A collection of tools associated with the 'qdap' package that may be useful outside of the context of text analysis.
qdapTools is an R package that contains tools associated with the qdap package that may be useful outside of the context of text analysis.
To download the development version of qdapTools:
if (!require("pacman")) install.packages("pacman")pacman::p_load_gh("trinker/qdapTools")
Releases will be numbered with the following semantic versioning format:
And constructed with the following guidelines:
read_docx added to read in .docx dcouments.
start_end added to find the locations of start/end places for the ones in a
run_split added to split a string into run chunks.
shift_right added to shift vectors.
counts2list now uses
apply and gains a speed boost.
loc_split picks up a speed boost thanks to indexing and dropping a reliance
loc_split added to split data forms (
matrix) on a vector of integer locations.
matrix2long makes a long format data.frame. It takes a matrix object, stacks
all columns and adds identifying columns by repeating row and column names
run_split added to split strings into runs.
split_vectorpicks up a
regexargument to allow for regular expression search of break location.
lookup threw an error with single length input
(see issue #6 for more). This behavior has been fixed.
lookup changed the order of existing
data.frames because of
scoping which modifies data in place. This was spotted by Kirill Muller (see
issue #7) and a solution provided by Matthew Flickinger
lookupwould throw warning and convert to more restrictive mode when (1)
key.reassignmodes didn't match & (2)
missing = NULL. This behavior has been fixed. See issue #5.
split_vectoradded to split a
vectorinto a list of vectors based on split points.
rownamesmatching the names of the original list rather than numeric indexes.
row.names = FALSEwas added to the call to
data.frameto correct this.
paddid not work consistently across all platforms. This behavior has been fixed.
This version of
qdapTools incorporates the
data.table package. This
provides huge speed boosts within a flexible frame work. The old behavior of
lookup functions was to convert
character. The latest
version does not perform this coercion. Those relying on this behavior may
find their code breaks hence the major bump to version 1.0.0. Thank you to
Arun Srinivasan for his demonstration of the
data.table package and help in
incorporating it into
lookupdid not have a method for when
key.matchwas a factor;
hash families of functions wraps
data.table package to
provide the ease of the lookup binary operators with the speed of the
qdapTools now uses the
testthat package to provide unit testing on
the package functions.
v_outer gains a speed boost through optimization optimization, including a
suggestion from stackoverflow.com's eddi:
id now allows the user to supply a character string prefix via the
%l*%binary operator becomes deprecated as its behavior is no longer needed with the inclusion of the
data.tablepackage. it will be removed in a subsequent version of
This version of qdapTools highlights optimization of
It also adds the
mtabulate function from the qdap package. Future development
will revolve around further optimization of
may utilize the data.table package to gain speed.
hash_look family of functions gains a major speed boost thanks
to @Arun Srinivasan. See: https://gist.github.com/arunsrinivasan/ee2d9ef43bdc02c32958
lookup becomes a generic method that operates on various classes. This
gains a slight speed boost.
v_outer becomes a generic method that operates on various classes. This
gains a slight speed boost.
This release moves the
list2df family of functions from
qdap to qdapTools.
This completes the process of moving generic
qdap tools into a separate
list2dffamily of functions have been moved from
qdapTools. These functions include:
id a function to generate a sequence of integers the
nrow of an
pad a convenience wrapper for
sprintf that pads the front end of strings
with spaces or 0s.
First push to CRAN.
%l*%added as a binary operator form of
lookupthat returns a factor when one is supplied in column 2 of the
data.framesupplied. Suggestion by Kirill Muller see: https://github.com/trinker/qdap/issues/167#issuecomment-41009219
Tools used by qdap that may be of use outside of the context of text analysis related tasks, have been moved to a separate package, qdapTools. This is the first installment of the package.