Tools for the 'qdap' Package

A collection of tools associated with the 'qdap' package that may be useful outside of the context of text analysis.

qdapTools is an R package that contains tools associated with the qdap package that may be useful outside of the context of text analysis.

To download the development version of qdapTools:

Download the zip ball or tar ball, decompress and run R CMD INSTALL on it, or use the pacman package to install the development version:

if (!require("pacman")) install.packages("pacman")

Web Page
Package PDF Help Manual



Releases will be numbered with the following semantic versioning format:


And constructed with the following guidelines:

  • Breaking backward compatibility bumps the major (and resets the minor and patch)
  • New additions without breaking backward compatibility bumps the minor (and resets the patch)
  • Bug fixes and misc changes bumps the patch


  • read_docx added to read in .docx dcouments.

  • start_end added to find the locations of start/end places for the ones in a binary vector.

  • run_split added to split a string into run chunks.

  • shift, shift_left, and shift_right added to shift vectors.


  • counts2list now uses apply and gains a speed boost.

  • loc_split picks up a speed boost thanks to indexing and dropping a reliance on cut + split.


  • loc_split added to split data forms (list, vector, data.frame, matrix) on a vector of integer locations.

  • matrix2long makes a long format data.frame. It takes a matrix object, stacks all columns and adds identifying columns by repeating row and column names accordingly.

  • run_split added to split strings into runs.


  • split_vector picks up a regex argument to allow for regular expression search of break location.


  • lookup threw an error with single length input terms and missing=NULL (see issue #6 for more). This behavior has been fixed.

  • lookup changed the order of existing data.frames because of data.table's scoping which modifies data in place. This was spotted by Kirill Muller (see issue #7) and a solution provided by Matthew Flickinger (


  • lookup would throw warning and convert to more restrictive mode when (1) terms mode and key.reassign modes didn't match & (2) missing = NULL. This behavior has been fixed. See issue #5.


  • split_vector added to split a vector into a list of vectors based on split points.


  • list2df would return rownames matching the names of the original list rather than numeric indexes. row.names = FALSE was added to the call to data.frame to correct this.


  • pad did not work consistently across all platforms. This behavior has been fixed.

This version of qdapTools incorporates the data.table package. This provides huge speed boosts within a flexible frame work. The old behavior of the lookup functions was to convert factor to character. The latest version does not perform this coercion. Those relying on this behavior may find their code breaks hence the major bump to version 1.0.0. Thank you to Arun Srinivasan for his demonstration of the data.table package and help in incorporating it into qdapTools.


  • lookup did not have a method for when key.match was a factor; lookup.factor added.


  • lookup and hash families of functions wraps data.table package to provide the ease of the lookup binary operators with the speed of the data.table package.

  • qdapTools now uses the testthat package to provide unit testing on the package functions.


  • v_outer gains a speed boost through optimization optimization, including a suggestion from's eddi:

  • id now allows the user to supply a character string prefix via the prefix argument.


  • The %l*% binary operator becomes deprecated as its behavior is no longer needed with the inclusion of the data.table package. it will be removed in a subsequent version of qdapTools.

This version of qdapTools highlights optimization of lookup and v_outer. It also adds the mtabulate function from the qdap package. Future development will revolve around further optimization of lookup and v_outer. lookup may utilize the data.table package to gain speed.


  • lookup and hash_look family of functions gains a major speed boost thanks to @Arun Srinivasan. See:

  • lookup becomes a generic method that operates on various classes. This gains a slight speed boost.

  • v_outer becomes a generic method that operates on various classes. This gains a slight speed boost.


  • mtabulate moved from qdap to qdapTools.

This release moves the list2df family of functions from qdap to qdapTools.
This completes the process of moving generic qdap tools into a separate qdapTools package.


  • The list2df family of functions have been moved from qdap to qdapTools. These functions include: list2df, matrix2df, vect2df, list_df2df, list_vect2df, counts2list, & vect2list.


  • id a function to generate a sequence of integers the length/nrow of an object.

  • pad a convenience wrapper for sprintf that pads the front end of strings with spaces or 0s.

First push to CRAN.

  • %l*% added as a binary operator form of lookup that returns a factor when one is supplied in column 2 of the key.match data.frame supplied. Suggestion by Kirill Muller see:

Tools used by qdap that may be of use outside of the context of text analysis related tasks, have been moved to a separate package, qdapTools. This is the first installment of the package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.3.1 by Tyler Rinker, 2 years ago

Report a bug at

Browse source code at

Authors: Bryan Goodrich [ctb], Dason Kurkiewicz [ctb], Kirill Muller [ctb], Tyler Rinker [aut, cre]

Documentation:   PDF Manual  

GPL-2 license

Imports chron, data.table, methods, RCurl, XML

Suggests testthat

Imported by anomalyDetection, censusGeography.

Depended on by qdap.

See at CRAN