Kevin Miscellaneous

This package contains a collection of functions for common data extraction and reshaping operations, string manipulation, and functions for table and plot generation for R Markdown documents.


A large number of new functions have been added; please feel free to explore the index and run examples with 'example()' for example usage / use-cases.

  • 'Kmisc.knit2html' knits an .Rmd file to HTML using Kmisc's set of CSS, JavaScript styling. Styles are borrowed and modified from Twitter Bootstrap, and theme is inspired from Tommorrow Night Bright.
  • Added a utility wrapped to 'awk'; useful for calling 'awk' directly from the command line. The path to 'awk' can be set with 'awk.set'.
  • Added 'chunk', a function for generating 'chunks' of indices. Eg. chunk(1, 4, size=2) returns the list list(chunk1=c(1, 2), chunk2=c(3, 4)).
  • Added 'rowApply', 'colApply' wrappers that aim to behave like versions of 'apply' for matrices. A 'drop' argument is added to control whether dimension dropping occurs.
  • Added 'counts', a fast Rcpp-backed version of table for 1D vectors.
  • Added 'duplicate', a function that explicitly forces a copy of an R object.
  • Utility functions for extracting and splitting tabular files: 'extract_rows_from_file' extracts rows from a tabular file where a column meets some criteria (avoiding reading the whole item into memory)
  • 'factor_' is a faster implementation of 'factor', but with less flexibility (currently only allows for a levels argument)
  • 'htmlTable' wraps around knitr::kable, but allows the user to include HTML attributes (class, id, style...)
  • 'labeller' allows users to modify axis labels in a ggplot2 plot.
  • 'list2df' allows for the fast coercion from R lists to data.frames.
  • 'manhattan_plot' produces the typical manhattan plot as seen often in GWAS studies.
  • 'matches' generates matches over
  • 'mat2df' converts matrices to data.frames.
  • 'melt_' is a faster version of reshape2::melt. A combination function 'unmelt_' also exists to undo the operation of 'melt_' (but should be used with care)
  • 'pymat' allows one to use Python-style formatting of strings in R
  • Added 'readlines' and 'read', which act as faster versions of 'readLines' and 'scan', using mmap.
  • Added '[Rr]cpp_tapply_generator', which allows one to define functions to be called in a 'tapply' like manner: split by a factor grouping variable.
  • 'remove_na' is a helper function for removing NA elements from different R objects.
  • 'size' is a wrapper to 'object.size', but uses units="auto" by default.
  • 'split2df' splits a vector of strings following some regular structure to a data.frame, acting as a nice wrapper over strsplit.
  • 'split_file' is used to split a tabular data file based on the fields in one column. Can be used as a quick and dirty way of splitting large tabular data files for sub-processing.
  • 'str_collapse' is a fast version of 'paste0(..., collapse="")', using Rcpp sugar.
  • 'swap_' is a companion function to 'swap', allowing a user to provide a set of named mapping to '...'
  • 'tapply_' is a faster version of 'tapply' for the common case of splitting on a 1D vector
  • 'transpose' is like R's 't', but we implement a 'transpose.list' method that behaves specially (assumes a matrix-like list)
  • 'tree' prints the tree structure of an R list.
  • 'wrap' will wrap string output to a specified width (number of characters). The default is low as the primary use is in plot fields that need to be small (legend titles, etc.)
  • Cleaned up the internal implementation of HTML tags to avoid ugliness in earlier versions.
  • Added %kin% and %knin%, a 'keeping' version of %in% and %nin%
  • Registered C functions
  • Reorganized many of the regular expression type functions, so that they can be called through 're_', for consistency. Older functions (e.g. re.exists,, are replaced by re_exists, re_extract, re_without).
  • Using the 'clipboard' utility function throws an error for non-Windows/Mac systems.
  • Fixed a bug in setting the row names of 'kMerge'.
  • Added 'pad', a function for padding various R objects with NAs.
  • Added 'remove_chars' and 'remove_digits' as utility regex functions for removing characters/digits from a character vector.
  • Package now requires Rcpp.
  • 'factor_to_char' now has a fast C implementation
  • Added 'split_runs', an Rcpp function for splitting a numeric or character vector by the runs observed.
  • Made an Rcpp implementation of 'stack_list', for fast stacking of lists of data.frames.
  • Fixed bug in 'swap' function.
  • Added 'grid.text2', a function that generates text with a simple background. Useful for overlaying simple formulas / results on a plot.
  • Fixed bugs with 'kMerge'.
  • Fixed compatibility issues with old versions of R (no calls to paste0()).
  • Added str_rev2(), for reversing of UTF-8 strings (str_rev() remains for speed reasons when using ASCII characters). Similarily with str_slice() and str_slice2().
  • Added str_sort(), a function for sorting a (UTF-8) string lexically.
  • Updated documentation on extract() / without().
  • Improved error checking in extract() / without() - now ensures that the symbols passed to ... are atomic
  • Added and for extracting rows from a dataframe whereby some variable in the data frame (default: rownames) matches a regex pattern
  • Added anat() / anatomy() functions for fast str()-like calling on large data.frames
  • Added clean_doc(), a helper function that deletes all documentation in the current working directory - used for cleaning up orphaned docs, assuming you're writing documentation with roxygen2
  • Added is.sorted(), which is really just !is.unsorted() but seems the more 'logical' question in many cases
  • Added 'rcpp_apply_generator', a function for generating fast apply-type functions through Rcpp
  • Added some useful string operations: str_rev() and str_slice(). These functions are written in C for fast execution.
  • Added in.interval(), a function for checking whether each element of a vector x lies within an interval [lo, hi).
  • Added, for extracting named objects; objects with names matching / not-matching the RE supplied are returned
  • Fixed bug with HTML tag functions handling of functions as arguments
  • Functions no longer depend on Rcpp - may be reintroduced in future
  • Minor changes to code for R <2.15.0 compatibility
  • Updated documentation + DESCRIPTION for clarity
  • Package now byte compiles by default
  • Introduced attachHTML(), detachHTML() functions for attaching commonly used HTML functions (ie, they do not mask any base package functions)
  • Exported makeHTMLTag() function, added documentation + examples
  • Fixed bug in read.cb() for Mac systems
  • Added scan.cb(), for getting non-tabular data from clipboard

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.5.0 by Kevin Ushey, 5 years ago

Report a bug at

Browse source code at

Authors: Kevin Ushey

Documentation:   PDF Manual  

Task views:

GPL (>= 2) license

Imports Rcpp, data.table, lattice, grid, knitr, markdown

Suggests ggplot2, plyr, reshape2, googleVis, testthat, microbenchmark

Linking to Rcpp

See at CRAN