Loads and processes huge text corpora processed with the sally toolbox (< http://www.mlsec.org/sally/>). sally acts as a very fast preprocessor which splits the text files into tokens or n-grams. These output files can then be read with the PRISMA package which applies testing-based token selection and has some replicate-aware, highly tuned non-negative matrix factorization and principal component analysis implementation which allows the processing of very big data sets even on desktop machines.
Protocol Inspection and State Machine Analysis
The package PRISMA is hosted on CRAN, so
install.packages("PRISMA") library(PRISMA) example(PRISMA) vignette("PRISMA")
will give you a first impression.