Statistical Analysis of Textual Data

Provides a set of functions devoted to multivariate exploratory statistics on textual data. Classical methods such as correspondence analysis and agglomerative hierarchical clustering are available. Chronologically constrained agglomerative hierarchical clustering enriched with labelled-by-words trees is offered. Given a division of the corpus into parts, their characteristic words and documents are identified. Further, accessing to 'FactoMineR' functions is very easy. Two of them are relevant in textual domain. MFA() addresses multiple lexical table allowing applications such as dealing with multilingual corpora as well as simultaneously analyzing both open-ended and closed questions in surveys. See <> for examples.


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.4.1 by Ramón Alvarez-Esteban, 4 months ago

Browse source code at

Authors: Mónica Bécue-Bertaut , Ramón Alvarez-Esteban , Josep-Anton Sánchez-Espigares , Belchin Kostov

Documentation:   PDF Manual  

GPL (>= 2.0) license

Imports ggdendro, ggforce, ggrepel, graphics, gridExtra, MASS, methods, stringi, stringr, slam, stats, utils, flexclust, flashClust

Depends on FactoMineR, ggplot2, tm

See at CRAN