Tools to Create, Modify and Manage 'CWB' Corpora

The 'Corpus Workbench' ('CWB', < http://cwb.sourceforge.net/>) offers a classic and mature approach for working with large, linguistically and structurally annotated corpora. The 'CWB' is memory efficient and its design makes running queries fast (Evert and Hardie 2011, < http://www.stefan-evert.de/PUB/EvertHardie2011.pdf>). The 'cwbtools' package offers pure R tools to create indexed corpus files as well as high-level wrappers for the original C implementation of CWB as exposed by the 'RcppCWB' package < https://CRAN.R-project.org/package=RcppCWB>. Additional functionality to add and modify annotations of corpora from within R makes working with CWB indexed corpora much more flexible and convenient. The 'cwbtools' package in combination with the R packages 'RcppCWB' (< https://CRAN.R-project.org/package=RcppCWB>) and 'polmineR' (< https://CRAN.R-project.org/package=polmineR>) offers a lightweight infrastructure to support the combination of quantitative and qualitative approaches for working with textual data.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("cwbtools")

0.3.1 by Andreas Blaette, 13 days ago


https://www.github.com/PolMine/cwbtools


Report a bug at https://github.com/PolMine/cwbtools/issues


Browse source code at https://github.com/cran/cwbtools


Authors: Andreas Blaette [aut, cre] , Christoph Leonhardt [ctb]


Documentation:   PDF Manual  


GPL-3 license


Imports data.table, R6, xml2, stringi, curl, RcppCWB, pbapply, methods, cli, jsonlite, RCurl, rstudioapi, zen4R

Suggests tm, knitr, tokenizers, tidytext, SnowballC, janeaustenr, devtools, NLP, testthat, usethis, rmarkdown


Imported by GermaParl.


See at CRAN