Morpheme Tokenization

Tokenize text into morphemes. The morphemepiece algorithm uses a lookup table to determine the morpheme breakdown of words, and falls back on a modified wordpiece tokenization algorithm for words not found in the lookup table.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("morphemepiece")

1.0.1 by Jonathan Bratt, 7 days ago


https://github.com/macmillancontentscience/morphemepiece


Report a bug at https://github.com/macmillancontentscience/morphemepiece/issues


Browse source code at https://github.com/cran/morphemepiece


Authors: Jonathan Bratt [aut, cre] , Jon Harmon [aut] , Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]


Documentation:   PDF Manual  


Apache License (>= 2) license


Imports dlr, magrittr, morphemepiece.data, piecemaker, purrr, rlang, stringr

Suggests dplyr, fs, ggplot2, here, knitr, remotes, rmarkdown, testthat, utils


See at CRAN