Morpheme Tokenization

Tokenize text into morphemes. The morphemepiece algorithm uses a lookup table to determine the morpheme breakdown of words, and falls back on a modified wordpiece tokenization algorithm for words not found in the lookup table.


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.0.1 by Jonathan Bratt, 7 days ago

Report a bug at

Browse source code at

Authors: Jonathan Bratt [aut, cre] , Jon Harmon [aut] , Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]

Documentation:   PDF Manual  

Apache License (>= 2) license

Imports dlr, magrittr,, piecemaker, purrr, rlang, stringr

Suggests dplyr, fs, ggplot2, here, knitr, remotes, rmarkdown, testthat, utils

See at CRAN