Extract Text from Microsoft Word Documents

Wraps the 'AntiWord' utility to extract text from Microsoft Word documents. The utility only supports the old 'doc' format, not the new xml based 'docx' format. Use the 'xml2' package to read the latter.



  • Windows: shQuote() path to file to make it work for paths with spaces
  • Capture error messages sent to stderr() by antiword
  • Simplify build structure a bit
  • Fix UBSAN error

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.1 by Jeroen Ooms, a year ago

https://github.com/ropensci/antiword#readme (devel) http://www.winfield.demon.nl (upstream)

Report a bug at http://github.com/ropensci/antiword/issues

Browse source code at https://github.com/cran/antiword

Authors: Jeroen Ooms [aut, cre], Adri van Os [cph] (Author 'antiword' utility)

Documentation:   PDF Manual  

GPL-2 license

Imports sys

Imported by readtext, textreadr.

Suggested by tm.

See at CRAN