Text Extraction and Rendering of PDF Documents

Utilities based on 'libpoppler' for extracting text, fonts, attachments and metadata from a pdf file. Also implements rendering of PDF to bitmaps on supported platforms.



  • Add workaround for poppler landscape truncation bug (fixes #7)


  • Rebuild poppler on Windows to support PDF rendering


  • Update Homebrew URL in configure script.
  • Fix autobrew (rename libopenjepg -> libopenjp2)
  • Update libpoppler 0.46 for Windows


  • Update libpoppler 0.42 for Windows
  • Use the COMPILED_BY variable on Windows to support R 3.3


  • Switch pdf_render_page to 1 based indexing
  • Fix for red/blue channel mixup in pdf_render_page
  • Update example to use local PDF file


  • Initial release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.2 by Jeroen Ooms, a month ago

https://ropensci.org/blog/2016/03/01/pdftools-and-jeroen (blog) https://github.com/ropensci/pdftools#readme (devel) https://poppler.freedesktop.org (upstream)

Report a bug at https://github.com/ropensci/pdftools/issues

Browse source code at https://github.com/cran/pdftools

Authors: Jeroen Ooms

Documentation:   PDF Manual  

MIT + file LICENSE license

Imports Rcpp

Suggests jpeg, png, webp

Linking to Rcpp

System requirements: Poppler C++ interface library and headers

Imported by textreadr.

Depended on by pdfsearch.

Suggested by hunspell, magick, tesseract.

See at CRAN