An R wrapper to the 'Python' 'spaCy' 'NLP' library, from < http://spacy.io>.
An R wrapper to the spaCy “industrial strength natural language processing”" Python library from https://spacy.io.
The easiest way to install spaCy and spacyr is through the
spacy_install(). This function by default
creates a new conda environment called
spacy_condaenv, as long as
some version of conda is installed on the user’s the system. You can
install miniconda from https://conda.io/miniconda.html. (Choose
the 64-bit version, or alternatively, run to the computer store now
and purchase a 64-bit system to replace your ancient 32-bit
If you already have any version of conda, you can skip this step.
You can check it by entering
conda --version in the Terminal.
For a Windows-based system, Visual C++ Build Tools or Visual Studio Express must be installed to compile spaCy for pip installation. The version of Visual Studio required for the installation of spaCy is found here and the default python version used in our installation method is 3.6.x.
Install the spacyr R package:
To install the latest package from source, you can simply run the following.
devtools::install_github("quanteda/spacyr", build_vignettes = FALSE)
Install spaCy in a conda environment
For Windows, you need to run R as an administrator to make installation work properly. To do so, right click the RStudio icon (or R desktop icon) and select “Run as administrator” when launching R.
To install spaCy, you can simply run
This will create a stand-alone conda environment including a python executable separate from your system Python (or anaconda python), install the latest version of spaCy (and its required packages), and download English language model. After installation, you can initialize spaCy in R with
This will return the following message if spaCy was installed with this method.
## Found 'spacy_condaenv'. spacyr will use this environment## successfully initialized (spaCy Version: 2.0.18, language model: en)## (python options: type = "condaenv", value = "spacy_condaenv")
(optional) Add more language models
For spaCy installed by
spacy_install(), spacyr provides a
useful helper function to install additional language models. For
instance, to install German language model
(Again, Windows users have to run this command as an administrator. Otherwise, he symlink (alias) to the language model will fail.)
If you are using the same setting for spaCy (e.g. condaenv or python
path) every time and want to reduce the time for initialization, you can
fixate the setting by specifying it in an R-startup file (For Mac/Linux,
the file is
~/.Rprofile), which is read every time a new
launched. You can set the option permanently when you call
spacy_initialize(save_profile = TRUE)
Once this is appropriately set up, the message from
changes to something like:
## spacy python option is already set, spacyr will use: ## condaenv = "spacy_condaenv" ## successfully initialized (spaCy Version: 2.0.18, language model: en) ## (python options: type = "condaenv", value = "spacy_condaenv")
To ignore the permanently set options, you can initialize spacy with
refresh_settings = TRUE.
nounphrase_consolidate()for direct extraction of entities, nounphrases, and tokens, and extraction of noun phrases from spacyr parsed tests.
spacy_parse()allowing the return of any tokens-level attribute available from https://spacy.io/api/token#attributes.
spacy_upgrade()to make installing or upgrading spaCy (and Python itself) easy and automatic.
multithreadingargument. This uses the "pipes" functionality in spaCy for improved performance.
spacy_initialize(entity = FALSE)(#91)
ask = FALSEto
spacy_initialize(), to find spaCy installations automatically.
spacy_parse(), by changing