Tools to load 'R' packages and automatically generate BibTeX files citing them as well as load and cache plain-text and 'Excel' formatted data stored on 'GitHub', and from other sources.
Miscellaneous tools for reproducible research
repmis currently has the following functions:
LoadandCite: a function for installing and loading R packages. The command
also creates a BibTeX bibliography file
with package citations.
InstallOldPackages: installs specific R package versions.
source_data: loads plain-text formatted data (e.g. CSV, TSV) or RDATA stored
at a URL (both http and https) into R. Note: the command can download data
from almost any secure (
https) URL. This includes data in Dropbox Public
folders and published Google Docs plain-text formatted data sets (see
Google Docs support pages
for details. Note, currently only the old Google Sheets supports publishing
sheets to the Web as plain-text files.)
source_data, and all of the data download commands in repmis find and
report SHA-1 hashes for each file it loads.
You can use a file's SHA-1 hash to make sure you are downloading the file and
version of the file you think you are downloading. Note: if you are using
source_data to download data from GitHub,
source_data's SHA-1 hash is not
the same as the Git commit's SHA-1 hash. (Thanks to Hadley Wickham's
devtools package for the code to make this
Data downloaded with
source_data can be cached (so you don't have to
re-download it every time you run a script. To do this use the
source_XlsxData: downloads and loads a data set in Excel format. The
function relies on the
xlsx package and can
take any arguments that
git_stamp: function for get git stamp (commit and branch) for a repository.
Thanks to Måns Magnusson.
scan_https: read a character text file from a secure (https) site into R as
a single object.
set_valid_wd: sets valid working directory from vector of possible
directories. This is useful if you run the same script on multiple machines.
The package is available for download from CRAN.
You can also download the most recent version using the
install repmis in R. Here is the exact code for installing the current
source_DropboxData is no longer supported due to changes in the
set_valid_wd informs the user if no valid directory is found.
set_valid_wd sets valid working directory from vector of possible directories.
xlsx package moved to suggests. Most of the package can now be used without xlsx installed or its rJava dependency.
Minor internal change to options for sourcing data.
Minor documentation improvements.
source_DropboxData now use
fread from the data.table
package rather than
read.table for faster more robust data loading.
Note: may break code from previous version in some instances. Please check.
scan_https for reading a character text file from a secure (https) site
into R as a single object.
git_stamp function for get git stamp (commit and branch) for a
repository. Thanks to Måns Magnusson.
Improved SHA1 Hash messages.
source_data now also loads RDATA files. Thanks to Måns Magnusson.
source_XlsxData function for downloading and loading Excel files.
Internal code improvements
clearCache arguments to `source_data. This allows the user
to cache the downloaded data frame so that it does not need to be downloaded
every time the function is called.
Added ability to pass arguments to
Improvements made to
LoadandCite largely suggested by R Journal reviewers.
pkgs = NULL then non-base packages loaded in the current session are cited.
can use the
style argument to style the citations for the Journal of
automatically includes a citation for the current R version and can check to see if the version of R running matches a specified version.
Minor example changes.
Minor internal changes.
Minor changes to
file = NULL then the packages are loaded
but no BibTeX file is created.
Other internal improvements and bug fixes.
Drawing on devtools version 1.2, this
source_data now us finds sha-1 hashes
for files and lets the user compare these with the version of the file they have
downloaded. This makes it easier to see if you downloaded the same file you
thought you had downloaded.
source_GitHubData is now depricated and will not be updated from this point.
Minor default settings change for data commands.
source_DropboxData for loading plain-text data from a non-Public Dropbox
Big hat tip to Kay Cichini (http://thebiobucket.blogspot.com/2013/04/download-files-from-dropbox.html)
source_GitHubData turned into a wrapper for
Change how default repo in
LoadandCite is determined/set.
LoadandCite repo option now gets repo information from
.Rprofile. Thanks to
Add tools to install old R package versions both in a stand alone function
InstallOldPackages) and as part of
package argument for
LoadandCite is deprecated. Use
This is so the syntax matches `install.packages syntax better.
Documentation and external package load fixes.
Includes the functions
LoadandCite for loading and citing R packages as well
source_GitHubData for downloading plain-text data from GitHub.