Miscellaneous Tools for Reproducible Research

Tools to load 'R' packages and automatically generate BibTeX files citing them as well as load and cache plain-text and 'Excel' formatted data stored on 'GitHub', and from other sources.

Version 0.4.5

Miscellaneous tools for reproducible research

repmis currently has the following functions:

  • LoadandCite: a function for installing and loading R packages. The command also creates a BibTeX bibliography file with package citations.

  • InstallOldPackages: installs specific R package versions.

  • source_data: loads plain-text formatted data (e.g. CSV, TSV) or RDATA stored at a URL (both http and https) into R. Note: the command can download data from almost any secure (https) URL. This includes data in Dropbox Public folders and published Google Docs plain-text formatted data sets (see Google Docs support pages for details. Note, currently only the old Google Sheets supports publishing sheets to the Web as plain-text files.)

    • source_data, and all of the data download commands in repmis find and report SHA-1 hashes for each file it loads. You can use a file's SHA-1 hash to make sure you are downloading the file and version of the file you think you are downloading. Note: if you are using source_data to download data from GitHub, source_data's SHA-1 hash is not the same as the Git commit's SHA-1 hash. (Thanks to Hadley Wickham's devtools package for the code to make this possible.)

    • Data downloaded with source_data can be cached (so you don't have to re-download it every time you run a script. To do this use the cache argument.

  • source_XlsxData: downloads and loads a data set in Excel format. The function relies on the xlsx package and can take any arguments that read.xlsx can.

  • git_stamp: function for get git stamp (commit and branch) for a repository. Thanks to Måns Magnusson.

  • scan_https: read a character text file from a secure (https) site into R as a single object.

  • set_valid_wd: sets valid working directory from vector of possible directories. This is useful if you run the same script on multiple machines.

The package is available for download from CRAN.

You can also download the most recent version using the devtools command install_github to install repmis in R. Here is the exact code for installing the current version:



## Version 0.5

Unfortunately, source_DropboxData is no longer supported due to changes in the Dropbox API.

## Version 0.4.4

set_valid_wd informs the user if no valid directory is found.

## Version 0.4.3

set_valid_wd sets valid working directory from vector of possible directories.

xlsx package moved to suggests. Most of the package can now be used without xlsx installed or its rJava dependency.

Minor internal change to options for sourcing data.

Minor documentation improvements.

source_data and source_DropboxData now use fread from the data.table package rather than read.table for faster more robust data loading.

Note: may break code from previous version in some instances. Please check.

Added scan_https for reading a character text file from a secure (https) site into R as a single object.

Added git_stamp function for get git stamp (commit and branch) for a repository. Thanks to Måns Magnusson.

Improved SHA1 Hash messages.

source_data now also loads RDATA files. Thanks to Måns Magnusson.

Added source_XlsxData function for downloading and loading Excel files.

Internal code improvements

Internal improvements.

Added cache and clearCache arguments to `source_data. This allows the user to cache the downloaded data frame so that it does not need to be downloaded every time the function is called.

Added ability to pass arguments to source_data, source_DropboxData, and source_GitHubData.

Improvements made to LoadandCite largely suggested by R Journal reviewers. These include:

  • if pkgs = NULL then non-base packages loaded in the current session are cited.

  • can use the style argument to style the citations for the Journal of Statistical Software.

  • automatically includes a citation for the current R version and can check to see if the version of R running matches a specified version.

Minor example changes.

Minor internal changes.

Minor changes to LoadandCite, if file = NULL then the packages are loaded but no BibTeX file is created.

Other internal improvements and bug fixes.

Drawing on devtools version 1.2, this source_data now us finds sha-1 hashes for files and lets the user compare these with the version of the file they have downloaded. This makes it easier to see if you downloaded the same file you thought you had downloaded.

source_GitHubData is now depricated and will not be updated from this point. Please use source_data instead.

Minor default settings change for data commands.

Add source_DropboxData for loading plain-text data from a non-Public Dropbox folder.

Big hat tip to Kay Cichini (http://thebiobucket.blogspot.com/2013/04/download-files-from-dropbox.html)

Add source_data. source_GitHubData turned into a wrapper for source_data.

Change how default repo in LoadandCite is determined/set.

Version 0.2.1

LoadandCite repo option now gets repo information from .Rprofile. Thanks to Karthik Ram.

  • Add tools to install old R package versions both in a stand alone function (InstallOldPackages) and as part of LoadandCite.

  • the package argument for LoadandCite is deprecated. Use pkgs instead. This is so the syntax matches `install.packages syntax better.

Documentation and external package load fixes.

Includes the functions LoadandCite for loading and citing R packages as well as source_GitHubData for downloading plain-text data from GitHub.

Reference manual

0.5 by Christopher Gandrud, a year ago


Report a bug at https://github.com/christophergandrud/repmis/issues

Browse source code at https://github.com/cran/repmis

Authors: Christopher Gandrud [aut, cre]

Documentation:   PDF Manual  

Task views: Web Technologies and Services

GPL (>= 3) license

Imports data.table, digest, httr, plyr, R.cache

Suggests xlsx

See at CRAN