Task view: Web Technologies and Services

Last updated on 2021-10-07 by Scott Chamberlain, Thomas Leeper, Patrick Mair, Karthik Ram, Christopher Gandrud

This Task View contains information about to use R and the world wide web together. The base version of R does not ship with many tools for interacting with the web. Thankfully, there are an increasingly large number of tools for interacting with the web. This task view focuses on packages for obtaining web-based data and information, frameworks for building web-based R applications, and online services that can be accessed from R. A list of available packages and functions is presented below, grouped by the type of activity. The rOpenSci Task View: Open Data provides further discussion of online data sources that can be accessed from R.

If you have any comments or suggestions for additions or improvements for this Task View, go to GitHub and submit an issue, or make some changes and submit a pull request. If you can’t contribute on GitHub, send Scott an email. If you have an issue with one of the packages discussed below, please contact the maintainer of that package.

Tools for Working with the Web from R

Core Tools For HTTP Requests

There are three main packages that should cover most use cases of interacting with the web from R. crul is an R6-based HTTP client that provides asynchronous HTTP requests, a pagination helper, HTTP mocking via webmockr, and request caching for unit tests via vcr. crul targets R developers more so than end users. httr provides more of a user facing client for HTTP requests and differentiates from the former package in that it provides support for OAuth. Note that you can pass in additional curl options when you instantiate R6 classes in crul, and the config parameter in httr. curl is a lower-level package that provides a closer interface between R and the libcurl C library, but is less user-friendly. curl underlies both crul and httr. curl may be useful for operations on web-based XML or to perform FTP operations (as crul and httr are focused primarily on HTTP). curl::curl() is an SSL-compatible replacement for base R’s url() and has support for http 2.0, SSL (https, ftps), gzip, deflate and more. For websites serving insecure HTTP (i.e. using the “http” not “https” prefix), most R functions can extract data directly, including read.table and read.csv; this also applies to functions in add-on packages such as jsonlite::fromJSON() and XML::parseXML. For more specific situations, the following resources may be useful:

  • RCurl is another low level client for libcurl. Of the two low-level curl clients, we recommend using curl. httpRequest is another low-level package for HTTP requests that implements the GET, POST and multipart POST verbs, but we do not recommend its use.
  • request provides a high-level package that is useful for developing other API client packages. httping provides simplified tools to ping and time HTTP requests, around httr calls. httpcache provides a mechanism for caching HTTP requests.
  • For dynamically generated webpages (i.e., those requiring user interaction to display results), RSelenium can be used to automate those interactions and extract page contents. It provides a set of bindings for the Selenium 2.0 webdriver using the JsonWireProtocol. It can also aid in automated application testing, load testing, and web scraping. seleniumPipes (GitHub) provides a “pipe”-oriented interface to the same. An alternative to the former two packages is splashr that vouches to be a lightweight altnernative. cpsievert/rdom (not on CRAN) uses phantomjs to access a webpage’s Document Object Model (DOM).
  • For capturing static content of web pages postlightmercury is a client for the web service Mercury that turns web pages into structured and clean text.
  • Another, higher-level alternative package useful for webscraping is rvest, which is designed to work with magrittr to make it easy to express common web scraping tasks.
  • Many base R tools can be used to download web content, provided that the website does not use SSL (i.e., the URL does not have the “https” prefix). download.file() is a general purpose function that can be used to download a remote file. For SSL, the download() function in downloader wraps download.file(), and takes all the same arguments.
  • Tabular data sets (e.g., txt, csv, etc.) can be input using read.table(), read.csv(), and friends, again assuming that the files are not hosted via SSL. An alternative is to use httr::GET (or RCurl::getURL) to first read the file into R as a character vector before parsing with read.table(text=...), or you can download the file to a local directory. rio (GitHub) provides an import() function that can read a number of common data formats directly from an https:// URL. The repmis function source_data() can load and cache plain-text data from a URL (either http or https). That package also includes source_Dropbox() for downloading/caching plain-text data from non-public Dropbox folders and source_XlsxData() for downloading/caching Excel xlsx sheets.
  • Authentication: Using web resources can require authentication, either via API keys, OAuth, username:password combination, or via other means. Additionally, sometimes web resources that require authentication be in the header of an http call, which requires a little bit of extra work. API keys and username:password combos can be combined within a url for a call to a web resource, or can be specified via commands in RCurl or httr. OAuth is the most complicated authentication process, and can be most easily done using httr. See the 6 demos within httr, three for OAuth 1.0 (linkedin, twitter, vimeo) and three for OAuth 2.0 (facebook, GitHub, google). ROAuth is a package that provides a separate R interface to OAuth. OAuth is easier to to do in httr, so start there. googleAuthR provides an OAuth 2.0 setup specifically for Google web services, and AzureAuth provides similar functionality for Azure Active Directory.

Handling HTTP Errors/Codes

  • fauxpas brings a set of Ruby or Python like R6 classes for each individual HTTP status code, allowing simple and verbose messages, with a choice of using messages, warnings, or stops.
  • httpcode is a simple package to help a user/package find HTTP status codes and associated messages by name or number.

Parsing Structured Web Data

The vast majority of web-based data is structured as plain text, HTML, XML, or JSON (javascript object notation). Web service APIs increasingly rely on JSON, but XML is still prevalent in many applications. There are several packages for specifically working with these format. These functions can be used to interact directly with insecure web pages or can be used to parse locally stored or in-memory web files.

  • XML: There are two packages for working with XML: XML and xml2 (GitHub). Both support general XML (and HTML) parsing, including XPath queries. The package xml2 is less fully featured, but more user friendly with respect to memory management, classes (e.g., XML node vs. node set vs. document), and namespaces. Of the two, only the XML supports de novo creation of XML nodes and documents. The XML2R (GitHub) package is a collection of convenient functions for coercing XML into data frames. An alternative to XML is selectr, which parses CSS3 Selectors and translates them to XPath 1.0 expressions. XML package is often used for parsing xml and html, but selectr translates CSS selectors to XPath, so can use the CSS selectors instead of XPath.
  • HTML: All of the tools that work with XML also work for HTML, though HTML is - in practice - more prone to be malformed. Some tools are designed specifically to work with HTML. xml2::read_html() is a good first function to use for importing HTML. htmltools provides functions to create HTML elements. The selectorgadget browser extension can be used to identify page elements. RHTMLForms reads HTML documents and obtains a description of each of the forms it contains, along with the different elements and hidden fields. scrapeR provides additional tools for scraping data from HTML documents. htmltidy (GitHub) provides tools to “tidy” messy HTML documents. htm2txt uses regex to converts html documents to plain text by removing all html tags. Rcrawler does crawling and scraping of web pages.
  • JSON: There are several packages for reading and writing JSON: rjson, RJSONIO, and jsonlite. jsonlite includes a different parser from RJSONIO called yajl. We recommend using jsonlite. Check out the paper describing jsonlite by Jeroen Ooms https://arxiv.org/abs/1403.2805. jqr provides bindings for the fast JSON library, jq. jsonvalidate (GitHub) validates JSON against a schema using the “is-my-json-valid” Javascript library; ajv does the same using the ajv Javascript library. ndjson (GitHub) supports the “ndjson” format.
  • RSS/Atom: tidyRSS parses RSS, Atom XML/JSON and geoRSS into a tidy data.frame.
  • swagger can be used to automatically generate functions for working with an web service API that provides documentation in Swagger.io format.

Tools for Working with URLs

  • The httr::parse_url() function can be used to extract portions of a URL. The RCurl::URLencode() and utils::URLencode() functions can be used to encode character strings for use in URLs. utils::URLdecode() decodes back to the original strings. urltools (GitHub) can also handle URL encoding, decoding, parsing, and parameter extraction.
  • iptools can facilitate working with IPv4 addresses, including for use in geolocation. A similar package ipaddress, handles IPv4 and IPv6 addresses and networks.
  • urlshorteneR offers URL expansion and analysis for Bit.ly, Goo.gl, and is.gd. longurl uses the longurl.org API to provide similar functionality.
  • gdns provides access to Google’s secure HTTP-based DNS resolution service.

Tools for Working with Scraped Webpage Contents

  • Several packages can be used for parsing HTML documents. boilerpipeR provides generic extraction of main text content from HTML files; removal of ads, sidebars and headers using the boilerpipe Java library. RTidyHTML interfaces to the libtidy library for correcting HTML documents that are not well-formed. This library corrects common errors in HTML documents. W3CMarkupValidator provides an R Interface to W3C Markup Validation Services for validating HTML documents.
  • For XML documents, the XMLSchema package provides facilities in R for reading XML schema documents and processing them to create definitions for R classes and functions for converting XML nodes to instances of those classes. It provides the framework for meta-computing with XML schema in R. xslt is an extension for the xml2 package to transform XML documents by applying an xslt style-sheet. (It can be seen as a modern replacement for Sxslt, which is an interface to Dan Veillard’s libxslt translator, and the SXalan package.) This may be useful for webscraping, as well as transforming XML markup into another human- or machine-readable format (e.g., HTML, JSON, plain text, etc.). SSOAP provides a client-side SOAP (Simple Object Access Protocol) mechanism. Beware, SSOAP itself may not install, and/or its dependencies. The best bet is to get the web service maintainers to switch to REST. XMLRPC provides an implementation of XML-RPC, a relatively simple remote procedure call mechanism that uses HTTP and XML. This can be used for communicating between processes on a single machine or for accessing Web services from within R.
  • Rcompression (not on CRAN): Interface to zlib and bzip2 libraries for performing in-memory compression and decompression in R. This is useful when receiving or sending contents to remote servers, e.g. Web services, HTTP requests via RCurl.
  • tm.plugin.webmining: Extensible text retrieval framework for news feeds in XML (RSS, ATOM) and JSON formats. Currently, the following feeds are implemented: Google Blog Search, Google Finance, Google News, NYTimes Article Search, Reuters News Feed, Yahoo Finance and Yahoo Inplay.
  • webshot uses PhantomJS to provide screenshots of web pages without a browser. It can be useful for testing websites (such as Shiny applications).

Other Useful Packages and Functions

  • Javascript: V8 is an R interface to Google’s open source, high performance JavaScript engine. It can wrap Javascript libraries as well as NPM packages. The SpiderMonkey package provides another means of evaluating JavaScript code, creating JavaScript objects and calling JavaScript functions and methods from within R. This can work by embedding the JavaScript engine within an R session or by embedding R in an browser such as Firefox and being able to call R from JavaScript and call back to JavaScript from R. The js package wraps V8 and validates, reformats, optimizes and analyzes JavaScript code.
  • Email:: mailR is an interface to Apache Commons Email to send emails from within R. sendmailR provides a simple SMTP client. gmailr provides access the Google’s gmail.com RESTful API.
  • Mocking:: webmockr is a library for stubbing and setting expectations on HTTP requests. It is inspired from Rubys webmock. This package only helps mock HTTP requests, and returns nothing when requests match expectations. webmockr integrates with the HTTP packages crul and httr. See Testing for mocking with returned responses.
  • Testing:: vcr provides an interface to easily cache HTTP requests in R package test suites (but can be used outside of testing use cases as well). vcr relies on webmockr to do the HTTP request mocking. vcr integrates with the HTTP packages crul and httr. httptest provides a framework for testing packages that communicate with HTTP APIs, offering tools for mocking APIs, for recording real API responses for use as mocks, and for making assertions about HTTP requests, all without requiring a live connection to the API server at runtime. httptest only works with httr.
  • Miscellaneous: webutils contains various functions for developing web applications, including parsers for application/x-www-form-urlencoded as well as multipart/form-data. mime (GitHub) guesses the MIME type for a file from its extension. rsdmx provides tools to read data and metadata documents exchanged through the Statistical Data and Metadata Exchange (SDMX) framework. The package currently focuses on the SDMX XML standard format (SDMX-ML). robotstxt provides functions and classes for parsing robots.txt files and checking access permissions; spiderbar does the same. uaparserjs (GitHub) uses the javascript “ua-parser” library to parse User-Agent HTTP headers. rapiclient is a client for consuming APIs that follow the Open API format. restfulr models a RESTful service as if it were a nested R list.

Web and Server Frameworks

  • Model Operationalization (previously DeployR) is a Microsoft product that provides support for deploying R and Python models and code to a server as a web service to later consume.
  • The shiny package makes it easy to build interactive web applications with R.
  • dash is a web framework which is available for Python, R and Julia, with components written in React.js.
  • Other web frameworks include: fiery that is meant to be more flexible but less easy to use than shiny (reqres and routr are utilities used by fiery that provide HTTP request and response classes, and HTTP routing, respectively); att/rcloud provides an iPython notebook-style web-based R interface; and Rook, which contains the specification and convenience software for building and running Rook applications.
  • The opencpu framework for embedded statistical computation and reproducible research exposes a web API interfacing R, LaTeX and Pandoc. This API is used for example to integrate statistical functionality into systems, share and execute scripts or reports on centralized servers, and build R based apps.
  • Several general purpose server/client frameworks for R exist. Rserve and RSclient provide server and client functionality for TCP/IP or local socket interfaces. httpuv provides a low-level socket and protocol support for handling HTTP and WebSocket requests directly within R. Another related package, perhaps which httpuv replaces, is websockets. servr provides a simple HTTP server to serve files under a given directory based on httpuv.
  • Several packages offer functionality for turning R code into a web API. FastRWeb provides some basic infrastructure for this. plumber allows you to create a REST API by decorating existing R source code.
  • The WADL package provides tools to process Web Application Description Language (WADL) documents and to programmatically generate R functions to interface to the REST methods described in those WADL documents. (not on CRAN)
  • The RDCOMServer provides a mechanism to export R objects as (D)COM objects in Windows. It can be used along with the RDCOMClient package which provides user-level access from R to other COM servers. (not on CRAN)
  • rapporter.net provides an online environment (SaaS) to host and run rapport statistical report templates in the cloud.
  • radiant (GitHub) is Shiny-based GUI for R that runs in a browser from a server or local machine.
  • The Tiki Wiki CMS/Groupware framework has an R plugin (PluginR) to run R code from wiki pages, and use data from their own collected web databases (trackers). A demo: https://r.tiki.org/tiki-index.php.
  • The MediaWiki has an extension (Extension:R) to run R code from wiki pages, and use uploaded data. A mailing list used to be available: R-sig-mediawiki.
  • whisker: Implementation of logicless templating based on Mustache in R. Mustache syntax is described in http://mustache.github.io/mustache.5.html
  • CGIwithR (not on CRAN) allows one to use R scripts as CGI programs for generating dynamic Web content. HTML forms and other mechanisms to submit dynamic requests can be used to provide input to R scripts via the Web to create content that is determined within that R script.

Web Services

Cloud Computing and Storage

  • The cloudyr project aims to provide interfaces to popular Amazon, Azure and Google cloud services without the need for external system dependencies.
  • Amazon Web Services is a popular, proprietary cloud service offering a suite of computing, storage, and infrastructure tools. aws.signature provides functionality for generating AWS API request signatures.
    • Elastic Cloud Compute (EC2) is a cloud computing service. segue (not on CRAN) is a package for managing EC2 instances and S3 storage, which includes a parallel version of lapply() for the Elastic Map Reduce (EMR) engine called emrlapply(). It uses Hadoop Streaming on Amazon’s EMR in order to get simple parallel computation.
    • DBREST: RAmazonDBREST provides an interface to Amazon’s Simple DB API.
    • paws (GitHub) is an interface to nearly all AWS APIs, including compute, storage, databases, and machine learning. It also requires no external system dependencies.
  • Azure and Microsoft 365 are Microsoft’s cloud computing services. The Azure platform provides Paas, SaaS and IaaS and supports many different tools and frameworks, including both Microsoft-specific and third-party systems; while Microsoft 365 is a unified framework for accessing cloud data from Microsoft’s Office services, Windows and Dynamics. The AzureR package family aims to provide a suite of lightweight, powerful tools for working with Azure in R. The packages listed below are part of the family, and are also mirrored at the cloudyr project.
    • Azure Active Directory (AAD) is a centralized directory and identity service. AzureAuth is an R client for AAD; use this to obtain OAuth tokens for authenticating with other Azure services, including Resource Manager and storage (see next).
    • Microsoft Graph is the API framework for the Microsoft 365 platform, including Azure Active Directory and Office. AzureGraph is a low-level extensible R6-based interface to Graph. Microsoft365R is an interface to the Office part of Microsoft 365, including OneDrive and SharePoint Online.
    • Azure Resource Manager (ARM) is a service for deploying other Azure services. AzureRMR is an R interface to ARM, and allows managing subscriptions, resource groups, resources and templates. It exposes a general R6 class framework that can extended to provide extra functionality for specific services (see next).
    • Azure Storage Accounts are a general-purpose data storage facility. Different types of storage are available: file, blob, table, Data Lake, and more. AzureStor provides an R interface to storage. Features include clients for file, blob and Data Lake Gen2 storage, parallelized file transfers, and an interface to Microsoft’s cross-platform AzCopy command line utility. Also supplied is an ARM interface, to allow creation and managing of storage accounts. AzureTableStor and AzureQstor extend AzureStor to provide interfaces to table storage and queue storage respectively
    • AzureVM is a package for creating and managing virtual machines in Azure. It includes templates for a wide variety of common VM specifications and operating systems, including Windows, Ubuntu, Debian and RHEL.
    • AzureContainers provides a unified facility for working with containers in Azure. Specifically, it includes R interfaces to Azure Container Instances (ACI), Azure Docker Registry (ACR) and Azure Kubernetes Service (AKS). Create Docker images and push them to an ACR repository; spin up ACI containers; deploy Kubernetes services in AKS.
    • Azure Data Explorer, also known as Kusto, is a fast, scalable data exploration and analytics service. AzureKusto is an R interface to ADE/Kusto. It includes a dplyr client interface similar to that provided by dbplyr for SQL databases, a DBI client interface, and an ARM interface for deploying and managing Kusto clusters and databases.
    • Azure Cosmos DB is a multi-model NoSQL database service, previously known as Document DB. AzureCosmosR is an interface to the core/SQL API for Cosmos DB. It also includes simple bridges to the table storage and MongoDB APIs.
    • Azure Computer Vision and Azure Custom Vision are AI services for image recognition and analysis. Computer Vision is a pre-trained service for handling commonly-encountered tasks, while Custom Vision allows you to train your own image recognition model on a custom dataset. AzureVision provides an interface to both these services.
  • googleComputeEngineR interacts with the Google Compute Engine API, and lets you create, start and stop instances in the Google Cloud.
  • Cloud Storage: googleCloudStorageR interfaces with Google Cloud Storage. boxr (GitHub) is a lightweight, high-level interface for the box.com API. rdrop2 is a Dropbox interface that provides access to a full suite of file operations, including dir/copy/move/delete operations, account information (including quotas) and the ability to upload and download files from any Dropbox account.
  • Docker: analogsea is a general purpose client for the Digital Ocean v2 API. In addition, the package includes functions to install various R tools including base R, RStudio server, and more. There’s an improving interface to interact with docker on your remote droplets via this package.
  • crunch GitHub provides an interface to the crunch.io storage and analytics platform. crunchy GitHub facilitates making Shiny apps on Crunch.
  • rrefine provides a client for the OpenRefine (formerly Google Refine) data cleaning service.

Document and Code Sharing

  • Code Sharing: gistr (GitHub) works with GitHub gists (gist.github.com) from R, allowing you to create new gists, update gists with new files, rename files, delete files, get and delete gists, star and un-star gists, fork gists, open a gist in your default browser, get embed code for a gist, list gist commits, and get rate limit information when authenticated. git2r provides bindings to the git version control system and gh is a client for the GitHub API. gitlabr is a GitLab-specific client.
  • Data archiving: rfigshare (GitHub) connects with Figshare.com. dataone (GitHub) provides a client for DataONE repositories.
  • Google Drive/Google Documents: The RGoogleDocs package is an example of using the RCurl and XML packages to quickly develop an interface to the Google Documents API. RGoogleStorage provides programmatic access to the Google Storage API. This allows R users to access and store data on Google’s storage. We can upload and download content, create, list and delete folders/buckets, and set access control permissions on objects and buckets.
  • Google Sheets: googlesheets (GitHub) can access private or public Google Sheets by title, key, or URL. Extract data or edit data. Create, delete, rename, copy, upload, or download spreadsheets and worksheets. gsheet (GitHub) can download Google Sheets using just the sharing link. Spreadsheets can be downloaded as a data frame, or as plain text to parse manually.
  • imguR (GitHub) is a package to share plots using the image hosting service Imgur.com. knitr also has a function imgur_upload() to load images from literate programming documents.
  • SharePoint and OneDrive: Microsoft365R provides an interface to these services, which form part of the Microsoft 365 (formerly known as Office 365) suite.

Data Analysis and Processing Services

  • Geospatial/Geolocation/Geocoding: Several packages connect to geolocation/geocoding services. rgeolocate (GitHub) offers several online and offline tools. trestletech/rydn (not on CRAN) is an interface to the Yahoo Developers network geolocation APIs, and hrbrmstr/ipapi can be used to geolocate IPv4/6 addresses and/or domain names using the http://ip-api.com/ API. opencage (GitHub) provides access to to the OpenCage geocoding service. hrbrmstr/nominatim (not on CRAN) connects to the OpenStreetMap Nominatim API for reverse geocoding. ropensci/PostcodesioR (not on CRAN) provides post code lookup and geocoding for the United Kingdom. geosapi is an R client for the GeoServer REST API, an open source implementation used widely for serving spatial data. geonapi provides an interface to the GeoNetwork legacy API, an opensource catalogue for managing geographic metadata. ows4R is a new R client for the OGC standard Web-Services, such Web Feature Service (WFS) for data and Catalogue Service (CSW) for metadata.
  • Machine Learning as a Service: Several packages provide access to cloud-based machine learning services. OpenML (GitHub) is the official client for the OpenML API. clarifai (GitHub) is a Clarifai.com client that enables automated image description. rLTP (GitHub) accesses the ltp-cloud service. languagelayeR is a client for Languagelayer, a language detection API. googlepredictionapi (not on CRAN): is an R client for the Google Prediction API, a suite of cloud machine learning tools. yhatr lets you deploy, maintain, and invoke models via the Yhat REST API. datarobot works with Data Robot’s predictive modeling platform. mscsweblm4r (GitHub) interfaces with the Microsoft Cognitive Services Web Language Model API and mscstexta4r (GitHub) uses the Microsoft Cognitive Services Text Analytics REST API. rosetteApi links to the Rosette text analysis API. googleLanguageR provides interfaces to Google’s Cloud Translation API, Natural Language API, Cloud Speech API, and the Cloud Text-to-Speech API. AzureVision provides interfaces to the Azure Computer Vision and Custom Vision image recognition services.
  • Machine Translation: translate provides bindings for the Google Translate API v2 and translateR provides bindings for both Google and Microsoft translation APIs. RYandexTranslate (GitHub) connects to Yandex Translate. transcribeR provides automated audio transcription via the HP IDOL service.
  • Document Processing: abbyyR GitHub and captr (GitHub) connect to optical character recognition (OCR) APIs. pdftables (GitHub) uses the PDFTables.com webservice to extract tables from PDFs.
  • Mapping: osmar provides infrastructure to access OpenStreetMap data from different sources to work with the data in common R manner and to convert data into available infrastructure provided by existing R packages (e.g., into sp and igraph objects). osrm provides shortest paths and travel times from OpenStreetMap. osmplotr (GitHub) extracts customizable map images from OpenStreetMap. RgoogleMaps serves two purposes: it provides a comfortable R interface to query the Google server for static maps, and use the map as a background image to overlay plots within R. R2GoogleMaps provides a mechanism to generate JavaScript code from R that displays data using Google Maps. RKMLDevice allows to create R graphics in Keyhole Markup Language (KML) format in a manner that allows them to be displayed on Google Earth (or Google Maps), and RKML provides users with high-level facilities to generate KML. ggmap allows for the easy visualization of spatial data and models on top of Google Maps, OpenStreetMaps, Stamen Maps, or CloudMade Maps using ggplot2. mapsapi is an sf-compatible interface to Google Maps API. leafletR: Allows you to display your spatial data on interactive web-maps using the open-source JavaScript library Leaflet. openadds (GitHub) is an Openaddresses client.
  • Online Surveys: qualtRics provide functions to interact with Qualtrics. WufooR (GitHub) can retrieve data from Wufoo.com forms. redcapAPI (GitHub) can provide access to data stored in a REDCap (Research Electronic Data CAPture) database, which is a web application for building and managing online surveys and databases developed at Vanderbilt University. rubenarslan/formr facilitates use of the formr survey framework, which is built on openCPU. Rexperigen is a client for the Experigen experimental platform.
  • Visualization: Plot.ly is a company that allows you to create visualizations in the web using R (and Python), which is accessible via plotly. googleVis provides an interface between R and the Google chart tools. The RUbigraph package provides an R interface to a Ubigraph server for drawing interactive, dynamic graphs. You can add and remove vertices/nodes and edges in a graph and change their attributes/characteristics such as shape, color, size.
  • Other:

Social Media Clients

  • Rfacebook provide an interface to the Facebook API.
  • The Rflickr package provides an interface to the Flickr photo management and sharing application Web service. (not on CRAN)
  • instaR (GitHub) is a client for the Instagram API.
  • Rlinkedin is a client for the LinkedIn API. Auth is via OAuth.
  • rpinterest connects to the Pintrest API.
  • vkR is a client for VK, a social networking site based in Russia.
  • rladies/meetupr is a client for the Meetup.com API.
  • Twitter: twitteR (GitHub) provides an interface to the Twitter web API. It claims to be deprecated in favor of rtweet (GitHub). gvegayon/twitterreport (not on CRAN) focuses on report generation based on Twitter data. streamR provides a series of functions that allow users to access Twitter’s filter, sample, and user streams, and to parse the output into data frames. OAuth authentication is supported. graphTweets produces a network graph from a data.frame of tweets. tweetscores (not on CRAN) implements a political ideology scaling measure for specified Twitter users.
  • brandwatchR is a package to retrieve a data from the Brandwatch social listening API. Both raw text and aggregate statistics are available, as well as project and query management functions.

Web Analytics Services

  • Google Trends: gtrendsR offers functions to perform and display Google Trends queries. RGoogleTrends provides an alternative.
  • Google Analytics: googleAnalyticsR, ganalytics, and RGA provide functions for accessing and retrieving data from the Google Analytics APIs. The latter supports OAuth 2.0 authorization. RGA provides a shiny app to explore data. searchConsoleR links to the Google Search Console (formerly Webmaster Tools).
  • Online Advertising: fbRads can manage Facebook ads via the Facebook Marketing API. WillemPaling/RDoubleClick (not on CRAN) can retrieve data from Google’s DoubleClick Campaign Manager Reporting API. RSmartlyIO (GitHub) loads Facebook and Instagram advertising data provided by Smartly.io.
  • Other services: RSiteCatalyst has functions for accessing the Adobe Analytics (Omniture SiteCatalyst) Reporting API.
  • RAdwords (GitHub) is a package for loading Google Adwords data.
  • webreadr (GitHub) can process various common forms of request log, including the Common and Combined Web Log formats and AWS logs.

Web Services for R Package Development

  • R-Hub http://log.r-hub.io/ is a project to enable package builds across all architectures. rhub is a package that interfaces with R-Hub to allow you to check a package on the platform.

Other Web Services

  • Push Notifications: RPushbullet provides an easy-to-use interface for the Pushbullet service which provides fast and efficient notifications between computers, phones and tablets. pushoverr (GitHub) can sending push notifications to mobile devices (iOS and Android) and desktop using Pushover. notifyme (GitHub) can control Phillips Hue lighting.

  • Reference/bibliography/citation management: rorcid (GitHub) is a programmatic interface the Orcid.org API, which can be used for identifying scientific authors and their publications (e.g., by DOI). rdatacite connects to DataCite, which manages DOIs and metadata for scholarly datasets. scholar provides functions to extract citation data from Google Scholar. rscopus provides functions to extract citation data from Elsevier Scopus APIs. Convenience functions are also provided for comparing multiple scholars and predicting future h-index values. mathpix convert an image of a formula (typeset or handwritten) via Mathpix webservice to produce the LaTeX code. zen4R provides an Interface to Zenodo REST API, including management of depositions, attribution of DOIs by ‘Zenodo’ and upload of files.

  • Literature: rplos is a programmatic interface to the Web Service methods provided by the Public Library of Science journals for search. europepmc connects to the Europe PubMed Central service. pubmed.mineR is a package for text mining of PubMed Abstracts that supports fetching text and XML from PubMed. aRxiv is a client for the arXiv API, a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics. roadoi provides an interface to the Unpaywall API for finding free full-text versions of academic papers. rcoreoa is an interface to the CORE API, a search interface for open access scholarly articles. rcrossref is an interface to Crossref’s API; fulltext is a general purpose package to search for, retrieve and extract full text from scholarly articles; and rromeo (GitHub) is an interface to the SHERPA/RoMEO API, a database of scientific journal archival policies regarding pre-, post-print, and accepted manuscript.

  • Automated Metadata Harvesting: oai and OAIHarvester harvest metadata using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) standard.

  • Wikipedia: WikipediR (GitHub) is a wrapper for the MediaWiki API, aimed particularly at the Wikimedia ‘production’ wikis, such as Wikipedia. WikidataR (GitHub) can request data from Wikidata.org, the free knowledgebase. wikipediatrend (GitHub) provides access to Wikipedia page access statistics. WikidataQueryServiceR is a client for the Wikidata Query Service.

  • bigrquery (GitHub): An interface to Google’s bigquery.

  • discgolf (GitHub) provides a client to interact with the API for the Discourse web forum platform. The API is for an installed instance of Discourse, not for the Discourse site itself.

  • stephlocke/mockaRoo (not on CRAN) uses the MockaRoo API to generate mock or fake data based on an input schema.

  • randNames (GitHub) generates random names and personal identifying information using the https://randomapi.com/ API.

  • rerddap: A generic R client to interact with any ERDDAP instance, which is a special case of OPeNDAP (https://en.wikipedia.org/wiki/OPeNDAP), or Open-source Project for a Network Data Access Protocol. Allows user to swap out the base URL to use any ERDDAP instance.

  • RStripe provides an interface to Stripe, an online payment processor.

  • slackr is a client for Slack.com messaging platform.

  • dgrtwo/stackr (not on CRAN): An unofficial wrapper for the read-only features of the Stack Exchange API.

  • nealrichardson/useRsnap (not on CRAN) provides an interface to the API for Usersnap, a tool for collecting feedback from web application users.

  • duckduckr is an R interface DuckDuckGo’s Instant Answer API


curl — 4.3.2

A Modern and Flexible Web Client for R

httr — 1.4.2

Tools for Working with URLs and HTTP

crul — 1.1.0

HTTP Client

vcr — 1.0.2

Record 'HTTP' Calls to Disk

webmockr — 0.8.0

Stubbing and Setting Expectations on 'HTTP' Requests

jsonlite — 1.7.2

A Simple and Robust JSON Parser and Generator for R

shiny — 1.7.1

Web Application Framework for R

xml2 — 1.3.2

Parse XML

abbyyR — 0.5.5

Access to Abbyy Optical Character Recognition (OCR) API

ajv — 1.0.0

Another JSON Schema Validator

analogsea — 1.0.0

Interface to 'Digital Ocean'

aRxiv — 0.5.19

Interface to the arXiv API

aws.signature — 0.6.0

Amazon Web Services Request Signatures

AzureAuth — 1.3.3

Authentication Services for Azure Active Directory

AzureContainers — 1.3.2

Interface to 'Container Instances', 'Docker Registry' and 'Kubernetes' in 'Azure'

AzureCosmosR — 1.0.0

Interface to the 'Azure Cosmos DB' 'NoSQL' Database Service

AzureGraph — 1.3.1

Simple Interface to 'Microsoft Graph'

AzureKusto — 1.0.6

Interface to 'Kusto'/'Azure Data Explorer'

AzureQstor — 1.0.1

Interface to 'Azure Queue Storage'

AzureRMR — 2.4.3

Interface to 'Azure Resource Manager'

AzureStor — 3.5.1

Storage Management in 'Azure'

AzureTableStor — 1.0.0

Interface to the Table Storage Service in 'Azure'

AzureVision — 1.0.2

Interface to Azure Computer Vision Services

AzureVM — 2.2.2

Virtual Machines in 'Azure'

bigrquery — 1.4.0

An Interface to Google's 'BigQuery' 'API'

boilerpipeR — 1.3.2

Interface to the Boilerpipe Java Library

boxr — 0.3.6

Interface for the 'Box.com API'

brandwatchR — 0.3.0

'Brandwatch' API to R

captr — 0.3.0

Client for the Captricity API

clarifai — 0.4.2

Access to Clarifai API

crunch — 1.28.1

Crunch.io Data Tools

crunchy — 0.3.3

Shiny Apps on Crunch

dash — 0.5.0

An Interface to the 'dash' Ecosystem for Authoring Reactive Web Applications

dataone — 2.2.1

R Interface to the DataONE REST API

datarobot — 2.18.0

'DataRobot' Predictive Modeling API

discgolf — 0.2.0

Discourse API Client

downloader — 0.4

Download Files over HTTP and HTTPS

duckduckr — 1.0.0

Simple Client for the DuckDuckGo Instant Answer API

europepmc — 0.4.1

R Interface to the Europe PubMed Central RESTful Web Service

FastRWeb — 1.1-3

Fast Interactive Framework for Web Scripting Using R

fauxpas — 0.5.0

HTTP Error Helpers

fbRads — 0.2

Analyzing and Managing Facebook Ads from R

fiery — 1.1.3

A Lightweight and Flexible Web Framework

fulltext — 2.0

Full Text of 'Scholarly' Articles Across Many Data Sources

ganalytics — 0.10.7

Interact with 'Google Analytics'

gdns — 0.5.0

Tools to Work with Google's 'DNS'-over-'HTTPS' ('DoH') API

geonapi — 0.4

'GeoNetwork' API R Interface

geosapi — 0.5-1

GeoServer REST API R Interface

ggmap — 3.0.0

Spatial Visualization with ggplot2

gh — 1.3.0

'GitHub' 'API'

gistr — 0.9.0

Work with 'GitHub' 'Gists'

git2r — 0.28.0

Provides Access to Git Repositories

gitlabr — 2.0.0

Access to the 'Gitlab' API

gmailr — 1.0.0

Access the 'Gmail' 'RESTful' API

googleAnalyticsR — 1.0.1

Google Analytics API into R

googleAuthR — 1.4.0

Authenticate and Create Google APIs

googleCloudStorageR — 0.6.0

Interface with Google Cloud Storage API

googleComputeEngineR — 0.3.0

R Interface with Google Compute Engine

googleLanguageR — 0.3.0

Call Google's 'Natural Language' API, 'Cloud Translation' API, 'Cloud Speech' API and 'Cloud Text-to-Speech' API

googlesheets — 0.3.0

Manage Google Spreadsheets from R

googleVis — 0.6.10

R Interface to Google Charts

graphTweets — 0.5.3

Visualise Twitter Interactions

gsheet — 0.4.5

Download Google Sheets Using Just the URL

gtrendsR — 1.5.0

Perform and Display Google Trends Queries

htm2txt — 2.1.1

Convert Html into Text

htmltidy — 0.5.0

Tidy Up and Test XPath Queries on HTML and XML Content

htmltools — 0.5.2

Tools for HTML

httpcache — 1.2.0

Query Cache for HTTP Clients

httpcode — 0.3.0

'HTTP' Status Code Helper

httping — 0.2.0

'Ping' 'URLs' to Time 'Requests'

httpRequest — 0.0.10

Basic HTTP Request

httptest — 4.1.0

A Test Environment for HTTP Requests

httpuv — 1.6.3

HTTP and WebSocket Server Library

imguR — 1.0.3

An Imgur.com API Client Package

instaR — 0.2.4

Access to Instagram API via R

ipaddress — 0.5.3

Tidy IP Addresses

iptools — 0.7.2

Manipulate, Validate and Resolve 'IP' Addresses

jqr — 1.2.1

Client for 'jq', a 'JSON' Processor

js — 1.2

Tools for Working with JavaScript in R

jsonvalidate — 1.3.1

Validate 'JSON' Schema

languagelayeR — 1.2.4

Access the 'languagelayer' API

leafletR — 0.4-0

Interactive Web-Maps Based on the Leaflet JavaScript Library

longurl — 0.3.3

Expand Short 'URLs'

magrittr — 2.0.1

A Forward-Pipe Operator for R

mailR — 0.4.1

A Utility to Send Emails from R

mapsapi — 0.5.0

'sf'-Compatible Interface to 'Google Maps' APIs

mathpix — 0.4.0

Support for the 'Mathpix' API (Image to 'LaTeX')

Microsoft365R — 2.3.2

Interface to the 'Microsoft 365' Suite of Cloud Services

mime — 0.12

Map Filenames to MIME Types

mscstexta4r — 0.1.2

R Client for the Microsoft Cognitive Services Text Analytics REST API

mscsweblm4r — 0.1.2

R Client for the Microsoft Cognitive Services Web Language Model REST API

ndjson — 0.8.0

Wicked-Fast Streaming 'JSON' ('ndjson') Reader

notifyme — 0.3.0

Send Alerts to your Cellphone and Phillips Hue Lights

oai — 0.3.2

General Purpose 'Oai-PMH' Services Client

OAIHarvester — 0.3-3

Harvest Metadata Using OAI-PMH Version 2.0

openadds — 0.2.0

Client to Access 'Openaddresses' Data

opencage — 0.2.2

Geocode with the OpenCage API

opencpu — 2.2.5

Producing and Reproducing Results

OpenML — 1.10

Open Machine Learning and Open Data Platform

osmar — 1.1-7

OpenStreetMap and R

osmplotr — 0.3.3

Bespoke Images of 'OpenStreetMap' Data

osrm — 3.5.0

Interface Between R and the OpenStreetMap-Based Routing Service OSRM

ows4R — 0.1-5

Interface to OGC Web-Services (OWS)

paws — 0.1.12

Amazon Web Services Software Development Kit

pdftables — 0.1

Programmatic Conversion of PDF Tables

plotly — 4.10.0

Create Interactive Web Graphics via 'plotly.js'

plumber — 1.1.0

An API Generator for R

postlightmercury — 1.2

Parses Web Pages using Postlight Mercury

pubmed.mineR — 1.0.18

Text Mining of PubMed Abstracts

pushoverr — 1.0.0

Send Push Notifications using Pushover

qualtRics — 3.1.5

Download 'Qualtrics' Survey Data

radiant — 1.4.0

Business Analytics using R and Shiny

RAdwords — 0.1.18

Loading Google Adwords Data into R

randNames — 0.2.3

Package Provides Access to Fake User Data

rapiclient — 0.1.3

Dynamic OpenAPI/Swagger Client

rapport — 1.1

A Report Templating System

rcoreoa — 0.4.0

Client for the CORE API

Rcrawler — 0.1.9-1

Web Crawler and Scraper

rcrossref — 1.1.0

Client for Various 'CrossRef' 'APIs'

RCurl — 1.98-1.5

General Network (HTTP/FTP/...) Client Interface for R

rdatacite — 0.5.2

Client for the 'DataCite' API

rdrop2 —

Programmatic Interface to the 'Dropbox' API

redcapAPI — 2.3

Interface to 'REDCap'

repmis — 0.5

Miscellaneous Tools for Reproducible Research

reqres — 0.2.3

Powerful Classes for HTTP Requests and Responses

request — 0.1.0

High Level 'HTTP' Client

rerddap — 0.7.6

General Purpose Client for 'ERDDAP' Servers

restfulr — 0.0.13

R Interface to RESTful Web Services

Rexperigen — 0.2.1

R Interface to Experigen

Rfacebook — 0.6.15

Access to Facebook API via R

rfigshare — 0.3.7

An R Interface to 'figshare'

RGA — 0.4.2

A Google Analytics API Client

rgeolocate — 1.4.1

IP Address Geolocation

RgoogleMaps —

Overlays on Static Maps

rhub — 1.1.1

Connect to 'R-hub'

rio — 0.5.27

A Swiss-Army Knife for Data I/O

rjson — 0.2.20

JSON for R

RJSONIO — 1.3-1.6

Serialize R Objects to JSON, JavaScript Object Notation

Rlinkedin — 0.2

Access to the LinkedIn API via R

rLTP — 0.1.4

R Interface to the 'LTP'-Cloud Service

roadoi — 0.7.1

Find Free Versions of Scholarly Publications via Unpaywall

ROAuth — 0.9.6

R Interface For OAuth

robotstxt — 0.7.13

A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

Rook — 1.1-1

Rook - a web server interface for R

rorcid — 0.7.0

Interface to the 'Orcid.org' API

rosetteApi — 1.14.4

'Rosette' API

routr — 0.4.0

A Simple Router for HTTP and WebSocket Requests

rpinterest — 0.3.1

Access Pinterest API

rplos — 1.0.0

Interface to the Search API for 'PLoS' Journals

RPushbullet — 0.3.4

R Interface to the Pushbullet Messaging Service

rrefine — 1.1.2

r Client for OpenRefine API

rromeo — 0.1.1

Access Publisher Copyright & Self-Archiving Policies via the 'SHERPA/RoMEO' API

RSclient — 0.7-3

Client for Rserve

rscopus — 0.6.6

Scopus Database 'API' Interface

rsdmx — 0.6

Tools for Reading SDMX Data and Metadata

RSelenium — 1.7.7

R Bindings for 'Selenium WebDriver'

Rserve — 1.7-3.1

Binary R server

RSiteCatalyst — 1.4.16

R Client for Adobe Analytics API V1.4

RSmartlyIO — 0.1.3

Loading Facebook and Instagram Advertising Data from 'Smartly.io'

RStripe — 0.1

A Convenience Interface for the Stripe Payment API

rtweet — 0.7.0

Collecting Twitter Data

rvest — 1.0.2

Easily Harvest (Scrape) Web Pages

RYandexTranslate — 1.0

R Interface to Yandex Translate API

scholar — 0.2.2

Analyse Citation Data from Google Scholar

scrapeR — 0.1.6

Tools for Scraping Data from HTML and XML Documents

searchConsoleR — 0.4.0

Google Search Console R Client

seleniumPipes — 0.3.7

R Client Implementing the W3C WebDriver Specification

sendmailR — 1.2-1

send email using R

servr — 0.23

A Simple HTTP Server to Serve Static Files or Dynamic Documents

slackr — 3.2.0

Send Messages, Images, R Objects and Files to 'Slack' Channels/Users

spiderbar — 0.2.4

Parse and Test Robots Exclusion Protocol Files and Rules

splashr — 0.6.0

Tools to Work with the 'Splash' 'JavaScript' Rendering and Scraping Service

streamR — 0.4.5

Access to Twitter Streaming API via R

swagger — 3.33.1

Dynamically Generates Documentation from a 'Swagger' Compliant API

tidyRSS — 2.0.4

Tidy RSS for R

tm.plugin.webmining — 1.3

Retrieve Structured, Textual Data from Various Web Sources

transcribeR — 0.0.0

Automated Transcription of Audio Files Through the HP IDOL API

translate — 0.1.2

Bindings for the Google Translate API v2

translateR — 1.0

Bindings for the Google and Microsoft Translation APIs

twitteR — 1.1.9

R Based Twitter Client

uaparserjs — 0.3.5

Parse 'User-Agent' Strings

urlshorteneR — 1.4.3

R Wrapper for the 'Bit.ly' and 'Is.gd'/'v.gd' URL Shortening Services

urltools — 1.7.3

Vectorised Tools for URL Handling and Parsing

V8 — 3.4.2

Embedded JavaScript and WebAssembly Engine for R

vkR — 0.2

Access to VK API via R

W3CMarkupValidator — 0.1-6

R Interface to W3C Markup Validation Services

webreadr — 0.4.0

Tools for Reading Formatted Access Log Files

webshot — 0.5.2

Take Screenshots of Web Pages

webutils — 1.1

Utility Functions for Developing Web Applications

whisker — 0.4

{{mustache}} for R, Logicless Templating

WikidataQueryServiceR — 1.0.0

API Client Library for 'Wikidata Query Service'

WikidataR — 2.3.1

Read-Write API Client Library for 'Wikidata'

wikipediatrend — 2.1.6

Public Subject Attention via Wikipedia Page View Statistics

WikipediR — 1.5.0

A MediaWiki API Wrapper

WufooR — 1.0.1

R Wrapper for the 'Wufoo.com' - The Form Building Service

XML — 3.99-0.8

Tools for Parsing and Generating XML Within R and S-Plus

XML2R — 0.0.6

EasieR XML data collection

xslt — 1.4.3

Extensible Style-Sheet Language Transformations

yhatr — 0.15.1

R Binder for the Yhat API

zen4R — 0.5

Interface to 'Zenodo' REST API

Task view list