Task view: Machine Learning & Statistical Learning

Last updated on 2021-10-26 by Torsten Hothorn

Several add-on packages implement ideas and methods developed at the borderline between computer science and statistics - this field of research is usually referred to as machine learning. The packages can be roughly structured into the following topics:

  • Neural Networks and Deep Learning: Single-hidden-layer neural network are implemented in package nnet (shipped with base R). Package RSNNS offers an interface to the Stuttgart Neural Network Simulator (SNNS). Packages implementing deep learning flavours of neural networks include deepnet (feed-forward neural network, restricted Boltzmann machine, deep belief network, stacked autoencoders), RcppDL (denoising autoencoder, stacked denoising autoencoder, restricted Boltzmann machine, deep belief network) and h2o (feed-forward neural network, deep autoencoders). An interface to tensorflow is available in tensorflow. The torch package implements an interface to the libtorch library.
  • Recursive Partitioning: Tree-structured models for regression, classification and survival analysis, following the ideas in the CART book, are implemented in rpart (shipped with base R) and tree. Package rpart is recommended for computing CART-like trees. A rich toolbox of partitioning algorithms is available in Weka, package RWeka provides an interface to this implementation, including the J4.8-variant of C4.5 and M5. The Cubist package fits rule-based models (similar to trees) with linear regression models in the terminal leaves, instance-based corrections and boosting. The C50 package can fit C5.0 classification trees, rule-based models, and boosted versions of these.
    Two recursive partitioning algorithms with unbiased variable selection and statistical stopping criterion are implemented in package party and partykit. Function ctree() is based on non-parametric conditional inference procedures for testing independence between response and each input variable whereas mob() can be used to partition parametric models. Extensible tools for visualizing binary trees and node distributions of the response are available in package party and partykit as well.
    Graphical tools for the visualization of trees are available in package maptree.
    Partitioning of mixture models is performed by RPMM.
    Computational infrastructure for representing trees and unified methods for prediction and visualization is implemented in partykit. This infrastructure is used by package evtree to implement evolutionary learning of globally optimal trees. Survival trees are available in various packages.
  • Random Forests: The reference implementation of the random forest algorithm for regression and classification is available in package randomForest. Package ipred has bagging for regression, classification and survival analysis as well as bundling, a combination of multiple models via ensemble learning. In addition, a random forest variant for response variables measured at arbitrary scales based on conditional inference trees is implemented in package party. randomForestSRC implements a unified treatment of Breiman's random forests for survival, regression and classification problems. Quantile regression forests quantregForest allow to regress quantiles of a numeric response on exploratory variables via a random forest approach. For binary data, The varSelRF and Boruta packages focus on variable selection by means for random forest algorithms. In addition, packages ranger and Rborist offer R interfaces to fast C++ implementations of random forests. Reinforcement Learning Trees, featuring splits in variables which will be important down the tree, are implemented in package RLT. wsrf implements an alternative variable weighting method for variable subspace selection in place of the traditional random variable sampling. Package RGF is an interface to a Python implementation of a procedure called regularized greedy forests. Random forests for parametric models, including forests for the estimation of predictive distributions, are available in packages trtf (predictive transformation forests, possibly under censoring and trunction) and grf (an implementation of generalised random forests).
  • Regularized and Shrinkage Methods: Regression models with some constraint on the parameter estimates can be fitted with the lasso2 and lars packages. Lasso with simultaneous updates for groups of parameters (groupwise lasso) is available in package grplasso; the grpreg package implements a number of other group penalization models, such as group MCP and group SCAD. The L1 regularization path for generalized linear models and Cox models can be obtained from functions available in package glmpath, the entire lasso or elastic-net regularization path (also in elasticnet) for linear regression, logistic and multinomial regression models can be obtained from package glmnet. The penalized package provides an alternative implementation of lasso (L1) and ridge (L2) penalized regression models (both GLM and Cox models). Package biglasso fits Gaussian and logistic linear models under L1 penalty when the data can't be stored in RAM. Package RXshrink can be used to identify and display TRACEs for a specified shrinkage path and to determine the appropriate extent of shrinkage. Semiparametric additive hazards models under lasso penalties are offered by package ahaz. A generalisation of the Lasso shrinkage technique for linear regression is called relaxed lasso and is available in package relaxo. Fisher's LDA projection with an optional LASSO penalty to produce sparse solutions is implemented in package penalizedLDA. The shrunken centroids classifier and utilities for gene expression analyses are implemented in package pamr. An implementation of multivariate adaptive regression splines is available in package earth. Various forms of penalized discriminant analysis are implemented in packages hda and sda. Package LiblineaR offers an interface to the LIBLINEAR library. The ncvreg package fits linear and logistic regression models under the the SCAD and MCP regression penalties using a coordinate descent algorithm. The same penalties are also implemented in the picasso package. An implementation of bundle methods for regularized risk minimization is available form package bmrm. The Lasso under non-Gaussian and heteroscedastic errors is estimated by hdm, inference on low-dimensional components of Lasso regression and of estimated treatment effects in a high-dimensional setting are also contained. Package SIS implements sure independence screening in generalised linear and Cox models. Elastic nets for correlated outcomes are available from package joinet. Robust penalized generalized linear models and robust support vector machines are fitted by package mpath using composite optimization by conjugation operator. The islasso package provides an implementation of lasso based on the induced smoothing idea which allows to obtain reliable p-values for all model parameters.
  • Boosting and Gradient Descent: Various forms of gradient boosting are implemented in package gbm (tree-based functional gradient descent boosting). Package xgboost implements tree-based boosting using efficient trees as base learners for several and also user-defined objective functions. The Hinge-loss is optimized by the boosting implementation in package bst. An extensible boosting framework for generalized linear, additive and nonparametric models is available in package mboost. Likelihood-based boosting for mixed models is implemented in GMMBoost. GAMLSS models can be fitted using boosting by gamboostLSS. An implementation of various learning algorithms based on Gradient Descent for dealing with regression tasks is available in package gradDescent.
  • Support Vector Machines and Kernel Methods: The function svm() from e1071 offers an interface to the LIBSVM library and package kernlab implements a flexible framework for kernel learning (including SVMs, RVMs and other kernel learning algorithms). An interface to the SVMlight implementation (only for one-against-all classification) is provided in package klaR. The relevant dimension in kernel feature spaces can be estimated using rdetools which also offers procedures for model selection and prediction.
  • Bayesian Methods: Bayesian Additive Regression Trees (BART), where the final model is defined in terms of the sum over many weak learners (not unlike ensemble methods), are implemented in packages BayesTree, BART, and bartMachine. Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes including Bayesian CART and treed linear models are made available by package tgp. Bayesian structure learning in undirected graphical models for multivariate continuous, discrete, and mixed data is implemented in package BDgraph; corresponding methods relying on spike-and-slab priors are available from package ssgraph. Naive Bayes classifiers are available in naivebayes.
  • Optimization using Genetic Algorithms: Package rgenoud offers optimization routines based on genetic algorithms. The package Rmalschains implements memetic algorithms with local search chains, which are a special type of evolutionary algorithms, combining a steady state genetic algorithm with local search for real-valued parameter optimization.
  • Association Rules: Package arules provides both data structures for efficient handling of sparse binary data as well as interfaces to implementations of Apriori and Eclat for mining frequent itemsets, maximal frequent itemsets, closed frequent itemsets and association rules. Package opusminer provides an interface to the OPUS Miner algorithm (implemented in C++) for finding the key associations in transaction data efficiently, in the form of self-sufficient itemsets, using either leverage or lift.
  • Fuzzy Rule-based Systems: Package frbs implements a host of standard methods for learning fuzzy rule-based systems from data for regression and classification. Package RoughSets provides comprehensive implementations of the rough set theory (RST) and the fuzzy rough set theory (FRST) in a single package.
  • Model selection and validation: Package e1071 has function tune() for hyper parameter tuning and function errorest() (ipred) can be used for error rate estimation. The cost parameter C for support vector machines can be chosen utilizing the functionality of package svmpath. Data splitting for crossvalidation and other resampling schemes is available in the splitTools package. Functions for ROC analysis and other visualisation techniques for comparing candidate classifiers are available from package ROCR. Packages hdi and stabs implement stability selection for a range of models, hdi also offers other inference procedures in high-dimensional models.
  • Causal Machine Learning: The package DoubleML is an object-oriented implementation of the double machine learning framework in a variety of causal models. Building upon the mlr3 ecosystem, estimation of causal effects can be based on an extensive collection of machine learning methods.
  • Other procedures: Evidential classifiers quantify the uncertainty about the class of a test pattern using a Dempster-Shafer mass function in package evclass. The OneR (One Rule) package offers a classification algorithm with enhancements for sophisticated handling of missing values and numeric data together with extensive diagnostic functions.
  • Meta packages: Package caret provides miscellaneous functions for building predictive models, including parameter tuning and variable importance measures. The package can be used with various parallel implementations (e.g. MPI, NWS etc). In a similar spirit, packages mlr3 and mlr3proba offer high-level interfaces to various statistical and machine learning packages. Package SuperLearner implements a similar toolbox. The h2o package implements a general purpose machine learning platform that has scalable implementations of many popular algorithms such as random forest, GBM, GLM (with elastic net regularization), and deep learning (feedforward multilayer networks), among others. An interface to the mlpack C++ library is available from package mlpack. CORElearn implements a rather broad class of machine learning algorithms, such as nearest neighbors, trees, random forests, and several feature selection methods. Similar, package rminer interfaces several learning algorithms implemented in other packages and computes several performance measures.
  • GUIrattle is a graphical user interface for data mining in R.
  • Visualisation (initially contributed by Brandon Greenwell) The stats::termplot() function package can be used to plot the terms in a model whose predict method supports type="terms". The effects package provides graphical and tabular effect displays for models with a linear predictor (e.g., linear and generalized linear models). Friedman’s partial dependence plots (PDPs), that are low dimensional graphical renderings of the prediction function, are implemented in a few packages. gbm, randomForest and randomForestSRC provide their own functions for displaying PDPs, but are limited to the models fit with those packages (the function partialPlot from randomForest is more limited since it only allows for one predictor at a time). Packages pdp, plotmo, and ICEbox are more general and allow for the creation of PDPs for a wide variety of machine learning models (e.g., random forests, support vector machines, etc.); both pdp and plotmo support multivariate displays (plotmo is limited to two predictors while pdp uses trellis graphics to display PDPs involving three predictors). By default, plotmo fixes the background variables at their medians (or first level for factors) which is faster than constructing PDPs but incorporates less information. ICEbox focuses on constructing individual conditional expectation (ICE) curves, a refinement over Friedman's PDPs. ICE curves, as well as centered ICE curves can also be constructed with the partial() function from the pdp package. ggRandomForests provides ggplot2-based tools for the graphical exploration of random forest models (e.g., variable importance plots and PDPs) from the randomForest and randomForestSRC packages.


ahaz — 1.14

Regularization for semiparametric additive hazards regression

arules — 1.7-1

Mining Association Rules and Frequent Itemsets

BART — 2.9

Bayesian Additive Regression Trees

bartMachine — 1.2.6

Bayesian Additive Regression Trees

BayesTree — 0.3-1.4

Bayesian Additive Regression Trees

BDgraph — 2.64

Bayesian Structure Learning in Graphical Models using Birth-Death MCMC

biglasso — 1.4.1

Extending Lasso Model Fitting to Big Data

bmrm — 4.1

Bundle Methods for Regularized Risk Minimization Package

Boruta — 7.0.0

Wrapper Algorithm for All Relevant Feature Selection

bst — 0.3-23

Gradient Boosting

C50 — 0.1.5

C5.0 Decision Trees and Rule-Based Models

caret — 6.0-90

Classification and Regression Training

CORElearn — 1.56.0

Classification, Regression and Feature Evaluation

Cubist — 0.3.0

Rule- And Instance-Based Regression Modeling

deepnet — 0.2

deep learning toolkit in R

DoubleML — 0.4.1

Double Machine Learning in R

e1071 — 1.7-9

Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien

effects — 4.2-0

Effect Displays for Linear, Generalized Linear, and Other Models

earth — 5.3.1

Multivariate Adaptive Regression Splines

elasticnet — 1.3

Elastic-Net for Sparse Estimation and Sparse PCA

evclass — 1.1.1

Evidential Distance-Based Classification

evtree — 1.0-8

Evolutionary Learning of Globally Optimal Trees

frbs — 3.2-0

Fuzzy Rule-Based Systems for Classification and Regression Tasks

gamboostLSS — 2.0-5

Boosting Methods for 'GAMLSS'

gbm — 2.1.8

Generalized Boosted Regression Models

ggRandomForests — 2.0.1

Visually Exploring Random Forests

glmnet — 4.1-3

Lasso and Elastic-Net Regularized Generalized Linear Models

glmpath — 0.98

L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model

GMMBoost — 1.1.3

Likelihood-Based Boosting for Generalized Mixed Models

gradDescent — 3.0

Gradient Descent for Regression Tasks

grf — 2.0.2

Generalized Random Forests

grplasso — 0.4-7

Fitting User-Specified Models with Group Lasso Penalty

grpreg — 3.4.0

Regularization Paths for Regression Models with Grouped Covariates

hda — 0.2-14

Heteroscedastic Discriminant Analysis

hdi — 0.1-9

High-Dimensional Inference

hdm — 0.3.1

High-Dimensional Metrics

h2o —

R Interface for the 'H2O' Scalable Machine Learning Platform

ICEbox — 1.1.2

Individual Conditional Expectation Plot Toolbox

ipred — 0.9-12

Improved Predictors

islasso — 1.4.1

The Induced Smoothed Lasso

joinet — 0.0.10

Multivariate Elastic Net Regression

kernlab — 0.9-29

Kernel-Based Machine Learning Lab

klaR — 0.6-15

Classification and Visualization

lars — 1.2

Least Angle Regression, Lasso and Forward Stagewise

lasso2 — 1.2-22

L1 Constrained Estimation aka `lasso'

LiblineaR — 2.10-12

Linear Predictive Models Based on the LIBLINEAR C/C++ Library

maptree — 1.4-7

Mapping, pruning, and graphing tree models

mboost — 2.9-5

Model-Based Boosting

mlpack —

'Rcpp' Integration for the 'mlpack' Library

mlr3 — 0.13.0

Machine Learning in R - Next Generation

mlr3proba — 0.4.2

Probabilistic Supervised Learning for 'mlr3'

mpath — 0.4-2.19

Regularized Linear Models

ncvreg — 3.13.0

Regularization Paths for SCAD and MCP Penalized Regression Models

naivebayes — 0.9.7

High Performance Implementation of the Naive Bayes Algorithm

nnet — 7.3-16

Feed-Forward Neural Networks and Multinomial Log-Linear Models

OneR — 2.2

One Rule Machine Learning Classification Algorithm with Enhancements

opusminer — 0.1-1

OPUS Miner Algorithm for Filtered Top-k Association Discovery

pamr — 1.56.1

Pam: Prediction Analysis for Microarrays

party — 1.3-9

A Laboratory for Recursive Partytioning

partykit — 1.2-15

A Toolkit for Recursive Partytioning

pdp — 0.7.0

Partial Dependence Plots

penalized — 0.9-51

L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model

penalizedLDA — 1.1

Penalized Classification using Fisher's Linear Discriminant

picasso — 1.3.1

Pathwise Calibrated Sparse Shooting Algorithm

plotmo — 3.6.1

Plot a Model's Residuals, Response, and Partial Dependence Plots

quantregForest — 1.3-7

Quantile Regression Forests

randomForest — 4.6-14

Breiman and Cutler's Random Forests for Classification and Regression

randomForestSRC — 2.14.0

Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

ranger — 0.13.1

A Fast Implementation of Random Forests

rattle — 5.4.0

Graphical User Interface for Data Science in R

Rborist — 0.2-3

Extensible, Parallelizable Implementation of the Random Forest Algorithm

RcppDL — 0.0.5

Deep Learning Methods via Rcpp

rdetools — 1.0

Relevant Dimension Estimation (RDE) in Feature Spaces

relaxo — 0.1-2

Relaxed Lasso

rgenoud — 5.8-3.0

R Version of GENetic Optimization Using Derivatives

RGF — 1.0.8

Regularized Greedy Forest

RLT — 3.2.3

Reinforcement Learning Trees

Rmalschains — 0.2-6

Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R

rminer — 1.4.6

Data Mining Classification and Regression Methods

ROCR — 1.0-11

Visualizing the Performance of Scoring Classifiers

RoughSets — 1.3-7

Data Analysis Using Rough Set and Fuzzy Rough Set Theories

rpart — 4.1-15

Recursive Partitioning and Regression Trees

RPMM — 1.25

Recursively Partitioned Mixture Model

RSNNS — 0.4-14

Neural Networks using the Stuttgart Neural Network Simulator (SNNS)

RWeka — 0.4-44

R/Weka Interface

RXshrink — 2.0

Maximum Likelihood Shrinkage using Generalized Ridge or Least Angle Regression

sda — 1.3.8

Shrinkage Discriminant Analysis and CAT Score Variable Selection

SIS — 0.8-8

Sure Independence Screening

splitTools — 0.3.1

Tools for Data Splitting

ssgraph — 1.12

Bayesian Graphical Estimation using Spike-and-Slab Priors

stabs — 0.6-4

Stability Selection with Error Control

SuperLearner — 2.0-28

Super Learner Prediction

svmpath — 0.970

The SVM Path Algorithm

tensorflow — 2.7.0

R Interface to 'TensorFlow'

tgp — 2.4-17

Bayesian Treed Gaussian Process Models

torch — 0.6.0

Tensors and Neural Networks with 'GPU' Acceleration

tree — 1.0-41

Classification and Regression Trees

trtf — 0.3-8

Transformation Trees and Forests

varSelRF — 0.7-8

Variable Selection using Random Forests

wsrf — 1.7.22

Weighted Subspace Random Forest for Classification

xgboost —

Extreme Gradient Boosting

Task view list