A Fast Implementation of Random Forests

A fast implementation of Random Forests, particularly suited for high dimensional data. Ensembles of classification, regression, survival and probability prediction trees are supported. Data from genome-wide association studies can be analyzed efficiently. In addition to data frames, datasets of class 'gwaa.data' (R package 'GenABEL') can be directly analyzed.


News

  • Set write.forest=TRUE by default
  • Add num.trees option to predict()
  • Faster version of getTerminalNodeIDs(), included in predict()
  • Handle new factor levels in 'order' mode
  • Use unadjusted p-value for 2 categories in maxstat splitting
  • Bug fixes
  • Add Windows multithreading support for new toolchain
  • Add splitting by maximally selected rank statistics for survival and regression forests
  • Faster method for unordered factor splitting
  • Add p-values for variable importance
  • Runtime improvement for regression forests on classification data
  • Bug fixes
  • Reduce memory usage of savest forest objects (changed child.nodeIDs interface)
  • Add keep.inbag option to track in-bag counts
  • Add option sample.fraction for fraction of sampled observations
  • Add tree-wise split.select.weights
  • Add predict.all option in predict() to get individual predictions for each tree for classification and regression
  • Add case-specific random forests
  • Add case weights (weighted bootstrapping or subsampling)
  • Remove tuning functions, please use mlr or caret
  • Catch error of outdated gcc not supporting C++11 completely
  • Bug fixes
  • Allow the user to interrupt computation from R
  • Transpose classification.table and rename to confusion.matrix
  • Respect R seed for prediction
  • Memory improvements for variable importance computation
  • Fix bug: Probability prediction for single observations
  • Fix bug: Results not identical when using alternative interface
  • Small fixes for Solaris compiler
  • Add C-index splitting
  • Fix NA SNP handling
  • Fix matrix and gwaa alternative survival interface
  • Version submitted to JSS
  • Small changes in documentation
  • Preallocate memory for splitting
  • Remove recursive splitting
  • Allow matrix as input data in R version
  • Fix prediction of classification forests in R
  • Speedup growing for continuous covariates
  • Add memory save option to save memory for very large datasets (but slower)
  • Remove memory mode option from R version since no performance gain
  • Fix problems when using Rcpp <0.11.4
  • Add option to split on unordered categorical covariates
  • Optimize memory management for very large survival forests
  • Set required Rcpp version to 0.11.2
  • Fix large $call objects when using BatchJobs
  • Add details and example on GenABEL usage to documentation
  • Minor changes to documentation
  • Speedup for survival forests with continuous covariates
  • R version: Generate seed from R. It is no longer necessary to set the seed argument in ranger calls.
  • Windows support for R version (without multithreading)
  • Speedup growing of regression and probability prediction forests
  • Prediction forests are now handled like regression forests: MSE used for prediction error and permutation importance
  • Fixed name conflict with randomForest package for "importance"
  • Fixed a bug: prediction function is now working for probability prediction forests
  • Slot "predictions" for probability forests now contains class probabilities
  • importance function is now working even if randomForest package is loaded after ranger
  • Fixed a bug: Split selection weights are now working as expected
  • Small changes in documentation

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.