Scalable implementation of classification and regression forests, as described by Breiman (2001),
Changes in 0.1-9:
Option 'nThread' limits OpenMP parallelization to maximum number of threads.
Option 'oob' specifies an out-of-bag constraint for prediction.
Row sampling now implemented using 'Rcpp', in place of 'rcppArmadillo'.
Package 'data.table' now implements block decomposition of 'data.frame'.
Changes in 0.1-8:
Command 'Validate' enables separate execution of out-of-bag validation.
Command 'Streamline' shrinks trained Rborist objects by emptying unused fields.
Option 'maxLeaf' prunes trees during training to a maximum number of leaves.
Changes in 0.1-4:
Sparse 'dcGMatrix' matrices accepted, if encoded in 'i/p' format.
Autocompression conserves space on a per-predictor basis.
Space-saving 'thinLeaves' option suppresses creation of summary data.
'splitQuantile' option allows fine tuning of split-point placement for numerical predictors.
Improved scaling with row count.
Changes in 0.1-2:
Improved scaling with predictor count.
Improved conformance with Caret package.
'minNode' default lowered to reflect uniqueness of indices referenced within a node.
Name change: PreTrain deprecated in favor of PreFormat.
Minor reorganization to support sparse internal representation planned for next release.
Changes in 0.1-1:
Significant reductions in memory footprint.
Default predictor-selction mode changed to 'predFixed' (like 'mTry') for small predictor counts. 'predProb' remains the default at higher count.
Binary classification now employs faster, weight-based algorithm.
Training produces rich internal state by default. In particular, quantile validation and prediction can be performed without having to train specially for them.
ForestFloorExport objects can be produced from training state for use by 'forestFloor' feature-analysis package.
PreTrain method produces pre-sorted predictor format, saving recomputation when retraining iteratively, such as during a Caret session.
OMP parallelization now performed per node/predictor pair, rather than per predictor.
Optional 'regMono' vector enforces monotonic constraints on numeric regressors.