Found 1835 packages in 0.17 seconds
Random Forest Prediction Decomposition and Feature Importance Measure
An R re-implementation of the 'treeinterpreter' package on PyPI
< https://pypi.org/project/treeinterpreter/>. Each prediction can be
decomposed as 'prediction = bias + feature_1_contribution + ... +
feature_n_contribution'. This decomposition is then used to calculate
the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using
out-of-bag samples (MDI-oob) feature importance measures based on the
work of Li et al. (2019)
Integrative Random Forest for Gene Regulatory Network Inference
Provides a flexible integrative algorithm that allows information from prior data, such as protein protein interactions and gene knock-down, to be jointly considered for gene regulatory network inference.
Block Forests: Random Forests for Blocks of Clinical and Omics Covariate Data
A random forest variant 'block forest' ('BlockForest') tailored to the
prediction of binary, survival and continuous outcomes using block-structured
covariate data, for example, clinical covariates plus measurements of a certain
omics data type or multi-omics data, that is, data for which measurements of
different types of omics data and/or clinical data for each patient exist. Examples
of different omics data types include gene expression measurements, mutation data
and copy number variation measurements.
Block forest are presented in Hornung & Wright (2019). The package includes four
other random forest variants for multi-omics data: 'RandomBlock', 'BlockVarSel',
'VarProb', and 'SplitWeights'. These were also considered in Hornung & Wright (2019),
but performed worse than block forest in their comparison study based on 20 real
multi-omics data sets. Therefore, we recommend to use block forest ('BlockForest')
in applications. The other random forest variants can, however, be consulted for
academic purposes, for example, in the context of further methodological
developments.
Reference: Hornung, R. & Wright, M. N. (2019) Block Forests: random forests for blocks of clinical and omics covariate data. BMC Bioinformatics 20:358.
Clustered Random Forests for Optimal Prediction and Inference of Clustered Data
A clustered random forest algorithm for fitting random forests for data of independent clusters, that exhibit within cluster dependence.
Details of the method can be found in Young and Buehlmann (2025)
Random Forests, Linear Trees, and Gradient Boosting for Inference and Interpretability
Provides fast implementations of Random Forests, Gradient Boosting, and Linear Random Forests, with an emphasis on inference and interpretability. Additionally contains methods for variable importance, out-of-bag prediction, regression monotonicity, and several methods for missing data imputation.
Oblique Random Forests for Right-Censored Time-to-Event Data
Oblique random survival forests incorporate linear combinations of input variables into random survival forests (Ishwaran, 2008
Sequential Permutation Testing of Random Forest Variable Importance Measures
Sequential permutation testing for statistical
significance of predictors in random forests and other prediction methods.
The main function of the package is rfvimptest(), which allows to test for
the statistical significance of predictors in random forests using
different (sequential) permutation test strategies [1]. The advantage
of sequential over conventional permutation tests is that they
are computationally considerably less intensive, as the sequential
procedure is stopped as soon as there is sufficient evidence
for either the null or the alternative hypothesis.
Reference:
[1] Hapfelmeier, A., Hornung, R. & Haller, B. (2023) Efficient permutation
testing of variable importance measures by the example of random forests.
Computational Statistics & Data Analysis 181:107689,
Integrated Prediction using Uni-Variate and Multivariate Random Forests
An implementation of a framework for drug sensitivity prediction from various genetic characterizations using ensemble approaches. Random Forests or Multivariate Random Forest predictive models can be generated from each genetic characterization that are then combined using a Least Square Regression approach. It also provides options for the use of different error estimation approaches of Leave-one-out, Bootstrap, N-fold cross validation and 0.632+Bootstrap along with generation of prediction confidence interval using Jackknife-after-Bootstrap approach.
Optimising Random Forest Stability by Determining the Optimal Number of Trees
Calculating the stability of random forest with certain numbers of trees. The non-linear relationship between stability and numbers of trees is described using a logistic regression model and used to estimate the optimal number of trees.
A Random-Forest-Based Approach for Imputing Clustered Incomplete Data
It offers random-forest-based functions to impute clustered incomplete data. The package is tailored for but not limited to imputing multitissue expression data, in which a gene's expression is measured on the collected tissues of an individual but missing on the uncollected tissues.