METACRAN search results

RFmerge — by Mauricio Zambrano-Bigiarini, 5 years ago

Merging of Satellite Datasets with Ground Observations using Random Forests

S3 implementation of the Random Forest MErging Procedure (RF-MEP), which combines two or more satellite-based datasets (e.g., precipitation products, topography) with ground observations to produce a new dataset with improved spatio-temporal distribution of the target field. In particular, this package was developed to merge different Satellite-based Rainfall Estimates (SREs) with measurements from rain gauges, in order to obtain a new precipitation dataset where the time series in the rain gauges are used to correct different types of errors present in the SREs. However, this package might be used to merge other hydrological/environmental satellite fields with point observations. For details, see Baez-Villanueva et al. (2020) . Bugs / comments / questions / collaboration of any kind are very welcomed.

https://github.com/hzambran/RFmerge

missForestPredict — by Elena Albu, 5 months ago

Missing Value Imputation using Random Forest for Prediction Settings

Missing data imputation based on the 'missForest' algorithm (Stekhoven, Daniel J (2012) ) with adaptations for prediction settings. The function missForest() is used to impute a (training) dataset with missing values and to learn imputation models that can be later used for imputing new observations. The function missForestPredict() is used to impute one or multiple new observations (test set) using the models learned on the training data. For more details see Albu, E., Gao, S., Wynants, L., & Van Calster, B. (2024). missForestPredict--Missing data imputation for prediction settings .

https://github.com/sibipx/missForestPredict

forestControl — by Tom Wilson, 4 years ago

Approximate False Positive Rate Control in Selection Frequency for Random Forest

Approximate false positive rate control in selection frequency for random forest using the methods described by Ender Konukoglu and Melanie Ganz (2014) . Methods for calculating the selection frequency threshold at false positive rates and selection frequency false positive rate feature selection.

https://github.com/aberHRML/forestControl

ModelMap — by Elizabeth Freeman, 20 days ago

Modeling and Map Production using Random Forest and Related Stochastic Models

Creates sophisticated models of training data and validates the models with an independent test set, cross validation, or Out Of Bag (OOB) predictions on the training data. Create graphs and tables of the model validation results. Applies these models to GIS .img files of predictors to create detailed prediction surfaces. Handles large predictor files for map making, by reading in the .img files in chunks, and output to the .txt file the prediction for each data chunk, before reading the next chunk of data.

JRF — by Francesca Petralia Developer, 9 years ago

Joint Random Forest (JRF) for the Simultaneous Estimation of Multiple Related Networks

Simultaneous estimation of multiple related networks.

https://www.r-project.org

party — by Torsten Hothorn, 9 months ago

A Laboratory for Recursive Partytioning

A computational toolbox for recursive partitioning. The core of the package is ctree(), an implementation of conditional inference trees which embed tree-structured regression models into a well defined theory of conditional inference procedures. This non-parametric class of regression trees is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Based on conditional inference trees, cforest() provides an implementation of Breiman's random forests. The function mob() implements an algorithm for recursive partitioning based on parametric models (e.g. linear models, GLMs or survival regression) employing parameter instability tests for split selection. Extensible functionality for visualizing tree-structured regression models is available. The methods are described in Hothorn et al. (2006) , Zeileis et al. (2008) and Strobl et al. (2007) .

http://party.R-forge.R-project.org

MERO — by Mohamed Soudy, 3 years ago

Performing Monte Carlo Expectation Maximization Random Forest Imputation for Biological Data

Perform missing value imputation for biological data using the random forest algorithm, the imputation aim to keep the original mean and standard deviation consistent after imputation.

diversityForest — by Roman Hornung, 6 months ago

Innovative Complex Split Procedures in Random Forests Through Candidate Split Sampling

Implementation of three methods based on the diversity forest (DF) algorithm (Hornung, 2022, ), a split-finding approach that enables complex split procedures in random forests. The package includes: 1. Interaction forests (IFs) (Hornung & Boulesteix, 2022, ): Model quantitative and qualitative interaction effects using bivariable splitting. Come with the Effect Importance Measure (EIM), which can be used to identify variable pairs that have well-interpretable quantitative and qualitative interaction effects with high predictive relevance. 2. Two random forest-based variable importance measures (VIMs) for multi-class outcomes: the class-focused VIM, which ranks covariates by their ability to distinguish individual outcome classes from the others, and the discriminatory VIM, which measures overall covariate influence irrespective of class-specific relevance. 3. The basic form of diversity forests that uses conventional univariable, binary splitting (Hornung, 2022). Except for the multi-class VIMs, all methods support categorical, metric, and survival outcomes. The package includes visualization tools for interpreting the identified covariate effects. Built as a fork of the 'ranger' R package (main author: Marvin N. Wright), which implements random forests using an efficient C++ implementation.

h2o — by Tomas Fryda, 2 years ago

R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

https://github.com/h2oai/h2o-3

survcompare — by Diana Shamsutdinova, 4 months ago

Nested Cross-Validation to Compare Cox-PH, Cox-Lasso, Survival Random Forests

Performs repeated nested cross-validation for Cox Proportionate Hazards, Cox Lasso, Survival Random Forest, and their ensemble. Returns internally validated concordance index, time-dependent area under the curve, Brier score, calibration slope, and statistical testing of non-linear ensemble outperforming the baseline Cox model. In this, it helps researchers to quantify the gain of using a more complex survival model, or justify its redundancy. Equally, it shows the performance value of the non-linear and interaction terms, and may highlight the need of further feature transformation. Further details can be found in Shamsutdinova, Stamate, Roberts, & Stahl (2022) "Combining Cox Model and Tree-Based Algorithms to Boost Performance and Preserve Interpretability for Health Outcomes" , where the method is described as Ensemble 1.

Search results

R links

R homepage

Download R

Mailing lists

R documentation

R manuals

R FAQs

The R Journal

CRAN links

CRAN homepage

CRAN repository policy

Submit a package

METACRAN stuff

About METACRAN

At github

Report a bug