Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 1774 packages in 0.01 seconds

trtf — by Torsten Hothorn, 9 months ago

Transformation Trees and Forests

Recursive partytioning of transformation models with corresponding random forest for conditional transformation models as described in 'Transformation Forests' (Hothorn and Zeileis, 2021, ) and 'Top-Down Transformation Choice' (Hothorn, 2018, ).

bonsai — by Emil Hvitfeldt, 4 months ago

Model Wrappers for Tree-Based Models

Bindings for additional tree-based model engines for use with the 'parsnip' package. Models include gradient boosted decision trees with 'LightGBM' (Ke et al, 2017.), conditional inference trees and conditional random forests with 'partykit' (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. ), and accelerated oblique random forests with 'aorsf' (Jaeger et al, 2022 ).

yaImpute — by Jeffrey S. Evans, a year ago

Nearest Neighbor Observation Imputation and Evaluation Tools

Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.

varImp — by Philipp Probst, 5 years ago

RF Variable Importance for Arbitrary Measures

Computes the random forest variable importance (VIMP) for the conditional inference random forest (cforest) of the 'party' package. Includes a function (varImp) that computes the VIMP for arbitrary measures from the 'measures' package. For calculating the VIMP regarding the measures accuracy and AUC two extra functions exist (varImpACC and varImpAUC).

missRanger — by Michael Mayer, 10 months ago

Fast Imputation of Missing Values

Alternative implementation of the beautiful 'MissForest' algorithm used to impute mixed-type data sets by chaining random forests, introduced by Stekhoven, D.J. and Buehlmann, P. (2012) . Under the hood, it uses the lightning fast random forest package 'ranger'. Between the iterative model fitting, we offer the option of using predictive mean matching. This firstly avoids imputation with values not already present in the original data (like a value 0.3334 in 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level. This would allow, e.g., to do multiple imputation when repeating the call to missRanger(). Out-of-sample application is supported as well.

meta — by Guido Schwarzer, 2 months ago

General Package for Meta-Analysis

User-friendly general package providing standard methods for meta-analysis and supporting Schwarzer, Carpenter, and Rücker , "Meta-Analysis with R" (2015): - common effect and random effects meta-analysis; - several plots (forest, funnel, Galbraith / radial, L'Abbe, Baujat, bubble); - three-level meta-analysis model; - generalised linear mixed model; - logistic regression with penalised likelihood for rare events; - Hartung-Knapp method for random effects model; - Kenward-Roger method for random effects model; - prediction interval; - statistical tests for funnel plot asymmetry; - trim-and-fill method to evaluate bias in meta-analysis; - meta-regression; - cumulative meta-analysis and leave-one-out meta-analysis; - import data from 'RevMan 5'; - produce forest plot summarising several (subgroup) meta-analyses.

icRSF — by Hui Xu, 8 years ago

A Modified Random Survival Forest Algorithm

Implements a modification to the Random Survival Forests algorithm for obtaining variable importance in high dimensional datasets. The proposed algorithm is appropriate for settings in which a silent event is observed through sequentially administered, error-prone self-reports or laboratory based diagnostic tests. The modified algorithm incorporates a formal likelihood framework that accommodates sequentially administered, error-prone self-reports or laboratory based diagnostic tests. The original Random Survival Forests algorithm is modified by the introduction of a new splitting criterion based on a likelihood ratio test statistic.

recforest — by Juliette Murris, a year ago

Random Survival Forest for Recurrent Events

Analyze recurrent events with right-censored data and the potential presence of a terminal event (that prevents further occurrences, like death). 'recofest' extends the random survival forest algorithm, adapting splitting rules and node estimators to handle complexities of recurrent events. The methodology is fully described in Murris, J., Bouaziz, O., Jakubczak, M., Katsahian, S., & Lavenu, A. (2024) (< https://hal.science/hal-04612431v1/document>).

CORElearn — by Marko Robnik-Sikonja, a year ago

Classification, Regression and Feature Evaluation

A suite of machine learning algorithms written in C++ with the R interface contains several learning techniques for classification and regression. Predictive models include e.g., classification and regression trees with optional constructive induction and models in the leaves, random forests, kNN, naive Bayes, and locally weighted regression. All predictions obtained with these models can be explained and visualized with the 'ExplainPrediction' package. This package is especially strong in feature evaluation where it contains several variants of Relief algorithm and many impurity based attribute evaluation functions, e.g., Gini, information gain, MDL, and DKM. These methods can be used for feature selection or discretization of numeric attributes. The OrdEval algorithm and its visualization is used for evaluation of data sets with ordinal features and class, enabling analysis according to the Kano model of customer satisfaction. Several algorithms support parallel multithreaded execution via OpenMP. The top-level documentation is reachable through ?CORElearn.

randomUniformForest — by Saip Ciss, 3 years ago

Random Uniform Forests for Classification, Regression and Unsupervised Learning

Ensemble model, for classification, regression and unsupervised learning, based on a forest of unpruned and randomized binary decision trees. Each tree is grown by sampling, with replacement, a set of variables at each node. Each cut-point is generated randomly, according to the continuous Uniform distribution. For each tree, data are either bootstrapped or subsampled. The unsupervised mode introduces clustering, dimension reduction and variable importance, using a three-layer engine. Random Uniform Forests are mainly aimed to lower correlation between trees (or trees residuals), to provide a deep analysis of variable importance and to allow native distributed and incremental learning.