Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 1652 packages in 0.01 seconds

party — by Torsten Hothorn, a year ago

A Laboratory for Recursive Partytioning

A computational toolbox for recursive partitioning. The core of the package is ctree(), an implementation of conditional inference trees which embed tree-structured regression models into a well defined theory of conditional inference procedures. This non-parametric class of regression trees is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Based on conditional inference trees, cforest() provides an implementation of Breiman's random forests. The function mob() implements an algorithm for recursive partitioning based on parametric models (e.g. linear models, GLMs or survival regression) employing parameter instability tests for split selection. Extensible functionality for visualizing tree-structured regression models is available. The methods are described in Hothorn et al. (2006) , Zeileis et al. (2008) and Strobl et al. (2007) .

riceware — by Francois Michonneau, 11 years ago

A Diceware Passphrase Implementation

The Diceware method can be used to generate strong passphrases. In short, you roll a 6-faced dice 5 times in a row, the number obtained is matched against a dictionary of easily remembered words. By combining together 7 words thus generated, you obtain a password that is relatively easy to remember, but would take several millions years (on average) for a powerful computer to guess.

ranlip — by Gleb Beliakov, 5 years ago

Generation of Random Vectors with User-Defined Density

Random vectors with arbitrary Lipschitz density are generated using acceptance/ rejection. The method is based on G. Beliakov (2005) .

randtests — by Frederico Caeiro, 2 years ago

Testing Randomness in R

Provides several non parametric randomness tests for numeric sequences.

RItools — by Jake Bowers, 8 months ago

Randomization Inference Tools

Tools for randomization-based inference. Current focus is on the d^2 omnibus test of differences of means following Hansen and Bowers (2008) . This test is useful for assessing balance in matched observational studies or for analysis of outcomes in block-randomized experiments.

h2o — by Tomas Fryda, 2 years ago

R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

mixtools — by Derek Young, 10 months ago

Tools for Analyzing Finite Mixture Models

Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).

spatstat.core — by Adrian Baddeley, 4 years ago

Core Functionality of the 'spatstat' Family

Functionality for data analysis and modelling of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Exploratory methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the functions ppm(), kppm(), slrm(), dppm() similar to glm(). Types of models include Poisson, Gibbs and Cox point processes, Neyman-Scott cluster processes, and determinantal point processes. Models may involve dependence on covariates, inter-point interaction, cluster formation and dependence on marks. Models are fitted by maximum likelihood, logistic regression, minimum contrast, and composite likelihood methods. A model can be fitted to a list of point patterns (replicated point pattern data) using the function mppm(). The model can include random effects and fixed effects depending on the experimental design, in addition to all the features listed above. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots.

nbpMatching — by Cole Beck, a year ago

Functions for Optimal Non-Bipartite Matching

Perform non-bipartite matching and matched randomization. A "bipartite" matching utilizes two separate groups, e.g. smokers being matched to nonsmokers or cases being matched to controls. A "non-bipartite" matching creates mates from one big group, e.g. 100 hospitals being randomized for a two-arm cluster randomized trial or 5000 children who have been exposed to various levels of secondhand smoke and are being paired to form a greater exposure vs. lesser exposure comparison. At the core of a non-bipartite matching is a N x N distance matrix for N potential mates. The distance between two units expresses a measure of similarity or quality as mates (the lower the better). The 'gendistance()' and 'distancematrix()' functions assist in creating this. The 'nonbimatch()' function creates the matching that minimizes the total sum of distances between mates; hence, it is referred to as an "optimal" matching. The 'assign.grp()' function aids in performing a matched randomization. Note bipartite matching can be performed using the prevent option in 'gendistance()'.

Boruta — by Miron Bartosz Kursa, 6 months ago

Wrapper Algorithm for All Relevant Feature Selection

An all relevant feature selection wrapper algorithm. It finds relevant features by comparing original attributes' importance with importance achievable at random, estimated using their permuted copies (shadows).