Found 1652 packages in 0.01 seconds
A Laboratory for Recursive Partytioning
A computational toolbox for recursive partitioning.
The core of the package is ctree(), an implementation of
conditional inference trees which embed tree-structured
regression models into a well defined theory of conditional
inference procedures. This non-parametric class of regression
trees is applicable to all kinds of regression problems, including
nominal, ordinal, numeric, censored as well as multivariate response
variables and arbitrary measurement scales of the covariates.
Based on conditional inference trees, cforest() provides an
implementation of Breiman's random forests. The function mob()
implements an algorithm for recursive partitioning based on
parametric models (e.g. linear models, GLMs or survival
regression) employing parameter instability tests for split
selection. Extensible functionality for visualizing tree-structured
regression models is available. The methods are described in
Hothorn et al. (2006)
A Diceware Passphrase Implementation
The Diceware method can be used to generate strong passphrases. In short, you roll a 6-faced dice 5 times in a row, the number obtained is matched against a dictionary of easily remembered words. By combining together 7 words thus generated, you obtain a password that is relatively easy to remember, but would take several millions years (on average) for a powerful computer to guess.
Generation of Random Vectors with User-Defined Density
Random vectors with arbitrary Lipschitz density are generated using acceptance/ rejection. The method is based on G. Beliakov (2005)
Testing Randomness in R
Provides several non parametric randomness tests for numeric sequences.
Randomization Inference Tools
Tools for randomization-based inference. Current focus is on the d^2 omnibus test of differences of means following Hansen and Bowers (2008)
R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Tools for Analyzing Finite Mixture Models
Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).
Core Functionality of the 'spatstat' Family
Functionality for data analysis and modelling of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Exploratory methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the functions ppm(), kppm(), slrm(), dppm() similar to glm(). Types of models include Poisson, Gibbs and Cox point processes, Neyman-Scott cluster processes, and determinantal point processes. Models may involve dependence on covariates, inter-point interaction, cluster formation and dependence on marks. Models are fitted by maximum likelihood, logistic regression, minimum contrast, and composite likelihood methods. A model can be fitted to a list of point patterns (replicated point pattern data) using the function mppm(). The model can include random effects and fixed effects depending on the experimental design, in addition to all the features listed above. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots.
Functions for Optimal Non-Bipartite Matching
Perform non-bipartite matching and matched randomization. A "bipartite" matching utilizes two separate groups, e.g. smokers being matched to nonsmokers or cases being matched to controls. A "non-bipartite" matching creates mates from one big group, e.g. 100 hospitals being randomized for a two-arm cluster randomized trial or 5000 children who have been exposed to various levels of secondhand smoke and are being paired to form a greater exposure vs. lesser exposure comparison. At the core of a non-bipartite matching is a N x N distance matrix for N potential mates. The distance between two units expresses a measure of similarity or quality as mates (the lower the better). The 'gendistance()' and 'distancematrix()' functions assist in creating this. The 'nonbimatch()' function creates the matching that minimizes the total sum of distances between mates; hence, it is referred to as an "optimal" matching. The 'assign.grp()' function aids in performing a matched randomization. Note bipartite matching can be performed using the prevent option in 'gendistance()'.
Wrapper Algorithm for All Relevant Feature Selection
An all relevant feature selection wrapper algorithm. It finds relevant features by comparing original attributes' importance with importance achievable at random, estimated using their permuted copies (shadows).