Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 1016 packages in 0.07 seconds

foreign — by R Core Team, 5 months ago

Read Data Stored by 'Minitab', 'S', 'SAS', 'SPSS', 'Stata', 'Systat', 'Weka', 'dBase', ...

Reading and writing data stored by some versions of 'Epi Info', 'Minitab', 'S', 'SAS', 'SPSS', 'Stata', 'Systat', 'Weka', and for reading and writing some 'dBase' files.

network — by Carter T. Butts, 9 months ago

Classes for Relational Data

Tools to create and modify network objects. The network class can represent a range of relational data types, and supports arbitrary vertex/edge/graph attributes.

reclin2 — by Jan van der Laan, 2 years ago

Record Linkage Toolkit

Functions to assist in performing probabilistic record linkage and deduplication: generating pairs, comparing records, em-algorithm for estimating m- and u-probabilities (I. Fellegi & A. Sunter (1969) , T.N. Herzog, F.J. Scheuren, & W.E. Winkler (2007), "Data Quality and Record Linkage Techniques", ISBN:978-0-387-69502-0), forcing one-to-one matching. Can also be used for pre- and post-processing for machine learning methods for record linkage. Focus is on memory, CPU performance and flexibility.

rstpm2 — by Mark Clements, 9 days ago

Smooth Survival Models, Including Generalized Survival Models

R implementation of generalized survival models (GSMs), smooth accelerated failure time (AFT) models and Markov multi-state models. For the GSMs, g(S(t|x))=eta(t,x) for a link function g, survival S at time t with covariates x and a linear predictor eta(t,x). The main assumption is that the time effect(s) are smooth . For fully parametric models with natural splines, this re-implements Stata's 'stpm2' function, which are flexible parametric survival models developed by Royston and colleagues. We have extended the parametric models to include any smooth parametric smoothers for time. We have also extended the model to include any smooth penalized smoothers from the 'mgcv' package, using penalized likelihood. These models include left truncation, right censoring, interval censoring, gamma frailties and normal random effects , and copulas. For the smooth AFTs, S(t|x) = S_0(t*eta(t,x)), where the baseline survival function S_0(t)=exp(-exp(eta_0(t))) is modelled for natural splines for eta_0, and the time-dependent cumulative acceleration factor eta(t,x)=\int_0^t exp(eta_1(u,x)) du for log acceleration factor eta_1(u,x). The Markov multi-state models allow for a range of models with smooth transitions to predict transition probabilities, length of stay, utilities and costs, with differences, ratios and standardisation.

cellWise — by Jakob Raymaekers, 2 years ago

Analyzing Data with Cellwise Outliers

Tools for detecting cellwise outliers and robust methods to analyze data which may contain them. Contains the implementation of the algorithms described in Rousseeuw and Van den Bossche (2018) (open access) Hubert et al. (2019) (open access), Raymaekers and Rousseeuw (2021) (open access), Raymaekers and Rousseeuw (2021) (open access), Raymaekers and Rousseeuw (2021) (open access), Raymaekers and Rousseeuw (2022) (open access) Rousseeuw (2022) (open access). Examples can be found in the vignettes: "DDC_examples", "MacroPCA_examples", "wrap_examples", "transfo_examples", "DI_examples", "cellMCD_examples" , "Correspondence_analysis_examples", and "cellwise_weights_examples".

NormData — by Wim Van der Elst, a year ago

Derivation of Regression-Based Normative Data

Normative data are often used to estimate the relative position of a raw test score in the population. This package allows for deriving regression-based normative data. It includes functions that enable the fitting of regression models for the mean and residual (or variance) structures, test the model assumptions, derive the normative data in the form of normative tables or automatic scoring sheets, and estimate confidence intervals for the norms. This package accompanies the book Van der Elst, W. (2024). Regression-based normative data for psychological assessment. A hands-on approach using R. Springer Nature.

vegan — by Jari Oksanen, 3 months ago

Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

shinyHugePlot — by Junta Tagusari, a year ago

Efficient Plotting of Large-Sized Data

A tool to plot data with a large sample size using 'shiny' and 'plotly'. Relatively small samples are obtained from the original data using a specific algorithm. The samples are updated according to a user-defined x range. Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost (2022) < https://github.com/predict-idlab/plotly-resampler>.

haldensify — by Nima Hejazi, 3 days ago

Highly Adaptive Lasso Conditional Density Estimation

An algorithm for flexible conditional density estimation based on application of pooled hazard regression to an artificial repeated measures dataset constructed by discretizing the support of the outcome variable. To facilitate flexible estimation of the conditional density, the highly adaptive lasso, a non-parametric regression function shown to estimate cadlag (RCLL) functions at a suitably fast convergence rate, is used. The use of pooled hazards regression for conditional density estimation as implemented here was first described for by Díaz and van der Laan (2011) . Building on the conditional density estimation utilities, non-parametric inverse probability weighted (IPW) estimators of the causal effects of additive modified treatment policies are implemented, using conditional density estimation to estimate the generalized propensity score. Non-parametric IPW estimators based on this can be coupled with undersmoothing of the generalized propensity score estimator to attain the semi-parametric efficiency bound (per Hejazi, Díaz, and van der Laan ).

optmatch — by Josh Errickson, a year ago

Functions for Optimal Matching

Distance based bipartite matching using minimum cost flow, oriented to matching of treatment and control groups in observational studies ('Hansen' and 'Klopfer' 2006 ). Routines are provided to generate distances from generalised linear models (propensity score matching), formulas giving variables on which to limit matched distances, stratified or exact matching directives, or calipers, alone or in combination.