Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 120 packages in 0.02 seconds

causalDisco — by Anne Helby Petersen, 2 years ago

Tools for Causal Discovery on Observational Data

Various tools for inferring causal models from observational data. The package includes an implementation of the temporal Peter-Clark (TPC) algorithm. Petersen, Osler and Ekstrøm (2021) . It also includes general tools for evaluating differences in adjacency matrices, which can be used for evaluating performance of causal discovery procedures.

dataReporter — by Claus Thorn Ekstrøm, 2 years ago

Reproducible Data Screening Checks and Report of Possible Errors

Data screening is an important first step of any statistical analysis. 'dataReporter' auto generates a customizable data report with a thorough summary of the checks and the results that a human can use to identify possible errors. It provides an extendable suite of test for common potential errors in a dataset. See Petersen AH, Ekstrøm CT (2019). "dataMaid: Your Assistant for Documenting Supervised Data Quality Screening in R." _Journal of Statistical Software_, *90*(6), 1-38 for more information.

RANN — by Gregory Jefferis, 5 years ago

Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric

Finds the k nearest neighbours for every point in a given dataset in O(N log N) time using Arya and Mount's ANN library (v1.1.3). There is support for approximate as well as exact searches, fixed radius searches and 'bd' as well as 'kd' trees. The distance is computed using the L2 (Euclidean) metric. Please see package 'RANN.L1' for the same functionality using the L1 (Manhattan, taxicab) metric.

dataMaid — by Claus Thorn Ekstrøm, 3 years ago

A Suite of Checks for Identification of Potential Errors in a Data Frame as Part of the Data Screening Process

Data screening is an important first step of any statistical analysis. dataMaid auto generates a customizable data report with a thorough summary of the checks and the results that a human can use to identify possible errors. It provides an extendable suite of test for common potential errors in a dataset.

PCADSC — by Anne H. Petersen, 7 years ago

Tools for Principal Component Analysis-Based Data Structure Comparisons

A suite of non-parametric, visual tools for assessing differences in data structures for two datasets that contain different observations of the same variables. These tools are all based on Principal Component Analysis (PCA) and thus effectively address differences in the structures of the covariance matrices of the two datasets. The PCASDC tools consist of easy-to-use, intuitive plots that each focus on different aspects of the PCA decompositions. The cumulative eigenvalue (CE) plot describes differences in the variance components (eigenvalues) of the deconstructed covariance matrices. The angle plot presents the information loss when moving from the PCA decomposition of one dataset to the PCA decomposition of the other. The chroma plot describes the loading patterns of the two datasets, thereby presenting the relative weighting and importance of the variables from the original dataset.

Petersen — by Carl Schwarz, 5 months ago

Estimators for Two-Sample Capture-Recapture Studies

A comprehensive implementation of Petersen-type estimators and its many variants for two-sample capture-recapture studies. A conditional likelihood approach is used that allows for tag loss; non reporting of tags; reward tags; categorical, geographical and temporal stratification; partial stratification; reverse capture-recapture; and continuous variables in modeling the probability of capture. Many examples from fisheries management are presented.

geeasy — by Søren Højsgaard, 2 years ago

Solve Generalized Estimating Equations for Clustered Data

Estimation of generalized linear models with correlated/clustered observations by use of generalized estimating equations (GEE). See e.g. Halekoh and Højsgaard, (2005, ), for details. Several types of clustering are supported, including exchangeable variance structures, AR1 structures, M-dependent, user-specified variance structures and more. The model fitting computations are performed using modified code from the 'geeM' package, while the interface and output objects have been written to resemble the 'geepack' package. The package also contains additional tools for working with and inspecting results from the 'geepack' package, e.g. a 'confint' method for 'geeglm' objects from 'geepack'.

scalpel — by Ashley Petersen, 3 years ago

Processes Calcium Imaging Data

Identifies the locations of neurons, and estimates their calcium concentrations over time using the SCALPEL method proposed in Petersen, Ashley; Simon, Noah; Witten, Daniela. SCALPEL: Extracting neurons from calcium imaging data. Ann. Appl. Stat. 12 (2018), no. 4, 2430--2456. . < https://projecteuclid.org/euclid.aoas/1542078051>.


ArchaeoPhases — by Anne Philippe, 2 years ago

Post-Processing of the Markov Chain Simulated by 'ChronoModel', 'Oxcal' or 'BCal'

Provides a list of functions for the statistical analysis of archaeological dates and groups of dates. It is based on the post-processing of the Markov Chains whose stationary distribution is the posterior distribution of a series of dates. Such output can be simulated by different applications as for instance 'ChronoModel' (see < https://chronomodel.com/>), 'Oxcal' (see < https://c14.arch.ox.ac.uk/oxcal.html>) or 'BCal' (see < https://bcal.shef.ac.uk/>). The only requirement is to have a csv file containing a sample from the posterior distribution. Note that this package interacts with data available through the 'ArchaeoPhases.dataset' package which is available in a separate repository. The size of the 'ArchaeoPhases.dataset' package is approximately 4 MB.

ade4 — by Aurélie Siberchicot, a year ago

Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences

Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) .