Found 127 packages in 0.94 seconds
Tools for Causal Discovery on Observational Data
Various tools for inferring causal models from observational data. The package
includes an implementation of the temporal Peter-Clark (TPC) algorithm. Petersen, Osler
and Ekstrøm (2021)
Sample Selection Models
Two-step and maximum likelihood estimation of Heckman-type sample selection models: standard sample selection models (Tobit-2), endogenous switching regression models (Tobit-5), sample selection models with binary dependent outcome variable, interval regression with sample selection (only ML estimation), and endogenous treatment effects models. These methods are described in the three vignettes that are included in this package and in econometric textbooks such as Greene (2011, Econometric Analysis, 7th edition, Pearson).
Reproducible Data Screening Checks and Report of Possible Errors
Data screening is an important first step of any statistical
analysis. 'dataReporter' auto generates a customizable data report with a thorough
summary of the checks and the results that a human can use to identify possible
errors. It provides an extendable suite of test for common potential
errors in a dataset. See Petersen AH, Ekstrøm CT (2019). "dataMaid: Your Assistant for Documenting Supervised Data Quality Screening in R." _Journal of Statistical Software_, *90*(6), 1-38
A Suite of Checks for Identification of Potential Errors in a Data Frame as Part of the Data Screening Process
Data screening is an important first step of any statistical analysis. dataMaid auto generates a customizable data report with a thorough summary of the checks and the results that a human can use to identify possible errors. It provides an extendable suite of test for common potential errors in a dataset.
Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric
Finds the k nearest neighbours for every point in a given dataset in O(N log N) time using Arya and Mount's ANN library (v1.1.3). There is support for approximate as well as exact searches, fixed radius searches and 'bd' as well as 'kd' trees. The distance is computed using the L2 (Euclidean) metric. Please see package 'RANN.L1' for the same functionality using the L1 (Manhattan, taxicab) metric.
Tools for Principal Component Analysis-Based Data Structure Comparisons
A suite of non-parametric, visual tools for assessing differences in data structures for two datasets that contain different observations of the same variables. These tools are all based on Principal Component Analysis (PCA) and thus effectively address differences in the structures of the covariance matrices of the two datasets. The PCASDC tools consist of easy-to-use, intuitive plots that each focus on different aspects of the PCA decompositions. The cumulative eigenvalue (CE) plot describes differences in the variance components (eigenvalues) of the deconstructed covariance matrices. The angle plot presents the information loss when moving from the PCA decomposition of one dataset to the PCA decomposition of the other. The chroma plot describes the loading patterns of the two datasets, thereby presenting the relative weighting and importance of the variables from the original dataset.
Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007)
A Collection of R Functions by the Petersen Lab
A collection of R functions that are widely used by the Petersen
Lab. Included are functions for various purposes, including evaluating the
accuracy of judgments and predictions, performing scoring of assessments,
generating correlation matrices, conversion of data between various types,
data management, psychometric evaluation, extensions related to latent
variable modeling, various plotting capabilities, and other miscellaneous
useful functions. By making the package available, we hope to make our
methods reproducible and replicable by others and to help others perform
their data processing and analysis methods more easily and efficiently. The
codebase is in
Estimators for Two-Sample Capture-Recapture Studies
A comprehensive implementation of Petersen-type estimators and its many variants for two-sample capture-recapture studies. A conditional likelihood approach is used that allows for tag loss; non reporting of tags; reward tags; categorical, geographical and temporal stratification; partial stratification; reverse capture-recapture; and continuous variables in modeling the probability of capture. Many examples from fisheries management are presented.
Subdistribution Analysis of Competing Risks
Estimation, testing and regression modeling of
subdistribution functions in competing risks, as described in Gray
(1988), A class of K-sample tests for comparing the cumulative
incidence of a competing risk, Ann. Stat. 16:1141-1154