Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 103 packages in 0.04 seconds

ctbi — by Francois Ritter, 2 years ago

A Procedure to Clean, Decompose and Aggregate Timeseries

Clean, decompose and aggregate univariate time series following the procedure "Cyclic/trend decomposition using bin interpolation" and the Logbox method for flagging outliers, both detailed in Ritter, F.: Technical note: A procedure to clean, decompose, and aggregate time series, Hydrol. Earth Syst. Sci., 27, 349–361, , 2023.

statisfactory — by Adam B. Smith, 2 years ago

Statistical and Geometrical Tools

A collection of statistical and geometrical tools including the aligned rank transform (ART; Higgins et al. 1990 ; Peterson 2002 ; Wobbrock et al. 2011 ), 2-D histograms and histograms with overlapping bins, a function for making all possible formulae within a set of constraints, amongst others.

CollapseLevels — by Krishanu Mukherjee, 5 years ago

Collapses Levels, Computes Information Value and WoE

Contains functions to help in selecting and exploring features ( or variables ) in binary classification problems. Provides functions to compute and display information value and weight of evidence (WoE) of the variables , and to convert numeric variables to categorical variables by binning. Functions are also provided to determine which levels ( or categories ) of a categorical variable can be collapsed (or combined ) based on their response rates. The functions provided only work for binary classification problems.

socialh — by Julia de Paula Soares Valente, 3 years ago

Rank and Social Hierarchy for Gregarious Animals

Tools developed to facilitate the establishment of the rank and social hierarchy for gregarious animals by the Si method developed by Kondo & Hurnik (1990). It is also possible to determine the number of agonistic interactions between two individuals, sociometric and dyadics matrix from dataset obtained through electronic bins. In addition, it is possible plotting the results using a bar plot, box plot, and sociogram.

bunching — by Panos Mavrokonstantis, 3 years ago

Estimate Bunching

Implementation of the bunching estimator for kinks and notches. Allows for flexible estimation of counterfactual (e.g. controlling for round number bunching, accounting for other bunching masses within bunching window, fixing bunching point to be minimum, maximum or median value in its bin, etc.). It produces publication-ready plots in the style followed since Chetty et al. (2011) , with lots of functionality to set plot options.

lacunr — by Elliott Smeds, a year ago

Fast 3D Lacunarity for Voxel Data

Calculates 3D lacunarity from voxel data. It is designed for use with point clouds generated from Light Detection And Ranging (LiDAR) scans in order to measure the spatial heterogeneity of 3-dimensional structures such as forest stands. It provides fast 'C++' functions to efficiently bin point cloud data into voxels and calculate lacunarity using different variants of the gliding-box algorithm originated by Allain & Cloitre (1991) .

ptools — by Andrew Wheeler, 2 years ago

Tools for Poisson Data

Functions used for analyzing count data, mostly crime counts. Includes checking difference in two Poisson counts (e-test), checking the fit for a Poisson distribution, small sample tests for counts in bins, Weighted Displacement Difference test (Wheeler and Ratcliffe, 2018) , to evaluate crime changes over time in treated/control areas. Additionally includes functions for aggregating spatial data and spatial feature engineering.

GenWin — by Timothy M. Beissinger, 3 years ago

Spline Based Window Boundaries for Genomic Analyses

Defines window or bin boundaries for the analysis of genomic data. Boundaries are based on the inflection points of a cubic smoothing spline fitted to the raw data. Along with defining boundaries, a technique to evaluate results obtained from unequally-sized windows is provided. Applications are particularly pertinent for, though not limited to, genome scans for selection based on variability between populations (e.g. using Wright's fixations index, Fst, which measures variability in subpopulations relative to the total population).

esvis — by Daniel Anderson, 5 years ago

Visualization and Estimation of Effect Sizes

A variety of methods are provided to estimate and visualize distributional differences in terms of effect sizes. Particular emphasis is upon evaluating differences between two or more distributions across the entire scale, rather than at a single point (e.g., differences in means). For example, Probability-Probability (PP) plots display the difference between two or more distributions, matched by their empirical CDFs (see Ho and Reardon, 2012; ), allowing for examinations of where on the scale distributional differences are largest or smallest. The area under the PP curve (AUC) is an effect-size metric, corresponding to the probability that a randomly selected observation from the x-axis distribution will have a higher value than a randomly selected observation from the y-axis distribution. Binned effect size plots are also available, in which the distributions are split into bins (set by the user) and separate effect sizes (Cohen's d) are produced for each bin - again providing a means to evaluate the consistency (or lack thereof) of the difference between two or more distributions at different points on the scale. Evaluation of empirical CDFs is also provided, with built-in arguments for providing annotations to help evaluate distributional differences at specific points (e.g., semi-transparent shading). All function take a consistent argument structure. Calculation of specific effect sizes is also possible. The following effect sizes are estimable: (a) Cohen's d, (b) Hedges' g, (c) percentage above a cut, (d) transformed (normalized) percentage above a cut, (e) area under the PP curve, and (f) the V statistic (see Ho, 2009; ), which essentially transforms the area under the curve to standard deviation units. By default, effect sizes are calculated for all possible pairwise comparisons, but a reference group (distribution) can be specified.

PWFSLSmoke — by Jonathan Callahan, 4 years ago

Utilities for Working with Air Quality Monitoring Data

Utilities for working with air quality monitoring data with a focus on small particulates (PM2.5) generated by wildfire smoke. Functions are provided for downloading available data from the United States 'EPA' < https://www.epa.gov/outdoor-air-quality-data> and it's 'AirNow' air quality site < https://www.airnow.gov>. Additional sources of PM2.5 data made accessible by the package include: 'AIRSIS' (aka "Oceaneering", not public) and 'WRCC' < https://wrcc.dri.edu/cgi-bin/smoke.pl>. Data compilations are hosted by the USFS 'AirFire' research team < https://www.airfire.org>.