Found 10000 packages in 0.17 seconds
Extensions of 'DTwrappers'
Offers functionality which provides methods for data analyses and cleaning that can be flexibly applied across multiple variables and in groups. These include cleaning accidental text, contingent calculations, counting missing data, and building summarizations of the data.
Testbench for Univariate Time Series Cleaning
A reliable and efficient tool for cleaning univariate time series data. It implements reliable and efficient procedures for automating the process of cleaning univariate time series data. The package provides integration with already developed and deployed tools for missing value imputation and outlier detection. It also provides a way of visualizing large time-series data in different resolutions.
Data Validation Infrastructure
Declare data validation rules and data quality indicators;
confront data with them and analyze or visualize the results.
The package supports rules that are per-field, in-record,
cross-record or cross-dataset. Rules can be automatically
analyzed for rule type and connectivity. Supports checks implied
by an SDMX DSD file as well. See also Van der Loo
and De Jonge (2018)
Curated Datasets and Tools for Epidemiological Data Analysis
Curated datasets and intuitive data management functions to streamline epidemiological data workflows. It is designed to support researchers in quickly accessing clean, structured data and applying essential cleaning, summarizing, visualization, and export operations with minimal effort. Whether you're preparing a cohort for analysis or creating reports, 'DIVINE' makes the process more efficient, transparent, and reproducible.
County-Level Estimates of Fertilizer Application in USA
Compiled and cleaned the county-level estimates of fertilizer, nitrogen and phosphorus, from 1945 to 2012 in United States of America (USA). The commercial fertilizer data were originally generated by USGS based on the sales data of commercial fertilizer. The manure data were estimated based on county-level population data of livestock, poultry, and other animals. See the user manual for detailed data sources and cleaning methods. 'usfertilizer' utilized the tidyverse to clean the original data and provide user-friendly dataframe. Please note that USGS does not endorse this package. Also data from 1986 is not available for now.
A Tidy Solution for Epidemiological Data
Offers a tidy solution for epidemiological data. It houses a range of functions for epidemiologists and public health data wizards for data management and cleaning.
Block Assignment Files
Download and read US Census Bureau data relationship files. Provides support for cleaning and using block assignment files since 2010, as described in < https://www.census.gov/geographies/reference-files/time-series/geo/block-assignment-files.html>. Also includes support for working with block equivalency files, used for years outside of decennial census years.
Cases of COVID-19 in France
Imports and cleans 'opencovid19-fr' < https://github.com/opencovid19-fr/data> data on COVID-19 in France.
Dams in the United States from the National Inventory of Dams (NID)
The single largest source of dams in the United States is the National Inventory of Dams (NID) < http://nid.usace.army.mil> from the US Army Corps of Engineers. Entire data from the NID cannot be obtained all at once and NID's website limits extraction of more than a couple of thousand records at a time. Moreover, selected data from the NID's user interface cannot not be saved to a file. In order to make the analysis of this data easier, all the data from NID was extracted manually. Subsequently, the raw data was checked for potential errors and cleaned. This package provides sample cleaned data from the NID and provides functionality to access the entire cleaned NID data.
Flexibly Reshape Data: A Reboot of the Reshape Package
Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast').