Found 10000 packages in 0.11 seconds
Text Analysis for All
An R 'shiny' app designed for diverse text analysis tasks, offering a wide range of methodologies tailored to Natural Language Processing (NLP) needs. It is a versatile, general-purpose tool for analyzing textual data. 'tall' features a comprehensive workflow, including data cleaning, preprocessing, statistical analysis, and visualization, all integrated for effective text analysis.
'Geocoordinate Validation Service'
The 'Geocoordinate Validation Service' (GVS) runs checks of coordinates in latitude/longitude format. It returns annotated coordinates with additional flags and metadata that can be used in data cleaning. Additionally, the package has functions related to attribution and metadata information. More information can be found at < https://github.com/ojalaquellueva/gvs/tree/master/api>.
Sites, Population, and Records Cleaning Skills
Data cleaning including 1) generating datasets for time-series and case-crossover analyses based on raw hospital records, 2) linking individuals to an areal map, 3) picking out cases living within a buffer of certain size surrounding a site, etc. For more information, please refer to Zhang W,etc. (2018)
GIS Integration
Designed to facilitate the preprocessing and linking of GIS (Geographic Information System) databases < https://www.sciencedirect.com/topics/computer-science/gis-database>, the R package 'GISINTEGRATION' offers a robust solution for efficiently preparing GIS data for advanced spatial analyses. This package excels in simplifying intrica procedures like data cleaning, normalization, and format conversion, ensuring that the data are optimally primed for precise and thorough analysis.
Tidy Messy Data
Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).
Simple Data Frames
Provides a 'tbl_df' class (the 'tibble') with stricter checking and better formatting than the traditional data frame.
Tools for Accessing the Botanical Information and Ecology Network Database
Provides Tools for Accessing the Botanical Information and Ecology Network Database. The BIEN database contains cleaned and standardized botanical data including occurrence, trait, plot and taxonomic data (See < https://bien.nceas.ucsb.edu/bien/> for more Information). This package provides functions that query the BIEN database by constructing and executing optimized SQL queries.
Economics and Pricing Tools
Functions to aid in micro and macro economic analysis and handling of price and currency data. Includes extraction of relevant inflation and exchange rate data from World Bank API, data cleaning/parsing, and standardisation. Inflation adjustment calculations as found in Principles of Macroeconomics by Gregory Mankiw et al (2014). Current and historical end of day exchange rates for 171 currencies from the European Central Bank Statistical Data Warehouse (2020) < https://sdw.ecb.europa.eu/curConverter.do>.
Automatic Database Normalisation for Data Frames
Automatic normalisation of a data frame to third normal form, with the intention of easing the process of data cleaning. (Usage to design your actual database for you is not advised.) Originally inspired by the 'AutoNormalize' library for 'Python' by 'Alteryx' (< https://github.com/alteryx/autonormalize>), with various changes and improvements. Automatic discovery of functional or approximate dependencies, normalisation based on those, and plotting of the resulting "database" via 'Graphviz', with options to exclude some attributes at discovery time, or remove discovered dependencies at normalisation time.
Draw Stratified Samples from the VADIR Database
Affords researchers the ability to draw stratified samples from the U.S. Department of Veteran's Affairs/Department of Defense Identity Repository (VADIR) database according to a variety of population characteristics. The VADIR database contains information for all veterans who were separated from the military after 1980. The central utility of the present package is to integrate data cleaning and formatting for the VADIR database with the stratification methods described by Mahto (2019) < https://CRAN.R-project.org/package=splitstackshape>. Data from VADIR are not provided as part of this package.