METACRAN search results

Text Analysis for All

An R 'shiny' app designed for diverse text analysis tasks, offering a wide range of methodologies tailored to Natural Language Processing (NLP) needs. It is a versatile, general-purpose tool for analyzing textual data. 'tall' features a comprehensive workflow, including data cleaning, preprocessing, statistical analysis, and visualization, all integrated for effective text analysis.

https://github.com/massimoaria/tall, https://www.k-synth.com/tall/

GVS — by Brian Maitner, 6 months ago

'Geocoordinate Validation Service'

The 'Geocoordinate Validation Service' (GVS) runs checks of coordinates in latitude/longitude format. It returns annotated coordinates with additional flags and metadata that can be used in data cleaning. Additionally, the package has functions related to attribution and metadata information. More information can be found at < https://github.com/ojalaquellueva/gvs/tree/master/api>.

rSPARCS — by Wangjian Zhang, 2 years ago

Sites, Population, and Records Cleaning Skills

Data cleaning including 1) generating datasets for time-series and case-crossover analyses based on raw hospital records, 2) linking individuals to an areal map, 3) picking out cases living within a buffer of certain size surrounding a site, etc. For more information, please refer to Zhang W,etc. (2018) .

GISINTEGRATION — by Leila Marvian Mashhad, a year ago

GIS Integration

Designed to facilitate the preprocessing and linking of GIS (Geographic Information System) databases < https://www.sciencedirect.com/topics/computer-science/gis-database>, the R package 'GISINTEGRATION' offers a robust solution for efficiently preparing GIS data for advanced spatial analyses. This package excels in simplifying intrica procedures like data cleaning, normalization, and format conversion, ensuring that the data are optimally primed for precise and thorough analysis.

tidyr — by Hadley Wickham, a year ago

Tidy Messy Data

Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).

https://tidyr.tidyverse.org, https://github.com/tidyverse/tidyr

tibble — by Kirill Müller, 10 days ago

Simple Data Frames

Provides a 'tbl_df' class (the 'tibble') with stricter checking and better formatting than the traditional data frame.

https://tibble.tidyverse.org/, https://github.com/tidyverse/tibble

BIEN — by Brian Maitner, 5 months ago

Tools for Accessing the Botanical Information and Ecology Network Database

Provides Tools for Accessing the Botanical Information and Ecology Network Database. The BIEN database contains cleaned and standardized botanical data including occurrence, trait, plot and taxonomic data (See < https://bien.nceas.ucsb.edu/bien/> for more Information). This package provides functions that query the BIEN database by constructing and executing optimized SQL queries.

priceR — by Steve Condylios, 10 months ago

Economics and Pricing Tools

Functions to aid in micro and macro economic analysis and handling of price and currency data. Includes extraction of relevant inflation and exchange rate data from World Bank API, data cleaning/parsing, and standardisation. Inflation adjustment calculations as found in Principles of Macroeconomics by Gregory Mankiw et al (2014). Current and historical end of day exchange rates for 171 currencies from the European Central Bank Statistical Data Warehouse (2020) < https://sdw.ecb.europa.eu/curConverter.do>.

https://github.com/stevecondylios/priceR

autodb — by Mark Webster, 3 months ago

Automatic Database Normalisation for Data Frames

Automatic normalisation of a data frame to third normal form, with the intention of easing the process of data cleaning. (Usage to design your actual database for you is not advised.) Originally inspired by the 'AutoNormalize' library for 'Python' by 'Alteryx' (< https://github.com/alteryx/autonormalize>), with various changes and improvements. Automatic discovery of functional or approximate dependencies, normalisation based on those, and plotting of the resulting "database" via 'Graphviz', with options to exclude some attributes at discovery time, or remove discovered dependencies at normalisation time.

https://charnelmouse.github.io/autodb/, https://github.com/CharnelMouse/autodb

sampleVADIR — by Trevor Swanson, 4 years ago

Draw Stratified Samples from the VADIR Database

Affords researchers the ability to draw stratified samples from the U.S. Department of Veteran's Affairs/Department of Defense Identity Repository (VADIR) database according to a variety of population characteristics. The VADIR database contains information for all veterans who were separated from the military after 1980. The central utility of the present package is to integrate data cleaning and formatting for the VADIR database with the stratification methods described by Mahto (2019) < https://CRAN.R-project.org/package=splitstackshape>. Data from VADIR are not provided as part of this package.

https://github.com/tswanson222/sampleVADIR

Search results

R links

R homepage

Download R

Mailing lists

R documentation

R manuals

R FAQs

The R Journal

CRAN links

CRAN homepage

CRAN repository policy

Submit a package

METACRAN stuff

About METACRAN

At github

Report a bug