Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 100 packages in 0.02 seconds

metasnf — by Prashanth S Velayudhan, 6 months ago

Meta Clustering with Similarity Network Fusion

Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in . SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in . Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.

autoFRK — by ShengLi Tzeng, 5 years ago

Automatic Fixed Rank Kriging

Automatic fixed rank kriging for (irregularly located) spatial data using a class of basis functions with multi-resolution features and ordered in terms of their resolutions. The model parameters are estimated by maximum likelihood (ML) and the number of basis functions is determined by Akaike's information criterion (AIC). For spatial data with either one realization or independent replicates, the ML estimates and AIC are efficiently computed using their closed-form expressions when no missing value occurs. Details regarding the basis function construction, parameter estimation, and AIC calculation can be found in Tzeng and Huang (2018) . For data with missing values, the ML estimates are obtained using the expectation- maximization algorithm. Apart from the number of basis functions, there are no other tuning parameters, making the method fully automatic. Users can also include a stationary structure in the spatial covariance, which utilizes 'LatticeKrig' package.

GapAnalysis — by Julian Ramirez-Villegas, 4 years ago

Conservation Indicators using Spatial Information

Supports the assessment of the degree of conservation of taxa in conservation systems, both in ex situ [in genebanks, botanical gardens, and other repositories] and in situ [in protected natural areas]. Methods are described in Carver et al. [2021] , building on Khoury et al. [2020] , Khoury et al. [2019] , Khoury et al. [2019] , Castaneda-Alvarez et al. [2016] , and Ramirez-Villegas et al. [2010] .

dang — by Dirk Eddelbuettel, 2 years ago

'Dang' Associated New Goodies

A collection of utility functions.

metajam — by Julien Brun, a year ago

Easily Download Data and Metadata from 'DataONE'

A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (< https://www.dataone.org>) and easily importing this information into R.

splash — by Roberto Villegas-Diaz, 3 years ago

Simple Process-Led Algorithms for Simulating Habitats

This program calculates bioclimatic indices and fluxes (radiation, evapotranspiration, soil moisture) for use in studies of ecosystem function, species distribution, and vegetation dynamics under changing climate scenarios. Predictions are based on a minimum of required inputs: latitude, precipitation, air temperature, and cloudiness. Davis et al. (2017) .

TFRE — by Yunan Wu, 2 years ago

A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression

Provide functions to estimate the coefficients in high-dimensional linear regressions via a tuning-free and robust approach. The method was published in Wang, L., Peng, B., Bradic, J., Li, R. and Wu, Y. (2020), "A Tuning-free Robust and Efficient Approach to High-dimensional Regression", Journal of the American Statistical Association, 115:532, 1700-1714(JASA’s discussion paper), . See also Wang, L., Peng, B., Bradic, J., Li, R. and Wu, Y. (2020), "Rejoinder to “A tuning-free robust and efficient approach to high-dimensional regression". Journal of the American Statistical Association, 115, 1726-1729, ; Peng, B. and Wang, L. (2015), "An Iterative Coordinate Descent Algorithm for High-Dimensional Nonconvex Penalized Quantile Regression", Journal of Computational and Graphical Statistics, 24:3, 676-694, ; Clémençon, S., Colin, I., and Bellet, A. (2016), "Scaling-up empirical risk minimization: optimization of incomplete u-statistics", The Journal of Machine Learning Research, 17(1):2682–2717; Fan, J. and Li, R. (2001), "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties", Journal of the American Statistical Association, 96:456, 1348-1360, .

backtest — by Daniel Gerlanc, 10 years ago

Exploring Portfolio-Based Conjectures About Financial Instruments

The backtest package provides facilities for exploring portfolio-based conjectures about financial instruments (stocks, bonds, swaps, options, et cetera).

httk — by John Wambaugh, 2 months ago

High-Throughput Toxicokinetics

Pre-made models that can be rapidly tailored to various chemicals and species using chemical-specific in vitro data and physiological information. These tools allow incorporation of chemical toxicokinetics ("TK") and in vitro-in vivo extrapolation ("IVIVE") into bioinformatics, as described by Pearce et al. (2017) (). Chemical-specific in vitro data characterizing toxicokinetics have been obtained from relatively high-throughput experiments. The chemical-independent ("generic") physiologically-based ("PBTK") and empirical (for example, one compartment) "TK" models included here can be parameterized with in vitro data or in silico predictions which are provided for thousands of chemicals, multiple exposure routes, and various species. High throughput toxicokinetics ("HTTK") is the combination of in vitro data and generic models. We establish the expected accuracy of HTTK for chemicals without in vivo data through statistical evaluation of HTTK predictions for chemicals where in vivo data do exist. The models are systems of ordinary differential equations that are developed in MCSim and solved using compiled (C-based) code for speed. A Monte Carlo sampler is included for simulating human biological variability (Ring et al., 2017 ) and propagating parameter uncertainty (Wambaugh et al., 2019 ). Empirically calibrated methods are included for predicting tissue:plasma partition coefficients and volume of distribution (Pearce et al., 2017 ). These functions and data provide a set of tools for using IVIVE to convert concentrations from high-throughput screening experiments (for example, Tox21, ToxCast) to real-world exposures via reverse dosimetry (also known as "RTK") (Wetmore et al., 2015 ).

rcorpora — by Gábor Csárdi, a year ago

A Collection of Small Text Corpora of Interesting Data

A collection of small text corpora of interesting data. It contains all data sets from 'dariusk/corpora'. Some examples: names of animals: birds, dinosaurs, dogs; foods: beer categories, pizza toppings; geography: English towns, rivers, oceans; humans: authors, US presidents, occupations; science: elements, planets; words: adjectives, verbs, proverbs, US president quotes.