Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 162 packages in 0.01 seconds

CareDensity — by Robin Denz, 6 months ago

Calculate the Care Density or Fragmented Care Density Given a Patient-Sharing Network

Given a patient-sharing network, calculate either the classic care density as proposed by Pollack et al. (2013) or the fragmented care density as proposed by Engels et al. (2024) . By utilizing the 'igraph' and 'data.table' packages, the provided functions scale well for very large graphs.

Lock5Data — by Robin Lock, 4 years ago

Datasets for "Statistics: UnLocking the Power of Data"

Datasets for the third edition of "Statistics: Unlocking the Power of Data" by Lock^5 Includes version of datasets from earlier editions.

read.gb — by Robin Mercier, 4 years ago

Open GenBank Files

Opens complete record(s) with .gb extension from the NCBI/GenBank Nucleotide database and returns a list containing shaped record(s). These kind of files contains detailed records of DNA samples (locus, organism, type of sequence, source of the sequence...). An example of record can be found at < https://www.ncbi.nlm.nih.gov/nuccore/HE799070>.

lori — by Genevieve Robin, 6 months ago

Imputation of High-Dimensional Count Data using Side Information

Analysis, imputation, and multiple imputation of count data using covariates. LORI uses a log-linear Poisson model where main row and column effects, as well as effects of known covariates and interaction terms can be fitted. The estimation procedure is based on the convex optimization of the Poisson loss penalized by a Lasso type penalty and a nuclear norm. LORI returns estimates of main effects, covariate effects and interactions, as well as an imputed count table. The package also contains a multiple imputation procedure. The methods are described in Robin, Josse, Moulines and Sardy (2019) .

dodgr — by Mark Padgham, 2 months ago

Distances on Directed Graphs

Distances on dual-weighted directed graphs using priority-queue shortest paths (Padgham (2019) ). Weighted directed graphs have weights from A to B which may differ from those from B to A. Dual-weighted directed graphs have two sets of such weights. A canonical example is a street network to be used for routing in which routes are calculated by weighting distances according to the type of way and mode of transport, yet lengths of routes must be calculated from direct distances.

simDAG — by Robin Denz, a month ago

Simulate Data from a DAG and Associated Node Information

Simulate complex data from a given directed acyclic graph and information about each individual node. Root nodes are simply sampled from the specified distribution. Child Nodes are simulated according to one of many implemented regressions, such as logistic regression, linear regression, poisson regression and more. Also includes a comprehensive framework for discrete-time simulation, which can generate even more complex longitudinal data.

VSURF — by Robin Genuer, 2 years ago

Variable Selection Using Random Forests

Three steps variable selection procedure based on random forests. Initially developed to handle high dimensional data (for which number of variables largely exceeds number of observations), the package is very versatile and can treat most dimensions of data, for regression and supervised classification problems. First step is dedicated to eliminate irrelevant variables from the dataset. Second step aims to select all variables related to the response for interpretation purpose. Third step refines the selection by eliminating redundancy in the set of variables selected by the second step, for prediction purpose. Genuer, R. Poggi, J.-M. and Tuleau-Malot, C. (2015) < https://journal.r-project.org/archive/2015-2/genuer-poggi-tuleaumalot.pdf>.

hyper2 — by Robin K. S. Hankin, a year ago

The Hyperdirichlet Distribution, Mark 2

A suite of routines for the hyperdirichlet distribution and reified Bradley-Terry; supersedes the 'hyperdirichlet' package; uses 'disordR' discipline . To cite in publications please use Hankin 2017 , and for Generalized Plackett-Luce likelihoods use Hankin 2024 .

rgrass — by Steven Pawley, 3 months ago

Interface Between 'GRASS' Geographical Information System and 'R'

An interface between the 'GRASS' geographical information system ('GIS') and 'R', based on starting 'R' from within the 'GRASS' 'GIS' environment, or running a free-standing 'R' session in a temporary 'GRASS' location; the package provides facilities for using all 'GRASS' commands from the 'R' command line. The original interface package for 'GRASS 5' (2000-2010) is described in Bivand (2000) and Bivand (2001) < https://www.r-project.org/conferences/DSC-2001/Proceedings/Bivand.pdf>. This was succeeded by 'spgrass6' for 'GRASS 6' (2006-2016) and 'rgrass7' for 'GRASS 7' (2015-present). The 'rgrass' package modernizes the interface for 'GRASS 8' while still permitting the use of 'GRASS 7'.

spatstat.explore — by Adrian Baddeley, 2 months ago

Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.