Found 16 packages in 0.02 seconds

Thematic Maps

Thematic maps are geographical maps in which spatial data distributions are visualized. This package offers a flexible, layer-based, and easy to use approach to create thematic maps, such as choropleths and bubble maps.

Treemap Visualization

A treemap is a space-filling visualization of hierarchical structures. This package offers great flexibility to draw treemaps.

Thematic Map Tools

Set of tools for reading and processing spatial data. The aim is to supply the workflow to create thematic maps. This package also facilitates 'tmap', the package for visualizing thematic maps.

Tableplot, a Visualization of Large Datasets

A tableplot is a visualisation of a (large) dataset with a dozen of variables, both numeric and categorical. Each column represents a variable and each row bin is an aggregate of a certain number of records. Numeric variables are visualized as bar charts, and categorical variables as stacked bar charts. Missing values are taken into account. Also supports large 'ffdf' datasets from the 'ff' package.

Asynchronous Disk-Based Representation of Massive Data

Storing very large data objects on a local drive, while still making it possible to manipulate the data in an efficient manner.

Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Prediction Model Selection and Performance Evaluation in Multiple Imputed Datasets

Provides functions to apply pooling or backward selection
of logistic, Cox regression and Multilevel (mixed models) prediction
models in multiply imputed datasets. Backward selection can be done
from the pooled model using Rubin's Rules (RR), the D1, D2, D3 and
promising median p-values method. The model can contain
continuous, dichotomous, categorical predictors and interaction terms
between all these type of predictors. Continuous predictors can also
be introduced as restricted cubic spline coefficients. It is also possible
to force (spline) predictors or interaction terms in the model during predictor
selection. The package includes a function to evaluate the stability
of the models using bootstrapping and cluster bootstrapping. The package further
contains functions to generate pooled model performance measures in multiply
imputed datasets as ROC/AUC, R-squares, Brier score, fit test values and
calibration plots for logistic regression models. A function to apply
Bootstrap internal validation is also available where two methods can be
used to combine bootstrapping and multiple imputation. One method, boot_MI,
first draws bootstrap samples and subsequently performs multiple imputation and with
the other method, MI_boot, first bootstrap samples are drawn from each imputed
dataset before results are combined. The adjusted intercept after shrinkage of
the pooled regression coefficients can be subsequently obtained. Backward selection
as part of internal validation is also an option. Also a function to externally
validate logistic prediction models in multiple imputed datasets is available.
Eekhout (2017)

Routines for Performing Empirical Calibration of Observational Study Estimates

Routines for performing empirical calibration of observational study estimates. By using a set of negative control hypotheses we can estimate the empirical null distribution of a particular observational study setup. This empirical null distribution can be used to compute a calibrated p-value, which reflects the probability of observing an estimated effect size when the null hypothesis is true taking both random and systematic error into account. A similar approach can be used to calibrate confidence intervals, using both negative and positive controls.

Rendering Parameterized SQL and Translation to Dialects

A rendering tool for parameterized SQL that also translates into different SQL dialects. These dialects include 'Microsoft Sql Server', 'Oracle', 'PostgreSql', 'Amazon RedShift', 'Apache Impala', 'IBM Netezza', 'Google BigQuery', 'Microsoft PDW', and 'SQLite'.

Support for Parallel Computation, Logging, and Function Automation

Support for parallel computation with progress bar, and option to stop or proceed on errors. Also provides logging to console and disk, and the logging persists in the parallel threads. Additional functions support function call automation with delayed execution (e.g. for executing functions in parallel).