Found 18 packages in 0.02 seconds

Data Driven Smooth Tests

Smooth testing of goodness of fit. These tests are data driven (alternative hypothesis is dynamically selected based on data). In this package you will find various tests for exponent, Gaussian, Gumbel and uniform distribution.

A Set of Datasets Used in My Classes or in the Book 'Modele Liniowe i Mieszane w R, Wraz z Przykladami w Analizie Danych'

A set of datasets and functions used in the book 'Modele liniowe i mieszane w R, wraz z przykladami w analizie danych'. Datasets either come from real studies or are created to be as similar as possible to real studies.

Mini Games from Adventures of Beta and Bit

Three games: proton, frequon and regression. Each one is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. In proton you have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. In frequon you will help to perform statistical cryptanalytic attack on a corpus of ciphered messages. This time seven sub-tasks are pushing the bar much higher. Do you accept the challenge? In regression you will test your modeling skills in a series of eight sub-tasks. Try only if ANOVA is your close friend. It's a part of Beta and Bit project. You will find more about the Beta and Bit project at < http://betabit.wiki>.

Datasets and Functions Used in the Book 'Przewodnik po Pakiecie R'

Data sets and functions used in the polish book "Przewodnik po pakiecie R" (The Hitchhiker's Guide to the R). See more at < http://biecek.pl/R>. Among others you will find here data about housing prices, cancer patients, running times and many others.

Tools for Storing, Restoring and Searching for R Objects

Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.

PogromcyDanych / DataCrunchers is the Masive Online Open Course that Brings R and Statistics to the People

The data sets used in the online course ,,PogromcyDanych''. You can process data in many ways. The course Data Crunchers will introduce you to this variety. For this reason we will work on datasets of different size (from several to several hundred thousand rows), with various level of complexity (from two to two thousand columns) and prepared in different formats (text data, quantitative data and qualitative data). All of these data sets were gathered in a single big package called PogromcyDanych to facilitate access to them. It contains all sorts of data sets such as data about offer prices of cars, results of opinion polls, information about changes in stock market indices, data about names given to newborn babies, ski jumping results or information about outcomes of breast cancer patients treatment.

Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl

Tools for accessing and processing datasets prepared by the Foundation SmarterPoland.pl. Among all: access to API of Google Maps, Central Statistical Office of Poland, MojePanstwo, Eurostat, WHO and other sources.

Gaussian Mixture Modeling Algorithms and the Belief-Based Mixture Modeling

Two partially supervised mixture modeling methods:
soft-label and belief-based modeling are implemented.
For completeness, we equipped the package also with the
functionality of unsupervised, semi- and fully supervised
mixture modeling. The package can be applied also to selection
of the best-fitting from a set of models with different
component numbers or constraints on their structures.
For detailed introduction see:
Przemyslaw Biecek, Ewa Szczurek, Martin Vingron, Jerzy
Tiuryn (2012), The R Package bgmm: Mixture Modeling with
Uncertain Knowledge, Journal of Statistical Software

Break Down Plots

Break Down Plots are inspired by waterfall plots created by 'xgboostExplainer' package (see < https://github.com/AppliedDataSciencePartners/xgboostExplainer>). The idea behind Break Down Plots it to decompose model prediction for a single observation. Break Down Plots show the contribution of every variable present in the model. Such plots will work for binary classifiers and general regression models.

Explaining and Visualizing Random Forests in Terms of Variable Importance

A set of tools to help explain which variables are most important in a random forests. Various variable importance measures are calculated and visualized in different settings in order to get an idea on how their importance changes depending on our criteria (Hemant Ishwaran and Udaya B. Kogalur and Eiran Z. Gorodeski and Andy J. Minn and Michael S. Lauer (2010)