Found 22 packages in 0.02 seconds
A Set of Datasets Used in My Classes or in the Book 'Modele Liniowe i Mieszane w R, Wraz z Przykladami w Analizie Danych'
A set of datasets and functions used in the book 'Modele liniowe i mieszane w R, wraz z przykladami w analizie danych'. Datasets either come from real studies or are created to be as similar as possible to real studies.
Data Driven Smooth Tests
Smooth testing of goodness of fit. These tests are data driven (alternative hypothesis is dynamically selected based on data). In this package you will find various tests for exponent, Gaussian, Gumbel and uniform distribution.
Descriptive mAchine Learning EXplanations
Machine Learning (ML) models are widely used and have various applications in classification or regression. Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance, but such black-box models usually lack of interpretability. DALEX package contains various explainers that help to understand the link between input variables and model output. The single_variable() explainer extracts conditional response of a model as a function of a single selected variable. It is a wrapper over packages 'pdp' and 'ALEPlot'. The single_prediction() explainer attributes parts of a model prediction to particular variables used in the model. It is a wrapper over 'breakDown' package. The variable_dropout() explainer calculates variable importance scores based on variable shuffling. All these explainers can be plotted with generic plot() function and compared across different models.
Datasets and Functions Used in the Book 'Przewodnik po Pakiecie R'
Data sets and functions used in the polish book "Przewodnik po pakiecie R" (The Hitchhiker's Guide to the R). See more at < http://biecek.pl/R>. Among others you will find here data about housing prices, cancer patients, running times and many others.
Mini Games from Adventures of Beta and Bit
Three games: proton, frequon and regression. Each one is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. In proton you have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. In frequon you will help to perform statistical cryptanalytic attack on a corpus of ciphered messages. This time seven sub-tasks are pushing the bar much higher. Do you accept the challenge? In regression you will test your modeling skills in a series of eight sub-tasks. Try only if ANOVA is your close friend. It's a part of Beta and Bit project. You will find more about the Beta and Bit project at < http://betabit.wiki>.
Tools for Storing, Restoring and Searching for R Objects
Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.
Ceteris Paribus Profiles
Ceteris Paribus Profiles (What-If Plots) are designed to present model responses around selected points in a feature space. For example around a single prediction for an interesting observation. Plots are designed to work in a model-agnostic fashion, they are working for any predictive Machine Learning model and allow for model comparisons. Ceteris Paribus Plots supplement the Break Down Plots from 'breakDown' package.
Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl
Tools for accessing and processing datasets prepared by the Foundation SmarterPoland.pl. Among all: access to API of Google Maps, Central Statistical Office of Poland, MojePanstwo, Eurostat, WHO and other sources.
Gaussian Mixture Modeling Algorithms and the Belief-Based Mixture Modeling
Two partially supervised mixture modeling methods:
soft-label and belief-based modeling are implemented.
For completeness, we equipped the package also with the
functionality of unsupervised, semi- and fully supervised
mixture modeling. The package can be applied also to selection
of the best-fitting from a set of models with different
component numbers or constraints on their structures.
For detailed introduction see:
Przemyslaw Biecek, Ewa Szczurek, Martin Vingron, Jerzy
Tiuryn (2012), The R Package bgmm: Mixture Modeling with
Uncertain Knowledge, Journal of Statistical Software
PogromcyDanych / DataCrunchers is the Masive Online Open Course that Brings R and Statistics to the People
The data sets used in the online course ,,PogromcyDanych''. You can process data in many ways. The course Data Crunchers will introduce you to this variety. For this reason we will work on datasets of different size (from several to several hundred thousand rows), with various level of complexity (from two to two thousand columns) and prepared in different formats (text data, quantitative data and qualitative data). All of these data sets were gathered in a single big package called PogromcyDanych to facilitate access to them. It contains all sorts of data sets such as data about offer prices of cars, results of opinion polls, information about changes in stock market indices, data about names given to newborn babies, ski jumping results or information about outcomes of breast cancer patients treatment.