Found 40 packages in 0.01 seconds
moDel Agnostic Language for Exploration and eXplanation
Unverified black box model is the path to the failure. Opaqueness leads to distrust.
Distrust leads to ignoration. Ignoration leads to rejection.
DALEX package xrays any model and helps to explore and explain its behaviour.
Machine Learning (ML) models are widely used and have various applications in classification
or regression. Models created with boosting, bagging, stacking or similar techniques are often
used due to their high performance. But such black-box models usually lack of direct interpretability.
DALEX package contains various methods that help to understand the link between input variables
and model output. Implemented methods help to explore model on the level of a single instance
as well as a level of the whole dataset.
All model explainers are model agnostic and can be compared across different models.
DALEX package is the cornerstone for 'DrWhy.AI' universe of packages for visual model exploration.
Find more details in (Biecek 2018)
Effects and Importances of Model Ingredients
Collection of tools for assessment of feature importance and feature effects.
Key functions are:
feature_importance() for assessment of global level feature importance,
ceteris_paribus() for calculation of the what-if plots,
partial_dependence() for partial dependence plots,
conditional_dependence() for conditional dependence plots,
accumulated_dependence() for accumulated local effects plots,
aggregate_profiles() and cluster_profiles() for aggregation of ceteris paribus profiles,
generic print() and plot() for better usability of selected explainers,
generic plotD3() for interactive, D3 based explanations, and
generic describe() for explanations in natural language.
The package 'ingredients' is a part of the 'DrWhy.AI' universe (Biecek 2018)
Drawing Survival Curves using 'ggplot2'
Contains the function 'ggsurvplot()' for drawing easily beautiful and 'ready-to-publish' survival curves with the 'number at risk' table and 'censoring count plot'. Other functions are also available to plot adjusted curves for `Cox` model and to visually examine 'Cox' model assumptions.
Tools for Storing, Restoring and Searching for R Objects
Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.
Model Agnostic Explainers for Individual Predictions
Model agnostic tool for decomposition of predictions from black boxes. Break Down Table shows contributions of every variable to a final prediction. Break Down Plot presents variable contributions in a concise graphical way. This package work for binary classifiers and general regression models.
Model Agnostic Instance Level Variable Attributions
Model agnostic tool for decomposition of predictions from black boxes.
Supports additive attributions and attributions with interactions.
The Break Down Table shows contributions of every variable to a final prediction.
The Break Down Plot presents variable contributions in a concise graphical way.
This package works for classification and regression models.
It is an extension of the 'breakDown' package (Staniak and Biecek 2018)
Gaussian Mixture Modeling Algorithms and the Belief-Based Mixture Modeling
Two partially supervised mixture modeling methods:
soft-label and belief-based modeling are implemented.
For completeness, we equipped the package also with the
functionality of unsupervised, semi- and fully supervised
mixture modeling. The package can be applied also to selection
of the best-fitting from a set of models with different
component numbers or constraints on their structures.
For detailed introduction see:
Przemyslaw Biecek, Ewa Szczurek, Martin Vingron, Jerzy
Tiuryn (2012), The R Package bgmm: Mixture Modeling with
Uncertain Knowledge, Journal of Statistical Software
Mini Games from Adventures of Beta and Bit
Three games: proton, frequon and regression. Each one is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. In proton you have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. In frequon you will help to perform statistical cryptanalytic attack on a corpus of ciphered messages. This time seven sub-tasks are pushing the bar much higher. Do you accept the challenge? In regression you will test your modeling skills in a series of eight sub-tasks. Try only if ANOVA is your close friend. It's a part of Beta and Bit project. You will find more about the Beta and Bit project at < http://betabit.wiki>.
Ceteris Paribus Profiles
Ceteris Paribus Profiles (What-If Plots) are designed to present model responses around selected points in a feature space. For example around a single prediction for an interesting observation. Plots are designed to work in a model-agnostic fashion, they are working for any predictive Machine Learning model and allow for model comparisons. Ceteris Paribus Plots supplement the Break Down Plots from 'breakDown' package.
The Proton Game
'The Proton Game' is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. You have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. The knowledge of dplyr is not required but may be very helpful. This game is linked with the ,,Pietraszko's Cave'' story available at http://biecek.pl/BetaBit/Warsaw. It's a part of Beta and Bit series. You will find more about the Beta and Bit series at http://biecek.pl/BetaBit.