SmarterPoland — by Przemyslaw Biecek, 3 years ago

Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl

Tools for accessing and processing datasets prepared by the Foundation SmarterPoland.pl. Among all: access to API of Google Maps, Central Statistical Office of Poland, MojePanstwo, Eurostat, WHO and other sources.

DALEX2 — by Przemyslaw Biecek, 3 months ago

Descriptive mAchine Learning EXplanations

Machine Learning models are widely used and have various applications in classification or regression tasks. Due to increasing computational power, availability of new data sources and new methods, ML models are more and more complex. Models created with techniques like boosting, bagging of neural networks are true black boxes. It is hard to trace the link between input variables and model outcomes. They are used because of high performance, but lack of interpretability is one of their weakest sides. In many applications we need to know, understand or prove how input variables are used in the model and what impact do they have on final model prediction. DALEX2 is a collection of tools that help to understand how complex predictive models are working. DALEX2 is a part of DrWhy universe for tools for Explanation, Exploration and Visualisation for Predictive Models.

Przewodnik — by Przemyslaw Biecek, 2 years ago

Datasets and Functions Used in the Book 'Przewodnik po Pakiecie R'

Data sets and functions used in the polish book "Przewodnik po pakiecie R" (The Hitchhiker's Guide to the R). See more at < http://biecek.pl/R>. Among others you will find here data about housing prices, cancer patients, running times and many others.

factorMerger — by Agnieszka Sitko, a year ago

The Merging Path Plot

The Merging Path Plot is a methodology for adaptive fusing of k-groups with likelihood-based model selection. This package contains tools for exploration and visualization of k-group dissimilarities. Comparison of k-groups is one of the most important issues in exploratory analyses and it has zillions of applications. The traditional approach is to use pairwise post hoc tests in order to verify which groups differ significantly. However, this approach fails with a large number of groups in both interpretation and visualization layer. The Merging Path Plot solves this problem by using an easy-to-understand description of dissimilarity among groups based on Likelihood Ratio Test (LRT) statistic. Work on this package was financially supported by the 'NCN Opus grant 2016/21/B/ST6/02176'.

ddst — by Przemyslaw Biecek, 3 years ago

Data Driven Smooth Tests

Smooth testing of goodness of fit. These tests are data driven (alternative hypothesis is dynamically selected based on data). In this package you will find various tests for exponent, Gaussian, Gumbel and uniform distribution.

live — by Mateusz Staniak, 25 days ago

Local Interpretable (Model-Agnostic) Visual Explanations

Interpretability of complex machine learning models is a growing concern. This package helps to understand key factors that drive the decision made by complicated predictive model (so called black box model). This is achieved through local approximations that are either based on additive regression like model or CART like model that allows for higher interactions. The methodology is based on Tulio Ribeiro, Singh, Guestrin (2016) . More details can be found in Staniak, Biecek (2018) .

auditor — by Alicja Gosiewska, 6 months ago

Model Audit - Verification, Validation, and Error Analysis

Provides an easy to use unified interface for creating validation plots for any model. The 'auditor' helps to avoid repetitive work consisting of writing code needed to create residual plots. This visualizations allow to asses and compare the goodness of fit, performance, and similarity of models.

archivist.github — by Marcin Kosinski, 8 months ago

Tools for Archiving, Managing and Sharing R Objects via GitHub

The extension of the 'archivist' package integrating the archivist with GitHub via GitHub API, 'git2r' packages and 'httr' package.

coxphSGD — by Marcin Kosinski, 2 years ago

Stochastic Gradient Descent log-Likelihood Estimation in Cox Proportional Hazards Model

Estimate coefficients of Cox proportional hazards model using stochastic gradient descent algorithm for batch data.

xspliner — by Krystian Igras, 3 months ago

Assisted Model Building, using Surrogate Black-Box Models to Train Interpretable Spline Based Additive Models

Builds generalized linear model with automatic data transformation. The 'xspliner' helps to build simple, interpretable models that inherits informations provided by more complicated ones. The resulting model may be treated as explanation of provided black box, that was supplied prior to the algorithm.