Provides graphics and other functions that evaluate and display models across many different kinds of model architecture. For instance, you can evaluate the effect size of a model input in the same way, regardless of architecture, interaction terms, etc.
Detailed examples of the use of the
statisticalModeling package are contained in the package vignettes. This document is directed to instructors to explain the motivation behind
This package reflects my evolving thinking about how to teach statistics and the importance of integrating modeling into how students think about statistics. Many of the basic ideas have been expressed in my book Statistical Modeling: A Fresh Approach (2/e, 2011):
This package is about (3).
Teaching about statistical modeling often starts with linear regression. I think there is an advantage to introducing other modeling techniques at the same time or even before linear regression. Why?
R provides an infrastructure to support teaching about linear regression. This includes, of course, the
lm() function, but also supporting functions for inference and graphics, e.g.
summary()when applied to an
lmobject produces the traditional regression table and other information such as R$^2$.
abline()makes it easy to plot a (single-variable) regression line over data. Functions such as
ggplot2package make it easy to extend this to functions of several variables.
confint()produces confidence intervals on coefficients.
mosaicpackage has added support for bootstrapping, randomization tests, and the like, as well as extending base functions such as
mean()to allow the formula interface to modeling and to provide a straightforward and consistent template that covers a wide variety of statistical techniques.
statisticalModeling package provides an alternative interface that generalizes to many different statistical modeling types, both regression and classification. It includes:
evaluate_model()produces model outputs that correspond to inputs. It simplifies quickly examining multi-variate models, since it will choose sensible values for any inputs that have not been given specific values. It also generalizes across model architectures in ways that the
predict()family of methods does not.
effect_size()for examining how a change in a model input is related to a change in model output. It is, in effect, a generalization of regression coefficients.
cv_pred_error()makes it simple to apply cross-validation to compare models.
ensemble()provides simple support for bootstrapping effect sizes.
In terms of graphics
fmodel()is the extension to
fmodel()function makes it straightforward to visualize models with multiple variables --- variation with up to four explanatory variables can be shown (with variables beyond four being held constant). It works for many different regression model architectures as well as classification models.
gf_density(), and so on, bring the formula interface to
ggplot(). This captures and extends the excellent simplicity of the
lattice-graphics formula interface, while providing the intuitive "add this component" capabilities of
Installations from CRAN are done in the usual way. The development version of the package is here on GitHub. To install it, use the following commands in your R system.
# Install devtools if necessaryinstall.packages("devtools")# Install statisticalModelingdevtools::install_github("dtkaplan/statisticalModeling")
NEWS.mdfile to track changes to the package.