Use a consistent syntax to create data structures of common statistical techniques that can be continued in a pipe chain. Design the analysis, add settings and variables, construct the results, and polish the final structure. Rinse and repeat for any number of statistical techniques.
Using a standard interface, create common data results structures, such as from a linear regression or correlation. Design the analysis, add settings and variables, construct the results, and lastly scrub and polish it up.
One of the main goals of
mason is to be able to easily implement other
analyses to this infrastructure. Since, I’d argue, most statistical
methods follow a similar pattern (what are the variables, what options
to use for the method, what to select from the results), this can be
easily encapsulated into a ‘blueprint -> construction -> scrubbing and
mason was designed to be best used with the
though it doesn’t need to be. It was also designed to follow the tidy
specifically that everything should result in a data frame, within
limits. This makes it easier to do further analysis, visualization, and
inclusion into report formats. This flow was deliberately chosen so it
works well with
ggplot2, and many other excellent
packages out there that help make analyses easier.
The package can be installed from CRAN using:
For the development version, install using:
# install.packages("remotes") remotes::install_github('lwjohnst86/mason')
The typical usage for this package would flow like this:
library(mason)design(iris, 'glm') %>%add_settings() %>%add_variables('yvars', c('Sepal.Length', 'Sepal.Width')) %>%add_variables('xvars', c('Petal.Length', 'Petal.Width')) %>%construct() %>%scrub() %>%polish_adjust_pvalue()#> # A tibble: 8 x 11#> Yterms Xterms term estimate std.error statistic p.value conf.low#> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>#> 1 Sepal.L… Petal.L… (Inte… 4.31 0.0784 54.9 2.43e-100 4.15#> 2 Sepal.L… Petal.L… <-Xte… 0.409 0.0189 21.6 1.04e- 47 0.372#> 3 Sepal.L… Petal.W… (Inte… 4.78 0.0729 65.5 3.34e-111 4.63#> 4 Sepal.L… Petal.W… <-Xte… 0.889 0.0514 17.3 2.33e- 37 0.788#> 5 Sepal.W… Petal.L… (Inte… 3.45 0.0761 45.4 9.02e- 89 3.31#> 6 Sepal.W… Petal.L… <-Xte… -0.106 0.0183 -5.77 4.51e- 8 -0.142#> 7 Sepal.W… Petal.W… (Inte… 3.31 0.0621 53.3 1.84e- 98 3.19#> 8 Sepal.W… Petal.W… <-Xte… -0.209 0.0437 -4.79 4.07e- 6 -0.295#> # ... with 3 more variables: conf.high <dbl>, sample.size <int>,#> # adj.p.value <dbl>
Depending on the statistical method being used, each function may have slightly different arguments.
If there are problems, create an issue and let me know what the problem is!
add_settingsfollowing the naming convention
add_settings.statmethod_bpand include the appropriate settings to the statistical method.
typeargument in the
add_settingsinstructions above, do the same for the
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
mutatefunctions from dplyr version update
NEWS.mdfile to track changes to the package.