Covariate Balance Tables and Plots

Generate balance tables and plots for covariates of groups preprocessed through matching, weighting or subclassification, for example, using propensity scores. Includes integration with 'MatchIt', 'twang', 'Matching', 'optmatch', 'CBPS', 'ebal', and 'WeightIt' for assessing balance on the output of their preprocessing functions. Users can also specify data for balance assessment not generated through the above packages. Also included are methods for assessing balance in clustered or multiply imputed data sets.


Welcome to cobalt, which stands for Covariate Balance Tables (and Plots). cobalt allows users to assess balance on covariate distributions in preprocessed groups generated through weighting, matching, or subclassification, such as by using the propensity score. cobalt's primary function is bal.tab(), which stands for "balance table", and essentially replaces (or supplements) the balance assessment tools found in the R packages twang, MatchIt, CBPS, and Matching. To examine how bal.tab() integrates with these packages and others, see the help file for bal.tab() with ?bal.tab, which links to the methods used for each package. Each page has examples of how bal.tab() is used with the package. There are also three vignette detailing the use of cobalt, which can be accessed with browseVignettes("cobalt"): one for basic uses of cobalt, one for the use of cobalt with additional packages, and another for the use of cobalt with multiply imputed and/or clustered data. Currently, cobalt is compatible with output from MatchIt, twang, Matching, optmatch, CBPS, ebal, and WeightIt. as well as data not processed through these packages.

Why cobalt?

Most of the major conditioning packages contain functions to assess balance; so why use cobalt at all? cobalt arose out of several desiderata when using these packages: to have standardized measures that were consistent across all conditioning packages, to allow for flexibility in the calculation and display of balance measures, and to incorporate recent methodological recommendations in the assessment of balance. In addition, cobalt has unique plotting capabilities that make use of ggplot2 in R for balance assessment and reporting.

Because conditioning methods are spread across several packages which each have their idiosyncrasies in how they report balance (if at all), comparing the resulting balance from various conditioning methods can be a challenge. cobalt unites these packages by providing a single, flexible tool that intelligently processes output from any of the conditioning packages and provides the user with both useful defaults and customizable options for display and calculation. cobalt also allows for balance assessment on data not generated through any of the conditioning packages. In addition, cobalt has tools for assessing and reporting balance for clustered data sets, data sets generated through multiple imputation, and data sets with a continuous treatment variable, all features that exist in very limited capacities or not at all in other packages.

A large focus in devloping cobalt was to streamline output so that only the most useful, non-redundant, and complete information is displayed, all at the user's choice. Balance statistics are intuitive, methodological informed, and simple to interpret. Visual displays of balance reflect the goals of balance assessment rather than being steps removed. While other packages have focused their efforts on processing data, cobalt only assesses balance, and does so particularly well.

New features are being added all the time, following the cutting edge of methodolgocial work on balance assessment. As new packages and methods are developed, cobalt will be ready to integrate them to further our goal of simple, unified balance assessment.

Below are examples of cobalt's primary functions:

library("cobalt")
library("MatchIt")
data("lalonde", package = "cobalt")
m.out <- matchit(treat ~ age + educ + race + married + nodegree + re74 + re75, 
    data = lalonde)
 
# Checking balance before and after matching:
bal.tab(m.out, m.threshold = 0.1, un = TRUE)
#> 
#> Call:
#>   matchit(formula = treat ~ age + educ + race + married + nodegree + 
#>       re74 + re75, data = lalonde)
#> 
#> Balance Measures:
#>                 Type Diff.Un Diff.Adj        M.Threshold
#> distance    Distance  1.7941   0.9739                   
#> age          Contin. -0.3094   0.0718     Balanced, <0.1
#> educ         Contin.  0.0550  -0.1290 Not Balanced, >0.1
#> race_black    Binary  0.6404   0.3730 Not Balanced, >0.1
#> race_hispan   Binary -0.0827  -0.1568 Not Balanced, >0.1
#> race_white    Binary -0.5577  -0.2162 Not Balanced, >0.1
#> married       Binary -0.3236  -0.0216     Balanced, <0.1
#> nodegree      Binary  0.1114   0.0703     Balanced, <0.1
#> re74         Contin. -0.7211  -0.0505     Balanced, <0.1
#> re75         Contin. -0.2903  -0.0257     Balanced, <0.1
#> 
#> Balance tally for mean differences:
#>                    count
#> Balanced, <0.1         5
#> Not Balanced, >0.1     4
#> 
#> Variable with the greatest mean difference:
#>            Diff.Adj        M.Threshold
#> race_black    0.373 Not Balanced, >0.1
#> 
#> Sample sizes:
#>           Control Treated
#> All           429     185
#> Matched       185     185
#> Unmatched     244       0
# Examining distributional balance with plots:
bal.plot(m.out, var.name = "educ")
bal.plot(m.out, var.name = "race")

# Generating a Love plot to report balance:
love.plot(bal.tab(m.out), threshold = 0.1, abs = TRUE, var.order = "unadjusted")

Please remember to cite this package when using it to analyze data. For example, in a manuscript, write: "Matching was performed using Matching (Sekhon, 2011), and covariate balance was assessed using cobalt (Greifer, 2017) in R (R Core team, 2017)." Use citation("cobalt") to generate a bibliographic reference for the cobalt package.

News

cobalt News and Updates

Version 3.2.0

  • Added support for longitudinal treatments in bal.tab(), bal.plot(), and love.plot(), including outut from iptw() in twang, CBMSM() from CBPS, and weightitMSM() from WeightIt.

  • Added a vignette to explain use with longitudinal treatments.

  • Edits to help files.

  • Added ability to change density options in bal.plot().

  • Added support for imp in bal.tab() for weightit objects.

  • Fixed bug when limited variables were present. (One found and fixed by sumtxt.)

  • Fixed bug with multiple methods when weights were entered as a list.

Version 3.1.0

  • Added full support for tibbles.

  • Examples for weightit methods in documentation and vignette now work.

  • Improved speed and performance.

  • Added pairwise option for bal.tab() with multinomial treatments.

  • Increased flexibility for displaying balance using love.plot() with clustered or multiply imputed data.

  • Added imbalanced.only and disp.bal.tab options to bal.tab().

  • Fixes to the vignettes. Also, creation of a new vignette to simplify the main one.

Version 3.0.0

  • Added support for multinomial treatments in bal.tab(), including output from CBPS and twang.

  • Added support for weightit objects from WeightIt, including for multinomial treatments.

  • Added support for ebalance.trim objects from ebal.

  • Fixes to the vignette.

  • Fixes to splitfactor() to handle tibbles better.

  • Fixed bug when using bal.tab() with multiply imputed data without adjustment. Fixed bug when using s.weights with the formula method of bal.tab().

Version 2.2.0

  • Added disp.ks and ks.threshold options to bal.tab() to display Kolmogorov-Smirnov statistics before and after preprocessing.

  • Added support for sampling weights, which are applied to both control and treated units, using option s.weights in bal.tab(). Sampling weights are also now compatible with the sampling weights in ps objects from twang; the default is to apply the sampling weights before and after adjustment, mimicking the behavior of bal.table() in twang.

  • Changed behavior of bal.tab() for ps objects to allow displaying balance for more than one stop method at a time, and to default to displaying balance for all available stop methods. The full.stop.method argument in bal.tab() has been renamed stop.method, but full.stop.method still works. get.w() for ps objects has also gone through some changes to be more like twang's get.weights().

  • Added support in bal.tab() and bal.plot() for subclassification with continuous treatments.

  • Added support in splitfactor() and unsplitfactor() for NA values

  • Fixed a bug in love.plot() caused when var.order was specified to be a sample that was not present.

Version 2.1.0

  • Added support in bal.tab(), bal.plot(), and love.plot() for examining balance on multiple weight specifications at a time

  • Added new utilities splitfactor(), unsplitfactor(), and get.w()

  • Added option in bal.plot() to display points sized by weights when treatment and covariate are continuous

  • Added which = "both" option in bal.plot() to simultaneously display plots for both adjusted and unadjusted samples; changed argument syntax to accommodate

  • Allowed bal.plot() to display balance for mutliple clusters and imputations simultaneously

  • Allowed bal.plot() to display balance for mutliple subclasses simultaneously with which.sub

  • Fixes to love.plot() to ensure adjusted points are in front of unadjusted points; changed colors and shape defaults and allowable values

  • Fixed bug where s.d.denom and estimand were not functioning correctly in bal.tab()

  • distance, addl, and weights can now be specified as lists of the usual arguments

Version 2.0.0

  • Added support for matching using the optmatch package or by specifying matching strata.

  • Added full support (bal.tab(), love.plot(), and bal.plot()) for multiply imputed data, including for clustered data sets.

  • Added support for multiple distance measures, including special treatment in love.plot()

  • Adjusted specifications in love.plot() for color and shape of points, and added option to generate a line connecting the points.

  • Adjusted love.plot() display to perform better on Windows.

  • Added capabilties for love.plot() and bal.plot() to display plots for multiple groups at a time

  • Added flexibility to f.build().

  • Updated bal.plot(), giving the capability to view multiple plots for subclassified or clustered data. Multinomial treatments are also supported.

  • Created of a new vignette for clustered and multiply imputed data

  • Speed improvements

  • Fixed a bug causing mislabelling of categorical variables

  • Changed calculation of weighted variance to be in line with recommendations; CBPS can now be used with standardized weights

Version 1.3.1

  • Added support for entropy balancing through the ebal package.

  • Changed default color scheme of love.plot() to be black and white and added options for color, shape, and size of points.

  • Added sample size calculations for continuous treatments.

  • Edits to the vignette.

Version 1.3.0

  • Increased capabilities for cluster balance in bal.tab() and love.plot()

  • Increased information and decreased redundancy when assessing balance on interactions

  • Added "quick" option for bal.tab() to increase speed

  • Added options for print()

  • Bug fixes

  • Speed improvements

  • Edits to the vignette

Version 1.2.0

  • Added support for continous treatment variables in bal.tab(), bal.plot(), and love.plot()

  • Added balance assessment within and across clusters

  • Other small performance changes to minimize errors and be more intuitive

  • Major revisions and adjustments to the vignette

Version 1.1.0

  • Added a vignette.

  • Fixed error in bal.tab.Match that caused wrong values and and warning messages when used.

  • Added new capabilities to bal.plot, including the ability to view unadjusted sample distributions, categorical variables as such, and the distance measure. Also updated documentation to reflect these changes and make which.sub more focal.

  • Allowed subclasses to be different from simply 1:S by treating them like factors once input is numerical

  • Changed column names in Balance table output to fit more compactly, and updated documentation to reflect these changes.

  • Other small performance changes to minimize errors and be more intuitive.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("cobalt")

3.2.3 by Noah Greifer, 2 months ago


https://github.com/ngreifer/cobalt


Report a bug at https://github.com/ngreifer/cobalt/issues


Browse source code at https://github.com/cran/cobalt


Authors: Noah Greifer [aut, cre]


Documentation:   PDF Manual  


GPL (>= 2) license


Imports ggplot2, ggstance

Suggests MatchIt, WeightIt, twang, Matching, optmatch, ebal, CBPS, mice, knitr, rmarkdown


Imported by WeightIt.


See at CRAN