Fit dynamic group-level item response theory (IRT) and multilevel
regression and poststratification (MRP) models from item response data. dgo
models latent traits at the level of demographic and geographic groups,
rather than individuals, in a Bayesian group-level IRT approach developed by
Caughey and Warshaw (2015)
dgo is an R package for the dynamic estimation of group-level public opinion. You can use the package to estimate latent trait means in subpopulations from survey data. For example, dgo can estimate the average policy liberalism in each American state over time among Democrats, Independents, and Republicans, given their answers to survey questions about policy proposals.
dgo accomplishes this using a Bayesian group-level IRT approach developed by Caughey and Warshaw 2015. It models latent traits at the level of demographic and geographic groups rather than individuals. It uses a hierarchical model to borrow strength cross-sectionally and dynamic linear models to do so across time.
The package can also be used to estimate smoothed estimates of subpopulations’ average responses to single survey items, using a dynamic multi-level regression and poststratification (MRP) model (Park, Gelman, and Bafumi 2004). For instance, you can use dgo to estimate public opinion in each state on same-sex marriage or the Affordable Care Act.
This model opens up new areas of research on historical public opinion in the United States at the subnational level. It also allows scholars of comparative politics to estimate dynamic cross-national models of public opinion.
dgo can be installed from CRAN:
if (!require(devtools, quietly = TRUE)) install.packages("devtools")devtools::install_github("jamesdunham/dgo")
Load the package and set RStan’s recommended options for a local, multicore machine with excess RAM:
library(dgo)rstan_options(auto_write = TRUE)options(mc.cores = parallel::detectCores())
The minimal workflow from raw data to estimation is:
dgirt()function to estimate a latent trait (e.g., conservatism) or
dgmrp()function to estimate opinion on a single survey question.
Please report issues that you encounter.
OS X only: RStan creates temporary files during estimation in a
location given by
tempdir(), typically an arbitrary location in
/var/folders. If a model runs for days, these files can be cleaned
up while still needed, which induces an error. A good solution is to
set a safer path for temporary files, using an environment variable
checked at session startup. For help setting environment variables,
see the Stack Overflow question
Confirm the new path before starting your model run by restarting R
and checking the output from
Models fitted before October 2016 (specifically <
using dgirt are not fully compatible with dgo. Their contents can be
extracted without using dgo, however, with the
operator. For example:
dgmrp() can generate
during model compilation. These are safe to ignore, or can be
suppressed by following the linked instructions.
dgo is under development and we welcome suggestions.
The package citation is:
Dunham, James, Devin Caughey, and Christopher Warshaw. 2018. dgo: Dynamic Estimation of Group-level Opinion. R package. https://jdunham.io/dgo/.
plot_dgirt()has been replaced by argument
group_name, which takes the name of a single grouping variable. This is a quick workaround for compatibility with breaking changes in ggplot2 3.0.0.
shape()when 1) at least two
group_namesare specified in an order other than alphabetic and 2) geographic
aggregate_dataindicating zero trials. (They don't represent item responses.) Preserving them has the effect that unobserved groups, defined partially or entirely by the values of the grouping variables in zero-trial rows in
aggregate_data, can be included in a model.
aggregate_datais used without
item_data, 2) no demographic groups are specified via
group_names, and 3) geographic
modifier_datamust cover all combinations of the geo and time variables in the item response data (individual or aggregated), but because of a bug in the validation of the geographic data, this requirement was not always enforced. In some cases a warning would appear instead of an error.
shape()now accepts aggregated item response data unaccompanied by individual-level item response data. The
item_namesarguments are no longer required.
shape()for trimming raked weights. Note that trimming occurs before raked weights are rescaled to have mean 1, and the rescaled weights can be larger than
dgmrp()taking for reuse a previously compiled Stan model, as found in the
@stanmodelslot of a
dgmrp()can be used to specify arbitrary
.stanfiles on the disk in addition to those included with the package.
get_item_n()methods properly accepts a vector of variable names when combined with
dgmrp()for fitting single-issue MRP models with hierarchical covariates
dgmrp_fitfor models fitted with
dgmrp(), inheriting from a new virtual class
dgirt()now returns a
dgirt_fit-class object that also inherits from
group_nameschange in 0.2.5
Error in .doLoadActions(where, attach))
group_namesis no longer required. If omitted, the geographic variable given by
geo_namewill define groups.
aggregate_item_namesis no longer required. It defaults to the observed values of the
strata_names. It takes a formula or list of formulas and allows more complicated preweighting.
shape()specifies variables to be kept in
aggregate_datamay include geographic areas, demographics, or time periods that don't appear in
plot_rhats()for model checking.
get_time_elapsedgives model run times. These also appear in