Dynamic Estimation of Group-Level Opinion

Fit dynamic group-level item response theory (IRT) and multilevel regression and poststratification (MRP) models from item response data. dgo models latent traits at the level of demographic and geographic groups, rather than individuals, in a Bayesian group-level IRT approach developed by Caughey and Warshaw (2015) . The package also estimates subpopulations' average responses to single survey items with a dynamic MRP model proposed by Park, Gelman, and Bafumi (2004) .

dgo is an R package for the dynamic estimation of group-level public opinion. You can use the package to estimate latent trait means in subpopulations from survey data. For example, dgo can estimate the average policy liberalism in each American state over time among Democrats, Independents, and Republicans, given their answers to survey questions about policy proposals.

dgo accomplishes this using a Bayesian group-level IRT approach developed by Caughey and Warshaw 2015. It models latent traits at the level of demographic and geographic groups rather than individuals. It uses a hierarchical model to borrow strength cross-sectionally and dynamic linear models to do so across time.

The package can also be used to estimate smoothed estimates of subpopulations' average responses to single survey items, using a dynamic multi-level regression and poststratification (MRP) model (Park, Gelman, and Bafumi 2004). For instance, you can use dgo to estimate public opinion in each state on same-sex marriage or the Affordable Care Act.

This model opens up new areas of research on historical public opinion in the United States at the subnational level. It also allows scholars of comparative politics to estimate dynamic cross-national models of public opinion.


dgo can be installed from CRAN:


Or get the latest version from GitHub using devtools:

if (!require(devtools, quietly = TRUE)) install.packages("devtools")

dgo requires a working installation of RStan. If you don't have already have RStan, follow its "Getting Started" guide.


Load the package and set RStan's recommended options for a local, multicore machine with excess RAM:

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

The minimal workflow from raw data to estimation is:

  1. shape input data using the shape() function; and
  2. pass the result to the dgirt() function to estimate a latent trait (e.g., conservatism) or dgmrp() function to estimate opinion on a single survey question.


Please report issues that you encounter.

  • OS X only: RStan creates temporary files during estimation in a location given by tempdir(), typically an arbitrary location in /var/folders. If a model runs for days, these files can be cleaned up while still needed, which induces an error. A good solution is to set a safer path for temporary files, using an environment variable checked at session startup. For help setting environment variables, see the Stack Overflow question here. Confirm the new path before starting your model run by restarting R and checking the output from tempdir().

  • Models fitted before October 2016 (specifically < #8e6a2cf) using dgirt are not fully compatible with dgo. Their contents can be extracted without using dgo, however, with the $ indexing operator. For example: as.data.frame(dgirtfit_object$stan.cmb).

  • Calling dgirt() or dgmrp() can generate warnings during model compilation. These are safe to ignore, or can be suppressed by following the linked instructions.

Contributing and citing

dgo is under development and we welcome suggestions.

The package citation is:

Dunham, James, Devin Caughey, and Christopher Warshaw. 2017. dgo: Dynamic Estimation of Group-level Opinion. R package. https://jdunham.io/dgo/.


dgo 0.2.14

  • Avoid an error during testing, on R built --without-long-double.

dgo 0.2.13

  • Fix an issue introduced in v0.2.12 that led to an unexpected error in shape() when 1) at least two group_names are specified in an order other than alphabetic and 2) geographic modifier_data is used.

dgo 0.2.12

  • Allow modeling of unobserved groups using aggregated data. The previous behavior was to drop rows in aggregate_data indicating zero trials. (They don't represent item responses.) Preserving them has the effect that unobserved groups, defined partially or entirely by the values of the grouping variables in zero-trial rows in aggregate_data, can be included in a model.
  • Fix an unexpected error when 1) aggregate_data is used without item_data, 2) no demographic groups are specified via group_names, and 3) geographic modifier_data is used.
  • Fix the check for missing modifier_data. Geographic modifier_data must cover all combinations of the geo and time variables in the item response data (individual or aggregated), but because of a bug in the validation of the geographic data, this requirement was not always enforced. In some cases a warning would appear instead of an error.

dgo 0.2.11

  • Add poststratification over posterior samples (closes #21).
  • shape() now accepts aggregated item response data unaccompanied by individual-level item response data. The item_data and item_names arguments are no longer required.
  • Add a max_raked_weight argument to shape() for trimming raked weights. Note that trimming occurs before raked weights are rescaled to have mean 1, and the rescaled weights can be larger than max_raked_weight.
  • Remove the unused function expand_rownames().
  • Bugfixes.

dgo 0.2.10

  • Remove Rcpp dependency by rewriting dichotomize() in R.
  • Avoid estimating models (using RStan) during tests, with the goal of rendering moot variation in build environments. This addresses a test failure during CRAN's r-release-osx-x86_64 build.

dgo 0.2.9

  • Switch from compiling Stan models at install time to compiling them at runtime, avoiding an Rcpp module issue.
  • Add model argument to dgirt() and dgmrp() taking for reuse a previously compiled Stan model, as found in the @stanmodel slot of a dgirt_fit- or dgmrp_fit-class object.
  • The version argument to dgirt() and dgmrp() can be used to specify arbitrary .stan files on the disk in addition to those included with the package.
  • Argument by to get_n() and get_item_n() methods properly accepts a vector of variable names when combined with aggregate arguments.

dgo 0.2.8

  • Improve Stan models for shorter run times
  • Add dgmrp() for fitting single-issue MRP models with hierarchical covariates
  • Add class dgmrp_fit for models fitted with dgmrp(), inheriting from a new virtual class dgo_fit
  • dgirt() now returns a dgirt_fit-class object that also inherits from dgo_fit class
  • Bugfixes

dgo 0.2.7

  • Package renamed dgo: Dynamic Estimation of Group-level Opinion
  • Tweaks to pass CRAN checks: clean up examples and docs
  • Use roxygen2 for classes, methods, and NAMESPACE
  • Fix checks on P, S related to group_names change in 0.2.5
  • Fix Rcpp module issue from 0.2.6 (Error in .doLoadActions(where, attach))
  • Export expand_rownames()

dgo 0.2.6

  • Fix error in dgirt_plot
  • Fix path in tools/make_cpp.R

dgo 0.2.5

  • group_names is no longer required. If omitted, the geographic variable given by geo_name will define groups.
  • aggregate_item_names is no longer required. It defaults to the observed values of the item column in aggregate_data.
  • raking argument to shape() replaces strata_names. It takes a formula or list of formulas and allows more complicated preweighting.
  • id_vars argument to shape() specifies variables to be kept in item_data.
  • aggregate_data may include geographic areas, demographics, or time periods that don't appear in item_data.
  • Fix: use a smaller epsilon than the default in survey::rake() for convergence with non-frequency weights.
  • New dgirtfit methods rhats() and plot_rhats() for model checking.
  • New dgirtfit method get_time_elapsed gives model run times. These also appear in summary output.

