Visual Regression Testing and Graphical Diffing

An extension to the 'testthat' package that makes it easy to add graphical unit tests. It provides a Shiny application to manage the test cases.

Travis-CI Build Status Build status

vdiffr is an extension to the package testthat that makes it easy to test for visual regressions. It provides a Shiny app to manage failed tests and visually compare a graphic to its expected output.


Get the development version from github with:

# install.packages("devtools")

or the last CRAN release with:


vdiffr requires FreeType greater than 2.6.0. It is automatically installed on Windows along with gdtools and comes with X11 on macOS. If you run an old Linux distribution, it is possible you will have to update the relevant package. See the section on Travis-CI below for some indications.

How to use vdiffr

Adding expectations

vdiffr integrates with testthat through the expect_doppelganger() expectation. It takes as arguments:

  • A title. This title is used in two ways. First, the title is standardised (it is converted to lowercase and any character that is not alphanumeric or a space is turned into a dash) and used as filename for storing the figure. Secondly, with ggplot2 figures the title is automatically added to the plot with ggtitle() (only if no ggtitle has been set).

  • A figure. This can be a ggplot object, a recordedplot, a function to be called, or more generally any object with a print method.

  • Optionally, a path where to store the figures, relative to tests/figs/. They are stored in a subfolder according to the current testthat context by default. Supply path to change the subfolder.

For example, the following tests will create figures in tests/figs/histograms/ called base-graphics-histogram.svg and ggplot2-histogram.svg:

disp_hist_base <- function() hist(mtcars$disp)
disp_hist_ggplot <- ggplot(mtcars, aes(disp)) + geom_histogram()
vdiffr::expect_doppelganger("Base graphics histogram", disp_hist_base)
vdiffr::expect_doppelganger("ggplot2 histogram", disp_hist_ggplot)

Note that in addition to automatic ggtitles, ggplot2 figures are assigned the minimalistic theme theme_test() (unless they already have been assigned a theme).

Running tests

You can run the tests the usual way, for example with devtools::test(). New cases for which you just wrote an expectation will be skipped. Failed tests will show as an error.

Managing the tests

When you have added new test cases or detected regressions, you can manage those from the R command line with the functions collect_cases(), validate_cases(), and delete_orphaned_cases(). However it's easier to run the shiny application manage_cases(). With this app you can:

  • Check how a failed case differs from its expected output using three widgets: Toggle (click to swap the images), Slide and Diff. If you use Github, you may be familiar with the last two.

  • Validate cases. You can do so groupwise (all new cases or all failed cases) or on a case by case basis. When you validate a failed case, the old expected output is replaced by the new one.

  • Delete orphaned cases. During a refactoring of your unit tests, some visual expectations may be removed or renamed. This means that some unused figures will linger in the tests/figs/ folder. These figures appear in the Shiny application under the category "Orphaned" and can be cleaned up from there.

Both manage_cases() and collect_cases() take package as first argument, the path to your package sources. This argument has exactly the same semantics as in devtools. You can use vdiffr tools the same way as you would use devtools::check(), for example. The default is ".", meaning that the package is expected to be found in the current folder.

All validated cases are stored in tests/figs/. This folder may be handy to showcase the different graphs offered in your package. You can also keep track of how your plots change as you tweak their layout and add features by checking the history on Github.

RStudio integration

An addin to launch manage_cases() is provided with vdiffr. Use the addin menu to launch the Shiny app in an RStudio dialog.

RStudio addin

ESS integration

To use the Shiny app as part of ESS devtools integration with C-c C-w C-v, include something like this in your init file:

(defun ess-r-vdiffr-manage-cases ()
  (ess-r-package-send-process "vdiffr::manage_cases(%s)\n"
                              "Manage vdiffr cases for %s"))
(define-key ess-r-package-dev-map "\C-v" 'ess-r-vdiffr-manage-cases)

Technical Aspects

FreeType dependency

The software FreeType plays a key role in vdiffr. It is used to compute the extents of text boxes and thus determine the dimensions of graphical elements containing text. These dimensions are then recorded in the SVG files.

Small changes in the algorithm implemented in FreeType to compute text extents will produce different SVGs. For this reason, it is important that the FreeType version that was used to create validated cases be the same as the one on the system running the tests. To avoid false failures, the visual tests are skipped when that's not the case. The minor version is not taken into account so FreeType 2.7.1 is deemed compatible with 2.7.2 but not with 2.8.0.

In practice, this means that package contributors should only validate visual cases if their FreeType version matches the one of the package maintainer. Also, the maintainer must update the version recorded in the package repository (in the file ./tests/figs/deps.txt) when FreeType has been updated on their system. Running vdiffr::validate_cases() updates the dependency file even if there are no visual case to update.

Windows platforms

Appveyor does not require any configuration since FreeType 2.6.0 is automatically installed on this platform along with gdtools. However, Fontconfig builds a cache of all system fonts the first time it is run, which can take a while. It is a good idea to add the following in a fontconfig-helper.R testthat file in order to speed up the cache building on Appveyor and on CRAN's Windows servers:

on_appveyor <- function() {
  identical(Sys.getenv("APPVEYOR"), "True")
on_cran <- function() {
  !identical(Sys.getenv("NOT_CRAN"), "true")
# Use minimal fonts.conf to speed up fc-cache
if (on_appveyor() || on_cran()) {

Dependency notes

vdiffr currently uses svglite to save the plots in a text format that makes it easy to perform comparisons. This makes the test cases dependent on that package as the SVG translation of the plot may change across different versions of svglite (though that should not happen often). For this reason, whenever you validate a graphical test case, the tests/figs/deps.txt file is updated with a note containing the svglite version. This works the same way as the roxygen version note.

Your graphics might be dependent on other packages besides svglite. If your package is an extension to ggplot2 for instance, the appearance of your plot may change as ggplot2 evolves (as with the 2.0 version which tweaked the grayness of the background color among other changes). For this reason, expect_doppelganger() adds a dependence on ggplot2 when you supply a ggplot2 object. You can also manually add a dependency on any other package by calling vdiffr::add_dependency() anywhere in a test file.


testthat Reporter

vdiffr extends testthat through a custom Reporter. Reporters are classes (R6 classes in recent versions of testthat) whose instances collect cases and output a summary of the tests. While reporters are usually meant to provide output for the end user, you can also use them in functions to interact with testthat.

vdiffr has a special reporter that does nothing but activate a collecter for the visual test cases. collect_cases() calls devtools::test() with this reporter. When expect_doppelganger() is called, it first checks whether the case is new or failed. If that's the case, and if it finds that vdiffr's collecter is active, it calls the collecter, which in turns records the current test case.

This enables the user to run the tests with the usual development tools and get feedback in the form of skipped or failed cases. On the other hand, when vdiffr's tools are called, we collect information about the tests of interest and wrap them in a data structure.

SVG comparison

Comparing SVG files is convenient and should work correctly in most situations. However, SVG is not suitable for tracking really subtle changes and regressions. See vdiffr's issue #1 for a discussion on this. vdiffr may gain additional comparison backends in the future to make the tests more stringent.


vdiffr 0.2.2

  • Skip tests if the system version of Cairo (actually the one gdtools was compiled with) doesn't match the version of Cairo used to generate the testcases. Cairo has an influence on the computation of text metrics which can cause spurious test failures.

    We plan to fix these issues once and for all by embedding gdtools, svglite, Cairo and FreeType in the vdiffr package.

vdiffr 0.2.1

This release fixes some CRAN failures.

  • Test cases of the mock package were updated to FreeType 2.8.0.

  • The unit test log file from the mock package is now preserved.

vdiffr 0.2.0

This release makes it easier to debug failures on remote systems. It also makes vdiffr more robust to failures caused by incompatible installations: instead of failing, the tests are skipped. This prevents spurious failures on CRAN.

Troubleshooting on remotes

  • expect_doppelganger() gains a verbose argument to print the SVG files for failed cases while testing. This is useful to debug failures on remotes.

  • When tests are run by R CMD check, failures are now recorded in a log file called This file will show up in the Travis log and can be retrieved from artifacts on Appveyor. It includes the SVG files for failed cases, which is useful to debug failures on remotes.

Handling of incompatible systems

The tests are now skipped if the FreeType version used to build the comparison SVGs does not match the version installed on the system where the tests are run. This is necessary because changes in new version of FreeType might affect the computation of text extents, which then causes svglite to produce slightly different SVGs. The minor version is not taken into account so FreeType 2.7.1 is deemed compatible with 2.7.2 but not with 2.8.0.

In practice, this means that package contributors should only validate visual cases if their FreeType version matches the one of the package maintainer. Also, the maintainer must update the version recorded in the package repository (in the file ./tests/figs/deps.txt) when FreeType has been updated on their system. Running vdiffr::validate_cases() updates the dependency file even if there are no visual case to update.

In the future, we may provide a version of vdiffr statically compiled with a specific version of FreeType to prevent these issues.

Other changes

  • The minimal R version required by vdiffr is now R 3.1.0.

vdiffr 0.1.1

  • expect_doppelganger() no longer throws an error when FreeType is too old. Instead, the test is skipped. This ensures that R CMD check passes on those platforms (e.g., CRAN's Solaris test server).

  • Depends on gdtools 0.1.2 or later as this version fixes a crash on Linux platforms.

  • widget_toggle(), widget_slide() and widget_diff() now take plots as arguments. This makes it easy to embed a vdiffr widget in R Markdown documents. The underscored versions take HTML sources as argument (paths to SVG files or inline SVGs).

vdiffr 0.1.0

  • Generated SVGs are now reproducible across platforms thanks to recent versions of svglite, gdtools, and the new package fontquiver. vdiffr now requires versions of FreeType greater than 2.6.1.

  • The figures folder is hardcoded to tests/figs/.

  • The figures are now stored in subfolders according to the current testthat context. expect_doppelganger() accepts the path argument to bypass this behaviour (set it to "" to store the figures in tests/figs/).

  • The title argument of expect_doppelganger() now serves as ggtitle() in ggplot2 figures (unless a title is already set). It is also standardised and used as filename to store the figure (spaces and non-alphanumeric characters are converted to dashes).

  • Add support for handling orphaned cases: you can now remove figures left over from deleted tests with delete_orphaned_cases() or from the Shiny app.

  • New filter argument to collect_cases() and manage_cases(). This lets you filter the test files from which to collect the cases, which is useful to speed up the collection for large codebases with a lot of unit tests.

  • Fix invalid generation of SVG files (#3)

  • Give a warning when multiple doppelgangers have the same name (#4).

  • Remove CR line endings before comparing svg files for compatibility with Windows


Initial release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.2.2 by Lionel Henry, 17 days ago

Browse source code at

Authors: Lionel Henry [cre, aut], RStudio [cph], Carl Sutherland [aut] (jg-imagediff library), Humble Software [cph] (jg-imagediff library), David Hong [aut] (TwoFace library), jQuery Foundation [cph] (jQuery library), jQuery contributors [ctb, cph] (jQuery library; authors listed in inst/htmlwidgets/lib/jquery-authors.txt)

Documentation:   PDF Manual  

GPL-3 | file LICENSE license

Imports devtools, fontquiver, gdtools, glue, grDevices, htmlwidgets, purrr, rlang, R6, Rcpp, shiny, svglite, testthat, xml2

Suggests ggplot2, roxygen2, rstudioapi, yaml

Linking to Rcpp

System requirements: FreeType >= 2.6.0

Suggested by atlantistools, cowplot, descriptr, earlyR, econullnetr, ggExtra, ggjoy, ggridges, ggstance, incidence, naniar, olsrr, projections, rcartocolor, sicegar, viridis, visdat.

See at CRAN