Data Validation Infrastructure

Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity.


News

version 0.2.0

  • new object of class 'indicator'
  • validation rules now support if-then-else syntax.
  • validaton rules can now contain embedded if-statements, e.g. A | (if (B) C)
  • expressionset objects (validator, indicator) can be combined using '+'
  • results of getter functions (e.g. 'labels') are now named.
  • less verbose printng of options for expressionset objects (validator, indicator)
  • expressionset objects (validator, indicator) can now be coerced to as.data.frame.
  • confrontation objects (validation, indication) can now be coerced to data.frame.
  • safer expression manipulation under the hood.
  • breaking: 'rule' column now called 'name' in summary of 'confrontation' objects.
  • bugfix: character indexing with multiple elements of expressionset objects would crash.
  • bugfix: proper recycling for replace methods in expressionset.

version 0.1.8

  • Added loggers for the 'lumberjack' package: lbj_cells and lbj_rules
  • Tolerance for checking (non-strict) linear inequalities is now controlled by voptions(lin.ineq.opts).
  • Macro assignments through := are no longer put in brackets.

version 0.1.7

  • The missingess counters are now only internally documented and will be deprecated (the introduction of the '.' made them more or less obsolete).
  • Package now 'depends' on methods to allow dispatch on objects inhereting from data.frame
  • bugfix: The assignment created() <- was broken. #65, thanks to Andrew R Gibson.
  • bugfix: Simple equallity checks on character data behaved unexpectedly. #67, thanks to Kevin Kuo.
  • Native routines now registered as required by updated CRAN policy.

version 0.1.5

  • The '.' is now used to reference the validated data set as whole.
  • Small change in output of 'compare' to match the table in van den Broek et al. (2013)
  • registered native routines as now recommended by CRAN

version 0.1.4

  • 'confront' now emits a warining when variable name conflicts with name of a reference data set
  • Deprecated 'validate_reset', in favour of the shorter 'reset' (use 'validate::reset' in case of ambiguity)
  • Deprecated 'validate_options' in favour of the shorter 'voptions'
  • New option na.value with default value NA, controlling the output when a rule evaluates to NA.
  • Added rules from the ESSnet on validation (deliverable 17) to automated tests.
  • added 'grepl' to allowed validation syntax (suggested by Dusan Sovic)
  • exported a few functions w/ keywords internal for extensibility
  • Bugfix: blocks sometimes reported wrong nr of blocks (in case of a single connected block.)
  • Bugfix: macro expansion failed when macros were reused in other macros.
  • Bugfix: certain nonlinear relations were recognized as linear
  • Bugfix: rules that use (anonymous) function definitions raised error when printed.

version 0.1.3

  • initial release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("validate")

0.2.4 by Mark van der Loo, 2 months ago


https://github.com/data-cleaning/validate


Report a bug at https://github.com/data-cleaning/validate/issues


Browse source code at https://github.com/cran/validate


Authors: Mark van der Loo [cre, aut], Edwin de Jonge [aut], Paul Hsieh [ctb]


Documentation:   PDF Manual  


Task views: Official Statistics & Survey Methodology


GPL-3 license


Imports graphics, settings, yaml

Depends on methods

Suggests testthat, knitr, rmarkdown

Enhances lumberjack


Imported by dcmodify, deductive, rspa.

Depended on by errorlocate, validatetools.


See at CRAN