Statistical Disclosure Control Methods for Anonymization of Microdata and Risk Estimation

Data from statistical agencies and other institutions are mostly confidential. This package can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. In addition, various risk estimation methods are included. Note that the package includes a graphical user interface that allows to use various methods of this package.


Build Status Coverage Status CRAN Downloads Mentioned in Awesome Official Statistics

sdcMicro is an R-package to anonymize microdata. Most functionalities of the package are also available via an interactive shiny-based graphical user interface.

News

5.3.0

  • Add versions for Stata export
  • add parameter shiny.server to sdcApp (to make it easily possible to run the app on a shiny server)
  • fixes due to new data.table version
  • improvements in sdcApp()

5.2.0

  • improvements in sdcApp()
  • updating dependencies due to new version of package car
  • bug in IL1 resolved. Now distiction of methods IL1 and IL1s
  • also a new gh-page (http://sdctools.github.io/sdcMicro/) was created

5.1.0

  • bugfix in sdcApp() when using R-objects as data input
  • bugfix in sdcApp() when button to perform kAnon() was "lost"
  • bugfix in sdcApp() when >= 10 keyvars were used in localSuppression
  • bugfix in sdcApp(): sort table of risky observations correctly
  • support shiny server for the GUI
  • new method kAnon_violations() returning the number of records violating k-anonymity in the sample or the population
  • fixes and improvements in parametrisation and error-handling in riskyCells()
  • minor fixes in sdc_guidelines vignette including a comment, that the guidelines have not yet been revised for sdcMicro version >= 5.0.0
  • pass (...) in writeSafeFile(..., format="csv")
  • fixes and improvements in localSuppression()

5.0.4

  • new default theme "IHSN" for sdcApp()
  • fixing an issue in report() where disclosure risk for original data was wrongly displayed if alpha-parameter was set
  • allow passing through of arguments in sdcApp()
  • add functions argus_rankswap() and argus_microaggregation() that use c++-code directly from mu-argus
  • bugfix in dUtility()
  • new function riskyCells() that allow to compute "unsafe cells" as in mu-argus
  • several code-optimizations and cleanup

5.0.3

  • improvement: show name of uploaded file in report when using sdcApp() (fixes #209)
  • correct summary statistics in GUI in case not all variables have been changed
  • fixes for file-imports of datasets containing labels (eg. stata-files)
  • allow to change computation of suda2-scores by adding a parameter to suda2()
  • use some functions (gowerD,..) from VIM
  • bugfix for special case of only one dim-variable in freqCalc()
  • bugfix for edge-cases in localSuppression()/kAnon()
  • update references and improve documentation

5.0.2

  • consistency improvements
  • code cleanup
  • fixes for non-ut8 encoded metadata using file import in graphical user interface
  • do not allow missing values in weight-variable
  • various small bugfixes and improvements

5.0.1

  • This release includes some small improvements in the graphical user interface and preperations for new major R version.

5.0.0

  • new argument 'excludeVars' in createSdcObj()
  • shiny-based GUI directly included in the package, can be started with sdcApp()
  • added vignette for sdcApp()
  • rewrite of function 'freqCalc()'
  • many improvements and bugfixes

4.1.7

  • IHSN SDC guidelines as vignette
  • cat. key variables returned as factors in extractManipData

4.1.6

  • show method for sdcMicroObj

4.1.5

  • pram bug fix

4.1.4

  • bug fix mafasts

4.1.1

  • only small bug fixes

4.1.0

  • new IHSN SDC Guidelines included
  • new implementation of freqCalc. Computation time is now linear with data size. Gains a lot of speed for large data sets.
  • localSuppression, measure_risk and createSdcJobj make use of new implementation of freqCalc
  • C++-Level glpk and R-Level Rglpk removed for better compatiblity with Mac
  • configure,cleanup removed and Makevars and Makevars.win rewritten according to Rcpp documentation
  • function microaggrGower added: microaggregation for numerical and categorical variables based on Gower distance
  • completely new report facility (knitr and brew instead of R2HTML), new class 'reportObj' which stores all info for reporting and is generate by calcReportData
  • new slot in class sdcObj for manipPramVars

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("sdcMicro")

5.3.0 by Matthias Templ, 9 months ago


https://github.com/sdcTools/sdcMicro


Browse source code at https://github.com/cran/sdcMicro


Authors: Matthias Templ [aut, cre] , Bernhard Meindl [aut] , Alexander Kowarik [aut]


Documentation:   PDF Manual  


Task views: Official Statistics & Survey Methodology


GPL-2 license


Imports utils, stats, graphics, car, carData, rmarkdown, knitr, data.table, xtable, robustbase, cluster, MASS, e1071, tools, Rcpp, methods, sets, ggplot2, shiny, haven, rhandsontable, DT, shinyBS, prettydoc, VIM

Suggests laeken, testthat

Linking to Rcpp


See at CRAN