Make Fake Data

Make fake data, supporting addresses, person names, dates, times, colors, coordinates, currencies, digital object identifiers ('DOIs'), jobs, phone numbers, 'DNA' sequences, doubles and integers from distributions and within a range.


Project Status: Active – The project has reached a stable, usable state and is being actively developed. Build Status Build status cran checks codecov cran version

charlatan makes fake data, inspired from and borrowing some code from Python's faker

Make fake data for:

  • person names
  • jobs
  • phone numbers
  • colors: names, hex, rgb
  • credit cards
  • DOIs
  • numbers in range and from distributions
  • gene sequences
  • geographic coordinates
  • emails
  • URIs, URLs, and their parts
  • IP addresses
  • more coming ...

Possible use cases for charlatan:

  • Students in a classroom setting learning any task that needs a dataset.
  • People doing simulations/modeling that need some fake data
  • Generate fake dataset of users for a database before actual users exist
  • Complete missing spots in a dataset
  • Generate fake data to replace sensitive real data with before public release
  • Create a random set of colors for visualization
  • Generate random coordinates for a map
  • Get a set of randomly generated DOIs (Digital Object Identifiers) to assign to fake scholarly artifacts
  • Generate fake taxonomic names for a biological dataset
  • Get a set of fake sequences to use to test code/software that uses sequence data

Reasons to use charlatan:

  • Lite weight, few dependencies
  • Relatively comprehensive types of data, and more being added
  • Comprehensive set of languages supported, more being added
  • Useful R features such as creating entire fake data.frame's

cran version

install.packages("charlatan")

dev version

devtools::install_github("ropensci/charlatan")
library("charlatan")

high level function

... for all fake data operations

x <- fraudster()
x$job()
#> [1] "Engineer, communications"
x$name()
#> [1] "Ms. Fleeta Bashirian"
x$color_name()
#> [1] "Aquamarine"

locale support

Adding more locales through time, e.g.,

Locale support for job data

ch_job(locale = "en_US", n = 3)
#> [1] "Investment banker, operational" "Psychologist, forensic"        
#> [3] "Magazine features editor"
ch_job(locale = "fr_FR", n = 3)
#> [1] "Diététicien"                       
#> [2] "Auteur interprète"                 
#> [3] "Ingénieur maintenance aéronautique"
ch_job(locale = "hr_HR", n = 3)
#> [1] "Voditelj skele u nacionalnoj plovidbi"
#> [2] "Čuvar prirode"                        
#> [3] "Viši arhivist"
ch_job(locale = "uk_UA", n = 3)
#> [1] "Режисер"   "Математик" "Ріелтор"
ch_job(locale = "zh_TW", n = 3)
#> [1] "多媒體開發主管"   "公共衛生醫師"     "產品企劃開發人員"

For colors:

ch_color_name(locale = "en_US", n = 3)
#> [1] "MediumSlateBlue" "Chocolate"       "HotPink"
ch_color_name(locale = "uk_UA", n = 3)
#> [1] "Морквяний"          "Яскраво-фіолетовий" "Брунато-малиновий"

More coming soon ...

generate a dataset

ch_generate()
#> # A tibble: 10 x 3
#>    name                    job                        phone_number        
#>    <chr>                   <chr>                      <chr>               
#>  1 Hervey Luettgen         Consulting civil engineer  (135)742-8104x9887  
#>  2 Mr. Deontae Herzog      Further education lecturer 426.369.0824        
#>  3 Vicki Denesik           Solicitor, Scotland        1-535-887-8338x39579
#>  4 Mrs. Elvera Heidenreich Secretary/administrator    577.988.6970x0455   
#>  5 Bambi Sanford           Equities trader            1-861-301-3087x38656
#>  6 Garrison Jones          Field seismologist         824-865-3964        
#>  7 Alia Grant              Occupational hygienist     08896842450         
#>  8 Kyree Koss              Equities trader            (086)781-0334       
#>  9 Bama Christiansen DDS   Forensic scientist         582-048-8116        
#> 10 Alfie Koepp             Police officer             (645)984-3611x1223
ch_generate('job', 'phone_number', n = 30)
#> # A tibble: 30 x 2
#>    job                         phone_number       
#>    <chr>                       <chr>              
#>  1 Youth worker                615-108-9165       
#>  2 Charity officer             1-917-206-3061x001 
#>  3 Museum/gallery curator      08799787859        
#>  4 Social researcher           (436)081-1417x20183
#>  5 Ship broker                 840-520-7103       
#>  6 Electrical engineer         1-166-486-7102     
#>  7 Games developer             240-503-6455x54793 
#>  8 Multimedia programmer       +78(5)8476399438   
#>  9 Engineer, maintenance (IT)  03183289534        
#> 10 Conservator, museum/gallery +32(0)2448780352   
#> # ... with 20 more rows

person name

ch_name()
#> [1] "Pauline Renner"
ch_name(10)
#>  [1] "Jon Anderson PhD"        "Dirk Hagenes"           
#>  [3] "Iola Hills"              "Ms. Merna Kilback PhD"  
#>  [5] "May Hermann"             "Dr. Zavier Kassulke III"
#>  [7] "Mr. Yancy Stiedemann"    "Ms. Melina Dach"        
#>  [9] "Ms. Janine Kunde"        "Lovett Greenfelder"

phone number

ch_phone_number()
#> [1] "799.053.8298x03215"
ch_phone_number(10)
#>  [1] "1-795-047-3421"       "01683058041"          "(199)253-2025"       
#>  [4] "442-608-7772x39728"   "006.994.5557"         "566.052.5676x69403"  
#>  [7] "+93(5)4894763387"     "1-067-264-4141x90001" "1-891-855-7961"      
#> [10] "334.562.6526"

job

ch_job()
#> [1] "Careers adviser"
ch_job(10)
#>  [1] "Midwife"                                
#>  [2] "Solicitor"                              
#>  [3] "Engineer, maintenance (IT)"             
#>  [4] "Lecturer, higher education"             
#>  [5] "Scientist, research (physical sciences)"
#>  [6] "Hydrogeologist"                         
#>  [7] "Editor, commissioning"                  
#>  [8] "Research officer, political party"      
#>  [9] "Estate manager/land agent"              
#> [10] "Brewing technologist"

credit cards

ch_credit_card_provider()
#> [1] "JCB 15 digit"
ch_credit_card_provider(n = 4)
#> [1] "JCB 15 digit" "Voyager"      "Discover"     "JCB 16 digit"
ch_credit_card_number()
#> [1] "3766356690332082"
ch_credit_card_number(n = 10)
#>  [1] "55520251460378052"   "561252701122526"     "3337396755300677029"
#>  [4] "4321262613647302"    "4500978817193750"    "869995174334936019" 
#>  [7] "869976711099048283"  "54335568117639643"   "210023110046558888" 
#> [10] "6011082845864142715"
ch_credit_card_security_code()
#> [1] "546"
ch_credit_card_security_code(10)
#>  [1] "772"  "273"  "192"  "316"  "769"  "536"  "041"  "3830" "133"  "991"

Usage in the wild

Contributors

similar art

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for charlatan in R doing citation(package = 'charlatan')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

ropensci_footer

News

charlatan 0.3.0

NEW FEATURES

  • ch_job() and JobsProvider gains da_DK locale support (#94) from @MartinMSPedersen

MINOR IMPROVEMENTS

  • fixes for PersonProvider for locale fr_FR: fix accents; avoid awkward french names; now can do double first names; removed some duplicate names (#35) (#83) from @kylevoyto
  • remove leading and trailing whitespace in JobsProvider and PersonProvider where found; and remove some blank suffixes for fa_IR PersonProvider (#88) (#91) from @kylevoyto
  • standardization of locale names to always be xx_XX where first two letters are lowercase and second two are uppercase (#90) from @kylevoyto
  • change locale for Danish/Denmark from dk_DK to da_DK to comply with ISO-3166 (#93) from @MartinMSPedersen
  • fix Danish phone number formats to match phone numbers actually used there (#93) from @MartinMSPedersen
  • remove duplicates and sort names across PersonProvider for various locales (#96) from @MartinMSPedersen
  • mention similar packages (#72)

charlatan 0.2.2

BUG FIXES

  • run examples conditionally if packages installed for packages in Suggests: iptools and stringi (#82)

charlatan 0.2.0

NEW FEATURES

  • new package author: https://github.com/kylevoyto
  • gains ElementsProvider and associated methods ch_element_element() and ch_element_symbol() for getting element names and symbols (#55)
  • gains InternetProvider with many methods, including for domain names, urls (and their parts), emails, tld's, etc. (#66)
  • gains MiscProvider with methods for getting locale names and locale codes (#69)
  • gains UserAgentProvider for user agent strings (#57)
  • gains FileProvider with methods for mime type, file extension, file names and paths (#59)
  • gains LoremProvider with methods for words, sentences and paragraphs (#58)
  • JobProvider gains Finnish locale (#79)

MINOR IMPROVEMENTS

  • mention usage in the wild in README (#54)
  • change behavior when a locale doesn't have a data type from erroring to a zero length string (#64)
  • switch to markdown docs (#68)
  • fix PersonProvider for locale en_GB - we were ignoring probabilities of different names (#63) (#75)
  • fix ColorProvider: generate only the 216 colors in safe web colors (https://en.wikipedia.org/wiki/Web_colors#Web-safe_colors) - and fix method for generating hex colors (#18) (#42) (#76)
  • fix to have safe_color_name within ColorProvider be sensitive to locale (#17) (#77)
  • packages stringi and iptools moved from Imports to Suggests - not required for package use now unless a few specific methods used (#71)
  • AddressProvider gains methods street_name, street_address, postcode, and address. in addition, various fixes to AddressProvider (#62) (#80)

charlatan 0.1.0

NEW FEATURES

  • Released to CRAN.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("charlatan")

0.3.0 by Scott Chamberlain, 2 months ago


https://github.com/ropensci/charlatan


Report a bug at https://github.com/ropensci/charlatan/issues


Browse source code at https://github.com/cran/charlatan


Authors: Scott Chamberlain [aut, cre] , Kyle Voytovich [aut] , Martin Pedersen [ctb] , Brooke Anderson [rev] (Brooke Anderson reviewed the package for rOpenSci , see https://github.com/ropensci/onboarding/issues/94) , Tristan Mahr [rev] (Tristan Mahr reviewed the package for rOpenSci , see https://github.com/ropensci/onboarding/issues/94)


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports R6, tibble, whisker

Suggests roxygen2, testthat, knitr, rmarkdown, iptools, stringi


Suggested by salty.


See at CRAN