Main Package of the EMU Speech Database Management System

Provides the next iteration of the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities.


The emuR package provides the next iteration of the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. It also contains a server that is intended to host databases in the emuDB format (see vignette('emuDB_intro')) to the EMU-webApp (http://ips-lmu.github.io/EMU-webApp/). The querying of annotations is performed using EMU's own EQL2 (EMU Query Language Version 2).

This package is part of the next iteration of the EMU Speech Database Management System which aims to be as close to an all-in-one solution for generating, manipulating, querying, analyzing and managing speech databases as possible. For an overview of the system please visit this URL: http://ips-lmu.github.io/EMU.html.

install.packages("emuR")

As this also installs all of the dependencies (incl. the wrassp package) this is the only installation step necessary to install the EMU-SDMS on your system. The only other requirement of the EMU-SDMS is a modern web browser (Chrome (recommended!) / Firefox / ...) which most people should already have on their systems.

  • for more information see the An introduction to the emuR package vignette:
vignette('emuR_intro')
  • either download & extract the package from GitHub. Then install it with the following command:
install.packages("path/to/emuR", repos = NULL, type="source")
  • or install the latest development version from GitHub (preferred method):
library(devtools)
install_github("IPS-LMU/emuR", build_vignettes = TRUE)

News

emuR 0.2.1

  • requery_hier() and requery_seq() now implement timeRefSegmentLevel parameter
  • fixed requery_hier() bug of requery on same attribute definition
  • fixed requery_hier() bug of requery on same level but different attribute definition
  • added new EMUwebAppConfig -> perspectives -> signalCanvases -> minMaxValLims config option to emuDB vignette
  • requery_hier + requery_seq now implement the same timeRefSegmentLevel parameter as query (#135)

emuR 0.2.0

  • fixed bad DBconfig gen. on add_perspective
  • fixed list_linkDefinitions() returning strings as factors
  • fixed bad error message when passing in ITEM levels to autobuild_linkFromTimes()
  • fixed incorrect handling of DBconfig when writeToFS was set to FALSE (writeToFS is now called rewriteAllAnnots)
  • rewrite of query engine to not require links_ext table any more (== redundant links)
  • calcTimes parameter added to query() / requery_seq() / requery_hier() to make calculating times optional (extreme performance boost if no times have to be calculated)
  • rewrite of annotJSONcharToBundleAnnotDFs() for faster loads emuDBs containing large annotJSONs
  • replaced tidyjson as annot.json parser with own solution at tidyjson didn't scale well on larger annotation files
  • added verbose parameter to export_TextGridCollection()
  • improved pre-check of dir exists in export_TextGridCollection()
  • added new replace_itemLabels function
  • improved export_TextGridCollection() doc
  • improved replace_itemLables() speed
  • implemented rename_emuDB() (#116)
  • implemented duplicate_level() (#113)
  • implemented linkDuplicates parameter in duplicate_level()
  • autobuild_linkFromTimes() speed improvements
  • FUNCQ queries (start(),end(), medial()) now additionally support TRUE & FALSE and T & F values (vs. 0 & 1)
  • added attrDefNames column to list_levelDefinitions() output
  • can now deal with read only emuDBs by copying the cache to tempdir() and making it writable for the user
  • added start_item_seq_idx and end_item_seq_idx to emuRsegs object
  • added start_item_seq_idx and end_item_seq_idx type values to all intermediate result tables
  • added optional function to reduce hierarchical query results to left and right most children only (large performance gain on calcTimes = T)
  • rewriting annot.json files now updates MD5 sums as well (avoids unnecessary reload on next load_emuDB)
  • rewriting annot.json files now writes all (including empty / missing) attributeDef. labels

emuR 0.1.9

  • fixed problem in conversion to JSON with empty items array (object '{}' vs array '[]')
  • fixed problem of keywords "number" | "time" | "xmin" | ... in labels causing TextGrid parser to fail
  • fixed problem with to lax RegEx in TextGrid parser
  • fixed validation problem with missing levels regarding types
  • also allowing "time = " in TextTiers
  • "levels of type 'EVENT' are not allowed to be super levels (== parents) in a domination relationship" constraint enforced in add_linkDefinition
  • added "MEDIAFILE_SAMPLES" as constant name to access audio samples to get_trackdata() function
  • improved error message to include tgPath in create_DBconfigFromTextGrid function
  • no integer return value returned by create_emuRdemoData() any more! It was implicitly returned from wrassp function call...
  • improved the slow overlap checking function in the BPF parser (is now O(n) instead of O(n^2))
  • fixed col naming problems for new (unreleased) RSQLite version
  • added export_TextGridCollection() function
  • improved doc for get_trackdata
  • constant naming of EMU-SDMS vs EMU_SDMS in various files
  • rewriting all annotation file on add_levelDefinition, remove_levelDefinition

emuR 0.1.8

  • fixed error handling of create_emuRtrackdata + added @export to roxygen doc
  • invalid annotJSONs generated by import_mediaFiles fixed
  • convert_TextGridCollection can now handle nested folders again
  • invalid UUIDs in DBConfig produced by convert_BPFCollection. Also added additional unit test to detect this.
  • list_bundles uses session argument again
  • fixed "Expression tree is too large (maximum depth 1000)" error in get_trackdata with long emuRsegs lists
  • get_trackdata with onTheFly calculation now reuses AsspDataObj if the current utterance is the same as the previous (large performance gain especially on long audio files)
  • checking if DBconfig exists for better error message if 'name' field is not set correctly in DBconfig
  • setting PRAGMA temp_store = 2; for SQLite connections
  • not extracting tables to R if no RegEx needed to create filtered_tmp tables (performance gain when querying large emuDBs)
  • convert_BPFCollection can now assigns the same label to more than one item when unifying tiers
  • newline at the end of load_emuDB if no redundant links are built
  • queries using dominates operator '^' don't use linksExt table anymore -> large performances benefits
  • only using _filtered_tmp tables if RegEx patterns are used
  • changed primary key on items table which leads to massive performance gains (deleting _emuDBcache.sqlite required)

emuR 0.1.7

  • R depends version bump to 3.2.0 (as requested by CRAN maintainer)
  • updated testthat::expect_less_than to expect_lt calls (due to deprecated warnings)
  • Using new .keep_all = T parameter of dplyr
  • removed legacy version of EQL vignette (overlooked as inst/doc was in .gitignore)

emuR 0.1.6

  • skipping in-depth thorough tests on CRAN for query and autobuild SQL functions

emuR 0.1.5

  • fixed problem of interm_res_tables already being present with queries that have multiple recursion depth on both sides of either -> or ^ operand (e.g. query (ae , "[[[Phonetic = n -> Phonetic =z] -> Phonetic = S ] ^ [Text = friends -> Text = she]]"))
  • fixed bad URL in README.md
  • added CITATION file

emuR 0.1.3.9000

  • renamed SQL tables & columns from camel case to underscore notation
  • variable SQL backend implementation

emuR 0.1.2.9000

  • multiple check fixes on various platforms

emuR 0.1.1.9000

  • serve problem with internalVars bug fixed
  • file locking problem that caused vignettes to fail under windows problem fixed

emuR 0.1.0.9000

  • massive refactor of all functions that used to refer to an emuDB by name and optionally by its UUID. They now use the new emuDBhandle object that is now returned by the load_emuDB() function.
  • convert_XXX_to_emuDB() functions renamed to convert_XXX()

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("emuR")

0.2.3 by Raphael Winkelmann, 3 months ago


https://github.com/IPS-LMU/emuR


Report a bug at https://github.com/IPS-LMU/emuR/issues


Browse source code at https://github.com/cran/emuR


Authors: Raphael Winkelmann [aut, cre], Klaus Jaensch [aut, ctb], Steve Cassidy [aut, ctb], Jonathan Harrington [aut, ctb]


Documentation:   PDF Manual  


GPL (>= 2) license


Imports MASS, tools, utils, stats, methods, graphics, grDevices, stringr, uuid, RCurl, base64enc, wrassp, jsonlite, RSQLite, DBI, httpuv, data.table, dplyr

Suggests ggplot2, testthat, knitr, compare, rmarkdown


See at CRAN