Main Package of the EMU Speech Database Management System

Provides the next iteration of the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities.

Build Status Coverage Status CRAN_Status_Badge

The emuR package provides the next iteration of the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. It also contains a server that is intended to host databases in the emuDB format (see vignette('emuDB_intro')) to the EMU-webApp ( The querying of annotations is performed using EMU's own EQL2 (EMU Query Language Version 2).

This package is part of the next iteration of the EMU Speech Database Management System which aims to be as close to an all-in-one solution for generating, manipulating, querying, analyzing and managing speech databases as possible. For an overview of the system please visit this URL:



As this also installs all of the dependencies (incl. the wrassp package) this is the only installation step necessary to install the EMU-SDMS on your system. The only other requirement of the EMU-SDMS is a modern web browser (Chrome (recommended!) / Firefox / ...) which most people should already have on their systems.

Quick start

  • for more information see the An introduction to the emuR package vignette:

For Developers / Beta-Testers

Installation (two alternative methods)

  • either download & extract the package from GitHub. Then install it with the following command:
install.packages("path/to/emuR", repos = NULL, type="source")
  • or install the latest development version from GitHub (preferred method):
install_github("IPS-LMU/emuR", build_vignettes = TRUE)


emuR 0.2.3

new features / performance tweaks / improvements

  • tweaked runBASwebservice_maus(); improved performance for presegmented bundles
  • performance bump for fapply() by preallocating result matrix
  • performance bump for trapply() by preallocating result matrix
  • performance bump for mel.spectral() by preallocating result matrix
  • performance bump for bark.spectral() by preallocating result matrix
  • updated DBI calls to comply with the latest best practices (using DBI::dbExecute() instead of DBI::dbGetQuery() for non-SELECT queries)
  • BPF collection exporter documented and now public

bug fixes

  • export_TextGridCollection() now handles partial includes of bundle and session names correctly (issue #147)
  • added missing check if anagestConfig is defined to rename_attributeDefinition()
  • setting useBytes to T to avoid reencoding under windows
  • fixed bug in add_ssffTrackDefinition() that was trying to access fp which was renamed in a refactor to filesDf
  • fixed export to autodetect S3 methods (cbind & rbind for trackdata)

emuR 0.2.2

new features / performance tweaks / improvements

  • some changes to the parameter names in the BAS webservice functions
  • convert_txtCollection and convert_BPFCollection now name topmost item "bundle"
  • added functions to set and get level descriptions in DBconfig
  • BAS webservice functions now perform a cache update prior to departure
  • added multiple perspectives to ae demo database
  • choosing explicit paths with intersecting hierarchies now possible
  • remove levelDef & linkDef now implement force parameters
  • new function convert_txtCollection converts plain text collections into single-node emuDB
  • new functions runBASwebservice_* that call various BAS webservices from inside emuR
  • NULLing out empty DFs on list_level/linkDefs for more consistent API
  • newLinkDefType argument implemented in autobuild_linkFromTimes() to generate linkDefinition if so desired
  • automatically removing superlevel from levelCanvasOrder if convertSuperlevel is set to TRUE in autobuild_linkFromTimes()

bug fixes

  • wrapped readChars in enc2utf8 to avoid encodings issues on windows
  • updating label table correctly on add_attributeDefinition() (#138)
  • runBASwebservice_maus / minni / all now no longer ignore unlinked items (idx -1) but treat them as linkless segments
  • commented out cat() in train() function be be less verbose
  • BAS webservice calls now get their own temp directories (UUID based). This avoids race conditions when several scripts are running in parallel.
  • convert_txtCollection now treats perspectives as array (as it should)

emuR 0.2.1

new features / performance tweaks / improvements

  • added new EMUwebAppConfig -> perspectives -> signalCanvases -> minMaxValLims config option to emuDB vignette
  • requery_hier + requery_seq now implement the same timeRefSegmentLevel parameter as query (#135)

bug fixes

  • fixed requery_hier() bug of requery on same attribute definition
  • fixed requery_hier() bug of requery on same level but different attribute definition

emuR 0.2.0

new features / performance tweaks / improvements

  • rewrite of query engine to not require links_ext table any more (== redundant links)
  • calcTimes parameter added to query() / requery_seq() / requery_hier() to make calculating times optional (extreme performance boost if no times have to be calculated)
  • rewrite of annotJSONcharToBundleAnnotDFs() for faster loads emuDBs containing large annotJSONs
  • replaced tidyjson as annot.json parser with own solution at tidyjson didn't scale well on larger annotation files
  • added verbose parameter to export_TextGridCollection()
  • improved pre-check of dir exists in export_TextGridCollection()
  • added new replace_itemLabels function
  • improved export_TextGridCollection() doc
  • improved replace_itemLables() speed
  • implemented rename_emuDB() (#116)
  • implemented duplicate_level() (#113)
  • implemented linkDuplicates parameter in duplicate_level()
  • autobuild_linkFromTimes() speed improvements
  • FUNCQ queries (start(),end(), medial()) now additionally support TRUE & FALSE and T & F values (vs. 0 & 1)
  • added attrDefNames column to list_levelDefinitions() output
  • can now deal with read only emuDBs by copying the cache to tempdir() and making it writable for the user
  • added start_item_seq_idx and end_item_seq_idx to emuRsegs object
  • added start_item_seq_idx and end_item_seq_idx type values to all intermediate result tables
  • added optional function to reduce hierarchical query results to left and right most children only (large performance gain on calcTimes = T)
  • rewriting annot.json files now updates MD5 sums as well (avoids unnecessary reload on next load_emuDB)
  • rewriting annot.json files now writes all (including empty / missing) attributeDef. labels

bug fixes

  • fixed bad DBconfig gen. on add_perspective
  • fixed list_linkDefinitions() returning strings as factors
  • fixed bad error message when passing in ITEM levels to autobuild_linkFromTimes()
  • fixed incorrect handling of DBconfig when writeToFS was set to FALSE (writeToFS is now called rewriteAllAnnots)

emuR 0.1.9

new features / performance tweaks / improvements

  • also allowing "time = " in TextTiers
  • "levels of type 'EVENT' are not allowed to be super levels (== parents) in a domination relationship" constraint enforced in add_linkDefinition
  • added "MEDIAFILE_SAMPLES" as constant name to access audio samples to get_trackdata() function
  • improved error message to include tgPath in create_DBconfigFromTextGrid function
  • no integer return value returned by create_emuRdemoData() any more! It was implicitly returned from wrassp function call...
  • improved the slow overlap checking function in the BPF parser (is now O(n) instead of O(n^2))
  • fixed col naming problems for new (unreleased) RSQLite version
  • added export_TextGridCollection() function
  • improved doc for get_trackdata
  • constant naming of EMU-SDMS vs EMU_SDMS in various files
  • rewriting all annotation file on add_levelDefinition, remove_levelDefinition

bug fixes

  • fixed problem in conversion to JSON with empty items array (object '{}' vs array '[]')
  • fixed problem of keywords "number" | "time" | "xmin" | ... in labels causing TextGrid parser to fail
  • fixed problem with to lax RegEx in TextGrid parser
  • fixed validation problem with missing levels regarding types

emuR 0.1.8

new features / performance tweaks / improvements

  • get_trackdata with onTheFly calculation now reuses AsspDataObj if the current utterance is the same as the previous (large performance gain especially on long audio files)
  • checking if DBconfig exists for better error message if 'name' field is not set correctly in DBconfig
  • setting PRAGMA temp_store = 2; for SQLite connections
  • not extracting tables to R if no RegEx needed to create filtered_tmp tables (performance gain when querying large emuDBs)
  • convert_BPFCollection can now assigns the same label to more than one item when unifying tiers
  • newline at the end of load_emuDB if no redundant links are built
  • queries using dominates operator '^' don't use linksExt table anymore -> large performances benefits
  • only using _filtered_tmp tables if RegEx patterns are used
  • changed primary key on items table which leads to massive performance gains (deleting _emuDBcache.sqlite required)

bug fixes

  • fixed error handling of create_emuRtrackdata + added @export to roxygen doc
  • invalid annotJSONs generated by import_mediaFiles fixed
  • convert_TextGridCollection can now handle nested folders again
  • invalid UUIDs in DBConfig produced by convert_BPFCollection. Also added additional unit test to detect this.
  • list_bundles uses session argument again
  • fixed "Expression tree is too large (maximum depth 1000)" error in get_trackdata with long emuRsegs lists

emuR 0.1.7

  • R depends version bump to 3.2.0 (as requested by CRAN maintainer)
  • updated testthat::expect_less_than to expect_lt calls (due to deprecated warnings)
  • Using new .keep_all = T parameter of dplyr
  • removed legacy version of EQL vignette (overlooked as inst/doc was in .gitignore)

emuR 0.1.6

  • skipping in-depth thorough tests on CRAN for query and autobuild SQL functions

emuR 0.1.5

  • fixed problem of interm_res_tables already being present with queries that have multiple recursion depth on both sides of either -> or ^ operand (e.g. query (ae , "[[[Phonetic = n -> Phonetic =z] -> Phonetic = S ] ^ [Text = friends -> Text = she]]"))
  • fixed bad URL in
  • added CITATION file


  • renamed SQL tables & columns from camel case to underscore notation
  • variable SQL backend implementation


  • multiple check fixes on various platforms


  • serve problem with internalVars bug fixed
  • file locking problem that caused vignettes to fail under windows problem fixed


  • massive refactor of all functions that used to refer to an emuDB by name and optionally by its UUID. They now use the new emuDBhandle object that is now returned by the load_emuDB() function.
  • convert_XXX_to_emuDB() functions renamed to convert_XXX()

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.0.0 by Raphael Winkelmann, 2 months ago

Report a bug at

Browse source code at

Authors: Raphael Winkelmann [aut, cre], Klaus Jaensch [aut, ctb], Steve Cassidy [aut, ctb], Jonathan Harrington [aut, ctb]

Documentation:   PDF Manual  

GPL (>= 2) license

Imports MASS, tools, utils, stats, methods, graphics, grDevices, stringr, uuid, RCurl, base64enc, shiny, wrassp, jsonlite, RSQLite, DBI, httpuv, dplyr, readr, tibble, purrr, compare

Suggests ggplot2, testthat, knitr, rmarkdown

See at CRAN