A tool to read and manipulate data generated from 'RiverWare'(TM) < http://www.riverware.org/> simulations. 'RiverWare' and 'RiverSMART' generate data in "rdf", "csv", and "nc" format. This package provides an interface to read, aggregate, and summarize data from one or more simulations in a 'dplyr' pipeline.
RWDataPlyr is a tool to read and manipulate data generated from RiverWareTM simulations in rdf, csv, and nc formats and work with those data in a dplyr pipeline. It provides functions to gather, aggregate, and summarize data from multiple RiverWare simulations, i.e., scenarios.
RWDataPlyr can be installed from GitHub, and we suggest building the vignette.
# install.packages("devtools")devtools::install_github('BoulderCodeHub/RWDataPlyr', build_vignettes = TRUE)
RWDataPlyr provides at least three workflows for reading and using RiverWare data:
rdf_aggregate()and user specified
rw_scen_aggregate()and user specified
Check out the workflow vignette for more details:
vignette("rwdataplyr-workflow", package = "RWDataPlyr")
This software is in the public domain because it contains materials that originally came from the U.S. Bureau of Reclamation, an agency of the United States Department of Interior.
Although this code has been used by Reclamation, no warranty, expressed or implied, is made by Reclamation or the U.S. Government as to the accuracy and functioning of the program and related program material nor shall the fact of distribution constitute any such warranty, and no responsibility is assumed by Reclamation in connection therewith.
This software is provided "AS IS."
Released August 15, 2018
getDataForAllScens()now defaults to
NULL, so that this function conforms to CRAN policies regarding not writing to the user's home file space by default. This should not cause any backwards compatability issues since all older code will explicitly specify the
rwd_agg_template()is now empty, so the user must specify the file explicitly for it to be created, also to conform to the same CRAN policy.
tempdir(), when necessary.
Released June 8, 2018
rdf_to_rwtbl2(), was added to try and improve performance of
rw_scen_aggregate(). (#85) These functions were shown to be about 6x - 7x slower than
getDataForAllScens()for the same aggregation. The new function calls a C++ version that creates the tbl_df, while maintaining the same information as
rdf_to_rwtbl(). The API for the new function is slighly different; the first argument is now a path to an rdf file, rather than an already read in
rdfobject. Otherwise, the same options are available. The C++ version of
rdf_to_twbl()is about 20x faster than the R version. However,
rw_scen_aggregate()are still about 4x slower than
getDataForAllScens(), indicating that there is still some necesary work to get closer speeds between the two functions (#90). As part of this work, the following modifications to other functions were made:
cpparguments. By default,
rdf_to_rwtbl2()is used, but
rdf_to_rwtbl()can be foreced by setting
cpp = FALSE.
TRUE(default), it returns an
rdfobject, otherwise it returns a character vector.
rdf_to_rwtbl()in favor of
scenariois coerced into a character. Typically this is a character, but it was previously left as numeric if specified as a numeric. For easier compatability with C++ and comparsion between
rdf_to_rwtbl2(), it's now always a character.
verboseparameter to print out the status of processing multiple scenarios, rdfs, and slots. (#82)
rwd_agg_template()) to create a blank template (or with examples:
examples = TRUE) csv file to use to create
read_rwd_agg()) to read in csv files as
read_rdf()error messages (#86)
rw_scen_aggregate()will now work with unnamed
rwslot_*functions now error if the data passed to them are not regular (January - December or October - September) (#83)
rdf_get_slot()now has a
"timespan"attribute that corresponds to the start and end values of the rdf.
rw_scen_aggregate(). Now it is clear that the
Timestepcolumns is not returned. (#84)
Released April 10, 2018
RWDataPlyr v0.6.0 includes a major revamping of how scenarios are processed and defines several new classes.
rdf_aggregate() added to upgrade existing
getDataForAllScens() function, which processes multiple scenarios at a time. (#51)
rwd_aggclass, which upgrades the exising "slot aggregation list". The advantage of the new class is a much more flexible way to summarize and aggregate RiverWare slots. (#68)
rwtbl_slot_names()), get the original scenario folders (
rwtbl_get_scen_folder()), and get RiverWare slots names from the saved variable name (
rwtbl_var_to_slot()) were added. (#50)
getDataForAllScens() now always returns data invisibly, so the
retFile arguement is deprecated. This function is also deprecated in favor of
Formalized the list returned by
createSlotAggList() as a
slot_agg_list class. Created applicable constructor, which deprecates
is_ methods and functions. (#67)
slot_agg_listobjects work with the deprecated
rwd_aggobjects are preferable.
read_rw_csv() to read csv files created from RiverWare or RiverSMART. (#30)
rdf_to_rwtbl() to convert rdf objects (read in from
read.rdf()) to tibble objects. (#30)
read.rdf()now returns an object with an rdf class
Switched to snake_case for all new functions and replaced existing functions with snake_case versions. (#76)
|Old Function||New Function|
flowWeightedAvgAnnConc(), because it is rarely used it was converted to an internal function:
trace_fwaac(). Then created
rwslot_fwaac()that is exported and follows the input/output format of
rwdataplyr-workflow vignette. Browse with
vignette("rwdataplyr-workflow", package = "RWDataPlyr").
getDataForAllScens()will use full month name in the
Monthcolumn. This change could break existing code if there are checks for particular months. (#20)
read.rdf()'s original implementation. Now
read.rdf2()(the faster implementation) is named
read.rdf2()is deprecated. (#63)
read.rdf()now works with rdf files that contain scalar slots (#52)
read_rdf()is added as an alias.
Released May 26, 2017
scales::percent, and keep us from having to multiply/divide by 100 if we compute other averages outside of RWDataPlyr. However, this will cause any plotting code that is expecting percentages between 0 and 100 instead of between 0 and 1 to break (or at least look weird). (#53)
findAllSlotsboolean parameter to
TRUEan error will post if one or more of the slots cannot be found. If it is
FALSE, then it will fill in the data frame with
-99, but not fail. (#38)
makeAllScenNames()will create scenario names for vectors of multiple dimensions.
getWYFromYearmon()will take a
yearmonobject and determine the water year it falls into.
WYMaxLTEas a valid aggregation method. This will compare the maximum value in a water year to a threshold and determine if it is less than or equal to the threshold.
getDataForAllScens()(and interal function
processSlots()) work with rdf files that only include one trace of data (#40)
processSlots()now returns the same type/class for
Variablecolumns for both annual and monthly data.
Year is a numeric and
Variableis always a character. This could affect code that does not read the data frame back in and assumes that Variable is a factor. (#54)
createSlotAggList(), so it will not wait until
processSlots()is called to throw the error.