Drift Correcting Water Quality Data

A tidy implementation of equations that correct for instrumental drift in continuous water quality monitoring data. There are many sources of water quality data including private (ex: YSI instruments) and open source (ex: USGS and NDBC), each of which are susceptible to errors/inaccuracies due to drift. This package allows the user to correct their data using one or two standard reference values in a uniform, reproducible way. The equations implemented are from Hasenmueller (2011) .


lifecycle Travis-CI BuildStatus AppVeyor BuildStatus codecov CRAN_Status_Badge DOI

There are many sources of water quality monitoring data including instruments (ex: YSI instruments) and open source data sets (ex: USGS and NDBC), all of which are susceptible to errors/inaccuracies due to drift. driftR provides a grammar for cleaning and correcting these data in a “tidy”, reproducible manner.

What’s New

Version 1.1 of driftR is here! It includes:

  • a streamlined dr_read function that includes built-in support for YSI Sonde 6600, YSI EXO, and Onset HOBO products.
  • new aregument in dr_read that gives the option to read in only clean variable names (e.g., no special characters, no spaces, etc.) using functionality from the janitor package.
  • expanded functionality for dr_drop, including the ability to drop by date range and by using expressions.
  • ability to convert observations that are likely measurement errors to NA using either a date range or by using an expression using a new function called dr_replace.
  • changes under the hood to dr_factor that expand it ability to handle a variety of date formats automatically.

Installation

The easiest way to get driftR is to install it from CRAN:

install.packages("driftR")

You can also install the development version of driftR from Github with devtools:

# install.packages("devtools")
devtools::install_github("shaughnessyar/driftR")

Background

The driftR package implements a series of equations used in Dr. Elizabeth Hasenmueller’s hydrology and geochemistry research. These equations correct continuous water quality monitoring data for incremental drift that occurs over time after calibration. There are two forms of corrections included in the package - a one-point calibration and a two-point calibration. One-point and two-point calibration values are suited for different types of measurements. The package is currently written for the easiest use with YSI multiparameter Sonde V2 series products, YSI EXO products, and Onset HOBO products.

The figure below illustrates the difference in chloride values between the uncorrected data and the same data with the drift corrections implemented by driftR applied. Note that the uncorrected data drifts to higher values over time. driftR uses calibration data to correct this drift.

Usage

As shown, continuous water quality instruments drift over time, so it becomes necessary to correct the data to maintain accuracy. driftR provides five verbs for applying these corrections in a consistent, reproducible manner: read, factor, correct, drop, and replace. These verbs are designed to be implemented in that order, though there may be multiple applications of correct for a given data set. All of the core functions for driftR have the dr_ prefix, making it easy to use them interactively in RStudio.

Basic Use

The following example shows a simple workflow for applying these verbs to some hypothetical data:

# load the driftR package
library(driftR)
 
# import data exported from a Sonde
waterTibble <- dr_read(file = "data.csv", instrument = "Sonde", defineVar = TRUE, 
                       cleanVar = TRUE, case = "snake")
 
# calculate correction factor and keep dateTime var
# results stored in new vector corrFac and dateTime
waterTibble <- dr_factor(waterTibble, corrFactor = corrFac, dateVar = Date,
                         timeVar = Time, keepDateTime = FALSE)
 
# apply one-point calibration to SpCond;
# results stored in new vector SpConde_Corr
waterTibble <- dr_correctOne(waterTibble, sourceVar = SpCond, cleanVar = SpCond_Corr,
                             calVal = 1.07, calStd = 1, factorVar = corrFac)
 
# apply two-point calibration to pH;
# results stored in new vector pH_Corr
waterTibble <- dr_correctTwo(waterTibble, sourceVar = pH, cleanVar = pH_Corr,
                             calValLow = 7.01, calStdLow = 7, calValHigh = 11.8,
                             calStdHigh =  10, factorVar = corrFac)
 
# drop observations to account for instrument equilibration
waterTibble <- dr_drop(waterTibble, head=10, tail=5)
 
#replace observations with NA for a date range
waterTibble <- dr_replace(waterTibble, sourceVar = pH, overwite = TRUE, dateVar = Date,
                          timeVar = Time, from = "2018-02-05", to = "2018-02-09")

Use with %>%

All of the core functions return tibbles (or data frames) and make use of the tidy evaluation pronoun .data, so using them in concert with the pipe (%>%) is straightforward:

# load the driftR package
library(driftR)
 
# import data exported from a Sonde
waterTibble <- dr_read(file = "sondeData.csv", instrument = "Sonde", defineVar = TRUE,
                       cleanVar = TRUE, case = "snake")
 
# caclulate correction factors, apply corrections, drop observations, and replace observations
waterTibble <- waterTibble %>%
  dr_factor(corrFactor = corrFac, dateVar = Date, timeVar = Time,
            keepDateTime = TRUE) %>%
  dr_correctOne(sourceVar = SpCond, cleanVar = SpCond_Corr, calVal = 1.07,
                calStd = 1, factorVar = corrFac) %>%
  dr_correctTwo(sourceVar = pH, cleanVar = pH_Corr, calValLow = 7.01, calStdLow = 7,
                calValHigh = 11.8, calStdHigh =  10, factorVar = corrFac) %>%
  dr_drop(head=10, tail=5) %>%
  dr_replace(waterTibble, sourceVar = pH, overwite = TRUE, dateVar = Date,
             timeVar = Time, from = "2018-02-05", to = "2018-02-09")

Additional Documentation

See the package website for more information on these functions and a detailed vignette describing how to get started with driftR. There is also an additional vignette describing the specific ways in which dates and times can be used in driftR functions.

We also provide some introductory examples for how to use tidyr, ggplot2, and several other R packages to conduct some initial exploratory data analysis of driftR output. Finally, we provide a third vignette designed for users of instruments not supported directly by driftR who wish to use driftR with their data.

You can also view the help files from within R:

?dr_read

Want to Contribute?

Have a Concern?

If driftR does not seem to be working as advertised, please help us creating a reproducible example, or reprex, that makes it easy to get help. You can find additional details in our support document.

Adding dr_read Functionality

We are interested in expanding the built-in capabilities of driftR to read in water quality data from other sources. As of version 1.1, we provide built-in support for YSI Sonde 6600, YSI EXO, and Onset HOBO products.

If you have some sample data (~500 observations are ideal) from another model or brand of instrument and are willing to share it, please reach out to one of the package authors or, better yet, open an Issue. If you have some R skills and want to write the function yourself, feel free to check out our contributing document and fork driftR.

Contributor Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

About the Authors

Andrew Shaughnessy led the development of this package. He is a senior at Saint Louis University majoring in Chemistry and Environmental Science.

Christopher Prener, Ph.D. assisted in the development of this package. He is an Assistant Professor in the Department of Sociology and Anthropology at Saint Louis University. He has a broad interest in computational social science as well as the development of R packages to make research more reproducible and to generalize research code.

Elizabeth Hasenmueller, Ph.D. developed the original equations that this package implements and provided the example data. She is an Assistant Professor in the Department of Earth and Atmospheric Science at Saint Louis University.

News

driftR 1.1.0

Major Changes

  • dr_read added to correctly format data fron YSI Sonde 6600 and EXO products, as well as Onset HOBO sensors. The new function contains an argument to specify the instrument, so dr_readSonde has been deprecated. The dr_read function now can reformat variables using the janitor package's clean_names() function. This is a significant change and may break workflows based on driftR v1.0 that anticipate particular variable names. The cleanVar argument is, however, optional, so setting it to FALSE should leave the result compatible with legacy code.
  • dr_drop now includes options to drop by date and (optionally) time as well as by using an expression to identify certain values
  • dr_replace added to give the option to replace values that are measurement errors with NA based on date and (optionally) time as well as by using an expression to identify certain values
  • The format argument for dr_factor has been deprecated - date formatting is now detected automatically by the function

Minor Changes

  • A redesigned hex logo has been added
  • The README and package website have been updated to reflect changes to the package
  • The vignette on tidy evaluation has been removed
  • A vignette on dates and times in driftR has been added

driftR 1.0.0

  • CRAN release version

driftR 0.2.2

  • Correct documentation vignettes and package website for technical details
  • Change plot on README to chloride
  • Add keepDateTime argument to dr_factor()

driftR 0.2.1

  • Complete documentation vignettes and package website.

driftR 0.2.0

  • Added a NEWS.md file to track changes to the package.
  • Ground up re-write of the functions to allow for non-standard evaluation and tidy evaluation from dplyr and rlang.
  • Functions now support piping.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("driftR")

1.1.0 by Andrew Shaughnessy, 6 months ago


https://github.com/shaughnessyar/driftR


Report a bug at https://github.com/shaughnessyar/driftR/issues


Browse source code at https://github.com/cran/driftR


Authors: Andrew Shaughnessy [aut, cre] , Christopher Prener [aut] , Elizabeth Hasenmueller [aut]


Documentation:   PDF Manual  


GPL-3 license


Imports dplyr, glue, janitor, lubridate, magrittr, readr, readxl, rlang, stringr, tibble

Suggests covr, knitr, testthat, rmarkdown


See at CRAN