Client for many 'NOAA' data sources including the 'NCDC' climate 'API' at < https://www.ncdc.noaa.gov/cdo-web/webservices/v2>, with functions for each of the 'API' 'endpoints': data, data categories, data sets, data types, locations, location categories, and stations. In addition, we have an interface for 'NOAA' sea ice data, the 'NOAA' severe weather inventory, 'NOAA' Historical Observing 'Metadata' Repository ('HOMR') data, 'NOAA' storm data via 'IBTrACS', tornado data via the 'NOAA' storm prediction center, and more.
rnoaa is an R interface to many NOAA data sources. We don't cover all of them, but we include many commonly used sources, and add we are always adding new sources. We focus on easy to use interfaces for getting NOAA data, and giving back data in easy to use formats downstream. We currently don't do much in the way of plots or analysis.
Functions to work with buoy data use netcdf files. You'll need the
ncdf package for those functions, and those only.
ncdf is in Suggests in this package, meaning you only need
ncdf if you are using the buoy functions. You'll get an informative error telling you to install
ncdf if you don't have it and you try to use the buoy functions. Installation of
ncdf should be straightforward on Mac and Windows, but on Linux you may have issues. See http://cran.r-project.org/web/packages/ncdf/INSTALL
There are many NOAA NCDC datasets. All data sources work, except
NEXRAD3, for an unknown reason. This relates to
ncdc_*() functions only.
|Dataset||Description||Start Date||End Date||Data Coverage|
|GSOM||Global Summary of the Month||1763-01-01||2017-04-01||1.00|
|GSOY||Global Summary of the Year||1763-01-01||2016-01-01||1.00|
|NEXRAD2||Weather Radar (Level II)||1991-06-05||2017-05-01||0.95|
|NEXRAD3||Weather Radar (Level III)||1994-05-20||2017-04-07||0.95|
|PRECIP_15||Precipitation 15 Minute||1970-05-12||2014-01-01||0.25|
Each NOAA dataset has a different set of attributes that you can potentially get back in your search. See http://www.ncdc.noaa.gov/cdo-web/datasets for detailed info on each dataset. We provide some information on the attributes in this package; see the vignette for attributes to find out more
You'll need an API key to use the NOAA NCDC functions (those starting with
ncdc*()) in this package (essentially a password). Go to http://www.ncdc.noaa.gov/cdo-web/token to get one. You can't use this package without an API key.
Once you obtain a key, there are two ways to use it.
a) Pass it inline with each function call (somewhat cumbersome)
ncdc(datasetid = 'PRECIP_HLY', locationid = 'ZIP:28801', datatypeid = 'HPCP', limit = 5, token = "YOUR_TOKEN")
b) Alternatively, you might find it easier to set this as an option, either by adding this line to the top of a script or somewhere in your
options(noaakey = "KEY_EMAILED_TO_YOU")
c) You can always store in permamently in your
You'll need GDAL installed first. You may want to use GDAL >=
0.9-1 since that version or later can read TopoJSON format files as well, which aren't required here, but may be useful. Install GDAL:
sudo apt-get install gdal-binreference
Then when you install the R package
rgeos also requires GDAL), you'll most likely need to specify where you're
gdal-config file is on your machine, as well as a few other things. I have an OSX Mavericks machine, and this works for me (there's no binary for Mavericks, so install the source version):
install.packages("", repos = NULL, type="source", configure.args = "--with-gdal-config=/Library/Frameworks/GDAL.framework/Versions/1.10/unix/bin/gdal-config --with-proj-include=/Library/Frameworks/PROJ.framework/unix/include --with-proj-lib=/Library/Frameworks/PROJ.framework/unix/lib")
The rest of the installation should be easy. If not, let us know.
Stable version from CRAN
or development version from GitHub
ncdc_locs(locationcategoryid='CITY', sortfield='name', sortorder='desc')#> $meta#> $meta$totalCount#>  1980#>#> $meta$pageCount#>  25#>#> $meta$offset#>  1#>#>#> $data#> mindate maxdate name datacoverage id#> 1 1892-08-01 2017-03-31 Zwolle, NL 1.0000 CITY:NL000012#> 2 1901-01-01 2017-04-29 Zurich, SZ 1.0000 CITY:SZ000007#> 3 1957-07-01 2017-04-29 Zonguldak, TU 1.0000 CITY:TU000057#> 4 1906-01-01 2017-04-29 Zinder, NG 0.9025 CITY:NG000004#> 5 1973-01-01 2017-04-29 Ziguinchor, SG 1.0000 CITY:SG000004#> 6 1938-01-01 2017-04-29 Zhytomyra, UP 0.9723 CITY:UP000025#> 7 1948-03-01 2017-04-29 Zhezkazgan, KZ 0.9302 CITY:KZ000017#> 8 1951-01-01 2017-04-29 Zhengzhou, CH 1.0000 CITY:CH000045#> 9 1941-01-01 2017-03-31 Zaragoza, SP 1.0000 CITY:SP000021#> 10 1936-01-01 2009-06-17 Zaporiyhzhya, UP 1.0000 CITY:UP000024#> 11 1957-01-01 2017-04-29 Zanzibar, TZ 0.8016 CITY:TZ000019#> 12 1973-01-01 2017-04-29 Zanjan, IR 0.9105 CITY:IR000020#> 13 1893-01-01 2017-05-01 Zanesville, OH US 1.0000 CITY:US390029#> 14 1912-01-01 2017-04-29 Zahle, LE 0.9819 CITY:LE000004#> 15 1951-01-01 2017-04-29 Zahedan, IR 0.9975 CITY:IR000019#> 16 1860-12-01 2017-04-29 Zagreb, HR 1.0000 CITY:HR000002#> 17 1975-08-29 2017-04-29 Zacatecas, MX 0.9306 CITY:MX000036#> 18 1947-01-01 2017-04-29 Yuzhno-Sakhalinsk, RS 1.0000 CITY:RS000081#> 19 1893-01-01 2017-05-01 Yuma, AZ US 1.0000 CITY:US040015#> 20 1942-02-01 2017-05-01 Yucca Valley, CA US 1.0000 CITY:US060048#> 21 1885-01-01 2017-05-01 Yuba City, CA US 1.0000 CITY:US060047#> 22 1998-02-01 2017-04-29 Yozgat, TU 1.0000 CITY:TU000056#> 23 1893-01-01 2017-05-01 Youngstown, OH US 1.0000 CITY:US390028#> 24 1894-01-01 2017-05-01 York, PA US 1.0000 CITY:US420024#> 25 1869-01-01 2017-05-01 Yonkers, NY US 1.0000 CITY:US360031#>#> attr(,"class")#>  "ncdc_locs"
ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289')#> $meta#> NULL#>#> $data#> elevation mindate maxdate latitude name#> 1 12.2 1899-02-01 2017-04-30 28.8029 INVERNESS 3 SE, FL US#> datacoverage id elevationUnit longitude#> 1 1 GHCND:USC00084289 METERS -82.3126#>#> attr(,"class")#>  "ncdc_stations"
out <- ncdc(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895', datatypeid='dly-tmax-normal', startdate = '2010-05-01', enddate = '2010-05-10')
head( out$data )#> date datatype station value fl_c#> 1 2010-05-01T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895 652 S#> 2 2010-05-02T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895 655 S#> 3 2010-05-03T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895 658 S#> 4 2010-05-04T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895 661 S#> 5 2010-05-05T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895 663 S#> 6 2010-05-06T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895 666 S
out <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-05-01', enddate = '2010-10-31', limit=500)ncdc_plot(out, breaks="1 month", dateformat="%d/%m")
You can pass many outputs from calls to the
noaa function in to the
out1 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-03-01', enddate = '2010-05-31', limit=500)out2 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-09-01', enddate = '2010-10-31', limit=500)ncdc_plot(out1, out2, breaks="45 days")
ncdc_datasets()#> $meta#> $meta$offset#>  1#>#> $meta$count#>  11#>#> $meta$limit#>  25#>#>#> $data#> uid mindate maxdate name#> 1 gov.noaa.ncdc:C00861 1763-01-01 2017-05-01 Daily Summaries#> 2 gov.noaa.ncdc:C00946 1763-01-01 2017-04-01 Global Summary of the Month#> 3 gov.noaa.ncdc:C00947 1763-01-01 2016-01-01 Global Summary of the Year#> 4 gov.noaa.ncdc:C00345 1991-06-05 2017-05-01 Weather Radar (Level II)#> 5 gov.noaa.ncdc:C00708 1994-05-20 2017-04-07 Weather Radar (Level III)#> 6 gov.noaa.ncdc:C00821 2010-01-01 2010-01-01 Normals Annual/Seasonal#> 7 gov.noaa.ncdc:C00823 2010-01-01 2010-12-31 Normals Daily#> 8 gov.noaa.ncdc:C00824 2010-01-01 2010-12-31 Normals Hourly#> 9 gov.noaa.ncdc:C00822 2010-01-01 2010-12-01 Normals Monthly#> 10 gov.noaa.ncdc:C00505 1970-05-12 2014-01-01 Precipitation 15 Minute#> 11 gov.noaa.ncdc:C00313 1900-01-01 2014-01-01 Precipitation Hourly#> datacoverage id#> 1 1.00 GHCND#> 2 1.00 GSOM#> 3 1.00 GSOY#> 4 0.95 NEXRAD2#> 5 0.95 NEXRAD3#> 6 1.00 NORMAL_ANN#> 7 1.00 NORMAL_DLY#> 8 1.00 NORMAL_HLY#> 9 1.00 NORMAL_MLY#> 10 0.25 PRECIP_15#> 11 1.00 PRECIP_HLY#>#> attr(,"class")#>  "ncdc_datasets"
ncdc_datacats(locationid = 'CITY:US390029')#> $meta#> $meta$totalCount#>  38#>#> $meta$pageCount#>  25#>#> $meta$offset#>  1#>#>#> $data#> name id#> 1 Annual Agricultural ANNAGR#> 2 Annual Degree Days ANNDD#> 3 Annual Precipitation ANNPRCP#> 4 Annual Temperature ANNTEMP#> 5 Autumn Agricultural AUAGR#> 6 Autumn Degree Days AUDD#> 7 Autumn Precipitation AUPRCP#> 8 Autumn Temperature AUTEMP#> 9 Computed COMP#> 10 Computed Agricultural COMPAGR#> 11 Degree Days DD#> 12 Dual-Pol Moments DUALPOLMOMENT#> 13 Echo Tops ECHOTOP#> 14 Hydrometeor Type HYDROMETEOR#> 15 Miscellany MISC#> 16 Other OTHER#> 17 Overlay OVERLAY#> 18 Precipitation PRCP#> 19 Reflectivity REFLECTIVITY#> 20 Sky cover & clouds SKY#> 21 Spring Agricultural SPAGR#> 22 Spring Degree Days SPDD#> 23 Spring Precipitation SPPRCP#> 24 Spring Temperature SPTEMP#> 25 Summer Agricultural SUAGR#>#> attr(,"class")#>  "ncdc_datacats"
tornadoes() simply gets all the data. So the call takes a while, but once done, is fun to play with.
shp <- tornadoes()#> OGR data source with driver: ESRI Shapefile#> Source: "/Users/sacmac/Library/Caches/rnoaa/tornadoes/torn", layer: "torn"#> with 60114 features#> It has 22 fields#> Integer64 fields read as strings: om yr mo dy tz stf stn mag inj fat wid fclibrary('sp')plot(shp)
In this example, search for metadata for a single station ID
homr(qid = 'COOP:046742')#> $`20002078`#> $`20002078`$id#>  "20002078"#>#> $`20002078`$head#> preferredName latitude_dec longitude_dec precision#> 1 PASO ROBLES MUNICIPAL AP, CA 35.6697 -120.6283 DDddddd#> por.beginDate por.endDate#> 1 1949-10-05T00:00:00.000 Present#>#> $`20002078`$namez#> name nameType#> 1 PASO ROBLES MUNICIPAL AP COOP#> 2 PASO ROBLES MUNICIPAL AP PRINCIPAL#> 3 PASO ROBLES MUNICIPAL ARPT PUB#>#> $`20002078`$identifiers#> idType id#> 1 GHCND USW00093209#> 2 GHCNMLT USW00093209...
Get storm data for the year 2010
storm_data(year = 2010)#> # A tibble: 2,855 × 195#> serial_num season num basin sub_basin name iso_time#> <chr> <int> <int> <chr> <chr> <chr> <chr>#> 1 2009317S10073 2010 1 SI MM ANJA 2009-11-13 06:00:00#> 2 2009317S10073 2010 1 SI MM ANJA 2009-11-13 12:00:00#> 3 2009317S10073 2010 1 SI MM ANJA 2009-11-13 18:00:00#> 4 2009317S10073 2010 1 SI MM ANJA 2009-11-14 00:00:00#> 5 2009317S10073 2010 1 SI MM ANJA 2009-11-14 06:00:00#> 6 2009317S10073 2010 1 SI MM ANJA 2009-11-14 12:00:00#> 7 2009317S10073 2010 1 SI MM ANJA 2009-11-14 18:00:00#> 8 2009317S10073 2010 1 SI MM ANJA 2009-11-15 00:00:00#> 9 2009317S10073 2010 1 SI MM ANJA 2009-11-15 06:00:00#> 10 2009317S10073 2010 1 SI MM ANJA 2009-11-15 12:00:00#> # ... with 2,845 more rows, and 188 more variables: nature <chr>,#> # latitude <dbl>, longitude <dbl>, wind.wmo. <dbl>, pres.wmo. <dbl>,#> # center <chr>, wind.wmo..percentile <dbl>, pres.wmo..percentile <dbl>,#> # track_type <chr>, latitude_for_mapping <dbl>,#> # longitude_for_mapping <dbl>, current.basin <chr>,#> # hurdat_atl_lat <dbl>, hurdat_atl_lon <dbl>, hurdat_atl_grade <dbl>,#> # hurdat_atl_wind <dbl>, hurdat_atl_pres <dbl>, td9636_lat <dbl>,...
Get forecast for a certain variable.
res <- gefs("Total_precipitation_surface_6_Hour_Accumulation_ens", lat = 46.28125, lon = -116.2188)head(res$data)#> Total_precipitation_surface_6_Hour_Accumulation_ens lon lat ens time2#> 1 0 244 46 0 6#> 2 0 244 46 1 12#> 3 0 244 46 2 18#> 4 0 244 46 3 24#> 5 0 244 46 4 30#> 6 0 244 46 5 36
There are a suite of functions for Argo data, a few egs:
# Spatial search - by bounding boxargo_search("coord", box = c(-40, 35, 3, 2))# Time based searchargo_search("coord", yearmin = 2007, yearmax = 2009)# Data quality based searchargo_search("coord", pres_qc = "A", temp_qc = "A")# Search on partial float id numberargo_qwmo(qwmo = 49)# Get dataargo(dac = "meds", id = 4900881, cycle = 127, dtype = "D")
Get daily mean water level data at Fairport, OH (9063053)
coops_search(station_name = 9063053, begin_date = 20150927, end_date = 20150928,product = "daily_mean", datum = "stnd", time_zone = "lst")#> $metadata#> $metadata$id#>  "9063053"#>#> $metadata$name#>  "Fairport"#>#> $metadata$lat#>  "41.7598"#>#> $metadata$lon#>  "-81.2811"#>#>#> $data#> t v f#> 1 2015-09-27 174.430 0,0#> 2 2015-09-28 174.422 0,0
rnoaain R doing
citation(package = 'rnoaa')
Note that some NOAA datasets have changed names:
GSOM(Global Summary of the Month)
GSOY(Global Summary of the Year)
isd()gains new parameters
additionalto toggle whether the non-mandatory ISD fields (additional + remarks) are parsed and returned &
forceto toggle whether download new version or use cached version.
isd_read()gains new parameter
additional(see description above) (#190)
arc2()to get data from Africa Rainfall Climatology version 2 (#201)
https- changed internal code to use
httpfor coops, swdi, ersst, and tornadoes data sources (#187)
coops_search()to handle requests better: only certain date combinations allowed for certain COOPS products (#213) (#214) thanks @tphilippi !
hoardrpackage to manage caching in some functions. Will roll out to all functions that cache soon (#191)
GSOY- added to docs and examples of using GSOM and GSOY (#189)
coops_search()to fix time zone problems (#184) thanks @drf5n
ghcnd()- fix some column types that were of inappropriate type before (#211)
ghcnd(): we were coercing factors to integers, which caused nonsense output - first coercing to character now, then integer (#221)
ncdc()function. Added metadata to the package to help parse flags (#199)
isd()now using a new package
isdparserto parse NOAA ISD files. We still fetch the file within
rnoaa, but the file parsing is done by
isdparser(#176) (#177) (#180) thanks @mrubayet for the push
meteo_*functions (#178) thanks @mrubayet
ghcnd()where internal unexported function was not found (#179)
isd_stations_search()to work correctly on Windows (#181) thanks @GuodongZhu
http(#182) thanks @maspotts
seaiceurls()function, just using base R functions.
isd_read()to read ISD output from
isd()manually instead of letting
isd()read in the data. This is useful when you use
isd()but need to read the file in later when it's already cached. (#169)
rnoaacache files that are downloaded from various NOAA web services. File caching is usually done when data comes from FTP servers. In some of these functions where we cache data, we used to write to your home directory, but have now changed all these functions to write to a proper cache directory in a platform independent way. We determine the cache directory using
rappdirs::user_cache_dir(). Note that this may change your workflow if you'd been depending on cached files to be a in particular place on your file system. In addition, the
pathparameter in the changed functions is now defunct, but you get an informative warning about it (#171)
storm_data()now returns a tibble/data.frame not inside of a list. We used to return a list with a single slot
datawith a data.frame, but this was unnecessary.
ghcnd_stations()now outputs a data.frame (
tbl_df) by itself, instead of a data.frame nested in a list. This may change how you access data from this function. (#163)
isd()docs that when you get an error similar to
Error: download failed for ftp://ftp.ncdc.noaa.gov/pub/data/noaa/1955/011490-99999-1955.gz, the file does not exist on NOAA's ftp servers. If your internet is down, you'll get a different error saying as much (#170)
meteo_*, and are meant to find weather monitors near locations (
meteo_nearby_stations), find all monitors within a radius of a location (
meteo_distance), calculate the distances between a location and all available stations (
meteo_process_geographic_data), calculate the distance between two locations (
meteo_spherical_distance), pull GHCND weather data for multiple weather monitors (
meteo_pull_monitors), create a tidy GHCND dataset from a single monitor (
meteo_tidy_ghcnd), and determine the "coverage" for a station data frame (
meteo_coverage()). In addition,
vis_miss()added to visualize missingness in a data.frame. See the PR diff against master for all the changes. (#159) Thanks a ton to @geanders et al. (@hrbrmstr, @maelle, @jdunic, @njtierney, @leighseverson, @RyanGan, @mandilin, @jferreri, @cpatrizio88, @ryan-hicks, @Ewen2015, @mgutilla, @hakessler, @rodlammers)
isd_stations_search()changed internal structure. We replaced usage of
dplyr::filterfor bbox inputs, and
lat/long/radiusinputs . This speeds up this function significantly. Thanks to @lukas-rokka (#157)
isd_stations()now return tibble's instead of data.frame's
isd_stations_search()now caches using
seaiceeurls()function that's used to generate urls for the
seaice()function - due to change in NOAA urls (#160)
ghncd_split_vars()to not fail on
dplyr::containscall (#156) thanks @lawinslow !
httrversion to call encoding explicitly (#135)
isd()function - it's a time consuming task as we have to parse a nasty string of characters line by line - more speed ups to come in future versions (#146)
dplyr::bind_rows()as the former is being deprecated (#152)
isd()function - was failing on some station names that had leading zeros. (#136)
ncdc_stations()- used to allow more than one station id to be passed in, but internally only handled one. This is a restriction due to the NOAA NCDC API. Documentation now shows an example of how to deal with many station ids (#138)
ncdc_*()functions to allow multiple inputs to those parameters where allowed (#139)
ncdc_plot()due to new
argo()functions: a) with new
httr, box input of a vector no longer works, now manually make a character vector; b) errant file param being passed into the http request, removed (#155)
argo()(#123) for more, see http://www.argo.ucsd.edu/
coops_search()(#111) for idea from @fmichonneau (#124) for implementing @jsta See http://co-ops.nos.noaa.gov/api/ also (#126) (#128)
rgdalmoved to Suggests to make usage easier (#125)
ncdc_plot()- made default brakes to just default to what
ggplot2does, but you can still pass in your own breaks (#131)
gefs_variables()(#106) (#119) thanks @potterzot - he's now an author too
isd_stations()to get ISD station data.
ncdf4package. Windows binaries weren't availiable for
ncdf4prior to now. (#117)
isd()function to do transformations of certain variables to give back data that makes more sense (#115)
lawnadded in Suggests, used in a few functions.
swdi()function man page that the
nldndataset is available to military users only (#107)
buoy()function to accept character class inputs for the
buoyidparameter. the error occurred because matching was not case-insensitive, now works regardless of case (#118)
GETrequest retries for
ghncdfunctions as some URLs fail unpredictably (#110)
?rnoaa-defunctfor more information (#104)
radiusparameter removed from
ncdc_stations()function (#102), was already removed internally within the function in the last version, now not in the function definition, see also (#98) and (#99)
v1where empty list not allowed to pass to the
ghcnd_version()(#85) (#86) (#87) (#88) (#94)
isd()functions, including better man file.
calloptsparameter changed to
ncdc()requires that users do their own paging - previously this was done internally (#77)
digest. A few new ones added:
erddapfunctions now defunct - see the package rerddap, a general purpose R client for ERDDAP servers. (#51) (#73) (#90) (#95)
noaa_stations()used to accept either a bounding box or a point defined by lat/long. The lat/long option dropped as it required two packages, one of which is a pain to install for many users (#98) (#99)
isd()to get ISD data from NOAA FTP server. (#76)
erddap_table(), while gridded datasets are available via
erddap_grid(). Helper function
erddap_search()was modified to search for either tabledap or griddap datasets, and
erddap_info()gets and prints summary information differently for tabledap and griddap datasets. (#63)
erddap_data()defunct, now as functions
erddap_grid(), uses new
storeparameter which takes a function, either
disk(path, overwrite)to store on disk or
memory()to store in R memory.
assertthatlibrary removed, replaced with
dplyr-like outputs with a summary of the data.frame, as appropriate.
.... This parameter allow you to pass in options to
httr::GETto modify curl requests. (#61)
check_key()looks for one of two stored keys, as an environment variable under the name
NOAA_KEY, or an option variable under the name
noaakey. Environment variables can be set during session like
Sys.setenv(VAR = "..."), or stored long term in your
.Renvironfile. Option variables can be set during session like
options(var = "..."), or stored long term in your
print.*functions no longer have public man files, but can be seen via
ncdf4, and new package Suggests:
noaa*()functions for NCDC data changed to
seaice(). When you call the old versions an error is thrown, with a message pointing you to the new function name. See ?rnoaa-defunct.
buoy(), including a number of helper functions.
ncdc()now splits apart attributes. Previously, the attributes were returned as a single column, but now there is column for each attribute so data can be easily retrieved. Attribute columns differ for each different
buoy()function has been removed from the CRAN version of
rnoaa. Install the version with
buoy()and associated functions via
noaa_swdi()(function changed to
swdi()) gains new parameter
filepathto specify path to write a file to if
format=shp. Examples added for using
format=csv, shp, and kmz.
ncdc()gains new parameter
includemetadata. If TRUE, includes metadata, if not, does not, and response should be faster as does not take time to calculate metadata.
noaa_stations()gains new parameter
extentis a vector of length 4 (for a bounding box) then radius is ignored, but if you pass in two points to
extent, it is interpreted as a point, and then
radiusis used as the distance upon which to construct a bounding box.
radiusdefault is 10 km.
enddateare often required parameters, and changes were made to help users with this.