Taxonomic Information from Around the Web

Interacts with a suite of web 'APIs' for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more.


Build Status Build status codecov.io rstudio mirror downloads cran version

taxize allows users to search over many taxonomic data sources for species names (scientific and common) and download up and downstream taxonomic hierarchical information - among other things.

The taxize tutorial is can be found at https://ropensci.org/tutorials/taxize.html

The functions in the package that hit a specific API have a prefix and suffix separated by an underscore. They follow the format of service_whatitdoes. For example, gnr_resolve uses the Global Names Resolver API to resolve species names. General functions in the package that don't hit a specific API don't have two words separated by an underscore, e.g., classification.

You need API keys for Encyclopedia of Life (EOL), Tropicos, IUCN, and NatureServe.

Note that a few data sources require SOAP web services, which are difficult to support in R across all operating systems. These include: Pan-European Species directories Infrastructure and Mycobank. Data sources that use SOAP web services have been moved to taxizesoap at https://github.com/ropensci/taxizesoap.

Currently implemented in taxize

Souce Function prefix API Docs API key
Encylopedia of Life eol link link
Taxonomic Name Resolution Service tnrs "api.phylotastic.org/tnrs" none
Integrated Taxonomic Information Service itis link none
Global Names Resolver gnr link none
Global Names Index gni link none
IUCN Red List iucn link link
Tropicos tp link link
Theplantlist dot org tpl ** none
Catalogue of Life col link none
National Center for Biotechnology Information ncbi none none
CANADENSYS Vascan name search API vascan link none
International Plant Names Index (IPNI) ipni link none
Barcode of Life Data Systems (BOLD) bold link none
National Biodiversity Network (UK) nbn link none
Index Fungorum fg link none
EU BON eubon link none
Index of Names (ION) ion link none
Open Tree of Life (TOL) tol link none
World Register of Marine Species (WoRMS) worms link none
NatureServe natserv link link
Wikipedia wiki link none

**: There are none! We suggest using TPL and TPLck functions in the taxonstand package. We provide two functions to get bullk data: tpl_families and tpl_get.

***: There are none! The function scrapes the web directly.

May be in taxize in the future...

See the newdatasource tag in the issue tracker

Tutorial

For more examples see the tutorial

Installation

Stable version from CRAN

install.packages("taxize")

Development version from GitHub

Windows users install Rtools first.

install.packages("devtools")
devtools::install_github("ropensci/taxize")
library('taxize')

Get unique taxonomic identifier from NCBI

Alot of taxize revolves around taxonomic identifiers. Because, as you know, names can be a mess (misspelled, synonyms, etc.), it's better to get an identifier that a particular data sources knows about, then we can move forth acquiring more fun taxonomic data.

uids <- get_uid(c("Chironomus riparius", "Chaetopteryx"))

Retrieve classifications

Classifications - think of a species, then all the taxonomic ranks up from that species, like genus, family, order, class, kingdom.

out <- classification(uids)
lapply(out, head)
#> $`315576`
#>                 name         rank     id
#> 1 cellular organisms      no rank 131567
#> 2          Eukaryota superkingdom   2759
#> 3       Opisthokonta      no rank  33154
#> 4            Metazoa      kingdom  33208
#> 5          Eumetazoa      no rank   6072
#> 6          Bilateria      no rank  33213
#> 
#> $`492549`
#>                 name         rank     id
#> 1 cellular organisms      no rank 131567
#> 2          Eukaryota superkingdom   2759
#> 3       Opisthokonta      no rank  33154
#> 4            Metazoa      kingdom  33208
#> 5          Eumetazoa      no rank   6072
#> 6          Bilateria      no rank  33213

Immediate children

Get immediate children of Salmo. In this case, Salmo is a genus, so this gives species within the genus.

children("Salmo", db = 'ncbi')
#> $Salmo
#>    childtaxa_id                   childtaxa_name childtaxa_rank
#> 1       1509524  Salmo marmoratus x Salmo trutta        species
#> 2       1484545 Salmo cf. cenerinus BOLD:AAB3872        species
#> 3       1483130               Salmo zrmanjaensis        species
#> 4       1483129               Salmo visovacensis        species
#> 5       1483128                Salmo rhodanensis        species
#> 6       1483127                 Salmo pellegrini        species
#> 7       1483126                     Salmo opimus        species
#> 8       1483125                Salmo macedonicus        species
#> 9       1483124                Salmo lourosensis        species
#> 10      1483123                   Salmo labecula        species
#> 11      1483122                  Salmo farioides        species
#> 12      1483121                      Salmo chilo        species
#> 13      1483120                     Salmo cettii        species
#> 14      1483119                  Salmo cenerinus        species
#> 15      1483118                   Salmo aphelios        species
#> 16      1483117                    Salmo akairos        species
#> 17      1201173               Salmo peristericus        species
#> 18      1035833                   Salmo ischchan        species
#> 19       700588                     Salmo labrax        species
#> 20       237411              Salmo obtusirostris        species
#> 21       235141              Salmo platycephalus        species
#> 22       234793                    Salmo letnica        species
#> 23        62065                  Salmo ohridanus        species
#> 24        33518                 Salmo marmoratus        species
#> 25        33516                    Salmo fibreni        species
#> 26        33515                     Salmo carpio        species
#> 27         8032                     Salmo trutta        species
#> 28         8030                      Salmo salar        species
#> 
#> attr(,"class")
#> [1] "children"
#> attr(,"db")
#> [1] "ncbi"

Downstream children to a rank

Get all species in the genus Apis

downstream(as.tsn(154395), db = 'itis', downto = 'species', verbose = FALSE)
#> $`154395`
#>      tsn parentname parenttsn          taxonname rankid rankname
#> 1 154396       Apis    154395     Apis mellifera    220  species
#> 2 763550       Apis    154395 Apis andreniformis    220  species
#> 3 763551       Apis    154395        Apis cerana    220  species
#> 4 763552       Apis    154395       Apis dorsata    220  species
#> 5 763553       Apis    154395        Apis florea    220  species
#> 6 763554       Apis    154395 Apis koschevnikovi    220  species
#> 7 763555       Apis    154395   Apis nigrocincta    220  species
#> 
#> attr(,"class")
#> [1] "downstream"
#> attr(,"db")
#> [1] "itis"

Upstream taxa

Get all genera up from the species Pinus contorta (this includes the genus of the species, and its co-genera within the same family).

upstream("Pinus contorta", db = 'itis', upto = 'Genus', verbose=FALSE)
#>      tsn                        target
#> 1 183327                Pinus contorta
#> 2 183332 Pinus contorta ssp. bolanderi
#> 3 822698  Pinus contorta ssp. contorta
#> 4 183329 Pinus contorta ssp. latifolia
#> 5 183330 Pinus contorta ssp. murrayana
#> 6 529672 Pinus contorta var. bolanderi
#> 7 183328  Pinus contorta var. contorta
#> 8 529673 Pinus contorta var. latifolia
#> 9 529674 Pinus contorta var. murrayana
#>                                                        commonNames
#> 1               scrub pine,shore pine,tamarack pine,lodgepole pine
#> 2                                            Bolander's beach pine
#> 3                                                               NA
#> 4                         black pine,Rocky Mountain lodgepole pine
#> 5                              tamarack pine,Sierra lodgepole pine
#> 6                                              Bolander beach pine
#> 7                  coast pine,lodgepole pine,beach pine,shore pine
#> 8 tall lodgepole pine,lodgepole pine,Rocky Mountain lodgepole pine
#> 9      Murray's lodgepole pine,Sierra lodgepole pine,tamarack pine
#>      nameUsage
#> 1     accepted
#> 2 not accepted
#> 3 not accepted
#> 4 not accepted
#> 5 not accepted
#> 6     accepted
#> 7     accepted
#> 8     accepted
#> 9     accepted
#> $`Pinus contorta`
#>      tsn parentname parenttsn   taxonname rankid rankname
#> 1  18031   Pinaceae     18030       Abies    180    genus
#> 2  18033   Pinaceae     18030       Picea    180    genus
#> 3  18035   Pinaceae     18030       Pinus    180    genus
#> 4 183396   Pinaceae     18030       Tsuga    180    genus
#> 5 183405   Pinaceae     18030      Cedrus    180    genus
#> 6 183409   Pinaceae     18030       Larix    180    genus
#> 7 183418   Pinaceae     18030 Pseudotsuga    180    genus
#> 8 822529   Pinaceae     18030  Keteleeria    180    genus
#> 9 822530   Pinaceae     18030 Pseudolarix    180    genus
#> 
#> attr(,"class")
#> [1] "upstream"
#> attr(,"db")
#> [1] "itis"

Get synonyms

synonyms("Acer drummondii", db="itis")
#>      tsn             target commonNames    nameUsage
#> 1 183671    Acer drummondii          NA not accepted
#> 2 183672 Rufacer drummondii          NA not accepted
#> $`Acer drummondii`
#>   sub_tsn                    acc_name acc_tsn
#> 1  183671 Acer rubrum var. drummondii  526853
#> 2  183671 Acer rubrum var. drummondii  526853
#> 3  183671 Acer rubrum var. drummondii  526853
#>                      acc_author                        syn_author
#> 1 (Hook. & Arn. ex Nutt.) Sarg. (Hook. & Arn. ex Nutt.) E. Murray
#> 2 (Hook. & Arn. ex Nutt.) Sarg.             Hook. & Arn. ex Nutt.
#> 3 (Hook. & Arn. ex Nutt.) Sarg.     (Hook. & Arn. ex Nutt.) Small
#>                      syn_name syn_tsn
#> 1 Acer rubrum ssp. drummondii   28730
#> 2             Acer drummondii  183671
#> 3          Rufacer drummondii  183672
#> 
#> attr(,"class")
#> [1] "synonyms"
#> attr(,"db")
#> [1] "itis"

Get taxonomic IDs from many sources

get_ids(names="Salvelinus fontinalis", db = c('itis', 'ncbi'), verbose=FALSE)
#> $itis
#> Salvelinus fontinalis 
#>              "162003" 
#> attr(,"match")
#> [1] "found"
#> attr(,"multiple_matches")
#> [1] FALSE
#> attr(,"pattern_match")
#> [1] FALSE
#> attr(,"uri")
#> [1] "http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=162003"
#> attr(,"class")
#> [1] "tsn"
#> 
#> $ncbi
#> Salvelinus fontinalis 
#>                "8038" 
#> attr(,"class")
#> [1] "uid"
#> attr(,"match")
#> [1] "found"
#> attr(,"multiple_matches")
#> [1] FALSE
#> attr(,"pattern_match")
#> [1] FALSE
#> attr(,"uri")
#> [1] "https://www.ncbi.nlm.nih.gov/taxonomy/8038"
#> 
#> attr(,"class")
#> [1] "ids"

You can limit to certain rows when getting ids in any get_*() functions

get_ids(names="Poa annua", db = "gbif", rows=1)
#> $gbif
#> Poa annua 
#> "2704179" 
#> attr(,"class")
#> [1] "gbifid"
#> attr(,"match")
#> [1] "found"
#> attr(,"multiple_matches")
#> [1] TRUE
#> attr(,"pattern_match")
#> [1] FALSE
#> attr(,"uri")
#> [1] "http://www.gbif.org/species/2704179"
#> 
#> attr(,"class")
#> [1] "ids"

Furthermore, you can just back all ids if that's your jam with the get_*_() functions (all get_*() functions with additional _ underscore at end of function name)

get_ids_(c("Chironomus riparius", "Pinus contorta"), db = 'nbn', rows=1:3)
#> $nbn
#> $nbn$`Chironomus riparius`
#>               guid             scientificName    rank taxonomicStatus
#> 1 NBNSYS0000027573        Chironomus riparius species        accepted
#> 2 NHMSYS0000864966 Damaeus (Damaeus) riparius species        accepted
#> 3 NHMSYS0021059238      Rhizoclonium riparium species        accepted
#> 
#> $nbn$`Pinus contorta`
#>               guid                scientificName    rank taxonomicStatus
#> 1 NBNSYS0000004786                Pinus contorta species        accepted
#> 2 NHMSYS0000494858 Pinus contorta var. murrayana variety        accepted
#> 3 NHMSYS0000494848  Pinus contorta var. contorta variety        accepted
#> 
#> 
#> attr(,"class")
#> [1] "ids"

Common names from scientific names

sci2comm('Helianthus annuus', db = 'itis')
#>      tsn                              target
#> 1  36616                   Helianthus annuus
#> 2 525928      Helianthus annuus ssp. jaegeri
#> 3 525929 Helianthus annuus ssp. lenticularis
#> 4 525930      Helianthus annuus ssp. texanus
#> 5 536095 Helianthus annuus var. lenticularis
#> 6 536096  Helianthus annuus var. macrocarpus
#> 7 536097      Helianthus annuus var. texanus
#>                                                  commonNames    nameUsage
#> 1 annual sunflower,sunflower,wild sunflower,common sunflower     accepted
#> 2                                                         NA not accepted
#> 3                                                         NA not accepted
#> 4                                                         NA not accepted
#> 5                                                         NA not accepted
#> 6                                                         NA not accepted
#> 7                                                         NA not accepted
#> $`Helianthus annuus`
#> [1] "common sunflower" "sunflower"        "wild sunflower"  
#> [4] "annual sunflower"

Scientific names from common names

comm2sci("black bear", db = "itis")
#> $`black bear`
#> [1] "Chiropotes satanas"          "Ursus thibetanus"           
#> [3] "Ursus thibetanus"            "Ursus americanus luteolus"  
#> [5] "Ursus americanus"            "Ursus americanus"           
#> [7] "Ursus americanus americanus"

Lowest common rank among taxa

spp <- c("Sus scrofa", "Homo sapiens", "Nycticebus coucang")
lowest_common(spp, db = "ncbi")
#>             name        rank      id
#> 21 Boreoeutheria below-class 1437010

Coerce codes to taxonomic id classes

numeric to uid

as.uid(315567)
#> [1] "315567"
#> attr(,"class")
#> [1] "uid"
#> attr(,"match")
#> [1] "found"
#> attr(,"multiple_matches")
#> [1] FALSE
#> attr(,"pattern_match")
#> [1] FALSE
#> attr(,"uri")
#> [1] "http://www.ncbi.nlm.nih.gov/taxonomy/315567"

list to uid

as.uid(list("315567", "3339", "9696"))
#> [1] "315567" "3339"   "9696"  
#> attr(,"class")
#> [1] "uid"
#> attr(,"match")
#> [1] "found" "found" "found"
#> attr(,"multiple_matches")
#> [1] FALSE FALSE FALSE
#> attr(,"pattern_match")
#> [1] FALSE FALSE FALSE
#> attr(,"uri")
#> [1] "http://www.ncbi.nlm.nih.gov/taxonomy/315567"
#> [2] "http://www.ncbi.nlm.nih.gov/taxonomy/3339"  
#> [3] "http://www.ncbi.nlm.nih.gov/taxonomy/9696"

Coerce taxonomic id classes to a data.frame

out <- as.uid(c(315567, 3339, 9696))
(res <- data.frame(out))
#>      ids class match multiple_matches pattern_match
#> 1 315567   uid found            FALSE         FALSE
#> 2   3339   uid found            FALSE         FALSE
#> 3   9696   uid found            FALSE         FALSE
#>                                           uri
#> 1 http://www.ncbi.nlm.nih.gov/taxonomy/315567
#> 2   http://www.ncbi.nlm.nih.gov/taxonomy/3339
#> 3   http://www.ncbi.nlm.nih.gov/taxonomy/9696

Contributors

Alphebetical

Road map

Check out our milestones to see what we plan to get done for each version.

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for taxize in R doing citation(package = 'taxize')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

News

taxize 0.9.0

Changes to get_*() functions

  • Added separate documentation file for all get* functions describing attributes and various exception behaviors
  • Some get*() functions had NaN as default rows parameter value. Those all changed to NA
  • Better failure behavior now when non-acceptable rows parameter value given
  • Added in all type checks for parameters across get_*() functions
  • Changed behavior across all get_*() functions to behave the same when ask = FALSE, rows = 1 and ask = TRUE, rows = 1 as these should result in the same outcome. (#627) thanks @zachary-foster !
  • Fixed direct match behavior so that when there's multiple results from the data provider, but no direct match, that the functions don't give back just NA with no inication that there were multiple matches.
  • Please let me know if any of these changes cause problems for your code or package.

NEW FEATURES

  • Change comm2sci() to S3 setup with methods for character, uid, and tsn (#621)
  • iucn_status() now has S3 setup with a single method that only handles output from the iucn_summary() function.

MINOR IMPROVEMENTS

  • Add required key parameter to fxn iucn_id() (#633)
  • imrove docs for sci2comm(): to indicate how to get non-simplified output (which includes what language the common name is from) vs. getting simplified output (#623) thanks @glaroc !
  • Fix to sci2comm() to not be case sensitive when looking for matches (#625) thanks @glaroc !
  • Two additional columns now returned with eol_search(): link and content
  • Improve docs in eol_search() to describe returned data.frame
  • Fix bold_bing() to use new base URL for their API
  • Improved description of the dataset rank_ref, see ?rank_ref

BUG FIXES

  • Fix to downstream() via fix to rank_ref dataset to include "infraspecies" and make "unspecified" and "no rank" requivalent. Fix to col_downstream() to remove properly ranks lower than allowed. (#620) thanks @cdeterman !
  • iucn_summary: changed to using rredlist package internally. sciname param changed to x. iucn_summary_id() now is deprecated in favor of iucn_summary(). iucn_summary() now has a S3 setup, with methods for character and iucn (#622)
  • Added "cohort" to rank_ref dataset as that rank sometimes used at NCBI (from bug reported in ncbi_downstream()) (#626)
  • Fix to sci2comm(), add tryCatch() to internals to catch failed requests for specific pageid's (#624) thanks @glaroc !
  • Fix URL for taxa for NBN taxonomic ids retrieved via get_nbnid() (#632)

taxize 0.8.9

BUG FIXES

  • Remove ape::neworder_phylo object, which is not used anymore in taxize
    (#618) (#619) thanks @ashiklom

taxize 0.8.8

NEW FEATURES

  • New function ncbi_downstream() and now NCBI is an option in the function downstream() (#583) thanks for the push @andzandz11
  • New data source: Wiki*, which includes Wikipedia, Wikispecies, and Wikidata - you can choose which you'd like to search. Uses new package wikitaxa, with contributions from @ezwelty (#317)
  • scrapenames() gains a parameter return_content, a boolean, to optionally return the OCR content as a text string with the results. (#614) thanks @fgabriel1891
  • New function get_iucn() - to get IUCN Red List ids for taxa. In addition, new S3 methods synonyms.iucn and sci2comm.iucn - no other methods could be made to work with IUCN Red List ids as they do no share their taxonomic classification data (#578) thanks @diogoprov

MINOR IMPROVEMENTS

  • bold now an option in classification() function (#588)
  • fix to NBN to use new base URL (#582) ($597)
  • genbank2uid() can give back more than 1 taxon matched to a given Genbank accession number. Now the function can return more than one match for each query, e.g., try genbank2uid(id = "AM420293") (#602) thanks @sariya
  • had to modify cbind() usage to incclude ... for method consistency (#612)
  • tax_rank() used to be able to do only ncbi and itis. Can now do a lot more data sources: ncbi, itis, eol, col, tropicos, gbif, nbn, worms, natserv, bold (#587)
  • Added to classification() docs in a section Lots of results a note about how to deal with results when there are A LOT of them. (#596) thanks @ahhurlbert for raising the issue
  • tnrs() now returns the resulting data.frame in the oder of the names passed in by the user (#613) thanks @wpetry
  • Changes to gnr_resolve() to now strip out taxonomic names submitted by user that are NA, or zero length strings, or are not of class character (#606)
  • Added description of the columns of the data.frame output in gnr_resolve() (#610) thanks @kamapu
  • Added noted in tnrs() docs that the service doesn't provide any information about homonyms. (#610) thanks @kamapu
  • Added parvorder to the taxize rank_ref dataset - used by NCBI - if tax returned with that rank, some functions in taxize were failing due to that rank missing in our reference dataset rank_ref (#615)

BUG FIXES

  • Fix to get_colid() via problem in parsing within col_search() (#585)
  • Fix to gbif_downstream (and thus fix in downstream()): there was two rows with form in our rank_ref reference dataset of rank names, causing > 1 result in some cases, then causing vapply to fail as it's expecting length 1 result (#599) thanks @andzandz11
  • Fix genbank2uid(): was failing when getting more than 1 result back, works now (#603) and fails better now, giving back warnings/error messages that are more informative (see also #602) thanks @sariya
  • Fix to synonyms.tsn(): in some cases a TSN has > 1 accepted name. We get accepted names first from the TSN, then look for synonyms, and hadn't accounted for > 1 accepted name. Fixed now (#607) thanks @tdjames
  • Fixed bug in sci2comm() - was not dealing internally with passing the simplify parameter (#616)

taxize 0.8.4

NEW FEATURES

  • Added WoRMS integration via the new worrms package on CRAN. Adds functions as.wormsid(), get_wormsid(), get_wormsid_(), children.wormsid(), classification.wormsid(), sci2comm.wormsid(), comm2sci.wormsid(), and synonyms.wormsid() (#574) (#579)
  • New functions for NatureServe data, including as.natservid, get_natservid, get_natservid_, and classification.natservid (#126)

BUG FIXES

  • EOL API keys were not passed on to internal functions. fixed now. thanks @dschlaep ! (#576)
  • Fix in rankagg() with respect to vegan package to work with older and new version of vegan - thank @jarioksa (#580) (#581)

taxize 0.8.0

NEW FEATURES

  • New data source added: Open Tree of Life. New functions for the data source added: get_tolid(), get_tolid_(), and as.tolid() (#517)
  • related to above classification() gains new method for TOL data
  • related to above lowest_common() gains new method for TOL data
  • Now using ritis package, an external dependency for ITIS taxonomy data. Note that a large number of ITIS functions were removed, and are now available via the package ritis. However, there are still many high level functions for working with ITIS data (see functions prefixed with itis_), and get_tsn(), classification.tsn(), and similar high level functions remain unchanged. (#525)
  • EUBON has a new API (v1.2). We now interact with that new API version. In addition, eubon() fxn is now eubon_search(), although either still work - though eubon() will be made defunct in the next version of this package. Additional new functions were added: eubon_capabilities(), eubon_children(), and eubon_hierarchy() (#567)
  • lowest_common() function gains two new data source options: COL (Catalogue of Life) and TOL (Tree of Life) (#505)
  • Addded new function synonyms_df() as a slim wrapper around data.table::rbindlist() to make it easy to combine many outputs from synonyms() for a single data source - there is a lot of heterogeneity among data sources in how they report synonyms data, so we don't attempt to combine data across sources (#533)

MINOR IMPROVEMENTS

  • Change NCBI URLs to https from http (#571)

BUG FIXES

  • Fixed bug in tax_name() in which when an invalid taxon was searched for then classification() returned no data and caused an error. Fixed now. (#560) thanks @ljvillanueva for reporting it!
  • Fixed bug in gnr_resolve() in which order of input names to the function was not retained. fixed now. (#561) thanks @bomeara for reporting it!
  • Fixed bug in gbif_parse() - data format changed coming back from GBIF - needed to replace NULL with NA (#568) thanks @ChrKoenig for reporting it!

taxize 0.7.9

NEW FEATURES

  • New vignette: "Strategies for programmatic name cleaning" (#549)

MINOR IMPROVEMENTS

  • get_*() functions now have new attributes to further help the user: multiple_matches (logical) indicating whether there were multiple matches or not, and pattern_match (logical) indicating whether a pattern match was made, or not. (#550) from (#547) discussion, thanks @ahhurlbert ! see also (#551)
  • Change all xml2::xml_find_one() to xml2::xml_find_first() for new xml2 version (#546)
  • gnr_resolve() now retains user supplied taxa that had no matches - this could affect your code, make sure to check your existing code (#558)
  • gnr_resolve() - stop sorting output data.frame, so order of rows in output data.frame now same as user input vector/list (#559)

BUG FIXES

  • Fixed internal fxn sub_rows() inside of most get_*() functions to not fail when the data.frame rows were less than that requested by the user in rows parameter (#556)
  • Fixed get_gbifid(), as sometimes calls failed because we now return numberic IDs but used to return character IDs (#555)
  • Fix to all get_() functions to call the internal sub_rows() function later in the function flow so as not to interfere with taxonomic based filtering (e.g., user filtering by a taxonomic rank) (#555)
  • Fix to gnr_resolve(), to not fail on parsing when no data returned when a preferred data source specified (#557)

taxize 0.7.8

MINOR IMPROVEMENTS

  • Fix to iucn_summary() (#543) thanks @mcsiple
  • Added message for when too many Ids passed in to ncbi_get_taxon_summary() suggesting to break up the ids into chunks (#541) thanks @daattali
  • Fix to itis_acceptname() to accept multiple names (#534) and now gives back same output regardless of whether match found or not (#531)

BUG FIXES

  • Fix to tax_name() for some queries that return no classification data via internal call to classification() (#542) thanks @daattali
  • Another fix for tax_name() (#530) thanks @ibartomeus
  • Fixed docs for rankagg() function, use requireNamespace() in examples to make sure user has vegan installed (#529)

taxize 0.7.6

MINOR IMPROVEMENTS

  • Changed defunct messages in eol_invasive() and gisd_invasive() to point to new location in the originr package. Also, cleaned out code in those functions as not avail. anymore (#494)
  • Access to IUCN taxonomy information is now provided through the newish rredlist package. (Two issues dealing with IUCN problems (#475) (#492))

BUG FIXES

  • Fix to get_gbifid() to use new internal code to provide two ways to search GBIF taxonomy API, either via /species/match or via /species/search, instead of /species/suggest, which we used previously. The suggest route was too coarse. get_gbifid() also gains a parameter method to toggle whether you search for names using /species/match or /species/search. (#528)
  • Fix for col_search() to handle when COL can return a value of missapplied name, which a switch() statement didn't handle yet (#511) thanks @JoStaerk !
  • Fixes for get_colid() and col_search() (#523) thanks @zachary-foster !

taxize 0.7.5

BUG FIXES

  • Fixed bug in the package dependency bold, which fixes taxize::bold_search(), so no actual changes in taxize for this, but take note (#521)
  • Fixed problem in gnr_resolve() where we indexed to data incorrectly. And added tests to account for this problem. Thanks @raredd ! (#519) (#520)
  • Fixed bug in iucn_summary() introduced in last version. iucn_summary() now uses the package rredlist, which requires an API key, and I didn't document how to use the key. Function now allows user to pass the key in as a parameter, and documents how to get a key and save it in either .Renviron or in .Rprofile (#522)

taxize 0.7.4

NEW FEATURES

  • New function lowest_common() for obtaining the lowest common taxon and rank for a given taxon name or ID. Methods so far for ITIS, NCBI, and GBIF (#505)
  • New contributor James O'Donnell (@jimmyodonnell) (via #505)
  • Now importing rredlist rredlist
  • New function iucn_summary_id() - same as iucn_summary(), except takes IUCN IDs as input instead of taxonomic names (#493)
  • All taxonomic rank columns in data.frame's now given back as lower case. This provides consistency, which is important, and many functions use ranks to determine what to do next, so using a consistent case is good.

MINOR IMPROVEMENTS

  • iucn_summary() fixes, long story short: a number of bug fixes, and uses the new IUCN API via the newish package rredlist when IDs are given as input, but uses the old IUCN API when taxonomic names given. Also: gains new parameter distr_details (#174) (#472) (#487) (#488)
  • Replaced XML with xml2 for XML parsing (#499)
  • Fixes to internal use of httr::content to explicitly state encoding="UTF-8" (#498)
  • gnr_resolve() now outputs a column (user_supplied_name) for the exact input taxon name - facilitates merging data back to original data inputs (#486) thanks @Alectoria
  • eol_dataobjects() gains new parameter taxonomy to toggle whether to return any taxonomy details from different data providers (#497)
  • Catalogue of Life URLs changed - updated all appropriate COL functions to use the new URLs (#501)
  • classification() was giving back rank values in mixed case from different data providers (e.g., class vs. Class). All rank values are now all lowercase (#504)
  • Changed number of results returned from internal GBIF search in get_gbfid to 50 from 20. Gives back more results, so more likely to get the thing searched for (#513)
  • Fix to gni_search() to make all output columns character class
  • iucn_id(), tpl_families(), and tpl_get() all gain a new parameter ... to pass on curl options to httr::GET()

BUG FIXES

  • Fixes to get_eolid(): URI returned now always has the pageid, and goes to the right place; API key if passed in now actually used, woopsy (#484)
  • Fixes to get_uid(): when a taxon not found, the "match" attribute was saying found sometimes anyway - that is now fixed; additionally, fixed docs to correctly state that we give back 'NA due to ask=FALSE' when ask = FALSE (#489) Additionally, made this doc fix in other get_*() function docs
  • Fix to apgOrders() function (#490)
  • Fixes to tp_search() which fixes get_tpsid(): Tropicos doesn't allow periods (.) in query strings, so those are URL encoded now; Tropicos doesn't like sub-specific rank names in name query strings, so we warn when those are found, but don't alter user inputs; and improved docs to be more clear about how the function fails (#491) thanks @scelmendorf !
  • Fix to classification(db = "itis") to fail better when no taxa found (#495) thanks @ashenkin !
  • eol_pages() fixes: the EOL API route for this method gained a new parameter taxonomy, this function gains that parameter. That change caused this fxn to fail. Now fixed. Also, parameter subject changed to subjects (#500)
  • Fix to col_search() due to when misapplied name come back as a data slot. There was previously no parser for that type. Now there is, and it works (#512)

taxize 0.7.0

NEW FEATURES

  • Now requires R >= 3.2.1. Good idea to update your R installation anyway (#476)
  • New function ion() for obtaining data from Index of Organism Names (#345)
  • New function eubon() for obtaining data from EU (European Union) BON taxonomy (#466) Note that you may onloy get partial results for some requests as paging isn't implemented yet in the EU BON API (#481)
  • New suite of functions, with prefix fg_*() for obtaining data from Index Fungorum. More work has to be done yet on this data source, but these initial functions allow some Index Fungorum data access (#471)
  • New function gbif_downstream() for obtaining downstream names from GBIF's backbone taxonomy. Also available in downstream(), where you can request downstream names from GBIF, along with other data sources (#414)

MINOR IMPROVEMENTS

  • Note added in docs for all db parameters to warn users that if they provide the wrong db value for the given taxon ID, they can get data back, but it would be wrong. That is, all taxonomic data sources available in taxize use their own unique IDs, so a single ID value can be in multiple data sources, even though the ID refers to different taxa in each data source. There is no way we can think of to prevent this from happening, so be cautious. (#465)
  • A note added to all IUCN functions to warn users that sometimes incorrect data is returned. This is beyond our control, as sometimes IUCN itself gives back incorrect data, and sometimes EOL/Global Names (which we use in some of the IUCN functions) give back incorrect data. (#468) (#473) (#174) (472) (#475)

BUG FIXES

  • Fix to gnr_resolve() to by default capitalize first name of a name string passed to the function. GNR is case sensitive, so case matters (#469)

DEFUNCT

  • phylomatic_tree() and phylomatic_format() are defunct. They were deprecated in recent versions, but are now gone. See the new package brranching for Phylomatic data (#479)

taxize 0.6.6

MINOR IMPROVEMENTS

  • stripauthority argument in gnr_resolve() has been renamed to canonical to better match what it actually does (#451)
  • gnr_resolve() now returns a single data.frame in output, or NULL when no data found. The input taxa that have no match at all are returned in an attribute with name not_known (#448)
  • updated some functions to work with to R >3.2.x
  • In vascan_search() changed callopts parameter to ... to pass in curl options to the request.
  • In ipni_search() changed callopts parameter to ... to pass in curl options to the request. In addition, better http error handling, and added a test suite for this function. (#458)
  • stringsAsFactors=FALSE now used for gibf_parse() (https://github.com/ropensci/taxize/commit/c0c4175d3a0b24d403f18c057258b67d3fbf17f0)
  • Made nearly all column headers and list names lowercase to simplify indexing to elements, as well as combining outputs. (#462)
  • Plantminer API updated to use a new API. Option to search ThePlantList or the Brazilian Flora Checklist (#464)
  • Added more details to the documentation for get_uid() to make more clear how to use the varoious parameters to get the desired result, and how to avoid certain pitfalls (#436)
  • Removed the parameter asdf from the function eol_dataobjects() - now returning data.frame's only.
  • Added some error catching to get_eolid() via tryCatch() to fail better when names not found.
  • Dropped openssl as a package dependency. Not needed anymore because uBio dropped.

BUG FIXES

  • gnr_resolve() failed when no canonical form was found.
  • Fixed gnr_resolve() when no results found when best_match_only=TRUE (#432)
  • Fixed bug in internal function itisdf() to give back an empty data.frame when no results found, often with subspecific taxa. Helps solve errors reported in use of downstream(), itis_downstream(), and gethierarchydownfromtsn() (#459)

NEW FEATURES

  • gnr_resolve() gains new parameter with_canonical_ranks (logical) to choose whether infraspecific ranks are returned or not.
  • New function iucn_id() to get the IUCN ID for a taxon from it's name. (#431)

DEFUNCT

  • All functions that interacted with the taxonomy service uBio are now defunct. Of course we would deprecate first, then make defunct later, to make transition easier, but that is out of our hands. The functions that are defunct are: ubio_classification(), ubio_classification_search(), ubio_id(), ubio_search(), ubio_synonyms(), get_ubioid(), ubio_ping(). In addition, ubio has been removed as an option in the synonyms() function, and references for uBio have been removed from the taxize_cite() utility function. (#449)

taxize 0.6.2

MINOR IMPROVEMENTS

  • rankagg() doesn't depend on data.table anymore (fixes issue with CRAN checks)
  • Replaced RCurl::base64Decode() with openssl::base64_decode(), needed for ubio_*() functions (#447)
  • Importing only functions (via importFrom) used across all imports now (#446). In addition, importFrom for all non-base R pkgs, including graphics, methods, stats and utils packages (#441)
  • Fixes to prevent problems with httr v1, where you can't pass a zero length list to the query parameter in GET(), but can pass NULL (#445)
  • Fixes to all of the gni_*() functions, including code tidying, some DRYing out, and ability to pass in curl options (#444)

BUG FIXES

  • Fixed typo in taxize_cite()
  • Fixed a bug in classification() where numeric IDs as input got converted to itis ids just because they were numeric. Fixed now. (#434)
  • Catalogue of Life (COL) changed from using short numeric codes for taxa to long alphanumeric UUID type ids. This required fixing functions using COL web services (#435)

taxize 0.6.0

NEW FEATURES

  • Added a method for Catalogue of Life for the synonyms function to get name synonyms. (#430)
  • Added datasets apgFamilies and apgOrders. (#418)
  • col_search() gains parameters response to get a terse or full response, and ... to pass in curl options.
  • eol_dataobjects() gains parameter ... to pass in curl options, and parameter returntype renamed to asdf (for "as data.frame").
  • ncb_get_taxon_summary() gains parameter ... to pass in curl options.
  • The children() function gains the rows parameter passed on to get_*() functions, supported for data sources ITIS and Catalogue of Life, but not for NCBI.
  • The upstream() function gains the rows parameter passed on to get_*() functions, supported for both data sources ITIS and Catalogue of Life.
  • The classification() function gains the rows parameter passed on to get_*() functions, for all sources used in the function.
  • The downstream() function gains the rows parameter passed on to get_*() functions, for all sources used in the function.
  • Nearly all taxonomic ID retrieveal functions (i.e., get_*()) gain new parameters to help filter results (e.g., division, phylum, class, family, parent, rank, etc.). These parameters allow direct matching or regex filters (e.g., .a to match any character followed by an a). (#410) (#385)
  • Nearly all taxonomic ID retrieveal functions (i.e., get_*()) now give back more information (mostly higher taxonomic data) to help in the interactive decision process. (#327)
  • New data source added to synonyms() function: Catalogue of Life. (#430)

MINOR IMPROVEMENTS

  • vegan package, used in class2tree() function, moved from Imports to Suggests. (#392)
  • Improved taxize_cite() a lot - get URLs and sometimes citation information for data sources available in taxize. (#270)
  • Fixed typo in apg_lookup() function. (#422)
  • Fixed documentation in apg_families() function. (#418)
  • Across many functions, fixed support for passing in curl options, and added examples of curl option use.
  • callopts parameter in eol_pages(), eol_search(), gnr_resolve(), tp_accnames(), tp_dist(), tp_search(), tp_summary(), tp_synonyms(), ubio_search() changed to ...
  • accepted parameter in get_tsn() changed to FALSE by default. (#425)
  • Default value of db parameter in resolve() changed to gnr as tnrs is often quite slow.
  • General code tidying across the package to make code easier to read.

BUG FIXES

  • Fixed encoding issues in tpl_families() and tpl_get(). (#424)

DEPRECATED AND DEFUNCT

  • The following functions that were deprecated are now defunct (no longer available): ncbi_getbyname(), ncbi_getbyid(), ncbi_search(), eol_invasive(), gisd_isinvasive(). These functions are available in the traits package. (#382)
  • phylomatic_tree() is deprecated, but will be defunct in a upcoming version.

taxize 0.5.2

NEW FEATURES

  • New set of functions to ping each of the APIs used in taxize. E.g., itis_ping() pings ITIS and returns a logical, indicating if the ITIS API is working or not. You can also do a very basic test to see whether content returned matches what's expected. (#394)
  • New function status_codes() to get vector of HTTP status codes. (#394)

MINOR IMPROVEMENTS

  • Removed startup message.
  • Now can pass in curl options to itis_ping(), and all *_ping() functions.

BUG FIXES

  • Moved examples that were in \donttest into \dontrun.

taxize 0.5.0

NEW FEATURES

  • New function genbank2uid() to get a NCBI taxonomic id (i.e., a uid) from a either a GenBank accession number of GI number. (#375)
  • New function get_nbnid() to get a UK National Biodiversity Network taxonomic id (i.e., a nbnid). (#332)
  • New function nbn_classification() to get a taxonomic classification for a UK National Biodiversity Network taxonomic id. Using this new function, generic method classification() gains method for nbnid. (#332)
  • New function nbn_synonyms() to get taxonomic synonyms for a UK National Biodiversity Network taxonomic id. Using this new function, generic method synonyms() gains method for nbnid. (#332)
  • New function nbn_search() to search for taxa in the UK National Biodiversity Network. (#332)
  • New function ncbi_children() to get direct taxonomic children for a NCBI taxonomic id. Using this new function, generic method children() gains method for ncbi. (#348) (#351) (#354)
  • New function upstream() to get taxa upstream of a taxon. E.g., getting families upstream from a genus gets all families within the one level higher up taxonomic class than family. (#343)
  • New suite of functions as.*() to coerce numeric/alphanumeric codes to taxonomic identifiers for various databases. There are methods on this function for each of itis, ncbi, tropicos, gbif, nbn, bold, col, eol, and ubio. By default as.*() funtions make a quick check that the identifier is a real one by making a GET request against the identifier URI - this can be toggle off by setting check=FALSE. There are methods for returning itself, character, numeric, list, and data.frame. In addition, if the as.*.data.frame() function is used, a generic method exists to coerce the data.frame back to a identifier object. (#362)
  • New suite of functions named, for example, get_tsn_() (the underscore is the only different from the previous function name). These functions don't do the normal interactive process of prompts that e.g., get_tsn() do, but instead returned a list of all ids, or a subset via the rows parameter. (#237)
  • New function ncbi_get_taxon_summary() to get taxonomic name and rank for 1 or more NCBI uid's. (#348)

MINOR IMPROVEMENTS

  • assertthat removed from package imports, replaced with stopifnot(), to reduce dependency load. (#387)
  • eol_hierarchy() now defunct (no longer available) (#228) (#381)
  • tp_classifcation() now defunct (no longer available) (#228) (#381)
  • col_classification() now defunct (no longer available) (#228) (#381)
  • New manual page listing all the low level ITIS functions for which their manual pages are not shown in the package index, but are available if you to ?fxn-name.
  • All get_*() functions gain a new parameter rows to allow selection of particular rows. For example, rows=1 to select the first row, or rows=1:3 to select rows 1 through 3. (#347)
  • classification() now by default returns taxonomic identifiers for each of the names. This can be toggled off by the return_id=FALSE. (#359) (#360)
  • Simplification of many higher level functions to use switch() on the db parameter, which helps give better error message when a db value is not possible or spelled incorrectly. (#379)

BUG FIXES

  • Lots of reduction of redundancy in internal functions. (#378)

taxize 0.4.0

NEW FEATURES

  • New data sources added to taxize: BOLD (Biodiversity of Life Database). Three more data sources were added (World Register of Marine Species (WoRMS), Pan-European Species directories Infrastructure (PESI), and Mycobank), but are not available on CRAN. Those three data sources provide data via SOAP web services protocol, which is hard to support in R. Thus, those sources are available on Github. See https://github.com/ropensci/taxize#version-with-soap-data-sources
  • New function children(), which is a single interface to various data sources to get immediate children from a given taxonomic name. (#304)
  • New functions added to search BOLD data" bold_search() that searches for taxa in the BOLD database of barcode data; get_boldid() to search for a BOLD taxon identifier. (#301)
  • New function get_ubioid() to get a uBio taxon identifier. (#318)
  • New function started (not complete yet) to get suggested citations for the various data sources available in taxize: taxize_cite(). (#270)

MINOR IMPROVEMENTS

  • Using jsonlite instead of RJSONIO throughout the taxize.
  • get_ids() gains new option to search for a uBio ID, in addition to the others, itis, ncbi, eol, col, tropicos, and gbif.
  • Fixed documentation for stripauthority parameter gnr_resolve(). (#325)
  • iplant_resolve() now outputs data.frame structure instead of a list. (#306)
  • Clarified parameter seqrange in ncbi_getbyname() and ncbi_search() (#328)
  • synonyms() gains new data source, can now get synonyms from uBio data source (#319)
  • vascan_search() giving back more useful results now.

BUG FIXES

  • Added error catching for when URI is too long, i.e., when too many names provided (#329) (#330)
  • Various fixes to tnrs() function, including more meaningful error messages on failures (#323) (#331)
  • Fixed bug in getpublicationsfromtsn() that caused function to fail on data.frame's with no data on name assignment (#297)
  • Fixed bug in sci2comm() that caused fxn to fail when using db=itis sometimes (#293)
  • Fixes to scrapenames(). Sending a text blob via the text parameter now works.
  • Fixes to resolve() so that function now works for all 3 data sources. (#337)

taxize 0.3.0

NEW FEATURES

  • New function iplant_resolve() to do name resolution using the iPlant name resolution service. Note, this is different from http://taxosaurus.org/ that is wrapped in the tnrs() function.
  • New function ipni_search() to search for names in the International Plant Names Index (IPNI).
  • New function resolve() that unifies name resolution services from iPlant's name resolution service (via iplant_resolve()), Taxosaurus' TNRS (via tnrs()), and GNR's name resolution service (via gnr_resolve()).
  • All get_*() functions how returning a new uri attribute that is a link to the taxon on on the web. If NA is given back (e.g. nothing found), the uri attribute is blank. You can go directly to the uri in your default browser by doing, for example: browseURL(attr(result, "uri")).
  • get_eolid() now returns an attribute provider because EOL collates taxonomic data form a lot of sources, then gives back IDs that are internal EOL ids, not those matching the id of the source they pull from. This should help with provenance, and should help if there is confusion about why the id givenb back by this function does not match that from the original source.
  • Within the get_tsn() function, now using the function itis_terms(), which gives back the accepted status of the taxa. This allows a new parameter in the function (accepted, logical) that allows user to say give back only accepted status names (accepted=TRUE), or to give back all names (accepted=FALSE).
  • gnr_resolve() gains two new parameters best_match_only (logical, to return best match only) and preferred_data_sources (to return preferred data sources) and callopts to pass in curl options.
  • tnrs(), tp_accnames(), tp_refs(), tp_summary(), and tp_synonyms() gain new parameter callopts to pass in curl options.

MINOR IMPROVEMENTS

  • class2tree() can now handle NA in classification objects.
  • classification.eolid() and classification.colid() now return the submitted name along with the classification.
  • Changed from CC0 to MIT license.
  • Updated citation to have both the taxize paper in F1000 Research and the package citation.
  • Sped up some functions by removing internal use of plyr functions, see #275.
  • Removed dependency on rgbif - copied into this package a few functions needed internally. This avoids users having to install GDAL binary.
  • Added in verbose parameter to many more functions to allow suppression of help messages.
  • In most functions when using httr, now manually parsing JSON to a list then to another data format instead of allowing internal httr parsing - in addition added checks on content type and encoding in many functions.
  • Added match.arg iternally to get_ids() for the db parameter so that a) unique short abbreviations of possible values are possible, and b) gives a meaningful warning if unsupported values are given.
  • Most long-named ITIS functions (e.g., getexpertsfromtsn, getgeographicdivisionsfromtsn) gain parameter curlopts to pass in curl options.
  • Added stringsAsFactors=FALSE to all data.frame creations to eliminate factor variables.

BUG FIXES

  • classification.gbifid() did not return the correct result when taxon not found.
  • Fixed bugs in many functions, see #245, #248, #254, #277.
  • classification() used to fail when it was passed a subset of a vector of ids, in which case the class information was stripped off. Now works (#284)

taxize 0.2.2

NEW FEATURES

  • itis_downstream() and col_downstream() functions accessible now from a single function downstream() (https://github.com/ropensci/taxize/issues/238)

MINOR IMPROVEMENTS

  • Added a extension function classification() for the gbif id class, classification.gbifid() (https://github.com/ropensci/taxize/issues/241)

BUG FIXES

  • Added some error catching to class2tree function. (https://github.com/ropensci/taxize/issues/240)
  • Fixed problems in cbind.classification() and rbind.classification() where the first column of the ouput was a useless column name, and all column names now lower case for consistency. (https://github.com/ropensci/taxize/issues/243)
  • classification() was giving back IDS instead of taxon names on the list element names, fixed this so hopefully all are giving back names. (https://github.com/ropensci/taxize/issues/243)
  • Fixed bugs in col_*() functions so they give back data.frame's now with character class columns instead of factors, damned stringsAsFactors! (https://github.com/ropensci/taxize/issues/246)

taxize 0.2.0

MINOR IMPROVEMENTS

  • New dataset: Lookup-table for family, genus, and species names for ThePlantList under dataset name "theplantlist".
  • get_ids() now accepts "gbif" as an option via use of get_gbifid().
  • Changed function itis_phymat_format() to phylomatic_format() - this function gets the typical Phylomatic format name string "family/genus/genus_epithet"

BUG FIXES

  • Updated gbif_parse() base url to the new one (http://api.gbif.org/v1/parser/name).
  • Fixes to phylomatic_tree().

NEW FEATURES

  • New function class2tree() to convert list of classifications to a tree. For example, go from a list of classifications from the function classification() to this function to get a taxonomy tree in ape phylo format.
  • New function get_gbfid() to get a Global Biodiversity Information Facility identifier. This is the ID GBIF uses in their backbone taxonomy.
  • classification() outputs gain rbind() and cbind() generic methods that act on the various outputs of classification() to bind data width-wise, or column-wise, respectively.

taxize 0.1.9

MINOR IMPROVEMENTS

  • Updated ncbi_search() to retrieve more than a max of 500, slightly changed column headers in output data files, and if didn't before, now accepts a vector/list of taxonomic names instead of just one name.

taxize 0.1.8

NEW FEATURES

  • We attempted to make all ouput column names lowercase, and to increase consistency across column names in outputs from similar functions.
  • New function scrapenames() uses the Global Names Recognition and Discovery service to extract taxonomic names from a web page, pdf, or other document.
  • New function vascan_search() to search the CANADENSYS Vascan names database.

BUG FIXES

  • Fixed bugs in get_tpsid(), get_eolid() and eol_pages().
  • phylomatic_tree() bugs fixed.

MINOR IMPROVEMENTS

  • classification() methods were simplified. Now classification() is the workhorse for every data-source. col_classification(), eol_hierarchy(), and tp_classification() are now deprecated and will be removed in the next taxize version.
  • classification() gains four new arguments: start, checklist, key, and callopts.
  • comm2sci() gains argument simplify to optionally simplify output to a vector of names (TRUE by default).
  • get_eolid() and get_tpsid() both gain new arguments key to specify an API key, and ... to pass on arguments to eol_search().
  • Added ncbi as a data source (db="ncbi") in sci2comm().
  • tax_agg() now accepts a matrix in addition to a data.frame. Thanks to @tpoi
  • tnrs() changes: Using httr instead of RCurl; now forcing splitting up name vector when long. Still issues when using POST requests (getpost="POST") wherein a request sent with 100 names only returns 30 for example. Investigating this now.

NOTES

  • Function name change: tp_acceptednames() now tp_accnames().
  • Function name change: tp_namedistributions() now tp_dist().
  • Function name change: tp_namereferences() now tp_refs().
  • Internal ldfast() function changed name to taxize_ldfast() to avoid namespace conflicts with similar function in another package.
  • Three functions now with ncbi_* prefix: get_seqs() is now ncbi_getbyname(); get_genes() is now ncbi_getbyid(); and get_genes_avail() is now ncbi_search().

taxize 0.1.5

NEW FEATURES

  • classification() gains extension method classification.ids() to accept output from get_ids() - which attempts to get a taxonomic hierarchy from each of the taxon identifiers with the output from get_ids().
  • synonyms() gains extension method synonyms.ids() to accept output from get_ids() - which attempts to get synonyms from each of the taxon identifiers with the output from get_ids().

taxize 0.1.4

NEW FEATURES

  • Reworked functions that interact with the ITIS API so that lower level functions were grouped together into higher level functions. All the approximately 50 lower level functions are still exported but are not included in the index help file (due to @keywords internal for each fxn) - but can still be used normally, and man files are avaialable at ?functionName.
  • New function itis_ping() to check if the ITIS API service is up, similar to eol_ping() for the EOL API.
  • New function itis_getrecord() to get a partial or full record, using a TSN or lsid.
  • New function itis_refs() to get references associated with a TSN.
  • New function itis_kingdomnames() to get all kingdom names, or kingdom name for a TSN.
  • New function itis_lsid() to get a TSN from an lsid, get a partial or full record from an lsid.
  • New function itis_native() to get status as native, exotic, etc. in various geographic regions.
  • New function itis_hierarchy() to get full hierarchy, or immediate up or downstream hierarchy.
  • New function itis_terms() to get tsn's, authors, common names, and scientific names from a given query.
  • New function sci2comm() to get common (vernacular) names from input scientific names from various data sources.
  • New function comm2sci() to get scientific names from input common (vernacular) names from various data sources.
  • New function get_ids() to get taxonomic identifiers across all sources.

MINOR IMPROVEMENTS

  • itis_taxrank() now outputs a character, not a factor; loses parameter verbose, and gains ..., which passes on further arguments to gettaxonomicranknamefromtsn.
  • tp_synonyms(), tp_summary(), plantminer(), itis_downstream(), gisd_isinvasive(), get_genes_avail(), get_genes(), eol_invasive(), eol_dataobjects(), andn tnrs() gain parameter verbose to optionally suppress messages.
  • phylomatic_tree() format changed so that names are passed in normall (e.g., Poa annua) instead of the slashpath format (family/genus/genus_species). Also, taxaformat parameter dropped.
  • itis_acceptname() gains ... to pass in further arguments to getacceptednamesfromtsn()
  • tp_namedistributions() loses parameter format.
  • get_tsn() and get_uid() return infomation about match as attribute.
  • clarified iucn-documentation

BUG FIXES

  • Fixed bug in synonyms() so that further arguments can be passed on to get_tsn() to suppress messages.
  • Removed test for ubio_classification_search(), a function that isn't operational yet.

taxize 0.1.1

NEW FEATURES

  • New functions added just like get_uid()/get_tsn() but for EOL, Catalogue of Life, and Tropicos, see get_eolid(), get_colid(), and get_tpsid(), respectively.
  • classification() methods added for EOL, Catalogue of Life, and Tropicos, see functions classification.eolid(), classification.colid(), and classification.tpsid() respectively.
  • New function col_search() to search for names in the Catalogue of Life.
  • User can turn off interactive mode in get_* functions. All get_* functions gain an ask argument, if TRUE (default) a user prompt is used for user to select which row they want, if FALSE, NA is returned when many results available; and added tests for the new argument. Affects downstream functions too.
  • New function eol_invasive() to search EOL collections of invasive species lists.
  • New function tp_search() to search for a taxonomic IDs from Tropicos.
  • New function tp_classification() to get a taxonomic hierarchy from Tropicos.
  • New function gbif_parse() to parse scientific names into their components, using the GBIF name parser API.
  • New function itis_searchcommon() to search for common names across both searchbycommonnamebeginswith, and searchbycommonnameendswith.

BUG FIXES

  • tax_name() and other function broke, because get_tsn() and get_uid() returned wrong value when a taxon was not found. Fixed.

MINOR IMPROVEMENTS

  • Added tests for new classification() methods for EOL, COL, and Tropicos.
  • Added tests for new functions tp_search() and tp_classification().

NOTES

  • Moved tests from inst/tests to tests/testthat according to new preferred location of tests.
  • Updated CITATION in inst/ with our F1000Research paper info.
  • Package repo name on Github changed from taxize_ to taxize - remember to use "taxize" in install_github() calls now instead of "taxize_"

taxize 0.1.0

NEW FEATURES

  • New function tpl_families() to get data.frame of families from The Plantlist.org site.
  • New function names_list() to get a random vector of species names using the
  • Added two new data sets, plantGenusNames.RData and plantNames.RData, to be used in names_list().
  • New function ldfast(), a replacement function for plyr::ldply that should be faster in all cases.
  • Changed API key names to be more consistent, now tropicosApiKey, eolApiKey, ubioApiKey, and pmApiKey - do change these in your .Rprofile if you store them there.
  • Added a startup message.

MINOR IMPROVEMENTS

  • Across most functions, removed dependencies on plyr, using ldfast() instead, for increased speed.
  • Across most functions, changed from using RCurl to using httr.
  • Across most functions, stop_for_status() now used directly after Curl call to check the http status code, stoping the function if appropriate code found.
  • Many functions changed parameter ... to callopts, which passes on additional Curl options, with default an empty list (list()), which makes function testing easier.
  • eol_search() gains parameters page, exact, filter_tid, filter_heid, filter_by_string, matching, cache_ttl, and callopts.
  • eol_hierarchy() gains parameter callopts, and loses parameter usekey (always using API key now).
  • eol_pages() gains parameters images, videos, sounds, maps, text, subject, licenses, details, common_names, synonyms, references, vetted, cache_ttl, and callopts.
  • gni_search(): parameter url lost, is defined inside the function now, and .Rd file gains url references.
  • phylomatic_tree() now checks to make sure family names were found for input taxa. If not, the function stops with message informing this.
  • tpl_get() updated with fixes/improvements by John Baumgartner - now gets taxa from all groups, whereas only retrieved from Angiosperms before. In addition, csv files from The Plantlist.org are downloaded directly rather than read into R and written out again.
  • tpl_search() now checks for missing data or errors, and stops function with error message.

BUG FIXES

  • capwords() fxn changed to taxize_capwords() to avoid namespace conflicts with other packages with a similar function.
  • ubio_namebank() was giving back base64 encoded data, now decoded appropriately.

NOTES

  • Added John Baumgartner as an author.

taxize 0.0.6

NEW FEATURES

  • tax_name() accepts multiple ranks to query.
  • tax_name() accepts vectors as input.
  • tax_name() has an option to query both, NCBI and ITIS, in one call and return the union of both.
  • new extractor function for iucn_summary(): iucn_status(), to extract status from iucn-objects.
  • tax_agg(): A function to aggregate species data to given taxonomic rank.
  • tax_rank(): Get taxonomic rank for a given taxon name.
  • classification() accepts taxon names as input and returns a named list.
  • new function apg_lookup() looks up APGIII taxonomy and replaces family names
  • new function gni_parse() parses scientific names using EOl's name parser API
  • new function iucn_getname() is a utility to find IUCN names using the EOL API
  • new function rank_agg() aggregates data by a given taxonomic rank
  • new data table apg_families
  • new data table apg_orders
  • gnr_resolve() gains new arguments gnr_resolvee_once, with_context, stripauthority, highestscore, and http, and loses returndf (that is, a data.frame is returned by default)
  • gni_search() gains parameter parse_names

MINOR IMPROVEMENTS

  • tnrs() parameter getpost changed from default of 'GET' to 'POST'
  • Across all functions, the url parameter specifying an API endpoint was moved inside of functions (i.e., not available as a parameter in the function call)
  • gnr_datasources() parameter todf=TRUE by default now, returning a data.frame
  • col_classification() minor formatting improvements

BUG FIXES

  • iucn_summary() returns no information about population estimates.
  • get_tsn() raised a warning in specific situations.
  • tax_name() did not work for multiple ranks with ITIS.
  • fixed errors in getfullhierarchyfromtsn()
  • fixed errors in gethierarchydownfromtsn()
  • fixed errors in getsynonymnamesfromtsn()
  • fixed errors in searchforanymatch()
  • fixed errors in searchforanymatchedpage()

NOTES

  • Removed dependency to NCBI2R
  • Improvements of documentation
  • Citation added

taxize 0.0.5

BUG FIXES

  • removed tests for now until longer term fix is made so that web APIs that are temporarily down don't cause tests to fail.

taxize 0.0.4

BUG FIXES

  • added R (>= 2.15.0) so that package tests don't fail on some systems due to paste0()
  • remove test for ubio_namebank() function as it sometimes fails

taxize 0.0.3

BUG FIXES

  • iucn_summary() does not break when API returns no information.
  • tax_name() returns NA when taxon is not found on API.
  • get_uid() asks for user input when more then one UID is found for a taxon.
  • changed base URL for phylomatic_tree(), and associated parameter changes

NEW FEATURES

  • added check for invasive species status for a set of species from GISD database via gisd_isinvasive().
  • Further development with the EOL-API: eol_dataobjects().
  • added Catalogue of Life: col_classification(), col_children(), and col_downstream().
  • new fxn get_genes(), retrieve gene sequences from NCBI by accession number.
  • new functions to interact with the Phylotastic name resolution service: tnrs_sources() and tnrs()
  • Added unit tests

DEPRECATED AND DEFUNCT

  • itis_name() fxn deprecated - use tax_name() instead

taxize 0.0.2

BUG FIXES

  • changed paste0 to paste to avoid problems on certain platforms.
  • removed all tests until the next version so that tests will not fail on any platforms.
  • plyr was missing as import for iucn_summary fxn.

NEW FEATURES

  • added NEWS file.

taxize 0.0.1

NEW FEATURES

  • released to CRAN

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("taxize")

0.9.3 by Scott Chamberlain, 2 months ago


https://github.com/ropensci/taxize


Report a bug at https://github.com/ropensci/taxize/issues


Browse source code at https://github.com/cran/taxize


Authors: Scott Chamberlain [aut, cre] (<https://orcid.org/0000-0003-1444-9135>), Eduard Szoecs [aut], Zachary Foster [aut], Zebulun Arendsee [aut], Carl Boettiger [ctb], Karthik Ram [ctb], Ignasi Bartomeus [ctb], John Baumgartner [ctb], James O'Donnell [ctb], Jari Oksanen [ctb], Bastian Greshake Tzovaras [ctb], Philippe Marchand [ctb], Vinh Tran [ctb]


Documentation:   PDF Manual  


Task views: Phylogenetics, Especially Comparative Methods


MIT + file LICENSE license


Imports graphics, methods, stats, utils, crul, httr, xml2, jsonlite, reshape2, stringr, plyr, foreach, ape, zoo, bold, data.table, rredlist, rotl, ritis, tibble, worrms, natserv, wikitaxa

Suggests testthat, knitr, vegan


Imported by TR8, bdvis, brranching, camtrapR, metacoder, myTAI, originr, rusda, taxa, taxlist, traits.

Depended on by MonoPhy, aptg.

Suggested by RNeXML, binomen, mapr, rbison, rnoaa, spocc.

Enhanced by rerddap.


See at CRAN