Using this package, users can access to the largest collection of public data and statistics on the Internet featuring about 2.5 billion time series from thousands of sources collected in 'Knoema' repository and use rich R calculations in order to analyze the data. Because data in 'Knoema' is time series data, 'Knoema' function offers data in a number of formats usable in R such as 'ts', 'xts' or 'zoo'. For more information about 'Knoema' API go to < https://knoema.com/dev/docs>.
This is the official documentation for Knoema's R Package. The package can be used for obtaining data from the datasets from the site knoema.com.
To install the devtools package:
install.packages("devtools") library("devtools") install_github("Knoema/knoema-r-driver")
To install the most recent package from CRAN:
Note: the CRAN version migth not reflect the latest changes made to this package. If you are interested in the latest changes, use the version from the github.
By default, the package allows you to work only with public datasets from the site knoema.com and has a limit on the number of requests. To make full use of the package we recommend you use parameters client.id and client.secret. You can get these parameters after registering on the site knoema.com, in the section "My profile - Apps - create new" (or use existing applications). For a quick call you can use the link https://knoema.com/user/apps. If on this page you have some applications - open one of them or create a new one. You can see the parameters client id and client secret at the bottom of the page and then use them in the functions. How to use these parameters in the functions will be shown below.
There is one method for retrieving series from datasets in R: the Knoema method. The method works with knoema datasets.
The following quick call can be used to retrieve a timeserie from dataset:
library("Knoema") data = Knoema("IMFWEO2017Apr", list(country = "914", subject = "ngdp"))
This example finds all data points for the dataset IMFWEO2017Apr with selection by country = Albania and subject = Gross domestic product, current prices (U.S. dollars) and stores this series in a format ts.
Please note that you need to identify all dimensions of the dataset, and for each dimension to indicate the selection. Otherwise, the method returns an error.
For multiple selection you can use the next example:
data = Knoema("IMFWEO2017Oct", list("country" = "914;512;111", "subject" = "lp;ngdp"))
For case when the dimensions of dataset that have multi word names use the next example:
data = Knoema("FDI_FLOW_CTRY", list("Reporting country" = "AUS", "Partner country/territory" = "w0", "Measurement principle" = "DI", "Type of FDI" = "T_FA_F", "Type of entity" = "ALL", "Accounting entry" = "NET", "Level of counterpart" = "IMC", "Currency" = "USD"))
In addition to the required using of the selections for dimensions, you can additionally specify the period and frequencies in the parameters. For more details, see the example below:
data = Knoema("IMFWEO2017Oct",list (country = "914;512;111", subject = "lp;ngdp", frequency = "A", timerange = "2007-2017"))
The package supports such formats as "ts", "xts" and "zoo", "DataFrame", "DataTable", "MetaDataFrame", "MetaDataTable". By default type is equal "ts". How to use the type shown in the example below:
data = Knoema("IMFWEO2017Oct",list (country = "914;512;111", subject = "lp;ngdp"), type = "zoo")
In order to get access to private datasets please use parameters client.id and client.secret in a function:
data = Knoema("MEI_BTS_COS_2015", list(location = "AT;AU", subject = "BSCI", measure = "blsa", frequency = "Q;M"), type = "DataFrame", client.id = "some client id", client.secret = "some client secret")
The search by mnemonics is implemented in knoema. Mnemonics is a unique identifier of the series. Different datasets can have the same series with the same mnemonics. In this case, in the search results there will be a series that was updated last. The same series can have several mnemonics at once, and you can search for any of them. An example of using the search for mnemonics::
data = Knoema('dataset_id', mnemonics = 'mnemonic1;mnemonic2')
If you are downloading data by mnemonics without providing dataset id, you can use this example::
data = Knoema(mnemonics = 'mnemonic1;mnemonic2')
You can avoid these errors, using correct parameters client.id and client.secret
Error: "dataset.id should be a string. Can't be NULL" Error: "dataset.id should be a string. Can't be double" These errors appear when you use NULL or number in place of dataset's Id parameter. Examples:
Error: "The function does not support specifying mnemonics and selection in a single call" This error appears when you use mnemonics and selection in one query. Example::
Knoema('IMFWEO2017Oct', selection = list(country ='912', subject='lp'), mnemonics = 'some_mnemonic') Knoema(selection = list(country = 'USA'), mnemonics = 'some_mnemonic')
Error: "Dimension with id or name some_name_of_dimension is not found" This error appears when you use name that doesn't correspond to any existing dimensions' names or ids. Example:
Knoema('IMFWEO2017Oct', list(dimension_not_exist='914', subject='lp')
Error: "Selection for dimension dimension_name is empty" This error appears when you use empty selection for dimension or all specified elements don't exist. Examples:
Knoema('IMFWEO2017Oct', list(country ='', subject='lp')) Knoema('IMFWEO2017Oct', list('country'='914', 'subject'='nonexistent_element1; nonexistent_element2'))
Error: "The following frequencies are not correct: list of frequencies" This error appears when you use frequencies that don't correspond to supported formats. Example:
Knoema("IMFWEO2017Oct", list(country = "914", subject = "LP", frequency = "A;nonexistent_frequency"))
We support only following abbreviations of frequencies - A, H, Q, M, W, D.
Error: "Requested dataset doesn't exist or you don't have access to it" This error appears when you use dataset that doesn't exist or you don't have access rights to it. Example:
Knoema("IMFWEO2017Apr", list(country = "914", subject = "LP"))
This dataset doesn't exist. If your dataset exist, and you have access to it, check that you set client.id and client.secret parameters
Error: "Underlying data is very large. Can't create visualization" This error appears when you use a big selection. Try to reduce the selection.
Error: "The specified host incorect_host doesn't exist" This error can appear when you use host that doesn't exist. Example:
Knoema("IMFWEO2017Apr", list(country = "914", subject = "LP"), host='knoema_incorect.com')