'DrugBank' Database XML Parser

This tool is for parsing the 'DrugBank' XML database < http://drugbank.ca/>. The parsed data are then returned in a proper 'R' dataframe with the ability to save them in a given database.


dbparser

The main purpose of the dbparser package is to parse the DrugBank database which is downloadable in XML format from this link. The parsed data can then be explored and analyzed as desired by the user. The dbparser package further provides the facility of saving the parsed data into a given database.

Installation

You can install the released version of dbparser from CRAN with:

install.packages("dbparser")

Example

This is a basic example which shows you how to solve a common problem:

get_xml_db_rows(
              system.file("extdata", "drugbank_record.xml", package = "dbparser")
            )
 
## load drugs data
drugs <- parse_drug()
 
## load drug groups data
drug_groups <- parse_drug_groups()
 
## load drug targets actions data
drug_targets_actions <- parse_drug_targets_actions()

Saving into a database

The parsed data may be saved into a given database. Databases supported by dbparser include MS SQL Server, MySQL and any database supported by DBI package. Following is an example of saving the parsed data into a MySQL database.

library(dbparser)
 
## open a connection to the desired database engine with an already
## created database
 open_db(xml_db_name =  "drugbank.xml", driver = "SQL Server",
 server = "ServerName\\\\SQL2016", output_database = "drugbank")
 
## save 'drugs' dataframe to DB
 parse_drug(TRUE)
 
## save 'drug_groups' dataframe to DB
 parse_drug_groups(TRUE)
 
## save 'drug_targets_actions' dataframe to DB
 parse_drug_targets_actions(TRUE)
 
## finally close db connection 
 close_db()

Exploring the data

Following is an example involving a quick look at a few aspects of the parsed data. First we look at the proportions of biotech and small-molecule drugs in the data.

## view proportions of the different drug types (biotech vs. small molecule)
drugs %>% 
    select(type) %>% 
    ggplot(aes(x = type)) + 
    geom_bar() + 
    guides(fill=FALSE)     ## removes legend for the bar colors

Below, we view the different drug_groups in the data and how prevalent they are.

## view proportions of the different drug types for each drug group
drugs %>% 
    rename(parent_key = primary_key) %>% 
    full_join(drug_groups, by = 'parent_key') %>% 
    select(type, text) %>% 
    ggplot(aes(x = text, fill = type)) + 
    geom_bar() + 
    theme(legend.position= 'bottom') + 
    labs(x = 'Drug Group', 
         y = 'Quantity', 
         title="Drug Type Distribution per Drug Group", 
         caption="created by ggplot") + 
    coord_flip()

Finally, we look at the drug_targets_actions to observe their proportions as well.

## get counts of the different target actions in the data
targetActionCounts <- 
    drug_targets_actions %>% 
    group_by(text) %>% 
    summarise(count = n()) %>% 
    arrange(desc(count))
 
## get bar chart of the 10 most occurring target actions in the data
p <- 
    ggplot(targetActionCounts[1:10,], 
           aes(x = reorder(text,count), y = count, fill = letters[1:10])) + 
    geom_bar(stat = 'identity') +
    labs(fill = 'action', 
         x = 'Target Action', 
         y = 'Quantity', 
         title = 'Target Actions Distribution', 
         subtitle = 'Distribution of Target Actions in the Data',
         caption = 'created by ggplot') + 
    guides(fill = FALSE) +    ## removes legend for the bar colors
    coord_flip()              ## switches the X and Y axes
 
## display plot
p

News

dbparser 1.0.0

  • Added a NEWS.md file to track changes to the package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("dbparser")

1.0.3 by Mohammed Ali, 6 days ago


https://dainanahan.github.io/dbparser/index.html


Report a bug at https://github.com/Dainanahan/dbparser/issues


Browse source code at https://github.com/cran/dbparser


Authors: Mohammed Ali [aut, cre] , Ali Ezzat [aut]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports DBI, odbc, purrr, tibble, tidyr, tools, XML

Suggests knitr, rmarkdown, testthat


See at CRAN