Translate CSS Selectors to XPath Expressions

Translates a CSS3 selector into an equivalent XPath expression. This allows us to use CSS selectors when working with the XML package as it can only evaluate XPath expressions. Also provided are convenience functions useful for using CSS selectors on XML nodes. This package is a port of the Python package 'cssselect' (< https://cssselect.readthedocs.io/>).


License (3-Clause BSD) Build Status CRAN version codecov Downloads per month

selectr is a package which makes working with HTML and XML documents easier. It does this by performing translation of CSS selectors into XPath expressions so that you can query XML and xml2 documents easily.

library(selectr)
xpath <- css_to_xpath("#selectr")
xpath
#> [1] "descendant-or-self::*[@id = 'selectr']"

Installation

Install the release version from CRAN

install.packages("selectr")

Install the development version from GitHub

# install.packages("devtools")
devtools::install_github("sjp/selectr")

Overview

The key functions in selectr are:

  • Translate a CSS selector into an XPath expression with css_to_xpath().

  • Query an XML or xml2 document with querySelector() and its variants.

    • Find the first matching node with querySelector().

    • Find all matching nodes with querySelectorAll().

    • Find the first matching node in a namespaced document with querySelectorNS().

    • Find all matching nodes in a namespaced document with querySelectorAllNS().

Examples

Here is a simple example to demonstrate how to query an XML or xml2 document with querySelector().

library(selectr)
xmlText <- '<foo><bar><baz id="first"/></bar><baz id="second"/></foo>'
 
library(XML)
doc <- xmlParse(xmlText)
querySelector(doc, "baz")
#> <baz id="first"/>
querySelectorAll(doc, "baz")
#> [[1]]
#> <baz id="first"/>
#>
#> [[2]]
#> <baz id="second"/>
#>
#> attr(,"class")
#> [1] "XMLNodeSet"
 
library(xml2)
doc <- read_xml(xmlText)
querySelector(doc, "baz")
#> {xml_node}
#> <baz id="first">
querySelectorAll(doc, "baz")
#> {xml_nodeset (2)}
#> [1] <baz id="first"/>
#> [2] <baz id="second"/>

News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("selectr")

0.4-1 by Simon Potter, 2 years ago


https://sjp.co.nz/projects/selectr


Report a bug at https://github.com/sjp/selectr/issues


Browse source code at https://github.com/cran/selectr


Authors: Simon Potter [aut, trl, cre] , Simon Sapin [aut] , Ian Bicking [aut]


Documentation:   PDF Manual  


BSD_3_clause + file LICENCE license


Imports methods, stringr, R6

Suggests testthat, XML, xml2


Imported by Rcrawler, cliapp, ganalytics, rvest, wikilake.

Suggested by aire.zmvm, unpivotr.


See at CRAN