Convert Spatial Data Using Tidy Tables

Tools to convert from specific formats to more general forms of spatial data. Using tables to store the actual entities present in spatial data provides flexibility, and the functions here deliberately minimize the level of interpretation applied, leaving that for specific applications. Includes support for simple features, round-trip for 'Spatial' classes and long-form tables, analogous to 'ggplot2::fortify'. There is also a more 'normal form' representation that decomposes simple features and their kin to tables of objects, parts, and unique coordinates.


Travis-CI Build Status

AppVeyor Build Status

CRAN RStudio mirror downloads

Coverage Status

Spbabel provides simple tools to flip between specialist, bespoke formats and tabular, generic forms of spatial data. This package aims assist in the ongoing development of tools for spatial data in R. This is really a set of tools for developing other tools, but do see some examples in the vignettes.

The key functions for simple decomposition and recomposition are sptable and sp, which provide the identified coordinates in a single data frame.

A more useful and extensible decomposition is provided by map_table which provides data frames of the object data, the parts data and the coordinates data as separate tables linked by ID. See here for more on the rationale. http://rpubs.com/cyclemumner/sc-rationale This table-based framework allows for the easy transfer between the different spatial representations in R, in generic database-ready tables.

Currently supported:

  • sf
  • sp
  • rangl
  • trip
  • rgl

In progress:

  • everything else
  • especially the other trajectory / animal tracking packages

The tracking packages such as adehabitatLT, trajectories, and dozens of others crawl and move packages contain objects that could be coerced in a straightforward way, see the Spatio Temporal Task View for more (in the Moving Objects / Trajectories section). The hyperframe in spatstat is another example, and to follow up Edzer's work in spacetime.

If you know of other variants that should be included, please file an issue or let me know. Once the basic framework is available, adding new conversions will be pretty simple.

There already are converters for Spatial classes, so why do this? There are converters, but the sp and sf classes adhere to a common denominator in modern GIS standards which is quite restrictive. There are many spatial data structures in R that cannot be represented, and that cannot be represented by extending the standard packages. Most other spatial software also goes around the simple features standards, and so we enter regions where we have no standards at all.

Conversions between existing forms is simply a side-benefit of having a more general framework. The main motivation is to be able to convert these commonly used types into forms ready for modern tools for interactive use, and to allow database back-ending without proliferation of complicated workarounds doing constant translation.

Not all pairwise combinations are of interest, but most importantly some of the representations are more general than others. The only one that can be used to represent all others is a set of relational tables, and 'gris' does most of this, but 'ggplot2' also comes pretty close. Neither have been used extensively to do this though!

Installation

Install the package from CRAN:

install.packages("spbabel")

The development version can be installed directly from github:

devtools::install_github("mdsumner/spbabel")

Formal and informal spatial data in R

Spatial data in the sp package have a formal definition (extending class Spatial) that is modelled on shapefiles, and close at least in spirit to the Simple Features definition. See What is Spatial in R? for more details. Spatial data in the ggplot2 package has no formal definition and there's not a lot of guidance for how to switch between these two worlds, or the opportunities that exist for other options.

The spbabel package tries to help by providing a more systematic encoding into the long-form with consistent naming and lossless ways to re-compose the original (or somewhat modified) objects.

The long-form version is similar to that implemented in:

  • sp's as() coercion for SpatialLinesDataFrame to SpatialPointsDataFrame
  • rasters's geom()
  • ggplot2's fortify()
  • gris' normalized tables

How does spbabel work

The sptable function decomposes a Spatial object to a single table structured as a row for every coordinate in all the sub-geometries, including duplicated coordinates that close polygonal rings, close lines and shared vertices between objects.

The sp function re-composes a Spatial object from a table, it auto-detects the topology by the matching column names:

  • SpatialPolygons: object_, branch_, island_, order_
  • SpatialLines: object_, branch_, order_
  • SpatialPoints: object_
  • SpatialMultiPoints: object_, branch_

After quite a lot of experimentation the long-form single table of all coordinates, with object, branch, island-status, and order provides the best middle-ground for transferring between different representations of Spatial data. Tables are always based on the "tibble" since it's a much better data frame.

The sptable function creates the table of coordinates with identifiers for object and branch, which is understood by sptable<- to "fortify" and sp for the reverse.

The long-form table may seem like soup, but it's not meant to be seen for normal use. It's very easy to dump this to databases, or to ask spatial databases for this form. There are other more normalized multi-table approaches as well - this is just a powerful lowest common denominator.

We can tidy this up by encoding the geometry data into a geometry-column, into nested data frames, or by normalizing to tables that store only one kind of data, or with recursive data structures such as lists of matrices. Each of these has strengths and weaknesses. Ultimately I want this to evolve into a fully-fledged set of tools for representing spatial/topological data in R, but still by leveraging existing code whereever possible.

Why do this?

I want these things, and spbabel is the right compromise for where to start:

  • flexibility in the number and type/s of attribute stored as "coordinates", x, y, lon, lat, z, time, temperature, etc.
  • ability to store attributes on parts (!) i.e. the state is the object, the county is the part
  • shared vertices
  • ability to store points, lines and areas together, sharing topology where appropriate
  • provide a flexible basis for conversion between other formats.
  • flexibility and ease of use
  • integration with database engines and other systems
  • integration with D3 via htmlwidgets, with shiny, and with gggeom ggvis or similar
  • data-flow with dplyr piping as the engine behind a D3 web interface

Flexibility in attributes generally is the key to breaking out of traditional GIS constraints that don't allow clear continuous / discrete distinctions, or time-varying objects/events, 3D/4D geometry, or clarity on topology versus geometry. When everything is tables this becomes natural, and we can build structures like link-relations between tables that transfer data only when required.

The ability to use Manifold System seamlessly with R is a particular long-term goal, and this will be best done(TM) via dplyr "back-ending".

A more general approach to this is started here: https://github.com/mdsumner/sc

The decomposition and rebuild process of sf objects is now better thought out here: https://github.com/mdsumner/gibble and is to be built into whatever sc becomes.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

News

spbabel 0.4.8

  • bug fix, sp recomposition for multpoint now correctly splits on object_ (rather than branch_)

  • bug fix, MULTIPOINT and point were getting "order_"

  • bug fix, POLYGON island status wasn't recorded (redundantly) along with branch - so recomposition resulted in lines

spbabel 0.4.7

  • remove dependency on sf, spbabel can decompose sf but cannot recompose. Importing sf requires too many dependencies that are not relevant to the
    workflows.

spbabel 0.4.6

  • new concept of "island", as the intermediary part before a branch for MULTIPOLYGON only

  • added support for sf, new model based on "feature_table"

  • proper support for SpatialPoints in map_table

spbabel 0.4.5

  • fixed bug in sp() logic that recreates a SpatialLines (it was using a Polygon under the hood)

  • sped up sptable by using old raster code, after generalizing to all types

  • new map_table method for 'trip' objects

  • workarounds for SpatialPoints, SpatialMultiPoints (removed problematic high-level use of as_tibble, which meant that points/multipoints weren't being built properly)

  • use duplicated rather than distinct_, see https://github.com/mdsumner/spbabel/issues/27

  • semi_cascade now keeps quiet

  • spbabel<- replacement function now drops attributes if object and row numbers not the same

spbabel 0.4.0

  • new function 'map_table' to produce the more general multiple-table model

  • branch IDs can now be factor, before this resulted in empty data.frames from split

  • moved to using character IDs for object, branch, vertex

  • added track data set

  • added holey data set

  • update to use tibble rather than dplyr data_frame

  • fix MultiPoints

  • updates for dplyr distinct(.keep_all)

  • extra documentation added

  • fix up package structure for CRAN

spbabel 0.3.2

  • removed internal use of a matrix in .pointsGeom

  • de- and re-composition of SpatialPoints and SpatialMultiPoints now consistent with other types

  • re-composition of poly (object_, branch_, island_, order_), line (object_, branch_, order_), point (object_), and multipoint (object_, branch_) now differentiated simply by usage of those column names

  • renamed spFromTable to sp generic, spFromTable deprecated

  • fixed up multipoint support

spbabel 0.3.1

  • removed all nesting and normalize approaches out of spbabel

  • removed all dplyr verb methods to spdplyr

  • various improvements provided by jlegewie, removed transmute_ (not needed), improved filter_ and select_, added left_join and inner_join, see https://github.com/mdsumner/spbabel/pull/10

  • added group_by and complementary summarize capability for Spatial

  • set data.frame and tbl and tbl_df as S4 compatible

spbabel 0.3.0

  • committing to names object_, branch_, island_, order_, x_ and y_, and Object_ and Branch_

  • removed "part" terminology, in favour of "branch"

  • remove ptransform - maybe use reproj instead, wip

  • added methods for ptransform, needs tests

  • working on embedded tables, with disparate tables per row rather than hierarchical

  • added nesting for Spatial

spbabel 0.1.0

  • added a replacement function sptable<-

  • added a data set of MultiPointsDataFrame "mpoint1"

  • Added a NEWS.md file to track changes to the package.

  • First function version - with methods for dplyr verbs.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("spbabel")

0.4.8 by Michael D. Sumner, a year ago


https://mdsumner.github.io/spbabel


Report a bug at https://github.com/mdsumner/spbabel/issues


Browse source code at https://github.com/cran/spbabel


Authors: Michael D. Sumner [aut, cre]


Documentation:   PDF Manual  


GPL-3 license


Imports dplyr, methods, sp, tibble

Suggests testthat, ggplot2, maptools, raster, rmarkdown, knitr, covr, broom, ggpolypath, maps, sf, trip, viridis


Imported by angstroms, spdplyr, tabularaster.


See at CRAN