Tools to convert from specific formats to more general forms of spatial data. Using tables to store the actual entities present in spatial data provides flexibility, and the functions here deliberately minimize the level of interpretation applied, leaving that for specific applications. Includes support for simple features, round-trip for 'Spatial' classes and long-form tables, analogous to 'ggplot2::fortify'. There is also a more 'normal form' representation that decomposes simple features and their kin to tables of objects, parts, and unique coordinates.
Spbabel provides simple tools to flip between specialist, bespoke formats and tabular, generic forms of spatial data. This package aims assist in the ongoing development of tools for spatial data in R. This is really a set of tools for developing other tools, but do see some examples in the vignettes.
The key functions for simple decomposition and recomposition are
sp, which provide the identified coordinates in a single data frame.
A more useful and extensible decomposition is provided by
map_table which provides data frames of the object data, the parts data and the coordinates data as separate tables linked by ID. See here for more on the rationale. http://rpubs.com/cyclemumner/sc-rationale This table-based framework allows for the easy transfer between the different spatial representations in R, in generic database-ready tables.
The tracking packages such as adehabitatLT, trajectories, and dozens of others
move packages contain objects that could be coerced in a straightforward way, see the Spatio Temporal Task View for more (in the Moving Objects / Trajectories section). The hyperframe in spatstat is another example, and to follow up Edzer's work in spacetime.
If you know of other variants that should be included, please file an issue or let me know. Once the basic framework is available, adding new conversions will be pretty simple.
There already are converters for Spatial classes, so why do this? There are converters, but the
sf classes adhere to a common denominator in modern GIS standards which is quite restrictive. There are many spatial data structures in R that cannot be represented, and that cannot be represented by extending the standard packages. Most other spatial software also goes around the simple features standards, and so we enter regions where we have no standards at all.
Conversions between existing forms is simply a side-benefit of having a more general framework. The main motivation is to be able to convert these commonly used types into forms ready for modern tools for interactive use, and to allow database back-ending without proliferation of complicated workarounds doing constant translation.
Not all pairwise combinations are of interest, but most importantly some of the representations are more general than others. The only one that can be used to represent all others is a set of relational tables, and 'gris' does most of this, but 'ggplot2' also comes pretty close. Neither have been used extensively to do this though!
Install the package from CRAN:
The development version can be installed directly from github:
Spatial data in the
sp package have a formal definition (extending class
Spatial) that is modelled on shapefiles, and close at least in spirit to the Simple Features definition. See What is Spatial in R? for more details. Spatial data in the
ggplot2 package has no formal definition and there's not a lot of guidance for how to switch between these two worlds, or the opportunities that exist for other options.
spbabel package tries to help by providing a more systematic encoding into the long-form with consistent naming and lossless ways to re-compose the original (or somewhat modified) objects.
The long-form version is similar to that implemented in:
sptable function decomposes a Spatial object to a single table structured as a row for every coordinate in all the sub-geometries, including duplicated coordinates that close polygonal rings, close lines and shared vertices between objects.
sp function re-composes a Spatial object from a table, it auto-detects the topology by the matching column names:
After quite a lot of experimentation the long-form single table of all coordinates, with object, branch, island-status, and order provides the best middle-ground for transferring between different representations of Spatial data. Tables are always based on the "tibble" since it's a much better data frame.
sptable function creates the table of coordinates with identifiers for object and branch, which is understood by
sptable<- to "fortify" and
sp for the reverse.
The long-form table may seem like soup, but it's not meant to be seen for normal use. It's very easy to dump this to databases, or to ask spatial databases for this form. There are other more normalized multi-table approaches as well - this is just a powerful lowest common denominator.
We can tidy this up by encoding the geometry data into a geometry-column, into nested data frames, or by normalizing to tables that store only one kind of data, or with recursive data structures such as lists of matrices. Each of these has strengths and weaknesses. Ultimately I want this to evolve into a fully-fledged set of tools for representing spatial/topological data in R, but still by leveraging existing code whereever possible.
I want these things, and spbabel is the right compromise for where to start:
Flexibility in attributes generally is the key to breaking out of traditional GIS constraints that don't allow clear continuous / discrete distinctions, or time-varying objects/events, 3D/4D geometry, or clarity on topology versus geometry. When everything is tables this becomes natural, and we can build structures like link-relations between tables that transfer data only when required.
The ability to use Manifold System seamlessly with R is a particular long-term goal, and this will be best done(TM) via dplyr "back-ending".
A more general approach to this is started here: https://github.com/mdsumner/sc
The decomposition and rebuild process of sf objects is now better thought out here: https://github.com/mdsumner/gibble and is to be built into whatever sc becomes.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
bug fix, sp recomposition for multpoint now correctly splits on object_ (rather than branch_)
bug fix, MULTIPOINT and point were getting "order_"
bug fix, POLYGON island status wasn't recorded (redundantly) along with branch - so recomposition resulted in lines
new concept of "island", as the intermediary part before a branch for MULTIPOLYGON only
added support for sf, new model based on "feature_table"
proper support for SpatialPoints in map_table
fixed bug in sp() logic that recreates a SpatialLines (it was using a Polygon under the hood)
sped up sptable by using old raster code, after generalizing to all types
new map_table method for 'trip' objects
workarounds for SpatialPoints, SpatialMultiPoints (removed problematic high-level use of as_tibble, which meant that points/multipoints weren't being built properly)
use duplicated rather than distinct_, see https://github.com/mdsumner/spbabel/issues/27
semi_cascade now keeps quiet
spbabel<- replacement function now drops attributes if object and row numbers not the same
new function 'map_table' to produce the more general multiple-table model
branch IDs can now be factor, before this resulted in empty data.frames from split
moved to using character IDs for object, branch, vertex
added track data set
added holey data set
update to use tibble rather than dplyr data_frame
updates for dplyr
extra documentation added
fix up package structure for CRAN
removed internal use of a matrix in .pointsGeom
de- and re-composition of SpatialPoints and SpatialMultiPoints now consistent with other types
re-composition of poly (object_, branch_, island_, order_), line (object_, branch_, order_), point (object_), and multipoint (object_, branch_) now differentiated simply by usage of those column names
renamed spFromTable to sp generic, spFromTable deprecated
fixed up multipoint support
removed all nesting and normalize approaches out of spbabel
removed all dplyr verb methods to spdplyr
various improvements provided by jlegewie, removed transmute_ (not needed), improved filter_ and select_, added left_join and inner_join, see https://github.com/mdsumner/spbabel/pull/10
added group_by and complementary summarize capability for Spatial
set data.frame and tbl and tbl_df as S4 compatible
committing to names object_, branch_, island_, order_, x_ and y_, and Object_ and Branch_
removed "part" terminology, in favour of "branch"
remove ptransform - maybe use reproj instead, wip
added methods for ptransform, needs tests
working on embedded tables, with disparate tables per row rather than hierarchical
added nesting for Spatial
added a replacement function
added a data set of MultiPointsDataFrame "mpoint1"
NEWS.md file to track changes to the package.
First function version - with methods for dplyr verbs.