An R interface to the 'SciDB' array database < http://scidb.org>.
Install the package from CRAN with
The current development version of the package can be installed directly from sources on GitHub using the devtools package as follows (requires an R development environment and the R devtools package):
devtools::install_github("Paradigm4/SciDBR") # development branch devtools::install_github("Paradigm4/SciDBR", ref="laboratory") # experimental branch
The SciDB R package requires installation of a simple open-source HTTP network service called on the computer that SciDB is installed on. This service only needs to be installed on the SciDB machine, not on client computers that connect to SciDB from R. See http://github.com/paradigm4/shim for source code and installation instructions.
Developers please note that R CMD check-style unit tests are skipped unless a system environment variable named SCIDB_TEST_HOST is set to the host name or I.P. address of SciDB. See the tests directory for test code.
GitHub branch policy: the "devel" branch is for new ideas that have a good
chance of making it into the CRAN package, but all tests might not yet pass.
The "master" branch should pass all
R CMD check tests against the current
R-devel version of R on all platforms, tested against the current SciDB release.
This is a major release that breaks API compatibility with previous package releases. Array objects have been removed. All SciDB arrays are now presented as virtual data frames in R. This change was informed by the most common uses we've seen.
CHANGES IN scidb 2.0.0:
o Big change: all SciDB arrays are presented as data frames
o Greatly simplified the package, improved data transfer performance
o Support for SciDB 16.9 and older versions at least back to 15.7
CHANGES IN scidb 1.1-3:
o Support for upcoming SciDB 14.12, untested backwards support for older releases
o Improved type parsing support, new binary transfer option for data frame objecs. This is much faster than UTF-8 transfer from SciDB.
o Support for streaming data from SciDB.
o Option to disable interrupt handling for faster large data transfers.
o Added functions include order, more flexible dist, peek.
CHANGES IN scidb 1.1-2:
o Support for SciDB 14.3, untested support for older SciDB releases o Dropping arrays from other R sessions requires a non-default option facilitating multiple-user settings. o Queries can be canceled with R user interrupts now (for example with CTRL + C or ESC).
o Many improvements to merge and aggregate functions o N-d arrays with multiple attributes can now index with `$` o Improved R save/load of R data files containing SciDB objects
o Support for Paradigm4 glm and a simple glm model matrix builder o Support for Paradigm4 truncated SVD routine o Added hist, quantile, all.equal, antijoin, a `c` (SciDB concat-like) function and many others
CHANGES IN scidb 1.1-1:
o The aggregate function now supports moving and fixed window aggregates.
TLS/SSL ENCRYPTION AND AUTHENTICATION
o The package now supports TLS/SSL encrypted communication with the shim network service and authentication. The only function affected by this is `scidbconnect` -- simply supply an SSL port number and username and password arguments.
o Most scidb and scidbdf object can now represent array promises, un-evaluated SciDB queries equipped with a result schema and a context environment.
NEW COMPOSABLE OPERATORS
o Many new functions were introduced that closely follow underlying SciDB AFL operators. All of the new functions are composable with lazy evaluation of the underlying query using the new `scidbexpr` class. Results are only computed and stored by SciDB when required or explicitly requested.
R SPARSE MATRIX SUPPORT
o The package now maps sparse SciDB arrays to sparse R matrices. Only matrices (2-d arrays) with double-precision attributes are supported. Sparse array arithmetic uses the new P4 sparse matrix multiply operator when available.
USING RCurl NOW
o We introduced a dependency on RCurl in order to support SSL and authentication with the SciDB shim service.
NO MORE SciDB NID
o The package no longer tries to support SciDB arrays with NID dimensions, which never really worked anyway. Instead, many functions now take advantage of the new SciDB `uniq` and `index_lookup` operators if available (>=SciDB 13.6). Future package versions will take this further and introduce array dimension labeling using the new operators.
MANY NEW DATABASE-LIKE FEATURES
o Aggregate was improved, merge sort, unique, index_lookup, and other new functions were added. See the vignette for more information.
CHANGES IN scidb 1.1-0:
SIGNIFICANT BUG FIX:
o Materializing subsetting operations could return inconsistently ordered data when results spanned SciDB array chunks across multiple SciDB instances. Data are now returned correctly in such cases.
OTHER BUG FIXES:
o Fixed a bug in the processing of the start argument in the as.scidb function. o Fixed several bugs in the image function.
o The iquery function now accepts n=Inf to efficiently download all output from query at once. The iterative=TRUE option should still be used with smaller n values to iterate over large results. o The crossprod and tcrossprod functions are now available for SciDB arrays and mixed SciDB/R objects. o A diag function is now available for SciDB matrices, returning result as a new SciDB 1-D array (vector). o Element-wise exponentiation was implemented for scidb array objects. o Implemented a sum function for scidb array objects.