A Future API for Parallel and Distributed Processing using 'batchtools'

Implementation of the Future API on top of the 'batchtools' package. This allows you to process futures, as defined by the 'future' package, in parallel out of the box, not only on your local machine or ad-hoc cluster of machines, but also via high-performance compute ('HPC') job schedulers such as 'LSF', 'OpenLava', 'Slurm', 'SGE', and 'TORQUE' / 'PBS', e.g. 'y <- future_lapply(files, FUN = process)'.


Package: future.batchtools

Version: 0.6.0-9000 [2017-09-10]


o If the built-in attempts of batchtools for finding a default template file fails, then system("templates", package = "future.batchtools") is searched for template files as well. Currently, there exists a torque.tmpl file.

o A job's name in the scheduler is now set as the future's label (requires batchtools 0.9.4 or newer). If no label is specified, the default job name is controlled by batchtools.

o The period between each poll of the scheduler to check whether a future (job) is finished or not now increases geometrically as a function of number of polls. This lowers the load on the scheduler for long running jobs.

o The error message for expired batchtools futures now include the last few lines of the logged output, which sometimes includes clues on why the future expired. For instance, if a TORQUE / PBS job use more than the allocated amount of memory it might be terminated by the scheduler leaving the message "PBS: job killed: vmem 1234000 exceeded limit 1048576" in the output.

o print() for BatchtoolsFuture returns the object invisibly.


o Calling future_lapply() with functions containing globals part of non-default packages would when using batchtools futures give an error complaining that the global is missing. This was due to updates in future (>= 1.4.0) that broke this package.

o loggedOutput() for BatchtoolsFuture would always return NULL unless an error had occurred.

Version: 0.5.0 [2017-06-02]

o First version submitted to CRAN.


o Added more tests; test coverage now at 93%.

Version: 0.4.0 [2017-05-16]


o Added batchtools_custom() for specifying batchtools futures using any type of batchtools cluster functions.

o batchtools_template(pathname = NULL, type = ) now relies on the batchtools package for locating the template file.

o nbrOfWorkers() for batchtools futures now defaults to +Inf unless the evaluator's 'workers' or 'cluster.functions' specify something else.

o Renamed argument 'pathname' to 'template' for batchtools_() functions.


o Under plan(batchjobs_*), when being created futures would produce an error on "all(is.finite(workers)) is not TRUE" due to an outdated sanity check.


o TESTS: Added test of future_lapply() for batchtools backends.

o TESTS: Added optional tests for batchjobs_* HPC schedulers listed in environment variable 'R_FUTURE_TESTS_STRATEGIES'.


o CLEANUP: Package no longer depends on R.utils.

Version: 0.3.0 [2017-03-19]


o The number of jobs one can add to the queues of HPC schedulers is in principle unlimited, which is why the number of available workers for such batchtools_* backends is reported as +Inf. However, as the number of workers is used by future_lapply() to decide how many futures should be used to best partition the elements, this means that future_lapply() will always use one future per element. Because of this, it is now possible to specify plan(batchtools_*, workers = n) where 'n' is the target number of workers.

Version: 0.2.0 [2017-02-23]


o batchtools (>= 0.9.2) now supports exporting objects with any type of names (previously only possible if they mapped to to strictly valid filenames). This allowed me to avoid lots of internal workaround code encoding and decoding globals.

Version: 0.1.0 [2017-02-11]

o Package created by porting the code of future.BatchJobs. This version passes 'R CMD check --as-cran' with all OK after a minimal amount of adjustments to the ported code.

0.6.0 by Henrik Bengtsson, 5 months ago


