Generates high-entropy integer synthetic populations from marginal and (optionally) seed data using quasirandom sampling,
in arbitrary dimensionality (Smith, Lovelace and Birkin (2017)
- adds new functionality for multidimensional integerisation.
- deletes previously deprecated functionality
Building on the
prob2IntFreqfunction - which takes a discrete probability distribution and a count, and returns the closest integer population to the distribution that sums to the count - a multidimensional equivalent
In one dimension, for example:>>> import numpy as np>>> import humanleague>>> p=>>>
produces the optimal (i.e. closest possible) integer population to the discrete distribution.
integerisefunction generalises this problem and applies it to higher dimensions: given an n-dimensional array of real numbers where the 1-d marginal sums in every dimension are integral (and thus the total population is too), it attempts to find an integral array that also satisfies these constraints.
The QISI algorithm is repurposed to this end. As it is a sampling algorithm it cannot guarantee that a solution is found, and if so, whether the solution is optimal. If it fails this does not prove that a solution does not exist for the given input.
>>> a =# marginal sums>>>>># perform integerisation>>> r =>>>True>>>>>>0.5766281297335398# check marginals are preserved>>> ==>>> ==
synthPopGimplement restricted versions of algorithms that are available in other functions.
qisins place of
qisiin place of
humanleague is a python and an R package for microsynthesising populations from marginal and (optionally) seed data. The package is implemented in C++ for performance.
The package contains algorithms that use a number of different microsynthesis techniques:
The latter provides a bridge between deterministic reweighting and combinatorial optimisation, offering advantages of both techniques:
The package also contains the following utility functions:
Version 1.0.1 reflects the work described in the Quasirandom Integer Sampling (QIS) paper.
For development version
Or, for the legacy version
Requires Python 3 and numpy. PyPI package:
python3 -m pip install humanleague --user
[Conda-forge package is being worked on]
$ ./setup.py install --user
$ ./setup.py test
Consult the package documentation, e.g.
> library(humanleague) > ?humanleague
in R, or for python:
>>> import humanleague as hl >>> help(hl)