Facilitates the simulation and evaluation of context-free and contextual multi-Armed Bandit policies or algorithms to ease the implementation, evaluation, and dissemination of both existing and new bandit algorithms and policies.
R package facilitating the simulation and evaluation of context-free and contextual Multi-Armed Bandit policies.
The package has been developed to:
To install contextual from CRAN:
To install the development version (requires the devtools package):
When working on or extending the package, clone its GitHub repository, then do:
install.packages("devtools")devtools::install_deps(dependencies = TRUE)devtools::build()devtools::reload()
clean and rebuild...
See the demo directory for practical examples and replications of both synthetic and offline (contextual) bandit policy evaluations.
How to replicate figures from two books, both offering a first introduction to context-free Multi-Armed Bandits:
Basic, context-free multi-armed bandit examples:
Examples of both synthetic and offline contextual multi-armed bandit evaluations:
Some more extensive vignettes to get you started with the package:
Paper offering a general overview of the package's structure & API:
Overview of contextual's growing library of contextual and context-free bandit policies:
|CMAB Naive Epsilon-Greedy
LinUCB (General, Disjoint, Hybrid)
Linear Thompson Sampling
|Lock-in Feedback (LiF)
Overview of contextual's bandit library:
|Basic Synthetic||Contextual Synthetic||Offline||Continuous|
|Basic Bernoulli Bandit
Basic Gaussian Bandit
By default, "contextual" uses R's built-in parallel package to facilitate parallel evaluation of multiple agents over repeated simulation. See the demo/alternative_parallel_backends directory for several alternative parallel backends:
Robin van Emden: author, maintainer* Maurits Kaptein: supervisor*
If you encounter a clear bug, please file a minimal reproducible example on GitHub.