Fast methods for learning sparse Bayesian networks from high-dimensional data using sparse regularization, as described in Aragam, Gu, and Zhou (2017)
Introducing sparsebn: A new R package for learning sparse Bayesian networks and other graphical models from high-dimensional data via sparse regularization. Designed from the ground up to handle:
The emphasis of this package is scalability and statistical consistency on high-dimensional datasets. Compared to existing algorithms, sparsebn scales much better and is under active development. For more details on this package, including worked examples and the methodological background, please see our new preprint [1].
The main methods for learning graphical models are:
estimate.dag for directed acyclic graphs (Bayesian networks).estimate.precision for undirected graphs (Markov random fields).estimate.covariance for covariance matrices.Currently, estimation of precision and covariances matrices is limited to Gaussian data.
The workhorse behind sparsebn is the sparsebnUtils package, which provides various S3 classes and methods for representing and manipulating graphs. The basic algorithms are implemented in ccdrAlgorithm and discretecdAlgorithm.
You can install:
the latest CRAN version with
install.packages("sparsebn")
the latest development version from GitHub with
devtools::install_github(c("itsrainingdata/sparsebn/", "itsrainingdata/sparsebnUtils/dev", "itsrainingdata/ccdrAlgorithm/dev", "gujyjean/discretecdAlgorithm"))
[1] Aragam, B., Gu, J., and Zhou, Q. (2017). Learning large-scale Bayesian networks with the sparsebn package. arXiv: 1703.04025.
[2] Aragam, B. and Zhou, Q. (2015). Concave penalized estimation of sparse Gaussian Bayesian networks. The Journal of Machine Learning Research. 16(Nov):2273−2328.
[3] Fu, F., Gu, J., and Zhou, Q. (2014). Adaptive penalized estimation of directed acyclic graphs from categorical data. arXiv: 1403.2310.
[4] Aragam, B., Amini, A. A., and Zhou, Q. (2015). Learning directed acyclic graphs with penalized neighbourhood regression. arXiv: 1511.08963.
[5] Fu, F. and Zhou, Q. (2013). Learning sparse causal Gaussian networks with experimental intervention: Regularization and coordinate descent. Journal of the American Statistical Association, 108: 288-300.
estimate.dag now supports white lists and black lists (#6)openCytoscape (#4)plotDAG now includes labels for each subplot by defaultNEWS.md file to track changes to the packageplotDAG to provide convenient default for plotting large graphsestimate.dag now takes an optional logical argument adaptive: If TRUE, then an adaptive version of the CD algorithm will be run for discrete data. This argument is ignored for continuous data.