The R
package BGGM provides tools for making Bayesian inference in
Gaussian graphical models (GGM). The methods are organized around two general approaches for
Bayesian inference: (1) estimation Williams2019BGGM and (2) hypothesis testing
Williams2019_bfBGGM. The key distinction is that the former focuses on either the
posterior or posterior predictive distribution, whereas the latter focuses on model comparison
with the Bayes factor.
The methods in BGGM build upon existing algorithms that are well-known in the literature. The central contribution of BGGM is to extend those approaches:
Bayesian estimation with the novel matrix-F prior distribution Mulder2018BGGM.
Estimation
estimate
.
Bayesian hypothesis testing with the novel matrix-F prior distribution Mulder2018BGGM.
Comparing GGMs williams2020comparingBGGM
Partial correlation differences
ggm_compare_estimate
.Posterior predictive check
ggm_compare_ppc
.Exploratory hypothesis testing
ggm_compare_explore
.Confirmatory hypothesis testing
ggm_compare_confirm
.
Extending inference beyond the conditional (in)dependence structure
Predictability with Bayesian variance explained gelman_r2_2019BGGM
predictability
.Posterior uncertainty in the partial correlations
estimate
.Custom Network Statistics
roll_your_own
.
Furthermore, the computationally intensive tasks are written in c++
via the R
package Rcpp eddelbuettel2011rcppBGGM and the c++
library Armadillo sanderson2016armadilloBGGM, there are plotting functions
for each method, control variables can be included in the model, and there is support for
missing values bggm_missing
.
Supported Data Types:
Continuous: The continuous method was described @in @Williams2019_bf;textualBGGM.
Binary: The binary method builds directly upon @in @talhouk2012efficient;textualBGGM, that, in turn, built upon the approaches of lawrence2008bayesian;textualBGGM and webb2008bayesian;textualBGGM (to name a few).
Ordinal: Ordinal data requires sampling thresholds. There are two approach included in BGGM: (1) the customary approach described in @in @albert1993bayesian;textualBGGM (the default) and the 'Cowles' algorithm described in @in @cowles1996accelerating;textualBGGM.
Mixed: The mixed data (a combination of discrete and continuous) method was introduced @in @hoff2007extending;textualBGGM. This is a semi-parametric copula model (i.e., a copula GGM) based on the ranked likelihood. Note that this can be used for data consisting entirely of ordinal data.
Additional Features:
The primary focus of BGGM
is Gaussian graphical modeling (the inverse covariance matrix).
The residue is a suite of useful methods not explicitly for GGMs:
Bivariate correlations for binary (tetrachoric), ordinal (polychoric), mixed (rank based), and continuous (Pearson's) data
zero_order_cors
.Multivariate regression for binary (probit), ordinal (probit), mixed (rank likelihood), and continous data (
estimate
).Multiple regression for binary (probit), ordinal (probit), mixed (rank likelihood), and continuous data (e.g.,
coef.estimate
).
Note on Conditional (In)dependence Models for Latent Data:
All of the data types (besides continuous) model latent data. That is, unobserved (latent) data is assumed to be Gaussian. For example, a tetrachoric correlation (binary data) is a special case of a polychoric correlation (ordinal data). Both capture relations between "theorized normally distributed continuous latent variables" (Wikipedia). In both instances, the corresponding partial correlation between observed variables is conditioned on the remaining variables in the latent space. This implies that interpretation is similar to continuous data, but with respect to latent variables. We refer interested users to @page 2364, section 2.2, in @webb2008bayesian;textualBGGM.
High Dimensional Data?
BGGM was built specifically for social-behavioral scientists. Of course, the methods can be used by all researchers. However, there is currently not support for high-dimensional data (i.e., more variables than observations) that are common place in the genetics literature. These data are rare in the social-behavioral sciences. In the future, support for high-dimensional data may be added to BGGM.