A re-implementation and extension of the permutation based network comparison test introduced in Van Borkulo et al. (2017) . Such extensions include scaling to networks with many nodes and the option to use custom test-statistics.
nct( Y_g1, Y_g2, iter = 1000, desparsify = TRUE, method = "pearson", FUN = NULL, cores = 1, progress = TRUE, update_progress = 4, ... )
Y_g1 | A matrix (or data.frame) of dimensions n by p,
corresponding to the first dataset (p must be the same
for |
---|---|
Y_g2 | A matrix of dimensions n by p, corresponding to the
second dataset (p must be the same for |
iter | Numeric. Number of (Monte Carlo) permutations (defaults to |
desparsify | Logical. Should the de-sparsified glasso estimator be
computed (defaults to |
method | character string. Which correlation coefficient (or covariance) is to be computed. One of "pearson" (default), "kendall", or "spearman". |
FUN | A function or list of functions (defaults to |
cores | Numeric. Number of cores to use when executing the permutations in
parallel (defaults to |
progress | Logical. Should a progress bar be included
(defaults to |
update_progress | How many times should the progress bar be updated
(defaults to |
... | Additional arguments passed to |
A list of class nct
, including the following
glstr_pvalue
: Global strength p-value.
sse_pvalue
: Sum of square error p-value.
jsd_pvalue
: Jensen-Shannon divergence p-value.
max_pvalue
: Maximum difference p-value.
glstr_obs
: Global strength observed.
sse_obs
: Sum of square error observed.
jsd_obs
: Jensen-Shannon divergence observed.
max_obs
: Maximum difference observed.
glstr_perm
: Global strength permutations.
sse_perm
: Sum of square error permutations.
jsd_perm
: Jensen-Shannon divergence permutations.
max_perm
: Maximum difference permutations.
For user-defined functions, i.e., those provided to FUN
,
the function name is pasted to _pvalue
, _obs
, and
_perm
.
User-Defined Functions
These functions must have two arguments, corresponding to the partial correlation network for each group. An example is provided below.
For user-defined functions (FUN
), absolute values are used
to compute the p-value, assuming more than one value is returned
(e.g., centrality). This is done to mimic the R
package
NCT.
A fail-safe method to ensure the p-value is computed correctly is
to access the permutations and observed values from the nct
object.
Finally, comparing edges is not implemented. The most straightforward
way to do this is with compare_edges
, which
uses the de-sparsified estimator.
In Van Borkulo et al. (2017) , it was suggested that these are tests of invariance. To avoid confusion, that terminology is not used in GGMncv. This is because these tests assume invariance or the null is true, and thus can only be used to detect differences. Hence, it would be incorrect to suggest networks are the same, or evidence for invariance, by merely failing to reject the null hypothesis (Williams et al. 2021) .
For the defaults, Jensen-Shannon divergence is a symmetrized version of Kullback-Leibler divergence (the average of both directions).
Computational Speed
This implementation has two key features that should make it
scale to larger networks: (1) parallel computation and (2) the
R
package glassoFast is used under the hood
(as opposed to glasso). CPU (time) comparisons are
provided in Sustik and Calderhead (2012)
.
Non-regularized
Non-regularized can be implemented by setting lambda = 0
. Note
this is provided to ggmncv
via ...
.
Sustik MA, Calderhead B (2012).
“GLASSOFAST: An efficient GLASSO implementation.”
UTCS Technical Report TR-12-29 2012.
Van Borkulo CD, Boschloo L, Kossakowski J, Tio P, Schoevers RA, Borsboom D, Waldorp LJ (2017).
“Comparing network structures on three aspects: A permutation test.”
Manuscript submitted for publication, 10.
Williams DR, Briganti G, Linkowski P, Mulder J (2021).
“On Accepting the Null Hypothesis of Conditional Independence in Partial Correlation Networks: A Bayesian Analysis.”
PsyArXiv.
doi: 10.31234/osf.io/7uhx8
, https://psyarxiv.com/7uhx8.
# \donttest{ # generate network main <- gen_net(p = 10) # assume groups are equal y1 <- MASS::mvrnorm(n = 500, mu = rep(0, 10), Sigma = main$cors) y2 <- MASS::mvrnorm(n = 500, mu = rep(0, 10), Sigma = main$cors) compare_ggms <- nct(y1, y2, iter = 500, progress = FALSE) compare_ggms #> Network Comparsion Test #> (GGMncv Edition) #> ---- #> Maximum Difference #> p-value: 0.538 #> ---- #> Global Strength #> p-value: 0.562 #> ---- #> Sum of Squared Error #> p-value: 0.196 #> ---- #> Jensen-Shannon divergence #> p-value: 0.122 #> ---- # custom function # note: x & y are partial correlation networks # correlation Correlation <- function(x, y){ cor(x[upper.tri(x)], y[upper.tri(y)]) } compare_ggms <- nct(y1, y2,iter = 100, FUN = Correlation, progress = FALSE) compare_ggms #> Network Comparsion Test #> (GGMncv Edition) #> ---- #> Maximum Difference #> p-value: 0.45 #> ---- #> Global Strength #> p-value: 0.52 #> ---- #> Sum of Squared Error #> p-value: 0.22 #> ---- #> Jensen-Shannon divergence #> p-value: 0.12 #> ---- #> note: compute p-values manually for custom tests. see vignettes. # correlation and strength Strength <- function(x, y){ NetworkToolbox::strength(x) - NetworkToolbox::strength(y) } compare_ggms <- nct(y1, y2, iter = 100, FUN = list(Correlation = Correlation, Strength = Strength), progress = FALSE) compare_ggms #> Network Comparsion Test #> (GGMncv Edition) #> ---- #> Maximum Difference #> p-value: 0.44 #> ---- #> Global Strength #> p-value: 0.54 #> ---- #> Sum of Squared Error #> p-value: 0.21 #> ---- #> Jensen-Shannon divergence #> p-value: 0.1 #> ---- #> note: compute p-values manually for custom tests. see vignettes. # }