Compute the de-sparsified (sometimes called "de-biased") glasso estimator with the approach described in Equation 7 of Jankova and Van De Geer (2015) . The basic idea is to undo \(L_1\)-regularization, in order to compute p-values and confidence intervals (i.e., to make statistical inference).
desparsify(object, ...)
object | An object of class |
---|---|
... | Currently ignored. |
The de-sparsified estimates, including
Theta
: De-sparsified precision matrix
P
: De-sparsified partial correlation matrix
According to Jankova and Van De Geer (2015) , the de-sparisifed estimator, \(\hat{\mathrm{\bf T}}\), is defined as
\(\hat{\mathrm{\bf T}} = 2\hat{\boldsymbol{\Theta}} - \hat{\boldsymbol{\Theta}}\hat{\mathrm{\bf R}}\hat{\boldsymbol{\Theta}},\)
where \(\hat{\boldsymbol{\Theta}}\) denotes the graphical lasso estimator of the precision matrix and \(\hat{\mathrm{\bf R}}\) is the sample correlation matrix. Further details can be found in Section 2 ("Main Results") of Jankova and Van De Geer (2015) .
This approach is built upon earlier work on the de-sparsified lasso estimator (Javanmard and Montanari 2014; Van de Geer et al. 2014; Zhang and Zhang 2014)
This assumes (reasonably) Gaussian data, and should not to be expected
to work for, say, polychoric correlations. Further, all work to date
has only looked at the graphical lasso estimator, and not de-sparsifying
nonconvex regularization. Accordingly, it is probably best to set
penalty = "lasso"
in ggmncv
.
This function only provides the de-sparsified estimator and
not p-values or confidence intervals (see inference
).
Jankova J, Van De Geer S (2015).
“Confidence intervals for high-dimensional inverse covariance estimation.”
Electronic Journal of Statistics, 9(1), 1205--1229.
Javanmard A, Montanari A (2014).
“Confidence intervals and hypothesis testing for high-dimensional regression.”
The Journal of Machine Learning Research, 15(1), 2869--2909.
Van de Geer S, B昼㹣hlmann P, Ritov Y, Dezeure R (2014).
“On asymptotically optimal confidence regions and tests for high-dimensional models.”
The Annals of Statistics, 42(3), 1166--1202.
Zhang C, Zhang SS (2014).
“Confidence intervals for low dimensional parameters in high dimensional linear models.”
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 217--242.
# data Y <- GGMncv::Sachs[,1:5] n <- nrow(Y) p <- ncol(Y) # fit model # note: fix lambda, as in the reference fit <- ggmncv(cor(Y), n = nrow(Y), progress = FALSE, penalty = "lasso", lambda = sqrt(log(p)/n)) # fit model # note: no regularization fit_non_reg <- ggmncv(cor(Y), n = nrow(Y), progress = FALSE, penalty = "lasso", lambda = 0) # remove (some) bias and sparsity That <- desparsify(fit) # graphical lasso estimator fit$P #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.00000000 -0.09592018 0.10347554 0.00000000 -0.3163087 #> [2,] -0.09592018 0.00000000 0.22902674 0.31104348 0.1749705 #> [3,] 0.10347554 0.22902674 0.00000000 0.07298544 -0.4440013 #> [4,] 0.00000000 0.31104348 0.07298544 0.00000000 -0.2600128 #> [5,] -0.31630867 0.17497053 -0.44400133 -0.26001275 0.0000000 # de-sparsified estimator That$P #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.000000000 -0.1094239 0.10948543 0.001852132 -0.3202569 #> [2,] -0.109423934 0.0000000 0.26145644 0.335307844 0.2114870 #> [3,] 0.109485428 0.2614564 0.00000000 0.059653978 -0.4636671 #> [4,] 0.001852132 0.3353078 0.05965398 0.000000000 -0.2804660 #> [5,] -0.320256910 0.2114870 -0.46366714 -0.280466031 0.0000000 # mle fit_non_reg$P #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.000000000 -0.1101030 0.10995010 0.002016594 -0.3192876 #> [2,] -0.110102999 0.0000000 0.26561295 0.337956876 0.2162442 #> [3,] 0.109950103 0.2656129 0.00000000 0.056272312 -0.4664779 #> [4,] 0.002016594 0.3379569 0.05627231 0.000000000 -0.2833199 #> [5,] -0.319287553 0.2162442 -0.46647791 -0.283319888 0.0000000