Abstract
This paper presents a Bayesian reformulation of covariate-assisted principal regression for covariance matrix outcomes to identify low-dimensional components in the covariance associated with covariates. By introducing a geometric approach to the covariance matrices and leveraging Euclidean geometry, we estimate dimension reduction parameters and model covariance heterogeneity based on covariates. This method enables joint estimation and uncertainty quantification of relevant model parameters associated with heteroscedasticity. We demonstrate our approach through simulation studies and apply it to analyze associations between covariates and brain functional connectivity using data from the Human Connectome Project.
Keywords: brain functional connectivity, dimension reduction, heteroscedasticity
1 Introduction
This paper reformulates covariate-assisted principal (CAP) regression of Zhao et al. (2021b) in the Bayesian paradigm. The approach identifies covariate-relevant components of the covariance of multivariate response data. Specifically, the method estimates a set of linear projections of multivariate response signals, whose variance is related to external covariates. In neuroscience, there is interest in analyzing statistical dependency between time-series of brain signals from distinct regions of the brain, which we refer to as functional connectivity (FC) (Lindquist 2008; Fornito and Bullmore 2012; Fornito et al. 2013; Monti et al. 2014; Fox and Dunson 2015). The brain signals underling FC are multivariate, and each brain activity is considered relative to others (Varoquaux et al. 2010) in analyzing FC, as this statistical dependency is related with behavioral characteristics (covariates). This paper develops a Bayesian approach to conducting supervised dimension reduction for the response signals, to analyze the association between external covariates and the FC characterized by the multivariate signals’ covariances.
Typically, the first step to analyze brain FC is to define a set of nodes corresponding to spatial regions of interest (ROIs), where each node is associated with its own time course of imaging data. Then the network connections (or an “edge” structure between the nodes) are subsequently estimated based on the statistical dependency between each of the nodes’ time course (van der Heuvel and Hulshoff Pol 2010; Friston 2011). FC networks have been inferred using Pearson’s correlation coefficients (Hutchison et al. 2013) and also with partial correlations in the context of Gaussian graphical models (Whittaker 1990; Hinne et al. 2014) summarized in the precision or inverse covariance matrix. In recent years, there has been a focus on subject-level graphical models where the node-to-node dependencies vary with respect to subject-level covariates. This line of research involves methods to estimate or test group-specific graphs (Guo et al. 2011; Danaher et al. 2014; Narayan et al. 2015; Peterson et al. 2015; Xia et al. 2015; Cai et al. 2016; Saegusa and Shojaie 2016; Lin et al. 2017; Tan et al. 2017; Xia and Li 2017; Durante and Dunson 2018; Xia et al. 2018) as well as general Gaussian graphical models for graph edges that allow both continuous and discrete covariates, estimated based on trees (Liu et al. 2010), kernels (Kolar et al. 2010; Lee and Xue 2018), linear or additive regression (Ni et al. 2019; Wang et al. 2022; Zhang and Li 2023). However, like other standard node-wise regression methods (e.g. Meinshausen and Buhlmann 2006; Peng et al. 2009; Kolar et al. 2010; Cheng et al. 2014; Leday et al. 2017; Ha et al. 2021) in Gaussian graphical models, these approaches focus on edge detection (ie estimation of the off-diagonal elements) rather than estimating the full precision or covariance matrix and do not explicitly constrain positive definiteness of precision or covariance matrices. Works on general tensor outcome regression (Li and Zhang 2017; Sun and Li 2017; Lock 2018) also do not generally guarantee the positive definiteness of the outcomes. While the problem of dimension reduction of individual covariances has been studied in brain dynamic connectivity analysis (Dai et al. 2020), problems in computer vision (Harandi et al. 2017; Li and Lu 2018; Gao et al. 2023) and brain computer interfaces (Davoudi et al. 2017; Xie et al. 2017) as well as multi-group covariance estimation (Flury 1984, 1986; Boik 2002; Pourahmadi et al. 2007; Hoff 2009; Franks and Hoff 2019), covariate information was not utilized in conducting dimension reduction, or it views the data at the group level, which does not account for subject-level heterogeneity in the brain networks. Gaussian graphical models have been applied to study brain connectivity networks in fMRI data (e.g. Li and Solea 2018; Zhang et al. 2020), however, the focus was on analyzing connectivity networks, without explicitly considering their relationship with subject-level covariates.
In this paper, in line with the covariance regression literatures (see, e.g. Engle and Kroner 1995; Fong et al. 2006; Varoquaux et al. 2010; Pourahmadi 2011; Hoff and Niu 2012; Fox and Dunson 2015; Zou et al. 2017; Zhao et al. 2021a,b, 2024), we will frame the problem of analyzing FC as modeling of heteroscedasticity, ie estimating a covariance function across a range of values for an explanatory x-variable. In contrast to the approach developed in Zhao et al. (2021b) where each projection vector for is estimated sequentially and in Franks (2022) where statistical inference is conducted conditionally on the estimated dimension-reduced subspace, the proposed framework allows coherent and simultaneous inference on all model parameters within the Bayesian paradigm.
One typical approach to associating brain FC with behavior is to take a massive univariate test approach that relates each connectivity matrix element with subject-level covariates (e.g. Woodward et al. 2011; Grillon et al. 2013). However, this “massive edgewise regression” lacks statistical power, as it (i) ignores dependencies among the connectivity elements; and (ii) involves quadratically increasing number of regressions that exacerbate the problem of multiple testing. On the other hand, multivariate methods such as principal component analysis (PCA) as considered in Crainiceanu and Punjabi (2011) consider the data from all ROIs at once, reducing the dimensionality of the original outcome to a smaller number of “networks” components, however, these common components may be associated with small eigenvalues, or the corresponding eigenvalues may not be associated with covariates.
The outcome data of interest are multivariate time-series resting-state fMRI (rs-fMRI) data in measured simultaneously across the p ROIs (or parcels) defined based on an anatomical parcellation (Eickhoff et al. 2018) or “network nodes” (Smith et al. 2012) derived from a data-driven algorithm such as independent component analysis (ICA) (Calhoun et al. 2009; Smith et al. 2013). As in Seiler and Holmes (2017), we will apply the Bayesian CAP regression to data from the Human Connectome Project (HCP) (Van Essen et al. 2013) to compare short sleepers (ie hours) with conventional sleepers (ie 7 to 9 hours) with respect to their FC.
2 Method
2.1 Covariance regression models
We consider n subjects, with subject-specific covariances for brain activity time series from p ROIs . The space of valid covariance matrices is the space of symmetric positive definite (SPD) matrices, denoted as in this paper. The rs-fMRI time-series for a given subject i are drawn from a Gaussian distribution: with and . For centered data, the mean , and the covariance captures FC. Without loss of generality, we assume that the observed signal is mean-centered so that for each subject , as our focus is on FC characterized by the covariance between the brain signals. We observed over Ti time points for each subject i along with subject-level vectors of covariates .
In this paper, instead of directly modeling the subject-specific covariances (as in Seiler and Holmes (2017); Fox and Dunson (2015); Zou et al. (2017)) in which most of the covariance heterogeneity may be unrelated with , we aim to extract a lower dimensional component whose covariance heterogeneity is related with . We will characterize this lower dimensional structure by a dimension reducing matrix where (ie is in a Stiefel manifold) with . Specifically, we consider a latent factor model for
| (2.1) |
with latent factors and , of dimensions d and p—d, respectively, where
| (2.2) |
models the x-related heteroscedasticity along the projection directions . In (2.2), is a diagonal matrix, where its diagonal elements are given by a linear predictor vector . In (2.1), specifies the Principal Directions of Covariance (PDCs) of related with , whereas the other orthogonal components , which satisfy , are included to account for the “noise” directions and magnitudes of the heteroscedasticity that are unrelated with .
In (2.2), the matrix (where represents the intercept) is a regression coefficient matrix that relates (with its first element being 1) to the subject-level outcome covariance . Under model (2.1), the subject-level covariance is given by
| (2.3) |
that decomposes the individual covariance matrices into two components, covariates related and unrelated, a principal factor decomposition of . In (2.3), unlike the more general structure on whose variability is unrelated with , the PDCs serve as features (ie “subnetworks”) that we expect to be consistent across subjects. Along , model (2.2) incorporates subject-level random effects to capture additional heteroscedasticity not captured by . In model (2.22), the diagonality of the d × d core tensor is needed as an identifiability condition, since any non-diagonal SPD can be diagonalized by its normalized eigenvectors (assuming common eigenvectors for across subjects), and can instead be used as the orthonormal dimension reduction matrix. While we impose the diagonality of , we allow , where may have off-diagonal elements that allow residual correlation in the the projected signals beyond what is modeled by common covariates .
Remark 2.1.
The covariance model (2.1) and (2.2) should be distinguished from the principal component (PC) regression that relates with the PCs , as our interest is in studying the association between the covariate with the variance of the components (ie heteroscedasticity), rather than with the components themselves.
For a multivariate outcome signal Z at time point t for subject i, Seiler and Holmes (2017) utilized a heteroscedesticity model, , where the outcome covariance matrix is modeled by a quadratic function of , where is the regression coefficient associated with , and . However, this model is quite restrictive, as its outer product term is of rank 1, and the noise covariance term is diagonal with independent variances. On the other hand, model (2.3) identifies a covariate associated rank-d (where structure via and allows a less restrictive noise covariance structure, which makes the covariance modeling with more flexible than that of Seiler and Holmes (2017). In particular, the outcome dimension reduction via implicit in model (2.3) offers computational advantages through working with low dimensional (d-by-d) covariances (rather than full p-by-p covariances), that can be particularly advantageous when the number of within-subject time points (Ti) is relatively small compared to the signal dimension p. The general outer product approach proposed by Hoff and Niu (2012) replaces by a p × p SPD matrix, requiring a large number of parameters (that can scale quadratically in p). The approaches proposed in Fox and Dunson (2015); Zou et al. (2017) also similarly model the whole p × p matrix , which may make the interpretation challenging for large matrices (Zhao et al. 2021b).
Zhao et al. (2021b) considered CAP regression, where the PDCs are sequentially estimated subject to identifiability constraints (in which is a p × p covariance representative of the overall study population) and . However, under a sequential optimization framework, joint inference on the outcome projection matrix and the regression coefficient is not straightforward, and thus, Zhao et al. (2021a,b, 2024) conducted bootstrap-based statistical inference only on the coefficients B, and not on . On the other hand, the proposed model (2.1), coupled with the core tensor model (2.2), further accounts for the additional heteroscedasticity in the projected outcomes by using subject-level random effets to relax the model assumption, while simultaneously modeling all the relevant parameters , allowing for more coherent downstream analysis that improves the model interpretability which we will discuss in Section 4.
2.2 Tangent space parametrization of dimension-reduced covariance
Due to the constraint for all nonzero , the space of covariance matrices forms a curved manifold which does not conform to Euclidean geometry; for example, the negative of a SPD matrix and some linear combinations of SPD matrices are not SPD (Schwartzman 2016). Thus, analyzing in the Euclidean vector space is not adequate to capture the curved nature of PDCs, and leads to a biased estimation of PDCs (Zhao et al. 2021b). However, is a Riemannian manifold under the affine-invariant Riemannian metric (AIRM) (Pennec et al. 2006), whose tangent space forms a vector space. We will use a Riemannian parametrization of SPD matrices in estimating the PDCs in this paper. A tangent space projection requires selection of a reference point that is close to to be projected. A sensible reference point on is a mean of , denoted as . We will use the matrix whitening transport of Ng et al. (2016) to bring the covariances close to , by applying matrix whitening based on . The resulting whitened covariances would be close to the identity matrix , at which we can construct a common tangent space for projection.
Remark 2.2.
Here we briefly review some relevant concepts of Riemannian geometry. Let , and be the tangent space at A. Given two tangent vectors at A, the AIRM inner product is . Given , there is a unique geodesic denoted as such that and ,
| (2.4) |
that connects A to a point when evaluated at t = 1. For , the Exponential map, defined as , projects the given X to a point , in such a way that the A and X distance on the tangent plane is the same as that between A and B on the manifold. The (AIRM) Log map, which is the inverse mapping of , projects the point back to the tangent vector,
| (2.5) |
and we can re-express the geodesic (2.4) as . The corresponding geodesic distance between A and B is where is the Frobenius norm.
In this paper, for each dimension reducing matrix , we will use , where is a fixed representative population level covariance, to “whiten” the individual level dimension-reduced covariances of model (2.3). Specifically, we will normalize by (where is computed based on the eigendecomposition of ), so that the resulting individual “whitened” SPD is close to the identity matrix . We will parametrize these in the tangent space at , by projecting at using the Log map,
| (2.6) |
locally mapping the bipoint to an element in the tangent space at . For notational convenience, in (2.6) let us denote the Log map, given , as , which is no longer linked by the positive definiteness constraint (Pervaiz et al. 2020) and forms a vector space. Then, treating as a local perturbation of in tangent space, we model in (2.6) by a linear model of the form,
| (2.7) |
where the linear predictor lies in (unrestricted) Euclidean vector space. Upon parametrizing (with appropriate priors on and ), we will re-map these covariate-parametrized objects in (2.7) to the original space in , by first taking Exponential map, (ie taking (2.4) at t = 1 and ) and then translating it back to the base point through “de-whitening” with , yielding
| (2.8) |
which completes our parameterization of the core tensor in (2.3). To define the mapping (2.6), we select to represent an estimate of the Euclidean average of . Among examined estimators in previous works (Dadi et al. 2019; Pervaiz et al. 2020) this choice of showed stable performance across various scenarios. We set , where .
2.3 Posterior inference
2.3.1 Prior and likelihood specification
We perform posterior inference on the tangent space parameterized model (2.7), which will be mapped to parametrization (2.2). Let represent the observed data and denote the collection , and let . The posterior of parameters can be expressed as the the product of a prior and the likelihood,
| (2.9) |
The covariate relevant component likelihood for subject i under (2.1) is
| (2.10) |
where the last line follows from the tangent-space parametrization (2.8) of . Equation (2.10) indicates that the likelihood is in the form of a Gaussian likelihood of transformed responses,
| (2.11) |
and no attempt will be made to estimate the parameters in (2.1) unrelated with .
We specify the prior in (2.9) as
| (2.12) |
using independent priors and a conditional prior on given based on . In (2.12), denotes the vector of the diagonal elements of . For , we use a mean zero matrix Gaussian prior with element-wise standard deviation . For , which we decompose into , we use an unit-scale half-Cauchy distribution (Gelman 2006; Polson and Scott 2012) on each element of the standard deviation vector (allowing for the possibility of extreme values) and a Lewandowski-Kurowicka-Joe (LKJ) prior (Lewandowski et al. 2009) on the correlation matrix with hyperparameter (specifying the amount of expected prior correlations). For , we use a matrix angular central Gaussian (MACG) (Chikuse 1990; Jupp and Mardia 1999) with hyperparameter . An orthonormal random matrix is said to be distributed as a MACG (with parameter ) if , where follows a p × d matrix normal distribution, whose density is
| (2.13) |
If the row covariance , then the prior on U encodes no spatial information. In our illustrations, we employed flat priors on and the correlation matrix (with and η = 1, respectively), and weakly informative priors on , using .
2.3.2 Posterior computation via polar expansion
A Markov chain Monte Carlo (MCMC) sampling for from the posterior (2.9) is challenging due to the restriction that is in a Stiefel manifold. We will use polar expansion to transform the orthonormal parameter to an unconstrained object to work around this restriction. Generally, “parameter expansion” of a statistical model refers to methods which expand the parameter space by introducing redundant working parameters for computational purposes (Jauch et al. 2021). By polar decomposition (Higham 1986), any arbitrary matrix can be decomposed into two components,
| (2.14) |
where the first component is an orthonormal (rotation) matrix, and the second is a symmetric nonnegative (stretch tensor) matrix.
Using a MACG prior on with prior on U in (2.13) allows for posterior inference on U (rather than directly on ). By employing the polar expansion of to U in (2.14), we “parameter expand” an orthonormal to an unconstrained U. This expanded parameter maintains the same model likelihood as in (2.10). However, the prior in (2.12) expands to under parametrization (2.14), leading to the corresponding posterior expansion from in (2.9) to . Using MCMC, we first approximate samples from the expanded posterior , then conduct the polar decomposition (2.14) to obtain the samples from the posterior of , which can be verified via a change of variable from U to . Specifically, given a Markov chain with a stationary distribution proportional to , we approximate the posterior of by where for each s, yielding approximate samples from .
In this paper, approximate the posterior distribution of parameters using an adaptive Hamiltonian Monte Carlo (HMC) sampler (Neal 2011) with automatic differentiation and adaptive tuning, implemented in Stan (2023). Consequently, we obtain HMC posterior samples of . The mapping between B and is given in Supplementary Materials S1. As in any PCA-type analysis, there is a sign non-identifiability of ; the non-identifiability of matrix up to random sign changes for each component. That is, the component vector and correspond to the same direction. We can align the posterior samples . For the first post-warmup sample , let . For , we compared the sign of with that of , and if the signs disagreed, we multiplied by –1. The aligned ’s were used to construct the credible intervals of . In Sections 3 and 4, we employed a burn-in of 700 steps, during which Stan optimizes tuning parameters for the HMC sampler. After burn-in, we ran HMC for an additional 1300 steps to generate 1300 post-warmup samples. Convergence was assessed by examining traceplots of random parameter subsets.
Unlike ICA, where the order of the extracted components is relatively arbitrary, the components in (2.1) specified by can be ranked based on the sample variance of the expected log-variance they explain across observations , where ; here we exclude subject-level random effects to quantify only covariate-associated heteroscedasticity. Specifically, we sort the d estimated components in decreasing order of the magnitude of the sample variance of the expected log-variance attributable to .
2.3.3 Determination of the number d of the components
We propose to use a selection criterion based on the Watanabe-Akaike Information Criterion (WAIC) (Watanabe 2010) which can be used to estimate the expected log posterior. Given a fixed d, we compute the log pointwise predictive density (LPPD) of the dimension reduced model, penalized by the WAIC effective degrees of freedom, (e.g. Gelman et al. (2014)). Specifically, we select the dimensionality d of the covariate-assisted outcome projection, which maximizes the expected deviance between two models in the projected outcome space: one incorporating covariate-explained heteroscedasticity , and the other without heteroscedasticity . The expected deviance (scaled by –2) is estimated by
| (2.15) |
where in which , ie the posterior ratio of the two models with vs. without covariate-explained heteroscedasticity, computed using the MCMC posterior parameter samples . If the covariates are predictive of the covariances along all PDCs of rank d, then the corresponding expected log posterior, , will be large. However, for a too large rank d, the covariates may not predict the covariances in all posited directions , leading to a smaller expected log posterior ratio, , compared to that with the optimal projected outcome dimension d. Considering the ratio is crucial for making this criterion comparable across different d’s, and we select d that minimizes this expected deviance. In Supplementary Materials S2, we demonstrate the validity of this criterion in selecting the correct number of covariate-relevant heteroscedasticity components.
3 Simulation illustration
3.1 Simulation setup
For each unit (subject) i, we simulate a set of outcome signals from a Gaussian distribution with mean zero and p × p unit-specific covariance . We vary , and . We use model (2.3) to generate , where the core SPD with d = 2, where , is defined based on the subject-level linear predictors ,
of dimension d = 2, where is the intercept vector, and are the regression coefficients for . We generate covariates and , and the subject-specific random effects , where , to define .
For each simulation run, we use the von Mises–Fisher distribution to randomly generate an orthonormal basis matrix for , and its subcomponent is further transformed by subject-specific orthonormal matrices , each randomly generated from the von Mises–Fisher distribution. Then, the “noise” covariance components are specified by generating with each element , whereas specify the “signal” components. For each simulation run, we compute the base covariance that we use for tangent-space parametrization of model (2.3) as the sample marginal covariance on the training sample.
To investigate the robustness of the method against model misspecification, we further consider the case where there are no common eigenvectors across subjects. We consider subject-level random perturbation using the subject-level rotation matrices with random angles , and use in place of in generating the responses in (2.1), referred to as “model misspecification” cases.
3.2 Evaluation metric
We run the simulation 50 times. For each simulation run, we compute, as evaluation metrics, the absolute cosine similarity for the loading coefficient vectors (where a value close to 0 indicates the proximity) and the root mean squared error (RMSE) (k = 1, 2) for the regression coefficient vectors, as well as the RMSE for the elements of the random effect covariance matrix , where the notation represents the posterior mean of . While we conduct the model estimation using the tangent space parameterization (2.7) with , the results are mapped to the original parametrization with B in (2.3). This approximately amounts to shifting the intercept vector by the diagonal elements of (see Supplementary Materials S1). We report the estimation performance for by reporting RMSE , under the original parametrization with B. Additionally, to assess whether the constructed credible intervals provide reasonably correct coverage for the true values of the parameters, we evaluate the posterior credible intervals of the model parameters () with respect to the frequentist’s coverage proportion. Specifically, for each simulation run, we estimate the posterior distribution of the parameters and calculate the 95% posterior credible intervals for the parameters, and then evaluate how often the credible intervals contain the true parameter values. We used a random initialization of the Markov chains in our posterior sampling.
3.3 Simulation results
In Fig. 1, as sample sizes ( increase, the estimation performance tends to improve overall. Particularly when the sample sizes are relatively small (e.g. ), the improvement tends to depends on the magnitude of the covariate effects on the outcome projection component, as performance for parameters for the first component ( and ) tends to be slightly better than those for the second components ( and ), reflecting stronger covariate effects on the first projection component. The number of subjects (n) and time points (T) both influence performance; increasing T enhances estimation by providing more subject-level information for accurate estimates of subject-specific random effects and their covariance , and accordingly population-level parameters and . The p = 10 cases reported in Supplementary Materials S3 show qualitative similar results to those for the p = 20 cases.
Fig. 1.
The model parameter estimation performance for p = 20 case, for the loading coefficient vectors (k = 1, 2), elements of the random effect covariance matrix , regression coefficients (k = 1, 2), and intercept , averaged across 50 simulation replications, with varying and .
In terms of coverage probability, the results in Table 1 for both p = 10 and 20 cases indicate that the “actual” coverage probability is reasonably close to the “nominal” coverage probability of 0.95, particularly with larger sample sizes (e.g. n = 400, T = 30) for the regression coefficients . Overall, the results in Table 1 suggest that the Bayesian credible intervals exhibit reasonable frequentist coverage, providing estimates of the parameter uncertainty that aligns with the desired coverage level. In Supplementary Materials S4, we further examine the model’s performance under misspecification: 1) when excluding the random effect component ; and 2) when there are no common “signal” eigenvectors across subjects. Without the random effect, estimation performance remains comparable in terms of bias, but the coverage of 95% credible intervals tends to underestimate uncertainties, particularly for the regression coefficients . The absence of common covariate-related eigenvectors introduces bias in estimating , leading to lower coverage levels of the credible intervals than nominal. The average computation time (on a MacBook running M3 Max with 96 GB unified memory) was about 0.8 hours for obtaining 1300 posterior samples on n = 400 subjects with T = 30 time points and p = 20.
Table 1.
The proportion of time that 95% posterior credible intervals contain the true values of the projection loading vectors (k = 1, 2), regression coefficients (k = 1, 2), and elements of , averaged across 50 simulation replications, with varying and .a
|
p = 10 |
p = 20 |
||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| n | T | ||||||||||
| 100 | 10 | 0.89 | 0.90 | 0.93 | 0.90 | 0.91 | 0.86 | 0.86 | 0.86 | 0.84 | 0.88 |
| 20 | 0.85 | 0.85 | 0.91 | 0.87 | 0.93 | 0.90 | 0.89 | 0.92 | 0.87 | 0.94 | |
| 30 | 0.88 | 0.87 | 0.91 | 0.90 | 0.91 | 0.87 | 0.89 | 0.88 | 0.88 | 0.88 | |
| 200 | 10 | 0.90 | 0.88 | 0.95 | 0.92 | 0.94 | 0.90 | 0.93 | 0.96 | 0.94 | 0.89 |
| 20 | 0.92 | 0.91 | 0.97 | 0.92 | 0.93 | 0.91 | 0.92 | 0.93 | 0.94 | 0.93 | |
| 30 | 0.89 | 0.89 | 0.96 | 0.88 | 0.94 | 0.89 | 0.89 | 0.90 | 0.89 | 0.89 | |
| 300 | 10 | 0.88 | 0.88 | 0.90 | 0.90 | 0.88 | 0.90 | 0.91 | 0.96 | 0.92 | 0.89 |
| 20 | 0.90 | 0.86 | 0.90 | 0.84 | 0.91 | 0.92 | 0.90 | 0.96 | 0.92 | 0.90 | |
| 30 | 0.91 | 0.90 | 0.91 | 0.90 | 0.91 | 0.91 | 0.91 | 0.94 | 0.92 | 0.91 | |
| 400 | 10 | 0.91 | 0.89 | 0.96 | 0.92 | 0.85 | 0.90 | 0.93 | 0.93 | 0.92 | 0.90 |
| 20 | 0.94 | 0.91 | 0.96 | 0.96 | 0.87 | 0.92 | 0.91 | 0.94 | 0.92 | 0.93 | |
| 30 | 0.93 | 0.92 | 0.96 | 0.95 | 0.89 | 0.93 | 0.91 | 0.92 | 0.92 | 0.92 | |
Coverage was computed for each entry, then averaged within components ( and ) and across the simulation replications (rounded to two significant digits).
4 Application
In this section, we applied the Bayesian CAP regression to data from HCP. As in Seiler and Holmes (2017), we used the rs-fMRI data from HCP 820 subjects and examined the associations between rs-fMRI and sleep duration. Each subject underwent 4 complete 15-min sessions (with ms, corresponding to 1200 time points per session for each subject), and each 15-min run of each subject’s rfMRI data was preprocessed according to Smith et al. (2013). We focused on the first session which is about a typical duration for rs-fMRI studies. We also applied the proposed method to the other three sessions to examine the sensitivity and reliability of this regression (see Supplementary Materials S6, where the covariate-related FC exhibits a high level of consistency across all 4 scanning sessions, with the intra-cluster correlation coefficient value of 0.84, 0.72, 0.84 and 0.83, for the 4 identified network components in terms of the log-variance).
We used a data-driven parcellation based on spatial ICA with p = 15 components (ie using p = 15 data-driven “networks nodes;” see Fig. 2 for their most relevant axial slices in MNI152 space) from the HCP PTN (Parcellation + Timeseries + Netmats) dataset, where each subject’s rs-fMRI timeseries data were mapped onto the set of ICA maps (Filippini et al. 2009). We refer to Smith et al. (2013) for details about preprocessing and the ICA time series computation. We conduct inference on the association between the FC over these IC network nodes (Smith et al. 2012) and sleep duration, gender and their interaction.
Fig. 2.
Fifteen independent components (ICs) from spatial group-ICA constituting a data-driven parcellation with 15 components (“network nodes”), provided by the HCP PTN dataset, represented at the most relevant axial slices in MNI152 space. According to Seiler and Holmes (2017), these IC networks correspond to default network (Net15), cerebellum (Net9), visual areas (Net1, Net3, Net4, and Net8), cognition-language (Net2, Net5, Net10, and Net14), perception-somesthesis-pain (Net2, Net6, Net10, and Net14), sensorimotor (Net7 and Net11), executive control (Net12) and auditory (Net12 and Net13).
As in Seiler and Holmes (2017), we classified the subjects into two groups: a group of 489 conventional sleepers (average sleep duration between 7 and 9 hours each night) and a group of 241 short sleepers (average equal or less than 6 hours each night). This yielded a total of 730 participants to compare FC (over the IC networks in Fig. 2) between short and conventional sleepers. Since the time series are temporally correlated, we inferred the equivalent sample size of independent samples. We computed the effective sample size (ESS) defined by Kass et al. (1998), where is the data at time t of the jth network node for subject i, following a conservative approach taking the minimum over all p components and n subjects as the overall estimator. Based on the estimated ESS, we performed thinning of the observed timeseries data, subsampling time points for each subject. The resulting outcome data, , were then mean-removed per each subject (so that for each i), and we focused on the association between their covariances and covariates.
We used the WAIC criterion (2.15) to identify d = 4 projection components. The models’ WAIC values over the range of d = 1 to 6 were –227.9, –397.6, –520.4, –602.7, –573.4, and –358.4, where the minimizer was the d = 4 case. The parameters (, B and ) with d = 4 are summarized by their posterior means and 95% credible intervals, reported in Supplementary Materials S4. The expected value of the log Deviation from Diagonality (DfD) was 0.60, suggesting a moderate departure from the diagonality of assumed in (2.2), but the deviation is not overly pronounced.
Under model (2.1), for a linear contrast vector , we can define the log covariance “contrast” map due to a -change in the covariates , which corresponds to (see Supplementary Materials S7), where is the regression coefficient matrix in (2.2). Specifically, the diagonal elements of this contrast matrix can be extracted and exponentiated. This represents the response signals’ variance ratio (VR) corresponding to a -change in the covariates. For the four contrasts derived from the SleepDuration × Gender interaction, the left two column panels in Fig. 3 present the response signals’ variance ratio, contrasting (i) short vs. conventional sleeper among male; (ii) short vs. conventional sleeper among female; (iii) male vs. female among short sleeper; and (iv) male vs. female among conventional sleeper.
Fig. 3.
The response signals’ variance ratio (posterior means and 95% credible intervals), corresponding to the four contrasts formed by the Gender-by-SleepDuration interaction. The 95% credible intervals that do not include the variance ratio of 1 are highlighted in red. The sets (“parcel sets”) of network nodes whose signals’ variances are expected to change in the same impact directions due to the corresponding contrasts are indicated in the last column panels, for the Short vs. Conventional sleeper contrasts in the top row, and for the Male vs. Female contrasts in the bottom row.
In Fig. 3, the nodes, or “parcels,” whose VR values were identified (based on 95% credible intervals) to be significantly different from 1, were all with VR > 1. The third column panels of Fig. 3 indicate the nodes whose signals’ variances are expected to change in the same direction, for the Short vs. Conventional sleeper contrasts in the top row panel, and for the Male vs. Female contrasts in the bottom row panel.
For each contrast, we can infer the contrasts’ impact on the connectivity by 95% credible intervals on the connectivity elements of the contrast matrix . The first column panels in Fig. 4 display the covariance elements identified to be significant, whereas the second column panels display the posterior mean of the matrix elements of , where each row panel corresponds to each contrast in the covariates. The results from the statistical significance maps in Fig. 4 indicate that, overall, there are more substantial connectivity differences between Short and Conventional sleepers (the first two row panels), compared to the cases when we compare Male vs. Female (the last two row panels), and there were slightly more pronounced Short vs. Conventional sleepers differences among Males (the first row panel) than among Females (the second row panel). While there were several identified connectivity differences between Male vs. Female among Short sleepers, there were no statistically significant Male vs. Female differences among Conventional sleepers.
Fig. 4.
The statistical significance map (the left column panels) and the posterior mean (the right column panels) of the log covariance contrast for each of the four covariate contrasts , derived from the SleepDuration × Gender interaction.
One conventional method for analyzing group ICA data involves initially computing subject-level Pearson correlations between the ICs, which are then Fisher z-transformed. This process is performed on 105 pairs of correlations (calculated from 15 ICs), while we conduct the element-wise log transformation on the p = 15 diagonal elements. A total of 120 element-wise linear regressions were then conducted on SleepDuration, Gender and their interaction, and P-values were corrected for multiplicity using the Benjamini–Hochberg (BH) (Benjamini and Hochberg 1995) procedure to control the false discovery rate (FDR) at 0.05. The patterns of the connectivity differences, implied by each -contrast, from this mass-univariate approach are presented in Supplementary Material S10, which were similar to the results from Bayesian CAP in Fig. 4. However, compared to the results from Bayesian CAP, far fewer statistically significant elements (13 vs. 77, out of 480 elements) were identified.
While the CAP regression formulation of Zhao et al. (2021b) also alleviates the multiplicity issue and thus can improve statistical power, inference is limited to the association between covariates and the projected outcome components, making it challenging to interpret covariates’ impacts in measured ROIs directly. Therefore, the approach is not directly comparable with the proposed approach here. In Supplementary Materials S8, we display the similarity (similarity between –1 and 1, with 0 indicating orthogonal) of the estimated projection directions from CAP (Zhao et al. 2021b) (in their first four leading components) and those from the proposed Bayesian latent factor model, which shows positive association for each projection direction with the similarity at least 0.4. We also report the CAP regression coefficients (with 95% bootstrap confidence intervals) for each estimated projected outcome component.
According to the meta analysis in Smith et al. (2009), the identified Parcel Set contrasting the Short vs. Conventional sleeper in Fig. 3 mainly correspond to visual areas (network nodes N1, N3, N4, N8), auditory areas (N12, N13) and sensory motor (N11). Curtis et al. (2016) found that self-reported sleep duration primarily co-varied with FC in auditory, visual, and sensorimotor cortices. Specifically, shorter sleep durations were associated with increased FC among auditory, visual, and sensorimotor cortices (these regions roughly correspond to the network nodes N1, N3, N4, N8, N12, N13, and N11), and decreased FC between these regions and the cerebellum (N9). These positive and negative associations found in Curtis et al. (2016) are consistent with the results in the contrast maps presented in Fig. 4 which contrast Short vs. Conventional sleepers.
5 Discussion
Extending the frequentist approach developed in Zhao et al. (2021b) under a probabilistic model (2.1), coupled with a geometric formulation of the dimension-reduced covariance objects in (2.3), the proposed Bayesian method provides a framework to conduct inference on all relevant parameters simultaneously, that produces more interpretable results regarding how the covariates’ effects are expressed in the ROIs. Furthermore, the outcome dimension reduction approach avoids the need to work with subject-specific full p-by-p sample covariance matrices, which can suffer from estimation instability when the number of time points (volumes) is not large (which is typically the case for fMRI signals). Generally, the CAP formulation of Zhao et al. (2021b) allows for a more targeted and efficient analysis by identifying the specific components of the outcome data relevant to the association between covariates and FC.
Although the computational burden and complexity associated with working with the full p-by-p sample covariance matrix can be significantly alleviated by reducing the dimensionality of the outcome data, the method is generally not suitable to be run in very high-dimensional outcome data, such as voxel-level data, and is better suited for intermediate spaces, such as those produced by ICA or an anatomical parcellation. Overfitting might occur due to the large number of parameters in the estimation of the outcome projection matrix . Future work will apply prior distributions on the dimension reducing matrix as well as on the covariate effect parameters B that promote sparsity, for improved estimation and interpretation in higher dimensional spaces.
As in Zhao et al. (2021a,b, 2024), the assumption that we make in conducting the inference is partially common eigenvectors of the covariance structure (Wang et al. 2021), in which the covariance is decomposed into shared and unique components, where the shared components captures the information related to the covariates. Future endeavors will explore strategies to mitigate concerns related to model misspecification by addressing heterogeneity in these shared components across subjects. We have conducted preliminary thinning of the observed multivariate time-series to achieve an effective sample size, involving subsampling to eliminate temporal dependencies. Subsequent investigations will refine this approach to delve into individual differences in dynamic FC (e.g. Zhang et al. 2020; Bahrami et al. 2022), incorporating dimension reduction models that account for both between-subject heterogeneity in spatial patterns and within-subject temporal correlation through state-space modeling of latent factors. This will facilitate a deeper exploration of associations between covariates and FC.
A main challenge in modeling covariance matrices is the positive definiteness constraint. Unlike a mean vector where a link function can act element-wise, the positive-definiteness on a covariance matrix is a constraint on all its entry (Pourahmadi 2011). One approach is to transform the problem into an unconstrained estimation problem through a transformation such as Cholesky decomposition, although this requires natural ordering information. Alternative way is to consider a more fundamental geometric formulation, that views individual covariances as elements on a (nonlinear) manifold. A more global transformation (compared to an entry-wise transformation) such as matrix log-transformation then maps individual covariances to a tangent space, allowing for unconstrained operations. However, a global log-transformation poses interpretability challenges, as it generally alters the covariate’s impact directions with respect to the measured ROIs. Our geometry-based CAP approach focuses on identifying relevant eigenvectors, while simultaneously estimating eigenvalues-by-covariates associations through a linear model in a tangent space. By assuming and identifying relevant eigenvectors that align with the covariates’ impact directions, the global log transformation maintains their orientation regarding the covariates’ effects, thus the estimated pairwise covariance contrasts preserve their interpretability as covariate-induced pairwise connectivity differences.
Yet another important challenge is the high dimensionality, as the number of covariance elements increase quadratically in the response variable’s dimension. Generally, CAP regression of Zhao et al. (2021b), and its extension developed here, is useful if there is no need to model the generation of the entire observations, and one is only interested in isolating the data into a potentially low-dimensional representation in which they exhibit certain desired characteristics such as maximizing the model likelihood associated with . Such supervised dimension reductions can generally mitigate the curse of dimensionality in covariance modeling.
Supplementary Material
Acknowledgements
The author is grateful to Dr Xiaomeng Ju and Dr Thaddeus Tarpey for helpful discussions and to the three reviewers of this manuscript for their constructive reviews.
Supplementary material
Supplementary material is available at Biostatistics Journal online.
Funding
This work was supported by National Institutes of Health (NIH) grant 5 R01 MH099003. Data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.
Conflict of interest statement
None declared.
Data availability
The code used in this paper is accessible at the following GitHub repository: https://github.com/syhyunpark/bcap
References
- Bahrami M, Laurienti PJ, Shappell HM, Dagenbach D, Simpson SL. A mixed-modeling framework for whole-brain dynamic network analysis. Network Neurosci. 2022:6(2):591–613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological). 1995:57(1):289–300. [Google Scholar]
- Boik RJ. Spectral models for covariance matrices. Biometrika. 2002:89(1):159–182. [Google Scholar]
- Cai TT, Li H, Liu W, Xie J. Joint estimation of multiple high-dimensional precision matrices. Stat Sin. 2016:26(2):445–464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calhoun V, Liu J, Adali T. A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data. Neuroimage. 2009:45(1):S163–S172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng J, Levina E, Wang P, Zhu J. A sparse ising model with covariates. Biometrics. 2014:70(4):943–953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chikuse Y. The matrix angular central gaussian distribution. J Multivar Anal. 1990:33(2):265–274. [Google Scholar]
- Crainiceanu CBS, Luo S, Zipunnikov VMCM, Punjabi NM. Population value decomposition, a framework for the analysis of image populations. J Am Stat Assoc. 2011:106(495):775–790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Curtis BJ, Williams PG, Jones CR, Anderson JS. Sleep duration and resting fMRI functional connectivity: examination of short sleepers with and without perceived daytime dysfunction. Brain Behav. 2016:6(12):e00576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dadi K, Rahim M, Abraham A, Chyzhyk D, Milham M, Thirion B, Varoquaux G. Benchmarking functional connectome-based predictive models for resting-state fMRI. Neuroimage. 2019:192(15):115–134. [DOI] [PubMed] [Google Scholar]
- Dai M, Zhang Z, Srivastava A. Analyzing dynamical functional connectivity as trajectories on space of covariance matrices. IEEE Trans Med Imaging. 2020:39(3):611–620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danaher P, Wang P, Witten DM. The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc Ser B. 2014:76:(2)373–397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davoudi A, Ghidary SS, Sadatnejad K. Dimensionality reduction based on distance preservation to local mean for symmetric positive definite matrices and its application in brain–computer interfaces. J Neural Eng. 2017:14(3):036019. [DOI] [PubMed] [Google Scholar]
- Durante D, Dunson DB. Bayesian inference and testing of group differences in brain networks. Bayesian Anal. 2018:13(1):29–58. [Google Scholar]
- Eickhoff SB, Yeo BTT, Genon S. Imaging-based parcellations of the human brain. Nat Rev Neurosci. 2018:19(11):672–686. [DOI] [PubMed] [Google Scholar]
- Engle RF, Kroner KF. Multivariate simultaneous generalized ARCH. Econ Theory. 1995:11(1):122–150. [Google Scholar]
- Filippini N, MacIntosh BJ, Hough MG, Goodwin GM, Frisoni GB, Smith SM, Matthews PM, Beckmann CF, Mackay CE. Distinct patterns of brain activity in young carriers of the apoe-e4 allele. Proc Natl Acad Sci USA. 2009:106(17):7209–7214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flury BN. Common principal components in k groups. J Am Stat Assoc. 1984:79(388):892–898. [Google Scholar]
- Flury BN. Asymptotic theory for common principal component analysis. Ann Stat. 1986:14(2):418–430. [Google Scholar]
- Fong PW, Li WK, An HZ. A simple multivariate ARCH model specified by random coefficients. Comput Stat Data Anal. 2006:51(3):1779–1802. [Google Scholar]
- Fornito A, Bullmore ET. Connectomic intermediate phenotypes for psychiatric disorders. Front Psychiatry. 2012:3:32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fornito A, Zalesky A, Breakspear M. Graph analysis of the human connectome: promise, progress, and pitfalls. Neuroimage. 2013:15(80):426–444. [DOI] [PubMed] [Google Scholar]
- Fox EB, Dunson DB. Bayesian nonparametric covariance regression. J Mach Learn Res. 2015:16(77):2501–2542. [Google Scholar]
- Franks AM. Reducing subspace models for large-scale covariance regression. Biometrics. 2022:78(4):1604–1613. [DOI] [PubMed] [Google Scholar]
- Franks AM, Hoff P. Shared subspace models for multi-group covariance estimation. J Mach Learn Res. 2019:20(171):1–37. [Google Scholar]
- Friston K. Functional and effective connectivity. Brain Connect. 2011:1(1):13–36. [DOI] [PubMed] [Google Scholar]
- Gao W, Ma Z, Xiong C, Gao T. Dimensionality reduction of SPD data based on Riemannian manifold tangent spaces and local affinity. Appl Intell. 2023:53:1887–1911. [Google Scholar]
- Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 2006:1(3):515–533. [Google Scholar]
- Gelman A, Hwang J, Vehtari A. Understanding predictive information criteria for bayesian models. Stat Comput. 2014:24:997–1016. [Google Scholar]
- Grillon ML, Oppenheim C, Varoquaux G, Charbonneau F, Devauchelle AD, Krebs MO, Bayle F, Thirion B, Huron C. Hyperfrontality and hypoconnectivity during refreshing in schizophrenia. Psychiatry Res. 2013:211(3):226–233. [DOI] [PubMed] [Google Scholar]
- Guo J., Levina E., Michailidis G., Zhu J. Joint estimation of multiple graphical models. Biometrika. 2011:98:1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ha M, Stingo F, Baladandayuthapani V. Bayesian structure learning in multi-layered genomic networks. J Am Stat Assoc. 2021:116(534):605–618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harandi M, Salzmann M, Hartley R. Dimensionality reduction on SPD manifolds: the emergence of geometry-aware methods. IEEE Trans Pattern Anal Mach Intell. 2017:40(1):48–62. [DOI] [PubMed] [Google Scholar]
- Higham NJ. Computing the polar decomposition—with applications. SIAM J Sci Stat Comput. 1986:7(4):1059–1417. [Google Scholar]
- Hinne M, Ambrogioni L, Janssen RJ, Heskes T, van Gerven MA. Structurally-informed bayesian functional connectivity analysis. Neuroimage. 2014:1(86):294–305. [DOI] [PubMed] [Google Scholar]
- Hoff P. A hierarchical eigenmodel for pooled covariance estimation. J R Stat Soc Ser B (Stat Methodol). 2009:71(5):971–992. [Google Scholar]
- Hoff P, Niu X. A covariance regression model. Stat Sin. 2012:22(2):729–753. [Google Scholar]
- Hutchison RM, Womelsdorf T, Allen EA, Bandettini PA, Calhoun VD, Corbetta M, Della PS, Duyn JH, Glover GH, Gonzalez-Castillo J, Handwerker DA, Keilholz S, Kiviniemi V, Leopold DA, de Pasquale F, Sporns O, Walter M. et al. Dynamic functional connectivity: promise, issues, and interpretations. Neuroimage. 2013:80(15):360–378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jauch M, Hoff PD, Dunson DB. Monte Carlo simulation on the Stiefel manifold via polar expansion. J Comput Graph Stat. 2021:30(3):622–631. [Google Scholar]
- Jupp PE, Mardia KV.. Directional Statistics. London: John Wiley & Sons, 1999. [Google Scholar]
- Kass RE, Carlin BP, Gelman A, Neal R.. Markov chain monte carlo in practice: a roundtable discussion. Am Stat. 1998:52(2):93–100. [Google Scholar]
- Kolar M, Parikh AP, Xing EP.. On sparse nonparametric conditional covariance selection. In: ICML-10, Madison, WI: Omnipress, 2010, 559–566. [Google Scholar]
- Leday GG, de Gunst GB, Kpogbezan MC, van der Vaart AW, van Wieringen WN, van de Wiel MA. Gene network reconstruction using global-local shrinkage priors. Ann Appl Stat. 2017:11(1):41–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee KH, Xue L. Nonparametric finite mixture of gaussian graphical models. Technometrics. 2018:60(4): 511–521. [Google Scholar]
- Lewandowski D, Kurowicka D, Joe H.. Generating random correlation matrices based on vines and extended onion method. J Multivar Anal. 2009:100(9):1989–2001. [Google Scholar]
- Li B, Solea E. A nonparametric graphical model for functional data with application to brain networks based on FMRI. J Am Stat Assoc. 2018:113(524):1637–1655. [Google Scholar]
- Li L, Zhang X. Parsimonious tensor response regression. J Am Stat Assoc. 2017:112(519):1131–1146. [Google Scholar]
- Li Y, Lu R. Locality preserving projection on SPD matrix lie group: algorithm and analysis. Sci China Inf Sci. 2018:61:092104. [Google Scholar]
- Lin Z, Wang T, Yang C, Zhao H. On joint estimation of gaussian graphical models for spatial and temporal data. Biometrics. 2017:73(3):769–779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindquist M. The statistical analysis of fMRI data. Stat Sci. 2008:23(4):439–464. [Google Scholar]
- Liu H, Chen X, Wasserman L, Lafferty J.. Graph-valued regression. In: Advances in Neural Information Processing Systems 23 (NIPS 2010), Vancouver, British Columbia, Canada. Curran Associates, Inc.; 2010, 1423–1431. [Google Scholar]
- Lock EF. Tensor-on-tensor regression. J Comput Graph Stat. 2018:27(3):638–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meinshausen N, Buhlmann P.. High-dimensional graphs and variable selection with the lasso. Ann Stat. 2006:34(3):1436–1462. [Google Scholar]
- Monti RP, Hellyer P, Sharp D, Leech R, Anagnostopoulos C, Montana G. Estimating time-varying brain connectivity networks from functional mri time series. Neuroimage. 2014:103:427–443. [DOI] [PubMed] [Google Scholar]
- Narayan M, Allen GI, Tomson S.. 2015. Two sample inference for populations of graphical models with applications to functional connectivity, arXiv, arXiv:1502.03853, preprint: not peer reviewed.
- Neal RM. MCMC Using Hamiltonian Dynamics. Chapter 5, Bocan Raton: Chapman and Hall-CRC Press; 2011. [Google Scholar]
- Ng B, Varoquaux G, Poline J, Greicius M, Thirion B. Transport on Riemannian manifold for connectivity-based brain decoding. IEEE Trans Med Imaging. 2016:35(1):208–216. [DOI] [PubMed] [Google Scholar]
- Ni Y, Stingo FC, Baladandayuthapani V. Bayesian graphical regression. J Am Stat Assoc. 2019:114(525):184–197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peng J, Wang P, Zhou N, Zhu J. Partial correlation estimation by joint sparse regression models. J Am Stat Assoc. 2009:104(486):735–746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pennec X, Fillard P, Ayache N. A Riemannian framework for tensor computing. Int J Comput Vis. 2006:66(1):41–66. [Google Scholar]
- Pervaiz U, Vidaurre D, Woolrich MW, Smith SM. Optimising network modelling methods for fMRI. Neuroimage. 2020:211:116604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peterson CB, Stingo FC, Vannucci M. Bayesian inference of multiple gaussian graphical models. J Am Stat Assoc. 2015:110(509):159–174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Polson NG, Scott JG. On the half-cauchy prior for a global scale parameter. Bayesian Anal. 2012:7(4):887–902. [Google Scholar]
- Pourahmadi M. Covariance estimation: the GLM and regularization perspectives. Stat Sci. 2011:26(3): 369–387. [Google Scholar]
- Pourahmadi M, Danielrs MJ, Park T. Simultaneous modelling of the Cholsky decomposition of several covariance matrices. J Multivar Anal. 2007:98(3):568–587. [Google Scholar]
- Saegusa T, Shojaie A. Joint estimation of precision matrices in heterogeneous populations. Electronic J Stat. 2016:10(1):1341–1392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwartzman A. Lognormal distributions and geometric averages of symmetric positive definite matrices. Int Stat Rev. 2016:84(3):456–486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seiler C, Holmes S. Multivariate heteroscedasticity models for functional brain connectivity. Front Neurosci. 2017:11:696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith SM, Fox PT, Miller KL, Glahn D, Fox PM, Mackay CE, Filippini N, Watkins KE, Toro R, Larid AR. et al. Correspondence of the brain’s functional architecture during activation and rest. Proc Natl Acad Sci USA. 2009:106(31):13040–13045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith SM, Miller KL, Moeller S, Xu J, Auerbach EJ, Woolrich MW, Beckmann CF, Jenkinson M, Andersson J, Glasser MF, Van Essen DC, Feinberg DA, Yacoub ES. et al. Temporally-independent functional modes of spontaneous brain activity. Proc Natl Acad Sci USA. 2012:109(8): 3131–3136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith SM, Vidaurre D, Beckmann CF, Glasser MF, Jenkinson M, Miller KL, Nichols TE, Robinson EC, Salimi-Khorshidi G, Woolrich MW, Barch DM, Ugurbil K. et al. Functional connectomics from resting-state fmri. Trends Cognit Sci. 2013:17(12):666–682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stan Development Team. Stan modeling language users guide and reference manual, 2.35. 2023. https://mc-stan.org.
- Sun WW, Li L. Store: sparse tensor response regression and neuroimaging analysis. J Mach Learn Res. 2017: 18(1):4908–4944. [Google Scholar]
- Tan LSL, Jasra A, De Iorio M, Ebbels TMD. Bayesian inference for multiple gaussian graphical models with application to metabolic association networks. Ann Appl Stat. 2017:11(4):2222–2251. [Google Scholar]
- van der Heuvel M, Hulshoff Pol H.. Exploring the brain network: a review on resting-state fMRI functional connectivity. Neuropsychopharmacol Rep. 2010:20(8):519–534. [DOI] [PubMed] [Google Scholar]
- Van Essen D, Smith S, Barch D, Behrens T, Yacoub E, Ugurbil K. et al. The WU-Minn human connectome project: an overview. Neuroimage. 2013:80:62–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Varoquaux G, Baronnet F, Kleinschmidt A, Fillard P, Thirion B. Detection of brain functional-connectivity difference in post-stroke patients using group-level covariance modeling. Med Image Comput Comput Assist Interv. 2010:13(1):200–208. [DOI] [PubMed] [Google Scholar]
- Wang B, Luo X, Zhao Y, Caffo B. Semiparametric partial common principal component analysis for covariance matrices. Biometrics. 2021:77(4):1175–1186. [DOI] [PubMed] [Google Scholar]
- Wang Z, Kaseb AO, Amin HM, Hassan MM, Wang W, Morris JS. Bayesian edge regression in undirected graphical models to characterize interpatient heterogeneity in cancer. J Am Stat Assoc. 2022:117(538): 533–546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watanabe S. Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res. 2010:11(116):3571–3594. [Google Scholar]
- Whittaker J. Graphical Models in Applied Multivariate Statistics. Wiley Series in Probability and Mathematical Statistics, Chichester: John Wiley and Sons, 1990. [Google Scholar]
- Woodward ND, Rogers B, Heckers S. Functional resting-state networks are differentially affected in schizophrenia. Schizophrenia Res. 2011:130(1–3):86–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia Y, Cai T, Cai TT. Testing differential networks with applications to the detection of gene-gene interactions. Biometrika. 2015:102(2):247–266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia Y, Cai T, Cai TT. Multiple testing of submatrices of a precision matrix with applications to identification of between pathway interactions. J Am Stat Assoc. 2018:113(521):328–339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia Y, Li L. Hypothesis testing of matrix graph model with application to brain connectivity analysis. Biometrics. 2017:73(3):780–791. [DOI] [PubMed] [Google Scholar]
- Xie X, Yu ZL, Lu H, Gu Z, Li Y. Motor imagery classification based on bilinear sub-manifold learning of symmetric positive-definite matrices. IEEE Trans Neural Syst Rehabil Eng. 2017:25(6):504–516. [DOI] [PubMed] [Google Scholar]
- Zhang J, Li Y. High-dimensional gaussian graphical regression models with covariates. J Am Stat Assoc. 2023;118(543):2088–2100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang J, Wei SW, Li L. Mixed-effect time-varying network model and application in brain connectivity analysis. J Am Stat Assoc. 2020:115(532):2022–2036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao Y, Caffo BS, Luo X. Principal regression for high dimensional covariance matrices. Electronic J Stat. 2021a:15(2):4192–4235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao Y, Caffo BS, Luo X. Longitudinal regression of covariance matrix outcomes. Biostatistics. 2024;25(2):385–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao Y, Wang B, Mostofsky SH, Caffo BS, Luo X. Covariate assisted principal regression for covariance matrix outcomes. Biostatistics. 2021b:22(3):629–645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zou T, Lan W, Wang H, Tsai C-L.. Covariance regression analysis. J Am Stat Assoc. 2017:112(517):266–281. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The code used in this paper is accessible at the following GitHub repository: https://github.com/syhyunpark/bcap




