Bootstrapping GEE models for fMRI regional connectivity

Gina M D’Angelo; Nicole A Lazar; Gongfu Zhou; William F Eddy; John C Morris; Yvette I Sheline

doi:10.1016/j.neuroimage.2012.08.036

. Author manuscript; available in PMC: 2013 Dec 1.

Published in final edited form as: Neuroimage. 2012 Aug 18;63(4):1890–1900. doi: 10.1016/j.neuroimage.2012.08.036

Bootstrapping GEE models for fMRI regional connectivity

Gina M D’Angelo ¹, Nicole A Lazar ², Gongfu Zhou ¹, William F Eddy ³, John C Morris ⁵, Yvette I Sheline ⁴

PMCID: PMC3491908 NIHMSID: NIHMS401900 PMID: 22906513

Abstract

An Alzheimer’s fMRI study has motivated us to evaluate inter-regional correlations during rest between groups. We apply generalized estimating equation (GEE) models to test for differences in regional correlations across groups. Both the GEE marginal model and GEE transition model are evaluated and compared to the standard pooling Fisher-z approach using simulation studies. Standard errors of all methods are estimated both theoretically (model-based) and empirically (bootstrap). Of all the methods, we find that the transition models have the best statistical properties. Overall, the model-based standard errors and bootstrap standard errors perform about the same. We also demonstrate the methods with a functional connectivity study in a healthy cognitively normal population of ApoE4+ participants and ApoE4− participants who are recruited from the Adult Children’s Study conducted at the Washington University Knight Alzheimer’s Disease Research Center.

Keywords: resting-state fMRI, time-series, temporal dependence, brain regional correlations, functional connectivity

1 Introduction

Functional magnetic resonance imaging (fMRI) is a neuroimaging approach that facilitates the understanding of how brain regions are functionally activated and related. fMRI data are measured with the blood oxygen level dependent (BOLD) contrast effect, which is the ratio of oxygenated to deoxygenated blood (Huettel et al., 2008). Images of the working brain are taken every few seconds, resulting in a series of temporally correlated scans. Hence, the fMRI scans are typically analyzed as statistical time-series data (Lazar, 2008; Friston and Buchel, 2004). It was shown early on (Worsley and Friston, 1995) that if this dependence structure is not taken into account, inference will be adversely affected, as it is difficult to distinguish between true signal and artifacts that arise from temporal correlation. In task-based fMRI studies it is now common to account for temporal correlation via a general linear model (GLM) with correlated errors for each subject, followed by a group summary model, also known as the ”mixed-effects analysis” (Friston et al., 2005). Although there has been some progress in modeling the temporal dependence of task-based fMRI studies, limited methods have been devoted to groupwise resting-state functional connectivity studies.

A recent area of interest in the analysis of fMRI data is resting-state functional connectivity. Functional connectivity studies typically focus on understanding how regions are correlated over time during a resting-state (i.e. with no stimulus or task). One goal of such studies is to evaluate whether correlations between regions differ across groups of subjects. A common approach for comparing connectivity patterns across groups is to compare the Fisher-z transformed correlations between regions (Fox et al., 2009); in the rest of this paper we refer to this approach as the ”standard pooling Fisher-z method”. A notable aspect of the standard pooling Fisher-z approach is that the temporal data are treated as though they were independent. Just as in the simpler case of analyzing responsivity to a task or stimulus, here too ignoring the dependence structure of the time series has an adverse effect on inference. Specifically, regional connectivity can be inflated and the estimates will not be consistent, leading to incorrect comparisons (D’Angelo et al., 2011). Not much progress has been made to date on modeling this temporal correlation when determining functional connectivity group differences at resting-state. In particular, for resting-state connectivity studies, the temporal correlation and lag, i.e. dependence of the value on its previous values, have yet to be handled with a modeling approach while simultaneously entering all subjects’ data into a single model at the first stage. Therefore, we suggest a two-stage approach that uses generalized estimating equations (GEEs) (Diggle et al., 1994; Liang and Zeger, 1986; Zeger et al., 1988) to handle the temporal dependence through a modeling strategy and then to use residuals from that model in the calculation of correlations.

The GEE method is ideally suited for fMRI data. GEEs have been developed to handle correlated response outcome data, such as a repeated measures response, cluster data, and correlated bivariate responses. Although the GEE approach is somewhat similar to a GLM with correlated error, one advantage to GEEs is that they require fewer distributional assumptions of the dependent variable than are specified with the GLM. Another advantage to GEEs over the GLM mixed effects analysis is that all subjects may be put into the model simultaneously at the first stage, while accounting for both lag and temporal dependence. Various model-free approaches, including principal component analysis (Bullmore et al., 1996) and independent component analysis (Beckmann et al., 2005; Calhoun et al., 2001), have been applied to resting-state functional connectivity data. However, these methods do not address the lag or explicitly specify the temporal correlation of resting-state data, and hence are not appropriate for our goal of comparing group regional correlations.

The focus of this paper is on groupwise comparison of functional connectivity using selected regions obtained from a seed-based analysis. We propose a two-stage approach with a GEE to estimate brain regional associations and assess between group differences while accounting for the temporal dependence in the individual time-series (Diggle et al., 1994). We evaluate both GEE marginal and GEE transition models. Marginal models are a population-average approach where we estimate the marginal expectation of the response. Transition models estimate the expectation of the current value conditional on the previous values. We also are interested in making inferences about functional connectivity differences between groups. To make inferences of group connectivity differences, we propose using both model-based standard errors (SEs) and bootstrap SEs to estimate p-values and to calculate confidence intervals. We investigate the properties of these GEE models and compare them to the standard pooling Fisher-z method using simulation studies. Furthermore, we compare the model-based SE estimates to the bootstrap SE estimates to assess their statistical properties. The methods are demonstrated with a functional connectivity study in a healthy cognitively normal population of ApoE4+ participants and ApoE4− participants from the Adult Children’s Study (ACS).

2 Data and functional connectivity

Our data consist of a study cohort with 100 healthy cognitively normal participants with no brain amyloid (Pittsburgh Compound B negative) deposition recruited from the Adult Children’s Study (ACS) (Sheline et al., 2010a). The ACS is conducted at the Washington University Knight Alzheimer’s Disease Research Center. The sample consists of cognitively normal participants who are all CDR 0 at baseline, where CDR is the Clinical Dementia Rating (Morris, 1993). Sheline et al. (2010a) define PIB- (Pittsburgh Compound B negative) as having a PIB value less than .18, the usual PIB threshold in our studies. Participants are classified as being either ApoE4+ (n=38) or ApoE4− (n=62), where ApoE4+ status is defined as having at least one 4 allele and ApoE4− is defined as having no 4 allele. ApoE4 is an important genetic marker of Alzheimer’s disease, and it is the allele of the apolipoprotein E (APOE) gene located on chromosome 19q13.

The preprocessing steps used are the standard steps described in Fox et al. (2009). Structural images were first obtained to be used for atlas registration of the fMRI data. Structural data were acquired using a T1-weighted MP-RAGE. Functional data were collected next using a T2-weighted gradient echo sequence [echo time 27 ms; repetition time (TR) 384 ms; field of view 256 mm; flip angle 90°]. fMRI BOLD datasets were collected while subjects fixated on a cross-hair. Two fMRI runs with 164 frames were acquired at a TR of 2.2 sec (approximately 6 minutes each). For the fMRI data, 36 contiguous, 4.0 mm thick slices were acquired parallel to the anterior-posterior commissure plane (4.0 mm approximately isotropic voxels) providing complete brain coverage. The fMRI data were transformed to a common atlas space, normalized across runs, corrected for head motion, and blurred with a 6 mm full-width at half-maximum Gaussian filter. A temporal filter with a frequency full width at half maximum cut-off of 0.1 Hz was applied to the fMRI data. Furthermore, noise was removed by regression of several nuisance variables, including the signal averaged over the whole brain and white matter parameters.

For the seed region, Sheline et al. (2010a, 2010b) selected bilateral precuneus (Talairach coordinates +/−7, −60, +21) (Talairach and Tournoux, 1988), which is among the regions affected early in the course of AD. They selected ROIs that were found to differ between healthy controls and both Alzheimer’s disease participants and PIB+ participants (Sheline et al., 2010b). Regional time series are an average of the voxel time series that are contained in a sphere with a 12 mm diameter (coordinates in Talairach space) (Talairach and Tournoux, 1988). The data were initially analyzed via the standard pooling Fisher-z approach using the seed region and ROIs selected. With the standard pooling Fisher-z method the following steps are taken: 1) the Pearson correlation between the regions for each subject is calculated; 2) the Pearson correlation coefficients are transformed to Fisher-z; 3) a group average of the Fisher-z transformation is calculated; and 4) an unpaired t-test of the group averages is performed. By this approach, 14 ROIs identified to have correlation differences between ApoE4+ and ApoE4− with bilateral precuneus are reported in Table 1. Figure 1 depicts the brain maps of the significant functional connectivity findings. In our analysis we compare the GEE approaches to the standard pooling Fisher-z approach using the same seed and ROIs.

Table 1.

Regions of intererst (Talairach coordinates) selected for ACS ApoE4+/ApoE4− connectivity study

Hypothalamus (00, 00, −09)

Dorsal occipital cortex (00, −89, +40)

Gyrus rectus (+04, +23, −21)

Inferior orbital cortex (+07, +58, −21)

Medial prefrontal cortex (+16, +53, +18)

Right hippocampus (+19, −19, −06)

Right parahippocampus (+31, −30, −14)

Right superior temporal gyrus/fronto-parietal operculum (+51, −03, +08)

Middle temporal cortex (+58, −12, −22)

Pregenual anterior cingulate/striatum (−02, +24, +02)

Caudal orbital cortex (−13, +24, −16)

Dorsal anterior cingulate (−14, +18, +30)

Left hippocampus & parahippocampus (−17, −34, −12)

Left superior temporal gyrus/fronto-parietal operculum (−37, −40, +09)

Open in a new tab

3 Notation and methodology

We denote the covariates to be a p × 1 vector X_ij and the outcome variables to be Y_qij where q = 1, ..,Q denotes the seed region (Y₁_ij) and its (Q − 1) regions of interest (ROIs), i = 1, .., n denotes the ith subject, and j = 1, .., J denotes the jth fMRI measurement. Our objective is to test for a group difference in the relationship between multiple ROIs Y_q and the seed region Y₁, where Y_q and Y₁ are nJ × 1 vectors, q > 1, and the two groups are denoted as g = {0, 1}. This objective will be determined via (Q − 1) pairwise correlations between the (q−1)th ROI Y_q and the seed region Y₁. To reach our objective of comparing regional correlations between groups we suggest using an approach that can handle time-series outcome data. We describe two modeling approaches that can simultaneously handle time-series outcome data, include all subjects in the same model, and adjust for covariates. Also, we describe the GEE methodology that is used for regression parameter estimation of these modeling approaches in the presence of repeated measures data.

3.1 Generalized estimating equations

We propose using two modeling approaches for time-series data, the marginal model and the transition model. To be more specific each region will have its own model. The marginal model (Diggle et al., 1994) is a population-average approach to longitudinal data where the marginal expectation of the response Y_qij, $μ_{qij} = E (y_{qij} ∣ x_{i j}^{T})$ , is the focus. The marginal expectation of Y_qij is characterized as a function of the explanatory factors, i.e. $h (μ_{qij}) = x_{i j}^{T} β_{q}$ for a link function h and regression coefficient parameter p × 1 vector β_q = (β_q₀, …, β_qp_′)^T where p′ = p − 1. Also, the variance of Y_qij is a function of the mean, Var(y_qij)= g(μ_qij)φ, for some variance function g and dispersion parameter φ. The correlation of the repeated measurements from the same subject, i.e. Y_qij and Y_qik, is a function of the marginal means and additional parameters, α_q, and is given by Corr(Y_qij, Y_qik)= ρ(α_q). The regression coefficients, β_q, have a similar interpretation to that of the population-average effect from a cross-sectional analysis. Since our data are continuous and we assume they are Gaussian, h(μ_qij) = μ_qij and g(μ_qij) = 1. The regression of the outcome on the covariates and the dependence structure are modeled separately (Diggle et al., 1994). These parameters are estimated using the quasi-likelihood and are estimated iteratively until convergence.

The other model we propose utilizing is the transition model. The objective of the transition model is to model the history of the outcome. In a transition model (Diggle et al., 1994) the current value of the outcome is influenced by its previous values. The model is $E (y_{qij} ∣ x_{i j}^{T}, y_{qij - 1}, .., y_{qij - K}) = μ_{qij}$ . We call the dependence on the past K values ”lag K”. The idea behind the transition model is that the influence of the past outcome values can be removed when they are adjusted for. The conditional mean and variance of the transition model are $h (μ_{qij}) = (x_{i j}^{T}, y_{qij - 1}, .., y_{qij - K}) β_{q}$ , β_q is a (p + K) × 1 vector of regression coefficients, and Var(y_qij)= g(μ_qij)φ. Again, since the data are continuous and we assume Gaussianity h(μ_qij) = μ_qij and g(μ_qij) = 1. Now the interpretation of the regression coefficients, β_q, for the covariates is that they are adjusted for the history of the transition process.

As mentioned previously we have time-series data which by definition are correlated. A parameter estimation approach for both the marginal model and the transition model with correlated data is the generalized estimating equation (GEE) (Diggle et al., 1994; Liang and Zeger, 1986; Zeger et al., 1988). GEEs are derived from quasi-likelihood theory (Wedderburn, 1974; McCullagh and Nelder, 1989) rather than from likelihood-based derivations such as the GLM. This implies that the actual form of the distribution of the outcome does not need to be specified. In quasi-likelihood, it is only necessary to specify the mean-covariance structure and the relationship between the mean of the outcome and the covariates. We estimate the regression coefficient, β_q, by solving the GEE (Diggle et al., 1994; Liang and Zeger, 1986; Zeger et al., 1988):

S_{β_{q}} (β_{q}, {\hat{α}}_{q}) = \sum_{i} D_{q i} {[Var (Y_{q i})]}^{- 1} (Y_{q i} - μ_{q i}) = 0

(1)

where Y_qi=(Y_qi₁, …, Y_qiJ)^T, $D_{q i} = \frac{\partial μ_{q i}^{T}}{\partial β_{q}}$ , μ_qi = (μ_qi₁, .., μ_qiJ)^T, $V_{q i} (α_{q}) = Var (Y_{q i}) = A_{q i}^{1 / 2} R_{q i} (α_{q}) A_{q i}^{1 / 2} φ$ , A_qi = diag(1, .., 1), and α_q are replaced with consistent estimates, α̂_q. The GEE uses a ”working” correlation matrix, R_qi (α_q), since the true correlation is unknown. There are various correlation structures that can be used. In this manuscript we consider the exchangeable correlation structure and the autoregressive of order 1 correlation structure (AR(1)). The exchangeable correlation structure assumes a constant correlation for a subject and is given by

{Corr}_{ex} (Y_{qij}, Y_{qik}) = {\begin{matrix} 1 & j = k \\ α_{q} & j \neq k \end{matrix} .

The AR(1) correlation structure indicates that two observations further away are less correlated than those close together and is

{Corr}_{ar 1} (Y_{qij}, Y_{q i, j + t}) = α_{q}^{t} for t = 0, 1, \dots, J - j .

The correlation, R_qi (α_q), is a nuisance parameter in the GEEs, and β_q is the parameter of interest.

As shown above, for the GEE it is only necessary to specify: 1) the mean and the variance of the outcome conditioned on the covariates, and 2) the relationship between the expected outcome and the covariates. With the transition model, when the specification of the conditional mean of Y_q is correct then in theory the repeated transitions can be treated as independent data and standard statistical methods such as a GLM can be used. When the lag specification is incorrect other measures need to be utilized such as using the GEE (Diggle et al., 1994), by specifying an additional correlation structure that may not be captured by the specified lag in the transition model. Generally speaking, not much is known in practice regarding the independence assumption so we recommend being cautious and using the GEE instead of the GLM since the parameter estimates can be less biased and more efficient when using the GEE over the GLM.

In this work we use both GEE models for each region. With the marginal model we consider both exchangeable correlations and AR(1) correlations. For the transition model we consider the exchangeable correlations and AR(1) correlations along with lags 1–3. The within-subject correlation has been accounted for by the GEE. Plots of the fMRI time-series data for each subject (see Figures 2a–2c) resemble a cosine wave and appear to be periodic. Therefore, we assume the function of time to be a cosine wave. In order to use the cosine waves in the models we have to estimate the frequency of oscillations in addition to the other parameters. We use periodograms (Shumway and Stoffer, 2010) to estimate the dominant frequency for each subject and then take the average of the dominant frequency from all subjects. This average frequency for each region is used as an estimate of the frequency of oscillations in the models.

Precuneus vs time for ApoE4+: (a) subject 1, (b) subject 2, (c) all subjects

We suggest a two-stage approach that uses the GEE to handle the temporal dependence through a modeling strategy and then use these GEE residuals to calculate correlations between the regions. The correlation between the regions, ρ̂₁_qi (defined below), is of primary interest and is different from the within-subject correlation, R_qi (α_q), estimated from the GEE model. As stated earlier R_qi (α_q) is a nuisance parameter. We use the residuals estimated from the Q GEE models in the regional correlation analysis, and the residuals are defined to be

{\hat{ε}}_{qij} = y_{qij} - {\hat{μ}}_{qij}, q = 1, .., Q .

(2)

For each subject, the Pearson correlation between the residuals of the seed region and the ROI is calculated

{\hat{ρ}}_{1 q i} = \frac{cov ({\hat{ε}}_{1 i}, {\hat{ε}}_{q i})}{σ_{{\hat{ε}}_{1 i}} σ_{{\hat{ε}}_{q i}}}, q > 1

(3)

where ε̂₁_i = (ε̂₁_i₁, .., ε̂₁_iJ)^T and ε̂_qi = (ε̂_qi₁, .., ε̂_qiJ)^T are residuals defined in (2). The correlation for each subject is then transformed to a Fisher-z

τ_{1 q i} = .5 ln (\frac{1 + {\hat{ρ}}_{1 q i}}{1 - {\hat{ρ}}_{1 q i}}), q > 1,

(4)

and ρ̂₁_qi is defined in (3). Then a group average of the Fisher-z transformation is calculated

{\bar{τ}}_{1 q g} = E (τ_{1 q} ∣ g = G) = \frac{\sum_{i} I (g_{i} = G) τ_{1 q i}}{n_{g}}, g = {0, 1}

(5)

where τ₁_qi is defined in (4), g denotes group status, G = {0, 1}, and

n_{g} = \sum_{i} I (g_{i} = G), g = {0, 1} .

(6)

Lastly, a standard error needs to be estimated to test for differences of correlations across groups. A model-based approach is used first, with the standard errors and confidence intervals based on the t-distribution. The model-based standard error is

s e = \frac{\sum_{i} {(I (g_{i} = 0) τ_{1 q i} - {\bar{τ}}_{1 q 0})}^{2}}{n_{0}} + \frac{\sum_{i} {(I (g_{i} = 1) τ_{1 q i} - {\bar{τ}}_{1 q 1})}^{2}}{n_{1}},

(7)

where τ₁_qi is defined in (4), τ̄₁_q₀ and τ̄₁_q₁ are defined in (5), and n₀ and n₁ are defined in (6). In addition, we estimate the standard errors and confidence intervals with the bootstrap to include an empirical approach since we may be uncertain of the standard error estimate provided by the model-based approach. Details for inferences using bootstrapping approaches are provided in the next section. We employ this two-stage procedure for all regions. We use the R package geepack (Højsgaard et al., 2005) to estimate the GEE parameters.

To set ideas and demonstrate the usefulness of the GEE approach, we briefly describe a PET example. A longitudinal study in Alzheimer’s disease (AD) has been conducted in which we collect PET PIB and CSF biomarkers, CSF-40 and CSF-42, at baseline and two follow-up annual visits. We want to determine if PET-PIB differs by group status and also if the association between CSF biomarkers and PET-PIB differs between healthy normal controls and AD subjects, with g = 0 if a control and g = 1 if an AD subject. A total of 25 healthy normal subjects, n₀ =25, and 20 AD subjects, n₁ =20, have been collected. Since there are three annual visits we denote the subscript for time to be j = 1, 2, 3. PET PIB is used to measure MCBP (mean cortical binding potential) which is one of the outcomes Y₁_ij measured for the ith subject at time j, i = 1, .., 45. The other two outcomes are CSF-40, Y₂_ij, and CSF-42, Y₃_ij. Here we have three outcomes and let q = 1, 2, 3. Suppose we also need to adjust for group, X₁_i, baseline age, X₂_i, gender, X₃_i, and time, j, yielding X = (1, X₁_i, X₂_i, X₃_i, j)^T, 1 is for the intercept, and (X₁_i, X₂_i, X₃_i) are fixed-time covariates. The mean function is h(μ_qij) = μ_qij = β_q₀+x₁_iβ_q₁+x₂_iβ_q₂+x₃_iβ_q₃+jβ_q₄ for the marginal model and h(μ_qij) = μ_qij = β_q₀ + x₁_iβ_q₁ + x₂_iβ_q₂ + x₃_iβ_q₃ + jβ_q₄ + y_qi,j₋₁β_q₅ + .. + y_qi,j₋_Kβ_q,₅₊_K for the transition model.

For example with PIB, the estimating equation (1) using the marginal model and exchangeable correlation is

S_{β_{1}} (β_{1}, {\hat{α}}_{1}) = \sum_{i} D_{1 i} {[Var (Y_{1 i})]}^{- 1} (Y_{1 i} - μ_{1 i}) = 0

(8)

where Y₁_i=(Y₁_i₁, Y₁_i₂, Y₁_i₃)^T, $D_{1 i} = \frac{\partial μ_{1 i}^{T}}{\partial β_{1}}$ , β₁ = (β₁₀, β₁₁, β₁₂, β₁₃, β₁₁), μ₁_i = (μ₁_i₁, μ₁_i₂, μ₁_i₃)^T,

\begin{array}{l} μ_{1 i 1} = β_{10} + x_{1 i} β_{11} + x_{2 i} β_{12} + x_{3 i} β_{13} + 1 \times β_{14} \\ μ_{1 i 2} = β_{10} + x_{1 i} β_{11} + x_{2 i} β_{12} + x_{3 i} β_{13} + 2 \times β_{14} \\ μ_{1 i 3} = β_{10} + x_{1 i} β_{11} + x_{2 i} β_{12} + x_{3 i} β_{13} + 3 \times β_{14}, \end{array}

and

V_{1 i} (\hat{α}) = Var (Y_{1 i}) = [\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] [\begin{matrix} 1 & {\hat{α}}_{1} & {\hat{α}}_{1} \\ {\hat{α}}_{1} & 1 & {\hat{α}}_{1} \\ {\hat{α}}_{1} & {\hat{α}}_{1} & 1 \end{matrix}] [\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] \hat{φ},

where φ is the dispersion parameter and is defined by

\hat{φ} = \frac{1}{N - p} \sum_{i = 1}^{n} \sum_{j = 1}^{J_{i}} {\hat{ε}}_{qij}^{2} = \frac{1}{3 * 45 - 5} \sum_{i = 1}^{45} \sum_{j = 1}^{3} {\hat{ε}}_{1 i j}^{2},

J_i = 3 for all subjects, N is the total number of measurements, and p is the number of regression parameters.

The regression parameters are estimated from (8) and these describe the relationship between its covariate and the seed region. For example, adjusting for all other covariates β₁₁ describes the relationship between group and the seed region. If the group regression coefficient estimated from this GEE model is significant then there is a relationship between PET-PIB and AD. These models account for the within-subject correlation. Furthermore, if we want to know whether the AD groups differ in the association between PIB and the CSF markers then we would also need to run a GEE model for both CSF markers. We would obtain the residuals for the ith individual from each of the three models: ε̂₁_i = (ε̂₁_i₁ = y₁_i₁ − μ̂₁_i₁, ε̂₁_i₂ = y₁_i₂ − μ̂₁_i₂, ε̂₁_i₃ = y₁_i₃ − μ̂₁_i₃)^T for PIB, ε̂₂_i = (ε̂₂_i₁, ε̂₂_i₂, ε̂₂_i₃)^T for CSF-40, and ε̂₃_i = (ε̂₃_i₁, ε̂₃_i₂, ε̂₃_i₃)^T for CSF-42. Then we would take the residuals from all three GEE models and calculate the correlations between the residuals of PIB and CSF-40, ${\hat{ρ}}_{12 i} = \frac{cov ({\hat{ε}}_{1 i}, {\hat{ε}}_{2 i})}{σ_{{\hat{ε}}_{1 i}} σ_{{\hat{ε}}_{2 i}}}$ , and also between the residuals of PIB and CSF-42, ${\hat{ρ}}_{13 i} = \frac{cov ({\hat{ε}}_{1 i}, {\hat{ε}}_{3 i})}{σ_{{\hat{ε}}_{1 i}} σ_{{\hat{ε}}_{3 i}}}$ , for each subject where ρ̂₁_qi is defined in (3). Then the Fisher-z transformation of PIB and CSF-40 and Fisher-z transformation of PIB and CSF-42 would be calculated using (4) to obtain τ₁₂_i and τ₁₃_i, respectively, for the ith individual. Next, we would calculate the control group means (τ̄₁₂₀, τ̄₁₃₀) and AD group means (τ̄₁₂₁, τ̄₁₃₁) for the Fisher-z transformations of PIB and CSF-40 and Fisher-z transformation of PIB and CSF-42 using (5). Lastly, the t-test would be calculated using the standard error defined by (7).

3.2 Bootstrap approach

Bootstrap approaches can be quite useful for inferential purposes in the event one is unsure of the model-based standard error estimate or when dealing with complex data (Davison and Hinkley, 1997; Efron and Tibshirani, 1993). We propose the individual bootstrap, i.e. resampling subjects applied to all methods under comparison. The individual bootstrap involves resampling n subjects and keeping the subjects’ whole time-series. This ensures that the dependency structure of the temporal data is kept intact for each region. Also, there is no violation of the independence assumption of bootstrapping since the whole series data are selected.

We use the bootstrap to estimate the standard error and calculate the confidence interval of the difference of the group-average Fisher-z transformations as follows. First we obtain the correlations from the B boostrapped datasets. With the standard pooling Fisher-z approach we resample the subjects first then obtain the correlations ${\hat{ρ}}_{1 q i, b}^{*}$ , b = 1, .., B, for the bth bootstrap. For the GEE model, first we obtain residuals from the GEE model with all the original subjects; then we resample the subjects on this residualized dataset and obtain the correlations ${\hat{ρ}}_{1 q i, b}^{*}$ , b = 1, .., B, for the bth bootstrap. After obtaining the B bootstrapped correlations we do the following for all methods. We estimate the difference of the group-average Fisher-z transformations, $τ_{1 q, b}^{*}$ , b = 1, .., B, for the bth bootstrap. Let $τ_{1 q}^{(d)}$ be the difference of the group-average Fisher-z transformations from the actual sample. The bias is estimated where ${Bias}_{1 q} = B^{- 1} \sum_{b = 1}^{B} τ_{1 q, b}^{*} - τ_{1 q}^{(d)}$ . The average of the bootstrapped estimates of the Fisher-z group differences is $\bar{τ_{1 q}^{*}} = \frac{1}{B} \sum_{b = 1}^{B} τ_{1 q, b}^{*}$ . The variance of the bootstrapped estimates of the Fisher-z group difference is

V_{1 q} = \frac{1}{B - 1} \sum_{b = 1}^{B} {(τ_{1 q, b}^{*} - \bar{τ_{1 q}^{*}})}^{2} .

The bias-adjusted confidence interval is $τ_{1 q}^{(d)} - {Bias}_{1 q} \pm t_{1 - α / 2, d f} \sqrt{V_{1 q}}$ . It is recommended to resample 100–500 times; and we have chosen to bootstrap 100 times since we did not see much change in the standard errors with 500 bootstraps.

3.3 Algorithm

To summarize our approach we have provided an algorithm to obtain the connectivity difference measures with all approaches.

Step 1
Start with preprocessed resting-state fMRI data for n subjects. Identify the seed region and ROIs.
Step 2
For the standard pooling Fisher-z approach correlate the seed region with each ROI for each subject to obtain the Pearson correlation, ρ̂₁_qi, using (3).
Step 3
For all GEE approaches run a GEE model in R, SAS, or STATA specifying the correlation structure. Each region will have its own GEE model. Request the residuals, ε̂_qij, from the qth GEE model. Next correlate the seed region with each ROI for each subject to obtain the Pearson correlation, ρ̂₁_qi, using (3).
Step 4
The Fisher-z transformation, τ₁_qi, for the seed region and each ROI is obtained for each subject using (4). This is done for all methods.
Step 5
For each method and seed region-ROI pair calculate the group mean Fisher-z, τ̄₁_qg, using (5) and Fisher-z group difference τ̄₁_q₁ − τ̄₁_q₀.
Step 6
For each method and seed region-ROI pair calculate the model-based standard error with (7) to be used for a t-test of connectivity group differences.
Step 7
To obtain a t-test with the bootstrap standard error for each seed region-ROI pair do the following. For the standard pooling Fisher-z approach use the original data and do subject resampling B times to get B bootstrap datasets. With the GEE approaches first obtain the residuals from the GEE models with the original data; then do subject resampling B times to get B datasets and include the residuals in that bootstrapped dataset from the GEE models. To estimate ${\hat{ρ}}_{1 q i, b}^{*}$ for the bth bootstrap correlate the seed region with each ROI for each subject to obtain the Pearson correlation; repeat for each method. Then follow steps 4 and 5 to estimate the difference of the group-average Fisher-z transformations, $τ_{1 q, b}^{*}$ , for the bth bootstrap. Finally, estimate the bootstrap standard error, $\sqrt{V_{1 q}}$ , with (9) to be used for a t-test of group differences.

4 Simulation studies

We perform a number of simulation studies in order to evaluate the statistical properties of the various approaches and to assess model misspecification. We compare six approaches: the standard pooled Fisher-z approach (Pool), a GEE marginal model with AR(1) correlation (GEE AR(1)), GEE transition models with lags 1–3 and the correct function of time with an exchangeable correlation (Tran 1, Tran 2, Tran 3), and a GEE transition model with lag 1 and incorrect function of time with an exchangeable correlation (Tran 1*). We also considered the marginal model with exchangeable correlations, marginal model with an incorrect function of time and exchangeable correlations and AR(1) correlations, and transition models with AR(1) correlations (results not presented). We generate the data to have a cosine function of time; the marginal model and transition model with incorrect function of time takes time to be linear. This allows us to assess robustness of the GEE model to an incorrect function of time. The bias, average of the standard error (SE), mean squared error (MSE), and 95% coverage probabilities of the difference between the group average Fisher-z estimates, (τ₁ − τ₀), are calculated where τ_g =.5ln[(1+ρ_g)/(1−ρ_g)] and g = {0, 1}. Results for the standard error and coverages are reported under their respective model-based approach or bootstrap approach.

Our data generation consists of j = 1, .., 200 time-points per subject, two groups denoted as g = {0, 1}, lag 1, and total number of subjects to be n = 60 with equal numbers in each group. To examine finite-sample population properties we also considered sample sizes of 30 and 200 with equal numbers in each group (results not presented). We generate two variables (Y₁_j, Y₂_j) to be (Y₁_j, Y₂_j)~BVN(μ_j, Σ_g) where μ_j=(μ₁_j, μ₂_j), μ₁_j=Acos(2πωj+φ)+b₁y_1,_j₋₁, μ₂_j=Acos(2πωj+φ)+b₂y_2,_j₋₁, Σ_g is the covariance matrix for the corresponding group with variances being 1 and the correlations denoted by ρ_g. The lag parameters are (b₁, b₂)=(−.85,− .8) for group 0, (b₁, b₂)=(−.75,−.7) for group 1; amplitude is A = 5; frequency of oscillation is ω = 1/20; and phase shift is φ = .6π. The three sets of correlation values and Fisher-z difference values selected for the simulation studies are: (ρ₀ = .05, ρ₁ = .1, τ₁ − τ₀ = .05); (ρ₀ = .1, ρ₁ = .2, τ₁ − τ₀ = .1); and (ρ₀ = .05, ρ₁ = .35, τ₁ − τ₀ = .32). We generate 1000 replications for each study. Since each simulation study takes approximately three weeks to run on a AMD Opteron 270 2.0 GHz Dual Core server, we have evaluated a limited number of values for the parameters. For these simulation studies, we assume the frequency of oscillations is not known and estimate it prior to the GEE parameter estimation process (as described in Section 3). The GEE marginal model of interest is y_qij = β₀ + β₁ cos (2πωj) + β₂ sin (2πωj) + β₃g. The transition models with the correct function of time are: $y_{qij} = β_{0} + β_{1} cos (2 π ω j) + β_{2} sin (2 π ω j) + β_{3} g + \sum_{k = 1}^{K} β_{4 + k} y_{q i (j - k)}$ . The transition model with the incorrect function of time and lag of 1 is: y_qij = β₀ + β₁j + β₂g + β₃y_qi₍_j₋₁₎.

The true SE of the group Fisher-z transformation difference is

\sqrt{\frac{1 / (J - 3)}{n_{1}} + \frac{1 / (J - 3)}{n_{2}}} = \sqrt{\frac{1 / (200 - 3)}{30} + \frac{1 / (200 - 3)}{30}} = 0.0184.

As may be noticed, the true standard error is a function of the number of time points and number of subjects. Therefore, the standard error will be the same regardless of the correlation values across simulation studies and will decrease as the sample size increases.

Results are reported in Tables 2, 3, and 4. The Pool approach has the most bias and has the largest MSE. This is due to not accounting for the lag and possibly not modeling time. The transition models with the correct function of time all have smallest bias, smallest SEs that are also closest to the true SE, and smallest MSE. The marginal model has small bias and small MSE but the largest SE. The transition model with lag 1 and incorrect function of time has more bias, larger SE, and larger MSE than the other GEE approaches, but still has smaller bias and smaller MSE than the Pool approach and smaller SE than the marginal model. The coverages are always too small for the Pool approach and the transition model with the incorrect function of time; the narrowness of the intervals is worst with the Pool approach. For the model-based SE and the bootstrap SE approaches, the coverages for the marginal model and the transition models with the correct function of time are close to the true nominal value. The model-based and bootstrap standard errors are very similar. As connectivity differences increase, the Pool approach and the transition model with the incorrect function of time result in a smaller bias, smaller MSE, and wider coverages; however, they still perform worse than the other GEE models.

Table 2.

Simulation study: model-based and bootstrap results for correlation differences of .05 (.05, .10) and Fisher-z difference of .05. Results presented for Fisher-z difference between groups

Summary Statistics	Pool	GEE AR(1)	Tran 3	Tran 2	Tran 1	Tran 1^*
bias	0.168	0.001	0.001	0.001	0.001	0.124
MSE	0.0289	0.0014	0.0004	0.0004	0.0004	0.0164
Model-based
E(SE)	0.027	0.037	0.019	0.019	0.018	0.031
95% Cov	0	0.95	0.946	0.945	0.938	0.034
Bootstrap
E(SE)	0.027	0.037	0.018	0.018	0.018	0.031
95% cov	0	0.947	0.936	0.93	0.927	0.031

Open in a new tab

Note: Tran j is with j lag,

wrong function of time

95% coverages for bootstrap are bias-adjusted

Table 3.

Simulation study: model-based and bootstrap results for correlation differences of .10 (.1, .2) and Fisher-z difference of .1. Results presented for Fisher-z difference between groups

Summary Statistics	Pool	GEE AR(1)	Tran 3	Tran 2	Tran 1	Tran 1^*
bias	0.15	0	0	0	0.001	0.11
MSE	0.0231	0.0014	0.0004	0.0004	0.0004	0.013
Model-based
E(SE)	0.027	0.037	0.019	0.019	0.018	0.032
95% Cov	0	0.948	0.943	0.946	0.941	0.082
Bootstrap
E(SE)	0.027	0.037	0.018	0.018	0.018	0.031
95% cov	0.001	0.948	0.935	0.935	0.934	0.081

Open in a new tab

Note: Tran j is with j lag,

wrong function of time

95% coverages for bootstrap are bias-adjusted

Table 4.

Simulation study: model-based and bootstrap results for correlation differences of .30 (.05, .35) and Fisher-z difference of .32. Results presented for Fisher-z difference between groups

Summary Statistics	Pool	GEE AR(1)	Tran 3	Tran 2	Tran 1	Tran 1^*
bias	0.087	−0.001	−0.001	−0.001	−0.001	0.065
MSE	0.0083	0.0014	0.0004	0.0004	0.0004	0.0052
Model-based
E(SE)	0.027	0.037	0.019	0.019	0.018	0.032
95% Cov	0.117	0.947	0.938	0.937	0.937	0.451
Bootstrap
E(SE)	0.027	0.037	0.018	0.018	0.018	0.031
95% cov	0.131	0.945	0.935	0.935	0.932	0.45

Open in a new tab

Note: Tran j is with j lag,

wrong function of time

95% coverages for bootstrap are bias-adjusted

Our results held over various sample sizes; and as expected the standard errors are reduced as the sample size increases. The results for the GEE approaches with the correct function of time are very similar whether we select the exchangeable correlation or AR(1) correlation structure; the standard error tends to be slightly smaller for the AR(1) correlation, most likely due to the nature of the data. The marginal model with an incorrect function of time performs poorly regardless of the correlation structure selected and is not an improvement over the Pool approach. However, we found that the correlation structure selection matters the most with the transition model and an incorrect function of time. With an incorrect function of time in the transition model, the AR(1) correlation has superior performance over the exchangeable correlation since it has less bias, smaller MSE, and wider coverages. Based on our choices for these simulation studies, the transition model with the AR(1) tends to be more robust to model misspecification. Based on our findings, we recommend using the GEE transition model and either the model-based or bootstrap SE.

5 Example

The objective for the resting-state ACS fMRI dataset is to determine which regional connections of precuneus and its ROIs differ between the ApoE4+ and ApoE4− groups of those who are cognitively normal. Each of the 15 regions has 164 BOLD measures per scanning session with two scanning sessions. We exclude the first four frames from each run to remove the effect of the magnet initialization. We compare a total of six approaches: the standard pooling Fisher-z method, a GEE marginal model with a cosine function of time and AR(1) correlation; GEE transition models with: a cosine function of time, lags 1–3, and an exchangeable correlation; and a GEE transition model with: lag 1, a linear function of time, and an exchangeable correlation. It seems unnecessary to include a lag larger than three in the analysis based on our plots of the current values versus the previous values using lags up to seven. Also, we apply the model-based SE approach and bootstrap SE approach to all six methods to obtain standard errors and confidence intervals. We include both an uncorrected analysis where p-value≤.05 is considered significant and a corrected analysis for multiple testing using a Bonferroni correction where p-value≤.004 is considered significant. In all models we adjust for time and group; therefore for the GEE models we specify the outcome to be the qth regional fMRI time-series data and the covariates to be time and group. For the transition models we also adjust for the outcomes from the previous K time-points that are specified by the lag of choice. Since age is related to ApoE4 status we include an analysis that does not adjust for age (unadjusted analyses) and an analysis where we adjust for age (adjusted analysis) (results only presented in figure). The GEE methods can include age as an additional covariate. The Pool approach has an extra step for the adjusted analysis. After obtaining the Fisher-z estimates for each subject we use a GLM to estimate group differences where the outcome is the Fisher-z estimate and the covariates are group and age. We then use a t-test for the group coefficient to test for functional connectivity group differences. We will focus the discussion here on the uncorrected unadjusted results since the number of regions is small and age does not change the functional connectivity findings except for a few cases with the multiple testing correction.

We only considered the 99 subjects who had two fMRI runs available. This gives us 38 ApoE4+ subjects to be compared with 61 ApoE4− subjects. Age is statistically significantly different between ApoE4 groups (p-value=0.009) with a mean (SD) of 63.2 (7.4) yrs in the ApoE4− group and 58.7 (8.5) yrs in the ApoE4+ group. ApoE4 groups do not statistically differ (p>.05) across other demographics: gender (19 (31%) males ApoE4−, 9 (24%) males ApoE4+); mini-mental status exam (MMSE) (mean (SD): 29.3 (.88) for ApoE4−; 29.6 (.68) for ApoE4+); and education level (mean (SD): 16.1 (2.6) yrs for ApoE4−; 16.2 (2.0) yrs for ApoE4+).

Tables 5, 6, 7, and 8 report the ApoE4+/− functional connectivity difference results between precuneus and its ROIs. Figures 3–4 depict the positive and negative group connectivity differences and their inferential findings across all methods using the model-based SE for the unadjusted analyses and adjusted analyses, respectively. For all regions, the GEE marginal model yields similar results to the Pool approach. We suspect this is due to the lag not being accounted for in the marginal model and the function of time may be slightly different than specified. The results for the transition model differ when compared to the GEE marginal model and Pool approach. In general for the transition model, the connectivity magnitude differences are smaller except for the hypothalamus, pregenual AC, and left hippocampus. The transition models with lag 1 always have a smaller SE than the transition models with lag >1. As for the standard errors, the smallest to the largest tend to be the transition lag 1 models, marginal model and Pool approach which are similar, and the transition lag>1 models. There are a few exceptions regarding the SE, where the transition models are always smaller than the Pool approach and marginal model (medial prefrontal cortex, right parahippocampus model-based only, middle temporal, and left hippocampus). In a few instances, the Pool approach and marginal model have smaller standard errors than the lag models (right hippocampus, dorsal AC, and left superior temporal). In all 14 analyses, the Pool approach and marginal model results are significant. Most of the transition model results are significant with the following exceptions: inferior orbital cortex, right hippocampus (transition lag > 1 models), right parahippocampus, right superior temporal (transition >lag 1 excluding lag 2 bootstrap), dorsal AC (all except transition 1 with bootstrap), and left superior temporal (all except transition lag 1 with bootstrap).

Table 5.

Functional connectivity of Precuneus and ROIs (Hypothalamus, Dorsal occipital cortex, Gyrus rectus, Inferior orbital cortex) comparing ApoE4+ and ApoE4−

Coefficient estimates	Pool	GEE AR(1)	Tran 3	Tran 2	Tran 1	Tran 1^*
Hypothalamus
Fisher-z difference	0.071	0.071	0.077	0.076	0.078	0.078
Model-based: SE	0.026	0.026	0.029	0.028	0.022	0.022
p-value	0.007^**	0.007^**	0.008^**	0.007^**	0.001	0.001
95% CI	(0.020, 0.122)	(0.020, 0.123)	(0.020, 0.135)	(0.022. 0.131)	(0.034, 0.122)	(0.034, 0.122)
Bootstrap: SE	0.027	0.027	0.029	0.028	0.022	0.022
p-value	0.009^**	0.009^**	0.011^**	0.009^**	0.001	0.001
95% CI	(0.018, 0.123)	(0.018, 0.124)	(0.018, 0.134)	(0.019, 0.131)	(0.034, 0.121)	(0.034, 0.121)
Dorsal Occ Cor
Fisher-z difference	0.124	0.124	0.12	0.119	0.11	0.11
Model-based: SE	0.040	0.040	0.049	0.048	0.038	0.038
p-value	0.003	0.003	0.016^**	0.016^**	0.005^**	0.005^**
95% CI	(0.045, 0.204)	(0.044, 0.204)	(0.023, 0.217)	(0.023, 0.215)	(0.035, 0.186)	(0.035, 0.186)
Bootstrap: SE	0.038	0.038	0.049	0.048	0.037	0.037
p-value	0.002	0.002	0.023^**	0.022^**	0.006^**	0.006^**
95% CI	(0.047, 0.197)	(0.046, 0.196)	(0.016, 0.211)	(0.017, 0.209)	(0.031, 0.180)	(0.031, 0.180)
Gyrus Rectus
Fisher-z difference	−0.095	−0.096	−0.092	−0.083	−0.062	−0.061
Model-based: SE	0.025	0.025	0.030	0.028	0.022	0.022
p-value	0.000	0.000	0.002	0.003	0.006^**	0.006^**
95% CI	(−0.144, −0.046)	(−0.144, −0.047)	(−0.151, −0.034)	(−0.138, −0.028)	(−0.105, −0.018)	(−0.105, −0.018)
Bootstrap: SE	0.026	0.026	0.029	0.028	0.023	0.023
p-value	0.000	0.000	0.002	0.003	0.009^**	0.009^**
95% CI	(−0.147, −0.043)	(−0.148, −0.044)	(−0.151, −0.035)	(−0.139, −0.028)	(−0.109, −0.016)	(−0.109, −0.016)
Inferior orbital cortex
Fisher-z difference	0.08	0.079	0.043	0.044	0.046	0.046
Model-based: SE	0.027	0.027	0.031	0.030	0.024	0.024
p-value	0.004	0.004	0.171^,^*	0.145^,^*	0.064^,^*	0.063^,^*
95% CI	(0.027, 0.133)	(0.026, 0.132)	(−0.019, 0.104)	(−0.015, 0.103)	(−0.003, 0.094)	(−0.002, 0.094)
Bootstrap: SE	0.025	0.025	0.029	0.028	0.024	0.023
p-value	0.003	0.004	0.191^,^*	0.155^,^*	0.070^,^*	0.069^,^*
95% CI	(0.026, 0.124)	(0.025, 0.123)	(−0.020, 0.097)	(−0.016, 0.097)	(−0.004, 0.090)	(−0.003, 0.090)

Open in a new tab

Note: Tran j is with j lag,

linear function of time, Occ Cor= Occipital Cortex

95% coverages are bias-adjusted,

=NS at α=.05,

^**

=NS for Bonferroni correction at α=.004

Table 6.

Functional connectivity of Precuneus and ROIs (Medial prefrontal cortex, R hippocampus, R parahippocampus, Right superior temporal gyrus/fronto-parietal operculum) comparing ApoE4+ and ApoE4−

Coefficient estimates	Pool	GEE AR(1)	Tran 3	Tran 2	Tran 1	Tran 1^*
Medial prefront cort
Fisher-z difference	0.119	0.119	0.105	0.101	0.09	0.09
Model-based: SE	0.038	0.038	0.037	0.036	0.034	0.034
p-value	0.002	0.002	0.005^**	0.006^**	0.009^**	0.009^**
95% CI	(0.044, 0.194)	(0.044, 0.194)	(0.032, 0.178)	(0.030, 0.173)	(0.023, 0.157)	(0.023, 0.157)
Bootstrap: SE	0.037	0.037	0.036	0.035	0.034	0.034
p-value	0.002	0.002	0.004	0.005^**	0.011^**	0.011^**
95% CI	(0.045, 0.191)	(0.045, 0.190)	(0.033, 0.175)	(0.030, 0.170)	(0.021, 0.156)	(0.021, 0.157)
R hippocampus
Fisher-z difference	−0.094	−0.094	−0.052	−0.051	−0.059	−0.059
Model-based: SE	0.025	0.025	0.034	0.033	0.027	0.027
p-value	0.000	0.000	0.128^,^*	0.128^,^*	0.031^**	0.031^**
95% CI	(−0.143, −0.045)	(−0.144, −0.045)	(−0.120, 0.015)	(−0.116, 0.015)	(−0.113, −0.006)	(−0.113, −0.006)
Bootstrap: SE	0.023	0.023	0.032	0.031	0.026	0.026
p-value	0.000	0.000	0.120^,^*	0.115^,^*	0.024^**	0.024^**
95% CI	(−0.136, −0.046)	(−0.136, −0.047)	(−0.113, 0.013)	(−0.110, 0.012)	(−0.109, −0.008)	(−0.109, −0.008)
R parahippocampus
Fisher-z difference	−0.085	−0.085	−0.048	−0.047	−0.052	−0.052
Model-based: SE	0.039	0.039	0.038	0.037	0.032	0.032
p-value	0.031^**	0.030^**	0.204^,^*	0.205^,^*	0.113^,^*	0.112^,^*
95% CI	(−0.162, −0.008)	(−0.163, −0.008)	(−0.123, 0.027)	(−0.121, 0.026)	(−0.116, 0.012)	(−0.116, 0.012)
Bootstrap: SE	0.033	0.033	0.034	0.033	0.029	0.029
p-value	0.013^**	0.013^**	0.185^,^*	0.188^,^*	0.107^,^*	0.107^,^*
95% CI	(−0.147, −0.018)	(−0.148, −0.018)	(−0.112, 0.022)	(−0.110, 0.022)	(−0.105, 0.010)	(−0.105, 0.010)
R superior temporal
Fisher-z difference	−0.116	−0.115	−0.079	−0.079	−0.087	−0.087
Model-based: SE	0.038	0.038	0.042	0.041	0.034	0.034
p-value	0.003	0.003	0.066^,^*	0.059^,^*	0.012^**	0.012^**
95% CI	(−0.192, −0.040)	(−0.191, −0.040)	(−0.162, 0.005)	(−0.161, 0.003)	(−0.154, −0.020)	(−0.154, −0.020)
Bootstrap: SE	0.036	0.036	0.041	0.040	0.033	0.033
p-value	0.001	0.001	0.053^, ^*	0.048^**	0.009^**	0.009^**
95% CI	(−0.188, −0.046)	(−0.187, −0.046)	(−0.162, 0.001)	(−0.160, −0.001)	(−0.154, −0.023)	(−0.154, −0.023)

Open in a new tab

Note: Tran j is with j lag,

linear function of time, prefront cort= prefrontal cortex, R superior temporal=Right superior temporal gyrus/fronto-parietal operculum, 95% coverages are bias-adjusted,

=NS at α=.05,

^**

=NS for Bonferroni correction at α=.004

Table 7.

Functional connectivity of Precuneus and ROIs (Middle temporal cortex, Pregenual anterior cingulate/striatum, Caudal orbital cortex, Dorsal anterior cingulate) comparing ApoE4+ and ApoE4−

Coefficient estimates	Pool	GEE AR(1)	Tran 3	Tran 2	Tran 1	Tran 1^*
Middle temporal
Fisher-z difference	−0.124	−0.124	−0.085	−0.082	−0.072	−0.072
Model-based: SE	0.041	0.041	0.037	0.036	0.033	0.033
p-value	0.003	0.003	0.023^**	0.024^**	0.033^**	0.033^**
95% CI	(−0.204, −0.043)	(−0.205, −0.044)	(−0.157, −0.012)	(−0.153, −0.011)	(−0.139, −0.006)	(−0.139, −0.006)
Bootstrap: SE	0.041	0.041	0.037	0.037	0.034	0.034
p-value	0.003	0.003	0.023^**	0.025^**	0.040^**	0.040^**
95% CI	(−0.204, −0.043)	(−0.204, −0.043)	(−0.160, −0.012)	(−0.156, −0.010)	(−0.140, −0.003)	(−0.140, −0.003)
Pregenual AC/striatum
Fisher-z difference	−0.098	−0.098	−0.136	−0.137	−0.138	−0.138
Model-based: SE	0.040	0.041	0.046	0.045	0.039	0.039
p-value	0.017^**	0.018^**	0.004	0.003	0.001	0.001
95% CI	(−0.178, −0.018)	(−0.178, −0.017)	(−0.226, −0.045)	(−0.226, −0.049)	(−0.216, −0.060)	(−0.216, −0.060)
Bootstrap: SE	0.037	0.037	0.041	0.04	0.033	0.033
p-value	0.010^**	0.011^**	0.002	0.002	0.000	0.000
95% CI	(−0.169, −0.024)	(−0.169, −0.023)	(−0.210, −0.048)	(−0.210, −0.051)	(−0.199, −0.067)	(−0.199, −0.068)
Caudal orbital cortex
Fisher-z difference	0.111	0.111	0.096	0.096	0.079	0.079
Model-based: SE	0.031	0.031	0.033	0.032	0.026	0.026
p-value	0.000	0.000	0.005^**	0.003	0.003	0.003
95% CI	(0.050, 0.172)	(0.050, 0.172)	(0.031, 0.162)	(0.033, 0.160)	(0.028, 0.130)	(0.028, 0.130)
Bootstrap: SE	0.032	0.032	0.032	0.031	0.026	0.026
p-value	0.001	0.001	0.005^**	0.004	0.004	0.004
95% CI	(0.042, 0.170)	(0.042, 0.169)	(0.028, 0.154)	(0.030, 0.153)	(0.025, 0.126)	(0.025, 0.126)
Dorsal AC
Fisher-z difference	−0.11	−0.11	−0.064	−0.064	−0.065	−0.065
Model-based: SE	0.031	0.031	0.04	0.041	0.033	0.033
p-value	0.001	0.001	0.131^,^*	0.121^,^*	0.053^,^*	0.053^,^*
95% CI	(−0.173, −0.048)	(−0.172, −0.048)	(−0.147, 0.019)	(−0.146, 0.017)	(−0.131, 0.001)	(−0.131, 0.001)
Bootstrap: SE	0.032	0.032	0.044	0.043	0.034	0.034
p-value	0.001	0.001	0.131^,^*	0.121^,^*	0.051^,^*	0.050^**
95% CI	(−0.176, −0.050)	(−0.176, −0.049)	(−0.156, 0.020)	(−0.154, 0.018)	(−0.135, 0.000)	(−0.135, 0.000)

Open in a new tab

Note: Tran j is with j lag,

linear function of time, Middle temporal=Middle temporal cortex, AC=Anterior Cingulate, 95% coverages are bias-adjusted,

=NS at α=.05,

^**

=NS for Bonferroni correction at α=.004

Table 8.

Functional connectivity of Precuneus and ROIs (L hippocampus and parahip-pocampus, L superior temporal gyrus/fronto-parietal operculum) comparing ApoE4+ and ApoE4−

Coefficient estimates	Pool	GEE AR(1)	Tran 3	Tran 2	Tran 1	Tran 1^*
L hipp and para
Fisher-z difference	−0.133	−0.132	−0.149	−0.146	−0.125	−0.125
Model-based: SE	0.038	0.038	0.035	0.034	0.032	0.032
p-value	0.001	0.001	0.000	0.000	0.000	0.000
95% CI	(−0.207, −0.058)	(−0.207, −0.057)	(−0.219, −0.079)	(−0.215, −0.078)	(−0.189, −0.061)	(−0.189, −0.061)
Bootstrap: SE	0.039	0.039	0.034	0.033	0.031	0.031
p-value	0.001	0.001	0.000	0.000	0.000	0.000
95% CI	(−0.204, −0.050)	(−0.203, −0.050)	(−0.214, −0.081)	(−0.209, −0.080)	(−0.185, −0.063)	(−0.185, −0.063)
L sup temp gyrus
Fisher-z difference	−0.113	−0.113	−0.052	−0.054	−0.069	−0.069
Model-based: SE	0.032	0.032	0.044	0.043	0.035	0.035
p-value	0.001	0.001	0.243^,^*	0.215^,^*	0.054^,^*	0.054^,^*
95% CI	(−0.177, −0.049)	(−0.177, −0.048)	(−0.140, 0.036)	(−0.140, 0.032)	(−0.139, 0.001)	(−0.138, 0.001)
Bootstrap: SE	0.029	0.029	0.040	0.039	0.032	0.032
p-value	0.000	0.000	0.163^,^*	0.136^,^*	0.024^**	0.024^**
95% CI	(−0.174, −0.058)	(−0.174, −0.058)	(−0.136. 0.023)	(−0.135, 0.019)	(−0.136, −0.010)	(−0.136, −0.010)

Open in a new tab

Note: Tran j is with j lag,

linear function of time, hipp=hippocampus, para=parahippocampus, sup temp=superior temporal 95% coverages are bias-adjusted,

=NS at α=.05,

^**

=NS for Bonferroni correction at α=.004

Method comparison for model-based SE and unadjusted analysis: group differences of functional connectivity between Precuneus and 14 ROIs. The heatmap color indicates the group differences of functional connectivity between Precuneus and each ROI listed on the y-axis. The color legend represents the range of correlation differences where red indicates a positive correlation difference between ApoE4+ and ApoE4−, and blue indicates a negative correlation difference between ApoE4+ and ApoE4−. All correlation differences are statistically significant except where shown on the plot with a * and/or **, where: *=NS at uncorrected, **=NS at corrected. On the color bar is a histogram of the correlation differences across all methods and ROIs.

Method comparison for model-based SE and adjusted analysis: group differences of functional connectivity between Precuneus and 14 ROIs. The heatmap color indicates the group differences of functional connectivity between Precuneus and each ROI listed on the y-axis. The color legend represents the range of correlation differences where red indicates a positive correlation difference between ApoE4+ and ApoE4−, and blue indicates a negative correlation difference between ApoE4+ and ApoE4−. All correlation differences are statistically significant except where shown on the plot with a * and/or **, where: *=NS at uncorrected, **=NS at corrected. On the color bar is a histogram of the correlation differences across all methods and ROIs.

With the multiple testing correction there are some additional findings. The group connectivity findings of right parahippocampus are no longer significant for all of the methods. The following eight regional functional connectivity group differences are no longer significant for all the transition models: dorsal occipital cortex, inferior orbital cortex, medial prefrontal cortex (except lag 3 bootstrap), right hippocampus, right superior temporal gyrus, middle temporal cortex, dorsal AC, and left superior temporal gyrus. A few additional nonsignificant findings that are isolated include: hypothalamus for all but the transition 1 models, gyrus rectus for the transition 1 models, pregenual AC for the Pool approach and marginal model, and caudal orbital cortex for the transition 3 model. The multiple testing correction leads to three additional regions being not statistically significant for the marginal and Pool approaches. Also, most of the regions are no longer statistically significant for the transition models.

We have shown how the functional connectivity values can differ when accounting for the lag in the data. The transition models may improve the standard error estimate which drives the inferential findings. The transition models do lead to some nonsignificant findings. Based on our simulation studies and the differences found in the example dataset we recommend transition models. According to the simulation studies, the transition models will lead to unbiased and more efficient estimates if the lag and function of time can be estimated correctly.

6 Discussion

Resting-state fMRI studies have been limited in their analytical approaches. Minimal progress has been made at modeling the temporal dependencies of fMRI data when comparing group functional connectivity. The ACS fMRI data has motivated us to study techniques that can handle the temporal characteristics of fMRI data while comparing regional correlations across groups. When we ignore the temporal correlation, the data may appear to be more correlated than they actually are and the estimates will not be consistent, leading to incorrect comparisons. Much work has been done in the task-based area addressing these concerns; however, much less focus has been in the resting-state area, specifically in determining group correlation differences. Our work demonstrates application of GEE approaches and a bootstrap approach to resting-state fMRI data. In particular, we evaluated the potential utility of the transition model to analyze functional connectivity data.

With time-series data, previous values will most likely be related to the current values. When removing the influence from the values of the previous timepoints, the correlations and differences may no longer reflect these temporal dependencies. These temporal dependencies are not of primary interest. The magnitude of the differences in functional correlations may be affected when accounting for the lag of resting-state fMRI data. As demonstrated in our simulation studies and real data example, the transition models yield smaller magnitudes of connectivity differences than the other approaches. Our simulation studies have suggested that the transition models may have better statistical properties than the marginal model and standard pooling Fisher-z approach. Based on the ACS example, we have shown that functional connectivity group differences will vary according to the method selected. The transition model did result in fewer connectivity difference findings. There were a few instances with the transition models, when the bootstrap SEs were slightly smaller than the model-based SE leading to a more significant finding than when using the model-based SEs. However, it should be mentioned that this was at the α = .05 threshold where the p-values for the model-based was very close to .05. We are not sure why this occurred. For a majority of the time the model-based and bootstrap standard errors were very similar.

There are some limitations with our approach. The function of time needs to be further evaluated. The simulation studies have demonstrated that when the function of time is very incorrect the standard pooling Fisher-z approach and marginal model both perform poorly, whereas the transition models are an improvement. Furthermore, the example found the results to be similar between the standard pooling Fisher-z approach and marginal approach and also similar between the transition lag 1 model with the cosine function of time and linear function of time. Therefore, additional investigation is required to determine how robust each method is to model misspecification. Another issue is with the estimation of frequency of oscillation. When estimating the cosine functions we took an average of the subjects’ dominant frequency of oscillations. This may not accurately reflect the data; therefore, we are in the early stages of pursuing wavelets and mixtures of cosines to determine if these are a better fit to the data. We are also currently evaluating the block bootstrap which is appropriate for time-series data. We have had some success with the block bootstrap; however, we need to understand the dependency structure of our fMRI data to use this method appropriately. Additional simulation studies and methods are being designed to further evaluate properties of the GEE methods and bootstrap methods for resting-state fMRI data.

GEE methods hold promise in the functional connectivity area as they are ideally suited for modeling time-series data. Furthermore, GEE methods are flexible by having the ability to adjust for covariates. Our results suggest that the transition model and bootstrapping may contribute to our understanding of regional correlations. Code may be requested from the corresponding author.

Highlights.

We used GEE to compare groupwise functional connectivity.
We estimated the model-based standard errors (SE) and bootstrap SE.
We compared GEE models to the standard pooling Fisher-z approach.
GEE transition models have the best statistical properties.

Acknowledgments

Contract/grant sponsor: NIH grants K25AG035062, P01AG026276, P50AG05681, P01AG03991, K24MHO79510

We acknowledge the Washington University Knight ADRC and Dr. Yvette Sheline’s lab for providing data.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

Beckmann CF, DeLuca M, Devlin JT, Smith SM. Investigations into resting-state connectivity using independent component analysis. Philos Trans R Soc Lond B Biol Sci. 2005;360:1001–1013. doi: 10.1098/rstb.2005.1634. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bullmore ET, Rabe-Hesketh S, Morris RG, Williams SCR, Gregory L, Gray JA, Brammer MJ. Functional magnetic resonance image analysis of a large-scale neurocognitive network. Neuroimage. 1996;4:16–33. doi: 10.1006/nimg.1996.0026. [DOI] [PubMed] [Google Scholar]
Calhoun VD, Adali T, Pearlson GD, Pekar JJ. A method for making group inferences from functional MRI data using independent component analysis. Hum Brain Mapp. 2001;14:140–151. doi: 10.1002/hbm.1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
D’Angelo GM, Lazar NA, Eddy WF, Morris JC, Sheline YI. A generalized estimating equations approach for resting-state functional MRI group analysis. Conf Proc IEEE Eng Med Biol Soc. 2011;2011:5064–7. doi: 10.1109/IEMBS.2011.6091254. [DOI] [PMC free article] [PubMed] [Google Scholar]
Davison AC, Hinkley DV. Bootstrap Methods and Their Application. Cambridge University Press; NY: 1997. [Google Scholar]
Diggle PJ, Liang KY, Zeger SL. Analysis of Longitudinal Data. Oxford; NY: 1994. [Google Scholar]
Efron B, Tibshirani RJ. An Introduction to the Bootstrap. Chapman & Hall/CRC; Boca Raton, FL: 1993. [Google Scholar]
Fox MD, Zhang D, Snyder AZ, Raichle ME. The global signal and observed anticorrelated resting state brain networks. J Neurophysiol. 2009;101:3270–3283. doi: 10.1152/jn.90777.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
Friston K, Buchel C. Functional connectivity. In: Frackowiak RSJ, Friston KJ, Frith CD, Dolan RJ, Price CJ, Zeki S, Ashburner J, Penny W, editors. Human Brain Function. 2. Elsevier; London: 2004. pp. 999–1018. [Google Scholar]
Friston KJ, Stephan KE, Lund TE, Morcom A, Kiebal S. Mixed effects and fMRI studies. Neuroimage. 2005;24:244–252. doi: 10.1016/j.neuroimage.2004.08.055. [DOI] [PubMed] [Google Scholar]
Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating equations. Journal of Statistical Software. 2005;15:1–11. [Google Scholar]
Huettel SA, Song AW, McCarthy G. Functional Magnetic Resonance Imaging. 2. Sinauer Associates; MA: 2008. [Google Scholar]
Lazar NA. The Statistical Analysis of Functional MRI Data. Springer; NY: 2008. [Google Scholar]
Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. [Google Scholar]
McCullagh P, Nelder JA. Generalized Linear Models. 2. Chapman and Hall/CRC; Boca Raton, FL: 1989. [Google Scholar]
Morris JC. The clinical dementia rating (CDR): Current version and scoring rules. Neurology. 1993;43:2412–2414. doi: 10.1212/wnl.43.11.2412-a. [DOI] [PubMed] [Google Scholar]
Sheline YI, Morris JC, Snyder AZ, Price JL, Yan Z, D’Angelo G, Liu C, Dixit S, Benzinger T, Fagan A, Goate A, Mintun MA. APOE4 allele disrupts resting state fMRI connectivity in the absence of amyloid plaques or decreased CSF Aβ42. J Neurosci. 2010a;30:17035–17040. doi: 10.1523/JNEUROSCI.3987-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sheline YI, Raichle ME, Snyder AZ, Morris JC, Head D, Wang S, Mintun MA. Amyloid plaques disrupt resting state default mode network connectivity in cognitively normal elderly. Biol Psychiatry. 2010b;67:584–587. doi: 10.1016/j.biopsych.2009.08.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shumway RH, Stoffer DS. Time Series Analysis and Its Applications: With R Examples. 3. Springer; NY: 2010. [Google Scholar]
Talairach J, Tournoux P. Co-Planar Stereotaxic Atlas of the Human Brain: 3-D Proportional System: an Approach to Cerebral Imaging. Thieme; NY: 1988. [Google Scholar]
Wedderburn RWM. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika. 1974;61:439–447. [Google Scholar]
Worsley KJ, Friston KJ. Analysis of fMRI time-series revisited again. Neuroimage. 1995;2:173–181. doi: 10.1006/nimg.1995.1023. [DOI] [PubMed] [Google Scholar]
Zeger SL, Liang KY, Albert PS. Models for longitudinal data: a generalized estimating equation approach. Biometrics. 1988;44:1049–1060. [PubMed] [Google Scholar]

[R1] Beckmann CF, DeLuca M, Devlin JT, Smith SM. Investigations into resting-state connectivity using independent component analysis. Philos Trans R Soc Lond B Biol Sci. 2005;360:1001–1013. doi: 10.1098/rstb.2005.1634. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Bullmore ET, Rabe-Hesketh S, Morris RG, Williams SCR, Gregory L, Gray JA, Brammer MJ. Functional magnetic resonance image analysis of a large-scale neurocognitive network. Neuroimage. 1996;4:16–33. doi: 10.1006/nimg.1996.0026. [DOI] [PubMed] [Google Scholar]

[R3] Calhoun VD, Adali T, Pearlson GD, Pekar JJ. A method for making group inferences from functional MRI data using independent component analysis. Hum Brain Mapp. 2001;14:140–151. doi: 10.1002/hbm.1048. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] D’Angelo GM, Lazar NA, Eddy WF, Morris JC, Sheline YI. A generalized estimating equations approach for resting-state functional MRI group analysis. Conf Proc IEEE Eng Med Biol Soc. 2011;2011:5064–7. doi: 10.1109/IEMBS.2011.6091254. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Davison AC, Hinkley DV. Bootstrap Methods and Their Application. Cambridge University Press; NY: 1997. [Google Scholar]

[R6] Diggle PJ, Liang KY, Zeger SL. Analysis of Longitudinal Data. Oxford; NY: 1994. [Google Scholar]

[R7] Efron B, Tibshirani RJ. An Introduction to the Bootstrap. Chapman & Hall/CRC; Boca Raton, FL: 1993. [Google Scholar]

[R8] Fox MD, Zhang D, Snyder AZ, Raichle ME. The global signal and observed anticorrelated resting state brain networks. J Neurophysiol. 2009;101:3270–3283. doi: 10.1152/jn.90777.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Friston K, Buchel C. Functional connectivity. In: Frackowiak RSJ, Friston KJ, Frith CD, Dolan RJ, Price CJ, Zeki S, Ashburner J, Penny W, editors. Human Brain Function. 2. Elsevier; London: 2004. pp. 999–1018. [Google Scholar]

[R10] Friston KJ, Stephan KE, Lund TE, Morcom A, Kiebal S. Mixed effects and fMRI studies. Neuroimage. 2005;24:244–252. doi: 10.1016/j.neuroimage.2004.08.055. [DOI] [PubMed] [Google Scholar]

[R11] Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating equations. Journal of Statistical Software. 2005;15:1–11. [Google Scholar]

[R12] Huettel SA, Song AW, McCarthy G. Functional Magnetic Resonance Imaging. 2. Sinauer Associates; MA: 2008. [Google Scholar]

[R13] Lazar NA. The Statistical Analysis of Functional MRI Data. Springer; NY: 2008. [Google Scholar]

[R14] Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. [Google Scholar]

[R15] McCullagh P, Nelder JA. Generalized Linear Models. 2. Chapman and Hall/CRC; Boca Raton, FL: 1989. [Google Scholar]

[R16] Morris JC. The clinical dementia rating (CDR): Current version and scoring rules. Neurology. 1993;43:2412–2414. doi: 10.1212/wnl.43.11.2412-a. [DOI] [PubMed] [Google Scholar]

[R17] Sheline YI, Morris JC, Snyder AZ, Price JL, Yan Z, D’Angelo G, Liu C, Dixit S, Benzinger T, Fagan A, Goate A, Mintun MA. APOE4 allele disrupts resting state fMRI connectivity in the absence of amyloid plaques or decreased CSF Aβ42. J Neurosci. 2010a;30:17035–17040. doi: 10.1523/JNEUROSCI.3987-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Sheline YI, Raichle ME, Snyder AZ, Morris JC, Head D, Wang S, Mintun MA. Amyloid plaques disrupt resting state default mode network connectivity in cognitively normal elderly. Biol Psychiatry. 2010b;67:584–587. doi: 10.1016/j.biopsych.2009.08.024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Shumway RH, Stoffer DS. Time Series Analysis and Its Applications: With R Examples. 3. Springer; NY: 2010. [Google Scholar]

[R20] Talairach J, Tournoux P. Co-Planar Stereotaxic Atlas of the Human Brain: 3-D Proportional System: an Approach to Cerebral Imaging. Thieme; NY: 1988. [Google Scholar]

[R21] Wedderburn RWM. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika. 1974;61:439–447. [Google Scholar]

[R22] Worsley KJ, Friston KJ. Analysis of fMRI time-series revisited again. Neuroimage. 1995;2:173–181. doi: 10.1006/nimg.1995.1023. [DOI] [PubMed] [Google Scholar]

[R23] Zeger SL, Liang KY, Albert PS. Models for longitudinal data: a generalized estimating equation approach. Biometrics. 1988;44:1049–1060. [PubMed] [Google Scholar]

PERMALINK

Bootstrapping GEE models for fMRI regional connectivity

Gina M D’Angelo

Nicole A Lazar

Gongfu Zhou

William F Eddy

John C Morris

Yvette I Sheline

Abstract

1 Introduction