Abstract
An Alzheimer’s fMRI study has motivated us to evaluate inter-regional correlations between groups. The overall objective is to assess inter-regional correlations at a resting-state with no stimulus or task. We propose using a generalized estimating equation (GEE) transition model and a GEE marginal model to model the within-subject correlation for each region. Residuals calculated from the GEE models are used to correlate brain regions and assess between group differences. The standard pooling approach of group averages of the Fisher-z transformation assuming temporal independence is a typical approach used to compare group correlations. The GEE approaches and standard Fisher-z pooling approach are demonstrated with an Alzheimer’s disease (AD) connectivity study in a population of AD subjects and healthy control subjects. We also compare these methods using simulation studies and show that the transition model may have better statistical properties.
I. Introduction
A noninvasive neuroimaging technique for the study of brain function is functional magnetic resonance imaging (fMRI). fMRI data are measured with the ratio of the oxygenated to deoxygenated blood, referred to as the blood oxygen level dependent (BOLD) contrast effect [1]. The fMRI data are represented as a series of a % signal change. Although images are taken every few seconds, the BOLD response is slow, followed by a short delay where the response has a gradual increase to the peak and decreases back to baseline [2]. The duration of the BOLD response is much longer than the acquisition time of the fMRI scans causing the scans to be correlated. Therefore, as others have suggested we treat the fMRI scans as statistical time-series data [1, 3–7]. A recent area of interest that has emerged from fMRI studies is resting-state connectivity. Here, the overall goal is to determine inter-regional correlations during a resting-state with no stimulus or task.
It is common to compare connectivity between groups by calculating the Pearson correlation between the regions for each subject. The Pearson correlation coefficients are transformed to a Fisher-z, followed by taking a group average of the Fisher-z transformation, and lastly doing an unpaired t-test of the group averages. We refer to this approach as the standard pooling Fisher-z method. A drawback of this approach is that is does not address the temporal correlation in the time-series and the correlations between regions can be inflated as a result. Hence, we suggest a two-stage approach that uses the generalized estimating equations (GEE) to handle the temporal dependence through a modeling strategy and then use these GEE residuals to calculate correlations.
Some progress has been made at modeling the temporal dependence of fMRI data [4–7]. However, not much progress has been made on modeling temporal dependence when determining group differences of resting-state connectivity. In addition, much of the general time-series literature has focused on a single time-series analysis. This is a limitation when the focus is on the analysis of multiple subjects. We are interested in modeling n individuals’ time-series to assess between-group comparisons of regional associations.
For this manuscript, we propose a two-stage approach with GEE to estimate brain regional associations and assess between group differences while accounting for the temporal dependence in the individual time-series [8]. We evaluate both GEE marginal and GEE transition models. Marginal models estimate the regression and correlation structure separately. Transition models estimate the expectation of the current value conditional on the previous values. We investigate the properties of these GEE models and compare them to the standard pooling Fisher-z method, using an Alzheimer’s disease (AD) functional connectivity study and simulation studies.
II. Data and functional connectivity
Our data consist of a study cohort with 20 healthy normal subjects and 14 AD subjects that were recruited from the Washington University Alzheimer’s Disease Research Center (ADRC). Fourteen subjects with very mild dementia of the Alzheimer’s type (DAT) with clinical dementia rating (CDR) scores 0.5–1 were compared with 20 normal age matched controls with no brain amyloid (PIB−) deposition. All subjects were part of ongoing longitudinal studies at the Washington University ADRC. fMRI BOLD datasets were collected while subjects fixated on a cross-hair. The fMRI data were transformed to a common atlas space, corrected for head motion, and blurred with a 6 mm full-width half-maximum Gaussian filter. The BOLD volumetric time-series were preprocessed with the previously mentioned steps including temporal filtering. Furthermore, noise was removed by regression of several nuisance variables, including the signal averaged over the whole brain and head motion parameters [9]. Summary statistics for age, MMSE, gender, and education are reported in Table 1. Two seed regions were selected that are among regions affected early in the course of AD: dorsolateral prefrontal cortex (DLPFC) (−36, 27, 29) and precuneus (−2, −66, 39). These are characterized by high levels of atrophy, hypometabolism, and amyloid deposition. The data were initially analyzed using the standard Fisher-z approach using the seed regions and regions of interest (ROI) selected. The seed regions each have a diameter of 12 mm (coordinates in Talairach space) [10]. By this approach, three ROIs identified to have correlation differences between AD and controls with DLPFC were lateral parietal cortex (−44, −64,43), left hippocampus (25, −34, −21), and right hippocampus (−21, −31, −17). The ROIs that had correlation differences between AD and controls with the precuneus seed region were right pregenual anterior cingulate (AC) (6,41,20), right anterior prefrontal cortex (25,58,12), medial frontal cortex (BA 8) (4,24,48), and right visual cortex (17, −71, −10). In our analysis we compare the GEE approaches to the standard pooling approach using the same seeds and ROIs.
Table 1.
Total (n=34) | Healthy (n=20) | AD (n=14) | p-value | |
---|---|---|---|---|
Male* | 14 (41.2) | 8 (40.0) | 6 (42.9) | 1.000 |
Age** | 73.9 (6.8) | 72.7 (6.0) | 75.6 (7.6) | 0.255 |
MMSE** | 27.1 (3.6) | 29.2 (0.9) | 24.2 (4.0) | <0.001 |
Educ** | 14.4 (2.9) | 14.6 (2.9) | 14.3 (3.1) | 0.712 |
n (%),
mean (SD), educ=education
III. Notation and Methodology
We denote the regions to be Zqi=(Zqi1,.., ZqiJ) where q=1,.., P denotes the seed region and its (P-1) ROIs, i=1,.., n denotes the ith subject, and j=1,.., J denotes the jth fMRI measurement. Our objective is to test for a group difference in the relationship between multiple ROIs Zq and the seed region Z1, where q>1. This will be determined via (P-1) pairwise correlations between the (q-1)th target region Zq and seed region Z1, where q>1.
GEE [8,11,12] methods have been developed to handle correlated data using an estimating equation that does not result from a likelihood-based derivation as general linear models (GLM) do. A quasi-likelihood [13] specifies only the mean and variance. The quasi-likelihood uses the following estimating equation to estimate the parameters [8, 11]:
(1) |
where q=1,.., P, μqi = E(Zqi) = β′Wi, Wi are covariates, , Rqi=Corr(Zqi), and Aqi=diag(uqi1,.., uqiJ). We use the Geepack library [14] from the R software to estimate the GEE parameters.
As previously mentioned we propose using two approaches, the GEE marginal model and the GEE transition model. For a GEE marginal model [8], the marginal expectation of Zq is characterized as a function of explanatory factors where E(Zqij|Wij)=β′Wij. The regression of the outcome on the covariates and the dependence structure are modeled separately [8]. Marginal models are designed to estimate population average parameters where the objective is to compare groups of subjects among a population. In a transition model [8] the current value of the outcome is influenced by its previous values where E(Zqij| Wij, Zqij-1,.., Zqij-K)=β′ (Wij, Zqij-1,.., Zqij-K). The dependence on the past K values will be referred to as lag K. If the specification of the conditional mean of Zq is correct, then the repeated transitions can be treated as independent data and standard statistical methods such as GLM can be used. If the specification is incorrect, then other measures need to be utilized such as using the GEE [8], by specifying an additional correlation structure that may not be captured by the specified lag in the transition model. The idea is that the influence of the past values can be removed when adjusted for in the transition model. Additional details for these GEE methods are provided in Diggle et al. [8].
In this work we use both models for each region. With the marginal model we consider both exchangeable and autoregressive with lag 1 correlations. For the transition model we consider both independence and exchangeable correlations along with lags 1–5. In all models we also adjust for time and group. The within-subject correlation has been accounted for by the GEE. The residuals from these q GEE models are then used in the correlation analysis. The Pearson correlation between the residuals of the seed region and its target region is calculated for each subject. Then the correlation for each subject is transformed to a Fisher-z and a group average of the Fisher-z transformation is calculated. Lastly, we calculate an unpaired t-test of the group average to compare connectivity differences between AD and healthy controls. We employ this two-stage procedure for all regions.
IV. Data Analysis Example
The AD dataset of 20 controls and 14 AD subjects is used for a data analysis example. Each region has 164 BOLD measures per scanning session with two scanning sessions. We exclude the first 4 frames to remove the effect of the magnet initialization. We compare a total of 13 approaches: GEE marginal models with exchangeable or AR(1) correlation, GEE transition models with lags 1–5 and independence or exchangeable correlation, and the standard pooling Fisher-z method. In all analyses the GEE marginal results are very similar for the AR(1) and exchangeable correlations; therefore, we present just the results for the AR(1). Also, in all analyses the GEE transition model results are very similar across both correlation structures; therefore, we just present results for the independence correlation. Results of the transition model are only reported for lag <=3 since the results are similar for lag>3, where the significance findings stay the same as the magnitude of connectivity differences are maintained. We do not correct for multiple testing and a p-value<.05 is considered significant. The analyses are based only on the first scanning session; therefore, the results may differ from those reported in Section II.
Table 2 reports functional connectivity difference results between DLPFC and its target regions: lateral parietal cortex, left hippocampus, and right hippocampus (not shown). For all regions, the GEE marginal model yields similar results to the standard pooling approach. We suspect this is due to the lag not being accounted for in the marginal model. The results for the transition model differ when compared to the GEE marginal model and standard pooling approach. In general for the transition model, the connectivity differences are smaller and variances increase except for the right hippocampus where the differences become larger and standard errors decrease. In addition, the findings for the group connectivity differences for lateral parietal and right hippocampus are different with the transition model compared to the other approaches. For lateral parietal, as the lag increases for the transition models the connectivity differences decrease, along with an increased standard error, resulting in nonsignificant differences. With the standard pooling and GEE marginal approach, the group connectivity differences of lateral parietal and DLPFC are found to be significant. Based on the standard pooling and GEE marginal approach, group differences of the correlation between DLPFC and right hippocampus are not statistically significant. The group difference of the functional connectivity between DLPFC and the right hippocampus become larger as the lag of the transition model increases and the difference becomes significant for all lags. The results for the left hippocampus and DLPFC are the same for all approaches except that the connectivity differences are smaller for the transition models.
Table 2.
Lateral Parietal | Conn diff* (SE) | p-value |
---|---|---|
Stand Pool Fisher-Z | −0.258 (0.105) | 0.020 |
GEE marginal AR(1) | −0.258 (0.105) | 0.020 |
GEE transition ind lag 1 | −0.222 (0.098) | 0.031 |
GEE transition ind lag 2 | −0.175 (0.114) | 0.135 |
GEE transition ind lag 3 | −0.179 (0.114) | 0.125 |
Left hippocampus | ||
Stand Pool Fisher-Z | −.270 (0.063) | <0.001 |
GEE marginal AR(1) | −.270 (0.063) | <0.001 |
GEE transition ind lag 1 | −.181 (0.064) | 0.008 |
GEE transition ind lag 2 | −.192 (0.071) | 0.011 |
GEE transition ind lag 3 | −.190 (0.071) | 0.012 |
Conn diff: differences between group average Fisher-z
Next, we discuss results for precuneus as the seed region and right pregenual AC, right anterior prefrontal cortex (not shown), medial frontal cortex (not shown), and right visual cortex as the target ROIs (Table 3). The results across all methods are quite similar for right anterior prefrontal cortex and medial frontal cortex. There is a suggested trend of the connectivity between precuneus and right anterior prefrontal cortex being larger for the controls than the AD group. There is no statistical difference of connectivity between precuneus and medial frontal cortex. The transition models for precuneus and right pregenual AC indicate that connectivity differences are slightly larger (although not significant) for AD than the control groups as opposed to the opposite direction found with the other approaches. For right visual cortex, the connectivity differences are larger when using the standard pooling and GEE marginal models, yielding significant results, whereas with the transition model, the connectivity difference for visual cortex and precuneus is not significant and differences are smaller than with other methods.
Table 3.
R Pregenual AC | Conn diff* (SE) | p-value |
---|---|---|
Stand Pool Fisher-Z | −0.036 (0.090) | 0.693 |
GEE marginal AR(1) | −0.035 (0.090) | 0.697 |
GEE transition ind lag 1 | 0.014 (0.082) | 0.863 |
GEE transition ind lag 2 | 0.031 (0.092) | 0.736 |
GEE transition ind lag 3 | 0.025 (0.094) | 0.793 |
R Visual Cortex | ||
Stand Pool Fisher-Z | 0.166 (0.066) | 0.018 |
GEE marginal AR(1) | 0.165 (0.066) | 0.019 |
GEE transition ind lag 1 | 0.091 (0.055) | 0.105 |
GEE transition ind lag 2 | 0.075 (0.062) | 0.234 |
GEE transition ind lag 3 | 0.070 (0.063) | 0.271 |
Conn diff: differences between group average Fisher-z
Previous values will most likely be related to the current values, but most likely related to values closest to them. When removing this influence, the correlations and differences may no longer reflect these temporal dependencies that are not of primary interest. The magnitude of the differences in functional correlations may be affected when accounting for the lag of resting-state fMRI data. A majority of the time the difference is either the same or decreases when comparing the transition approach to the other approaches, and in one instance the difference is larger. About half the time the inference results differ for the transition model from the GEE marginal model and standard pooling approach. These findings suggest that the lag should be considered in functional connectivity analysis. Also, the GEE model seems robust to the correlation structure.
V. Simulation Studies
We perform a number of small simulation studies to let us evaluate some statistical properties of the various approaches. We compare six approaches: the standard Fisher-z, GEE marginal model with AR(1) correlation, GEE transition models with lags 1–3 and the correct function of time with an exchangeable correlation, and a GEE transition model with lag 1 and incorrect function of time with an exchangeable correlation. The bias and root mean squared error (MSE) of the difference between the group average Fisher-z estimates, (τ1-τ0), are calculated where τg=.5ln[(1+ρg)/(1−ρg)] and g=(0,1).
Our data generation consists of j=1,..,200 time-points per subject, two groups denoted as g=(0,1), lag 1, and total number of subjects to be n=60 with equal numbers in each group. We generate two variables (Z1, Z2) to be (Z1, Z2)~BVN(μ, Σg) where μ=(μ1, μ2),
(2) |
Σg is the covariance matrix for the corresponding group with variances being 1 and the correlations denoted by ρg. In the simulation studies we let the lag parameters have the same value of (b1=−.85, b2=−.8) for group 0, (b1=−.75, b2=−.7) for group 1, the amplitude be A=5, the frequency of oscillations be ω=1/20, the phase shift be ϕ=.6π, and generate 200 replications for each study. For these simulations we assume the frequency is known and do not estimate it during the estimation process of GEE. We consider the correlation values: (ρ0=.05, ρ1=.1) and (τ1−τ0)=.05; (ρ0=.1, ρ1=.2) and (τ1−τ0)=.1; and (ρ0=.05, ρ1=.35) and (τ1−τ0)=.32. The following are the conditional expectations specified for each model. The GEE marginal model is
(3) |
The transition models with the correct function of time are:
(4) |
where K specifies the lag. The transition model with the incorrect function of time and lag of 1 is:
(5) |
Results are reported in Table 4. The transition model with lag 3 performs the same as the models with lags 1–2, therefore we just present results for lags 1–2. The pooled approach is the most biased and has the largest MSE. This is due to not accounting for the lag and possibly not modeling time. The transition models with correct function of time all have small bias and small MSE. The marginal model performs almost as well as the transition models when time is modeled correctly. The transition model with lag 1 and incorrect model of time has more bias and larger MSE than the other GEE approaches but still has smaller bias and MSE than the standard approach. As connectivity differences increase, the pooled approach and transition model with incorrect function of time result in a smaller bias and smaller MSE. These results suggest that the transition model may contribute to our understanding of regional correlations. However, we need to investigate our simulation specifications further to provide additional understanding of the GEE approaches.
Table 4.
GEE | Transition (K=lag) | ||||
---|---|---|---|---|---|
Pool | ar(1) | K=1 | K=2 | K=1* | |
τ1 − τ0=.05 | |||||
Bias | 0.166 | −0.003 | 0 | 0 | .121 |
√MSE | 0.168 | 0.036 | .018 | .018 | .125 |
τ1 − τ0=.10 | |||||
Bias | .148 | −.003 | 0 | 0 | .106 |
√MSE | .15 | .036 | .018 | .018 | .11 |
τ1 − τ0=.32 | |||||
Bias | .085 | −.004 | −.001 | −.001 | .062 |
√MSE | .089 | .036 | .018 | .018 | .069 |
incorrect function of time
VI. Conclusion
This work demonstrated application of GEE approaches to resting-state fMRI data. In particular, we evaluated the potential utility of the transition model to analyze functional connectivity data. Based on the AD example, we have shown that functional connectivity group differences will vary according to the method selected. The transition model did result in less connectivity difference findings. The simulation studies also suggested that the transition model may have better statistical properties. Advantages of the GEE are that it is only necessary to specify the first two moments and not necessary to specify the distribution of the dependent variable. A disadvantage of the GEE is the working correlation has to be specified. Additional simulation studies are being designed to further evaluate properties of the GEE methods for resting-state fMRI data. GEE methods hold promise in the functional connectivity area as they are ideally suited for modeling time-series data and are flexible by having the ability to adjust for covariates.
Acknowledgments
This work was supported in part by the NIH grants K25AG035062, P50AG05681, P01AG03991, K24MHO79510
We acknowledge the Washington University ADRC and Dr. Yvette Sheline’s lab for providing data.
Contributor Information
Gina M. D’Angelo, Email: gina@wubios.wustl.edu, Division of Biostatistics, Washington University School of Medicine, St. Louis, MO 63130 USA (314-362-3758; fax: 314-362-2693)
Nicole A. Lazar, Email: nlazar@stat.uga.edu, Department of Statistics, The University of Georgia, Athens, GA 30602 USA
William F. Eddy, Email: bill@stat.cmu.edu, Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213 USA
John C. Morris, Email: morrisj@abraxas.wustl.edu, Department of Neurology, Washington University School of Medicine, St. Louis, MO 63130 USA
Yvette I. Sheline, Email: yvette@npg.wustl.edu, Department of Psychiatry, Washington University School of Medicine, St. Louis, MO 63130 USA
References
- 1.Lazar NA. The Statistical Analysis of Functional MRI Data. NY: Springer; 2008. [Google Scholar]
- 2.Henson R. Analysis of fMRI Time Series. In: Frackowiak RSJ, Friston KJ, Frith CD, Dolan RJ, Price CJ, Zeki S, Ashburner J, Penny W, editors. Human Brain Function. 2. London: Elsevier; 2004. pp. 793–822. [Google Scholar]
- 3.Friston K, Buchel C. Functional Connectivity. In: Frackowiak RSJ, Friston KJ, Frith CD, Dolan RJ, Price CJ, Zeki S, Ashburner J, Penny W, editors. Human Brain Function. 2. London: Elsevier; 2004. pp. 999–1018. [Google Scholar]
- 4.Bullmore ET, Rabe-Hesketh S, Morris RG, Williams SCR, Gregory L, Gray JA, Brammer MJ. Functional magnetic resonance image analysis of a large-scale neurocognitive network. NeuroImage. 1996;4(1):16–33. doi: 10.1006/nimg.1996.0026. [DOI] [PubMed] [Google Scholar]
- 5.Hampson M, Peterson BS, Skudlarski P, Gatenby JC, Gore JC. Detection of functional connectivity using temporal correlations in MR Images. Human Brain Mapping. 2002;15(4):247–262. doi: 10.1002/hbm.10022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lahaye PJ, Poline JB, Flandin G, Dodel S, Garnero L. Functional connectivity: studying nonlinear, delayed interactions between BOLD signals. NeuroImage. 2003;20(2):962–974. doi: 10.1016/S1053-8119(03)00340-9. [DOI] [PubMed] [Google Scholar]
- 7.Sun FT, Miller LM, D’Esposito M. Measuring interregional functional connectivity using coherence and partial coherence analyses of fMRI data. NeuroImage. 2004;21(2):647–58. doi: 10.1016/j.neuroimage.2003.09.056. [DOI] [PubMed] [Google Scholar]
- 8.Diggle PJ, Liang KY, Zeger SL. Analysis of Longitudinal Data. NY: Oxford; 1994. [Google Scholar]
- 9.Fox MD, Zhang D, Snyder AZ, Raichle ME. The global signal and observed anticorrelated resting state brain networks. Journal of Neurophysiology. 2009;101(6):3270–3283. doi: 10.1152/jn.90777.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Talairach J, Tournoux P. Co-Planar Stereotaxic Atlas of the Human Brain: 3-D Proportional System: an Approach to Cerebral Imaging. NY: Thieme; 1988. [Google Scholar]
- 11.Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics. 1986;42(1):121–130. [PubMed] [Google Scholar]
- 12.Hardin JW, Hilbe JM. Generalized Estimating Equations. Boca Raton, FL: Chapman & Hall/CRC; 2003. [Google Scholar]
- 13.Wedderburn RWM. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika. 1974;61(3):439–447. [Google Scholar]
- 14.Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating equations. Journal of Statistical Software. 2005;15(2):1–11. [Google Scholar]