Skip to main content
eLife logoLink to eLife
. 2023 Jan 6;12:e81869. doi: 10.7554/eLife.81869

Multimodal brain age estimates relate to Alzheimer disease biomarkers and cognition in early stages: a cross-sectional observational study

Peter R Millar 1,, Brian A Gordon 2, Patrick H Luckett 3, Tammie LS Benzinger 2,3, Carlos Cruchaga 4, Anne M Fagan 1, Jason J Hassenstab 1, Richard J Perrin 1,5, Suzanne E Schindler 1, Ricardo F Allegri 6, Gregory S Day 7, Martin R Farlow 8, Hiroshi Mori 9, Georg Nübling 10,11; The Dominantly Inherited Alzheimer Network, Randall J Bateman 1, John C Morris 1, Beau M Ances 1,2
Editors: Karla L Miller12, Jeannie Chin13
PMCID: PMC9988262  PMID: 36607335

Abstract

Background:

Estimates of ‘brain-predicted age’ quantify apparent brain age compared to normative trajectories of neuroimaging features. The brain age gap (BAG) between predicted and chronological age is elevated in symptomatic Alzheimer disease (AD) but has not been well explored in presymptomatic AD. Prior studies have typically modeled BAG with structural MRI, but more recently other modalities, including functional connectivity (FC) and multimodal MRI, have been explored.

Methods:

We trained three models to predict age from FC, structural (S), or multimodal MRI (S+FC) in 390 amyloid-negative cognitively normal (CN/A−) participants (18–89 years old). In independent samples of 144 CN/A−, 154 CN/A+, and 154 cognitively impaired (CI; CDR > 0) participants, we tested relationships between BAG and AD biomarkers of amyloid and tau, as well as a global cognitive composite.

Results:

All models predicted age in the control training set, with the multimodal model outperforming the unimodal models. All three BAG estimates were significantly elevated in CI compared to controls. FC-BAG was significantly reduced in CN/A+ participants compared to CN/A−. In CI participants only, elevated S-BAG and S+FC BAG were associated with more advanced AD pathology and lower cognitive performance.

Conclusions:

Both FC-BAG and S-BAG are elevated in CI participants. However, FC and structural MRI also capture complementary signals. Specifically, FC-BAG may capture a unique biphasic response to presymptomatic AD pathology, while S-BAG may capture pathological progression and cognitive decline in the symptomatic stage. A multimodal age-prediction model improves sensitivity to healthy age differences.

Funding:

This work was supported by the National Institutes of Health (P01-AG026276, P01- AG03991, P30-AG066444, 5-R01-AG052550, 5-R01-AG057680, 1-R01-AG067505, 1S10RR022984-01A1, and U19-AG032438), the BrightFocus Foundation (A2022014F), and the Alzheimer’s Association (SG-20-690363-DIAN).

Research organism: Human

eLife digest

The brains of people with advanced Alzheimer’s disease often look older than expected based on the patients’ actual age. This ‘brain age gap’ (how old a brain appears compared to the person’s chronological age) can be calculated thanks to machine learning algorithms which analyse images of the organ to detect changes related to aging. Traditionally, these models have relied on images of the brain structure, such as the size and thickness of various brain areas; more recent models have started to use activity data, such as how different brain regions work together to form functional networks.

While the brain age gap is a useful measure for researchers who investigate aging and disease, it is not yet helpful for clinicians. For example, it is unclear whether the machine learning algorithm could detect changes in the brains of individuals in the initial stages of Alzheimer’s disease, before they start to manifest cognitive symptoms.

Millar et al. explored this question by testing whether models which incorporate structural and activity data could be more sensitive to these early changes. Three machine learning algorithms (relying on either structural data, activity data, or combination of both) were used to predict the brain ages of participants with no sign of disease; with biological markers of Alzheimer’s disease but preserved cognitive functions; and with marked cognitive symptoms of the condition.

Overall, the combined model was slightly better at predicting the brain age of healthy volunteers, and all three models indicated that patients with dementia had a brain which looked older than normal. For this group, the model based on structural data was also able to make predictions which reflected the severity of cognitive decline. Crucially, the algorithm which used activity data predicted that, in individuals with biological markers of Alzheimer’s disease but no cognitive impairment, the brain looked in fact younger than chronological age. Exactly why this is the case remains unclear, but this signal may be driven by neural processes which unfold in the early stages of the disease. While more research is needed, the work by Millar et al. helps to explore how various types of machine learning models could one day be used to assess and predict brain health.

Introduction

Alzheimer disease (AD) is marked by structural and functional disruptions in the brain, some of which can be observed through multimodal magnetic resonance imaging (MRI) in preclinical and symptomatic stages of the disease (Frisoni et al., 2010; Brier et al., 2014a). More recently, the ‘brain-predicted age’ framework has emerged as a promising tool for neuroimaging analyses, leveraging recent developments and accessibility of machine-learning techniques, as well as large-scale, publicly available neuroimaging datasets (Cole and Franke, 2017b; Franke and Gaser, 2019). These models are trained to quantify how ‘old’ a brain appears, as compared to a normative sample of training data - typically consisting of cognitively normal participants across the adult lifespan (e.g., Cole et al., 2015). Thus, the framework allows for a residual-based interpretation of the brain age gap (BAG), defined as the difference between model-predicted age and chronological age, as an index of vulnerability and/or resistance to underlying disease pathology. Indeed, several studies have demonstrated that BAG is elevated (i.e. the brain ‘appears older’ than expected) in a host of neurological and psychiatric disorders, including symptomatic AD (Franke et al., 2010; Franke and Gaser, 2012; Gaser et al., 2013), as well as schizophrenia (e.g., Koutsouleris et al., 2014), HIV (e.g., Cole et al., 2017c), and type-2 diabetes (e.g., Franke et al., 2013), and moreover, predicts mortality (Cole et al., 2018). Conversely, lower BAG is associated with lower risk of disease progression (Gaser et al., 2013; Wang et al., 2019; Bocancea et al., 2021). Critically, at least one comparison suggests that BAG exceeds other established MRI (hippocampal volume) and CSF (pTau and Aβ42) biomarkers in sensitivity to AD progression (Gaser et al., 2013). Thus, by summarizing complex, non-linear, highly multivariate patterns of neuroimaging features into a simple, interpretable summary metric, BAG may reflect a comprehensive biomarker of brain health.

Several studies have established that symptomatic AD and mild cognitive impairment (MCI) are associated with elevated BAG (Cole and Franke, 2017b; Franke and Gaser, 2019). However, the sensitivity of these model estimates to AD in the presymptmatic stage (i.e. present amyloid pathology in the absence of cognitive decline [Sperling et al., 2011]) is less clear. The development of sensitive, reliable, non-invasive biomarkers of preclinical AD pathology is critical for the assessment of individual AD risk, as well as the evaluation of AD clinical prevention trials. Recent studies have demonstrated that greater BAG is associated with greater amyloid PET burden in a Down syndrome cohort (Cole et al., 2017a) and with greater tau PET burden in sporadic MCI and symptomatic AD (Lee et al., 2022). One approach to maximize sensitivity of BAG to presymptomatic AD pathology may be to train brain age models exclusively on amyloid-negative participants. As undetected AD pathology might influence MRI measures, and thus confound effects otherwise attributed to ‘healthy aging’ (Brier et al., 2014b), including the patterns learned by a traditional brain age model, an alternative model trained on amyloid-negative participants only might be more sensitive to detect presymptomatic AD pathology as deviations in BAG. Indeed, one recent study demonstrated that an amyloid-negative trained brain age model (Ly et al., 2020) is more sensitive to progressive stages of AD than a typical amyloid-insensitive model (Cole et al., 2015). However, this comparison included amyloid-negative and amyloid-positive test samples from two separate cohorts and thus may be driven by cohort, scanner, and/or site differences. To validate the applicability of the brain-predicted age approach to presymptomatic AD, it is important to test a model’s sensitivity to amyloid status, as well as continuous relationships with AD biomarkers, within a single cohort. Another recent comparison demonstrated that both traditional and amyloid-negative trained brain age models were similarly related to molecular AD biomarkers, but that further attempts to ‘disentangle’ AD from brain age by including more advanced AD continuum participants in the training sample significantly reduced relationships between brain age and AD markers (Hwang et al., 2022). Thus, in this study, we will apply the amyloid-negative training approach to a multimodal MRI dataset in order to maximize sensitivity to AD pathology in the presymptomatic stage.

Most of the brain-predicted age reports described above focused primarily on structural MRI. However, other studies have successfully modeled brain age using a variety of other modalities, including metabolic PET (Goyal et al., 2019; Lee et al., 2022), diffusion MRI (Cherubini et al., 2016; Petersen et al., 2022), and functional connectivity (FC) (Dosenbach et al., 2010; Liem et al., 2017; Eavani et al., 2018; Nielsen et al., 2019). Integration of multiple neuroimaging modalities may maximize sensitivity of BAG estimates to preclinical AD. Indeed, recent multimodal comparisons suggest that structural MRI and FC capture complementary age-related signals (Eavani et al., 2018; Dunås et al., 2021) and that age prediction may be improved by incorporating multiple modalities (Liem et al., 2017; Engemann et al., 2020). One recent study has shown that BAG estimates from an FC graph theory-based model are significantly elevated in autosomal dominant AD mutation carriers and are positively associated with amyloid PET (Gonneaud et al., 2021). Furthermore, we have recently demonstrated that FC correlation-based BAG estimates are surprisingly reduced in cognitively normal participants with evidence of amyloid pathology and elevated pTau, as well as in cognitively normal APOE ε4 carriers at genetic risk of AD (Millar et al., 2022). Thus, incorporating FC into BAG models may improve sensitivity to early AD.

This project aimed to develop multimodal models of brain-predicted age, incorporating both FC and structural MRI. Participants with presymptomatic AD pathology were excluded from the training set to maximize sensitivity. We hypothesized that BAG estimates would be sensitive to the presence of AD biomarkers and early cognitive impairment. We further considered whether estimates were continuously associated with AD biomarkers of amyloid and tau, as well as cognition. We hypothesized that FC and structural MRI would capture complementary signals related to age and AD. Thus, we systematically compared models trained on unimodal FC, structural MRI, and combined modalities to test the added utility of multimodal integration in accurately predicting age and whether each modality captures unique relationships with AD biomarkers and cognition.

Methods

Participants

We formed a training sample of healthy controls spanning the adult lifespan by combining structural and FC-MRI data from three sources, as described previously (Millar et al., 2022): the Charles F. and Joanne Knight AD Research Center (ADRC) at Washington University in St. Louis (WUSTL), healthy controls from studies in the Ances lab at WUSTL (Thomas et al., 2013; Petersen et al., 2021), and mutation-negative controls from the Dominantly Inherited Alzheimer Network (DIAN) study of autosomal dominant AD at multiple international sites including WUSTL (McKay et al., 2022). To minimize the likelihood of undetected AD pathology in our training set, participants over the age of 50 were only included in the training set if they were cognitively normal, as assessed by the Clinical Dementia Rating (CDR 0; Morris, 1993), and had at least one biomarker indicating the absence of amyloid pathology (CN/A−, see below). We excluded 59 participants who did not have available CDR or biomarker measures (see Figure 1—figure supplement 1). As CDR and amyloid biomarkers were not available in the Ances lab controls, we included only participants at or below age 50 from this cohort in the training set. These healthy control participants were randomly divided into a training set (~80%; N=390) and a held-out test set (~20%; N=97), which did not significantly differ in age, sex, education, or race, see Table 1.

Table 1. Demographic information of the combined samples.

Measure Training sets (total N=390) Test sets (total N=97) § Analysis sets (total N=452)
Ances Controls(CN/<50) DIAN Controls(CN/A−) Knight ADRC Controls(CN/A−) Ances Controls(CN/<50) DIAN Controls(CN/A−) Knight ADRC Controls(CN/A−) CN/A− CN/A+ CI
N 136 120 134 38 26 33 144 154 154
Age (mean, SD) 29.92 (9.92) 40.02 (10.26) 64.97 (10.57) 26.68 (7.11) 41.46 (12.34) 64.73 (10.57) 66.93 (8.53) 72.56 (7.15) 75.67 (6.86)
CDR (N 0 / N 0.5 / N 1.0 / N 2.0) NA 120 / 0 / 0 / 0 134 / 0 / 0 / 0 NA 26 / 0 / 0 / 0 33 / 0 / 0 / 0 144 / 0 / 0 / 0 154 / 0 / 0 / 0 0 / 119 / 35 / 2
Amyloid status (N + / N -) NA 120 / 0 134 / 0 NA 26 / 0 33 / 0 144 / 0 0 / 154 0 / 57
Biomarkers available (N PET / CSF / both) NA 30 / 6 / 79 11 / 22 / 91 NA 3 / 1 / 21 5 / 0 / 28 24 / 0 / 120 17 / 0 / 137 14 / 0 / 43
APOE ε4 carrier status (N + / N -) NA 76 / 44 99 / 34 NA 19 / 7 28 / 5 115 / 29 71 / 83 55 / 98
MMSE (mean, SD) NA NA 29.26 (1.05) NA NA 29.45 (0.94) 29.13 (1.17) 28.97 (1.33) 25.37 (3.55)
Sex (N female / N male) 70 / 64 85 / 35 84 / 50 19 / 18 16 / 10 22 / 11 89 / 55 91 / 63 68 / 86
Years of education (mean, SD) 13.68 (2.16) 14.78 (3.04) 16.16 (2.43) 13.95 (1.99) 14.92 (2.83) 16.48 (2.43) 15.71 (2.65) 15.90 (2.64) 15.05 (2.97)*
Race (N American Indian or Alaska Native) 1 0 0 1 0 0 0 0 0
Race (N Asian) 1 1 2 0 0 0 0 1 0
Race (N Black) 67 0 20 17 0 7 17 16 20
Race (N Native Hawaiian or Other Pacifc Islander) 2 0 0 2 0 0 0 0 0
Race (N White) 57 118 112 17 26 26 127 137 134
Site WUSTL Multiple sites WUSTL WUSTL Multiple sites WUSTL WUSTL WUSTL WUSTL
Scanner Siemens Trio Siemens Trio / Verio Siemens Trio / Biograph Siemens Trio Siemens Trio / Verio Siemens Trio / Biograph Siemens Trio / Biograph Siemens Trio / Biograph Siemens Trio / Biograph
Field strength 3T 3T 3T 3T 3T 3T 3T 3T 3T

CN = Cognitively Normal, <50 = less than age 50, A− = amyloid negative, A+ = amyloid positive, CI = cognitively Impaired, DIAN = Dominantly Inherited Alzheimer Network, ADRC = Alzheimer Disease Research Center, AD = Alzheimer disease, CDR = Clinical Dementia Rating, MMSE = Mini Mental State Examination, WUSTL = Washington University in St. Louis, T = Tesla. Group differences from the CN/A− analysis set were tested with t tests for continuous variables and χ2 tests for categorical variables.

*

p < 0.05, ^ p < 0.10.

p < 0.01.

p < 0.001.

§

Test sets include randomly-selected, non-overlapping subsets of participants drawn from the same studies as the training sets.

Finally, independent samples for hypothesis testing included three groups from the Knight ADRC: a randomly selected sample of 144 CN/A− controls who did not overlap with the training or testing sets, 154 CN/A+ participants, and 154 cognitively impaired (CI) participants (CDR > 0 with a biomarker measure consistent with amyloid pathology [see below] and/or a primary diagnosis of AD or uncertain dementia [McKhann et al., 2011]). See Table 1 for demographic details of each sample. All participants provided written informed consent in accordance with the Declaration of Helsinki and their local institutional review board. All procedures were approved by the Human Research Protection Office at WUSTL (IRB ID # 201204041).

PET and CSF biomarkers

Amyloid burden was imaged with PET using (11 C)-Pittsburgh Compound B (PIB; Klunk et al., 2004) or (18 F)-Florbetapir (AV45; Wong et al., 2010). Regional standard uptake ratios (SUVRs) were modeled from 30 to 60 min after injection for PIB and from 50 to 70 min for AV45, using cerebellar gray as the reference region (Su et al., 2013). Regions of interest were segmented automatically using FreeSurfer 5.3 (Fischl, 2012). Global amyloid burden was defined as the mean of partial-volume-corrected (PVC) SUVRs from bilateral precuneus, superior and rostral middle frontal, lateral and medial orbitofrontal, and superior and middle temporal regions (Su et al., 2013). Amyloid summary SUVRs were harmonized across tracers using a centiloid conversion (Su et al., 2018).

Tau deposition was imaged with PET using (18 F)-Flortaucipir (AV-1451; Chien et al., 2013). Regional SUVRs were modeled from 80 to 100 min after injection, using cerebellar gray as the reference region. A tau summary measure was defined in the mean PVC SUVRs from bilateral amygdala, entorhinal, inferior temporal, and lateral occipital regions (Mishra et al., 2017).

CSF was collected via lumbar puncture using methods described previously (Fagan et al., 2006). After overnight fasting, 20–30 mL samples of CSF were collected, centrifuged, then aliquoted (500 µL) in polypropylene tubes, and stored at –80°C. CSF amyloid β peptide 42 (Aβ42), Aβ40, and phosphorylated tau-181 (pTau) were measured with automated Lumipulse immunoassays (Fujirebio, Malvern, PA, USA) using a single lot of assays for each analyte. Aβ42 and pTau estimates were each normalized for individual differences in CSF production rates by forming a ratio with Aβ40 as the denominator (Hansson et al., 2019; Guo et al., 2020). As pTau/Aβ40 was highly skewed, we applied a log transformation to these estimates before statistical analysis.

Amyloid positivity was defined using previously published cutoffs for PIB (SUVR > 1.42; Vlassenko et al., 2016) or AV45 (SUVR > 1.19; Su et al., 2019). Additionally, the CSF Aβ42/Aβ40 ratio has been shown to be highly concordant with amyloid PET (positivity cutoff < 0.0673; Schindler et al., 2018; Volluz et al., 2021). Thus, participants were defined as amyloid-positive (for CN/A+ and CI groups) if they had either a PIB, AV45, or CSF Aβ42/Aβ40 ratio measure in the positive range. Participants with discordant positivity between PET and CSF estimates were defined as amyloid-positive.

Cognitive battery

Knight ADRC participants completed a 2 hr battery of cognitive tests. We examined global cognition by forming a composite of tasks across cognitive domains, including processing speed (Trail Making A; Schindler et al., 2018), executive function (Trail Making B; Schindler et al., 2018), semantic fluency (Animal Naming; Armitage, 1946), and episodic memory (Free and Cued Selective Reminding Test free recall score; Goodglass and Kaplan, 1983; Grober et al., 1988). This composite has recently been used to study individual differences in cognition in relation the preclinical AD biomarkers and structural MRI (Aschenbrenner et al., 2018), as well as functional MRI measures (Millar et al., 2021).

MRI acquisition

All MRI data were obtained using a Siemens 3T scanner, although there was a variety of specific models within and across studies. As described previously (Millar et al., 2022), participants in the Knight ADRC and Ances lab studies completed one of two comparable structural MRI protocols, varying by scanner (sagittal T1-weighted magnetization-prepared rapid gradient echo sequence [MPRAGE] with repetition time [TR] = 2400 or 2300 ms, echo time [TE] = 3.16 or 2.95 ms, flip angle = 8 or 9°, frames = 176, field of view = sagittal 256×256 or 240×256 mm, 1 mm isotropic or 1×1×1.2 mm voxels; oblique T2-weighted fast spin echo sequence [FSE] with TR = 3200 ms, TE = 455 ms, 256×256 acquisition matrix, 1 mm isotropic voxels) and an identical resting-state fMRI protocol (interleaved whole-brain echo planar imaging sequence [EPI] with TR = 2200 ms, TE = 27 ms, flip angle = 90°, field of view = 256 mm, 4 mm isotropic voxels for two 6 min runs [164 volumes each] of eyes open fixation). DIAN participants completed a similar MPRAGE protocol (TR = 2300ms, TE = 2.95ms, flip angle = 9°, field of view = 270 mm, 1.1×1.1×1.2 mm voxels; McKay et al., 2022). Resting-state EPI sequence parameters for the DIAN participants differed across sites and scanners with the most notable difference being shorter resting-state runs (one 5 min run of 120 volumes; see Supplementary file 1 for summary of structural and functional MRI parameters; McKay et al., 2022).

FC preprocessing and features

All MRI data were processed using common pipelines. Initial fMRI preprocessing followed conventional methods, as described previously (Shulman et al., 2010; Millar et al., 2022), including frame alignment, debanding, rigid body transformation, bias field correction, and normalization of within-run intensity values to a whole-brain mode of 1000 (Power et al., 2012). Transformation to an age-appropriate in-house atlas template (based on independent samples of either younger adults or CN older adults) was performed using a composition of affine transforms connecting the functional volumes with the T2-weighted and MPRAGE images. Frame alignment was included in a single resampling that generated a volumetric time series of the concatenated runs in isotropic 3 mm atlas space.

As described previously (Fox et al., 2009; Millar et al., 2022), additional processing was performed to allow for nuisance variable regression. Data underwent framewise censoring based on motion estimates (framewise displacement [FD] > 0.3 mm and/or derivative of variance [DVARS] > 2.5 above participant’s mean). To further minimize the confounding influence of head motion on FC estimates (Power et al., 2012) in all samples, we only included scans with low head motion (mean FD < 0.30 mm and > 50% frames retained after motion censoring). BOLD data underwent a temporal band-pass filter (0.005 Hz < f < 0.1 Hz) and nuisance variable regression, including motion parameters, timeseries from FreeSurfer 5.3-defined (Fischl, 2012) whole brain (global signal), CSF, ventricle, and white matter masks, as well as the derivatives of these signals. Finally, BOLD data were spatially blurred (6 mm full width at half maximum).

Final BOLD time series data were averaged across voxels within a set of 300 spherical regions of interest (ROIs) in cortical, subcortical, and cerebellar areas (Seitzman et al., 2020). For each scan, we calculated the 300×300 Fisher-transformed Pearson correlation matrix of the final averaged BOLD time series between all ROIs. We then used the vectorized upper triangle of each correlation matrix (excluding auto-correlations; 44,850 total correlations) as input features for predicting age. Since site and/or scanner differences between samples might confound neuroimaging estimates, we harmonized FC matrices using an empirical Bayes modeling approach (ComBat; Johnson et al., 2007; Fortin et al., 2017), which has previously been applied to FC data (Yu et al., 2018).

Structural MRI processing and features

All T1-weighted images underwent cortical reconstruction and structural segmentation through a common pipeline with FreeSurfer 5.3 (Fischl et al., 2002; Fischl, 2012). Structural processing included segmentation of subcortical white matter and deep gray matter, intensity normalization, registration to a spherical atlas, and parcellation of the cerebral cortex based on the Desikan atlas (Desikan et al., 2006). Inclusion and exclusion errors of parcellation and segmentation were identified and edited by a centralized team of trained research technicians according to standardized criteria (Su et al., 2013). We then used the FreeSurfer-defined thickness estimates from 68 cortical regions (Desikan et al., 2006), along with volume estimates from 33 subcortical regions (Fischl et al., 2002) as input features for predicting age. We harmonized structural features across sites and scanners using the same ComBat approach (Johnson et al., 2007; Fortin et al., 2017), which has also been applied to structural MRI data (Fortin et al., 2018).

Gaussian process regression

As described previously (Millar et al., 2022), machine-learning analyses were conducted using the Regression Learner application in Matlab (MathWorks, 2021). We trained two Gaussian process regression (GPR; Rasmussen et al., 2004) models, each with a rational quadratic kernel function to predict chronological age using fully-processed, harmonized MRI features (FC or structural) in the training set. The σ hyperparameter was tuned within each model by searching a range of values from 10–4 to 10*SDage using Bayesian optimization across 100 training evaluations. The optimal value of σ for each model was found (see Figure 1—figure supplement 2) and was applied for all subsequent applications of that model. All other hyperparameters were set to default values (basis function = constant and standardize = true).

Model performance in the training set was assessed using 10-fold cross validation via the Pearson correlation coefficient (r), the proportion of variance explained (R2), the mean absolute error (MAE), and root-mean-square error (RMSE) between true chronological age and the cross-validated age predictions merged across the 10 folds. We then evaluated generalizability of the models to predict age in unseen data by applying the trained models to the held-out test set of healthy controls. Finally, we applied the same fully-trained GPR models to separate analysis sets of 154 CI, 154 CN/A+, and 144 CN/A− controls to test our hypotheses regarding AD-related group effects and individual difference relationships. Unimodal models were each constructed with a single GPR model. The multimodal model was constructed by taking the ‘stacked’ predictions from each first-level unimodal model as features for training a second-level GPR model (Liem et al., 2017; Engemann et al., 2020; Dunås et al., 2021).

For each participant, we calculated model-specific BAG estimates as the difference between chronological age and age predictions from the unimodal FC model (FC-BAG), structural model (S-BAG), and multimodal model (S+FC BAG). To correct for regression dilution commonly observed in similar models (Le et al., 2018; Smith et al., 2019; Liang et al., 2019), we included chronological age as a covariate in all statistical tests of BAG (Cole et al., 2017a; Le et al., 2018). However, to avoid inflating estimates of prediction accuracy (Butler et al., 2021), only uncorrected age prediction values were used for evaluating model performance in the training and test sets.

Statistical analysis

All statistical analyses were conducted in R 4.0.2 (R Development Core Team, 2020). Demographic differences in the AD samples were tested with independent-samples t tests for continuous variables and χ2 tests for categorical variables, using CN/A− controls as a reference group. Differences in brain age model performance were tested using Williams’s test of difference between dependent correlations sharing one variable, i.e., Pearson’s r between age and each model prediction of age. To correct for age-related bias in BAG (Le et al., 2018; as previously mentioned), we controlled for age as a covariate during all statistical tests. Group differences in each BAG estimate were tested using an omnibus ANOVA test with follow-up pairwise t tests on age-residualized BAG estimates, using a false discovery rate (FDR) correction for multiple comparisons. Assumptions of normality were tested by visual inspection of quantile-quantile plots. Assumptions of equality of variance were tested with Levene’s test. Linear regression models tested the effects of cognitive impairment (CDR > 0 vs. CDR 0) and amyloid positivity (A− vs. A+) on BAG estimates from each model, controlling for true age (as noted above), sex, and years of education. Given the potential confounding influence of head motion on FC-derived measures (Power et al., 2012; Van Dijk et al., 2012; Satterthwaite et al., 2012), we also included mean FD as an additional covariate of non-interest in the FC and S+FC models. We tested continuous relationships with AD biomarkers and cognitive estimates using linear regression models, including the same demographic and motion covariates. Since the range of amyloid biomarkers was drastically reduced in the CN/A− sample, we excluded these participants from models testing continuous amyloid relationships. Effect sizes were computed as partial η2 (ηp2).

Results

Sample description and demographics

Demographic characteristics of the training sets, test sets, and analysis sets are reported in Table 1. CN/A+ participants were older (t = 6.15, p < 0.001) and more likely to be APOE ε4 carriers (χ2 = 34.73, p < 0.001) than amyloid-negative controls. Furthermore, CI participants were older (t = 9.71, p < 0.001), more likely male (χ2 = 8.60, p = 0.003), more likely to be APOE ε4 carriers (χ2 = 56.67, p < 0.001), and had fewer years of education (t = 2.03, p < 0.043), and lower MMSE scores (t = 12.46, p < 0.001) than amyloid-negative controls.

Comparison of model performance

All models accurately predicted chronological age in the training sets, as assessed using 10-fold cross validation, as well as in the held-out test sets. Overall, prediction accuracy was lowest in the FC model (MAEFC/Train = 8.67 years, R2FC/Train = 0.68, MAEFC/Test = 8.25 years, R2FC/Test = 0.73; see Figure 1A & B). The structural MRI model (MAES/Train = 5.97 years, R2S/Train = 0.81, MAES/Test = 6.26 years, R2S/Test = 0.82; see Figure 1C & D) significantly outperformed the FC model in age prediction accuracy, Williams’s tS vs. FC = 5.39, p < 0.001. There was a significant, but modestly sized, positive correlation between FC-BAG and S-BAG in the adult lifespan CN/A− training and testing sets (r = 0.095, p = 0.036; see Figure 1—figure supplement 3A), as well as the AD analysis sets (r = 0.134, p = 0.004; see Figure 1—figure supplement 3B).

Figure 1. Performance of the brain age models in the training (left column) and test sets (right column) for each modality: functional connectivity (FC; A and B), structural MRI (S; C and D) and multimodal models (S+FC; E and F).

Age predicted by each model (y axis) is plotted against true age (x axis). Colored lines and shaded areas represent regression lines and 95% confidence regions. Dashed black lines represent perfect prediction. Model performance is evaluated by Pearson’s r, proportion of variance explained (R2), mean absolute error (MAE), and root-mean-square error (RMSE).

Figure 1.

Figure 1—figure supplement 1. Flow chart of participant inclusion, exclusion, and group assignments.

Figure 1—figure supplement 1.

Figure 1—figure supplement 2. Tuning curves of σ hyperparameter in training for structural (A) and functional connectivity (B) Gaussian process regression (GPR) models.

Figure 1—figure supplement 2.

Figure 1—figure supplement 3. Correlation between S-brain age gap (BAG; x axis) and functional connectivity (FC)-BAG (y axis) estimates in the training and validation sets (A) and analysis sets (B).

Figure 1—figure supplement 3.

Both BAG estimates are residualized for age. Dotted black lines represent no difference between predicted and chronological age for each model. Colored lines and shaded areas represent group-specific regression lines and 95% confidence regions. Dashed black lines represent main effect regression lines across all groups.
Figure 1—figure supplement 4. Violin plot of R2 performance estimates from 1000 bootstrapped samples in which a stacked brain age model combined the fully-trained structural MRI model (R2S) with a reshuffled functional connectivity (FC) model (i.e. FC training features were randomly reassigned in each bootstrap sample).

Figure 1—figure supplement 4.

Most bootstrapped stacked models perform about as well or worse than the unimodal structural MRI model (R2S, black dashed line). The fully-trained stacked multimodal model (R2S+FC, red solid line) outperforms all bootstrapped models, suggesting that the modest increase in model performance observed in the multimodal model over the unimodal structural model is due to meaningful age-related FC signal, rather than capitalizing on noise in a larger feature set.

Finally, the multimodal model (MAES+FC/Train = 5.34 years, R2S+FC/Train = 0.86, MAES+FC/Test = 5.25 years, R2S+FC/Test = 0.87; see Figure 1E & F) significantly outperformed both the FC model (Williams’s tS+FC vs. FC = 11.20, p < 0.001) and the structural MRI model (Williams’s tS+FC vs. S = 5.67, p < 0.001). It is possible that the modest increase in the multimodal model was due to capitalizing on noise, simply by adding more features to the structural model. Hence, we also compared the observed R2S+FC to a bootstrapped distribution of R2 performance estimates from 1000 resamples using a model in which the original structural MRI model was stacked with a model trained on randomly reshuffled FC features. Thus, this distribution represents the expected improvements in model performance from simply adding new features to the structural MRI model with the stacked approach. The observed R2S+FC outperformed all R2 estimates from this bootstrapped distribution (p < 0.001; see Figure 1—figure supplement 4), suggesting that the modest increase in model performance observed in the stacked multimodal (S+FC) model over the unimodal structural model is due to meaningful age-related FC signal, rather than capitalizing on noise in a larger feature set.

BAG differences in cognitive impairment and amyloid positivity

Residual FC-BAG was normally distributed (see Figure 2—figure supplement 1), and variance in FC-BAG did not significantly differ between the analysis sets, Levene’s statistic = 0.01, p = 0.988. An omnibus ANOVA revealed significant differences in residual FC-BAG across the three groups, F(2,449) = 9.80, p < 0.001. FC-BAG was 2.17 years older in CI participants compared to CN controls (β = 2.17, p = 0.030, ηp2 = 0.01; see Figure 2A&B, Table 2A). Follow-up t tests revealed that residual FC-BAG was significantly elevated in CI relative to CN/A+participants (pFDR < 0.001). FC-BAG was also 1.64 years lower in A+ participants compared to A− (β = –1.64, p = 0.035, ηp2 = 0.01), controlling for global CDR and the other covariates. Follow-up t tests revealed that residual FC-BAG was significantly lower in CN/A+ participants compared to CN/A− controls (pFDR = 0.002).

Figure 2. Group differences in functional connectivity (FC; A and B), structural (S; C and D), and multimodal (S+FC; E and F) brain age in the analysis sets.

Comparisons are presented between cognitively normal (Clinical Dementia Rating [CDR] = 0) biomarker-negative controls (CN/A−; blue) vs. CN/A+ (green) vs. cognitively impaired participants (CI, red). Scatterplots (A, C, and E) show predicted vs. true age for each group. Colored lines and shaded areas represent group-specific regression lines and 95% confidence regions. Dashed black lines represent perfect prediction. Violin plots (B, D, and F) show residual FC-brain age gap (BAG; controlling for true age) in each group. p values are reported from pairwise independent-samples t tests.

Figure 2.

Figure 2—figure supplement 1. Quantile-quantile plots of brain age gap, controlling for age, in each of the analysis sets (cognitively normal, amyloid negative [CN/A−]; CN/A+; and cognitively impaired [CI]) for functional connectivity [FC; A], structural [S; B] and multimodal [S+FC; C] models.

Figure 2—figure supplement 1.

Table 2. Linear regression models predicting functional connectivity (FC)-brain age gap (BAG) (A), S-BAG (B), and FC + S BAG (C).

CDR = Clinical Dementia Rating. FD = framewise displacement.


A. FC-BAG (df = 348) B. S-BAG (df = 349) C. S+FC BAG (df = 348)
Estimate SE p value ηp2 Estimate SE p value ηp2 Estimate SE p value ηp2
Intercept 30.903 3.809 0.000 5.830 4.899 0.235 11.755 4.197 0.005
CDR > 0 2.169 0.997 0.030 0.013 5.105 1.287 0.000 0.043 4.305 1.099 0.000 0.042
Amyloid+ 1.640 0.776 0.035 0.013 0.900 1.002 0.369 0.002 0.060 0.855 0.944 0.000
Age (y) 0.586 0.044 0.000 0.335 0.151 0.057 0.008 0.020 0.201 0.049 0.000 0.047
Sex = female –1.174 0.700 0.094 0.008 1.792 0.904 0.048 0.011 0.691 0.771 0.371 0.002
Education (y) –0.006 0.127 0.964 0.000 –0.155 0.164 0.345 0.003 –0.152 0.140 0.276 0.003
Mean FD 5.528 5.467 0.313 0.003 NA NA NA NA 4.893 6.024 0.417 0.002







Residual S-BAG was also normally distributed (see Figure 2—figure supplement 1), and variance in S-BAG did not significantly differ between the analysis sets, Levene’s statistic = 0.10, p = 0.902. An omnibus ANOVA revealed significant differences in residual S-BAG across the three groups, F(2,449) = 20.64, p < 0.001. S-BAG was 5.10 years older in CI participants compared to CN controls (β = 5.10, p < 0.001, ηp2 = 0.04; see Figure 2C&D, Table 2B). Follow-up t tests revealed that residual S-BAG was significantly elevated in CI participants relative to CN/A− and CN/A+ participants (pFDR’s < 0.001). S-BAG did not significantly differ as a function of amyloid positivity, controlling for CDR and the other covariates.

Residual S+FC-BAG was also normally distributed (see Figure 2—figure supplement 1), and variance in S+FC-BAG did not significantly differ between the analysis sets, Levene’s statistic = 0.89, p = 0.412. An omnibus ANOVA revealed significant differences in residual S+FC-BAG across the three groups, F(2,449) = 21.84, p < 0.001. S+FC-BAG was 4.31 years older in CI participants compared to CN controls (β = 4.1, p < 0.001, ηp2 = 0.04; see Figure 2E, F, Table 2C). Follow-up t tests revealed that residual FC-BAG was significantly elevated in CI participants relative to CN/A− and CN/A+ participants (pFDR’s < 0.001). S+FC-BAG did not significantly differ as a function of amyloid positivity, controlling for CDR and the other covariates.

Relationships with amyloid markers

355 participants (144 CN/A−, 154 CN/A+, 57 CI) had an available amyloid PET scan, and 300 (120 CN/A−, 137 CN/A+, 43 CI) had an available CSF estimate of Aβ42/40. In the FC model, FC-BAG was not significantly related with amyloid PET nor was there an interactive relationship with amyloid PET between groups (see Figure 3A). There were also no significant main effects or interactions between FC-BAG, S-BAG, or S+FC BAG and CSF Aβ42/40 (See Figure 3B, D and F).

Figure 3. Continuous relationships between amyloid biomarkers and functional connectivity (FC-brain age gap [BAG]; A and B), structural (S-BAG; C and D), and multimodal (S+FC BAG; E and F) BAG in the analysis sets.

Figure 3.

Scatterplots show amyloid PET (A, C, and E) and CSF AB42/40 (B, D, and F) as a function of residual BAG (controlling for true age) in each group. Colored lines and shaded areas represent group-specific regression lines and 95% confidence regions. Dashed black lines represent main effect regression lines across all groups.

In the structural and multimodal models, there were significant main effects, such that greater S-BAG (β = 0.79, p = 0.004, ηp2 = 0.041; see Figure 3C) and greater S+FC BAG (β = 0.81, p = 0.015, ηp2 = 0.029; see Figure 3E) were both associated with greater amyloid PET. In the multimodal model only, this relationship was further characterized by a non-significant interaction (β = 1.16, p = 0.087, ηp2 = 0.014), such that the association was significantly positive in CI participants interaction (β = 1.53, p = 0.029, ηp2 = 0.092) but not in CN/A+ (β = –0.05, p = 0.881, ηp2 = 0.001).

Relationships with tau markers

99 participants (42 CN/A–, 40 CN/A+, 17 CI) had an available tau PET scan, and 300 (120 CN/A–, 137 CN/A+, 43 CI) had an available CSF estimate of pTau-181/Aβ40. In the FC model, FC-BAG was not significantly related with tau PET or CSF pTau-181/Aβ40 (see Figure 4A and B). However, there was a non-significant interaction, suggesting a more positive association between CSF pTau-181/Aβ40 and FC-BAG in CI participants but not in CN controls (β = 0.02, p = 0.059, ηp2 = 0.016).

Figure 4. Continuous relationships between tau biomarkers and functional connectivity (FC-brain age gap [BAG]; A and B), structural (S-BAG; C and D), and multimodal (S+FC BAG; E and F) BAG in the analysis sets.

Figure 4.

Scatterplots show Tau PET summary (A, C, and E) and log-transformed CSF pTau/Aβ40 (B, D, and F) as a function of residual BAG (controlling for true age) in each group. Colored lines and shaded areas represent group-specific regression lines and 95% confidence regions. Dashed black lines represent main effect regression lines across all groups.

In the structural and multimodal models, there were significant main effects, such that greater S-BAG (β = 0.02, p < 0.001, ηp2 = 0.141; see Figure 4C) and greater S+FC BAG (β = 0.02, p = 0.001, ηp2 = 0.110; see Figure 4E) were both associated with greater tau PET. These main effects were further characterized by significant interactions (S-BAG: β = 0.04, p < 0.001, ηp2 = 0.176; S+FC-BAG: β = 0.07, p < 0.001, ηp2 = 0.250), such that the positive association was only observed in CI participants, but not in the other groups.

Consistent with tau PET, CSF pTau/Aβ40 demonstrated similar interactive relationships, such that greater S-BAG (β = 0.02, p < 0.001, ηp2 = 0.052; see Figure 4D) and greater S+FC BAG (β = 0.04, p < 0.001, ηp2 = 0.075; see Figure 4F) were both associated with greater CSF pTau/Aβ40 in the CI participants, but not in the other groups.

Relationships with cognition

445 participants (144 CN/A−, 153 CN/A+, 148 CI) had available performance measures from the cognitive composite tasks. In the FC model, there was a significant main effect, such that across all groups, greater FC-BAG was associated with lower cognitive composite score (β = –0.01, p = 0.006, ηp2 = 0.017; see Figure 5A). However, this effect was driven by group differences in both variables, as there were neither relationships between FC-BAG and cognition within any of the groups nor were there any significant interactions.

Figure 5. Continuous relationships between global cognition and functional connectivity (FC-brain age gap [BAG]; A), structural (S-BAG; B), and multimodal (S+FC BAG; C) in the analysis sets.

Figure 5.

Scatterplots show global cognition as a function of residual BAG (controlling for true age) in each group. Colored lines and shaded areas represent group-specific regression lines and 95% confidence regions. Dashed black lines represent main effect regression lines across all groups.

In the structural model and multimodal models, there were significant main effects, such that greater S-BAG (β = –0.03, p < 0.001, ηp2 = 0.104; see Figure 5B) and greater S+FC BAG (β = –0.03, p < 0.001, ηp2 = 0.096; see Figure 5C) were both associated with lower cognitive composite scores. Both effects were further characterized by significant interactions such that the negative associations were observed in the CI participants, but not in the other groups (S-BAG: β = –0.03, p < 0.001, ηp2 = 0.045; S+FC-BAG: β = –0.04, p < 0.001, ηp2 = 0.047).

Discussion

We first found that machine-learning models successfully predicted age when trained on FC, structural MRI, and multimodal datasets. As expected, the structural model predicted age with greater accuracy than the FC model, but the multimodal model outperformed both unimodal models. Second, BAG estimates from all models were significantly elevated in CI participants compared to CN controls. BAG estimates in the FC model were significantly reduced in cognitively normal participants with elevated amyloid, but no structural group differences were observed in presymptomatic stages. Third, interactive relationships were observed, such that greater BAG was associated with greater continuous AD biomarker load in CI, but not in CN, participants. Specifically, in the FC model, such a pattern only appeared in a non-significant interaction predicting CSF pTau/Aβ40. However, in the structural model, these interactions were significantly observed in relation to CSF pTau/Aβ40 and tau PET. In the multimodal model, these same interactions were also observed in addition to a non-significant interaction with amyloid PET. Finally, regarding cognitive relationships, similar interactive patterns were observed, such that in CI participants, greater BAG estimates from structural and multimodal models were associated with lower cognitive performance; however, this relationship was not observed in the FC model.

Predicting brain age with multiple modalities

We found that a GPR model trained on structural MRI features predicted chronological age in a cognitively normal, amyloid-negative adult sample with an R2 of 0.81. This level of performance is comparable to other structural models, which have reported R2s ranging from 0.80 to 0.95 (Cole and Franke, 2017b; Liem et al., 2017; Eavani et al., 2018; Wang et al., 2019; Bashyam et al., 2020; Ly et al., 2020; Gong et al., 2021; Lee et al., 2022). As previously reported (Millar et al., 2022), the FC-trained model predicted age with an R2 of 0.68, again consistent with previous FC models, which have achieved R2s from 0.53 to 0.80 (Liem et al., 2017; Eavani et al., 2018; Gonneaud et al., 2021). Our observation that structural MRI outperformed FC in age prediction is also consistent with previous direct comparisons between modalities (Liem et al., 2017; Eavani et al., 2018; Dunås et al., 2021).

Importantly, however, there was only a modest positive correlation between FC and structural BAG estimates, after correcting for age-related biases, suggesting that functional and structural MRI capture distinct age-related signals. Indeed, the multimodal model outperformed both unimodal models by integrating these complementary signals. These observations, again, are consistent with other recent reports of multimodal age prediction models (Liem et al., 2017; Eavani et al., 2018; Engemann et al., 2020; Dunås et al., 2021). Future models may improve age prediction accuracy by combining data from structural, FC, and/or other neuroimaging modalities, several of which may be available in typical MRI sessions of multiple sequences.

BAG as a marker of cognitive impairment

Structural BAG was elevated by 5.10 years in CI participants compared to CN controls. This effect is comparable to previous structural age prediction models, demonstrating elevations in AD and MCI samples between 5 and 10 years (Cole and Franke, 2017b; Franke and Gaser, 2019). As previously reported, FC BAG was also elevated in CI participants, but to a relatively smaller extent, i.e., 2.17 years (Millar et al., 2022). The multimodal BAG was similarly elevated in CI participants by 5.10 years. Thus, each model is clearly sensitive to group differences in AD status at the symptomatic stage.

Consistent with one previous report (Lee et al., 2022), we demonstrated that within the CI participants, BAG estimates were related to individual differences in AD biomarkers and cognitive function. These effects were most pronounced in the structural model, which showed relationships with tau biomarkers and cognition in the CI participants, and the multimodal model, which showed relationships with tau, cognition, and amyloid PET. Thus, age prediction models that include structural MRI (including unimodal and multimodal approaches) may be useful in tracking AD pathological progression and cognitive decline within the symptomatic stage of the disease.

BAG as a marker of presymptomatic AD

We found that structural and multimodal BAG did not differ between cognitively normal participants with and without amyloid pathology. In cognitively normal participants, structural BAG estimates did not significantly associate with individual differences in any AD biomarkers. Overall, although structural and multimodal BAG estimates track well with some biomarkers of AD pathophysiology, as previously reported (Lee et al., 2022), our novel results suggest that these relationships are not observed until the symptomatic stage of the disease, at which point structural changes become more apparent.

As we have previously reported (Millar et al., 2022), FC-BAG was lower in presymptomatic AD participants compared to amyloid-negative controls. Extending beyond this group difference, we now also note that FC-BAG was negatively associated with amyloid PET in CN/A+ participants. The combined reduction of FC-BAG in the presymptomatic stage and increase in the symptomatic stage suggest a biphasic functional response to AD progression, which is partially consistent with some prior suggestions (Jagust and Mormino, 2011; Jones et al., 2016; Jones et al., 2017; Schultz et al., 2017; Wales and Leung, 2021; see Millar et al., 2022 for a more detailed discussion).

Interpretation of this biphasic pattern is still unclear, although the present results provide at least one novel insight. Specifically, one potential interpretation is that the ‘younger’ appearing FC pattern in the presymptomatic stage may reflect a compensatory response to early AD pathology (Cabeza et al., 2018). This interpretation leads to the prediction that reduced FC-BAG should be associated with better cognitive performance in the preclinical stage. However, this interpretation is not supported by the current results, as FC-BAG did not correlate with cognition in any of the analysis samples.

Alternatively, pathological AD-related FC disruptions may be orthogonal to healthy age-related FC differences, as supported by our previous observation that age and AD are predicted by mostly non-overlapping FC networks (Millar et al., 2022). For instance, the ‘younger’ FC pattern in CN/A+ participants may be driven by hyper-excitability in the preclinical stage (Harris et al., 2020; Ranasinghe et al., 2022). It is also worth considering that patterns of younger FC-BAG in CN/A+ participants may somehow correspond to a recent observation that patterns of youthful-appearing aerobic glycolysis are relatively preserved in the presymptomatic stage of AD (Goyal et al., 2022). Finally, this effect may simply be spuriously driven by poor performance of the FC brain age model, sample-specific noise, and/or statistical artifacts related to regression dilution and its correction (Butler et al., 2021). Hence, future studies should attempt to replicate these results in independent samples and further test potential theoretical interpretations.

BAG as a marker of cognition

Although FC-BAG was not associated with individual differences in a global cognitive composite within any of our analysis samples, greater structural and multimodal BAG estimates were associated with lower cognitive performance within the CI participants. Hence, these estimates may be sensitive markers of cognitive decline in the symptomatic stage. This finding is consistent with previous reports that other structural brain age estimates are associated with cognitive performance in AD (Eavani et al., 2018), Down syndrome (Cole et al., 2017a), HIV (Petersen et al., 2021; Petersen et al., 2022), as well as cognitively normal controls (Richard et al., 2018).

Limitations and future directions

The training sets included MRI scans from a range of sites, scanners, and acquisition sequence parameters, which may introduce noise and/or confounding variance into MRI features. We attempted to mitigate this problem by: (1) including only data from Siemens 3T scanners with similar protocols; (2) processing all MRI data through common pipelines and quality assessments; and (3) harmonizing across sites and scanners with ComBat (Fortin et al., 2017).

Additionally, the training set (N = 390) was relatively small compared to prior models, which have included training samples over 1,000 (e.g., Cole et al., 2015; Bashyam et al., 2020). Future studies may further improve model performance by including larger samples of well-characterized participants in the training set.

Although we took appropriate steps to detect and control for AD-related pathology in the CN/A− training sets, we were unable to control for other non-AD pathologies, e.g., Lewy body disease, TDP-43, etc., which may be present.

Structural MRI was quantified using the Desikan atlas (Desikan et al., 2006), which, although widely used, provides a relatively coarse parcellation of structural anatomy and, moreover, does not align with the parcellation used to define FC regions (Seitzman et al., 2020). Although the structural MRI data still outperformed FC in predicting age, future brain age models may further improve performance by using more refined and harmonized anatomical parcellations to define brain regions.

The sample size of continuous biomarker and cognitive analyses differed across the measures, depending on the availability, and was particularly low for analyses of tau PET. Future studies might improve upon this approach by a larger and more complete biomarker sample.

Moreover, estimates of BAG likely capture variance in early-life factors, which may obscure associations with AD and cognition, especially in cross-sectional designs (Vidal-Piñeiro et al., 2021). Future studies may improve the sensitivity of BAG estimates to disease-related markers by testing associations with longitudinal change.

Finally, although the Ances lab controls were relatively diverse, participants in other samples were mostly white and highly educated. Hence, these models may not be generalizable to broader samples. Future models would benefit by using more representative training samples.

Conclusions

We compared three MRI-based machine-learning models in their ability to predict age, as well as their sensitivity to early-stage AD, AD biomarkers, and cognition. Although FC and structural MRI models were both successful in detecting differences related to healthy aging and cognitive impairment, we note clear evidence that these modalities capture complementary signals. Specifically, FC-BAG was uniquely reduced in cognitively normal participants with elevated amyloid, although the interpretation of this finding still warrants further investigation. In contrast, structural BAG was uniquely associated with biomarkers of AD pathology and cognitive function within the CI participants. Finally, the multimodal age prediction model, which combined FC and structural MRI, further improved the prediction of healthy age differences and also was related to biomarkers and cognition in CI participants. Thus, multimodal brain age models may be useful maximizing sensitivity to AD across the spectrum of disease progression.

Acknowledgements

We thank the participants for their dedication to this project, Haleem Azmy, Anna Boerwinkle, and Dimitre Tomov for technical and processing support. This manuscript has been reviewed by DIAN Study investigators for scientific content and consistency of data interpretation with previous DIAN Study publications. We acknowledge the altruism of the participants and their families and contributions of the DIAN research and support staff at each of the participating sites for their contributions to this study. We thank the personnel of the Administration, Biomarker, Biostatistics, Clinical, Genetics, and Neuroimaging Cores of the Knight ADRC, as well as the Administration, Biomarker, Biostatistics, Clinical, Cognition, Genetics, and Imaging Cores of DIAN. This research was funded by grants from the National Institutes of Health (P01-AG026276, P01-AG03991, P30-AG066444, 5-R01-AG052550, 5-R01-AG057680, 1-R01-AG067505, 1S10RR022984-01A1) and the BrightFocus Foundation (A2022014F), with generous support from the Paula and Rodger O Riney Fund and the Daniel J Brennan MD Fund. Data collection and sharing for this project was supported by The Dominantly Inherited Alzheimer Network (DIAN, U19-AG032438) funded by the National Institute on Aging (NIA),the Alzheimer’s Association (SG-20–690363-DIAN), the German Center for Neurodegenerative Diseases (DZNE), Raul Carrea Institute for Neurological Research (FLENI), Partial support by the Research and Development Grants for Dementia from Japan Agency for Medical Research and Development, AMED, and the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), Spanish Institute of Health Carlos III (ISCIII), Canadian Institutes of Health Research (CIHR), Canadian Consortium of Neurodegeneration and Aging, Brain Canada Foundation, and Fonds de Recherche du Québec – Santé.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Peter R Millar, Email: pmillar@wustl.edu.

Karla L Miller, University of Oxford, United Kingdom.

Jeannie Chin, Baylor College of Medicine, United States.

The Dominantly Inherited Alzheimer Network:

Adam Sarah, Allegri Ricardo, Araki Aki, Barthelemy Nicolas, Bateman Randall, Bechara Jacob, Benzinger Tammie, Berman Sarah, Bodge Courtney, Brandon Susan, Brooks William Bill, Brosch Jared, Buck Jill, Buckles Virginia, Carter Kathleen, Cash Lisa, Chen Charlie, Chhatwal Jasmeer, Mendez Patricio C, Chua Jasmin, Chui Helena, Courtney Laura, Cruchaga Carlos, Day Gregory S, DeLaCruz Chrismary, Denner Darcy, Diffenbacher Anna, Dincer Aylin, Donahue Tamara, Douglas Jane, Duong Duc, Egido Noelia, Esposito Bianca, Fagan Anne, Farlow Marty, Feldman Becca, Fitzpatrick Colleen, Flores Shaney, Fox Nick, Franklin Erin, Joseph-Mathurin Nelly, Fujii Hisako, Gardener Samantha, Ghetti Bernardino, Goate Alison, Goldberg Sarah, Goldman Jill, Gonzalez Alyssa, Gordon Brian, Gräber-Sultan Susanne, Graff-Radford Neill, Graham Morgan, Gray Julia, Gremminger Emily, Grilo Miguel, Groves Alex, Haass Christian, Häsler Lisa, Hassenstab Jason, Hellm Cortaiga, Herries Elizabeth, Hoechst-Swisher Laura, Hofmann Anna, Holtzman David, Hornbeck Russ, Igor Yakushev, Ihara Ryoko, Ikeuchi Takeshi, Ikonomovic Snezana, Ishii Kenji, Jack Clifford, Jerome Gina, Johnson Erik, Jucker Mathias, Karch Celeste, Käser Stephan, Kasuga Kensaku, Keefe Sarah, Klunk William, Koeppe Robert, Koudelis Deb, Kuder-Buletta Elke, Laske Christoph, Levey Allan, Levin Johannes, Li Yan, Lopez MD Oscar, Marsh Jacob, Martins Ralph, Mason Neal S, Masters Colin, Mawuenyega Kwasi, McCullough Austin, McDade Eric, Mejia Arlene, Morenas-Rodriguez Estrella, Morris John, Mountz James, Mummery Cath, Nadkarni Neelesh, Nagamatsu Akemi, Neimeyer Katie, Niimi Yoshiki, Noble James, Norton Joanne, Nuscher Brigitte, Obermüller Ulricke, O'Connor Antoinette, Patira Riddhi, Perrin Richard, Ping Lingyan, Preische Oliver, Renton Alan, Ringman John, Salloway Stephen, Schofield Peter, Senda Michio, Seyfried Nicholas T, Shady Kristine, Shimada Hiroyuki, Sigurdson Wendy, Smith Jennifer, Smith Lori, Snitz Beth, Sohrabi Hamid, Stephens Sochenda, Taddei Kevin, Thompson Sarah, Vöglein Jonathan, Wang Peter, Wang Qing, Weamer Elise, Xiong Chengjie, Xu Jinbin, and Xu Xiong

Funding Information

This paper was supported by the following grants:

  • National Institutes of Health P01-AG026276 to John C Morris.

  • National Institutes of Health P01-AG03991 to John C Morris.

  • National Institutes of Health P30-AG066444 to John C Morris.

  • National Institutes of Health 5-R01-AG052550 to Beau M Ances.

  • National Institutes of Health 5-R01-AG057680 to Beau M Ances.

  • National Institutes of Health U19-AG032438 to Randall J Bateman.

  • BrightFocus Foundation A2022014F to Peter R Millar.

  • Alzheimer's Association SG-20-690363-DIAN to Randall J Bateman.

Additional information

Competing interests

No competing interests declared.

No competing interests declared.

received doses (AV45, AV1451) and partial support for PET scanning through an investigator-initiated research grant awarded to Washington University from Avid Radiopharmaceuticals (a wholly-owned subsidiary of Eli Lilly and Company). The author received consulting fees from Eisai, Siemens, and received payment for Biogen speaker's bureau. Tammie Benzinger acts as site investigator in clinical trials sponsored by Avid Radiopharmaceuticals, Eli Lilly and Company, Biogen, Eisai, Jaansen and Roche. The author has no other competing interests to declare.

has received research support from Biogen, EISAI, Alector and Parabon. Carlos Cruchaga is a member of the advisory board of Vivid Genetics, Circular Genomics and Alector. The author has no other competing interests to declare.

has received consulting fees from DiamiR and Siemens Healthcare Diagnostics Inc and has received consulting fees for participation on Scientific advisory boards for Roche Diagnostics, Genentech and Diadem. The author has received travel support for in-person attendance at ABC-DS Meeting/Retreat and travel support/honorarium for in-person attendance at Scientific Advisory Board meeting for South Texas Alzheimer's Disease Research Center (ADRC). The author has no other competing interests to declare.

has received consulting fees from Roche and Parabon Nanolabs. The author has no other competing interests to declare.

received personal honoraria for presenting lectures from the University of Wisconsin, St. Luke's Hospital, Houston Methodist Medical Center, personal Honoraria for serving on the Alzheimer Disease Center Clinical Task Force from University of Washington and personal honoraria for serving on the National Centralized Repository for Alzheimer's Disease biospecimen review committee from University of Indiana. The author received travel support from National Institute on Aging grant R01AG070941, and is a board member of the Greater Missouri Alzheimer's Association. The author received plasma Ab42/Ab40 data provided by C2N Diagnostics at no cost. No payments/research funding was provided by C2N Diagnostics. No gifts/financial incentives of any kind have been provided to Dr. Schindler by C2N Diagnostics. The author has no other competing interests to declare.

received fees for consulting and for acting as Dementia Topic Editor from DynaMed (EBSCO Health) and received fees for consulting, grant writing / implementation Parabon Nanonlabs. The author received payment for CME Content development from PeerView Media and Continuing Education Inc, payment for educational content development and focus group participation from Eli Lilly Co, and payment for continuum manuscript authorship from the American Academy of Neurology. The author received payment for expert testimony in the case of Wernicke encephalopathy from Barrow Law. Gregory S Day acts as Clinical Director for Anti-NMDA Receptor Encephalitis Foundation, Inc. The author has stock holdings at ANI Pharmaceuticals, Inc and stock options at Parabon Nanolabs. The author has no other competing interests to declare.

received grants from AbbVie, Eisai, Novartis, ADCS Posiphen, Genentech and Suven Life Sciences (no grant numbers available). The author has received consulting fees from Artery Therapeutics, Avanir, Biogen, Cyclo Therapeutics, Green Valley, Lexeo, McClena, Nervive, Oligomerix, Pinteon, Prothena, Vaxinity, Athira, AZTherapies, Cognition Therapeutics, Gemvax, Ionis, Longeveron, Merck, Neurotrope Biosciences, Otsuka, Proclara and SToP-AD. The author has no other competing interests to declare.

received funding and non-financial support for the DIAN-TU-001 trial from Avid Radiopharmaceuticals, and funding for the DIAN-TU-001 trial from Janssen, Hoffman La-Roche/Genentech, Eli Lilly & Co., Eisai, Biogen, AbbVie and Bristol Meyer Squibb. The author has equity ownership interest in C2N Diagnostics and receives royalty income based on technology (stable isotope labeling kinetics and blood plasma assay) licensed by Washington University to C2N Diagnostics. The author received International Conference Lecture Honoraria from Korean Dementia Association and Conference Lecture Honoraria from Weill Cornell Medical College. The author received support for travel expenses from Alzheimer's Association Roundtable and Duke Margolis Alzheimer's Roundtable. The author participates on an unpaid Advisory Board for Roche Gantenerumab Steering Committee and Biogen - Combination Therapy for Alzheimer's Disease, and participates on an unpaid Scientific Advisory Board for UK Dementia Research Institute at University College London and Stanford University, Next Generation Translational Proteomics for Alzheimer's and Related Dementias. The author receives an income from C2N Diagnostics for serving on the scientific advisory board. The author has received equipment and materials from Avid Radiopharmaceuticals, Eli Lilly & Co, Hoffman La-Roche, Eisai and Janssen. Unrelated to this article, Randall Bateman serves as principal investigator of the DIAN-TU, which is supported by the Alzheimer's Association, GHR Foundation, an anonymous organization and the DIAN-TU Pharma Consortium (Active: Eli Lilly and Company/Avid Radiopharmaceuticals, F. Hoffman-La Roche/Genentech, Biogen, Eisai, and Janssen. Previous: Abbvie, Amgen, AstraZeneca, Forum, Mithridion, Novartis, Pfizer, Sanofi, and United Neuroscience). In addition, in-kind support has been received from CogState and Signant Health. Unrelated to this article Randall Bateman has submitted the US nonprovisional patent application "Methods for Measuring the Metabolism of CNS Derived Biomolecules In Vivo" and provisional patent application "Plasma Based Methods for Detecting CNS Amyloid Deposition". The author has no other competing interests to declare.

has received consulting fees from Barcelona Brain Research Center BBRC and Native Alzheimer Disease-Related Resource Center in Minority Aging Research, Ext Adv Board. The author has received payment or honoraria for lectures from Montefiore Grand Rounds, NY and Tetra-Inst ADRC seminar series, Grand Rds, NY. The author has participated on the Research Strategy Council for the Cure Alzheimer's Fund, the Diverse VCID Observational Study Monitoring Board and the LEADS Advisory Board, Indiana University. The author has no other competing interests to declare.

Author contributions

Conceptualization, Software, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Resources, Supervision, Funding acquisition, Project administration, Writing – review and editing.

Conceptualization, Software, Methodology, Writing – review and editing.

Resources, Supervision, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Data curation.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Resources, Funding acquisition, Project administration, Writing – review and editing.

Conceptualization, Resources, Supervision, Funding acquisition, Methodology, Project administration, Writing – review and editing.

Ethics

Human subjects: All participants provided written informed consent in accordance with the Declaration of Helsinki and their local institutional review board. All procedures were approved by the Human Research Protection Office at WUSTL (IRB ID # 201204041).

Additional files

Supplementary file 1. Summary of acquisition parameters for structural T1 and resting-state functional MRI.

TR = repetition time, TE = echo time.

elife-81869-supp1.docx (1.4MB, docx)
MDAR checklist

Data availability

This project utilized datasets obtained from the Knight ADRC and DIAN. The Knight ADRC and DIAN encourage and facilitate research by current and new investigators, and thus, the data and code are available to all qualified researchers after appropriate review. Requests for access to the data used in this study may be placed to the Knight ADRC Leadership Committee (https://knightadrc.wustl.edu/professionals-clinicians/request-center-resources/) and the DIAN Steering Committee (https://dian.wustl.edu/our-research/for-investigators/dian-observational-study-investigator-resources/data-request-form/). Requests for access to the Ances lab data may be placed to the corresponding author. Code used in this study is available at https://github.com/peterrmillar/MultimodalBrainAge (copy archived at swh:1:rev:de233b8fe813f5fcca317ce0a6353047f0dfbb92).

References

  1. Armitage SG. An analysis of certain psychological tests used for the evaluation of brain injury. Psychological Monographs. 1946;60:i1–i48. doi: 10.1037/h0093567. [DOI] [Google Scholar]
  2. Aschenbrenner AJ, Gordon BA, Benzinger TLS, Morris JC, Hassenstab JJ. Influence of tau PET, amyloid PET, and hippocampal volume on cognition in Alzheimer disease. Neurology. 2018;91:e859–e866. doi: 10.1212/WNL.0000000000006075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bashyam VM, Erus G, Doshi J, Habes M, Nasrallah IM, Truelove-Hill M, Srinivasan D, Mamourian L, Pomponio R, Fan Y, Launer LJ, Masters CL, Maruff P, Zhuo C, Völzke H, Johnson SC, Fripp J, Koutsouleris N, Satterthwaite TD, Wolf D, Gur RE, Gur RC, Morris J, Albert MS, Grabe HJ, Resnick S, Bryan RN, Wolk DA, Shou H, Davatzikos C. Mri signatures of brain age and disease over the lifespan based on a deep brain network and 14 468 individuals worldwide. Brain. 2020;143:2312–2324. doi: 10.1093/brain/awaa160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bocancea DI, van Loenhoud AC, Groot C, Barkhof F, van der Flier WM, Ossenkoppele R. Measuring resilience and resistance in aging and Alzheimer disease using residual methods: a systematic review and meta-analysis. Neurology. 2021;97:474–488. doi: 10.1212/WNL.0000000000012499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Brier MR, Thomas JB, Ances BM. Network dysfunction in alzheimer’s disease: refining the disconnection hypothesis. Brain Connectivity. 2014a;4:299–311. doi: 10.1089/brain.2014.0236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Brier MR, Thomas JB, Snyder AZ, Wang L, Fagan AM, Benzinger T, Morris JC, Ances BM. Unrecognized preclinical alzheimer disease confounds rs-fcmri studies of normal aging. Neurology. 2014b;83:1613–1619. doi: 10.1212/WNL.0000000000000939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Butler ER, Chen A, Ramadan R, Le TT, Ruparel K, Moore TM, Satterthwaite TD, Zhang F, Shou H, Gur RC, Nichols TE, Shinohara RT. Pitfalls in brain age analyses. Human Brain Mapping. 2021;42:4092–4101. doi: 10.1002/hbm.25533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cabeza R, Albert M, Belleville S, Craik FIM, Duarte A, Grady CL, Lindenberger U, Nyberg L, Park DC, Reuter-Lorenz PA, Rugg MD, Steffener J, Rajah MN. Maintenance, reserve and compensation: the cognitive neuroscience of healthy ageing. Nature Reviews. Neuroscience. 2018;19:701–710. doi: 10.1038/s41583-018-0068-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cherubini A, Caligiuri ME, Peran P, Sabatini U, Cosentino C, Amato F. Importance of multimodal MRI in characterizing brain tissue and its potential application for individual age prediction. IEEE Journal of Biomedical and Health Informatics. 2016;20:1232–1239. doi: 10.1109/JBHI.2016.2559938. [DOI] [PubMed] [Google Scholar]
  10. Chien DT, Bahri S, Szardenings AK, Walsh JC, Mu F, Su M-Y, Shankle WR, Elizarov A, Kolb HC. Early clinical PET imaging results with the novel PHF-tau radioligand [ F-18 ] -t807. Journal of Alzheimer’s Disease. 2013;34:457–468. doi: 10.3233/JAD-122059. [DOI] [PubMed] [Google Scholar]
  11. Cole JH, Leech R, Sharp DJ. Prediction of brain age suggests accelerated atrophy after traumatic brain injury. Annals of Neurology. 2015;77:571–581. doi: 10.1002/ana.24367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cole JH, Annus T, Wilson LR, Remtulla R, Hong YT, Fryer TD, Acosta-Cabronero J, Cardenas-Blanco A, Smith R, Menon DK, Zaman SH, Nestor PJ, Holland AJ. Brain-predicted age in down syndrome is associated with beta amyloid deposition and cognitive decline. Neurobiology of Aging. 2017a;56:41–49. doi: 10.1016/j.neurobiolaging.2017.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cole JH, Franke K. Predicting age using neuroimaging: innovative brain ageing biomarkers. Trends in Neurosciences. 2017b;40:681–690. doi: 10.1016/j.tins.2017.10.001. [DOI] [PubMed] [Google Scholar]
  14. Cole JH, Underwood J, Caan MWA, De Francesco D, van Zoest RA, Leech R, Wit F, Portegies P, Geurtsen GJ, Schmand BA, Schim van der Loeff MF, Franceschi C, Sabin CA, Majoie C, Winston A, Reiss P, Sharp DJ, COBRA collaboration Increased brain-predicted aging in treated HIV disease. Neurology. 2017c;88:1349–1357. doi: 10.1212/WNL.0000000000003790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Cole JH, Ritchie SJ, Bastin ME, Valdés Hernández MC, Muñoz Maniega S, Royle N, Corley J, Pattie A, Harris SE, Zhang Q, Wray NR, Redmond P, Marioni RE, Starr JM, Cox SR, Wardlaw JM, Sharp DJ, Deary IJ. Brain age predicts mortality. Molecular Psychiatry. 2018;23:1385–1392. doi: 10.1038/mp.2017.62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Desikan RS, Ségonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, Buckner RL, Dale AM, Maguire RP, Hyman BT, Albert MS, Killiany RJ. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage. 2006;31:968–980. doi: 10.1016/j.neuroimage.2006.01.021. [DOI] [PubMed] [Google Scholar]
  17. Dosenbach NUF, Nardos B, Cohen AL, Fair DA, Power JD, Church JA, Nelson SM, Wig GS, Vogel AC, Lessov-Schlaggar CN, Barnes KA, Dubis JW, Feczko E, Coalson RS, Pruett JR, Barch DM, Petersen SE, Schlaggar BL. Prediction of individual brain maturity using fmri. Science. 2010;329:1358–1361. doi: 10.1126/science.1194144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dunås T, Wåhlin A, Nyberg L, Boraxbekk CJ. Multimodal image analysis of apparent brain age identifies physical fitness as predictor of brain maintenance. Cerebral Cortex. 2021;31:3393–3407. doi: 10.1093/cercor/bhab019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Eavani H, Habes M, Satterthwaite TD, An Y, Hsieh MK, Honnorat N, Erus G, Doshi J, Ferrucci L, Beason-Held LL, Resnick SM, Davatzikos C. Heterogeneity of structural and functional imaging patterns of advanced brain aging revealed via machine learning methods. Neurobiology of Aging. 2018;71:41–50. doi: 10.1016/j.neurobiolaging.2018.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Engemann DA, Kozynets O, Sabbagh D, Lemaître G, Varoquaux G, Liem F, Gramfort A. Combining magnetoencephalography with magnetic resonance imaging enhances learning of surrogate-biomarkers. eLife. 2020;9:e55. doi: 10.7554/eLife.54055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Fagan AM, Mintun MA, Mach RH, Lee S-Y, Dence CS, Shah AR, LaRossa GN, Spinner ML, Klunk WE, Mathis CA, DeKosky ST, Morris JC, Holtzman DM. Inverse relation between in vivo amyloid imaging load and cerebrospinal fluid Abeta42 in humans. Annals of Neurology. 2006;59:512–519. doi: 10.1002/ana.20730. [DOI] [PubMed] [Google Scholar]
  22. Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, van der Kouwe A, Killiany R, Kennedy D, Klaveness S, Montillo A, Makris N, Rosen B, Dale AM. Whole brain segmentation. Neuron. 2002;33:341–355. doi: 10.1016/S0896-6273(02)00569-X. [DOI] [PubMed] [Google Scholar]
  23. Fischl B. FreeSurfer. NeuroImage. 2012;62:774–781. doi: 10.1016/j.neuroimage.2012.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Fortin JP, Parker D, Tunç B, Watanabe T, Elliott MA, Ruparel K, Roalf DR, Satterthwaite TD, Gur RC, Gur RE, Schultz RT, Verma R, Shinohara RT. Harmonization of multi-site diffusion tensor imaging data. NeuroImage. 2017;161:149–170. doi: 10.1016/j.neuroimage.2017.08.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Fortin JP, Cullen N, Sheline YI, Taylor WD, Aselcioglu I, Cook PA, Adams P, Cooper C, Fava M, McGrath PJ, McInnis M, Phillips ML, Trivedi MH, Weissman MM, Shinohara RT. Harmonization of cortical thickness measurements across scanners and sites. NeuroImage. 2018;167:104–120. doi: 10.1016/j.neuroimage.2017.11.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Fox MD, Zhang D, Snyder AZ, Raichle ME. The global signal and observed anticorrelated resting state brain networks. J Neurophysiol. 2009;101:3270–3283. doi: 10.1152/jn.90777.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Franke K, Ziegler G, Klöppel S, Gaser C. Estimating the age of healthy subjects from T1-weighted MRI scans using kernel methods: exploring the influence of various parameters. NeuroImage. 2010;50:883–892. doi: 10.1016/j.neuroimage.2010.01.005. [DOI] [PubMed] [Google Scholar]
  28. Franke K, Gaser C. Longitudinal changes in individual brainage in healthy aging, mild cognitive impairment, and alzheimer’s disease. GeroPsych. 2012;25:235–245. doi: 10.1024/1662-9647/a000074. [DOI] [Google Scholar]
  29. Franke K, Gaser C, Manor B, Novak V. Advanced brainage in older adults with type 2 diabetes mellitus. Frontiers in Aging Neuroscience. 2013;5:90. doi: 10.3389/fnagi.2013.00090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Franke K, Gaser C. Ten years of brainage as a neuroimaging biomarker of brain aging: what insights have we gained? Frontiers in Neurology. 2019;10:789. doi: 10.3389/fneur.2019.00789. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Frisoni GB, Fox NC, Jack CR, Scheltens P, Thompson PM. The clinical use of structural MRI in Alzheimer disease. Nature Reviews. Neurology. 2010;6:67–77. doi: 10.1038/nrneurol.2009.215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gaser C, Franke K, Klöppel S, Koutsouleris N, Sauer H. BrainAGE in mild cognitive impaired patients: predicting the conversion to alzheimer’s disease. PLOS ONE. 2013;8:e67346. doi: 10.1371/journal.pone.0067346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Gong W, Beckmann CF, Vedaldi A, Smith SM, Peng H. Optimising a simple fully convolutional network for accurate brain age prediction in the PAC 2019 challenge. Frontiers in Psychiatry. 2021;12:627996. doi: 10.3389/fpsyt.2021.627996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Gonneaud J, Baria AT, Binette AP, Gordon BA, Chhatwal JP, Cruchaga C. Accelerated functional brain aging in pre-clinical familial alzheimer’s disease. Nat Commun. 2021;12:5346. doi: 10.1038/s41467-021-25492-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Goodglass H, Kaplan E. Boston Diagnostic Aphasia Examination Booklet, III: Oral Expression: Animal Naming Fluency in Controlled Association. In Philadelphia: Lea & Febiger; 1983. [Google Scholar]
  36. Goyal MS, Blazey TM, Su Y, Couture LE, Durbin TJ, Bateman RJ, Benzinger TLS, Morris JC, Raichle ME, Vlassenko AG. Persistent metabolic youth in the aging female brain. PNAS. 2019;116:3251–3255. doi: 10.1073/pnas.1815917116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Goyal MS, Blazey T, Metcalf NV, McAvoy MP, Strain J, Rahmani M, Durbin TJ, Xiong C, Benzinger TLS, Morris JC, Raichle ME, Vlassenko AG. Brain Aerobic Glycolysis and Resilience in Alzheimer Disease. bioRxiv. 2022 doi: 10.1101/2022.06.21.497006. [DOI] [PMC free article] [PubMed]
  38. Grober E, Buschke H, Crystal H, Bang S, Dresner R. Screening for dementia by memory testing. Neurology. 1988;38:900–903. doi: 10.1212/wnl.38.6.900. [DOI] [PubMed] [Google Scholar]
  39. Guo T, Korman D, La Joie R, Shaw LM, Trojanowski JQ, Jagust WJ, Landau SM, Alzheimer’s Disease Neuroimaging Initiative Normalization of CSF ptau measurement by aβ40 improves its performance as a biomarker of alzheimer’s disease. Alzheimer’s Research & Therapy. 2020;12:97. doi: 10.1186/s13195-020-00665-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Hansson O, Lehmann S, Otto M, Zetterberg H, Lewczuk P. Advantages and disadvantages of the use of the CSF amyloid β (Aβ) 42/40 ratio in the diagnosis of alzheimer’s disease. Alzheimer’s Research & Therapy. 2019;11:34. doi: 10.1186/s13195-019-0485-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Harris SS, Wolf F, De Strooper B, Busche MA. Tipping the scales: peptide-dependent dysregulation of neural circuit dynamics in alzheimer’s disease. Neuron. 2020;107:417–435. doi: 10.1016/j.neuron.2020.06.005. [DOI] [PubMed] [Google Scholar]
  42. Hwang G, Abdulkadir A, Erus G, Habes M, Pomponio R, Shou H, Doshi J, Mamourian E, Rashid T, Bilgel M, Fan Y, Sotiras A, Srinivasan D, Morris JC, Albert MS, Bryan NR, Resnick SM, Nasrallah IM, Davatzikos C, Wolk DA, from the iSTAGING consortium. ADNI Disentangling Alzheimer’s disease neurodegeneration from typical brain ageing using machine learning. Brain Communications. 2022;4:fcac117. doi: 10.1093/braincomms/fcac117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Jagust WJ, Mormino EC. Lifespan brain activity, β-amyloid, and alzheimer’s disease. Trends in Cognitive Sciences. 2011;15:520–526. doi: 10.1016/j.tics.2011.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8:118–127. doi: 10.1093/biostatistics/kxj037. [DOI] [PubMed] [Google Scholar]
  45. Jones DT, Knopman DS, Gunter JL, Graff-Radford J, Vemuri P, Boeve BF, Petersen RC, Weiner MW, Jack CR. Cascading network failure across the alzheimer’s disease spectrum. Brain. 2016;139:547–562. doi: 10.1093/brain/awv338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Jones DT, Graff-Radford J, Lowe VJ, Wiste HJ, Gunter JL, Senjem ML, Botha H, Kantarci K, Boeve BF, Knopman DS, Petersen RC, Jack CR. Tau, amyloid, and cascading network failure across the alzheimer’s disease spectrum. Cortex; a Journal Devoted to the Study of the Nervous System and Behavior. 2017;97:143–159. doi: 10.1016/j.cortex.2017.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Klunk WE, Engler H, Nordberg A, Wang Y, Blomqvist G, Holt DP, Bergström M, Savitcheva I, Huang G, Estrada S, Ausén B, Debnath ML, Barletta J, Price JC, Sandell J, Lopresti BJ, Wall A, Koivisto P, Antoni G, Mathis CA, Långström B. Imaging brain amyloid in Alzheimer’s disease with Pittsburgh compound-B. Annals of Neurology. 2004;55:306–319. doi: 10.1002/ana.20009. [DOI] [PubMed] [Google Scholar]
  48. Koutsouleris N, Davatzikos C, Borgwardt S, Gaser C, Bottlender R, Frodl T, Falkai P, Riecher-Rössler A, Möller H-J, Reiser M, Pantelis C, Meisenzahl E. Accelerated brain aging in schizophrenia and beyond: a neuroanatomical marker of psychiatric disorders. Schizophrenia Bulletin. 2014;40:1140–1153. doi: 10.1093/schbul/sbt142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Le TT, Kuplicki RT, McKinney BA, Yeh HW, Thompson WK, Paulus MP, Tulsa 1000 Investigators A nonlinear simulation framework supports adjusting for age when analyzing brainage. Frontiers in Aging Neuroscience. 2018;10:317. doi: 10.3389/fnagi.2018.00317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Lee J, Burkett BJ, Min HK, Senjem ML, Lundt ES, Botha H, Graff-Radford J, Barnard LR, Gunter JL, Schwarz CG, Kantarci K, Knopman DS, Boeve BF, Lowe VJ, Petersen RC, Jack CR, Jones DT. Deep learning-based brain age prediction in normal aging and dementia. Nature Aging. 2022;2:412–424. doi: 10.1038/s43587-022-00219-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Liang H, Zhang F, Niu X. Investigating systematic bias in brain age estimation with application to post-traumatic stress disorders. Human Brain Mapping. 2019;40:3143–3152. doi: 10.1002/hbm.24588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Liem F, Varoquaux G, Kynast J, Beyer F, Kharabian Masouleh S, Huntenburg JM, Lampe L, Rahim M, Abraham A, Craddock RC, Riedel-Heller S, Luck T, Loeffler M, Schroeter ML, Witte AV, Villringer A, Margulies DS. Predicting brain-age from multimodal imaging data captures cognitive impairment. NeuroImage. 2017;148:179–188. doi: 10.1016/j.neuroimage.2016.11.005. [DOI] [PubMed] [Google Scholar]
  53. Ly M, Yu GZ, Karim HT, Muppidi NR, Mizuno A, Klunk WE, Aizenstein HJ. Improving brain age prediction models: incorporation of amyloid status in alzheimer’s disease. Neurobiology of Aging. 2020;87:44–48. doi: 10.1016/j.neurobiolaging.2019.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. MathWorks Regression learner app. 1.2.1Internet. 2021 https://www.mathworks.com/help/stats/regression-learner-app.html
  55. McKay NS, Gordon BA, Hornbeck RC, Jack CR, Koeppe R, Flores S, Keefe S, Hobbs DA, Joseph-Mathurin N, Wang Q, Rahmani F, Chen CD, McCullough A, Koudelis D, Chua J, Ances BM, Millar PR, Nickels M, Perrin RJ, Allegri RF, Berman SB, Brooks WS, Cash DM, Chhatwal JP, Farlow MR, Fox NC, Fulham M, Ghetti B, Graff-Radford N, Ikeuchi T, Day G, Klunk W, Levin J, Lee JH, Martins R, Masters CL, McConathy J, Mori H, Noble JM, Rowe C, Salloway S, Sanchez-Valle R, Schofield PR, Shimada H, Shoji M, Su Y, Suzuki K, Vöglein J, Yakushev I, Swisher L, Cruchaga C, Hassenstab J, Karch C, McDade E, Xiong C, Morris JC, Bateman RJ, Benzinger TLS, Dominantly Inherited Alzheimer Network Neuroimaging within the Dominantly Inherited Alzheimer’s Network (DIAN): PET and MRI. bioRxiv. 2022 doi: 10.1101/2022.03.25.485799. [DOI]
  56. McKhann GM, Knopman DS, Chertkow H, Hyman BT, Jack CR, Kawas CH, Klunk WE, Koroshetz WJ, Manly JJ, Mayeux R, Mohs RC, Morris JC, Rossor MN, Scheltens P, Carrillo MC, Thies B, Weintraub S, Phelps CH. The diagnosis of dementia due to alzheimer’s disease: recommendations from the national institute on aging-alzheimer’s association workgroups on diagnostic guidelines for alzheimer’s disease. Alzheimer’s & Dementia. 2011;7:263–269. doi: 10.1016/j.jalz.2011.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Millar PR, Ances BM, Gordon BA, Benzinger TLS, Morris JC, Balota DA. Evaluating cognitive relationships with resting-state and task-driven blood oxygen level-dependent variability. Journal of Cognitive Neuroscience. 2021;33:279–302. doi: 10.1162/jocn_a_01645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Millar PR, Luckett PH, Gordon BA, Benzinger TLS, Schindler SE, Fagan AM, Cruchaga C, Bateman RJ, Allegri R, Jucker M, Lee JH, Mori H, Salloway SP, Yakushev I, Morris JC, Ances BM, Dominantly Inherited Alzheimer Network Predicting brain age from functional connectivity in symptomatic and preclinical alzheimer disease. NeuroImage. 2022;256:119228. doi: 10.1016/j.neuroimage.2022.119228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Mishra S, Gordon BA, Su Y, Christensen J, Friedrichsen K, Jackson K, Hornbeck R, Balota DA, Cairns NJ, Morris JC, Ances BM, Benzinger TLS. AV-1451 PET imaging of tau pathology in preclinical Alzheimer disease: defining a summary measure. NeuroImage. 2017;161:171–178. doi: 10.1016/j.neuroimage.2017.07.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Morris JC. The clinical dementia rating (CDR): current version and scoring rules. Neurology. 1993;43:2412–2414. doi: 10.1212/wnl.43.11.2412-a. [DOI] [PubMed] [Google Scholar]
  61. Nielsen AN, Greene DJ, Gratton C, Dosenbach NUF, Petersen SE, Schlaggar BL. Evaluating the prediction of brain maturity from functional connectivity after motion artifact denoising. Cerebral Cortex. 2019;29:2455–2469. doi: 10.1093/cercor/bhy117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Petersen KJ, Metcalf N, Cooley S, Tomov D, Vaida F, Paul R, Ances BM. Accelerated brain aging and cerebral blood flow reduction in persons with human immunodeficiency virus. Clinical Infectious Diseases. 2021;73:1813–1821. doi: 10.1093/cid/ciab169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Petersen KJ, Strain JF, Cooley SA, Vaida FF, Ances BM. Machine learning quantifies accelerated white-matter aging in persons with HIV. The Journal of Infectious Diseases. 2022;226:49–58. doi: 10.1093/infdis/jiac156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Power JD, Barnes KA, Snyder AZ, Schlaggar BL, Petersen SE. Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage. 2012;59:2142–2154. doi: 10.1016/j.neuroimage.2011.10.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Ranasinghe KG, Verma P, Cai C, Xie X, Kudo K, Gao X, Lerner H, Mizuiri D, Strom A, Iaccarino L, La Joie R, Miller BL, Gorno-Tempini ML, Rankin KP, Jagust WJ, Vossel K, Rabinovici GD, Raj A, Nagarajan SS. Altered excitatory and inhibitory neuronal subpopulation parameters are distinctly associated with tau and amyloid in Alzheimer’s disease. eLife. 2022;11:e77850. doi: 10.7554/eLife.77850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Rasmussen CE, von Luxburg U, Rätsch G. In: Advanced Lectures on Machine Learning. Carbonell JG, Siekmann J, editors. Berlin, Heidelberg: Springer-Verlag; 2004. Advanced lectures on machine learning; pp. 63–71. [DOI] [Google Scholar]
  67. R Development Core Team . Vienna, Austria: R Foundation for Statistical Computing; 2020. https://www.r-project.org/ [Google Scholar]
  68. Richard G, Kolskår K, Sanders A-M, Kaufmann T, Petersen A, Doan NT, Monereo Sánchez J, Alnæs D, Ulrichsen KM, Dørum ES, Andreassen OA, Nordvik JE, Westlye LT. Assessing distinct patterns of cognitive aging using tissue-specific brain age prediction based on diffusion tensor imaging and brain morphometry. PeerJ. 2018;6:e5908. doi: 10.7717/peerj.5908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Satterthwaite TD, Wolf DH, Loughead J, Ruparel K, Elliott MA, Hakonarson H, Gur RC, Gur RE. Impact of in-scanner head motion on multiple measures of functional connectivity: relevance for studies of neurodevelopment in youth. NeuroImage. 2012;60:623–632. doi: 10.1016/j.neuroimage.2011.12.063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Schindler SE, Gray JD, Gordon BA, Xiong C, Batrla-Utermann R, Quan M, Wahl S, Benzinger TLS, Holtzman DM, Morris JC, Fagan AM. Cerebrospinal fluid biomarkers measured by elecsys assays compared to amyloid imaging. Alzheimer’s & Dementia. 2018;14:1460–1469. doi: 10.1016/j.jalz.2018.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Schultz AP, Chhatwal JP, Hedden T, Mormino EC, Hanseeuw BJ, Sepulcre J, Huijbers W, LaPoint M, Buckley RF, Johnson KA, Sperling RA. Phases of hyperconnectivity and hypoconnectivity in the default mode and salience networks track with amyloid and tau in clinically normal individuals. The Journal of Neuroscience. 2017;37:4323–4331. doi: 10.1523/JNEUROSCI.3263-16.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Seitzman BA, Gratton C, Marek S, Raut RV, Dosenbach NUF, Schlaggar BL, Petersen SE, Greene DJ. A set of functionally-defined brain regions with improved representation of the subcortex and cerebellum. NeuroImage. 2020;206:116290. doi: 10.1016/j.neuroimage.2019.116290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Shulman GL, Pope DLW, Astafiev SV, McAvoy MP, Snyder AZ, Corbetta M. Right hemisphere dominance during spatial selective attention and target detection occurs outside the dorsal frontoparietal network. J Neurosci. 2010;30:3640–3651. doi: 10.1523/JNEUROSCI.4085-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Smith SM, Vidaurre D, Alfaro-Almagro F, Nichols TE, Miller KL. Estimation of brain age delta from brain imaging. NeuroImage. 2019;200:528–539. doi: 10.1016/j.neuroimage.2019.06.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Sperling RA, Aisen PS, Beckett LA, Bennett DA, Craft S, Fagan AM. Toward defining the preclinical stages of alzheimer’s disease: recommendations from the national institute on aging and the alzheimer’s association workgroup. Alzheimers Dement. 2011;7:280–292. doi: 10.1016/j.jalz.2011.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Su Y, D’Angelo GM, Vlassenko AG, Zhou G, Snyder AZ, Marcus DS, Blazey TM, Christensen JJ, Vora S, Morris JC, Mintun MA, Benzinger TLS. Quantitative analysis of PIB-PET with freesurfer rois. PLOS ONE. 2013;8:e73377. doi: 10.1371/journal.pone.0073377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Su Y, Flores S, Hornbeck RC, Speidel B, Vlassenko AG, Gordon BA, Koeppe RA, Klunk WE, Xiong C, Morris JC, Benzinger TLS. Utilizing the centiloid scale in cross-sectional and longitudinal PIB PET studies. NeuroImage. Clinical. 2018;19:406–416. doi: 10.1016/j.nicl.2018.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Su Y, Flores S, Wang G, Hornbeck RC, Speidel B, Joseph-Mathurin N, Vlassenko AG, Gordon BA, Koeppe RA, Klunk WE, Jack CR, Farlow MR, Salloway S, Snider BJ, Berman SB, Roberson ED, Brosch J, Jimenez-Velazques I, van Dyck CH, Galasko D, Yuan SH, Jayadev S, Honig LS, Gauthier S, Hsiung GYR, Masellis M, Brooks WS, Fulham M, Clarnette R, Masters CL, Wallon D, Hannequin D, Dubois B, Pariente J, Sanchez-Valle R, Mummery C, Ringman JM, Bottlaender M, Klein G, Milosavljevic-Ristic S, McDade E, Xiong C, Morris JC, Bateman RJ, Benzinger TLS. Comparison of pittsburgh compound B and florbetapir in cross-sectional and longitudinal studies. Alzheimer’s & Dementia. 2019;11:180–190. doi: 10.1016/j.dadm.2018.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Thomas JB, Brier MR, Snyder AZ, Vaida FF, Ances BM. Pathways to neurodegeneration: effects of HIV and aging on resting-state functional connectivity. Neurology. 2013;80:1186–1193. doi: 10.1212/WNL.0b013e318288792b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Van Dijk KRA, Sabuncu MR, Buckner RL. The influence of head motion on intrinsic functional connectivity MRI. NeuroImage. 2012;59:431–438. doi: 10.1016/j.neuroimage.2011.07.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Vidal-Piñeiro D, Wang Y, Krogsrud SK, Amlien IK, Baaré WFC, Bartrés-Faz D, Bertram L, Brandmaier AM, Drevon CA, Düzel S, Ebmeier KP, Henson RN, Junque C, Kievit RA, Kühn S, Leonardsen E, Lindenberger U, Madsen KS, Magnussen F, Mowinckel AM, Nyberg L, Roe JM, Segura B, Smith SM, Sørensen Ø, Suri S, Westerhausen R, Zalesky A, Zsoldos E, Walhovd KB, Fjell AM, the Australian Imaging Biomarkers and Lifestyle flagship study of ageing Individual Variations in “Brain Age” Relate to Early Life Factors More than to Longitudinal Brain Change. bioRxiv. 2021 doi: 10.1101/2021.02.08.428915. [DOI] [PMC free article] [PubMed]
  82. Vlassenko AG, McCue L, Jasielec MS, Su Y, Gordon BA, Xiong C, Holtzman DM, Benzinger TLS, Morris JC, Fagan AM. Imaging and cerebrospinal fluid biomarkers in early preclinical Alzheimer disease. Annals of Neurology. 2016;80:379–387. doi: 10.1002/ana.24719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Volluz KE, Schindler SE, Henson RL, Xiong C, Gordon BA, Benzinger TLS, Holtzman DM, Morris JC, Fagan AM. Correspondence of CSF biomarkers measured by lumipulse assays with amyloid PET. Alzheimer’s & Dementia. 2021;17:S5. doi: 10.1002/alz.051085. [DOI] [Google Scholar]
  84. Wales RM, Leung HC. The effects of amyloid and tau on functional network connectivity in older populations. Brain Connectivity. 2021;11:599–612. doi: 10.1089/brain.2020.0902. [DOI] [PubMed] [Google Scholar]
  85. Wang J, Knol MJ, Tiulpin A, Dubost F, de Bruijne M, Vernooij MW, Adams HHH, Ikram MA, Niessen WJ, Roshchupkin GV. Gray matter age prediction as a biomarker for risk of dementia. PNAS. 2019;116:21213–21218. doi: 10.1073/pnas.1902376116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Wong DF, Rosenberg PB, Zhou Y, Kumar A, Raymont V, Ravert HT, Dannals RF, Nandi A, Brasić JR, Ye W, Hilton J, Lyketsos C, Kung HF, Joshi AD, Skovronsky DM, Pontecorvo MJ. In vivo imaging of amyloid deposition in Alzheimer disease using the radioligand 18F-AV-45 (florbetapir [ corrected ] F 18) Journal of Nuclear Medicine. 2010;51:913–920. doi: 10.2967/jnumed.109.069088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Yu M, Linn KA, Cook PA, Phillips ML, McInnis M, Fava M, Trivedi MH, Weissman MM, Shinohara RT, Sheline YI. Statistical harmonization corrects site effects in functional connectivity measurements from multi-site fMRI data. Human Brain Mapping. 2018;39:4213–4227. doi: 10.1002/hbm.24241. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Karla L Miller 1

This is a useful study exploring multi-modality brain age (structural plus resting state MRI) in people in the early stages or at risk of Alzheimer's disease. They found solid evidence that people with cognitive impairment had older-appearing brains and that older-appearing brains were related to Alzheimer's risk factors such as amyloid and tau deposition. Their data suggest that the multi-modality brain age model is more accurate than a unimodal structural MRI model.

Decision letter

Editor: Karla L Miller1
Reviewed by: James Cole2, Didac Vidal-Pineiro3

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Multimodal brain age estimates relate to Alzheimer disease biomarkers and cognition in early stages: a cross-sectional observational study" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Jeannie Chin as the Senior Editor. The following individuals involved in the review of your submission have agreed to reveal their identity: James Cole (Reviewer #1); Didac Vidal-Pineiro (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

The overall consensus from the reviewers is that some of the hypotheses (and conclusions) are supported by the results (specifically, the Vol-BAG model) while others are much weaker (specifically, the FC-BAG models). In consultation between the reviewers, they agreed that the work is "useful" but found the evidence to be "incomplete" for the conclusions as presented.

1. The functional connectivity model provides a poor fit to the data. Given that the FC-based models have been recently published (Millar 2022), the reviewers feel that the authors should temper claims about the importance of the FC-based modelling.

2. The reviewers are skeptical about claims regarding the improvements afforded by multi-modal brain age models. In particular, the bootstrapping analyses actually support the claims that FC data improve the quality of brain age modelling.

3. Overall, the reviewers feel that some of the conclusions, such as the biphasic relationships between functional brain-age models and pathological status, are not strongly supported and need to be tempered. In particular, the reviewers object to referring to results as "marginal".

4. Discussion about the potential implications of sample size would be welcome.

Reviewer #1 (Recommendations for the authors):

– In the Methods, they say they used a Gaussian mixture model to define pTau positivity. There are multiple ways to implement GMMs, so more details should be included here.

– The presentation of the MRI Acquisition section in the Methods is not very clear. I suggest the authors consider an alternative format, possibly a supplementary table, where the acquisition details can be more easily appraised. Currently, the acquisition details on the DIAN participants are scarce relative to the ADRC participants.

– Can the authors explain and justify why the fMRI processing included registration to an older-adult template? Could this have caused a bias in the registration accuracy for younger participants?

– It is unclear to me why they chose to perform 10-fold CV and hold-out validation with 1000 bootstraps. To my mind, the latter would have been sufficient. If the authors think including the initial 10-fold CV as well is important, this should be clearly justified.

– It is important that R2 is reported for each model performance, not just MAE. As R2 is a ratio the values can readily be compared across published studies, while the MAE cannot as it is heavily dependent on the age distribution of the test set. For completeness, they could also consider reporting the Pearson's r correlation between predicted age and age, and the root mean square error as well.

– It is unclear how the model performance comparisons were conducted (Results, pg. 12). While t-tests are mentioned in the text, the exact details should be included in the Methods. My concern here is that the n (sample size) for these comparisons is based on the number of bootstraps (arbitrarily determined by the authors to be 1000), rather than the actual sample size. If that is the case (and Figure 1D suggests it is), this is procedure is incorrect as the sensitivity that these tests have to detect differences would be purely a factor of the number of bootstraps, rather than the number of observations. This means that the experimenter can simply choose to make smaller differences 'significant' simply by adding more bootstraps. This needs to be clarified and corrected if appropriate. One approach to achieve the goal of comparing model performances is to take Pearson's correlations with age from each model and use Z-transformations to test the alternative hypothesis that the correlations are different (e.g., the Steiger test). In that way, the n would be determined by the number of observations, so statistical power would appropriately reflect the data.

– I recommend avoiding saying things like 'marginally lower' when a p-value = 0.110. There's no real evidence that there's a difference here, so hard to say whether it's truly lower or not. Generally avoiding 'trends' at 0.1> p >0.05 is best practice. P-values are important, but effect sizes (with confidence intervals) are often more informative.

– In the Discussion, when comparing age prediction accuracy between studies, it's important not to rely on MAE alone as this can vary greatly as a function of the test set age distribution. They should use R2 instead. Where R2 is unavailable, it's essential that the age range of each study mentioned in comparison is reported to provide context to the MAE values.

– The evidence for a biphasic relationship between FC-BAG and pre-clinical/clinical status is somewhat over-interpreted, particularly given there was no difference between A+T- and A+T+ people (p=0.11) and the fit of FC brain age is quite poor (i.e., far from the line of identity in Figure 2A). I suggest more caution when discussing this.

– A key limitation that was not mentioned was the small sample size relative to other studies. Perhaps the model performance is similar but given that only MAE is used to compare studies it is hard to draw meaningful conclusions. My impression is that had larger datasets been available, then performance would have improved.

Reviewer #2 (Recommendations for the authors):

– As explained in the previous section, the FC-BAG model has very limited prediction power, and therefore the results from the FC-BAG model are not reliable while providing marginal benefit. The FC-BAG results should be moved to the supplementary materials.

– For the FC-BAG models and its relation to other clinical variables, please also another version of the model including mean, median, and maximum head motion during the entire rsfMRI scan as covariates in the model to further ensure the reliability of the results.

– It is not clear to me that the bootstrapped based t-test provides evidence in favor of the Vol+FC-BAG model. In other words, a stacked model combining FC-BAG and Vol-BAG will always perform as well or worse than each model. If the stacking approach takes this into account (not clear in the method section, needs further explanation) the marginal increase in performance can be explained to this unidirectional effect and needs further confirmation based on a model selection step (e.g. using new independent data not used in the training-validation of FC-BAG and Vol-BAG model to compare Vol+FC-BAG and Vol-BAG model).

– After the previous step authors can choose the best performing model (either Vol-BAG or Vol+FC-BAG model) and only present the data for the selected model since results between the two models are redundant and don't add extra information to the reader.

– The analysis of hippocampal volume (specially related to the preclinical AD) needs to be confirmed. To do so, hippocampal volume as well as volumetric features from regions highly correlated with hippocampal volume should be removed from the feature set of Vol-BAG and Vol+FC-BAG models. The models need to be retrained using the same procedure. The relationship between hippocampal volume and the newly calculated Vol-BAG and Vol+FC-BAG values should be reported alongside the current results.

Reviewer #3 (Recommendations for the authors):

Find below some recommendations on how (I think) the science in this manuscript might be improved in no particular order.

1. Training sample. It is unclear why one would like to minimize undetected AD pathology (amyloid positivity, that is) in the cognitively healthy training sample as many of these individuals (when Tau negative) have minimal changes in brain structure and function. Since you create a BA "norm" from these individuals, one may benefit from including a bigger, more representative sample using more lenient inclusion criteria. Decisions regarding the training sample can have a big impact on the subsequent interpretation of BA results (e.g. Hwang, 2022, Brain Comm).

2. Group descriptors. It is still a matter of ongoing debate, but I recommend using another descriptor for the amyloid positive group rather than "preclinical AD". Even in the NIAA-AA Research framework from 2018 (Jack Jr.) they only use this tag for individuals that are amyloid and tau positive.

3. Biomarker definition. I am not an expert on biomarkers, but the definition of pTau positivity is uncommon to me "Gaussian mixture model approach to defining pTau positivity based on the CSF pTau/Aβ40 ratio.". Could the authors justify and or cite the correspondent references?

4. Statistical analysis. If I have not misread, the methods section only mentions three test groups (A-, A+, and CDR>0) but the analysis is performed with four groups. This leads to confusion and should be corrected. Also, most higher-level analyses reported in the results are not described in this section. These analyses should be described in the methods section. It is difficult to evaluate whether the performed analyses are appropriate without this description. For example, (lines 323-7) the authors report three different regression models and then a fourth analysis combining the four groups, but only for FC-BAG. This procedure is unclear, not described (as far as I can see), and not justified. Another example is the analysis with NFL which is not mentioned until line 412 (p.20) in the Results section. Also, the authors use different samples for different tests, due to the lack of Biomarker information for some individuals. I suggest adding degrees of freedom/n when reporting the results, so the reader has some information regarding the sample used.

5. The authors are repeating the same analysis in three different modalities (also sometimes they repeat the analyses across several pairs of groups [e.g. lines 323-7]). Thus, I would strongly recommend using some type of multiple comparison corrections.

6. Table 2. The authors should mention what the units in the table represent. Also, I recommend adding df and exact significance values (at least if p >.001).

7. Atlas. The authors used the D-K atlas (not strictly the FS-defined) for BA computation. This is a suboptimal choice, and I would recommend in the future using more fine-grained parcellations. This is not a strong issue, but the choice surprised me since the authors used a 300-ROI parcellation for the rs-fMRI. Also, since the authors use cortical thickness for sampling the cortex, I would not use "Volumetric"-BA as a descriptor.

8. Movement and rs-fMRI. The rs-fMRI preprocessing used might still lead to a signal that is related to movement. Since movement is almost always related to age and disease [and thus can affect both the BA computation and the tests in the test sample], I would suggest taking additional steps in this regard. At the minimum, I would include total motion as an additional covariate in the higher-level analysis and discuss this issue in the limitations section.

9. The results in cognitively healthy samples are largely negative (i.e. do not differ with groups). One possible explanation is that the authors are using cross-sectional samples and thus – even when using BA metrics – have a signal that captures ongoing aging (accelerated aging, if you wish) and baseline (lifelong, preexisting) variability between individuals. The latter may obscure possible existing effects. I recommend the authors acknowledge the limitations of using cross-sectional data to study changes that ought to be longitudinal.

eLife. 2023 Jan 6;12:e81869. doi: 10.7554/eLife.81869.sa2

Author response


Essential revisions:

The overall consensus from the reviewers is that some of the hypotheses (and conclusions) are supported by the results (specifically, the Vol-BAG model) while others are much weaker (specifically, the FC-BAG models). In consultation between the reviewers, they agreed that the work is "useful" but found the evidence to be "incomplete" for the conclusions as presented.

1. The functional connectivity model provides a poor fit to the data. Given that the FC-based models have been recently published (Millar 2022), the reviewers feel that the authors should temper claims about the importance of the FC-based modelling.

Although the reviewers are correct that the FC model indeed provided a relatively poor fit, compared to structural MRI data, a major goal of this project was to test whether each modality (structural MRI and FC) captures unique patterns related to AD progression. As we are primarily motivated to evaluate these models in their associations with AD, it is important to consider that the most accurate BAG models for age prediction are not necessarily the ones that are most sensitive to disease. Indeed, at least one study suggests that models with “moderate” age prediction accuracy might be the most useful in detecting deviation related to disease, as compared to overly “loose” or “tight” age prediction models (Bashyam et al., 2020). We now justify our motivations more clearly in the “Introduction”:

“This project aimed to develop multimodal models of brain-predicted age, incorporating both FC and structural MRI. Participants with presymptomatic AD pathology were excluded from the training set to maximize sensitivity. We hypothesized that BAG estimates would be sensitive to the presence of AD biomarkers and early cognitive impairment. We further considered whether estimates were continuously associated with AD biomarkers of amyloid and tau, as well as cognition. We hypothesized that FC and structural MRI would capture complementary signals related to age and AD. Thus, we systematically compared models trained on unimodal FC, structural MRI, and combined modalities, to test the added utility of multimodal integration in accurately predicting age and whether each modality captures unique relationships with AD biomarkers and cognition.”

Moreover, in the current revision, we aim to focus the discussion on novel associations with this biphasic FC pattern (including the tests of continuous associations with biomarkers and cognition), rather than recapitulating the previously published finding. We also discuss the potential relevance of this result to emerging results from MEG (Ranasinghe et al., 2022) and metabolic PET studies (Goyal et al., 2022). Finally, we now also acknowledge the poor prediction performance of the FC model as a potential spurious explanation of these findings. The discussion of this result now tempers prior claims about the importance of FC:

“As we have previously reported (31), FC-BAG was lower in presymptomatic AD participants compared to amyloid-negative controls. Extending beyond this group difference, we now also note that FC-BAG was negatively associated with amyloid PET in CN/A+ participants. The combined reduction of FC-BAG in the presymptomatic phase and increase in the symptomatic phase suggest a biphasic functional response to AD progression, which is partially consistent with some prior suggestions (77–81) (see ref 31 for a more detailed discussion).

Interpretation of this biphasic pattern is still unclear, although the present results provide at least one novel insight. Specifically, one potential interpretation is that the “younger” appearing FC pattern in the preclinical phase may reflect a compensatory response to early AD pathology (82). This interpretation leads to the prediction that reduced FC-BAG should be associated with better cognitive performance in the preclinical stage. However, this interpretation is not supported by the current results, as FC-BAG did not correlate with cognition in any of the analysis samples.

Alternatively, pathological AD-related FC disruptions may be orthogonal to healthy age-related FC differences, as supported by our previous observation that age and AD are predicted by mostly non-overlapping FC networks (31). For instance, the “younger” FC pattern in CN/A+ participants may be driven by hyper-excitability in the preclinical stage (83,84). It is also worth considering that patterns of younger FC-BAG in CN/A+ participants may somehow correspond to a recent observation that patterns of youthful-appearing aerobic glycolysis are relatively preserved in the preclinical stage of AD (85). Finally, this effect may simply be spuriously driven by poor performance of the FC brain age model, sample-specific noise, and/or statistical artifacts related to regression dilution and its correction (71). Hence, future studies should attempt to replicate these results in independent samples and further test potential theoretical interpretations.”

2. The reviewers are skeptical about claims regarding the improvements afforded by multi-modal brain age models. In particular, the bootstrapping analyses actually support the claims that FC data improve the quality of brain age modelling.

We thank the reviewers for pointing out this flaw in the comparison of model performance. We now test for significant differences between z-transformed Pearson correlations with age in each model using a Williams’s test (as these correlations are dependent in that they share a common variable, age, as opposed to the Steiger test of correlations between different variables). We now report these test results in the

“Comparison of Model Performance” section:

“All models accurately predicted chronological age in the training sets, as assessed using 10-fold cross validation, as well as in the held-out test sets. Overall, prediction accuracy was lowest in the FC model (MAEFC/Train = 8.67 years, R2FC/Train = 0.68, MAEFC/Test = 8.25 years, R2FC/Test = 0.73, see Figure 1A and B). The structural MRI model (MAES/Train = 5.97 years, R2S/Train = 0.81, MAES/Test = 6.26 years, R2S/Test = 0.82, see Figure 1C and D) significantly outperformed the FC model in age prediction accuracy, Williams’s tS vs. FC = 5.39, p <.001. Finally, the multimodal model (MAES+FC/Train = 5.34 years, R2S+FC/Train = 0.86, MAES+FC/Test = 5.25 years, R2S+FC/Test = 0.87, see Figure 1E and F) significantly outperformed both the FC model (Williams’s tS+FC vs. FC = 11.20, p <.001) and the structural MRI model (Williams’s tS+FC vs. S = 5.67, p <.001).”

3. Overall, the reviewers feel that some of the conclusions, such as the biphasic relationships between functional brain-age models and pathological status, are not strongly supported and need to be tempered. In particular, the reviewers object to referring to results as "marginal".

We appreciate the reviewers’ concern over the weak support for some of the results, particularly in the biphasic relationship observed in the multimodal BAG model. In the revised analyses, which focus on a three group comparison (CN/A- vs. CN/A+ vs. CI), the biphasic pattern in the FC-BAG model is clearly reproduced and survives FDR correction for multiple comparisons. However, the previously noted “marginal” biphasic pattern in the S+FC-BAG model is no longer apparent. Thus, we limit our discussion of the biphasic pattern to the FC model, and not the multimodal model. Moreover, we no longer refer to results as “marginal” throughout the revised submission.

4. Discussion about the potential implications of sample size would be welcome.

We agree with the reviewers that the sample size of the training set was relatively small compared to prior models. We now acknowledge this issue as a limitation and an avenue for future development:

“Additionally, the training set (N = 390) was relatively small compared to prior models, which have included training samples over 1000 (e.g., 5,76). Future studies may further improve model performance by including larger samples of well-characterized participants in the training set.”

Reviewer #1 (Recommendations for the authors):

– In the Methods, they say they used a Gaussian mixture model to define pTau positivity. There are multiple ways to implement GMMs, so more details should be included here.

We apologize for the lack of clarity in the GMM methods, as multiple reviewers also noted similar concerns. Specifically, we fit a two-component GMM to the continuous pTau data, and then used the model classification to define pTau- and pTau+ participants. However, in order to simplify the analyses and interpretation of results, we have removed the analyses stratifying by pTau positivity and instead focus only on A- vs. A+ participants (see responses below to reviewer #3, comments #3 and 4).

– The presentation of the MRI Acquisition section in the Methods is not very clear. I suggest the authors consider an alternative format, possibly a supplementary table, where the acquisition details can be more easily appraised. Currently, the acquisition details on the DIAN participants are scarce relative to the ADRC participants.

We apologize for the lack of clarity. In the revision, we now provide more specific details on the acquisition parameters for DIAN participants in the main text (“MRI Acquisition” section) and also provide a summary table of the parameters in the Supplementary Material.

“All MRI data were obtained using a Siemens 3T scanner, although there was a variety of specific models within and across studies. As described previously (Millar et al., 2022), participants in the Knight ADRC and Ances lab studies completed one of two comparable structural MRI protocols, varying by scanner (sagittal T1-weighted magnetization-prepared rapid gradient echo sequence [MPRAGE] with repetition time [TR] = 2400 or 2300 ms, echo time [TE] = 3.16 or 2.95 ms, flip angle = 8° or 9°, frames = 176, field of view = sagittal 256x256 or 240x256 mm, 1-mm isotropic or 1x1x1.2 mm voxels; oblique T2-weighted fast spin echo sequence [FSE] with TR = 3200 ms, TE = 455 ms, 256 x 256 acquisition matrix, 1-mm isotropic voxels) and an identical resting-state fMRI protocol (interleaved whole-brain echo planar imaging sequence [EPI] with TR = 2200 ms, TE = 27 ms, flip angle = 90°, field of view = 256 mm, 4-mm isotropic voxels for two 6-minute runs [164 volumes each] of eyes open fixation). DIAN participants completed a similar MPRAGE protocol (TR = 2300 ms, TE = 2.95 ms, flip angle = 9°, field of view = 270 mm, 1.1x1.1x1.2 mm voxels)(McKay et al., 2022). Resting-state EPI sequence parameters for the DIAN participants differed across sites and scanners with the most notable difference being shorter resting-state runs (one 5-minute run of 120 volumes; see Supplementary File 1 for summary of structural and functional MRI parameters) (McKay et al., 2022).”

– Can the authors explain and justify why the fMRI processing included registration to an older-adult template? Could this have caused a bias in the registration accuracy for younger participants?

We apologize for the lack of clarity. In fact, we used two separate templates (one for younger adults and one for older adults). All participants were registered to the age-appropriate template. We now specify this procedure more clearly in “FC Preprocessing and Features”:

“Transformation to an age-appropriate in-house atlas template (based on independent samples of either younger adults or CN older adults) was performed using a composition of affine transforms connecting the functional volumes with the T2-weighted and MPRAGE images.”

– It is unclear to me why they chose to perform 10-fold CV and hold-out validation with 1000 bootstraps. To my mind, the latter would have been sufficient. If the authors think including the initial 10-fold CV as well is important, this should be clearly justified.

We agree with the reviewer that the 10-fold CV and bootstrapped hold-out validation are somewhat redundant. The hold-out validation step was initially performed to facilitate comparison across models. However, several reviewers have critiqued this approach. We have now removed the bootstrapping approach and instead focus on cross validation in the training set, as well as a non-bootstrapped validation in the testing set. We now specify this approach in the “Gaussian Process Regression (GPR)” section:

“Model performance in the training set was assessed using 10-fold cross validation via the Pearson correlation coefficient (r), the proportion of variance explained (R2), the mean absolute error (MAE), and root-mean-square error (RMSE) between true chronological age and the cross-validated age predictions merged across the 10 folds. We then evaluated generalizability of the models to predict age in unseen data by applying the trained models to the held-out test set of healthy controls.”

– It is important that R2 is reported for each model performance, not just MAE. As R2 is a ratio the values can readily be compared across published studies, while the MAE cannot as it is heavily dependent on the age distribution of the test set. For completeness, they could also consider reporting the Pearson's r correlation between predicted age and age, and the root mean square error as well.

We agree with the reviewer that R2 (as well as Pearson’s r and RMSE) are important metrics of model performance, especially for comparison with other studies. We now report these measures in Figure 1, as well as in the “Comparison of Model Performance”:

“All models accurately predicted chronological age in the training sets, as assessed using 10-fold cross validation, as well as in the held-out test sets. Overall, prediction accuracy was lowest in the FC model (MAEFC/Train = 8.67 years, R2FC/Train = 0.68, MAEFC/Test = 8.25 years, R2FC/Test = 0.73, see Figure 1A and B). The structural MRI model (MAES/Train = 5.97 years, R2S/Train = 0.81, MAES/Test = 6.26 years, R2S/Test = 0.82, see Figure 1C and D) significantly outperformed the FC model in age prediction accuracy, Williams’s tS vs. FC = 5.39, p <.001. Finally, the multimodal model (MAES+FC/Train = 5.34 years, R2S+FC/Train = 0.86, MAES+FC/Test = 5.25 years, R2S+FC/Test = 0.87, see Figure 1E and F) significantly outperformed both the FC model (Williams’s tS+FC vs. FC = 11.20, p <.001) and the structural MRI model (Williams’s tS+FC vs. S = 5.67, p <.001).”

– It is unclear how the model performance comparisons were conducted (Results, pg. 12). While t-tests are mentioned in the text, the exact details should be included in the Methods. My concern here is that the n (sample size) for these comparisons is based on the number of bootstraps (arbitrarily determined by the authors to be 1000), rather than the actual sample size. If that is the case (and Figure 1D suggests it is), this is procedure is incorrect as the sensitivity that these tests have to detect differences would be purely a factor of the number of bootstraps, rather than the number of observations. This means that the experimenter can simply choose to make smaller differences 'significant' simply by adding more bootstraps. This needs to be clarified and corrected if appropriate. One approach to achieve the goal of comparing model performances is to take Pearson's correlations with age from each model and use Z-transformations to test the alternative hypothesis that the correlations are different (e.g., the Steiger test). In that way, the n would be determined by the number of observations, so statistical power would appropriately reflect the data.

We thank the reviewer for pointing out this flaw in the comparison of model performance. We now test for significant differences between z-transformed Pearson correlations with age in each model using a Williams’s test (as these correlations are dependent in that they share a common variable, age, as opposed to the Steiger test of correlations between different variables). We now report these test results in the “Comparison of Model Performance” section:

“All models accurately predicted chronological age in the training sets, as assessed using 10-fold cross validation, as well as in the held-out test sets. Overall, prediction accuracy was lowest in the FC model (MAEFC/Train = 8.67 years, R2FC/Train = 0.68, MAEFC/Test = 8.25 years, R2FC/Test = 0.73, see Figure 1A and B). The structural MRI model (MAES/Train = 5.97 years, R2S/Train = 0.81, MAES/Test = 6.26 years, R2S/Test = 0.82, see Figure 1C and D) significantly outperformed the FC model in age prediction accuracy, Williams’s tS vs. FC = 5.39, p <.001. Finally, the multimodal model (MAES+FC/Train = 5.34 years, R2S+FC/Train = 0.86, MAES+FC/Test = 5.25 years, R2S+FC/Test = 0.87, see Figure 1E and F) significantly outperformed both the FC model (Williams’s tS+FC vs. FC = 11.20, p <.001) and the structural MRI model (Williams’s tS+FC vs. S = 5.67, p <.001).”

– I recommend avoiding saying things like 'marginally lower' when a p-value = 0.110. There's no real evidence that there's a difference here, so hard to say whether it's truly lower or not. Generally avoiding 'trends' at 0.1> p >0.05 is best practice. P-values are important, but effect sizes (with confidence intervals) are often more informative.

We appreciate the reviewer’s concern with over-interpretation of non-significant relationships. We now avoid using the terms “marginal” and “trend” throughout the manuscript. We also report effect sizes (partial η2) for all regression-based analyses.

– In the Discussion, when comparing age prediction accuracy between studies, it's important not to rely on MAE alone as this can vary greatly as a function of the test set age distribution. They should use R2 instead. Where R2 is unavailable, it's essential that the age range of each study mentioned in comparison is reported to provide context to the MAE values.

We thank the reviewer for pointing out this flaw. We now discuss our model performance in comparison to prior models using R2, instead of MAE, in “Predicting Brain Age with Multiple Modalities”:

“We found that a GPR model trained on structural MRI features predicted chronological age in a cognitively normal, amyloid-negative adult sample with an R2 of 0.81. This level of performance is comparable to other structural models, which have reported R2s ranging from 0.80 to 0.95 (Bashyam et al., 2020; Cole and Franke, 2017; Eavani et al., 2018; Gong et al., 2021; Lee et al., 2022; Liem et al., 2017; Ly et al., 2020; Wang et al., 2019). As previously reported (Millar et al., 2022), the FC-trained model predicted age with an R2 of 0.68, again consistent with previous FC models, which have achieved R2s from 0.53 to 0.80 (Eavani et al., 2018; Gonneaud et al., 2021; Liem et al., 2017). Our observation that structural MRI outperformed FC in age prediction is also consistent with previous direct comparisons between modalities (Dunås et al., 2021; Eavani et al., 2018; Liem et al., 2017).”

– The evidence for a biphasic relationship between FC-BAG and pre-clinical/clinical status is somewhat over-interpreted, particularly given there was no difference between A+T- and A+T+ people (p=0.11) and the fit of FC brain age is quite poor (i.e., far from the line of identity in Figure 2A). I suggest more caution when discussing this.

In the revised analyses, which focus on a three group comparison (CN/A- vs. CN/A+ vs. CI), the biphasic pattern in the FC-BAG model is clearly reproduced and survives FDR correction for multiple comparisons. However, the previously noted “marginal” biphasic pattern in the S+FC-BAG model is no longer apparent. Thus, we limit our discussion of the biphasic pattern to the FC model, and not the multimodal model. Moreover, we aim to focus the discussion on novel associations with this biphasic FC pattern (including the tests of continuous associations with biomarkers and cognition), rather than recapitulating the previously published finding. We also discuss the potential relevance of this result to emerging results from MEG (Ranasinghe et al., 2022) and metabolic PET studies (Goyal et al., 2022). Finally, we now also acknowledge the poor prediction performance of the FC model as a potential spurious explanation of these findings. The discussion of this result now reads as follows:

“As we have previously reported (31), FC-BAG was lower in presymptomatic AD participants compared to amyloid-negative controls. Extending beyond this group difference, we now also note that FC-BAG was negatively associated with amyloid PET in CN/A+ participants. The combined reduction of FC-BAG in the presymptomatic phase and increase in the symptomatic phase suggest a biphasic functional response to AD progression, which is partially consistent with some prior suggestions (77–81) (see ref 31 for a more detailed discussion).

Interpretation of this biphasic pattern is still unclear, although the present results provide at least one novel insight. Specifically, one potential interpretation is that the “younger” appearing FC pattern in the preclinical phase may reflect a compensatory response to early AD pathology (82). This interpretation leads to the prediction that reduced FC-BAG should be associated with better cognitive performance in the preclinical stage. However, this interpretation is not supported by the current results, as FC-BAG did not correlate with cognition in any of the analysis samples.

Alternatively, pathological AD-related FC disruptions may be orthogonal to healthy age-related FC differences, as supported by our previous observation that age and AD are predicted by mostly non-overlapping FC networks (31). For instance, the “younger” FC pattern in CN/A+ participants may be driven by hyper-excitability in the preclinical stage (83,84). It is also worth considering that patterns of younger FC-BAG in CN/A+ participants may somehow correspond to a recent observation that patterns of youthful-appearing aerobic glycolysis are relatively preserved in the preclinical stage of AD (85). Finally, this effect may simply be spuriously driven by poor performance of the FC brain age model, sample-specific noise, and/or statistical artifacts related to regression dilution and its correction (71). Hence, future studies should attempt to replicate these results in independent samples and further test potential theoretical interpretations.”

– A key limitation that was not mentioned was the small sample size relative to other studies. Perhaps the model performance is similar but given that only MAE is used to compare studies it is hard to draw meaningful conclusions. My impression is that had larger datasets been available, then performance would have improved.

We agree with the reviewer that the sample size of the training set was relatively small compared to prior models. We now acknowledge this issue as a limitation and an avenue for future development:

“Additionally, the training set (N = 390) was relatively small compared to prior models, which have included training samples over 1000 (e.g., 5,76). Future studies may further improve model performance by including larger samples of well-characterized participants in the training set.”

Reviewer #2 (Recommendations for the authors):

– As explained in the previous section, the FC-BAG model has very limited prediction power, and therefore the results from the FC-BAG model are not reliable while providing marginal benefit. The FC-BAG results should be moved to the supplementary materials.

Although FC performed relatively poorly in predicting age, a major goal of this project was to test whether each modality (structural MRI and FC) captures unique patterns related to AD progression. In fact, the FC model indeed captures a unique pattern in that it is reduced in CN/A+ participants, but increased in CI participants, which stands in contrast to patterns observed in the S-BAG model. We view this as an important observation, which belongs in the main text, rather than a supplementary analysis. We now justify our motivations more clearly in the “Introduction”:

“This project aimed to develop multimodal models of brain-predicted age, incorporating both FC and structural MRI. Participants with presymptomatic AD pathology were excluded from the training set to maximize sensitivity. We hypothesized that BAG estimates would be sensitive to the presence of AD biomarkers and early cognitive impairment. We further considered whether estimates were continuously associated with AD biomarkers of amyloid and tau, as well as cognition. We hypothesized that FC and structural MRI would capture complementary signals related to age and AD. Thus, we systematically compared models trained on unimodal FC, structural MRI, and combined modalities, to test the added utility of multimodal integration in accurately predicting age and whether each modality captures unique relationships with AD biomarkers and cognition.”

– For the FC-BAG models and its relation to other clinical variables, please also another version of the model including mean, median, and maximum head motion during the entire rsfMRI scan as covariates in the model to further ensure the reliability of the results.

We agree with the reviewer (as well as Reviewer #3) that appropriate consideration and control for head motion artifact is a critical element in analysis of FC data. Hence, we now include mean framewise displacement (FD) as an additional covariate in all statistical analyses involving the FC and multimodal (S+FC) BAG estimates. We do not include median and maximum, as suggested by the reviewer, in order to minimize potential multi-collinearity in our regression models. As noted in “Statistical Analysis”:

“Given the potential confounding influence of head motion on FC-derived measures (60,76,77), we also included mean FD as an additional covariate of non-interest in the FC and S+FC models.”

– It is not clear to me that the bootstrapped based t-test provides evidence in favor of the Vol+FC-BAG model. In other words, a stacked model combining FC-BAG and Vol-BAG will always perform as well or worse than each model. If the stacking approach takes this into account (not clear in the method section, needs further explanation) the marginal increase in performance can be explained to this unidirectional effect and needs further confirmation based on a model selection step (e.g. using new independent data not used in the training-validation of FC-BAG and Vol-BAG model to compare Vol+FC-BAG and Vol-BAG model).

We appreciate the reviewer’s concern and agree that it is important to demonstrate that increases in model performance are meaningful, rather than driven by unidirectional effects of adding more features and/or capitalizing on chance. Thus, we performed a supplementary analysis, in which we combined the fully trained structural MRI brain age model with a model trained on “reshuffled” FC features using the same stacking approach in 1000 bootstrap samples. Thus, the distribution of R2 in this analysis reflects the expected range of model performance from adding unrelated FC features to the structural brain age model. In fact, most of the bootstrapped models performed similarly or worse than the unimodal structural model (see Figure 1—figure supplement 4), suggesting that our stacking approach does not have a unidirectional effect of improvement from adding unrelated features. No simulation achieved performance as high or greater than the fully trained S+FC model, suggesting that the modestly sized increase in the stacked multimodal model (compared to the unimodal structural MRI model) is driven by meaningful age-related FC signal, rather than by simply capitalizing on chance in a larger feature set. We now describe this analysis in

“Comparison of Model Performance” and Figure 1—figure supplement 4:

“It is possible that the modest increase in the multimodal model was due to capitalizing on noise, simply by adding more features to the structural model. Hence, we also compared the observed R2S+FC to a bootstrapped distribution of R2 performance estimates from 1000 resamples using a model in which the original structural MRI model was stacked with a model trained on randomly reshuffled FC features. Thus, this distribution represents the expected improvements in model performance from simply adding new features to the structural MRI model with the stacked approach. The observed R2S+FC outperformed all R2 estimates from this bootstrapped distribution (p < 0.001, see Figure 1—figure supplement 4), suggesting that the modest increase in model performance observed in the stacked multimodal (S+FC) model over the unimodal structural model is due to meaningful age-related FC signal, rather than capitalizing on noise in a larger feature set.”

– After the previous step authors can choose the best performing model (either Vol-BAG or Vol+FC-BAG model) and only present the data for the selected model since results between the two models are redundant and don't add extra information to the reader.

Although our revised and supplementary analyses support the selection of the S+FC BAG model for most accurate prediction of age, as noted above in the response to comment #1, a major goal of this project was to test whether each modality (structural MRI and FC) captures unique patterns related to AD progression. As we are primarily motivated to evaluate these models in their associations with AD, it is important to consider that the most accurate BAG models for age prediction are not necessarily the ones that are most sensitive to disease. In fact, at least one study suggests that models with “moderate” age prediction accuracy might be the most useful in detecting deviation related to disease, as compared to overly “loose” or “tight” age prediction models (Bashyam et al., 2020). We now justify our motivations more clearly in the

“Introduction”:

“This project aimed to develop multimodal models of brain-predicted age, incorporating both FC and structural MRI. Participants with presymptomatic AD pathology were excluded from the training set to maximize sensitivity. We hypothesized that BAG estimates would be sensitive to the presence of AD biomarkers and early cognitive impairment. We further considered whether estimates were continuously associated with AD biomarkers of amyloid, tau, and neurodegeneration (Jack et al., 2016), as well as cognitive function. We hypothesized that FC and structural MRI would capture complementary signals related to age and AD. Thus, we systematically compared models trained on unimodal FC, structural MRI, and combined modalities, to test the added utility of multimodal integration in accurately predicting age and whether each modality captures unique relationships with AD biomarkers and cognition.”

– The analysis of hippocampal volume (specially related to the preclinical AD) needs to be confirmed. To do so, hippocampal volume as well as volumetric features from regions highly correlated with hippocampal volume should be removed from the feature set of Vol-BAG and Vol+FC-BAG models. The models need to be retrained using the same procedure. The relationship between hippocampal volume and the newly calculated Vol-BAG and Vol+FC-BAG values should be reported alongside the current results.

We agree with the reviewer that the associations between hippocampal volume and S-BAG and S+FC-BAG create problems for interpretation, as the S-BAG and S+FC-BAG models include hippocampal volume as input features and are thus circular. Moreover, it is more of interest to this study to test associations with the biomarkers associated with earlier Alzheimer disease stages, including amyloid and tau. Thus, in the interest of simplifying the focus of the study, as well as the interpretation of results, we have decided to remove the analyses of the neurodegeneration markers (including hippocampal volume).

Reviewer #3 (Recommendations for the authors):

Find below some recommendations on how (I think) the science in this manuscript might be improved in no particular order.

1. Training sample. It is unclear why one would like to minimize undetected AD pathology (amyloid positivity, that is) in the cognitively healthy training sample as many of these individuals (when Tau negative) have minimal changes in brain structure and function. Since you create a BA "norm" from these individuals, one may benefit from including a bigger, more representative sample using more lenient inclusion criteria. Decisions regarding the training sample can have a big impact on the subsequent interpretation of BA results (e.g. Hwang, 2022, Brain Comm).

We agree with the reviewer that the composition of the training sample is critical for interpreting outputs from a brain age model. Indeed, this consideration motivated us to train our model in amyloid-negative participants for both theoretical and empirical reasons. Specifically, although individuals in the earliest preclinical stages of AD (i.e. A+T-N-) likely have minimal detectable structural changes, it is possible that structural changes might be observable in later stage participants (i.e., T+, N+) even if they are cognitively normal. Thus, removing all A+ participants from the training set is a conservative approach to minimize the potential influence of presymptomatic AD pathology in any stage.

Further, although amyloid positivity may lead to minimal structural differences, prior work from our lab (and others) suggests that amyloid may be associated with differences in functional connectivity, and critically that presymptomatic amyloid pathology may confound effects that are otherwise interpreted to reflect “healthy aging” (Brier et al., 2014). Thus, if these participants are included in the training set, the FC model would learn to associate these disease-related FC patterns with normative aging. When applied to an analysis set of amyloid-positive participants, such a model would be less likely to identify deviation in the BAG, as those disease-related differences are incorporated into the model of healthy aging.

This argument was recently tested by Ly and colleagues (2020), who compared two brain age models: one trained on amyloid-negative participants vs. another trained on cognitively normal participants regardless of amyloid status. They found that the amyloid-negative trained model was able to detect differences in brain age between an amyloid-positive and amyloid-negative test sets, but the model that did not exclude amyloid-positive participants was not sensitive to this difference. Although this study was limited in that the amyloid-positive and amyloid-negative samples were drawn from separate, unmatched cohorts, it represents an important proof of concept, upon we aim to expand in this paper.

We have now revised the introduction to make the motivation for this design decision more clear:

“One approach to maximize sensitivity of BAG to presymptomatic AD pathology may be to train brain age models exclusively on amyloid-negative participants. As undetected AD pathology might influence MRI measures, and thus confound effects otherwise attributed to “healthy aging” (Brier, Thomas, Snyder, et al., 2014), including the patterns learned by a traditional brain age model, an alternative model trained on amyloid-negative participants only might be more sensitive to detect presymptomatic AD pathology as deviations in BAG. Indeed, one recent study demonstrated that an amyloid-negative trained brain age model (Ly et al., 2020) is more sensitive to progressive stages of AD than a typical amyloid-insensitive model (Cole et al., 2015). However, this comparison included amyloid-negative and amyloid-positive test samples from two separate cohorts, and thus may be driven by cohort, scanner, and/or site differences. To validate the applicability of the brain-predicted age approach to preclinical AD, it is important to test a model’s sensitivity to amyloid status, as well as continuous relationships with preclinical AD biomarkers, within a single cohort. Another recent comparison demonstrated that both traditional and amyloid-negative trained brain age models were similarly related to molecular AD biomarkers, but that further attempts to “disentangle” AD from brain age by including more advanced AD continuum participants in the training sample significantly reduced relationships between brain age and AD markers (Hwang et al., 2022). Thus, in this study we will apply the amyloid-negative training approach to a multimodal MRI dataset, in order to maximize sensitivity to AD pathology in the presymptomatic stage.”

2. Group descriptors. It is still a matter of ongoing debate, but I recommend using another descriptor for the amyloid positive group rather than "preclinical AD". Even in the NIAA-AA Research framework from 2018 (Jack Jr.) they only use this tag for individuals that are amyloid and tau positive.

We have revised our terminology throughout the manuscript and figures to refer to our groups by clinical assessment and molecular categorization (e.g, CN/A-, CN/A+, CI), rather than staged progression terms (e.g., “preclinical AD”).

3. Biomarker definition. I am not an expert on biomarkers, but the definition of pTau positivity is uncommon to me "Gaussian mixture model approach to defining pTau positivity based on the CSF pTau/Aβ40 ratio.". Could the authors justify and or cite the correspondent references?

To clarify, we fit a two-component GMM to the continuous pTau data, and then used the model classification to define pTau- and pTau+ participants. However, in order to simplify the analyses and interpretation of results, we have removed the analyses stratifying by pTau positivity and instead focus only on A- vs. A+ participants (see response below to comment #4).

4. Statistical analysis. If I have not misread, the methods section only mentions three test groups (A-, A+, and CDR>0) but the analysis is performed with four groups. This leads to confusion and should be corrected. Also, most higher-level analyses reported in the results are not described in this section. These analyses should be described in the methods section. It is difficult to evaluate whether the performed analyses are appropriate without this description. For example, (lines 323-7) the authors report three different regression models and then a fourth analysis combining the four groups, but only for FC-BAG. This procedure is unclear, not described (as far as I can see), and not justified. Another example is the analysis with NFL which is not mentioned until line 412 (p.20) in the Results section. Also, the authors use different samples for different tests, due to the lack of Biomarker information for some individuals. I suggest adding degrees of freedom/n when reporting the results, so the reader has some information regarding the sample used.

We apologize for the lack of clarity in the statistical analysis. In the revision, we have improved the clarity of this section in the following ways:

A. We no longer analyze the data using a four group split (i.e., A-T- vs. A+T- vs. A+T+ vs. CI). Instead, we focus on analyses of three groups (CN/A- vs. CN/A+ vs. CI) consistently throughout the study.

B. We now provide more detail on the higher level analyses, in which we test for group differences between the three analysis sets and test continuous associations with biomarkers and cognitive measures:

“Group differences in each BAG estimate were tested using an omnibus analysis of variance (ANOVA) test with follow-up pairwise t tests on age-residualized BAG estimates, using a false discovery rate (FDR) correction for multiple comparisons. Assumptions of normality were tested by visual inspection of quantile-quantile plots (see Figure 2—figure supplement 1). Assumptions of equality of variance were tested with Levene’s test. Linear regression models tested the effects of cognitive impairment (CDR > 0 vs. CDR 0) and amyloid positivity (A- vs. A+) on BAG estimates from each model, controlling for true age (as noted above) and demographic covariates (sex, years of education, and race). Given the potential confounding influence of head motion on FC-derived measures (59,75,76), we also included mean FD as an additional covariate of non-interest in the FC and S+FC models. We tested continuous relationships with AD biomarkers and cognitive estimates using linear regression models, including the same demographic and motion covariates.”

C. As noted in the response to Reviewer #2, comment #3, analyses of neurodegeneration biomarkers are of less interest to this study, compared to earlier biomarkers of amyloid and tau. Thus, analyses of NfL have been removed from the study.

D. We now report degrees of freedom for our regression analyses of group differences (see Table 2). We also report the number of participants in each group with available measures of each biomarker throughout the results, for example:

“355 participants (144 CN/A-, 154 CN/A+, 57 CI) had an available amyloid PET scan and 300 (120 CN/A-, 137 CN/A+, 43 CI) had an available CSF estimate of Aβ42/40.”

5. The authors are repeating the same analysis in three different modalities (also sometimes they repeat the analyses across several pairs of groups [e.g. lines 323-7]). Thus, I would strongly recommend using some type of multiple comparison corrections.

We agree with the reviewer that appropriate correction for multiple comparisons is necessary for these analyses. We now apply a false discovery rate (FDR) correction to the pairwise t tests, as described in “Statistical Analysis”:

“Group differences in each BAG estimate were tested using an omnibus analysis of variance (ANOVA) test with follow-up pairwise t tests on age-residualized BAG estimates, using a false discovery rate (FDR) correction for multiple comparisons.”

6. Table 2. The authors should mention what the units in the table represent. Also, I recommend adding df and exact significance values (at least if p >.001).

Table 2 presents the β estimates and standard error for the terms in the linear regression models predicting each BAG estimate. We now label this information more explicitly with separated columns. Further, we now provide exact p values for all terms and df for each model.

7. Atlas. The authors used the D-K atlas (not strictly the FS-defined) for BA computation. This is a suboptimal choice, and I would recommend in the future using more fine-grained parcellations. This is not a strong issue, but the choice surprised me since the authors used a 300-ROI parcellation for the rs-fMRI. Also, since the authors use cortical thickness for sampling the cortex, I would not use "Volumetric"-BA as a descriptor.

We agree with the authors that the D-K atlas is a relatively coarse anatomical parcellation. However, as these analyses were based on large, existing datasets that had already been processed and QC’ed with a harmonized pipeline, it would require significant effort to re-parcellate and QC the full dataset. Moreover, despite this coarse parcellation, the structural MRI data still predicts age quite well and outperforms the FC data, which of course uses the finer grained set of ROIs. We now acknowledge the choice of the D-K parcellation as a potential limitation and area of future development in the “Limitations”:

“Structural MRI was quantified using the Desikan atlas (Desikan et al., 2006), which although widely used, provides a relatively coarse parcellation of structural anatomy, and moreover, does not align with the parcellation used to define FC regions (Seitzman et al., 2020). Although the structural MRI data still outperformed FC in predicting age, future brain age models may further improve performance by using more refined and harmonized anatomical parcellations to define brain regions.”

Additionally, we now refer to the “volumetric” brain age model as “structural”, e.g., S-BAG, throughout the manuscript and figures.

8. Movement and rs-fMRI. The rs-fMRI preprocessing used might still lead to a signal that is related to movement. Since movement is almost always related to age and disease [and thus can affect both the BA computation and the tests in the test sample], I would suggest taking additional steps in this regard. At the minimum, I would include total motion as an additional covariate in the higher-level analysis and discuss this issue in the limitations section.

We agree with the reviewer (as well as Reviewer #2) that appropriate consideration and control for head motion artifact is a critical element in analysis of FC data. Hence, we now include mean framewise displacement (FD) as an additional covariate in all statistical analyses involving the FC and multimodal (S+FC) BAG estimates. As noted in “Statistical Analysis”:

“Given the potential confounding influence of head motion on FC-derived measures (60,76,77), we also included mean FD as an additional covariate of non-interest in the FC and S+FC models.”

9. The results in cognitively healthy samples are largely negative (i.e. do not differ with groups). One possible explanation is that the authors are using cross-sectional samples and thus – even when using BA metrics – have a signal that captures ongoing aging (accelerated aging, if you wish) and baseline (lifelong, preexisting) variability between individuals. The latter may obscure possible existing effects. I recommend the authors acknowledge the limitations of using cross-sectional data to study changes that ought to be longitudinal.

We appreciate the reviewer’s suggestion and now discuss this issue as a limitation and area of future development:

“Moreover, estimates of BAG likely capture variance in early-life factors, which may obscure associations with Alzheimer disease and cognition, especially in cross-sectional designs (87). Future studies may improve the sensitivity of BAG estimates to disease-related markers by testing associations with longitudinal change.”

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Supplementary file 1. Summary of acquisition parameters for structural T1 and resting-state functional MRI.

    TR = repetition time, TE = echo time.

    elife-81869-supp1.docx (1.4MB, docx)
    MDAR checklist

    Data Availability Statement

    This project utilized datasets obtained from the Knight ADRC and DIAN. The Knight ADRC and DIAN encourage and facilitate research by current and new investigators, and thus, the data and code are available to all qualified researchers after appropriate review. Requests for access to the data used in this study may be placed to the Knight ADRC Leadership Committee (https://knightadrc.wustl.edu/professionals-clinicians/request-center-resources/) and the DIAN Steering Committee (https://dian.wustl.edu/our-research/for-investigators/dian-observational-study-investigator-resources/data-request-form/). Requests for access to the Ances lab data may be placed to the corresponding author. Code used in this study is available at https://github.com/peterrmillar/MultimodalBrainAge (copy archived at swh:1:rev:de233b8fe813f5fcca317ce0a6353047f0dfbb92).


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES