Abstract
Age is a significant risk factor for mild cognitive impairment (MCI) and Alzheimer’s disease (AD) and identifying brain age patterns is critical for comprehending the normal aging and MCI/AD processes. Prior studies have widely established the univariate relationships between brain regions and age, while multivariate associations remain largely unexplored. Herein, various artificial intelligence (AI) models were used to perform brain age prediction using an MRI dataset (n = 825). The optimal AI model was then integrated with the feature importance methods, namely Shapley additive explanations (SHAP), local interpretable model-agnostic explanations, and layer-wise relevance propagation, to identify the significant multivariate brain regions hierarchically involved in this prediction. Our results showed that the deep learning model (referred to as AgeNet) outperformed conventional machine learning models for brain age prediction, and that AgeNet integrated with SHAP (referred to as AgeNet-SHAP) identified all ground-truth perturbed regions as key predictors of brain age in semi-simulation, demonstrating the validity of our methodology. In the experimental dataset, when compared to cognitively normal (CN) participants, MCI exhibited moderate differences in brain regions, whereas AD showed highly robust and widely distributed regional differences. Individualized AgeNet-SHAP regional features further showed associations with clinical severity scores in the AD continuum. These results collectively facilitate data-driven explainable AI approaches for disease progression, diagnostics, prognostics, and personalized medicine efforts.
Keywords: Alzheimer’s disease (AD), mild cognitive impairment (MCI), brain age, deep learning (DL), machine learning (ML), feature importance, SHapley Additive exPlanations (SHAP)
Introduction
Alzheimer’s disease (AD) is the most prevalent neurodegenerative disorder, characterized by neural degeneration and structural changes in the brain. AD represents a significant global health challenge, with over 55 million individuals affected worldwide [1]. In the USA alone, over 6.9 million adults aged 65 and older are affected, and projections indicate that this number will double by 2050 [2]. This highlights the pressing need to develop effective preventive strategies and therapeutic interventions for AD, which requires a clear understanding of AD mechanisms. To fully uncover AD mechanisms, it is crucial to have a deeper understanding of the various facets of AD, including factors such as age and brain alterations.
Human brain changes with chronological aging process, and neurodegenerative diseases such as AD can disrupt the normal aging trajectory and accelerate such process [3, 4]. Brain age, a promising biomarker for neurodegenerative diseases derived from neuroimaging data, quantifies this accelerated aging process. The difference between chronological age and brain age is often referred to as the brain age gap, with a larger gap being associated with worse AD outcomes, including significant cognitive declines [5]. Thus, understanding brain alterations accompanying brain aging in healthy and MCI/AD populations could provide valuable insights in the MCI/AD mechanisms.
Brain structural alterations can be assessed using neuroimaging markers [4, 6, 7], such as magnetic resonance imaging (MRI) [4], which, for instance has demonstrated an association between gray matter volume reduction and increased age [8, 9]. The integration of neuroimaging with machine learning (ML)/artificial intelligence (AI) techniques has led to significant advancements in AD research [10–13]. However, the existing studies predominantly focus on characterizing univariate relationships between age and selective or individual brain regions or tissue types [14, 15]. In reality, multiple brain regions show associations with age [16, 17] and work collaboratively in brain processes [7, 18, 19]. Prior studies on brain age prediction using neuroimaging data and ML/AI techniques have primarily focused on prediction accuracy [5, 6, 20] but the underlying regional features that contribute to such predictions are less interpretable or explainable. These studies, among others, highlight the critical need for developing ML/AI algorithms that are more interpretable or explainable, so they can potentially capture multivariate interactions between brain regions and provide their hierarchical contributions for precise understanding of brain aging mechanisms in health and disorders, particularly AD.
In this study, we used semi-simulated and experimental data to validate and investigate the performance of various AI models, including conventional ML models and a deep learning (DL) model named AgeNet, for brain age prediction. We then integrated the optimal predictive model, which is the AgeNet model, with the Shapley additive explanations (SHAP) [21], local interpretable model-agnostic explanations (LIME) [13, 22], and layer-wise relevance propagation (LRP) [23] and compared their performances in the semi-simulated data for ground-truth. The outperforming AgeNet-SHAP was used to identify the hierarchy of multivariate associations between brain regions and brain age in cognitively normal control and MCI/AD populations. We hypothesize that our approach can effectively capture the important regional hubs related to aging in cognitively normal control and AD populations. In addition, we hypothesize that our approach could also illustrate the clinical severity relevance of those regions under the framework of AgeNet-SHAP.
Materials and methods
Experimental datasets
We utilized the T1-weighted MR images of 825 participants at their baseline visit from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) [24] database as the experimental data for this study. Among them 684 participants (55.1–91.5 years old, with a mean of 73.08 and standard deviation of 7.43 years old; 317 females) belonged to ADNI-2 (referred to as “ADNI-2” data), and 141 participants (58.3-94.0 years old, with a mean of 77.48 and standard deviation of 7.43 years old; 64 females) belonged to ADNI-1, ADNI-GO, ADNI-3, and ADNI-4 (collectively referred to as “ADNI-others” data). In the ADNI-2 data, there were 206 cognitively normal (CN) individuals and 478 MCI/AD patients based on the record of their baseline visit. The MCI/AD participants were further categorized into 171 early MCI (EMCI), 152 late MCI (LMCI), 6 MCI, and 149 AD. Moreover, the ADNI-others data samples were divided into 28 CN and 113 MCI/AD (23 EMCI, 45 MCI, 45AD) participants following their baseline visits. Both the ADNI-2 and ADNI-others data included the chronological age of each participant recorded on the MRI date. Furthermore, using the Multi-atlas region segmentation utilizing ensembles of registration techniques and parameter modification (MUSE) method [25], each MRI was segmented into 145 distinct regions of interest (ROIs), covering the whole brain gray matter, white matter, and cerebrospinal fluid areas. The volumetric measurement of each ROI was extracted and then corrected for the site and sex influence by using an existing harmonization technique [19, 26]. Then, the ROI volumes for each participant were normalized by dividing each ROI by the sum of all ROIs. Normalization is crucial for removing head size differences and making the data more suitable for AI modeling. We utilized the normalized harmonized brain volumetric changes expressed through ROIs to predict the age, and this predicted age is referred to as brain age. Note that, although ADNI-2 and ADNI-others cohorts originate from the same initiative, but these data vary in several aspects such as age distribution and other demographic factors. Leveraging these datasets collectively enables a more robust evaluation of the generalizability of the proposed method. Additionally, the Clinical Dementia Rating Scale Sum of Boxes (CDR-SB) [27, 28] scores of all participants are also used to investigate the clinical severity processes. For each subject, the CDR-SB score closest to the MRI date was selected.
Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD.
Semi-simulated dataset
We formulated semi-simulated data to validate the ability of AgeNet-SHAP approach for capturing the hierarchical multivariate relationships between brain regions and brain age. The aim of the simulation analysis is to establish arbitrary multivariate correlations between brain volumetric measures and brain age, as well as to evaluate whether our methodology can accurately detect these complex relationships for the known ground-truth. The semi-simulated data was created using the MRIs of 234 CN participants (56.5–90.3 years old, with a mean of 75.03 and standard deviation of 6.88 years old; 121 females) from the ADNI-2 and ADNI-others datasets, which can enable the preservation of normal brain variations. By starting with CN data—which represents healthy, non-pathological brain aging—we create a clean baseline where any deviations introduced through our synthetic perturbations can be unambiguously interpreted as disease-like signals. If we had included any patient data in the simulation, their intrinsic neurobiological heterogeneity and unknown contributions to brain age prediction would have introduced uncontrolled confounding factors. To generate the semi-simulated data, we chose the top 10 ROIs (gray matter or white matter) having strongest negative correlation with age. The ROIs selected for perturbation are the left planum temporale, left central operculum, left superior frontal gyrus medial segment, right planum temporale, left opercular part of the inferior frontal gyrus, right medial frontal cortex, right posterior insula, right anterior limb of internal capsule, left planum polare, and right planum polare. For 60% participants we reduced the volumes of the selected ROIs by a factor between 30% and 50% and increased age by a factor between 10% and 30%. These factors were chosen from a random uniform distribution. While perturbing we made sure to keep age in the range of 55–97 years. The perturbations retain the maximum negative correlation between age and the selected ROIs, while maintaining suboptimal correlations for other ROIs. The further details of semi-simulated data are presented in Supplementary Materials Section S1.
Predicting brain age and regional patterns via explainable AI models
We used three ML models and one DL model to predict brain age. Then the optimal performing model was combined with feature importance methods (SHAP, LIME, and LRP) to identify the significant multivariate brain regional hubs toward the brain age prediction. The ML models used in this study include Lasso regression (LR) [29, 30], Ridge regression (RR) [31, 32], and support vector regression (SVR) [33]. These models were selected for their effectiveness in handling high-dimensional data, with SVR being particularly useful due to its non-linear capabilities. The hyperparameters for each model were optimized using the grid search approach based on the negative mean absolute error scoring metric. For LR and RR, the best alpha values were chosen within the ranges of 0.15–0.35 with an increment of 0.05 and 150–250 with an increment of 10, respectively. This parameter controls the strength of regularization applied to the model to prevent overfitting. LR follows L1 regularization while RR follows L2 regularization. For SVR, the Radial Basis Function (RBF) kernel was used, and the gamma, epsilon, and C hyperparameters were tuned across the following values: gamma for n = −12 to −2 with an increment of 1, epsilon for n = −7 to 1 with an increment of 1, and C for n = −1 to 4 with an increment of 1. These hyperparameters work together to balance the model’s complexity, generalization ability, and sensitivity to training data. The regularization parameter (C) determines how strictly the model adheres to the training data by penalizing errors. The epsilon parameter defines the width of the insensitive tube around the predicted regression line, within which errors are not penalized. Finally, gamma determines how far the influence of a single training point extends, effectively controlling the flexibility of the decision boundary. The ML models were trained following a 10-fold cross-validation (CV) [34] procedure, where the data was spitted into training and testing set. The dataset was first randomly shuffled with a fixed random state to ensure reproducibility, then divided into 10 equal folds. For each iteration of CV, 9 folds (90% of the data) were used for model training, while the remaining 1 fold (10%) served as the test set. During training, we further allocated 10% of the training data as a validation set to monitor model performance. All reported performance metrics were calculated on the independent test sets across all 10 iterations. Note that the test data is distinct from train data and is not used in training the models.
Furthermore, given the superior ability of DL models over conventional ML models to capture complex non-linear relationships, we tested brain age prediction using a basic deep neural network. Considering the limitations of our dataset (i.e. small to moderate sample size), this model is the most suitable, as it provides promising and generalized results while minimizing overfitting due to its architectural simplicity. The DL model, named AgeNet, consists of five layers for estimating brain age. It has four hidden dense layers followed by a final output dense layer. The hidden layers contain decreasing units in the sequence of 256, 128, 64, and 32, each utilizing a rectified linear unit (ReLU) activation function, followed by a batch normalization and a dropout layer with a dropout rate of 0.4. The final output layer features a single unit with a linear activation function to predict the brain age value. The training parameters contain mean absolute error (MAE) loss function, ADAM optimizer, 0.01 initial learning rate, 64 batch size, and 300 epochs. Note that, we have only used MAE loss function without any additional weighted loss components. The hyperparameters were tuned to optimize the model’s performance, and the details of hyperparameters tuning are given in the Supplementary Materials Section S2. This model is also trained using 10-fold CV and the reported results are computed on the test data. The test data is not overlapped with the training data, hence ensuring the model’s generalizability.
Since the AgeNet model tends to perform better than the ML models for brain age prediction, we integrated it with SHAP method to identify the significant brain regional hubs hierarchically involved in the brain age prediction. SHAP is a widely used technique for feature importance that accounts for interactions among features. It is also generally more stable and consistent than other promising feature importance methods, such as LIME [13, 22] and LRP [23]. The detailed comparison of SHAP with other feature importance techniques (LIME and LRP) is included in Supplementary Materials Section S3. The significance of a brain region (ROI) is determined based on SHAP value, and this value is computed by considering the impact of the region on the final predictive output by adding and removing it from the input. The SHAP value for a region is obtained by averaging the absolute SHAP values of that feature for all participants. A higher average absolute SHAP indicates a higher significance of that feature for making the prediction. The equation used to calculate the SHAP value for an input feature r is [10]:
where x is the input ROIs, y is the output of the AgeNet model (f). Let x be the simplified input which maps to the original input x using the function x = hx(x), specifically, for absent features hx substitutes baseline values (typically derived from the training distribution, Xtrain) and for present features hx retains the original values from the instance being explained. N is the length of binary coalition vector, and z is a binary coalition vector that represents a perturbed version of the input features. We used SHAP’s primary explainer interface to quantify feature contributions through Shapley values. The overall workflow AgeNet-SHAP is illustrated in Fig. 1. As shown in the figure, given ROIs features and age as input, the model f (AgeNet) is trained on the train set. The architecture of the model f is shown in the upper most block of the diagram. Once the model is trained, the test set is given to it as input to evaluate its predictive performance along with determining the significance of the input ROI features for brain age prediction using their SHAP values. We computed the SHAP values on the test data for each fold during the 10-fold CV.
Figure 1.
The schematic workflow of the AgeNet-SHAP method. Given the brain regional features as input, it predicts the brain age using the f (AgeNet) model. The architecture of the f model is provided in the top block. It also computes the SHAP values of the input features to determine their impact on the output of f model
For the robustness purposes, we ran our models’ multiple times (five times in this case) and averaged the results, where each run consists of 10-fold CV training-testing process. The model’s performance was evaluated by computing the Spearman correlation [35, 36] between the actual and predicted brain age within the test set, along with determining the significance of this correlation via the computed P-value, following the Spearman rank-order correlation coefficient method [35, 36]. This is a non-parametric approach that does not make assumptions about the underlying data distribution. Then the P-values are mapped to false discovery rate (FDR)-corrected P-values, as FDR-corrected P-values are best for validating the performance of various models because they help to control the rate of false positives when performing multiple comparisons, ensuring that observed significant correlations are more likely to reflect true effects rather than random variation. Additionally, to display the brain regions, they are overlayed on the standard MRI template in MNI space using the axial orientation. Moreover, we assessed brain structural differences between the CN group and individuals with MCI and AD using Cohen’s d [37], a widely recognized measure for effect size. Individualized AgeNet-SHAP regional values were used to compute Cohen’s d. Only ROIs with FDR-corrected P-values < .05 were considered statistically significant. Comparisons were made between the CN group and both MCI and AD groups to examine how brain alterations, in terms of multivariate regional associations with brain age, evolve with group-wise disease progression.
Clinical severity of AD
The clinical severity of MCI/AD was explored by investigating the correlations between the individualized AgeNet-SHAP regional features obtained from the experimental data and the CDR-SB [27, 28] scores. A CDR-SB score ranges from 0 to 18, derived from the assessment of three cognitive and three functional domains (memory, orientation, judgment and problem-solving, community affairs, home and hobbies, and personal care), where a higher score indicates a greater clinical severity. This analysis highlights the relevance of AgeNet-SHAP based generated features to determine the severity of MCI/AD.
Results
Semi-simulated data results
In the semi-simulated data, the DL-based AgeNet and conventional ML models were compared for their performance in predicting brain age, as presented in Table 1. AgeNet model achieved a correlation of 0.900 between the actual and predicted brain age, along with an FDR-corrected P-value of 3.11E-87 (Fig. 2a), while the LR, RR, and SVR models obtained 0.769, 0.758, and 0.731 correlations, respectively. The test loss is also smaller/optimal for AgeNet, while the other ML models have higher test losses, indicating that the AgeNet model captures non-linear complex relationships in the data more effectively than conventional ML models. Moreover, the AgeNet-SHAP was able to identify all perturbed brain regions (shown in Fig. 2b) to be the significant brain regions for brain age prediction based on their SHAP values, as illustrated in Fig. 2c. The detailed comparison of AgeNet-SHAP with AgeNet-LIME and AgeNet-LRP is included in Supplementary Materials Section S3. Note that Fig. 2b only highlights the regions which were perturbed to create the semi-simulated data. Since SHAP values are continuous and reflect a hierarchy of feature importance, the perturbed regions in Fig. 2c were shown in different colors. Despite the variation in color intensity, indicating differing levels of importance, all highlighted regions fall within the top 10 most significant regions for brain age prediction. This suggests that our models consistently identified the perturbed regions as highly influential for the prediction task, even though their relative contributions may differ.
Table 1.
The performance of LR, RR, SVR, and AgeNet for brain age predictions using the semi-simulated data.
| Model | Test loss | Spearman’s correlation | P-value |
|---|---|---|---|
| LR | 4.493 | 0.769 | 1.28E-48 |
| RR | 4.723 | 0.758 | 2.00E-46 |
| SVR | 5.418 | 0.738 | 7.20E-43 |
| AgeNet | 2.319 | 0.900 | 3.11E-87 |
Figure 2.
AgeNet-SHAP based brain age prediction using the semi-simulated data. (a) Spearman’s correlation between the actual and predicted brain age. (b) The top 10 regions chosen for perturbation are shown in red. (c) Significant brain regions identified by AgeNet-SHAP. The results are averaged over all participants. z shows the brain slice position
Experimental data results
The models were trained-tested using the entire experimental dataset following the 10-fold CV process and then the results were separated out for the ADNI-2 and ADNI-others test samples, respectively. From these results we can observe that, the AgeNet model exhibited higher performance as compared to the conventional ML models for brain age prediction using both the ADNI-2 and ADNI-others experimental datasets, as shown in Table 2. It obtained a correlation of 0.870 (P = 2.20E-308, FDR-corrected, Fig. 3a) and a correlation of 0.890 (P = 2.20E-308, FDR-corrected, Fig. 4a) between the actual and predicted age on ADNI-2 and ADNI-others data respectively, along with yielding optimal test loss score (3.42) on the entire experimental data. In contrast, the correlations achieved by LR, RR, and SVR on the ADNI-2 data were 0.631, 0.622, and 0.625, respectively, and the correlations obtained by these models on the ADNI-others data were 0.564, 0.577, and 0.586, respectively. Additionally, the outperforming AgeNet model integrated with SHAP, AgeNet-SHAP, identified the significant brain regional hubs hierarchically involved in brain age estimation across the CN, MCI, and AD groups, as illustrated in Fig. 3b–d for the ADNI-2 data and Fig. 4b–d for the ADNI-others data. From these results we can observe that 90% of the top 50 significant brain regions were overlapping between the ADNI-2 and ADNI-others data across the same groups. Furthermore, the Figs 3e and 4e demonstrate the Cohen’s d differences between the CN and MCI groups belonging to ADNI-2 data and belonging to ADNI-others data, respectively. Likewise, Figs 3f and 4f show the Cohen’s d differences between the CN and AD groups from ADNI-2 data and from ADNI-others data, respectively. From Fig. 3e and f, we can assess that the AD patients showed much larger brain alterations than MCI patients when compared to CN group, both in the aspects of range and magnitude. For example, the right amygdala and left middle temporal gyrus regions in MCI portrayed moderate differences when compared to CN, while in AD they showed a much higher differences than CN. A similar pattern was observed in Fig. 4e and f, where AD patients again showed more significant brain alterations compared to MCI patients, relative to the CN participants.
Table 2.
The performance of LR, RR, SVR, and AgeNet for brain age predictions using the experimental datasets.
| Dataset | Evaluation metric | Models |
|||
|---|---|---|---|---|---|
| LR | RR | SVR | AgeNet | ||
| ADNI-2 | Test loss | 4.738 | 4.725 | 4.745 | 3.429 |
| Spearman’s correlation | 0.631 | 0.622 | 0.625 | 0.870 | |
| P-value | 2.05E-77 | 1.33E-74 | 1.80E-75 | 2.20E-308 | |
| ADNI-others | Test loss | 4.738 | 4.725 | 4.745 | 3.429 |
| Spearman’s correlation | 0.564 | 0.577 | 0.586 | 0.890 | |
| P-value | 5.09E-13 | 1.08E-13 | 3.72E-14 | 2.20E-308 | |
Figure 3.
AgeNet-SHAP on ADNI-2 data. (a) Spearman’s correlation between the actual and predicted brain age for entire data. The individualized AgeNet-SHAP regional features were averaged within each group and significant brain regions are shown for (b) CN, (c) MCI, and (d) AD groups. Cohen’s d effect size group differences are presented for (e) CN–MCI and (f) CN–AD. The results are averaged over all participants. z shows the brain slice position. Color areas show the significant regions while the gray areas show the nonsignificant background
Figure 4.
AgeNet-SHAP on ADNI-others data. (a) Spearman’s correlation between the actual and predicted brain age for entire data. The individualized AgeNet-SHAP regional features were averaged within each group and significant brain regions are shown for (b) CN, (c) MCI, and (d) AD groups. Cohen’s d effect size group differences are presented for (e) CN–MCI and (f) CN–AD. The results are averaged over all participants. z shows the brain slice position. Color areas show the significant regions while the gray areas show the nonsignificant background
Clinical severity results
The multivariate regional AgeNet-SHAP features were studied in associations with the clinical severity scores in the AD continuum using the entire experimental data. The brain regions significantly associated with CDR-SB scores, based on individualized AgeNet-SHAP features, are depicted in Fig. 5 for the entire experimental data. Only correlations with P-values < .05 were considered significant and presented. Notable brain regions exhibiting positive correlations with CDR-SB in the AD continuum were the left precuneus, left middle temporal gyrus, right fornix, right ventral DC, and left entorhinal area. Conversely, prominent regions showing significant negative correlations with CDR-SB in the AD continuum were the left occipital fusiform gyrus, left superior parietal lobule, left subcallosal area, right basal forebrain, and right triangular part of the inferior frontal gyrus.
Figure 5.
Spearman’s correlations between the individualized AgeNet-SHAP regional features and the CDRSB scores in the entire experimental data across. The color bar highlights the correlation values. The results are averaged over all participants. z shows the brain slice position. Color areas show the significant regions while the gray areas show the nonsignificant background
Discussion
This study systematically evaluated the performance of various ML/DL models for brain age prediction in normal aging and MCI/AD population and found that the DL model (AgeNet) outperformed the conventional ML models on both semi-simulated and experimental datasets. We further integrated the outperforming AgeNet model with SHAP, LIME, and LRP feature importance strategies and found that AgeNet-SHAP outperformed AgeNet-LIME and AgeNet-LRP. This integrated AgeNet-SHAP method captured the multivariate associations between brain age and neuroanatomical changes in CN, MCI, and AD individuals. Group differences assessed using multivariate regional AgeNet-SHAP features between the CN and MCI groups, as well as between the CN and AD groups revealed that the disease progression (i.e. from CN to MCI to AD dementia stages) leads to increased and widely distributed multivariate hierarchical brain atrophy patterns. Additionally, individualized AgeNet-SHAP regional features further correlated with the participants’ clinical severity scores, suggesting the potential clinical relevance of these novel multivariate features.
We noticed an overlap among the most significant brain regions identified by AgeNet-SHAP across the CN, MCI, and AD groups belonging to ADNI-2 and ADNI-others data, respectively. However, the order and magnitude of the significance varied among the groups for each dataset. For instance, in both data cohorts, the left anterior limb of internal capsule showed greater significance in AD compared to CN, while the significance of the left planum polare reduced from CN to AD. This might suggest that left planum polare is more susceptible to AD in the early stage of the disease course. Likewise, the distinct regional hubs in MCI/AD compared to CN suggest that those regions are highly significant for AD progression. This disease progression induces brain alterations in such a way that there is a noticeable difference between brain regions and brain age associations, and our model effectively captured those changes. Understanding such structural brain changes at different stages of the disease course is critical for developing more targeted treatments for patients with different levels of clinical severity. Additionally, when comparing the results from the ADNI-2 and ADNI-other datasets, the AgeNet model achieved consistently high correlations for brain age prediction in both cohorts, indicating its robust performance and strong generalizability across datasets. Furthermore, notable overlaps, as well as some variations, were observed in the significant brain regional hubs identified by AgeNet-SHAP across CN and patient groups in both datasets. The overlapping regions highlight the high reproducibility of AgeNet-SHAP results, further validating the reliability of our method. Meanwhile, the observed differences suggest that the model effectively captures underlying variations in dataset characteristics, such as age distribution. Overall, these findings demonstrate the robustness of the proposed method and reinforce its validity and applicability across heterogeneous datasets.
Moreover, the AgeNet-SHAP method identified several significant brain regions associated with brain age estimation, and many of our findings align with previous studies. For instance, the third ventricle was previously recognized as a key region for brain age in the MCI and AD groups [38], and our method similarly highlighted it in these groups. Additionally, brainstem atrophy has been shown to correlate with aging in AD participants [39], and our approach also linked this region to aging in AD. The left planum polare has been associated with a high brain age gap in AD [40], and AgeNet-SHAP identified it as a dominant region for brain age estimation in both MCI and AD groups. The broad consistency between our findings and the existing literature supports the validity of our method and results, as well as the relevance of the identified novel multivariate hierarchical regions in normal aging and MCI/AD.
Furthermore, the significant brain regions captured in the clinical severity analysis are also consistent with findings from existing literature. Like, prominent atrophies are known to happen in temporal lobe due to AD [41, 42] and our analysis also identified the left middle temporal gyrus region to have significant positive correlations with CDR-SB scores. Similarly, brain atrophy with normal aging was observed in the occipital and parietal lobes [43], and our method also identified the superior parietal lobule and occipital fusiform gyrus regions to have notable negative correlations with the CDR-SB. These findings demonstrate that AgeNet-SHAP-derived multivariate regional features are meaningful and can effectively capture clinical severity patterns associated with both the AD and normal aging processes.
In conclusion, this study emphasized the MRI-based modelling of the multivariate associations between brain age and brain regions by integrating the optimal AI predictive model with the feature importance technique, SHAP. It demonstrated that multiple brain regions collaboratively contribute to aging processes, and the proposed novel multivariate approach captured those significant dominant regional hubs hierarchically involved in brain aging processes and clinical severity prediction. Overall, our explainable AI-based modelling approaches effectively captured multivariate regional brain age and clinical severity patterns in MCI/AD relative to CN, thereby providing a deeper understanding of disease progression mechanisms. This work can be extended in future by applying it to various other neuroimaging modalities, such as PET, EEG, DTI, which could uncover comprehensive brain aging aspects, aiding in more complete understanding of brain aging mechanisms in MCI/AD. Furthermore, future research could investigate voxel-wise multivariate associations between brain regions and brain age prediction using more advanced DL models. With the availability of larger datasets, the approach can also be extended to analyze narrower age ranges for finer-grained aging insights.
Supplementary Material
Acknowledgments
The authors acknowledge the publicly available ADNI dataset (https://adni.loni.usc.edu). Data collection and sharing for this project was funded by the ADNI (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. The preliminary preprint of this work is available on medRxiv (doi: https://doi.org/10.1101/2025.02.28.25323097) and will be linked to the published version after the paper is published.
Contributor Information
Gauri Darekar, Department of Radiology, Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, 63110, United States; Institute for Informatics, Data Science and Biostatistics, Washington University School of Medicine, St Louis, MO, 63110, United States.
Taslim Murad, Department of Radiology, Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, 63110, United States.
Hui-Yuan Miao, Department of Radiology, Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, 63110, United States.
Deepa S Thakuri, Department of Radiology, Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, 63110, United States; School of Medicine, University of Missouri, Columbia, MO, 65212, United States.
Ganesh B Chand, Department of Radiology, Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, 63110, United States; Imaging Core, Knight Alzheimer Disease Research Center, Washington University School of Medicine, St Louis, MO, 63110, United States; Institute of Clinical and Translational Sciences, Washington University School of Medicine, St Louis, MO, 63110, United States; NeuroGenomics and Informatics Center, Washington University School of Medicine, St Louis, MO, 63110, United States.
Alzheimer’s Disease Neuroimaging Initiative:
Gauri Darekar, Taslim Murad, Hui-Yuan Miao, Deepa S Thakuri, and Ganesh B Chand
Author contributions
Gauri Darekar (Conceptualization [lead], Data curation [equal], Formal analysis [lead], Investigation [equal], Methodology [equal], Software [equal], Validation [equal], Visualization [equal], Writing—original draft [lead], Writing—review & editing [supporting]), Taslim Murad (Conceptualization [equal], Data curation [equal], Formal analysis [lead], Investigation [equal], Methodology [equal], Software [equal], Validation [equal], Visualization [equal], Writing—original draft [lead], Writing—review & editing [lead]), Hui-Yuan Miao (Investigation [supporting], Writing—original draft [supporting], Writing—review & editing [supporting]), Deepa S. Thakuri (Investigation [supporting], Writing—original draft [supporting], Writing—review & editing [supporting]), and Ganesh B. Chand (Conceptualization [lead], Data curation [equal], Formal analysis [equal], Funding acquisition [lead], Investigation [lead], Methodology [equal], Project administration [equal], Resources [lead], Software [equal], Supervision [equal], Validation [equal], Visualization [equal], Writing—original draft [equal], Writing—review & editing [supporting]).
Supplementary data
Supplementary data is available at Biology Methods and Protocols online.
Conflict of interest statement. None declared.
Funding
This work was supported by the Mallinckrodt Institute of Radiology (MIR) of Washington University in St Louis and the National Institutes of Health (NIH) [K01AG083230 to G.B.C.] funding.
Data availability
Original ADNI data are publicly available (https://adni.loni.usc.edu/). The generated results will be shared upon reasonable request following applicable human subjects’ data transfer procedures.
Code availability
Our code is available at https://github.com/ganchand/AgeNet-SHAP. For harmonization we have used the code located at https://github.com/ganchand/AJP_Codes. The MUSE code is located at https://github.com/CBICA/MUSE.
References
- 1. Alzheimer’s Association. 2024 Alzheimer’s disease facts and figures. Alzheimers Dement 2024;20:3708–821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Parums DV. A review of the current status of disease-modifying therapies and prevention of Alzheimer’s disease. Med Sci Monit 2024;30:e945091-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Pan Y, Nicolazzo JA. Impact of aging, Alzheimer’s disease and Parkinson’s disease on the blood-brain barrier transport of therapeutics. Adv Drug Deliv Rev 2018;135:62–74. [DOI] [PubMed] [Google Scholar]
- 4. Beheshti I, Mishra S, Sone D et al. T1-weighted MRI-driven brain age estimation in Alzheimer’s disease and Parkinson’s disease. Aging Dis 2020;11:618–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Bashyam VM, Erus G, Doshi J et al. MRI signatures of brain age and disease over the lifespan based on a deep brain network and 14 468 individuals worldwide. Brain 2020;143:2312–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Lee J, Burkett BJ, Min H-K et al. Deep learning-based brain age prediction in normal aging and dementia. Nat Aging 2022;2:412–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Chand GB, Wu J, Hajjar I et al. Interactions of the salience network and its subsystems with the default-mode and the central-executive networks in normal aging and mild cognitive impairment. Brain Connect 2017;7:401–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Courchesne E, Chisum HJ, Townsend J et al. Normal brain development and aging: quantitative analysis at in vivo MR imaging in healthy volunteers. Radiology 2000;216:672–82. [DOI] [PubMed] [Google Scholar]
- 9. Lemaitre H, Goldman AL, Sambataro F et al. Normal age-related brain morphometric changes: nonuniformity across cortical thickness, surface area and gray matter volume? Neurobiol Aging 2012;33:617.e1–e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Bhattarai P, Thakuri DS, Nie Y et al. Explainable AI-based deep-SHAP for mapping the multivariate relationships between regional neuroimaging biomarkers and cognition. Eur J Radiol 2024;174:111403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Millar PR, Gordon BA, Luckett PH et al. , Dominantly Inherited Alzheimer Network. Multimodal brain age estimates relate to Alzheimer disease biomarkers and cognition in early stages: a cross-sectional observational study. Elife 2023;12:e81869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Wen J, Varol E, Sotiras A et al. , Alzheimer’s Disease Neuroimaging Initiative. Multi-scale semi-supervised clustering of brain images: deriving disease subtypes. Med Image Anal 2022;75:102304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Bhattarai P, Taha A, Soni B et al. Predicting cognitive dysfunction and regional hubs using Braak staging amyloid-beta biomarkers and machine learning. Brain Inform 2023;10:33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Dickerson BC, Brickhouse M, McGinnis S et al. Alzheimer’s disease: the influence of age on clinical heterogeneity through the human brain connectome. Alzheimers Dement (Amst) 2017;6:122–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Sun J, Han J-DJ, Chen W. Exploring the relationship among Alzheimer’s disease, aging and cognitive scores through neuroimaging-based approach. Sci Rep 2024;14:27472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Fujita S, Mori S, Onda K et al. Characterization of brain volume changes in aging individuals with normal cognition using serial magnetic resonance imaging. JAMA Netw Open 2023;6:e2318153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Driscoll I, Davatzikos C, An Y et al. Longitudinal pattern of regional brain volume change differentiates normal aging from MCI. Neurology 2009;72:1906–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Chand GB, Hajjar I, Qiu D. Disrupted interactions among the hippocampal, dorsal attention, and central‐executive networks in amnestic mild cognitive impairment. Hum Brain Mapp 2018;39:4987–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Chand GB, Thakuri DS, Soni B. Salience network anatomical and molecular markers are linked with cognitive dysfunction in mild cognitive impairment. J Neuroimaging 2022;32:728–34. [DOI] [PubMed] [Google Scholar]
- 20. Leonardsen EH, Peng H, Kaufmann T et al. Deep neural networks learn general and clinically relevant representations of the ageing brain. NeuroImage 2022;256:119210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Lundberg S. A unified approach to interpreting model predictions. arXiv, arXiv:1705.07874, 2017, preprint: not peer reviewed.
- 22. Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, CA, USA: The ACM, 2016.
- 23. Montavon G et al. Layer-wise relevance propagation: an overview. Explainable AI: interpreting, explaining and visualizing deep learning. New York, NY, USA: Springer Nature, 2019, 193–209.
- 24. Jack CR, Bernstein MA, Fox NC et al. The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods. J Magn Reson Imaging 2008;27:685–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Doshi J, Erus G, Ou Y et al. ; Alzheimer’s Neuroimaging Initiative. MUSE: MUlti-atlas region Segmentation utilizing Ensembles of registration algorithms and parameters, and locally optimal atlas selection. Neuroimage 2016;127:186–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Chand GB, Singhal P, Dwyer DB et al. Schizophrenia imaging signatures and their associations with cognition, psychopathology, and genetics in the general population. Am J Psychiatry 2022;179:650–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Morris JC, Storandt M, Miller JP et al. Mild cognitive impairment represents early-stage Alzheimer disease. Arch Neurol 2001;58:397–405. [DOI] [PubMed] [Google Scholar]
- 28. Morris JC. The Clinical Dementia Rating (CDR) current version and scoring rules. Neurology 1993;43:2412–4. [DOI] [PubMed] [Google Scholar]
- 29. Ranstam J, Cook JA. LASSO regression. J Br Surg 2018;105:1348. [Google Scholar]
- 30. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Stat Methodol 1996;58:267–88. [Google Scholar]
- 31. de Vlaming R, Groenen PJ. The current and future use of ridge regression for prediction in quantitative genetics. BioMed Res Int 2015;2015:143712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 1970;12:55–67. [Google Scholar]
- 33. Chand GB, Habes M, Dolui S et al. Estimating regional cerebral blood flow using resting-state functional MRI via machine learning. J Neurosci Methods 2020;331:108528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Stone M. Cross‐validatory choice and assessment of statistical predictions. J R Stat Soc Ser B (Methodol) 1974;36:111–33. [Google Scholar]
- 35. Bishara AJ, Hittner JB. Reducing bias and error in the correlation coefficient due to nonnormality. Educ Psychol Meas 2015;75:785–804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Arndt S, Turvey C, Andreasen NC. Correlating and predicting psychiatric symptom ratings: Spearmans r versus Kendalls tau correlation. J Psychiatr Res 1999;33:97–104. [DOI] [PubMed] [Google Scholar]
- 37. Cohen J. Statistical Power Analysis for the Behavioral Sciences. New York, NY, USA.: Routledge, ; 2013. [Google Scholar]
- 38. Apostolova LG, Green AE, Babakchanian S et al. Hippocampal atrophy and ventricular enlargement in normal aging, mild cognitive impairment (MCI), and Alzheimer Disease. Alzheimer Dis Assoc Disord 2012;26:17–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Jacobs HIL, O’Donnell A, Satizabal CL et al. Associations between brainstem volume and Alzheimer’s disease pathology in middle-aged individuals of the Framingham Heart Study. J Alzheimers Dis 2022;86:1603–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Nguyen H-D, Clément M, Mansencal B et al. Brain structure ages—a new biomarker for multi‐disease classification. Hum Brain Mapp 2024;45:e26558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Sivera R, Delingette H, Lorenzi M et al. A model of brain morphological changes related to aging and Alzheimer’s disease from cross-sectional assessments. NeuroImage 2019;198:255–70. [DOI] [PubMed] [Google Scholar]
- 42. Zhou TD, Zhang Z, Balachandrasekaran A et al. Prospective longitudinal perfusion in probable Alzheimer’s disease correlated with atrophy in temporal lobe. Aging Dis 2024;15:1855–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Raji CA, Lopez OL, Kuller LH et al. Age, Alzheimer disease, and brain structure. Neurology 2009;73:1899–905. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Original ADNI data are publicly available (https://adni.loni.usc.edu/). The generated results will be shared upon reasonable request following applicable human subjects’ data transfer procedures.





