Abstract
Introduction:
A data-driven index of dementia risk based on MRI, the Alzheimer’s Disease Pattern Similarity(AD-PS) score was estimated for participants in the Atherosclerosis Risk in Communities study.
Methods:
AD-PS scores were generated for 839 cognitively nonimpaired individuals with mean follow-up of 4.86 years. The scores and a hypothesis-driven volumetric measure based on several brain regions susceptible to AD, were compared as predictors of incident cognitive impairment in different settings.
Results:
Logistic regression analyses suggest the data-driven AD-PS scores to be more predictive of incident cognitive impairment than its counterpart. Both biomarkers were more predictive of incident cognitive impairment in participants who were Whites, females and APOE-ε4 carriers. Random Forests analyses including predictors from different domains ranked the AD-PS scores as the most relevant MRI predictor of cognitive impairment.
Conclusions:
Overall the AD-PS scores were the stronger MRI derived predictors of incident cognitive impairment in cognitively nonimpaired individuals.
Keywords: Alzheimer’s disease, MRI, AD-PS, machine learning, Random Forests, ARIC
INTRODUCTION
Despite intense research into novel biomarkers of dementia derived from blood and positron emission tomography imaging (PET)[1, 2], biomarkers from structural MRI remains an area of great interest. MRI captures cumulative damage caused by pathological processes over time[3, 4] and is: 1)less expensive than PET and less invasive than obtaining cerebrospinal fluid; 2)readily available in large legacy databases where other Alzheimer’s disease (AD) biomarkers were not collected; 3)characterizes neurodegeneration within the A/T/N model[5]; and 4)can define severity and progression of brain disease.
The development of MRI-based biomarkers of dementia risk remains an active area of research[6–10] that continues to produce new innovations. Some MRI biomarkers are guided by expert knowledge. Racine and colleagues proposed the personalized AD cortical thickness index[11]. They used a composite measure estimated as the average cortical thickness of nine regions believed to be early targets of AD to predict progression from mild cognitive impairment(MCI) to dementia. Brickman and colleagues proposed a measure of degenerative and cerebrovascular pathology[12], which correlated with Aβ PET imaging and cerebrospinal fluids levels of total tau, phosphorylated tau, and Aβ1–42 and predicted incident cognitive impairment. Wu et al. investigated the value of different MRI measures as risk factors for incident MCI and AD[13] in the Atherosclerosis Risk in Communities(ARIC) cohort, reporting that both brain tissue atrophy and vascular lesions contribute to dementia and cognitive impairment in ARIC. These approaches have in common the use of hypothesis-based composite measures that include several brain regions susceptible to AD. Some composites are volumetric and others are based on the cortical thickness average of the hypothesized brain regions.
Several groups have proposed MRI data-driven biomarkers based on machine learning methods[6, 7, 9]. Very few have been systematically deployed in the context of AD and related dementias. The Spatial Pattern of Abnormality for Recognition of Early Alzheimer’s disease index is a better known example of a data-driven index of AD risk which has been applied to different problems in AD[7, 14–16]. We introduced the Alzheimer’s Disease Pattern Similarity(AD-PS) scores using high-dimensional machine learning methods[10, 17–19]. This work was extended to the Women’s Health Initiative Memory Study(WHIMS) MRI cohort[20] where AD-PS scores were associated with incident cognitive impairment, age and global cognitive function. Scores were consistent with the relative trajectories of global cognitive function in WHIMS women over 10 years of follow up[21]. WHIMS AD-PS score as a measure of neuroanatomic risk of dementia have been linked to air pollution[22, 23].
To date, there have been relatively few comparisons between data-driven and hypothesis-driven MRI indices as predictors of incident cognitive impairment, particularly in diverse cohorts. This work pursues several objectives: 1) to extend AD-PS scores to the ARIC cohort and evaluate their associations with incident cognitive impairment in a diverse cohort of cognitively nonimpaired individuals; 2) to evaluate the relative merit of AD-PS scores compared to a hypothesis-driven composite volumetric measure of several brain regions susceptible to AD available in ARIC and 3) to perform exploratory stratified analyses across sex, race and APOE ε4 carrier status to evaluate the impact of these factors on the AD-PS scores and the composite volumetric measure when predicting incident cognitive impairment.
MATERIALS AND METHODS
Two datasets were utilized for this study. ARIC is the main target cohort and Alzheimer’s Disease Neuroimaging Initiative (ADNI) MRI data was used to train machine learning algorithms to generate AD-PS scores when provided with MRI data from ARIC participants.
The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment(MCI) and early AD. The ADNI study provides a rich and well characterized cohort of cognitively normal participants and AD patients, which we have used actively in our previous work[24–26]. The ADNI data are described in the supplemental materials.
The ARIC study began in 1987, funded by the National Heart, Lung, and Blood Institute(NHLBI). From 1987 through 1989, 15,792 mostly White and African American participants aged 45–64 years were recruited from four field centers located in Forsyth County, NC; Jackson, MS; Minneapolis suburbs, MN; and Washington County, MD. Using probability sampling, each ARIC field center recruited approximately 4,000 individuals aged 45–64 years from a defined population in their community. Only African Americans were recruited in Jackson, MS; the remaining sites reflected local populations, mostly White in Minneapolis and Washington County and both races in Forsyth County. The Institutional Review Boards from all centers approved ARIC protocols; participants provided written consent for their study participation and for use of their genetic data. To date, there have been seven examinations; relevant to this work are visit 5 (2011 to 2013) and visit 6 (2016 to 2017).
ARIC DATA
ARIC - Cognitive evaluation
The ARIC cognitive assessment used in visits 5 was described previously[27]. Briefly, ARIC obtained cognitive evaluations at visits 5 and 6, with mean follow up time of 4.86 years. The three cognitive instruments were administered beginning with ARIC visit 2 and used in visit 5 include: the Delayed Word Recall Task (DWRT), Digit Symbol Substitution(DSS) from the Wechsler Adult Intelligence Scale -Revised(WAIS-R), and a Word Fluency Test. Z scores for each test were estimated using mean and standard deviations from visit 2. Factor scores were used for global cognition, and for three previously derived cognitive domain: executive function, language, and memory[28–30].
The cognitive status(nonimpaired, MCI, or dementia) of participants who attended visits 5 and 6 was classified using a standardized algorithm based on cognitive assessment and verified by expert committee review, using information from in-person cognitive batteries, the Clinical Dementia Rating scale, and functional questionnaires completed by participants and/or informants. Since the goal of this study is to evaluate early detection of dementia risk using imaging biomarkers, only cognitively nonimpaired individuals (CNI) at visit 5 were included in our analyses. Cognitively nonimpaired was defined as not meeting criteria for MCI or dementia (supplementary materials). MCI type and the etiology of dementia were not adjudicated at visit 6.
ARIC-MRI
Structural brain images were obtained using 3-T MRI scanners(Siemens Verio[Maryland site], Siemens Skyra[North Carolina study center], Siemens Trio [Minnesota site], and Siemens Skyra [Mississippi site]) as previously described[31]. We used cortical volumes of regions of interest, estimated using the FreeSurfer system(Laboratory for Computational Neuroimaging) available in the ARIC database: frontal, temporal, occipital, parietal, deep gray matter, ventricular, total brain volume (TBV) and a composite of brain regions volumes susceptible to AD including hippocampus, parahippocampal, entorhinal, inferior parietal lobule, precuneus and cuneus. This last measure has been referenced in previous work as “AD-signature”[13, 32]. Due to growing evidence indicating these areas could be the target of other brain diseases like limbic-predominant age related TDP-43 encephalopathy [33] we will use the term “composite volumetric measure of regions susceptible to AD” or simply “composite volumetric measure(CVM)”. We used white matter hyperintensity(WMH) volumes and volumes of several brain regions(Table S2) as predictors in some of the analyses and the intracranial volumes(ICV) to adjust for differences in brain sizes. MRI scans from 839 ARIC CNI were available.
ARIC PET Data
A subset of participants without dementia, ages 67–88 years, were imaged using 18-florbetapir PET at three sites (Maryland; North Carolina; and Mississippi) during visit 5. The details of 18-florbetapir PET image processing and co-registration with MRI, carried out at the Johns Hopkins University reading center, were described previously[34] and can be found in the supplementary materials. A global cortical measure of florbetapir uptake was used as a weighted average (based on region of interest size) of the orbitofrontal, prefrontal, and superior frontal cortices; the lateral temporal, parietal, and occipital lobes; the precuneus, the anterior cingulate, and the posterior cingulate. This measure is called global cortical standardized uptake value ratio (GC SUVR). We included in our analyses SUVRs from individual regions linked in prior studies to AD: medial temporal, amygdala, hippocampus, anterior cingulate, posterior cingulate, caudate, putamen and thalamus [29]. An automated region for cerebellum gray matter was used as a reference. Here we used data from 193 participants who were adjudicated as CNI in visit 5.
Estimation of the AD-PS scores
The general approach is depicted in Figure 1. MRI scans from both ARIC and ADNI (Table S1) were aligned to a common template(derived from ADNI images) using image processing tools available in the Advanced Normalization Tools(see image processing in supplementary materials). Next, we used high-dimensional machine learning methods to estimate the AD-PS scores. Details of the machine learning algorithms were published previously[10, 19, 20, 22]. Briefly, a regularized logistic regression (RLR) classifier was estimated in a voxel-wise manner using the grey matter probability maps(resulting from the image processing described above) from CN and AD participants available in the training dataset(ADNI in our case). The weights estimated after solving the optimization problem associated with the RLR classifier are used to estimate conditional probabilities of AD according to the MRI scan. To estimate the optimal values of the regularization parameters, we combined nested 10-fold cross-validations and grid search. Probabilities we refer to as AD-PS scores were computed as the mean values of 5 repetitions of the computations, to account for variability due to random partitioning of cross-validation that occurred during model estimation. The scores are computed for the target dataset, in this case ARIC.
Figure 1 - –

General approach to estimate Alzheimer’s disease pattern similarity (AD-PS) scores for ARIC MRI images.
Analyses
We performed logistic regression analyses considering incident cognitive impairment(either MCI or dementia) as the outcome and AD-PS scores or the CVM described above as main independent variables, fitted in separate models. Other variables included age, race-center, sex, education, and intracranial volume, where race-center combined information from both race and study center. We estimated area under the curve(AUC) based on 10-fold cross-validation. Similar analyses were performed stratifying by sex, race and APOE ε4 carrier status. The Delong method[35] was used to evaluate significance of the increase in performance resulting by adding each biomarker(AD-PS or CVM) to the basic model based only on covariates. We performed Random Forests (RF)[36] classification analyses (See supplementary materials for RF details) to investigate relative importance of both MRI biomarkers when predicting incident cognitive impairment including AD-PS scores, the CVM, several MRI variables derived using Freesurfer(described above), cognitive, clinical, demographics and APOE ε4 data (Table S2). The derived MRI variables were scaled by dividing them by their corresponding intracranial volume (ICV).
Similar logistic regression and RF analyses were performed using a subset(N=193) of CNI with MRI and amyloid PET data available. This allowed us to compare performance of the anatomical measures derived from MRI with measures derived from 18-florbetapir PET. RF models were constructed using same predictors as described above but adding several Aβ amyloid PET measures to the model from brain areas previously linked to AD(see ARIC PET section above).
All RF analyses were performed using methods for imbalanced classification available in the R package randomForestSRC[37, 38]. We selected ntree = 3000 and AUC as the splitting rule, both of which are recommended for imbalanced learning. Other parameters were set to the default values. The permutation index was used to determine variable importance. The pROC R package was used to estimate AUCs and confidence intervals as a measure of performance. For variable selection we used a strategy proposed by Strobl for computing an RF permutation index[39], who suggested discarding as noise the variables with negative permutation index and the positive ones with absolute values less than the amplitude of the negative score with maximum amplitude[40, 41].
RESULTS
Age, race, education, APOE ε4 carrier status, CVM and AD-PS scores at visit 5 for 839 CNI with MRI are presented in Table 1, distributed by cognitive status as adjudicated at visit 6. The average AD-PS scores of CNI estimated at visit 5, increases with the severity of the future cognitive impairment classification as adjudicated in visit 6.
Table 1 –
Demographic characteristics, APOE ε4 carrier status and MRI measures at Visit 5 for CNI participants with MRI (N=839) and for those with MRI and amyloid PET (N=193) by cognitive status at Visit 6.
| Total | Cognitive Status at Visit 6 | |||
|---|---|---|---|---|
| Normal | MCI | Dementia | ||
| N | 839 | 691 | 122 | 26 |
| Age mean(SD) | 75.3 (5.0) | 74.9 (5.0) | 76.5 (4.7) | 79.4 (4.0) |
| Gender | ||||
| Female | 555 (66.2%) | 435 (63.1%) | 78 (63.9%) | 21 (80.8%) |
| Male | 316 (34.8%) | 256 (36.9%) | 44 (35.1%) | 5 (19.2%) |
| Race | ||||
| Black | 272 (32.4%) | 227 (32.9%) | 31 (25.4%) | 14 (53.9%) |
| White | 567(67.6%) | 464 (67.1%) | 91 (74.6%) | 12 (46.1%) |
| Education | ||||
| Basic | 98 (11.7%) | 72 (10.4%) | 20 (16.4%) | 6 (23.1%) |
| Intermediate | 306 (36.5%) | 264 (38.2%) | 34 (27.9%) | 8 (30.8%) |
| Advanced | 435 (51.8%) | 355 (51.4%) | 68 (55.7%) | 12 (46.1%) |
| APOE ε4 carrier status | ||||
| ε4 carrier (%) | 224 (27.7%) | 171 (24.8%) | 45 (36.9%) | 8 (30.8%) |
| AD-PS scores mean(SD) | 0.20 (20.3) | 0.18 (0.18) | 0.30 (0.25) | 0.47 (0.33) |
| Composite volume mean(SD) | 59893.1 (6674.0) | 60270.5 (6654.3) | 58943.2 (6349.4) | 54321 (5935.1) |
| MRI and amyloid PET Subset | ||||
| Total | Cognitive Status at Visit 6 | |||
| Normal | MCI | Dementia | ||
| N | 193 | 174 | 16 | 3 |
| Age mean(SD) | 75.0 (5.2) | 74.8 (5.3) | 76.6 (4.8) | 79.3 (4.5) |
| Gender | ||||
| Female | 121 (62.7%) | 109 (62.6%) | 10 (62.5%) | 2 (66.7%) |
| Male | 72 (37.3%) | 65 (37.4%) | 6 (37.5%) | 1 (33.3%) |
| Race | ||||
| Black | 78 (40.4%) | 69 (39.7%) | 6 (37.5%) | 3 (100%) |
| White | 115 (59.6%) | 105 (63.3%) | 10 (62.5%) | 0 (0%) |
| Education* | ||||
| Basic | 23 (11.9%) | 18 (10.3%) | 4 (25%) | 1 (33.4%) |
| Intermediate | 83 (43.0%) | 75 (43.1%) | 8 (50%) | 0 (0.0%) |
| Advanced | 87 (45.1%) | 81 (46.6%) | 4 (25%) | 2 (66.6%) |
| APOE ε4 carrier status | ||||
| Yes (%)* | 50 (25.9%) | 41 (23.6%) | 8 (50%) | 1 (33.3%) |
| AD-PS scores mean(SD) | 0.19 (0.19) | 0.18 (0.18) | 0.37 (0.20) | 0.39 (0.42) |
| AD-PS scores mean(SD) | 0.19 (0.19) | 0.18 (0.18) | 0.37 (0.20) | 0.39 (0.42) |
| Composite volume mean(STD) | 59588 (6448) | 59923 (6414) | 57737 (5842) | 50040 (1546) |
| Global cortical SUVR mean(SD) | 1.23 (0.20) | 1.21 (0.17) | 1.44 (0.31) | 1.44 (0.35) |
Education levels – Basic (<12 years), Intermediate (completed high school or GED*), Advanced (some college)
GED - General Educational Development Test
In Table 2, results for the incident impairment analyses based on logistic regression using the full sample(N = 839) and stratifications by race, sex and APOE ε4 carrier status are presented. Overall, AUC for incident cognitive impairment was 0.692 for AD-PS and 0.654 for the CVM. Consistently, AD-PS scores were significantly more predictive of incident cognitive impairment than the CVM (see also Table S3). Both MRI-based biomarkers were more predictive of incident cognitive impairment in participants who were Whites, females and APOE ε4 carriers. Spearman correlation between both measures was rho=−0.13 (p<0.001).
Table 2 –
The predictive value of the two MRI measures in CNI participants at visit 5 with MRI and for those with MRI and amyloid PET.
| Group | Effect | N | Cases (*Dem.) | AUC | 95% CI | **p-value |
|---|---|---|---|---|---|---|
| Full Sample | AD-PS | 839 | 148 (26) | 0.692 | [0.687–0.697] | 0.0004 |
| CVM | 0.654 | [0.650–0.658] | 0.036 | |||
| Whites | AD-PS | 567 | 103 (12) | 0.688 | [0.681–0.695] | 0.006 |
| CVM | 0.672 | [0.664–0.680] | 0.10 | |||
| Black | AD-PS | 272 | 45 (14) | 0.645 | [0.625–0.665] | 0.10 |
| CVM | 0.596 | [0.576–0.616] | 0.08 | |||
| Male | AD-PS | 305 | 49 (5) | 0.661 | [0.643–0.679] | 0.08 |
| CVM | 0.615 | [0.597–0.633] | 0.17 | |||
| Females | AD-PS | 534 | 99 (25) | 0.685 | [0.677–0.693] | 0.003 |
| CVM | 0.645 | [0.637–0.653] | 0.21 | |||
| APOE ε4 carriers | AD-PS | 224 | 53 (8) | 0.744 | [0.728–0.760] | 0.06 |
| CVM | 0.692 | [0.674–0.710] | 0.37 | |||
| APOE ε4 Non-carriers | AD-PS | 615 | 95 (18) | 0.668 | [0.659–0.677] | 0.002 |
| CVM | 0.617 | [0.609–0.625] | 0.07 | |||
| MRI and amyloid PET Subset | ||||||
| Group | Effect | N | Cases (Dem) | AUC | 95% CI | **p-value |
| Full Sample | AD-PS | 193 | 19 (3 ) | 0.737 | [0.714–0.760] | 0.024 |
| CVM | 0.633 | [0.605–0.661] | 0.19 | |||
| GC SUVR | 0.672 | [0.633–0.711] | 0.18 | |||
Dem – Number of dementia cases
p-value - Delong test comparing AUCs produced by AD-PS or CVM plus covariates with respect to the basic model containing only covariates.
For participants with both MRI and PET (N=193), we present the demographic information at visit 5 (Table 1) and logistic regression results for the full sample (Table 2). The model including AD-PS scores, generated the larger AUC when predicting incident cognitive impairment. For completeness, the results across different stratifications of the dataset are presented in the supplementary materials in Table S4 although the sample size in several cases was small.
Figure 2 shows the rank of predictors based on the RF permutation index resulting from the incident cognitive impairment analysis with 839 CNI at visit 5. The horizontal red line defines the threshold for variable selection according to the Strobl criterion[39]. AD-PS scores were the most relevant predictor of incident cognitive impairment followed by global cognition and memory domain scores, age, CVM, WFT z-score and ventricular volume. Performance of the classifier was AUC = 0.735 [0.692–0.780] CI (95%). The results of the incident impairment analysis in the PET sample (N = 193) is shown in Figure 3. Results were driven mostly by 5 predictors: three from Florbetapir PET(anterior cingulate, posterior cingulate and global cortical SUVRs) followed by two derived from MRI(AD-PS scores and TBV). The classifier performance was AUC = 0.825 [0.726–0.934] CI (95%). Several complementary analyses were performed. Linear regression models were fitted to evaluate associations of both MRI biomarkers with memory and executive function scores. The AD-PS scores were in all cases significantly associated to these cognitive measures cross-sectionally and longitudinally while the CVM was not associated with memory function cross-sectionally and to executive function longitudinally (See Table S5). Logistic regression analyses (amyloid+ versus amyloid−) adjusted for age, sex, education, race and ICV were fitted to investigate associations of both MRI measures with amyloid PET in CNI but results were not significant in both cases (not presented).
Figure 2 –

Rank of predictors based on the RF permutation index resulting from the incident cognitive impairment analysis with 839 cognitive non-impaired participants at visit 5. The horizontal red line defines the threshold for variable selection. The AD-PS score was the more relevant predictor. Performance of the classifier was AUC = 0.735 [0.692–0.780] CI (95%).
Figure 3 –

Rank of predictors based on the RF permutation index resulting from the incident cognitive impairment analysis of 193 cognitively non-impaired ARIC PET study participants at visit 5 adding to MRI, cognitive, clinical and demographic data information derived from Florbetapir-18 PET. The horizontal red line defines the threshold for variable selection. The AD-PS scores and total brain volume ranked as the two most relevant MRI measures after anterior cingulate, posterior cingulate and global cortical SURVs, which were the three most relevant predictors in this analysis. Performance of the classifier was AUC = 0.834 [0.735–0.932] CI (95%).
DISCUSSION
The AD-PS score, a data-driven MRI biomarker of dementia risk, was estimated for ARIC participants. ARIC is a diverse cohort containing one of the largest MRI databases ever collected among African Americans. Using data from CNI, AD-PS scores were strong predictors of incident cognitive impairment over 4.86 years of follow up.
An important goal was to understand the relative merit of our data-driven score with respect to a hypothesis-driven CVM based on regions susceptible to AD. Areas contributing to the AD-PS scores estimation are selected by the algorithm from the gray matter tissue in a voxel-wise manner(see Figure S1 in supplementary materials). We investigated the relative value of these two MRI-based biomarkers for prediction of incident cognitive impairment using: 1)parsimonious logistic regression models and 2)high-dimensional RF models that included other MRI, cognitive, demographic, genetic and clinical measures. Logistic regression analyses based on individuals with MRI (N=839) showed that models including the AD-PS scores were very often significantly more predictive of incident cognitive impairment than the CVM. In analyses stratified by race, sex and APOE ε4, we observed that both MRI metrics simultaneously were more predictive of incident cognitive impairment in White, females and APOE ε4 carriers. Complementary analyses were performed to further investigate relationships between AD-PS and CVM (see Table S3). When both biomarkers were included in the same model, the CVM associations with incident cognitive impairment became non-significant in all cases.
To investigate relative performance of both MRI biomarkers in the presence of multiple variables from different domains we used RF, a state of the art machine learning method well known to the predictive modelling community. The permutation index is perhaps the most popular variable importance measure it provides. It evaluates decreases in model performance when a given variable is randomly permuted. If the variable is important the model performance will decrease and vice versa. The RF analyses performed using MRI, cognitive, clinical, demographic and APOE ε4 carrier status predictors showed the AD-PS score to be a more relevant predictor of incident cognitive impairment than the CVM. Overall AD-PS scores were the most relevant predictor in the model followed by global cognition and memory domain cognitive scores, age and ventricular volume. Complementary RF analyses dropping one of the two measures (AD-PS or CVM) at a time from the full model showed that only the removal of AD-PS scores led to significant differences in model performance AUC = 0.717, [0.672–0.763] CI (95%) (p<0.05)[35].
To investigate the performance of the AD-PS scores and the CVM with respect to amyloid PET SUVRs, we took advantage of data collected in the ARIC PET study[34]. Similar analyses with CNI (N=193) were repeated. The results showed the anterior and posterior cingulate and global cortical SUVRs to be more relevant predictors of incident cognitive impairment followed by AD-PS scores. Intriguingly, the other MRI predictor that survived the threshold was TBV, a traditional biomarker of brain tissue atrophy. This is possible due to differences between the datasets (e.g. sample sizes, % of cases, demographic differences, etc.) and requires further confirmation.
The anterior and posterior cingulate cortex are part of the default mode network[42] which has been linked to early changes related to AD[43]. Development of Aβ amyloid plaques in the anterior and posterior cingulate in cognitively normal individuals has been previously reported[44, 45]. Nadkarni and colleagues used the A/T/N/V framework[5] to evaluate associations of Aβ amyloid burden, white matter hyperintensities and FDG-PET with incident MCI in cognitively normal individuals[46]. They found that baseline Aβ amyloid positivity alone or combined hypo-metabolism positivity were associated with incident amnestic MCI. Similarly, Burnham and colleagues, using data from 573 participants in the Biomarker and Lifestyle (AIBL) study (mean age=73 yrs), evaluated their clinical progression over 6 years of follow up based on Aβ amyloid PET and hippocampal volume[47]. They reported that Aβ amyloid burden was a risk factor for cognitive decline and progression from preclinical to symptomatic stages of the disease, with neurodegeneration acting as a compounding factor. Similar to other groups we have used SUVRs of several brain ROIs as predictors[48, 49]. However, a more common practice is to use only a composite of different ROIs.
The increasing role of the less expensive plasma biomarkers[2] raises questions about the role of MRI biomarkers in the future. According to the ADRD research framework[5], Aβ, tau and MRI biomarkers reside in different domains (A, T and N respectively). MRI provide valuable information related to neurodegeneration and cerebrovascular disease. In clinical practice MRI biomarkers, like the AD-PS scores, can provide complementary information about abnormalities in the brain. In addition, prediction of future events is difficult in any field, and prediction of future cognitive impairment is not an exception due to the complexity of the processes that lead to it. Accurate prediction of incident cognitive impairment will require complex models including predictors from different sources. In the foreseeable future, MRI will be an important component of the diagnostic process[16] and MRI-derived biomarkers will contribute valuable information for mathematical models designed to predict incident cognitive impairment.
Our study is not without limitations. The ADNI cohort is highly selected and not representative of the general population. The majority of participants used in the training dataset are Whites (>90%) and this must be considered when interpreting results across race. Ideally, the training dataset would be more representative of the ARIC cohort. However, this is the second database to which AD-PS scores have been extended via machine learning inference. In both cases with very different populations the scores are predictive of incident cognitive impairment. We believe that the AD-PS scores will benefit from a much larger sample size of the training dataset. Our comparison of the AD-PS scores with respect to the composite measure was restricted to the volumetric version available in ARIC. However, other hypothesis-driven volumetric composites or composites based on the average of cortical thickness of several regions instead of volumes[12] could be designed, which could be more sensitive measures to future cognitive decline. Our dataset did not contain AD blood-based biomarkers such as p-tau 181[2], so we were unable to evaluate their relative performance with respect to AD-PS scores. The sample sizes of participants with Florbetapir PET who developed cognitive impairment were relatively small which could impact our analyses. Most African American participants were from one site. Simulation studies have reported permutation index biases in some situations[50]. However, we don’t expect this to affect the comparison between AD-PS scores and the CVM or the models performance.
Finally, despite being based on a training dataset (ADNI) with an excellent clinical characterization of AD, AD-PS scores are most likely capturing mixed pathology because AD often co-exists or overlaps with other brain diseases or due to classification not verified by biomarkers or confirmed at autopsy. Some researchers have suggested that mixed pathology biomarkers have the potential to be better predictor of future clinical outcomes relative to a biomarker of a specific pathology[12, 51].
CONCLUSIONS
We estimated a data-driven MRI index of dementia risk we call AD-PS scores in the ARIC study. Scores were predictive of incident cognitive impairment adjudicated ~4.86 years after the MRIs were collected. Overall the data-driven AD-PS score outperformed the CVM. RF analyses using variables derived from MRI, amyloid PET, cognitive testing, genetic and demographic data showed the AD-PS scores to be the most relevant predictor of incident cognitive impairment among the MRI variables but following several amyloid PET measures. Our work supports the potential of data-driven biomarkers of dementia. Future work will investigate early signs of dementia risk according to AD-PS scores and define cut-off values for clinical diagnosis. More important, we will refine our machine learning methodology by investigating the use of very large training datasets, how to include imaging information from large amounts of MCI individuals in model inference, investigate other outcomes to train the machine learning algorithms, the use of Deep Learning and Manifold Learning methods, its applications to other available imaging databases and extensions or combinations with other imaging modalities.
Supplementary Material
ACKNOWLEDGEMENTS
The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts (HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I). Neurocognitive data is collected by U01 2U01HL096812, 2U01HL096814, 2U01HL096899, 2U01HL096902, 2U01HL096917 from the NIH (NHLBI, NINDS, NIA and NIDCD), and with previous brain MRI examinations funded by R01-HL70825 from the NHLBI. The authors thank the staff and participants of the ARIC study for their important contributions. RC, KH and TH receive funding from the Wake Forest Alzheimer’s Disease Core Center (P30AG049638-01A1).
Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.;Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.;Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.
CONFLICT OF INTEREST
RC received a Contract from Atrium Health paid to the Institution to do machine learning using Emergency Room data. FC was consultant for the San Diego State University and was member of a DSMB at Rutgers University. KH was consultant with Fred Hutchinson Cancer Research Center and the UNC at Chapel Hill. She received support to attend meetings by the Alzheimer’s Association and the NIH. KH also Participated in a DSMB at WFSM with no compensation and is editor of several journals of the Alzheimer’s association with no compensation. MG received honoraria for lecture at Johns Hopkins Biostatistics. RG received Honoraria for lectures: University of Alabama at Birmingham; University of Michigan; American College of Cardiology. She is secretary of the American Neurological Association with no compensation. CW received payment as Forensic panel consultant. He received payments from the Biogen MRI Protocol Advisory Board and Biogen US Evolving the Care Team: Treat & Monitor Advisory Board. RB, AA, RT and TH have nothing to declare.
References
- 1.La Joie R, et al. , Prospective longitudinal atrophy in Alzheimer’s disease correlates with the intensity and topography of baseline tau-PET. Sci Transl Med, 2020. 12(524). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Karikari TK, et al. , Blood phosphorylated tau 181 as a biomarker for Alzheimer’s disease: a diagnostic performance and prediction modelling study using data from four prospective cohorts. Lancet Neurol, 2020. 19(5): p. 422–433. [DOI] [PubMed] [Google Scholar]
- 3.Jack CR Jr., Brain Atrophy on Magnetic Resonance Imaging as a Biomarker of Neurodegeneration. JAMA Neurol, 2016. 73(10): p. 1179–1182. [DOI] [PubMed] [Google Scholar]
- 4.Vemuri P and Jack CR Jr., Role of structural MRI in Alzheimer’s disease. Alzheimers Res Ther, 2010. 2(4): p. 23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jack CR Jr., et al. , NIA-AA Research Framework: Toward a biological definition of Alzheimer’s disease. Alzheimers Dement, 2018. 14(4): p. 535–562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Vemuri P, et al. , Antemortem MRI based STructural Abnormality iNDex (STAND)-scores correlate with postmortem Braak neurofibrillary tangle stage. Neuroimage, 2008. 42(2): p. 559–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Davatzikos C, et al. , Longitudinal progression of Alzheimer’s-like patterns of atrophy in normal older adults: the SPARE-AD index. Brain, 2009. 132(Pt 8): p. 2026–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kloppel S, et al. , Accuracy of dementia diagnosis: a direct comparison between radiologists and a computerized method. Brain, 2008. 131(Pt 11): p. 2969–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kloppel S, et al. , Automatic classification of MR scans in Alzheimer’s disease. Brain, 2008. 131(Pt 3): p. 681–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Casanova R, et al. , Alzheimer’s disease risk assessment using large-scale machine learning methods. PLoS One, 2013. 8(11): p. e77949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Racine AM, et al. , The personalized Alzheimer’s disease cortical thickness index predicts likely pathology and clinical progression in mild cognitive impairment. Alzheimers Dement (Amst), 2018. 10: p. 301–310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Brickman AM, et al. , An MRI measure of degenerative and cerebrovascular pathology in Alzheimer disease. Neurology, 2018. 91(15): p. e1402–e1412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wu A, et al. , Association of Brain Magnetic Resonance Imaging Signs With Cognitive Outcomes in Persons With Nonimpaired Cognition and Mild Cognitive Impairment. JAMA Netw Open, 2019. 2(5): p. e193359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Habes M, et al. , White matter hyperintensities and imaging patterns of brain ageing in the general population. Brain, 2016. 139(Pt 4): p. 1164–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Janowitz D, et al. , Inflammatory markers and imaging patterns of advanced brain aging in the general population. Brain Imaging Behav, 2020. 14(4): p. 1108–1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Habes M, et al. , The Brain Chart of Aging: Machine-learning analytics reveals links between brain aging, white matter disease, amyloid burden, and cognition in the iSTAGING consortium of 10,216 harmonized MR scans. Alzheimers Dement, 2021. 17(1): p. 89–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Casanova R, et al. , High dimensional classification of structural MRI Alzheimer’s disease data based on large scale regularization. Frontiers of Neuroscience in Neuroinformatics, 2011. 5:22. Epub 2011 Oct 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Casanova R, Maldjian JA, and Espeland MA, Evaluating the impact of different factors on voxel-wise classification methods of ADNI structural MRI brain images. International Journal of Biomedical Datamining., 2011. 1: p. 11. [Google Scholar]
- 19.Casanova R, et al. , Classification of structural MRI images in Alzheimer’s disease from the perspective of ill-posed problems. PLoS One, 2012. 7(10): p. e44877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Casanova R, et al. , Using high-dimensional machine learning methods to estimate an anatomical risk factor for Alzheimer’s disease across imaging databases. Neuroimage, 2018. 183: p. 401–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Espeland MA, et al. , Sex-related differences in the prevalence of cognitive impairment among overweight and obese adults with type 2 diabetes. Alzheimers Dement, 2018. 14(9): p. 1184–1192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Younan D, et al. , Particulate matter and episodic memory decline mediated by early neuroanatomic biomarkers of Alzheimer’s disease. Brain, 2020. 143(1): p. 289–302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Younan D, et al. , PM2.5 associated with gray matter atrophy reflecting increased Alzheimers risk in older women. Neurology, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Casanova R, et al. , High dimensional classification of structural MRI Alzheimer’s disease data based on large scale regularization. Frontiers of Neuroscience in Neuroinformatics, 2011. 5:22. Epub 2011 Oct 14 p. 5–22, PMCID: PMC3193072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Casanova R, et al. , Classification of structural MRI images in Alzheimer’s disease from the perspective of ill-posed problems. PLoS One, 2012. 7(10): p. e44877, PMCID: PMC3468621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Casanova R, et al. , Alzheimer’s disease risk assessment using large-scale machine learning methods. PLoS One, 2013. 8(11): p. PMCID: PMC3826736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Knopman DS, et al. , Mild Cognitive Impairment and Dementia Prevalence: The Atherosclerosis Risk in Communities Neurocognitive Study (ARIC-NCS). Alzheimers Dement (Amst), 2016. 2: p. 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Gross AL, et al. , Application of Latent Variable Methods to the Study of Cognitive Decline When Tests Change over Time. Epidemiology, 2015. 26(6): p. 878–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Park LQ, et al. , Confirmatory factor analysis of the ADNI Neuropsychological Battery. Brain Imaging Behav, 2012. 6(4): p. 528–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hayden KM, et al. , Factor structure of the National Alzheimer’s Coordinating Centers uniform dataset neuropsychological battery: an evaluation of invariance between and within groups over time. Alzheimer Dis Assoc Disord, 2011. 25(2): p. 128–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Schneider ALC, et al. , Diabetes, Prediabetes, and Brain Volumes and Subclinical Cerebrovascular Disease on MRI: The Atherosclerosis Risk in Communities Neurocognitive Study (ARIC-NCS). Diabetes Care, 2017. 40(11): p. 1514–1521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Power MC, et al. , The Association of Long-Term Exposure to Particulate Matter Air Pollution with Brain MRI Findings: The ARIC Study. Environ Health Perspect, 2018. 126(2): p. 027009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Nelson PT, et al. , Limbic-predominant age-related TDP-43 encephalopathy (LATE): consensus working group report. Brain, 2019. 142(6): p. 1503–1527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gottesman RF, et al. , The ARIC-PET amyloid imaging study: Brain amyloid differences by age, race, sex, and APOE. Neurology, 2016. 87(5): p. 473–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.DeLong ER, DeLong DM, and Clarke-Pearson DL, Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 1988. 44(3): p. 837–45. [PubMed] [Google Scholar]
- 36.Breiman L, Random Forests. Machine Learning, 2001. 45: p. 5–32. [Google Scholar]
- 37.Ishwaran H and Kogalur U, Random forests for survival, regression and classification (RF-SRC) R package version 1.6 http://CRANR-project.org/package=randomForestSRC . 2014. [Google Scholar]
- 38.O’Brien R and I. H, A random forests quantile classifier for class imbalanced data. Pattern Recognition 2017( 90): p. 232–249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Strobl C, Malley J, and Tutz G, An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods, 2009. 14(4): p. 323–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Casanova R, et al. , Investigating Predictors of Cognitive Decline Using Machine Learning. J Gerontol B Psychol Sci Soc Sci, 2020. 75(4): p. 733–742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kaup AR, et al. , Cognitive resilience to apolipoprotein E epsilon4: contributing factors in black and white older adults. JAMA Neurol, 2015. 72(3): p. 340–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Raichle ME, et al. , A default mode of brain function. Proc Natl Acad Sci U S A, 2001. 98(2): p. 676–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Greicius MD, et al. , Default-mode network activity distinguishes Alzheimer’s disease from healthy aging: evidence from functional MRI. Proc Natl Acad Sci U S A, 2004. 101(13): p. 4637–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Chetelat G, et al. , Amyloid imaging in cognitively normal individuals, at-risk populations and preclinical Alzheimer’s disease. Neuroimage Clin, 2013. 2: p. 356–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Aizenstein HJ, et al. , Frequent amyloid deposition without significant cognitive impairment among the elderly. Arch Neurol, 2008. 65(11): p. 1509–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Nadkarni NK, et al. , Association Between Amyloid-beta, Small-vessel Disease, and Neurodegeneration Biomarker Positivity, and Progression to Mild Cognitive Impairment in Cognitively Normal Individuals. J Gerontol A Biol Sci Med Sci, 2019. 74(11): p. 1753–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Burnham SC, et al. , Clinical and cognitive trajectories in cognitively healthy elderly individuals with suspected non-Alzheimer’s disease pathophysiology (SNAP) or Alzheimer’s disease pathology: a longitudinal study. Lancet Neurol, 2016. 15(10): p. 1044–53. [DOI] [PubMed] [Google Scholar]
- 48.Guo T, et al. , Association of CSF Abeta, amyloid PET and cognition in cognitively unimpaired elderly adults. Neurology, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lockhart SN, et al. , Amyloid and tau PET demonstrate region-specific associations in normal older people. Neuroimage, 2017. 150: p. 191–199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Strobl C, et al. , Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 2007. 8: p. 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Bennett DA, An MRI biomarker of mixed pathology. Neurology, 2018. 91(15): p. 682–683. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
