Skip to main content
Journal of Clinical Neurology (Seoul, Korea) logoLink to Journal of Clinical Neurology (Seoul, Korea)
. 2024 Aug 2;20(5):478–486. doi: 10.3988/jcn.2023.0289

Predicting the Progression of Mild Cognitive Impairment to Alzheimer’s Dementia Using Recurrent Neural Networks With a Series of Neuropsychological Tests

Chaeyoon Park a, Gihun Joo b, Minji Roh b, Seunghun Shin b, Sujin Yum b,c, Na Young Yeo b,c, Sang Won Park c,d, Jae-Won Jang b,c,d,e,, Hyeonseung Im a,b,e,f,; for the Alzheimer’s Disease Neuroimaging Initiative
PMCID: PMC11372213  PMID: 39227330

Abstract

Background and Purpose

The prevalence of Alzheimer’s dementia (AD) is increasing as populations age, causing immense suffering for patients, families, and communities. Unfortunately, no treatments for this neurodegenerative disease have been established. Predicting AD is therefore becoming more important, because early diagnosis is the best way to prevent its onset and delay its progression.

Methods

Mild cognitive impairment (MCI) is the stage between normal cognition and AD, with large variations in its progression. The disease can be effectively managed by accurately predicting the probability of MCI progressing to AD over several years. In this study we used the Alzheimer’s Disease Neuroimaging Initiative dataset to predict the progression of MCI to AD over a 3-year period from baseline. We developed and compared various recurrent neural network (RNN) models to determine the predictive effectiveness of four neuropsychological (NP) tests and magnetic resonance imaging (MRI) data at baseline.

Results

The experimental results confirmed that the Preclinical Alzheimer’s Cognitive Composite score was the most effective of the four NP tests, and that the prediction performance of the NP tests improved over time. Moreover, the gated recurrent unit model exhibited the best performance among the prediction models, with an average area under the receiver operating characteristic curve of 0.916

Conclusions

Timely prediction of progression from MCI to AD can be achieved using a series of NP test results and an RNN, both with and without using the baseline MRI data.

Keywords: Alzheimer’s disease, mild cognitive impairment, dementia, neuropsychological tests, neural network models

Graphical Abstract

graphic file with name jcn-20-478-abf001.jpg

INTRODUCTION

The incidence rate of neurodegenerative diseases among the aging population has increased recently. Alzheimer’s dementia (AD) is a neurodegenerative disease that initially decreases neuropsychological (NP) function, accompanies mental disorders, and causes neurological and physical complications, and eventually death. AD symptoms tend to progress for more than 8 years,1,2 during which the conditions cause enormous stress to the patient, their caregivers, and the wider community.3 However, a clear and fundamental treatment for AD has not yet been developed.4,5 Predicting AD is therefore becoming more important, because early diagnosis is the best way to prevent its onset and delay its progression.6

AD progressively worsens, and can be classified into several stages according to the symptoms.1,6 Mild cognitive impairment (MCI) corresponds to the earliest time when cognitive defects are clinically observed, and has a high risk of deteriorating to AD.6 Although several studies have focused on predicting the transition from MCI to AD, most have predicted the presence of AD at a specific time.7,8 Because the progression speed and pattern of the disease differ markedly between patients, these characteristics cannot be quantified if the judgment is made at a single time point. In contrast, predicting disease progression over several years would be helpful for doctors, patients, and even caregivers to deal the disease. Meanwhile, patients with MCI often receive many follow-up tests in clinical practice, with several NP and magnetic resonance imaging (MRI) tests generally being used to indicate disease progression. However, each such test requires expensive equipment and highly trained and skilled professionals as users of the equipment, resulting in considerable time and expense.9 Reducing the need for these tests could reduce the burden on hospitals, patients, and society. Moreover, identifying useful indicators for predicting MCI-to-AD progression could reduce the cost of performing additional screening for these predictions.

The primary purpose of this study is to identify the most effective NP tests, in combination with neural network models, for predicting the progression from MCI to AD over a 3-year period. We therefore developed a prediction model that uses patient data obtained in a series of NP tests performed over 2 years to determine if the conversion from MCI to AD occurred within that 3-year period. Moreover, the need for expensive imaging data, such as from MRI, was assessed by comparing the results of AD predictions with and without using imaging data. Finally, we identified the most-effective indicator for AD prediction by comparing the results of four representative NP tests and baseline MRI data and by using three representative neural networks that can analyze time-series data: recurrent neural networks (RNNs), long short-term memory (LSTM), and gated recurrent units (GRUs).

METHODS

Study population

The data used for this study were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (https://adni.loni.usc.edu/). The ADNI was a multicenter study initiated by Michael W. Weiner in 2004 which focused on the early diagnosis and follow-up of AD.

We downloaded the data from the ADNI database on March 13, 2022. The data comprised baseline and diagnostic (DX) data obtained at 6, 12, 24, and 36 months after baseline in 790 Americans, which included healthy individuals and patients with MCI and AD. These data included basic information such as age, sex, duration of education, and the presence of APOE4, as well as brain MRI and NP measurements (up to the 36-month visit).

MCI and AD were diagnosed based on the presence of objective memory impairment. All participants diagnosed with MCI had Mini-Mental State Examination (MMSE) scores of 24–30, a global Clinical Dementia Rating (CDR) score of 0.5, a CDR memory score of 0.5, and a score indicating impaired delayed recall of Story A on the Wechsler Memory Scale–Revised (scores of <11, ≥9, and ≥6 for ≥16, 8–15, and 0–7 years of education, respectively). All participants diagnosed with AD had an MMSE score of 20–24, a CDR score of 0.5 or 1, and a score indicating impaired delayed recall of Story A of the Wechsler Memory Scale–Revised (scores of ≤8, ≤4, and ≤2 for ≥16, 8–15, and 0–7 years of education, respectively).

This study has ethical approval, consent to participate, and consent for publication, and was approved by Institutional Review Board of the Kangwon National University Hospital (IRB No. KNUH-2023-01-010).

MRI measures

All participants underwent brain imaging using a 1.5-T or 3-T MRI scanner. Baseline MRI data were acquired and processed at multiple sites using a standard protocol.10 We analyzed the cortical volumes of the hippocampus, entorhinal cortex, and fusiform gyrus, which all deteriorate during early-stage AD.11 The volume of each region was measured using the FreeSurfer automatic image analysis tool.12 Versions 4.3 and 5.1 of FreeSurfer were used to analyze the 1.5-T and 3-T MRI data, respectively. Each image was segmented according to an atlas defined in FreeSurfer.13

NP measures

NP test data were measured at baseline and at 12 months and 24 months, including the scores on the MMSE, the 13-item cognition subscale of the Alzheimer’s Disease Assessment Scale (ADAS13), and Preclinical Alzheimer’s Cognitive Composite (PACC).14 The original PACC score was determined using the MMSE, Logical Memory Delayed-Recall test, Digit Symbol Substitution Test (DSST), and Free and Cued Selective Reminding Test (FCSRT). Two modified PACC (mPACC) parameters were used in the ADNI database, in which the mPACCdigit used the delayed recall portion of the ADAS13 instead of the FCSRT, and mPACCtrailsB used trails B instead of the DSST.

Study process

The goal of this study was to predict MCI-to-AD progression using three cyclic neural-network-based deep-learning models, and to compare the performance of the model. We therefore identified 561 patients diagnosed with MCI from among the 790 included patients and predicted whether they would progress to AD during 36 months of longitudinal annual follow-ups. The data at 6 months were excluded because the examinations were not accurately performed at that time point in clinical practice. Since we made a prediction every year from baseline up to 36 months, the data of 395 patients were used for model learning, and 166 patients were excluded due to missing values for NP tests and brain MRI data at baseline or at 12 months or 24 months. Fig. 1 presents a flowchart of participant selection and the overall study process.

Fig. 1. Study population and overall study process. AUC, area under the receiver operating characteristic curve; GRU, gated recurrent unit; LSTM, long short-term memory; MCI, mild cognitive impairment; RNN, recurrent neural network.

Fig. 1

The variables used to predict progression to AD were baseline clinical data (age, sex, duration of education, and presence of APOE4), MRI data (e.g., the volumes of the hippocampus, fusiform gyrus, and entorhinal cortex),14,15,16,17 NP test results (MMSE,18 ADAS13,19,20 mPACCdigit, and mPACCtrailsB21,22,23 scores) at baseline and at 12 months and 24 months, and DX results. Predictions were only made using MRI data obtained at the first visit due to MRI rarely being performed annually in real-world clinical practice. NP test variables were also used, including MMSE, ADAS13, and PACC scores. The MMSE can be used to evaluate the NP function of a patient, and the ADAS13 is widely used in clinical studies. The PACC score is configured to sensitively measure the initial changes in NP function according to scores such as on the MMSE. We obtained PACC scores using the mPACCdigit and mPACCtrailsB, which were modified to include subitems in the Digit Symbol Coding Test and Trail-Making Test B, respectively. Both of these tests can evaluate executive function and attention using data from the ADNI database. Table 1 lists the detailed characteristics of the datasets.

Table 1. Baseline characteristics of the dataset.

Characteristic Baseline (n=395) 12 months (n=395) 24 months (n=395)
Age (yr) 72.6±7.3 - -
Sex, female 160 (40.5) - -
Education (yr) 16.1±2.8 - -
APOE ε4 carrier 192 (48.6) - -
Race, white 372 (94.2) - -
Race, black 10 (0.02) - -
Race, others 13 (0.03) - -
Married 305 (77.2) - -
Widowed 46 (0.12) - -
Divorced 36 (0.09) - -
Never married or unknown 8 (6+2) (0.02) - -
MRI
Hippocampus 6806.8±1139.1 - -
Entorhinal cortex 3508.9±773.2 - -
Fusiform gyrus 17681.6±2710.3 - -
NP test score
MMSE 27.7±1.7 27.2±2.5 26.5±3.3
ADAS13 16.3±6.4 16.7±7.9 18.5±9.6
mPACCdigit -6.1±3.7 -6.2±5.4 -7.3±6.5
mPACCtrailsB -5.7±3.6 -5.9±5.1 -6.8±6.1
RAVLT, immediate 34.3±10.1 33.8±11.1 32.6±12.1
RAVLT, learning 4.2±2.5 3.9±2.6 3.6±2.7
RAVLT, forgetting 4.8±2.4 4.7±2.6 4.8±2.6
LDELTOTAL 5.7±3.4 6.7±5.1 6.5±5.5
DIGITSCOR 38.6±11.2 38.2±12.6 36.8±13.4
TRABSCOR 110.0±62.3 116.0±70.1 119.7±75.8
FAQ 3.3±4.3 4.7±5.7 6.3±7.3

Data are mean±standard deviation or n (%) values.

ADAS13, 13-item cognition subscale of the Alzheimer’s Disease Assessment Scale; DIGITSCOR, Digit Span; FAQ, Functional Activities Questionnaire; LDELTOTAL, Delayed-Recall Total; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; MRI, magnetic resonance imaging; NP, neuropsychological; RAVLT, Rey’s Auditory Verbal Learning Test; TRABSCOR, Trail-Making Test B time to complete.

We compared the effectiveness of the four NP tests by using them independently to make predictions. To predict MCI-to-AD progression, we trained and compared the performances of various prediction models based on RNN, LSTM, and GRU, which were suitable for processing sequence data among various deep-learning algorithms. All three models used a dropout rate of 0.2, 30 epochs, and a batch size of 32. The models were trained using the sigmoid activation function and Adam optimization. We also conducted cross-validation using the K-fold technique,24 which is commonly used to stabilize model performance for small samples.

Data preprocessing

In this study we applied MaxAbsScaler25 to scale age, duration of education, and the four MRI datasets (hippocampus, fusiform, entorhinal, and middle temporal), which have various value ranges, to prevent the models from focusing on specific data when learning. We also applied one-hot encoding to sex, the presence of APOE4, and DX to inform the model that each feature value had no continuity. Scaling and one-hot encoding were not applied to the NP test results.

Recurrent neural network

An RNN26 is the most-basic sequence model. After receiving an input in the form of a sequence, the RNN generates an output in the same form through the hidden state, and generates a new output using the current input and the previous hidden state. This model has the advantages of a simple structure and being able to handle sequential data. However, there are long-term dependency issues since RNNs do not remember the information of the initial inputs in long sequences, and due to gradient instability in the backpropagation process.

Long short-term memory

An LSTM27 model was proposed to overcome the limitations of RNNs. The LSTM approach uses forget, input, and output gates to determine and remember the weights of the previous and current information. This method solves the gradient and long-term dependence problems of RNNs, but is limited by its greater computation requirements.

Gated recurrent unit

A GRU28 model simplifies the complex structure of the LSTM by replacing its forget, input, and output gates with update and reset gates. This model improves the overfitting problem of LSTM by reducing the number of parameters to be learned. Its performance is similar to that of LSTM, but this comes with the advantage of faster computation.

Experimental settings

All experiments were conducted on a computer workstation with an Intel Core i7-8700 3.20-GHz CPU, 32 GB of RAM, and an NVIDIA GeForce RTX 2080 SUPER GPU. The host operating system was Windows 10 (64-bit) and all prediction models were implemented using TensorFlow, Python 3, and the scikit-learn library. We used K-fold cross-validation to improve the accuracy of model performance comparisons. K-fold cross-validation involves dividing the data into k nonoverlapping subsets or folds, allowing each subset to be used as a test dataset at least once, which results in more-reliable evaluations. Five-fold cross-validation was used in this study; that is, the data were divided into five folds. All three models implemented in this study (RNN, LSTM, and GRU) consisted of four layers: three dropouts and one time-distributed layer. Fig. 2 presents the overall structure of the prediction model.

Fig. 2. Model overview. The input for each time point consists of NP test results of baseline, 12 months, and 24 months, and all inputs include CD and MRI data of baseline. CD, clinical data; MRI, magnetic resonance imaging; NP, neuropsychological; RNN, recurrent neural network.

Fig. 2

RESULTS

Fig. 3 and Table 2 present the average values of the areas under the receiver operating characteristic curve (AUCs) obtained by repeating the model training 10 times using five-fold cross-validation to predict MCI-to-AD progression up to 36 months from 12 months after baseline using the baseline MRI data. Fig. 4 and Table 3 present the average values obtained without using the baseline MRI data. The rows labeled “Overall” in Tables 2 and 3 list the average values of the prediction results at 12, 24, and 36 months, to compare the overall performance of each model for different NP tests. Supplementary Figs. 1 and 2 (in the online-only Data Supplement) present the receiver operating characteristic curves for the GRU model with and without using the baseline MRI data, respectively, as a representative case.

Fig. 3. Performance comparisons for each model with varying NP tests (with data in the baseline). ADAS13, 13-item cognition subscale of the Alzheimer’s Disease Assessment Scale; AUC, area under the receiver operating characteristic curve; GRU, gated recurrent unit; LSTM, long short-term memory; M12, 12 months; M24, 24 months; M36, 36 months; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; NP, neuropsychological; RNN, recurrent neural network.

Fig. 3

Table 2. AUCs of all prediction models using the baseline MRI data.

Model Time MMSE ADAS13 mPACCdigit mPACCtrailsB
RNN 12 months 0.657±0.037 0.803±0.007 0.831±0.005 0.849±0.005
24 months 0.871±0.002 0.908±0.003 0.903±0.003 0.917±0.003
36 months 0.963±0.002 0.975±0.002 0.973±0.001 0.971±0.001
Overall 0.830±0.012 0.895±0.003 0.902±0.002 0.912±0.001
LSTM 12 months 0.666±0.019 0.804±0.003 0.830±0.003 0.848±0.005
24 months 0.869±0.003 0.908±0.002 0.904±0.002 0.917±0.002
36 months 0.963±0.002 0.974±0.003 0.973±0.001 0.972±0.002
Overall 0.833±0.007 0.895±0.002 0.903±0.001 0.912±0.002
GRU 12 months 0.713±0.014 0.801±0.004 0.832±0.003 0.854±0.002
24 months 0.869±0.003 0.911±0.002 0.907±0.002 0.920±0.001
36 months 0.962±0.001 0.977±0.001 0.976±0.001 0.975±0.001
Overall 0.848±0.005 0.896±0.002 0.905±0.001 0.916±0.001

Data are mean±standard deviation values. Predictions were based on patients’ clinical, MRI, and NP test data.

ADAS13, 13-item cognition subscale of the Alzheimer’s Disease Assessment Scale; AUCs, areas under the receiver operating characteristic curve; GRU, gated recurrent unit; LSTM, long short-term memory; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; MRI, magnetic resonance imaging; NP, neuropsychological; RNN, recurrent neural network.

Fig. 4. Performance comparisons for each model with varying NP tests (without data in the baseline). ADAS13, 13-item cognition subscale of the Alzheimer’s Disease Assessment Scale; AUC, area under the receiver operating characteristic curve; GRU, gated recurrent unit; LSTM, long short-term memory; M12, 12 months; M24, 24 months; M36, 36 months; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; NP, neuropsychological; RNN, recurrent neural network.

Fig. 4

Table 3. AUCs of all prediction models without using the baseline MRI data.

Model Time MMSE ADAS13 mPACCdigit mPACCtrailsB
RNN 12 months 0.676±0.030 0.799±0.004 0.823±0.002 0.847±0.004
24 months 0.866±0.004 0.905±0.003 0.901±0.001 0.917±0.002
36 months 0.962±0.001 0.974±0.002 0.973±0.001 0.973±0.001
Overall 0.835±0.010 0.893±0.001 0.899±0.001 0.912±0.002
LSTM 12 months 0.649±0.023 0.793±0.005 0.824±0.003 0.846±0.003
24 months 0.867±0.003 0.907±0.003 0.903±0.003 0.917±0.002
36 months 0.962±0.002 0.974±0.002 0.973±0.002 0.972±0.002
Overall 0.826±0.008 0.891±0.003 0.900±0.002 0.912±0.002
GRU 12 months 0.699±0.010 0.796±0.004 0.825±0.002 0.846±0.002
24 months 0.867±0.002 0.909±0.002 0.905±0.001 0.919±0.001
36 months 0.962±0.001 0.977±0.001 0.975±0.001 0.975±0.001
Overall 0.843±0.003 0.894±0.001 0.902±0.001 0.914±0.001

Data are mean±standard deviation values. Predictions were based on patients’ clinical and NP test data.

ADAS13, 13-item cognition subscale of the Alzheimer’s Disease Assessment Scale; AUCs, areas under the receiver operating characteristic curve; GRU, gated recurrent unit; LSTM, long short-term memory; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; MRI, magnetic resonance imaging; NP, neuropsychological; RNN, recurrent neural network.

Fig. 3 and Table 2 indicate that there were no significant differences among the performances of the three models, and the performance of each gradually improved from baseline, meaning that the prediction for 36 months was the most accurate due to the accumulation of information from NP test results. However, the performance of the models varied depending on the NP tests. The performance was worst when using the MMSE, with an average AUC of 0.830–0.848, and best when using the mPACCtrailsB (AUC=0.912–0.916). Similar results were obtained for the mPACCdigit. The predictive performance was better when using the ADAS13 than the MMSE; however, it was less effective than the mPACCdigit and mPACCtrailsB. These results indicate that the mPACCdigit and mPACCtrailsB can be more useful in predicting progression to AD than the simple NP-based MMSE and the ADAS13, which are commonly used in clinical practice.

Fig. 4 and Table 3 present patterns similar to those in Fig. 3 and Table 2 but with slightly worse performance due to the baseline MRI data not being utilized. Fig. 3 and Table 2 indicate that the performances of the three models gradually improved over time, with no significant intermodel differences. Additionally, the performance of each model was the highest when using the mPACCtrailsB, with an average AUC of 0.913. Moreover, the performance was worse when using the mPACCdigit, ADAS13, or MMSE.

When comparing the performance in predicting the progression to AD according to the presence of baseline MRI data, the AUCs decreased by an average of 0.002, 0.002, 0.003, and 0.001 when the MMSE, ADAS13, mPACCdigit, and mPACCtrailsB were used, respectively, when that data were not used. The results indicated that there was no significant difference in the performance of predicting MCI-to-AD progression depending on the presence of MRI data, indicating that AD can be predicted effectively using only NP tests.

DISCUSSION

This study makes several important contributions. First, we studied RNN-based models that utilized previous results using minimal features, and predicted MCI-to-AD progression using annual NP test results. We did this by training prediction models for patients diagnosed with MCI at baseline using the ADNI dataset and predicted MCI-to-AD progression over a 3-year period. Most previous investigations of predicting cognitive deterioration from MCI to AD used multimodal data,29,30 while we only used a tabular-style dataset obtained using a simple examination. Meanwhile, the traditional logistic regression approach can also be unnaturally applied to our study by developing an independent logistic regression model for each 12-, 24-, and 34-month prediction. Supplementary Tables 1 and 2 (in the online-only Data Supplement) indicate that the average performance of the three logistic regression models was similar to that of other RNN-based prediction models. We note here that as the number of time steps to predict increases, so does the number of logistic regression models.

Second, the mPACC was the most-effective representative NP test for predicting MCI-to-AD progression. Various combinations of the four longitudinal NP tests and baseline clinical and MRI data were assessed to determine the model that performed best. The performances of the various prediction models were compared using their AUCs, and the average value of 10-time five-fold cross-validations was used for generalization.

Third, we performed extensive experiments to compare the predictive performance of the RNN, LSTM, and GRU models, and found that the GRU model performed best with the mPACCtrailsB and baseline MRI results, with an average AUC of 0.913. This indicates that the model was better in predicting MCI-to-AD progression than the previously reported AUC of 0.75 obtained using 10-fold cross-validation with the radiomic features of amyloid-beta positron-emission tomography (PET) images.31 Varatharajah et al.32 obtained an AUC of 0.93 for predicting MCI-to-AD progression using data from medical imaging, the cerebrospinal fluid, and genes. Platero et al.33 also used age, sex, duration of education, standard NP measures, and MRI markers obtained from the ADNI dataset to predict MCI-to-AD progression, and obtained AUCs of 0.892, 0.905, and 0.928 at 1, 2, and 3 years, respectively. Although not directly comparable, the AUCs of predictions using our GRU model with mPACCtrailsB were 0.854, 0.920, and 0.975, respectively, due to the yearly accumulation of results from NP tests. This indicates that a time-series analysis with simple examination data is effective in predicting MCI-to-AD progression.

Fourth, we examined the effectiveness of using baseline MRI data to predict MCI-to-AD progression. We applied deep-learning methods to easily accessible longitudinal NP tests combined with other baseline data. Although using baseline MRI data could improve the predictive performance, the difference was not statistically significant. More specifically, the models without MRI data obtained similar performances, especially at 24 months and 36 months, meaning that they could be used in the absence of initial MRI data and would only be based on serial NP test data collected annually in clinical practice.

Composite scores such as the mPACC were superior across all periods to screening tests such as the MMSE, indicating that the mPACC can provide accurate predictions across ongoing follow-ups. However, the performance at 24 months and 36 months became similar regardless of the NP test used, suggesting that these RNN methods, even those including the MMSE, can be applied as follow-up progresses. This is clinically meaningful given that the MMSE is a basic clinical tool that is commonly used for annual follow-ups.

This study was also subject to some limitations. First, while our methods were internally validated using K-fold cross-validation, one limitation was that the study analyzed data from only 395 patients, which restricts the clinical generalizability of the results. Therefore, future studies should perform external validation by repeating the experiments with external data. It could also be useful to develop an application that utilizes our model to contribute to clinical decision-making. Second, we used older FreeSurfer software (versions 4.3 and 5.1) to analyze ADNI data, which may have restricted the detailed analysis of brain regions compared with using newer versions. Although these versions are valid for quantitative brain analysis and provide valuable insights, the lack of features in the older FreeSurfer versions could have reduced the accuracy of the ultrastructural property analysis. While previous studies indicated minimal volumetric differences between FreeSurfer versions, the use of the most recent software may have improved the quality of our findings. We aim to address this limitation in future research by employing newer versions of FreeSurfer and improve the quality of imaging data. The third limitation was that we did not analyze confirmatory biomarkers such as amyloid PET images or cerebrospinal fluid, which are included in the ADNI to define AD. Although these biomarkers would have more accurately defined AD, we preferred to predict MCI-to-AD progression using a deep-learning algorithm combined with longitudinal NP tests, which are relatively inexpensive, noninvasive, and routinely used in clinical practice. Because sample size is critical in deep learning, we adopted a syndromal definition in this study that would require further study using subjects with certain biomarkers.

In conclusion, this study has advanced MCI-to-AD predictions using minimal features and annual NP tests and highlighted the superior performance of the GRU model using the mPACC as a key indicator. Despite some limitations, our findings indicate the potential of deep learning in clinical practice, including when comprehensive data or resources are not available. This research sets the stage for more-comprehensive future studies to enhance the diagnosis and monitoring of AD.

Footnotes

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (https://adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Author Contributions:
  • Conceptualization: Chaeyoon Park, Jae-Won Jang, Hyeonseung Im.
  • Data curation: Chaeyoon Park, Gihun Joo, Seunghun Shin, Sujin Yum, Jae-Won Jang, Hyeonseung Im.
  • Formal analysis: Chaeyoon Park, Gihun Joo, Seunghun Shin.
  • Funding acquisition: Jae-Won Jang, Hyeonseung Im.
  • Investigation: Chaeyoon Park, Gihun Joo, Minji Roh.
  • Methodology: Chaeyoon Park, Gihun Joo, Hyeonseung Im.
  • Project administration: Jae-Won Jang, Hyeonseung Im.
  • Resources: Jae-Won Jang, Hyeonseung Im.
  • Software: Chaeyoon Park, Gihun Joo.
  • Supervision: Jae-Won Jang, Hyeonseung Im.
  • Validation: Chaeyoon Park, Gihun Joo, Minji Roh, Na Young Yeo, Sang Won Park, Jae-Won Jang, Hyeonseung Im.
  • Visualization: Chaeyoon Park, Gihun Joo.
  • Writing—original draft: Chaeyoon Park, Gihun Joo, Jae-Won Jang, Hyeonseung Im.
  • Writing—review & editing: Chaeyoon Park, Gihun Joo, Minji Roh, Na Young Yeo, Sang Won Park, Jae-Won Jang, Hyeonseung Im.

Conflicts of Interest: The authors have no potential conflicts of interest to disclose.

Funding Statement: This research was supported by “Regional Innovation Strategy (RIS)” through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (2022RIS-005). This work was also supported by Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2022-0-01196, Regional strategic Industry convergence security core talent training business) and by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2023-00242528).

Availability of Data and Material

The datasets generated or analyzed during the study are available from the corresponding author on reasonable request.

Supplementary Materials

The online-only Data Supplement is available with this article at https://doi.org/10.3988/jcn.2023.0289.

Supplementary Table 1

AUC for all prediction models (with MRI data in the baseline)

jcn-20-478-s001.pdf (30KB, pdf)
Supplementary Table 2

AUC for all prediction models (without MRI data in the baseline)

jcn-20-478-s002.pdf (30.9KB, pdf)
Supplementary Fig. 1

The ROC curves of the GRU model for each NP test (with MRI data in the baseline). ADAS13, 13-item cognition subscale of the Alzheimer's Disease Assessment Scale; AUC, area under the receiver operating characteristic curve; GRU, gated recurrent unit; M12, 12 months; M24, 24 months; M36, 36 months; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; MRI, magnetic resonance imaging; NP, neuropsychological; ROC, receiver operating characteristic.

jcn-20-478-s003.pdf (172.1KB, pdf)
Supplementary Fig. 2

The ROC curves of the GRU model for each NP test (without MRI data in the baseline). ADAS13, 13-item cognition subscale of the Alzheimer's Disease Assessment Scale; AUC, area under the receiver operating characteristic curve; GRU, gated recurrent unit; M12, 12 months; M24, 24 months; M36, 36 months; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; MRI, magnetic resonance imaging; NP, neuropsychological; ROC, receiver operating characteristic.

jcn-20-478-s004.pdf (173KB, pdf)

References

  • 1.Therriault J, Zimmer ER, Benedet AL, Pascoal TA, Gauthier S, Rosa-Neto P. Staging of Alzheimer’s disease: past, present, and future perspectives. Trends Mol Med. 2022;28:726–741. doi: 10.1016/j.molmed.2022.05.008. [DOI] [PubMed] [Google Scholar]
  • 2.Morris JC. Early-stage and preclinical Alzheimer disease. Alzheimer Dis Assoc Disord. 2005;19:163–165. doi: 10.1097/01.wad.0000184005.22611.cc. [DOI] [PubMed] [Google Scholar]
  • 3.Isik AT, Soysal P, Solmi M, Veronese N. Bidirectional relationship between caregiver burden and neuropsychiatric symptoms in patients with Alzheimer’s disease: a narrative review. Int J Geriatr Psychiatry. 2019;34:1326–1334. doi: 10.1002/gps.4965. [DOI] [PubMed] [Google Scholar]
  • 4.Vaz M, Silvestre S. Alzheimer’s disease: recent treatment strategies. Eur J Pharmacol. 2020;887:173554. doi: 10.1016/j.ejphar.2020.173554. [DOI] [PubMed] [Google Scholar]
  • 5.Yiannopoulou KG, Anastasiou AI, Zachariou V, Pelidou SH. Reasons for failed trials of disease-modifying treatments for Alzheimer disease and their contribution in recent research. Biomedicines. 2019;7:97. doi: 10.3390/biomedicines7040097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Gauthier S, Webster C, Servaes S, Morais JA, Rosa-Neto P. World Alzheimer report 2022. Life after diagnosis: navigating treatment, care and support. London: Alzheimer’s Disease International; 2022. [Google Scholar]
  • 7.Bari Antor M, Jamil AHMS, Mamtaz M, Monirujjaman Khan M, Aljahdali S, Kaur M, et al. A comparative analysis of machine learning algorithms to predict Alzheimer’s disease. J Healthc Eng. 2021;2021:9917919. doi: 10.1155/2021/9917919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Basaia S, Agosta F, Wagner L, Canu E, Magnani G, Santangelo R, et al. Automated classification of Alzheimer’s disease and mild cognitive impairment using a single MRI and deep neural networks. Neuroimage Clin. 2019;21:101645. doi: 10.1016/j.nicl.2018.101645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ge XY, Cui K, Liu L, Qin Y, Cui J, Han HJ, et al. Screening and predicting progression from high-risk mild cognitive impairment to Alzheimer’s disease. Sci Rep. 2021;11:17558. doi: 10.1038/s41598-021-96914-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jack CR, Jr, Bernstein MA, Fox NC, Thompson P, Alexander G, Harvey D, et al. The Alzheimer’s Disease Neuroimaging Initiative (ADNI): MRI methods. J Magn Reson Imaging. 2008;27:685–691. doi: 10.1002/jmri.21049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hamza EA, Moustafa AA, Tindle R, Karki R, Nalla S, Hamid MS, et al. Effect of APOE4 allele and gender on the rate of atrophy in the hippocampus, entorhinal cortex, and fusiform gyrus in Alzheimer’s disease. Curr Alzheimer Res. 2022;19:943–953. doi: 10.2174/1567205020666230309113749. [DOI] [PubMed] [Google Scholar]
  • 12.FreeSurfer. Main page [Internet] Charlestown: Massachusetts General Hospital; [cited 2023 Apr 9]. Available from: https://surfer.nmr.mgh.harvard.edu . [Google Scholar]
  • 13.Fischl B, Dale AM. Measuring the thickness of the human cerebral cortex from magnetic resonance images. Proc Natl Acad Sci U S A. 2000;97:11050–11055. doi: 10.1073/pnas.200033797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Insel PS, Weiner M, Mackin RS, Mormino E, Lim YY, Stomrud E, et al. Determining clinically meaningful decline in preclinical Alzheimer disease. Neurology. 2019;93:e322–e333. doi: 10.1212/WNL.0000000000007831. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chandra A, Dervenoulas G, Politis M Alzheimer’s Disease Neuroimaging Initiative. Magnetic resonance imaging in Alzheimer’s disease and mild cognitive impairment. J Neurol. 2019;266:1293–1302. doi: 10.1007/s00415-018-9016-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Davila-Velderrain J, Mathys H, Mohammadi S, Ruzicka B, Jiang X, Ng A, et al. Single-cell anatomical analysis of human hippocampus and entorhinal cortex uncovers early-stage molecular pathology in Alzheimer’s disease. [cited 2023 Apr 9];bioRxiv [Preprint] 2021 doi: 10.1101/2021.07.01.450715. Available from: [DOI] [Google Scholar]
  • 17.Bayram E, Caldwell JZK, Banks SJ. Current understanding of magnetic resonance imaging biomarkers and memory in Alzheimer’s disease. Alzheimers Dement (N Y) 2018;4:395–413. doi: 10.1016/j.trci.2018.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Arevalo-Rodriguez I, Smailagic N, Roqué I Figuls M, Ciapponi A, Sanchez-Perez E, Giannakou A, et al. Mini-mental state examination (MMSE) for the detection of Alzheimer’s disease and other dementias in people with mild cognitive impairment (MCI) Cochrane Database Syst Rev. 2015;2015:CD010783. doi: 10.1002/14651858.CD010783.pub2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Rosen WG, Mohs RC, Davis KL. A new rating scale for Alzheimer’s disease. Am J Psychiatry. 1984;141:1356–1364. doi: 10.1176/ajp.141.11.1356. [DOI] [PubMed] [Google Scholar]
  • 20.Cano SJ, Posner HB, Moline ML, Hurt SW, Swartz J, Hsu T, et al. The ADAS-cog in Alzheimer’s disease clinical trials: psychometric evaluation of the sum and its parts. J Neurol Neurosurg Psychiatry. 2010;81:1363–1368. doi: 10.1136/jnnp.2009.204008. [DOI] [PubMed] [Google Scholar]
  • 21.Seghezzo G, Van Hoecke Y, James L, Davoren D, Williamson E, Pearce N, et al. Feasibility study of assessing the Preclinical Alzheimer Cognitive Composite (PACC) score via videoconferencing. J Neurol. 2021;268:2228–2237. doi: 10.1007/s00415-021-10403-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Arbuthnott K, Frank J. Trail making test, part B as a measure of executive control: validation using a set-switching paradigm. J Clin Exp Neuropsychol. 2000;22:518–528. doi: 10.1076/1380-3395(200008)22:4;1-0;FT518. [DOI] [PubMed] [Google Scholar]
  • 23.Ashendorf L, Jefferson AL, O’Connor MK, Chaisson C, Green RC, Stern RA. Trail making test errors in normal aging, mild cognitive impairment, and dementia. Arch Clin Neuropsychol. 2008;23:129–137. doi: 10.1016/j.acn.2007.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Stone M. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Series B Stat Methodol. 1974;36:111–133. [Google Scholar]
  • 25.Raju VG, Lakshmi KP, Jain VM, Kalidindi A, Padma V. Study the influence of normalization/transformation process on the accuracy of supervised classification [Internet] Piscataway: IEEE; [cited 2023 Jul 23]. Available from: [DOI] [Google Scholar]
  • 26.Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation [Internet] Cambridge: MIT Press; [cited 2023 Jul 23]. Available from: [DOI] [Google Scholar]
  • 27.Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9:1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
  • 28.Chung J, Gulcehre C, Cho K, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. [cited 2023 Jul 23];arXiv [Preprint] 2014 doi: 10.48550/arXiv.1412.3555. Available from: [DOI] [Google Scholar]
  • 29.Song J, Zheng J, Li P, Lu X, Zhu G, Shen P. An effective multimodal image fusion method using MRI and PET for Alzheimer’s disease diagnosis. Front Digit Health. 2021;3:637386. doi: 10.3389/fdgth.2021.637386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Venugopalan J, Tong L, Hassanzadeh HR, Wang MD. Multimodal deep learning models for early detection of Alzheimer’s disease stage. Sci Rep. 2021;11:3254. doi: 10.1038/s41598-020-74399-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ding Y, Zhao K, Che T, Du K, Sun H, Liu S, et al. Quantitative radiomic features as new biomarkers for Alzheimer’s disease: an amyloid PET study. Cereb Cortex. 2021;31:3950–3961. doi: 10.1093/cercor/bhab061. [DOI] [PubMed] [Google Scholar]
  • 32.Varatharajah Y, Ramanan VK, Iyer R, Vemuri P Alzheimer’s Disease Neuroimaging Initiative. Predicting short-term MCI-to-AD progression using imaging, CSF, genetic factors, cognitive resilience, and demographics. Sci Rep. 2019;9:2235. doi: 10.1038/s41598-019-38793-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Platero C, Tobar MC Alzheimer’s Disease Neuroimaging Initiative. Predicting Alzheimer’s conversion in mild cognitive impairment patients using longitudinal neuroimaging and clinical markers. Brain Imaging Behav. 2021;15:1728–1738. doi: 10.1007/s11682-020-00366-8. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 1

AUC for all prediction models (with MRI data in the baseline)

jcn-20-478-s001.pdf (30KB, pdf)
Supplementary Table 2

AUC for all prediction models (without MRI data in the baseline)

jcn-20-478-s002.pdf (30.9KB, pdf)
Supplementary Fig. 1

The ROC curves of the GRU model for each NP test (with MRI data in the baseline). ADAS13, 13-item cognition subscale of the Alzheimer's Disease Assessment Scale; AUC, area under the receiver operating characteristic curve; GRU, gated recurrent unit; M12, 12 months; M24, 24 months; M36, 36 months; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; MRI, magnetic resonance imaging; NP, neuropsychological; ROC, receiver operating characteristic.

jcn-20-478-s003.pdf (172.1KB, pdf)
Supplementary Fig. 2

The ROC curves of the GRU model for each NP test (without MRI data in the baseline). ADAS13, 13-item cognition subscale of the Alzheimer's Disease Assessment Scale; AUC, area under the receiver operating characteristic curve; GRU, gated recurrent unit; M12, 12 months; M24, 24 months; M36, 36 months; MMSE, Mini-Mental State Examination; mPACC, modified Preclinical Alzheimer Cognitive Composite; MRI, magnetic resonance imaging; NP, neuropsychological; ROC, receiver operating characteristic.

jcn-20-478-s004.pdf (173KB, pdf)

Data Availability Statement

The datasets generated or analyzed during the study are available from the corresponding author on reasonable request.


Articles from Journal of Clinical Neurology (Seoul, Korea) are provided here courtesy of Korean Neurological Association

RESOURCES