Abstract
Background
With the advancement of globalization, the prevalence of cognitive dysfunction in the elderly population has risen significantly. Early intervention may dramatically alleviate the disease burden and reduce economic costs associated with cognitive impairment. This study aims to construct a risk prediction model for cognitive dysfunction based on machine learning (ML) algorithms, providing healthcare professionals and patients with a more accurate and effective tool for risk assessment.
Methods
This study included 1,325 elderly participants who completed cognitive assessments and comprehensive laboratory blood tests. Risk factors for cognitive dysfunction were identified through univariate analysis, multivariate logistic regression, LASSO regression, and the Boruta algorithm. Nine ML methods—Random Forest (RF), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), Logistic Regression, K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Artificial Neural Network (ANN), Decision Tree, and Elastic Net—were employed to construct the prediction models. The Shapley Additive Explanations (SHAP) algorithm was utilized to interpret the final model.
Results
The Random Forest model exhibited the highest predictive performance, with an AUC value exceeding those of other models. SHAP analysis identified age, race, education level, diabetes, and depression as the primary predictors of cognitive dysfunction in the elderly. The calibration curve indicated a strong alignment between the model’s predictions and actual outcomes, while the decision curve confirmed the model’s clinical applicability.
Conclusion
Age, race, education level, diabetes, and depression are significant influencing factors of cognitive dysfunction in the elderly. Among the ML algorithms evaluated, the Random Forest model exhibited the best predictive performance.
1. Introduction
The global population is aging rapidly, and increased life expectancy has made cognitive dysfunction in the elderly a major public health concern [1]. Cognitive impairment not only diminishes cognitive function and quality of life but also imposes substantial disease and economic burdens on patients and their families [2–4]. According to the World Alzheimer Report, an estimated 46.8 million individuals worldwide were affected by dementia in 2015, with projections indicating a rise to 131.5 million by 2050. [5]. Every three seconds, someone is diagnosed with dementia, and the annual cost of dementia is estimated at $1 trillion, a figure expected to double by 2030. Alzheimer’s disease (AD) is the most common form of dementia [6]. Primary prevention of AD holds significant potential, as one-third of global AD cases are attributable to modifiable risk factors. The World Health Organization (WHO) Guidelines for Risk Reduction of Cognitive Decline and Dementia [7] and the 2020 Lancet Commission report [8] identified several modifiable risk factors for dementia, including low education, advanced age, smoking, excessive alcohol consumption, obesity, depression, physical inactivity, hearing impairment, hypertension, diabetes, social isolation, traumatic brain injury, and air pollution.
Growing evidence suggests a link between cognitive dysfunction and inflammation [9]. Immunosenescence and inflammaging are hallmark features of aging [10], with inflammation playing a pivotal role in the pathogenesis of cognitive decline and dementia [11–14]. The systemic immune-inflammation index (SII) and systemic inflammation response index (SIRI), recently developed composite inflammatory markers, are widely used to assess systemic inflammation [15,16]. Studies by Wang et al.[17–22] have demonstrated a significant association between SII, SIRI, and cognitive impairment. Insulin resistance (IR) is also significantly associated with an increased risk of cognitive decline [23]. The triglyceride-glucose (TyG) index, derived from fasting triglyceride (TG) and blood glucose (FBG) levels, is a cost-effective and readily available surrogate marker for IR [24]. Research [25–28] indicates that a higher TyG index is significantly correlated with an elevated risk of dementia. Investigating the relationship between SII, SIRI, the TyG index, and cognitive function may provide a basis for early detection of cognitive impairment.
With the rapid advancement of artificial intelligence, risk models incorporating demographic, behavioral, and psychosocial factors have emerged. Machine learning (ML) offers unique advantages in medical prediction by automatically identifying key predictors and their interactions through feature importance ranking and decision-splitting mechanisms [29]. However, existing studies are often limited to single-algorithm validation, lacking multi-model performance comparisons and interpretability analyses. Therefore, this study leverages data from the 2011–2014 National Health and Nutrition Examination Survey (NHANES) to explore risk factors for cognitive dysfunction in the elderly. By employing nine ML algorithms to construct predictive models and comparing their performance using calibration and decision curve analyses, this study aims to identify the optimal model, offering new insights for healthcare professionals in predicting cognitive impairment risk among the elderly.
2. Materials and methods
2.1 Study population
This study utilized cross-sectional data from the 2011–2014 National Health and Nutrition Examination Survey (NHANES), conducted by the National Center for Health Statistics (NCHS) and the Centers for Disease Control and Prevention (CDC) to assess the health and nutritional status of individuals across all age groups in the United States. The survey was approved by the NCHS Ethics Review Board, and all participants provided written informed consent (https://www.cdc.gov/nchs/nhanes/about/erb.html?CDC_AAref_Val=https://www.cdc.gov/nchs/nhanes/irba98.htm). Inclusion criteria were: (1) age ≥ 60 years; and (2) complete responses to all survey items. A total of 1,325 elderly participants were included in the final analysis (Fig 1).
Fig 1. Flowchart of participant selection.
2.2 Study variables and definitions
2.2.1 Cognitive function assessment.
Cognitive ability in participants aged ≥60 years was evaluated using three standardized tests:
Consortium to Establish a Registry for Alzheimer’s Disease (CERAD): Including immediate recall (CERAD-IR) and delayed recall (CERAD-DR) tests.
Animal Fluency Test (AFT).
Digit Symbol Substitution Test (DSST).
These tools are widely used in studies analyzing cognitive function and its risk factors [30–32]. A composite Z-score, termed the Overall cognitive ability score, was calculated by averaging the standardized scores of the CERAD, AFT, and DSST tests [33–36]. Although no definitive threshold for cognitive impairment has been established in prior studies, the 25th percentile of the Overall cognitive ability score was used as the cutoff in this study [37–39]. Participants were categorized into two groups: normal cognitive function and cognitive impairment.
2.2.2 Depression.
The Patient Health Questionnaire-9 (PHQ-9), a self-report tool widely used in clinical practice and research, was employed to screen, diagnose, and assess depression. The questionnaire consists of nine items covering core depressive symptoms, including low mood, loss of interest, sleep disturbances, appetite changes, fatigue, feelings of worthlessness, poor concentration, psychomotor retardation, and suicidal ideation. Each item is scored from 0 (“not at all”) to 3 (“nearly every day”), with a maximum total score of 27. A PHQ-9 score ≥10 was considered indicative of depression [40].
2.2.3 Immune-inflammatory indices and triglyceride-glucose index.
The systemic immune-inflammation index (SII), systemic inflammation response index (SIRI), and triglyceride-glucose (TyG) index were calculated using complete blood count (CBC) laboratory results from the NHANES database [18]. The following measurements were used (reported in 1,000 cells/μL or mg/dL): platelet count (PC), neutrophil count (NC), monocyte count (MC), lymphocyte count (LC), triglycerides (TG), and fasting blood glucose (FPG). The indices were derived as follows:
SII = (platelet count × neutrophil count)/lymphocyte count;
SIRI = (neutrophil count × monocyte count)/lymphocyte count.
TyG=In[fasting triglycerides (TG, mg/dL) X fasting blood glucose (FPG, mg/dL)/2]
2.2.4 Covariates.
Based on study design requirements, variables were assessed across different dimensions, with categorical variables assigned numerical values. Binary variables were coded as 0 or 1, while multi-category variables were assigned incremental values (e.g., 0, 1, 2). The selected covariates included:
Continuous variables: Age, minutes of sedentary activity.
Categorical variables: Gender (male, female); Race/ethnicity (Mexican American, Other Hispanic, Non-Hispanic White, Non-Hispanic Black, Other Race—including multiracial); Education level (<9th grade, 9–11th grade [including 12th grade without diploma], high school graduate/GED or equivalent, some college or AA degree, college graduate or above); Marital status (married, widowed, divorced, separated, never married, living with partner); BMI (<25, 25– < 30, ≥ 30); Self-reported sleep trouble (no, yes); Diabetes (no, borderline, yes); Heart disease (no, yes); Stroke (no, yes); Depression (no, yes).
2.3 Model development
Risk factors for cognitive impairment were screened using univariate analysis, multivariate logistic regression, Least Absolute Shrinkage and Selection Operator (LASSO) regression, and the Boruta algorithm. The dataset was randomly split into training (70%) and testing (30%) sets. Nine supervised machine learning (ML) algorithms were employed to construct prediction models:
1. Random Forest (RF); 2. Light Gradient Boosting Machine (LightGBM); 3. Extreme Gradient Boosting (XGBoost); 4. Logistic Regression; 5. K-Nearest Neighbor (KNN); 6. Support Vector Machine (SVM); 7. Artificial Neural Network (ANN); 8. Decision Tree; 9. Elastic Net; Hyperparameter optimization is crucial for model performance. We employed grid search with 5-fold cross-validation to identify the optimal hyperparameter combinations for each algorithm. The hyperparameter space for each algorithm is detailed in Table 1. The configuration yielding the best average performance across the folds was selected for model building. (Table 1).
Table 1. Hyperparameter values for each machine learning algorithm.
| Algorithms | Hyperparameters | Explanations | Search Space | Optimal Value |
|---|---|---|---|---|
| KNN | n_neighbors | Number of K neighbors | {1, 2, 3, 4, 5, 6, 7,8,9,10} | 1 |
| p | Power parameter of distance measurement (Minkowski distance) | {1, 2} | 1 | |
| weights | Weight functions used in prediction | {uniform, distance} | uniform | |
| SVM | C | regularization parameter | {0.001, 0.01, 0.1, 1, 10, 100} | 100 |
| gamma | Kernel function parameters | {0.001, 0.01, 0.1, 1, 10, 100} | 1 | |
| kernel | kernel function type | {linear, rbf, sigmoid} | rbf | |
| ANN | hidden_layer_sizes | Number of hidden layer neurons | {1, 2,..., 32} | 27 |
| max_iter | Maximum Number Of Iterations | {500, 1000, 1500} | 500 | |
| Decision Tree | max_depth | The maximum depth of a tree | {2, 3, 4, 5, 6, 7, 8, None} | None |
| min_samples_split | The minimum number of samples required to split internal nodes | {2, 3,..., 10} | 4 | |
| min_samples_leaf | Minimum number of samples required on leaf nodes | {1, 2, 3, 4, 5} | 1 | |
| criterion | A function for measuring the quality of splitting | {gini, entropy} | entropy | |
| max_features | The number of features to consider when searching for the best segmentation | {auto, sqrt, log2, None} | None | |
| LightGBM | learning_rate | learning rate | {0.01, 0.05, 0.1} | 0.05 |
| num_leaves | The maximum number of leaves per tree | {31, 40, 50} | 50 | |
| max_depth | The maximum depth of a tree | {−1, 5, 10} | 10 | |
| n_estimators | Iteration times (number of trees) | {100, 200, 300} | 200 | |
| subsample | Subsample ratio | {0.8, 0.9, 1.0} | 0.9 | |
| colsample_bytree | Column sampling ratio for each tree | {0.8, 0.9, 1.0} | 1 | |
| reg_alpha | L1 regularization | {0, 0.1, 0.5} | 0.1 | |
| reg_lambda | L2 regularization | {0, 0.1, 0.5} | 0 | |
| Random Forest | n_estimators | The number of trees in the forest | {10, 20,..., 200} | 40 |
| max_depth | The maximum depth of a tree | {None, 3, 4,..., 13} | 10 | |
| min_samples_split | The minimum number of samples required to split internal nodes | {2, 3,..., 10} | 3 | |
| min_samples_leaf | Minimum number of samples required on leaf nodes | {1, 2, 3, 4, 5} | 2 | |
| max_features | The number of features to consider when searching for the best segmentation | {auto, sqrt, log2, None, 1,...} | log2 | |
| XGBoost | learning_rate | learning rate | {0.01, 0.05, 0.1} | 0.01 |
| max_depth | The maximum depth of a tree | {3, 6, 9} | 9 | |
| n_estimators | The number of trees | {50, 100, 200} | 100 | |
| subsample | Subsample ratio | {0.8, 0.9, 1.0} | 0.9 | |
| colsample_bytree | Column sampling ratio for each tree | {0.8, 0.9, 1.0} | 0.8 | |
| Elastic Net | C | regularization strength | {0.001, 0.01, 0.1, 1, 10, 100} | 0.01 |
| l1_ratio | Regularization ratio of L1 and L2 | {0.1, 0.3, 0.5, 0.7, 0.9} | 0.3 | |
| Logistic Regression | C | The reciprocal of the regularization strength | {1e-4, 1e-3,..., 1e + 4} | 0.01 |
| penalty | Specify the penalties used in regularization | {l1, l2, None} | l2 | |
| solver | Algorithms used for optimizing problems | {liblinear, lbfgs} | lbfgs |
2.4 Evaluation metrics
Performance metrics included accuracy, recall, specificity, positive predictive value (PPV), negative predictive value (NPV), area under the receiver operating characteristic curve (AUC-ROC), and F1-score. Calibration curves were used to assess model consistency, with the Brier score quantifying calibration performance (lower scores indicate better accuracy: 0–0.1 = excellent, 0.1–0.25 = good, > 0.25 = poor). Decision curve analysis (DCA) evaluated clinical utility, and the Shapley Additive Explanations (SHAP) algorithm interpreted feature contributions to model predictions.
2.5 Statistical analysis
Analyses were conducted using R 4.3 and Python 3.11.5. Non-normally distributed continuous variables were expressed as median (interquartile range) [M(P25, P75)], with group comparisons performed using nonparametric tests. Categorical variables were reported as frequencies and percentages (n, %), with group comparisons assessed via Z-tests. A two-tailed P < 0.05 was considered statistically significant.
3. Results
3.1 Participant characteristics
Among the 1,325 participants, 1,092 (82.42%) had normal cognitive function, while 233 (17.58%) exhibited cognitive impairment. Significant differences (P < 0.05) were observed between groups for age, race, education level, marital status, diabetes, stroke, and depression. No significant differences (P > 0.05) were found for sedentary activity, SII, SIRI, TyG index, gender, BMI, self-reported sleep trouble, or heart disease (Table 2).
Table 2. Basic characteristics of study participants (n = 1325).
| Variables | Total (n = 1325) | Non-cognitive impairment group (n = 1008) | Cognitive impairment group (n = 317) | Statistic | P |
|---|---|---|---|---|---|
| Age, M(P25, P75) | 69.00(64.00,76.00) | 67.00(63.00,74.00) | 73.00(66.00,80.00) | Z = −7.592 | <0.001 |
| Minutes sedentary activity, M(P25, P75) | 360.00(240.00,480.00) | 360.00(240.00,480.00) | 360.00(240.00,480.00) | Z = −1.346 | 0.178 |
| SII, M(P25, P75) | 440.75(318.50,642.21) | 442.57(320.57,629.50) | 434.45(306.31,702.80) | Z = −0.637 | 0.524 |
| SIRI,M(P25, P75) | 1.08(0.72,1.60) | 1.06(0.71,1.53) | 1.14(0.72,1.73) | Z = −1.433 | 0.152 |
| TyG,M(P25, P75) | 8.64(8.27,9.07) | 8.63(8.24,9.06) | 8.67(8.34,9.11) | Z = −1.723 | 0.085 |
| Gender, n(%) | χ² = 3.020 | 0.082 | |||
| Male | 650(49.06) | 481(47.72) | 169(53.31) | ||
| Female | 675(50.94) | 527(52.28) | 148(46.69) | ||
| Race/Hispanic origin, n(%) | χ² = 76.251 | <0.001 | |||
| Mexican American | 117(8.83) | 72(7.14) | 45(14.20) | ||
| Other Hispanic | 135(10.19) | 77(7.64) | 58(18.30) | ||
| Non-Hispanic White | 672(50.72) | 563(55.85) | 109(34.38) | ||
| Non-Hispanic Black | 270(20.38) | 185(18.35) | 85(26.81) | ||
| Other Race – Including Multi-Racial | 131(9.89) | 111(11.01) | 20(6.31) | ||
| Education level, n(%) | χ² = 301.940 | <0.001 | |||
| Less than 9th grade | 149(11.25) | 42(4.17) | 107(33.75) | ||
| 9-11th grade (Includes 12th grade with no diploma) | 187(14.11) | 111(11.01) | 76(23.97) | ||
| High school graduate/GED or equivalent | 317(23.92) | 243(24.11) | 74(23.34) | ||
| Some college or AA degree | 375(28.30) | 333(33.04) | 42(13.25) | ||
| College graduate or above | 297(22.42) | 279(27.68) | 18(5.68) | ||
| Marital status, n(%) | χ² = 32.456 | <0.001 | |||
| Married | 784(59.17) | 615(61.01) | 169(53.31) | ||
| Widowed | 243(18.34) | 161(15.97) | 82(25.87) | ||
| Divorced | 157(11.85) | 124(12.30) | 33(10.41) | ||
| Separated | 29(2.19) | 14(1.39) | 15(4.73) | ||
| Never married | 70(5.28) | 60(5.95) | 10(3.15) | ||
| Living with a partner | 42(3.17) | 34(3.37) | 8(2.52) | ||
| BMI, n(%) | χ² = 1.076 | 0.584 | |||
| < 25 | 366(27.62) | 274(27.18) | 92(29.02) | ||
| 25 ≤ - < 30 | 462(34.87) | 359(35.62) | 103(32.49) | ||
| ≥ 30 | 497(37.51) | 375(37.20) | 122(38.49) | ||
| Ever told the doctor had trouble sleeping, n(%) | χ² = 0.285 | 0.593 | |||
| No | 933(70.42) | 706(70.04) | 227(71.61) | ||
| Yes | 392(29.58) | 302(29.96) | 90(28.39) | ||
| Diabetes, n(%) | χ² = 35.861 | <0.001 | |||
| No | 969(73.13) | 769(76.29) | 200(63.09) | ||
| Borderline | 59(4.45) | 51(5.06) | 8(2.52) | ||
| Yes | 297(22.42) | 188(18.65) | 109(34.38) | ||
| Heart disease, n(%) | χ² = 1.754 | 0.185 | |||
| No | 1192(89.96) | 913(90.58) | 279(88.01) | ||
| Yes | 133(10.04) | 95(9.42) | 38(11.99) | ||
| Stroke, n(%) | χ² = 6.040 | 0.014 | |||
| No | 1232(92.98) | 947(93.95) | 285(89.91) | ||
| Yes | 93(7.02) | 61(6.05) | 32(10.09) | ||
| Depression, n(%) | χ² = 28.234 | <0.001 | |||
| No | 1197(90.34) | 935(92.76) | 262(82.65) | ||
| Yes | 128(9.66) | 73(7.24) | 55(17.35) |
3.2 Feature selection
Use multiple logistic regression, Lasso regression, and the Boruta algorithm to screen risk factors closely related to cognitive impairment in the elderly, and include variables with statistically significant differences in univariate analysis. The five important predictive factors obtained from multiple logistic regression, Lasso regression, and Boruta algorithm are Age, Race, Education level, Diabetes, and Depression (Table 3, Fig 2).
Table 3. Multivariate logistic analysis.
| Variables | β | S.E | Z | P | OR (95%CI) |
|---|---|---|---|---|---|
| Intercept | −8.099 | 1.004 | −8.069 | ||
| Age | 0.126 | 0.014 | 8.781 | <0.001 | 1.134(1.102 ~ 1.166) |
| Race/Hispanic origin | |||||
| Mexican American | 1.000(Reference) | ||||
| Other Hispanic | 0.737 | 0.336 | 2.194 | 0.028 | 2.090(1.082 ~ 4.038) |
| Non-Hispanic White | −0.725 | 0.308 | −2.356 | 0.018 | 0.485(0.265 ~ 0.885) |
| Non-Hispanic Black | 0.646 | 0.315 | 2.047 | 0.041 | 1.907(1.028 ~ 3.539) |
| Other Race – Including Multi-Racial | −0.300 | 0.391 | −0.767 | 0.443 | 0.741(0.344 ~ 1.594) |
| Education level | |||||
| Less than 9th grade | 1.000(Reference) | ||||
| 9-11th grade (Includes 12th grade with no diploma) | −1.358 | 0.274 | −4.959 | <0.001 | 0.257(0.150 ~ 0.440) |
| High school graduate/GED or equivalent | −2.143 | 0.270 | −7.941 | <0.001 | 0.117(0.069 ~ 0.199) |
| Some college or AA degree | −2.976 | 0.285 | −10.427 | <0.001 | 0.051(0.029 ~ 0.089) |
| College graduate or above | −3.486 | 0.345 | −10.091 | <0.001 | 0.031(0.016 ~ 0.060) |
| Marital status | |||||
| Married | 1.000(Reference) | ||||
| Widowed | 0.099 | 0.203 | 0.489 | 0.625 | 1.104(0.742 ~ 1.643) |
| Divorced | 0.070 | 0.268 | 0.262 | 0.794 | 1.073(0.634 ~ 1.815) |
| Separated | 0.567 | 0.466 | 1.217 | 0.224 | 1.764(0.707 ~ 4.401) |
| Never married | −0.655 | 0.431 | −1.520 | 0.128 | 0.519(0.223 ~ 1.209) |
| Living with a partner | 0.149 | 0.511 | 0.292 | 0.771 | 1.161(0.426 ~ 3.159) |
| Diabetes | |||||
| No | 1.000(Reference) | ||||
| Borderline | −0.530 | 0.473 | −1.120 | 0.263 | 0.589(0.233 ~ 1.488) |
| Yes | 0.596 | 0.178 | 3.337 | <0.001 | 1.814(1.279 ~ 2.574) |
| Stroke | |||||
| No | 1.000(Reference) | ||||
| Yes | 0.300 | 0.282 | 1.063 | 0.288 | 1.349(0.776 ~ 2.345) |
| Depression | |||||
| No | 1.000(Reference) | ||||
| Yes | 0.696 | 0.250 | 2.791 | 0.005 | 2.006(1.230 ~ 3.272) |
Fig 2. Lasso regression and Boruta algorithm for screening predictive factors.
(A) Coefficient path of Lasso regression. (B) Cross-validation results of Lasso regression. (C) Boruta algorithm results.
3.3 Performance comparison of nine prediction models
The nine ML algorithms were trained using the selected predictors. Random Forest demonstrated the highest performance: Training set: AUC = 0.872 (95% CI: 0.854–0.890), accuracy = 0.787, sensitivity = 0.795, specificity = 0.780, PPV = 0.786, NPV = 0.792, F1-score = 0.789. Testing set: AUC = 0.870 (95% CI: 0.850–0.890), accuracy = 0.770, sensitivity = 0.778, specificity = 0.762, PPV = 0.768, NPV = 0.775, F1-score = 0.772 (Table 4, Fig 3).
Table 4. Performance comparison of 9 machine learning prediction models.
| Model | ROC AUC | Accuracy | Sensitivity | Specificity | Positive Predictive Value | Negative Predictive Value | F1 Score |
|---|---|---|---|---|---|---|---|
| Random Forest | 0.872 | 0.787 | 0.795 | 0.780 | 0.786 | 0.792 | 0.789 |
| 0.870 | 0.770 | 0.778 | 0.762 | 0.768 | 0.775 | 0.772 | |
| LightGBM | 0.870 | 0.784 | 0.795 | 0.773 | 0.781 | 0.791 | 0.787 |
| 0.865 | 0.770 | 0.765 | 0.776 | 0.775 | 0.770 | 0.769 | |
| XGBoost | 0.868 | 0.790 | 0.783 | 0.797 | 0.797 | 0.788 | 0.788 |
| 0.868 | 0.762 | 0.778 | 0.746 | 0.756 | 0.772 | 0.766 | |
| Logistic Regression | 0.848 | 0.765 | 0.751 | 0.779 | 0.774 | 0.758 | 0.761 |
| 0.846 | 0.764 | 0.761 | 0.766 | 0.764 | 0.764 | 0.763 | |
| KNN | 0.766 | 0.766 | 0.766 | 0.766 | 0.766 | 0.768 | 0.765 |
| 0.756 | 0.755 | 0.785 | 0.726 | 0.743 | 0.772 | 0.762 | |
| SVM | 0.829 | 0.788 | 0.809 | 0.767 | 0.777 | 0.800 | 0.793 |
| 0.813 | 0.765 | 0.778 | 0.752 | 0.760 | 0.774 | 0.768 | |
| ANN | 0.847 | 0.752 | 0.735 | 0.769 | 0.762 | 0.745 | 0.747 |
| 0.847 | 0.755 | 0.742 | 0.769 | 0.763 | 0.752 | 0.751 | |
| Decision Tree | 0.823 | 0.770 | 0.765 | 0.776 | 0.774 | 0.767 | 0.769 |
| 0.782 | 0.739 | 0.722 | 0.756 | 0.746 | 0.732 | 0.734 | |
| ElasticNet | 0.843 | 0.762 | 0.758 | 0.766 | 0.765 | 0.762 | 0.760 |
| 0.843 | 0.759 | 0.755 | 0.763 | 0.762 | 0.758 | 0.758 |
Fig 3. ROC curves of 9 machine learning prediction models.
(A) Training set. (B) Test set.
On the test set, Random Forest LightGBM XGBoost Logistic Regression KNN SVM ANN Decision Tree Elastic Net, evaluate the accuracy and clinical practicality of the model. The calibration curve of the test set shows that the Brier scores of all 9 models are below 0.20, indicating that the 9 models have good accuracy and the predicted results are consistent with the actual results (Fig 4).
Fig 4. Calibration curves of 9 machine learning prediction models.
(A) Training set. (B) Test set.
The DCA curve showed that when the risk threshold was between 0.1 and 0.8, Random Forest LightGBM XGBoost Logistic Regression KNN SVM ANN Decision Tree,Elastic Net the model can obtain better clinical net benefit, indicating that the model has better clinical applicability (Fig 5).
Fig 5. DCA curves of 9 machine learning prediction models.
(A) Training set. (B) Test set.
Based on the above analysis, the Random Forest model performs the best in predicting the risk of cognitive impairment in the elderly, with high prediction accuracy and good clinical practicality. Therefore, the Random Forest model was chosen as the final model for predicting the risk of cognitive impairment in the elderly.
3.4 Feature importance analysis
SHAP analysis of the Random Forest model ranked predictor importance as: education level > age > race > diabetes > depression (Fig 6A). Swarm plots revealed negative associations between education level and cognitive impairment, and positive associations for age, diabetes, and depression (Fig 6B).
Fig 6. SHAP’s Visual Explanation of the Global Model.
(A) Bar chart. (B) Bee colony chart.
Single-sample SHAP diagrams, waterfall diagrams, and decision diagrams can explain the prediction results of a single case. For example, the data from Case 1 shows that the Education level is Less than 9th grade, Race is Mexican American, Age is 60 years old, no diabetes, no depression, and the Random Forest risk model predicts a probability of 0.96 for the risk of cognitive impairment in the elderly (Fig 7).
Fig 7. Visual interpretation of SHAP for single-sample cases.
(A) Force Plot. (B) Waterfall Plot. (C) Decision Plot.
4 Discuss
In recent years, machine learning (ML), deep learning, artificial intelligence, and statistical analysis have been increasingly applied to medical research [41–45]. ML algorithms can leverage large datasets for training, thereby enhancing the accuracy and predictive power of models. These algorithms autonomously learn patterns and relationships from data to construct predictive models without manual intervention, improving efficiency. Moreover, they can continuously update and optimize models with new data, ensuring adaptability to evolving environments and datasets [46]. ML excels at processing high-dimensional data and complex relationships, uncovering nonlinear associations and patterns that may elude traditional methods. Its capacity to handle large-scale data enables the extraction of actionable insights, while its interpretability clarifies model mechanics and decision-making processes, offering precise predictive and decision-support tools in medicine [47].
During model development, nine ML algorithms were evaluated, with the Random Forest (RF) model demonstrating superior performance in predicting cognitive impairment risk among elderly individuals. As a classical ML algorithm, RF efficiently handles high-dimensional data by employing an ensemble of decision trees, which mitigates overfitting and captures complex feature interactions [48–50]. Its robustness to noise and outliers further enhances reliability in real-world applications. To improve transparency, the Shapley Additive Planations (SHAP) algorithm was employed for model interpretation. Globally, SHAP quantified the relative contribution of each feature to cognitive impairment risk; locally, it elucidated how individual predictors influenced specific cases. This dual interpretability strengthens the model’s clinical utility [51].
Feature or variable selection is the core of predictive model development [52,53]. This study used three algorithms, namely multi-factor logistic regression, Lasso regression, and Boruta algorithm, to obtain Age, Race, Education level, Diabetes, and Depression predictive factors. Based on the five predictive factors, a risk prediction model was constructed, which has good predictive performance, accuracy, and clinical benefits. Previous studies have focused on age [54,55] Education level [56,57] Race [58] Diabetes [59],Depression [44]. The literature indicates that the predictive factors we have chosen are available and reliable.
Through the SHAP algorithm, it was found that education level is the primary predictor, and a higher education level can reduce the risk of cognitive impairment. Education can help improve memory, cognitive stimulation, and cognitive abilities [60]. Cognitive stimulation activities may slow down the rate of hippocampal atrophy during normal aging [61], and may even prevent the accumulation of amyloid plaques [62]. The deposition of amyloid beta (Aβ) is a biomarker for cognitive impairment. Education mainly strengthens the control of processes and the understanding of concepts in cognitive function. Compared with those with shorter education periods, those with longer education periods have an 85% lower risk of Mild Cognitive Impairment (MCI) and Alzheimer’s disease [63]. The cognitive reserve hypothesis [64] proposes that stimulating the environment promotes the growth of new neurons in the form of neurogenesis, thereby promoting neural plasticity. With the improvement of cultural level and the increase of cognitive reserve, the expression of cognitive decline may be delayed [65].
Cognitive guidelines and expert consensus point out [66–69] that age is one of the predictive factors for the risk of cognitive impairment. As the body gradually ages, various organs and tissues begin to age, and the functional connections of the brain network will selectively weaken, inevitably leading to a decline in cognitive ability. Lee et al. [70,71] found that hippocampal neurons located deep in the temporal lobe of the brain help us classify and understand human perception and experience from the most basic to highly complex things. As we age, the balance between pattern separation and pattern completion is disrupted, and memory is impaired. Moreover, the hippocampus is highly susceptible to hypoxia/ischemic damage, and the function of the hippocampal vascular system is crucial for maintaining neurocognitive health. The decrease in hippocampal blood flow occurs during healthy aging and can lead to neuronal atrophy and memory decline in the hippocampus [72,73].
There is a close relationship between diabetes and cognitive impairment [74]. Type 2 diabetes can increase the incidence of Alzheimer’s disease (AD) by 1.5–2.5 times [71]. Many scholars have found that diabetes and cognitive impairment share many common pathophysiological bases. Diabetes can cause an inflammatory reaction, metabolic disorder, microvascular disease, oxidative stress, A β deposition, neurofibrillar tangle, leading to insulin resistance, damage to synaptic plasticity, synaptic degeneration, and cell death. There are abnormal insulin signal transduction pathways, weakened mitochondrial function, autonomic nervous dysfunction, and neurocellular inflammation in diabetes patients, which can affect the brain tissue and structure, and ultimately lead to cognitive decline [75,76].
The reasons why depression increases the risk of cognitive impairment involve multiple mechanisms at the neurobiological, endocrine, and behavioral levels. Firstly, depression leads to abnormal levels of key neurotransmitters such as serotonin and dopamine in the brain, impairing synaptic transmission efficiency and directly affecting memory encoding and cognitive flexibility [77]. Long-term depression may interfere with glutamate-mediated neuronal excitability, inhibit hippocampal neural plasticity, and accelerate cognitive decline [78]. Depressive states activate microglia, promote the release of pro-inflammatory factors such as IL-6 and TNF-α, damage neurons, and hinder synaptic remodeling [79]. Depression leads to hyperfunction of the hypothalamic pituitary adrenal (HPA) axis, sustained elevation of cortisol, and direct toxicity to hippocampal neurons [80]. Depression is often accompanied by abnormal glucose metabolism, and an increase in the TyG index reflects insulin resistance, which can reduce brain glucose utilization and impair cognitive function [29]. Depression related stress hormones can promote abnormal phosphorylation of tau protein, accelerate the formation of neurofibrillary tangles, and are directly associated with cognitive symptoms of Alzheimer’s disease [81]. There is a bidirectional relationship between cognitive impairment and depression [82]. On the one hand, cognitive impairment leads to a decrease in social participation and emotional regulation ability, which in turn triggers depression and exacerbates depressive symptoms [83,84]. On the other hand, depression accelerates cognitive decline by promoting neuroinflammation and abnormal brain function, and this vicious cycle accelerates the transition to dementia.
Previous studies have found that as the SII, SIRI, and TyG indices increase, the risk of cognitive impairment in the elderly increases [14,18,85–92]. However, after cleaning the null values of complete blood count (CBC) in NHANES laboratory tests from 2011 to 2014, this study did not find any direct statistical differences in SII, SIRI, and TyG between the cognitively normal and cognitively impaired elderly groups during analysis. Considering the cleaning of missing values may result in a small sample size, which may reduce the power of the test and prevent the detection of actual differences. Secondly, SII and SIRI inflammatory markers mainly reflect systemic inflammatory status, but their specificity in cognitive impairment may not be high. Although inflammation is associated with cognitive decline, SII and SIRI indices are not sensitive biomarkers for cognitive impairment, especially in the elderly population, which is influenced by multiple confounding factors such as coexisting chronic diseases and drug treatment. The TyG index evaluates insulin resistance. Although there is a theoretical association between insulin resistance and cognitive impairment (such as Alzheimer’s disease), in the actual population, individual variations in metabolic factors (such as lifestyle and genetic background) may dilute the differences between the cognitively normal group and the cognitively impaired group. This also leads to the lack of statistical significance of SII, SIRI, and TyG as predictive factors in this study.
There are certain limitations to this study. Firstly, the sample size of SII, SIRI, and TyG indices is relatively small after cleaning the laboratory to check the complete blood count (CBC) null value, which limits the learning ability of ML; Secondly, there are shortcomings in feature selection, which fail to fully explore all potential factors that affect the risk of cognitive impairment in the elderly; The third issue is that model selection and parameter tuning need to be optimized. The above factors have led to significant room for improvement in indicators such as AUC and recall rate of our research model, although they are within an acceptable range. In future research, it is expected to expand the sample size of indices such as SII, SIRI, TyG, etc. to improve testing efficiency, and combine multimodal evaluation (such as imaging and genetic markers) to reduce confounding bias. At the same time, more efficient algorithms will be explored to improve model performance and expand the applicability of the model.
In summary, Age, Race, Education level, Diabetes, and Depression are the influencing factors of cognitive impairment in the elderly. This study constructs a prediction model for cognitive impairment risk in the elderly based on machine learning algorithms. Among them, the random forest algorithm has the best prediction performance and certain predictive value, which can provide new ideas and methods for early identification and intervention of cognitive impairment risk in the elderly.
Data Availability
This study utilized data from the 2011-2014 National Health and Nutrition Examination Survey (NHANES) which is open to researchers around the world. Researchers may (https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=2011, The https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=2013) access, download, and research for use.
Funding Statement
The research reported in this publication has been supported by the Key Project of the Natural Science Foundation of Xinjiang Autonomous Region (2022D01D63 to XZ).
References
- 1.Partridge L, Deelen J, Slagboom PE. Facing up to the global challenges of ageing. Nature. 2018;561(7721):45–56. doi: 10.1038/s41586-018-0457-8 [DOI] [PubMed] [Google Scholar]
- 2.Pan C-W, Wang X, Ma Q, Sun H-P, Xu Y, Wang P. Cognitive dysfunction and health-related quality of life among older Chinese. Sci Rep. 2015;5:17301. doi: 10.1038/srep17301 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jia J, Wei C, Chen S, Li F, Tang Y, Qin W, et al. The cost of Alzheimer’s disease in China and re-estimation of costs worldwide. Alzheimers Dement. 2018;14(4):483–91. doi: 10.1016/j.jalz.2017.12.006 [DOI] [PubMed] [Google Scholar]
- 4.Academy of Cognitive Disorders of China (ACDC), Han Y, Jia J, Li X, Lv Y, Sun X, et al. Expert consensus on the care and management of patients with cognitive impairment in China. Neurosci Bull. 2020;36(3):307–20. doi: 10.1007/s12264-019-00444-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Broadhouse KM, Singh MF, Suo C, Gates N, Wen W, Brodaty H, et al. Hippocampal plasticity underpins long-term cognitive gains from resistance exercise in MCI. Neuroimage Clin. 2020;25:102182. doi: 10.1016/j.nicl.2020.102182 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li J-Q, Tan L, Wang H-F, Tan M-S, Tan L, Xu W, et al. Risk factors for predicting progression from mild cognitive impairment to Alzheimer’s disease: a systematic review and meta-analysis of cohort studies. J Neurol Neurosurg Psychiatry. 2016;87(5):476–84. doi: 10.1136/jnnp-2014-310095 [DOI] [PubMed] [Google Scholar]
- 7.Chowdhary N, Barbui C, Anstey KJ, Kivipelto M, Barbera M, Peters R, et al. Reducing the risk of cognitive decline and dementia: WHO recommendations. Front Neurol. 2022;12:765584. doi: 10.3389/fneur.2021.765584 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the lancet commission. Lancet. 2020;396(10248):413–46. doi: 10.1016/S0140-6736(20)30367-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Barter J, Kumar A, Bean L, Ciesla M, Foster TC. Adulthood systemic inflammation accelerates the trajectory of age-related cognitive decline. Aging (Albany NY). 2021;13(18):22092–108. doi: 10.18632/aging.203588 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Aiello A, Farzaneh F, Candore G, Caruso C, Davinelli S, Gambino CM, et al. Immunosenescence and its hallmarks: how to oppose aging strategically? A review of potential options for therapeutic intervention. Front Immunol. 2019;10:2247. doi: 10.3389/fimmu.2019.02247 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Irwin MR, Vitiello MV. Implications of sleep disturbance and inflammation for Alzheimer’s disease dementia. Lancet Neurol. 2019;18(3):296–306. doi: 10.1016/S1474-4422(18)30450-2 [DOI] [PubMed] [Google Scholar]
- 12.Wang M, Zeng X, Liu Q, Yang Z, Li J. The association between sleep duration and cognitive function in the U.S. elderly from NHANES 2011-2014: A mediation analysis for inflammatory biomarkers. J Affect Disord. 2025;375:465–71. doi: 10.1016/j.jad.2025.01.154 [DOI] [PubMed] [Google Scholar]
- 13.Tondo G, Aprile D, De Marchi F, Sarasso B, Serra P, Borasio G, et al. Investigating the prognostic role of peripheral inflammatory markers in mild cognitive impairment. J Clin Med. 2023;12(13):4298. doi: 10.3390/jcm12134298 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zheng Y, Yu Y, Gao L, Yu M, Jiang L, Zhu Q. Association of red blood cell count, hemoglobin concentration, and inflammatory indices with cognitive impairment severity in Alzheimer’s disease. Sci Rep. 2025;15(1):17425. doi: 10.1038/s41598-025-02468-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wang Q, Ma J, Jiang Z, Ming L. Prognostic value of neutrophil-to-lymphocyte ratio and platelet-to-lymphocyte ratio in acute pulmonary embolism: a systematic review and meta-analysis. Int Angiol. 2018;37(1):4–11. doi: 10.23736/S0392-9590.17.03848-2 [DOI] [PubMed] [Google Scholar]
- 16.Gasparyan AY, Ayvazyan L, Mukanova U, Yessirkepov M, Kitas GD. The platelet-to-lymphocyte ratio as an inflammatory marker in rheumatic diseases. Ann Lab Med. 2019;39(4):345–57. doi: 10.3343/alm.2019.39.4.345 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wang X, Li T, Li H, Li D, Wang X, Zhao A, et al. Association of dietary inflammatory potential with blood inflammation: the prospective markers on mild cognitive impairment. Nutrients. 2022;14(12):2417. doi: 10.3390/nu14122417 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lu W, Zhang K, Chang X, Yu X, Bian J. The association between systemic immune-inflammation index and postoperative cognitive decline in elderly patients. Clin Interv Aging. 2022;17:699–705. doi: 10.2147/CIA.S357319 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Xiao Y, Teng Z, Xu J, Qi Q, Guan T, Jiang X, et al. Systemic immune-inflammation index is associated with cerebral small vessel disease burden and cognitive impairment. Neuropsychiatr Dis Treat. 2023;19:403–13. doi: 10.2147/NDT.S401098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Guo Z, Zheng Y, Geng J, Wu Z, Wei T, Shan G, et al. Unveiling the link between systemic inflammation markers and cognitive performance among older adults in the US: A population-based study using NHANES 2011-2014 data. J Clin Neurosci. 2024;119:45–51. doi: 10.1016/j.jocn.2023.11.004 [DOI] [PubMed] [Google Scholar]
- 21.Nolasco-Rosales GA, Alonso-García CY, Hernández-Martínez DG, Villar-Soto M, Martínez-Magaña JJ, Genis-Mendoza AD, et al. Aftereffects in epigenetic age related to cognitive decline and inflammatory markers in healthcare personnel with post-COVID-19: A cross-sectional study. Int J Gen Med. 2023;16:4953–64. doi: 10.2147/IJGM.S426249 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.van der Willik KD, Koppelmans V, Hauptmann M, Compter A, Ikram MA, Schagen SB. Inflammation markers and cognitive performance in breast cancer survivors 20 years after completion of chemotherapy: a cohort study. Breast Cancer Res. 2018;20(1):135. doi: 10.1186/s13058-018-1062-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hong S, Han K, Park C-Y. The insulin resistance by triglyceride glucose index and risk for dementia: population-based study. Alzheimers Res Ther. 2021;13(1):9. doi: 10.1186/s13195-020-00758-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Minh HV, Tien HA, Sinh CT, Thang DC, Chen C-H, Tay JC, et al. Assessment of preferred methods to measure insulin resistance in Asian patients with hypertension. J Clin Hypertens (Greenwich). 2021;23(3):529–37. doi: 10.1111/jch.14155 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wei B, Dong Q, Ma J, Zhang A. The association between triglyceride-glucose index and cognitive function in nondiabetic elderly: NHANES 2011-2014. Lipids Health Dis. 2023;22(1):188. doi: 10.1186/s12944-023-01959-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Nayak SS, Kuriyakose D, Polisetty LD, Patil AA, Ameen D, Bonu R, et al. Diagnostic and prognostic value of triglyceride glucose index: a comprehensive evaluation of meta-analysis. Cardiovasc Diabetol. 2024;23(1):310. doi: 10.1186/s12933-024-02392-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zhang Z, Sheng Z, Liu J, Zhang D, Wang H, Wang L, et al. The association of the triglyceride-glucose index with Alzheimer’s disease and its potential mechanisms. J Alzheimers Dis. 2024;102(1):77–88. doi: 10.1177/13872877241284216 [DOI] [PubMed] [Google Scholar]
- 28.Bai W, An S, Jia H, Xu J, Qin L. Relationship between triglyceride-glucose index and cognitive function among community-dwelling older adults: a population-based cohort study. Front Endocrinol (Lausanne). 2024;15:1398235. doi: 10.3389/fendo.2024.1398235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ding C, Lu R, Kong Z, Huang R. Exploring the triglyceride-glucose index’s role in depression and cognitive dysfunction: Evidence from NHANES with machine learning support. J Affect Disord. 2025;374:282–9. doi: 10.1016/j.jad.2025.01.051 [DOI] [PubMed] [Google Scholar]
- 30.Pang K, Liu C, Tong J, Ouyang W, Hu S, Tang Y. Higher total cholesterol concentration may be associated with better cognitive performance among elderly females. Nutrients. 2022;14(19):4198. doi: 10.3390/nu14194198 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lee S, Min J-Y, Kim B, Ha S-W, Han JH, Min K-B. Serum sodium in relation to various domains of cognitive function in the elderly US population. BMC Geriatr. 2021;21(1):328. doi: 10.1186/s12877-021-02260-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Botelho J, Leira Y, Viana J, Machado V, Lyra P, Aldrey JM, et al. The role of inflammatory diet and vitamin d on the link between periodontitis and cognitive function: a mediation analysis in older adults. Nutrients. 2021;13(3):924. doi: 10.3390/nu13030924 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Scherr M, Kunz A, Doll A, Mutzenbach JS, Broussalis E, Bergmann HJ, et al. Ignoring floor and ceiling effects may underestimate the effect of carotid artery stenting on cognitive performance. J Neurointerv Surg. 2016;8(7):747–51. doi: 10.1136/neurintsurg-2014-011612 [DOI] [PubMed] [Google Scholar]
- 34.Lim CR, Harris K, Dawson J, Beard DJ, Fitzpatrick R, Price AJ. Floor and ceiling effects in the OHS: an analysis of the NHS PROMs data set. BMJ Open. 2015;5(7):e007765. doi: 10.1136/bmjopen-2015-007765 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wang X, Wen Q, Li Y, Zhu H, Zhang F, Li S, et al. Systemic inflammation markers (SII and SIRI) as predictors of cognitive performance: evidence from NHANES 2011-2014. Front Neurol. 2025;16:1527302. doi: 10.3389/fneur.2025.1527302 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Liao X, Li Y, Zhang Z, Xiao Y, Yu X, Huang R, et al. Associations of the body roundness index with cognitive function in US older adults and the mediating role of depression: a cross-sectional study from the NHANES 2011-2014. Sci Rep. 2025;15(1):16884. doi: 10.1038/s41598-025-01383-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Nie W, Hu J. The relationship between grip strength and cognitive impairment: evidence from NHANES 2011-2014. Brain Behav. 2025;15(3):e70381. doi: 10.1002/brb3.70381 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Qian Y, Liu Q, Li T. Association between composite dietary antioxidant index and cognitive function impairment in the elderly: evidence from NHANES 2011-2014. Front Neurol. 2025;16:1529989. doi: 10.3389/fneur.2025.1529989 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Wu J, Shi M, Wang C. Association between tinnitus and cognitive impairment: analysis of National health and nutrition examination survey 2011:2014. Front Neurol. 2025;16:1533821. doi: 10.3389/fneur.2025.1533821 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Qato DM, Ozenberger K, Olfson M. Prevalence of Prescription Medications With Depression as a Potential Adverse Effect Among Adults in the United States. JAMA. 2018;319(22):2289–98. doi: 10.1001/jama.2018.6741 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zhou Y, Han W, Yao X, Xue J, Li Z, Li Y. Developing a machine learning model for detecting depression, anxiety, and apathy in older adults with mild cognitive impairment using speech and facial expressions: A cross-sectional observational study. Int J Nurs Stud. 2023;146:104562. doi: 10.1016/j.ijnurstu.2023.104562 [DOI] [PubMed] [Google Scholar]
- 42.Vermeulen RJ, Andersson V, Banken J, Hannink G, Govers TM, Rovers MM, et al. Limited generalizability and high risk of bias in multivariable models predicting conversion risk from mild cognitive impairment to dementia: A systematic review. Alzheimers Dement. 2025;21(4):e70069. doi: 10.1002/alz.70069 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Wang X, Zhou S, Ye N, Li Y, Zhou P, Chen G, et al. Predictive models of Alzheimer’s disease dementia risk in older adults with mild cognitive impairment: a systematic review and critical appraisal. BMC Geriatr. 2024;24(1):531. doi: 10.1186/s12877-024-05044-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Grueso S, Viejo-Sobera R. Machine learning methods for predicting progression from mild cognitive impairment to Alzheimer’s disease dementia: a systematic review. Alzheimers Res Ther. 2021;13(1):162. doi: 10.1186/s13195-021-00900-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Nagaraj S, Duong TQ. Deep learning and risk score classification of mild cognitive impairment and Alzheimer’s disease. J Alzheimers Dis. 2021;80(3):1079–90. doi: 10.3233/JAD-201438 [DOI] [PubMed] [Google Scholar]
- 46.Chen Y, Qian X, Zhang Y, Su W, Huang Y, Wang X, et al. Prediction Models for conversion from mild cognitive impairment to Alzheimer’s Disease: A systematic review and meta-analysis. Front Aging Neurosci. 2022;14:840386. doi: 10.3389/fnagi.2022.840386 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Oh SS, Kang B, Hong D, Kim JI, Jeong H, Song J, et al. A multivariable prediction model for mild cognitive impairment and dementia: algorithm development and validation. JMIR Med Inform. 2024;12:e59396. doi: 10.2196/59396 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Velazquez M, Lee Y, Alzheimer’s Disease Neuroimaging Initiative. Random forest model for feature-based Alzheimer’s disease conversion prediction from early mild cognitive impairment subjects. PLoS One. 2021;16(4):e0244773. doi: 10.1371/journal.pone.0244773 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Couronné R, Probst P, Boulesteix A-L. Random forest versus logistic regression: a large-scale benchmark experiment. BMC Bioinformatics. 2018;19(1):270. doi: 10.1186/s12859-018-2264-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Li J, Tian Y, Zhu Y, Zhou T, Li J, Ding K, et al. A multicenter random forest model for effective prognosis prediction in collaborative clinical research network. Artif Intell Med. 2020;103:101814. doi: 10.1016/j.artmed.2020.101814 [DOI] [PubMed] [Google Scholar]
- 51.Song Y, Yuan Q, Liu H, Gu K, Liu Y. Machine learning algorithms to predict mild cognitive impairment in older adults in China: A cross-sectional study. Journal of Affective Disorders. 2025;368:117–26. doi: 10.1016/j.jad.2024.09.059 [DOI] [PubMed] [Google Scholar]
- 52.Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30. doi: 10.1161/CIRCULATIONAHA.115.001593 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Liu Y, Zhou S, Wei H, An S. A comparative study of forest methods for time-to-event data: variable selection and predictive performance. BMC Med Res Methodol. 2021;21(1):193. doi: 10.1186/s12874-021-01386-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Hu M, Shu X, Yu G, Wu X, Välimäki M, Feng H. A risk prediction model based on machine learning for cognitive impairment among chinese community-dwelling elderly people with normal cognition: development and validation study. J Med Internet Res. 2021;23(2):e20298. doi: 10.2196/20298 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Livingston G, Sommerlad A, Orgeta V, Costafreda SG, Huntley J, Ames D, et al. Dementia prevention, intervention, and care. Lancet. 2017;390(10113):2673–734. doi: 10.1016/S0140-6736(17)31363-6 [DOI] [PubMed] [Google Scholar]
- 56.Na K-S. Prediction of future cognitive impairment among the community elderly: A machine-learning based approach. Sci Rep. 2019;9(1):3335. doi: 10.1038/s41598-019-39478-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Sattler C, Toro P, Schönknecht P, Schröder J. Cognitive activity, education and socioeconomic status as preventive factors for mild cognitive impairment and Alzheimer’s disease. Psychiatry Res. 2012;196(1):90–5. doi: 10.1016/j.psychres.2011.11.012 [DOI] [PubMed] [Google Scholar]
- 58.Aschwanden D, Aichele S, Ghisletta P, Terracciano A, Kliegel M, Sutin AR, et al. Predicting cognitive impairment and dementia: a machine learning approach. J Alzheimers Dis. 2020;75(3):717–28. doi: 10.3233/JAD-190967 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Casagrande SS, Lee C, Stoeckel LE, Menke A, Cowie CC. Cognitive function among older adults with diabetes and prediabetes, NHANES 2011-2014. Diabetes Res Clin Pract. 2021;178:108939. doi: 10.1016/j.diabres.2021.108939 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Casemiro FG, Quirino DM, Diniz MAA, Rodrigues RAP, Pavarini SCI, Gratão ACM. Effects of health education in the elderly with mild cognitive impairment. Rev Bras Enferm. 2018;2:801–10. doi: 10.1590/0034-7167-2017-0032 [DOI] [PubMed] [Google Scholar]
- 61.Valenzuela MJ, Sachdev P, Wen W, Chen X, Brodaty H. Lifespan mental activity predicts diminished rate of hippocampal atrophy. PLoS One. 2008;3(7):e2598. doi: 10.1371/journal.pone.0002598 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Landau SM, Marks SM, Mormino EC, Rabinovici GD, Oh H, O’Neil JP, et al. Association of lifetime cognitive engagement and low β-amyloid deposition. Arch Neurol. 2012;69(5):623–9. doi: 10.1001/archneurol.2011.2748 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Lachman ME, Agrigoroaei S, Murphy C, Tun PA. Frequent cognitive activity compensates for education differences in episodic memory. Am J Geriatr Psychiatry. 2010;18(1):4–10. doi: 10.1097/JGP.0b013e3181ab8b62 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Stern Y. Cognitive reserve in ageing and Alzheimer’s disease. Lancet Neurol. 2012;11(11):1006–12. doi: 10.1016/S1474-4422(12)70191-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Le Carret N, Lafont S, Mayo W, Fabrigoule C. The effect of education on cognitive performances and its implication for the constitution of the cognitive reserve. Dev Neuropsychol. 2003;23(3):317–37. doi: 10.1207/S15326942DN2303_1 [DOI] [PubMed] [Google Scholar]
- 66.Stern Y, Arenaza-Urquijo EM, Bartrés-Faz D, Belleville S, Cantilon M, Chetelat G, et al. Whitepaper: Defining and investigating cognitive reserve, brain reserve, and brain maintenance. Alzheimers Dement. 2020;16(9):1305–11. doi: 10.1016/j.jalz.2018.07.219 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Różyk-Myrta A. Guidelines for prevention and treatment of cognitive impairment in the elderly. Med Sci Monit. 2015;21:585–97. doi: 10.12659/msm.892542 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Sadowsky CH, Galvin JE. Guidelines for the management of cognitive and behavioral problems in dementia. J Am Board Fam Med. 2012;25(3):350–66. doi: 10.3122/jabfm.2012.03.100183 [DOI] [PubMed] [Google Scholar]
- 69.Ngo J, Holroyd-Leduc JM. Systematic review of recent dementia practice guidelines. Age Ageing. 2015;44(1):25–33. doi: 10.1093/ageing/afu143 [DOI] [PubMed] [Google Scholar]
- 70.Lee H, Wang Z, Tillekeratne A, Lukish N, Puliyadi V, Zeger S, et al. Loss of functional heterogeneity along the CA3 transverse axis in aging. Curr Biol. 2022;32(12):2681-2693.e4. doi: 10.1016/j.cub.2022.04.077 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Anacker C, Hen R. Adult hippocampal neurogenesis and cognitive flexibility - linking memory and mood. Nat Rev Neurosci. 2017;18(6):335–46. doi: 10.1038/nrn.2017.45 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Johnson AC. Hippocampal vascular supply and its role in vascular cognitive impairment. Stroke. 2023;54(3):673–85. doi: 10.1161/STROKEAHA.122.038263 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Lisman J, Buzsáki G, Eichenbaum H, Nadel L, Ranganath C, Redish AD. Viewpoints: how the hippocampus contributes to memory, navigation and cognition. Nat Neurosci. 2017;20(11):1434–47. doi: 10.1038/nn.4661 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Black S, Kraemer K, Shah A, Simpson G, Scogin F, Smith A. Diabetes, depression, and cognition: a recursive cycle of cognitive dysfunction and glycemic dysregulation. Curr Diab Rep. 2018;18(11):118. doi: 10.1007/s11892-018-1079-0 [DOI] [PubMed] [Google Scholar]
- 75.Gispen WH, Biessels GJ. Cognition and synaptic plasticity in diabetes mellitus. Trends Neurosci. 2000;23(11):542–9. doi: 10.1016/s0166-2236(00)01656-8 [DOI] [PubMed] [Google Scholar]
- 76.Artola A. Diabetes-, stress- and ageing-related changes in synaptic plasticity in hippocampus and neocortex--the same metaplastic process?. Eur J Pharmacol. 2008;585(1):153–62. doi: 10.1016/j.ejphar.2007.11.084 [DOI] [PubMed] [Google Scholar]
- 77.Liu X, Hao J, Yao E, Cao J, Zheng X, Yao D, et al. Polyunsaturated fatty acid supplement alleviates depression-incident cognitive dysfunction by protecting the cerebrovascular and glymphatic systems. Brain Behav Immun. 2020;89:357–70. doi: 10.1016/j.bbi.2020.07.022 [DOI] [PubMed] [Google Scholar]
- 78.McEwen BS, Nasca C, Gray JD. Stress effects on neuronal structure: hippocampus, amygdala, and prefrontal cortex. Neuropsychopharmacology. 2016;41(1):3–23. doi: 10.1038/npp.2015.171 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Jin K, Lu J, Yu Z, Shen Z, Li H, Mou T, et al. Linking peripheral IL-6, IL-1β and hypocretin-1 with cognitive impairment from major depression. J Affect Disord. 2020;277:204–11. doi: 10.1016/j.jad.2020.08.024 [DOI] [PubMed] [Google Scholar]
- 80.Reppermund S, Zihl J, Lucae S, Horstmann S, Kloiber S, Holsboer F, et al. Persistent cognitive impairment in depression: the role of psychopathology and altered hypothalamic-pituitary-adrenocortical (HPA) system regulation. Biol Psychiatry. 2007;62(5):400–6. doi: 10.1016/j.biopsych.2006.09.027 [DOI] [PubMed] [Google Scholar]
- 81.Garcia MJ, Leadley R, Ross J, Bozeat S, Redhead G, Hansson O, et al. Prognostic and Predictive Factors in Early Alzheimer’s disease: a systematic review. J Alzheimers Dis Rep. 2024;8(1):203–40. doi: 10.3233/ADR-230045 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Rubin R. Exploring the relationship between depression and dementia. JAMA. 2018;320(10):961. doi: 10.1001/jama.2018.11154 [DOI] [PubMed] [Google Scholar]
- 83.Mohan D, Iype T, Varghese S, Usha A, Mohan M. A cross-sectional study to assess prevalence and factors associated with mild cognitive impairment among older adults in an urban area of Kerala, South India. BMJ Open. 2019;9(3):e025473. doi: 10.1136/bmjopen-2018-025473 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Song D, Yu DSF, Li PWC, Sun Q. Identifying the factors related to depressive symptoms amongst community-dwelling older adults with mild cognitive impairment. Int J Environ Res Public Health. 2019;16(18):3449. doi: 10.3390/ijerph16183449 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Zhang W, Wu J, Yu L, Zhang C, Zhang H, Guo P, et al. Association between spinal cord injury and cognitive impairment, and the mediating role of inflammation. Journal of Craniofacial Surgery. 2025;36(6):e799–803. doi: 10.1097/scs.0000000000011593 [DOI] [PubMed] [Google Scholar]
- 86.Geng C, Chen C. Association between elevated systemic inflammatory markers and the risk of cognitive decline progression: a longitudinal study. Neurol Sci. 2024;45(11):5253–9. doi: 10.1007/s10072-024-07654-x [DOI] [PubMed] [Google Scholar]
- 87.Chen K, Wang L, Ning H, Pan H, Zhang W. Neutrophil-to-lymphocyte ratio; platelet-to-lymphocyte ratio; systemic immune-inflammatory Index: inflammatory indicators of cognitive impairment in schizophrenia patients. Front Psychiatry. 2025;16:1552451. doi: 10.3389/fpsyt.2025.1552451 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Liu Q, Li Z, Huang L, Zhou D, Fu J, Duan H, et al. Telomere and mitochondria mediated the association between dietary inflammatory index and mild cognitive impairment: A prospective cohort study. Immun Ageing. 2023;20(1):1. doi: 10.1186/s12979-022-00326-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Liu Q, Zhou D, Duan H, Zhu Y, Du Y, Sun C, et al. Association of dietary inflammatory index and leukocyte telomere length with mild cognitive impairment in Chinese older adults. Nutr Neurosci. 2023;26(1):50–9. doi: 10.1080/1028415X.2021.2017660 [DOI] [PubMed] [Google Scholar]
- 90.Ding T, Aimaiti M, Cui S, Shen J, Lu M, Wang L, et al. Meta-analysis of the association between dietary inflammatory index and cognitive health. Front Nutr. 2023;10:1104255. doi: 10.3389/fnut.2023.1104255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Mohammadi A, Mohammadi M, Almasi-Dooghaee M, Mirmosayyeb O. Neutrophil to lymphocyte ratio in Alzheimer’s disease: A systematic review and meta-analysis. PLoS One. 2024;19(6):e0305322. doi: 10.1371/journal.pone.0305322 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Algul FE, Kaplan Y. Increased systemic immune-inflammation index as a novel indicator of Alzheimer’s disease severity. J Geriatr Psychiatry Neurol. 2025;38(3):214–22. doi: 10.1177/08919887241280880 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This study utilized data from the 2011-2014 National Health and Nutrition Examination Survey (NHANES) which is open to researchers around the world. Researchers may (https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=2011, The https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=2013) access, download, and research for use.







