Construction of a machine learning-based risk prediction model for depression in middle-aged and elderly patients with cardiovascular metabolic diseases in China: a longitudinal study

Gege Zhang; Sijie Dong; Li Wang

doi:10.1186/s12889-025-23075-7

. 2025 May 23;25:1904. doi: 10.1186/s12889-025-23075-7

Construction of a machine learning-based risk prediction model for depression in middle-aged and elderly patients with cardiovascular metabolic diseases in China: a longitudinal study

Gege Zhang ¹, Sijie Dong ¹, Li Wang ^2,^✉

PMCID: PMC12101033 PMID: 40410764

Abstract

Background

The incidence of cardiovascular metabolic diseases (CMD) continues to rise among middle-aged and elderly populations, affecting not only physical health but also significantly increasing the risk of depression. This study aims to construct a machine learning model to predict the risk of depression in middle-aged and elderly patients with CMD and to identssify key risk factors.

Methods

Based on data from the China Health and Retirement Longitudinal Study (CHARLS) from 2018 to 2020, 4,477 patients aged 45 and above were included. LASSO regression was used to screen for risk factors, and three machine learning algorithms—logistic regression (LR), random forest (RF), and XGBoost—were employed to build predictive models. The performance of the models was evaluated using ROC curves, calibration curves, and decision curves.

Results

The study found several risk factors significantly associated with depression, including disability status, pain, retirement status, number of chronic diseases, education level, age, gender, place of residence, life satisfaction, optimism about the future, and self-rated health status. The incidence of depression was significantly higher among women (56%), rural residents (64%), individuals with disabilities, non-retirees (85%), and those with chronic illnesses (73%). The LR model demonstrated the best predictive performance, with an AUC of 0.69. Key predictive factors included self-rated health, residence, education level, gender, pain, life satisfaction, age, and hope for the future.

Conclusion

This study developed a depression risk prediction model based on logistic regression, providing important references for psychological health interventions in middle-aged and elderly patients with CMD. Identifying and intervening in high-risk populations is crucial for improving patients' quality of life.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12889-025-23075-7.

Keywords: Cardiovascular metabolic diseases, Depression, Machine learning, Risk prediction, Middle-aged and elderly population

Background

Cardiovascular metabolic diseases (CMD) is a collective term for a series of chronic conditions including hypertension, diabetes, dyslipidaemia, coronary heart disease, and stroke, which have a high prevalence among middle-aged and elderly populations, severely impacting their physical and mental health as well as their quality of life [1]. Numerous studies both globally and domestically have confirmed that CMD and its related diseases are major risk factors for mortality [2]. According to the 2023 data from the"China Cardiovascular Health and Disease Report,"the prevalence of CMD in our country continues to rise, with a large number of patients suffering from hypertension and diabetes. The proportion of overweight and obese adults exceeds 50%, posing significant challenges to the public health system. It is particularly concerning that the proportion of diabetes patients who die from cardiovascular and cerebrovascular diseases is as high as 50%−80%. Their cardiovascular mortality rate is 7.5 times that of non-diabetic patients, and the risk of death further increases when both conditions are present simultaneously [3–5]. These stark statistics highlight the necessity of proactive prevention and intervention for CMD and its complications. In addition to its direct impact on physical health, CMD is closely associated with mental health issues, particularly depression. Numerous studies have shown that the incidence of depression among middle-aged and older CMD patients is significantly higher than that of the general population, and depressive symptoms can exacerbate the condition of cardiovascular diseases, creating a vicious cycle that ultimately leads to a decline in the patient's quality of life and even poses a threat to their life [6].Therefore, early identification and intervention for high-risk groups is crucial. Furthermore, an increasing number of studies also indicate that CMD may have a negative impact on cognitive function, raising the risk of dementia [7].

The trend of population ageing in China is becoming increasingly evident, accompanied by a continuous rise in the prevalence of chronic diseases, which poses the most significant threat to the health of middle-aged and elderly individuals. The rapid increase in the prevalence of hypertension is particularly concerning, as it is not only one of the leading causes of death but also a core component of chronic metabolic diseases, further exacerbating the disease burden. Numerous studies have shown that the prevalence of depression among patients with hypertension and type 2 diabetes is significantly higher than that of the general population. This association may stem from the long-term physiological and psychological stress endured by patients, as well as concerns about the progression of their illnesses. [8, 9]. Moreover, patients with metabolic syndrome have a more fragile emotional state, making them prone to emotional disorders and cognitive decline. [10, 11], particularly among the elderly, where this phenomenon is even more pronounced.

The relationship between depression and CMD is bidirectional, with a vicious cycle exacerbating the severity of the illness and the suffering of patients. Depression not only increases the risk of developing cardiovascular diseases but also worsens the condition, significantly impacting quality of life [12, 13]。 Therefore, it is crucial to implement proactive mental health interventions for middle-aged and elderly patients with CMD, focusing on early identification and treatment to improve their overall health status. Depression has become a global public health issue, with a particularly significant impact on patients with cardiovascular and cerebrovascular diseases, increasing the likelihood of non-adherence to medical advice, a decline in quality of life, and a heightened risk of suicide [14]

Currently, effective and timely risk assessment for depression in high-risk populations remains a challenge. In clinical practice, the risk assessment of depression mainly relies on the experience and judgment of clinicians, lacking objective and quantifiable assessment tools. However, the rapid development of artificial intelligence (AI) and machine learning (ML) technologies has brought new solutions to this issue. Machine learning can extract key features from vast amounts of data to build predictive models, thereby more accurately identifying high-risk individuals and enabling early intervention [15].

This study utilises data from the China Health and Retirement Longitudinal Study (CHARLS, 2018–2020) and employs machine learning methods to construct a model for predicting the risk of depression in middle-aged and elderly patients with chronic metabolic diseases (CMD). We will analyse the data to identify key factors closely associated with depression risk and compare the predictive performance of different machine learning models. The ultimate goal is to provide a reliable tool for clinical practice, aiming to more accurately assess the depression risk in Middle-Aged and Elderly Patients with Cardiovascular Metabolic Diseases, thereby improving their overall health and quality of life.

Materials and methods

This study utilised data from the China Health and Retirement Longitudinal Study (CHARLS) conducted between 2018 and 2020. CHARLS is a large interdisciplinary survey project organised by the National School of Development at Peking University. The project employs a household survey approach to collect high-quality longitudinal survey data from a nationally representative sample of individuals aged 45 and above, along with their spouses. The CHARLS survey covers 150 counties and 450 communities (villages) across 28 provinces. It employs a probability sampling method that is proportional to population size, with sampling conducted in four stages: by county, household, and individual. This method has significant advantages in terms of scientific rigor, geographical coverage, representativeness, and sample authenticity.

Study population

According to the purpose of the survey, the inclusion criteria for this study are: (1) individuals aged 45 and above; (2) patients clinically diagnosed with CMD. The exclusion criteria are: (1)Participants who exhibited depressive symptoms at baseline or had mental health disorders, or who did not complete the depression scale (2018); (2) Participants unable to provide supplementary survey information or lacking reliable information; (3) Participants who did not attend follow-up in 2020; (4) Participants who did not complete the depression scale in 2020. Ultimately, this study included a total of 4,477 middle-aged and elderly individuals, with the detailed selection process of participants illustrated in Fig. 1. The sampling unit for this study was middle-aged and elderly patients with hypertension. Furthermore, due to the sufficiently large sample size selected for this study, no sampling weights were applied.

Research variables

Outcome variables

In the CHARLS questionnaire, the CESD-10 scale is used to screen for depressive symptoms among participants. This scale exhibits strong reliability and validity, particularly in measuring depressive symptoms in older adults [40, 41]. The CESD-10 consists of 10 items and employs a 4-point Likert scoring method. Items 5 (“I feel hopeful about the future”) and 8 (“I feel happy”) are considered positive items, while the remaining items are negative. The scoring criteria for negative items are as follows: “Rarely or not at all (less than 1 day) = 0,” “Not much (1–2 days) = 1,” “Sometimes or about half the time (3–4 days) = 2,” and “Most of the time (5–7 days) = 3.” Positive items are scored in reverse. The scores for each item are summed to obtain a total score ranging from 0 to 30. A total score of ≥10 indicates the presence of depressive symptoms, while a score of less than 10 suggests the absence of depressive symptoms. Higher scores indicate greater severity of depressive symptoms [42].

The choice of CESD-10 as the outcome variable is based on the following reasons:

The CHARLS database does not contain clinical diagnosis data;
Validation studies of CESD-10 in the Chinese elderly population show relatively high sensitivity and specificity [16, 17].
In public health contexts, the need for symptom screening takes precedence over diagnostic practices.

Demographic factors

Demographic factors include gender, education level, marital status, place of residence, age, and retirement. Gender is divided into male and female; education level is categorised as "Below elementary school", "Elementary school graduate", "Middle school graduate," and "High school graduate and above"; marital status can be classified as "married" (including those cohabiting with a spouse or temporarily separated due to work or other reasons) and "unmarried" (including those separated, divorced, or widowed); place of residence is divided into "urban" and "rural"; retirement is classified as "yes" or "no", with age treated as a continuous variable.

Behavioral factors

Behavioural factors include exercise, social activities, smoking, drinking, and sleep duration. Exercise is divided into "with" and "without", while social activities are classified as "none", "light", or "heavy". Smoking and drinking are divided into "currently yes/no". Sleep duration is measured by the question, "On average, how many hours did you sleep per night in the past month?" and is considered a continuous variable in this study.

Health status

In this study, health status includes the extent of medical visits, disability, pain, and chronic diseases. Medical visits are defined as whether the individual has sought medical attention in the past year; for disability, the following five questions were used: (1) Do you have any of the following disabilities? (2) Do you have a brain injury/intellectual disability? (3) Do you have vision problems? (4) Do you have hearing problems? (5) Do you have a speech impairment? If participants answered"yes"to at least one of the above questions, they were defined as having a disability; otherwise, they were classified as"no."For hearing difficulties, the following three questions were used: (1) Have you ever worn a hearing aid? (2) How would you rate your hearing: very good, good, fair, poor, or very poor? (3) Do you have hearing problems? If the answer to question 1 or 3 is"yes,"or if the answer to question 2 is "poor", then the individual is defined as having hearing difficulties. In this study, chronic diseases include nine types: cancer, chronic lung disease, liver disease, arthritis or rheumatism, kidney disease, digestive system diseases, asthma, and memory disorders. The number of chronic diseases in this study is treated as a categorical variable, defined as "no" for fewer than three diseases, "mild" for three or more diseases, and"severe"for more than three diseases. Additionally, pain is defined as "no" for fewer than three sites, "mild" for three or more sites, and "severe" for more than three sites.

Mental health factors

Mental health factors include hope for the future defined as "unknown", "no", "yes"; life satisfaction, and self-rated health defined as "good", "fair", "poor", with all three variables being categorical in this study.

Statistical methods

In this study, data analysis was conducted using R Studio software with the assistance of several R packages, including rms, ROCR, Hmisc, randomForest, glmnet, and caret. First, we described the general information and scores of the patients. As the measurement data in this study were non-normally distributed, they are presented as medians and interquartile ranges. Comparisons between and within groups were conducted using the Mann-Whitney U test. Categorical data are expressed as proportions, composition ratios, or frequencies, with comparisons performed using the chi-squared test. The significance level was set at α = 0.05.Next, the Lasso regression method was employed to identify significant risk factors. The risk factors selected by Lasso were used as input variables for comparison in logistic regression (LR), random forest (RF), and XGBoost machine learning algorithms to construct a depression risk prediction model for middle-aged and elderly patients with cardiovascular metabolic diseases in the training set. Finally, the performance of the predictive model was evaluated through the area under the receiver operating characteristic (ROC) curve, calibration curve, and decision curve analysis (DCA).A comprehensive comparison of the machine learning models was conducted, including sensitivity, specificity, F1 score, precision, and accuracy. These metrics were used to assess the predictive capability and classification effectiveness of the models, ensuring their practical applicability in real-world scenarios.

Results

This study analysed the baseline characteristics of 4,477 middle-aged and elderly patients Table 1, revealing that various factors are significantly associated with the occurrence of depression. In terms of gender, female patients accounted for 56% of those with depression, significantly higher than males at 44% (P < 0.001). Marital status also had an important impact on depression, with married patients exhibiting a relatively lower incidence of depression (14% vs. 10%), suggesting that marital relationships may play a protective role in mental health (P < 0.001). The differences between urban and rural living environments significantly affected the risk of depression, with rural residents making up 64% of those with depression, compared to only 36% of urban residents (P < 0.001). Additionally, the incidence of depression was notably higher among disabled patients and those who had not yet retired, at 39% and 85% respectively (P < 0.001).

Table 1.

A comparative analysis of depression and related variables in middle-aged and elderly patients with cardiovascular metabolic diseases

Variables	Total (n = 4477)	Not depressed(n = 3351)	Depression (n = 1126)	p	statistic
Gender, n (%)				< 0.001	57.031
Female	2084 (47)	1450 (43)	634 (56)
Male	2393 (53)	1901 (57)	492 (44)
Marry, n (%)				< 0.001	13.467
Unmarried	490 (11)	333 (10)	157 (14)
Married	3987 (89)	3018 (90)	969 (86)
Residence, n (%)				< 0.001	80.681
Urban	2118 (47)	1716 (51)	402 (36)
Rural	2359 (53)	1635 (49)	724 (64)
Disability, n (%)				< 0.001	41.458
No	3078 (69)	2391 (71)	687 (61)
Yes	1399 (31)	960 (29)	439 (39)
Retire, n (%)				< 0.001	42.1
No	3475 (78)	2522 (75)	953 (85)
Yes	1002 (22)	829 (25)	173 (15)
Exercise, n (%)				0.134	2.248
None	323 (7)	230 (7)	93 (8)
Yes	4154 (93)	3121 (93)	1033 (92)
Hope, n (%)				< 0.001	51.669
Don't know	221 (5)	148 (4)	73 (6)
No	1451 (32)	1003 (30)	448 (40)
Yes	2805 (63)	2200 (66)	605 (54)
Insurance, n (%)				0.025	5
None	76 (2)	48 (1)	28 (2)
Yes	4401 (98)	3303 (99)	1098 (98)
Drink, n (%)				< 0.001	24.173
No	2132 (48)	1524 (45)	608 (54)
Yes	2345 (52)	1827 (55)	518 (46)
Smoke, n (%)				< 0.001	17.107
No	2400 (54)	1736 (52)	664 (59)
Yes	2077 (46)	1615 (48)	462 (41)
Hospital, n (%)				< 0.001	15.837
None	3728 (83)	2834 (85)	894 (79)
Yes	749 (17)	517 (15)	232 (21)
Edu, n (%)				< 0.001	131.629
Below elementary school	1366 (31)	882 (26)	484 (43)
Elementary school graduate	1317 (29)	1004 (30)	313 (28)
Middle school graduate	1071 (24)	847 (25)	224 (20)
High school graduate and above	723 (16)	618 (18)	105 (9)
Self-reported health, n (%)				< 0.001	167.131
Poor	912 (20)	551 (16)	361 (32)
Fair	2506 (56)	1893 (56)	613 (54)
Good	1059 (24)	907 (27)	152 (13)
Life satisfication, n (%)				< 0.001	89.028
Poor	181 (4)	86 (3)	95 (8)
Fair	2446 (55)	1806 (54)	640 (57)
Good	1850 (41)	1459 (44)	391 (35)
Social, n (%)				< 0.001	21.587
None	1984 (44)	1422 (42)	562 (50)
Mild	2032 (45)	1559 (47)	473 (42)
Severe	461 (10)	370 (11)	91 (8)
Chronic disease, n (%)				< 0.001	89.615
None	1711 (38)	1399 (42)	312 (28)
Mild	2265 (51)	1637 (49)	628 (56)
Severe	501 (11)	315 (9)	186 (17)
Pain, n (%)				< 0.001	111.54
None	2033 (45)	1658 (49)	375 (33)
Mild	919 (21)	686 (20)	233 (21)
Severe	1525 (34)	1007 (30)	518 (46)
Sleep time, Median (Q1,Q3)	6 (5, 8)	6.5 (5.5, 8)	6 (5, 8)	< 0.001	2,081,674
Age, Median (Q1,Q3)	62 (55, 69)	62 (55, 68)	64 (57, 70)	< 0.001	1,652,193

Open in a new tab

In terms of health behaviours, drinking and smoking habits are significantly associated with the occurrence of depression. Among those with depression, 54% are non-drinkers, while 59% are non-smokers (P < 0.001). Chronic diseases and pain have been confirmed as independent risk factors for depression, with 73% of patients with chronic diseases being in the depressed group (P < 0.001), and the proportion of patients experiencing severe pain who develop depression is 46% (P < 0.001). Patients who self-report poor health status and have a low frequency of social activity show a significantly increased risk of depression (P < 0.001), and there are also notable differences in sleep quality and median age among patients with depression.

In addition, all screened patients were divided into a training set (3,135 cases) and a calibration set (1,342 cases) in a 7:3 ratio Table 2. The training set was used for parameter selection and model training, while the calibration set was used to test the model's generalisability. The results showed that there were no significant differences in baseline characteristics between the two groups (except for retirement status and self-rated health) (P > 0.05), providing a solid foundation for subsequent data analysis. Through Lasso regression analysis, 11 risk factors associated with depression were identified, including disability status, pain, retirement status, number of chronic diseases, education level, age, gender, place of residence, life satisfaction, optimism about the future, and self-rated health status. Further multivariate logistic regression analysis demonstrated the construction of a nomogram based on these factors to simplify the assessment process for depression risk. This nomogram allows healthcare professionals to quickly calculate a patient's risk of depression and provides support for personalised interventions.

Table 2.

Comparison of variables between the training group and the testing group of middle-aged and elderly patients with cardiovascular metabolic diseases

Variables	Total (n = 4477)	Train (n = 3135)	Test(n = 1342)	p	statistic
Residence, n (%)				0.498	0.46
Urban	2118 (47)	1494 (48)	624 (46)
Rural	2359 (53)	1641 (52)	718 (54)
Disability, n (%)				0.52	0.414
No	3078 (69)	2165 (69)	913 (68)
Yes	1399 (31)	970 (31)	429 (32)
Retire, n (%)				0.867	0.028
No	3475 (78)	2436 (78)	1039 (77)
Yes	1002 (22)	699 (22)	303 (23)
Hope, n (%)				0.399	1.84
Don't know	221 (5)	162 (5)	59 (4)
No	1451 (32)	1025 (33)	426 (32)
Yes	2805 (63)	1948 (62)	857 (64)
Edu, n (%)				0.732	1.288
Below elementary school	1366 (31)	968 (31)	398 (30)
Elementary school graduate	1317 (29)	911 (29)	406 (30)
Middle school graduate	1071 (24)	744 (24)	327 (24)
High school graduate and above	723 (16)	512 (16)	211 (16)
Self-reported health, n (%)				0.976	0.048
Poor	912 (20)	636 (20)	276 (21)
Fair	2506 (56)	1756 (56)	750 (56)
Good	1059 (24)	743 (24)	316 (24)
Life satisfication, n (%)				0.623	0.948
Poor	181 (4)	130 (4)	51 (4)
Fair	2446 (55)	1723 (55)	723 (54)
Good	1850 (41)	1282 (41)	568 (42)
Chronic disease, n (%)				0.401	1.825
None	1711 (38)	1200 (38)	511 (38)
Mild	2265 (51)	1597 (51)	668 (50)
Severe	501 (11)	338 (11)	163 (12)
Pain, n (%)				0.699	0.716
None	2033 (45)	1418 (45)	615 (46)
Mild	919 (21)	654 (21)	265 (20)
Severe	1525 (34)	1063 (34)	462 (34)
Age, Median (Q1,Q3)	62 (55, 69)	63 (55, 68)	62 (55, 69)	0.813	2,112,961.5

Open in a new tab

In the comparison of model performance on the test set, three predictive models (Logistic Regression, Random Forest, and XGBoost) all demonstrated good recognition capabilities, with the Logistic Regression model achieving the highest AUC value (0.69), indicating its best performance in identifying patients with depression. Analysis of various charts revealed the relationship between actual probabilities and predicted probabilities, indicating that the model is somewhat lacking in predicting low-risk patients. The Lasso regression curve and nomogram further validated the rationality of feature selection and the effectiveness of model construction.

Identification of risk factors

In the training set, the presence or absence of depression was used as the dependent variable (yes = 1, no = 0), while the pre-selected risk factors for depression in middle-aged and elderly patients with cardiovascular metabolic diseases served as independent variables. Lasso regression was employed to filter the risk factors. As shown in Fig. 2, the coefficients of the independent variables included in the model from the outset are gradually compressed, with the final portion of the independent variable coefficients being reduced to zero to avoid overfitting the model. From the figure, it can be observed that through tenfold cross-validation, the optimal value was selected as 1 + 1, which minimised the error. Ultimately, 11 risk factors were identified:

Fig. 2 — Using the Lasso regression model to select risk factors. A Generate the coefficient curve based on the log(lambda) sequence, and derive non-zero coefficients from the optimal lambda. B The results of ten-fold cross-validation, plotting the partial likelihood deviance (binomial deviance) curve by validating the optimal lambda in the LASSO model, and drawing a dashed vertical line based on the minimum lambda and standard error criteria

disability status, pain, retirement status, number of chronic illnesses, level of education, age, gender, place of residence, hope, life satisfaction, optimism about the future, and self-rated health status.

Construction of a logistic regression risk prediction model

Based on multivariate logistic regression analysis, factors such as gender, place of residence, retirement status,education level, self-reported health status, life satisfaction, chronic diseases, pain, optimism about the future, and age have all been identified as independent risk factors for depression in middle-aged and elderly patients with cardiovascular metabolic diseases Tables 3 and 4. These findings emphasise the multidimensional nature of mental health management and reveal the impact of living environment, health status, and personal psychological factors on the risk of depression.We developed a nomogram to predict depression risk in middle-aged and elderly patients with cardiovascular metabolic diseases, as shown in Fig. 3. From this nomogram, We randomly calculated the risk factor score for a participant, for example: gender (45 points), residence (46 points), disability (38 points), retirement status (33 points), hope (34 points), education (21 points), self-reported health (59 points), life satisfaction (40 points), chronic diseases (32 points), pain (38 points), and age (54 points). The total score reaches 440 points, indicating a risk over 50%. This highlights a significantly increased risk of depression for the patient.

Table 3.

Multifactorial logistic analysis of depression in middle-aged and elderly patients with cardiac metabolic diseases

	Estimate	Std. Error	Wald	Pr(>\|z\|)	OR(95%)
(Intercept)	−1.41	0.458	9.489	0.002	0.24 (0.1- 0.6)
Gender	−0.416	0.095	19.338	0	0.66 (0.55- 0.79)
Residence	0.41	0.097	17.807	0	1.51 (1.25- 1.82)
Disability	0.231	0.095	5.899	0.015	1.26 (1.05- 1.52)
Retire	−0.37	0.134	7.576	0.006	0.69 (0.53- 0.9)
Hope1	0.366	0.195	3.523	0.061	1.44 (0.98- 2.11)
Hope2	0.025	0.193	0.017	0.895	1.03 (0.7- 1.5)
Edu2	−0.199	0.115	3.005	0.083	0.82 (0.66- 1.03)
Edu3	−0.375	0.129	8.472	0.004	0.69 (0.53- 0.88)
Edu4	−0.454	0.162	7.889	0.005	0.64 (0.46- 0.87)
Self-reported health 2	−0.371	0.106	12.196	0	0.69 (0.56- 0.85)
Self-reported health 3	−0.837	0.148	32.051	0	0.43 (0.32- 0.58)
Life satisfication 2	−0.687	0.195	12.411	0	0.5 (0.34- 0.74)
Life satisfication 3	−0.989	0.201	24.084	0	0.37 (0.25- 0.55)
Chronic disease 1	0.248	0.102	5.916	0.015	1.28 (1.05- 1.56)
Chronic disease 2	0.444	0.153	8.42	0.004	1.56 (1.16- 2.1)
Pain1	0.22	0.12	3.368	0.066	1.25 (0.98- 1.58)
Pain2	0.366	0.11	11.156	0.001	1.44 (1.16- 1.79)
Age	0.018	0.005	10.772	0.001	1.02 (1.01- 1.03)

Open in a new tab

Table 4.

Variable assignment modes for middle-aged and elderly patients with cardiovascular metabolic diseases

Variables	Assignment mode
Gender	Female = 0;Male = 1
Residence	Urban = 0;Rural = 1
Disability	No = 0;Yes = 1
Retire,	No = 0;Yes = 1
Hope	Don't know = 0; No = 1; Yes = 2
Edu	Below elementary school = 1;Elementary school graduate = 2;Middle school graduate = 3;High school graduate and above = 4
Self-reported health	Poor = 1;Fair = 2;Good = 3
Life satisfication	Poor = 1;Fair = 2;Good = 3
Chronic disease	None = 0;Mild = 1;Severe = 2
Pain	None = 0;Mild = 1;Severe = 2
Age	Measured value

Open in a new tab

Fig. 3 — Nomogram for predicting depression risk in middle-aged and elderly patients with cardiovascular metabolic diseases

Construction of the random forest risk prediction model

In the training group, samples were used to establish a random forest model. With an mtry value of 2, the error stabilised when ntree = 300. Therefore, the optimal model is when mtry = 2 and ntree = 100. Based on the mean reduction of the Gini index, the importance of 11 variables was ranked. Age, education level, self-rated health, life satisfaction, pain, and chronic diseases are the six key indicators for predicting depression in middle-aged and elderly patients with cardiovascular metabolic diseases,as shown in Fig. 4.

Fig. 4 — Ranking of important factors affecting depression in middle-aged and elderly patients with cardiovascular metabolic diseases in the random forest model

Construction of the XGBoost risk prediction model

This study analyzed risk factors for depression in older adults with cardiovascular metabolic diseases using the XGBoost model. The feature importance results presented in Fig. 5 indicate that the top eight influencing factors are self-rated health, place of residence, education level, gender, pain, life satisfaction, age, and hope for the future. SHAP analysis further clarified these findings. The first figure shows the average SHAP values for each feature, highlighting that self-rated health is the most significant factor, followed by gender, education level, and place of residence. The second figure presents a violin plot that illustrates the distribution of SHAP values for each feature, offering deeper insights, particularly regarding the concentrated impact of disability and age on depression risk. Finally, the third figure reveals the complex relationships between these risk factors, emphasizing the importance of psychosocial elements such as education level, hope, and life satisfaction. Based on these findings, the study selected these top eight factors as input features for the model.

Fig. 5 — The importance of features affecting depression in middle-aged and elderly patients with cardiovascular metabolic diseases in the XGBoost model (A);Distribution of SHAP Values by Key Features(B);SHAP Value Relationships Among Key Risk Factors (C)

Comparison of the performance of three predictive models

In this study, we utilized a test set comprising 1,342 participants to evaluate the predictive effectiveness of three models—Random Forest (RF), Logistic Regression (LR), and XGBoost—in assessing depression among middle-aged and elderly patients with cardiovascular metabolic diseases. As illustrated in Fig. 6, the area under the curve (AUC) for all three models ranged from 0.6 to 0.7. Among them, the LR model exhibited the highest AUC value at 0.69, while the RF model had the lowest AUC of 0.64. Table 5 provides a comparison of performance metrics for the three predictive models. The results indicate that the logistic regression (LR) model exhibits the highest accuracy (0.750) and specificity (0.954), whereas the random forest (RF) model demonstrates the best sensitivity (0.727). Overall, despite the relatively low F1 scores across all models, the logistic regression model shows better precision (0.511) and overall performance. These findings suggest that the logistic regression model is effective in assessing depression in this specific population.

Fig. 6 — Comparison of AUC for three predictive models of depression in middle-aged and elderly patients with cardiovascular metabolic diseases

Table 5.

Performance metrics comparison of three predictive models for depression in middle-aged and elderly patients with cardiovascular metabolic diseases

Model	Accuracy	Sensitivity	Specificity	Precision	F1 score
Logistic regression	0.750	0.142	0.954	0.511	0.223
Random forest	0.727	0.727	0.927	0.376	0.193
Xgboost	0.737	0.134	0.939	0.425	0.203

Open in a new tab

We performed model evaluations using the test group data, applying likelihood ratio tests and other relevant statistics for analysis. The likelihood ratio test yielded a P value of less than 0.05 (P < 0.05), indicating a significant difference between the model predictions and the actual data, thereby suggesting that the model exhibits a good fit. The C statistic value was calculated at 0.715 (95% CI: 0.694–0.736), typically reflecting moderate discriminatory ability (with values of 0.51–0.71 indicating fair accuracy, 0.71–0.9 indicating high accuracy, and values above 0.9 suggesting very high accuracy). For the calibration assessment, we employed the Hosmer–Lemeshow goodness-of-fit test, which yielded a chi-square value of 6.6128 and a corresponding P value of 0.5789. This result signifies no significant difference between the predicted and observed values (P > 0.05), suggesting overall good fit of the model Attachment 1.

As shown in Attachment 2, the decision curve analysis (DCA) illustrates the clinical utility of our prediction model across different treatment thresholds. The red curve (representing the "Treat All" strategy) indicates a rapid decline in net benefit when the threshold probability exceeds 10%, suggesting that indiscriminate interventions waste resources (e.g., at a 20% threshold, net benefit decreases significantly compared to the model-based approach). In contrast, the blue curve (model-predicted probability) shows superior net benefit within the 5%−35% threshold range, peaking at 25%. This threshold aligns with the optimal cost–benefit balance for managing depression in middle-aged and elderly CMD patients.

Discussion

This study employed LASSO regression to identify risk factors and systematically compared the performances of three machine learning models—Logistic Regression (LR), Random Forest (RF), and XGBoost—in predicting depression risk among patients with Cardiovascular Metabolic Diseases (CMD). The results indicated that the LR model, based on a linear assumption (AUC = 0.69), demonstrated the best predictive performance. This finding is consistent with recent studies on psychological risk prediction in chronic disease patients but does not align with the integrated learning methods that are commonly referenced in most mental health research [18, 19]. This difference may be attributed to two key characteristics of the dataset used in this study: first, among the eleven risk factors selected by LASSO—such as hope, life satisfaction, disability status, pain, retirement status, number of chronic diseases, education level, age, gender, place of residence, and self-rated health status—conform to the assumptions of LR; second, the relatively modest sample size (n = 4,477) might limit the RF’s ability to capture complex interactions. This limitation is reflected in its relatively high sensitivity (0.727) and good specificity (0.927), suggesting that the model reliably identifies both positive and negative cases.

In terms of algorithm selection, we considered three factors: (1) Clinical Interpretability: The odds ratios (OR) provided by LR can be readily translated into risk assessment tools, as illustrated in the nomogram in Fig. 3; (2) Complexity of Feature Interactions: RF was deployed to uncover the nonlinear relationships between physiological indicators (such as the number of chronic diseases) and psychological factors (such as life satisfaction), which aligns with recommendations from research on multi-morbidity prediction [20]; (3) Handling Data Imbalance: In addressing the distribution of the depression cohort, XGBoost utilized a class weighting mechanism to enhance recognition of the minority class. However, its specificity (0.939) remains lower than that of the LR model (0.954), indicating that traditional statistical approaches may still be more effective in accurately identifying high-risk populations.

Relative to existing literature, this study achieved three methodological advancements: (1) Enhanced Feature Selection Efficiency: LASSO regression successfully reduced the initial variables by 60%; (2) Optimized Clinical Decision Support: As depicted in the decision curve in Attachment 2, the model exhibited significant net benefits within the 5–35% risk threshold range, aligning with the cost-effectiveness frameworks typical in primary health care; (3) Multidimensional Risk Integration: The nomogram integrates objective physiological indicators (disability status, pain severity) with subjective psychological perceptions (future optimism) into a unified assessment framework for the first time, which may address the limitations of traditional scales that primarily focus on symptom descriptions [21]. These enhancements offer practical tools for grassroots medical institutions to implement stratified management of depression risk.

The prevalence of cardiovascular metabolic diseases among middle-aged and elderly individuals—such as hypertension, diabetes, dyslipidemia, stroke, and coronary heart disease—is on the rise annually. This trend is closely linked to factors such as aging, unhealthy lifestyle choices, and psychosocial influences. Research indicates that patients in this age group often experience multiple chronic conditions, including obesity and high cholesterol, which together significantly increase the risk of cardiovascular events [22, 23]. Moreover, cardiovascular metabolic diseases not only affect patients'physical health but also have a significant impact on their mental well-being. Many middle-aged and elderly patients experience psychological distress, including depression and anxiety, with incidence rates reaching as high as 56% among those with cardiovascular diseases [24, 25]. In summary, middle-aged and elderly patients with cardiovascular metabolic diseases frequently encounter a range of physiological and psychological challenges. These may include deterioration in physical function, coexistence of chronic illnesses, and challenges in adapting to lifestyle changes. Research indicates that these patients are particularly vulnerable in terms of emotional and mental health, making them more susceptible to negative emotions such as depression and anxiety [26, 27].

This study analysed the baseline characteristics of 4,477 middle-aged and elderly patients and revealed significant associations between various factors and depression. Our data clearly indicate that factors such as gender, marital status, place of residence, disability and retirement status, health behaviours, and self-rated health status may all have a significant impact on the onset of depression. In this study, the proportion of female patients in the depression group was higher than that of male patients, a finding consistent with numerous studies in the literature, further confirming that women are more vulnerable to the incidence of depression.The observed gender disparity aligns with existing literature showing higher depression prevalence in females, potentially linked to physiological (e.g., hormonal fluctuations) and socio-cultural factors, particularly the effects of the menstrual cycle, pregnancy, and menopause on mood [28, 29]. At the same time, socio-cultural factors play a significant role in this dynamic.In contrast, men are typically expected to exhibit strength and emotional restraint [30, 31]. On the other hand, the incidence of depression among married patients is relatively low (14% vs. 10%), a finding that highlights the protective role of good social support for mental health [32–34].

There is a significant difference in the incidence of depression between urban and rural residents, with the prevalence among rural residents at 64%, compared to just 36% for urban residents. This disparity may be closely linked to several factors. First, the lack of medical resources in rural areas and the absence of specialized mental health services hinder rural residents'access to timely and effective psychological interventions [35].Moreover, rural residents often have weak social support networks. Rural residency was associated with elevated depression risk, possibly reflecting disparities in healthcare access and socioeconomic stressors [36].

At the same time, the lifestyle and economic conditions of rural residents are also important factors affecting mental health. Low economic levels may subject rural residents to greater life pressures, such as poverty and increased medical burdens, which can weaken their ability to resist depression [37, 38].

The study also revealed that the incidence of depression is significantly higher among disabled individuals and non-retired patients. This suggests that the loss of physical function and potential changes in social roles following retirement have a considerable impact on mental health. Disabled individuals often face physiological limitations that result in reduced mobility and social participation, leading to social isolation and feelings of loneliness—both of which are important risk factors for depression [39, 40]. For patients who are not yet retired, after retirement, individuals may experience feelings of loss and a decrease in social engagement, which can elevate the risk of depression [41]. Furthermore, there is a significant association between drinking and smoking habits and the occurrence of depression. This study found that among individuals with depression, 54% were non-drinkers and 59% were non-smokers, underscoring the importance of healthy behaviors for mental health. Drinking and smoking not only reflect underlying mental health issues but may also serve as risk factors for the development of depression [42]. Improving lifestyle choices can significantly reduce the incidence of depression, highlighting the importance of incorporating the control of harmful habits and the promotion of healthy lifestyles into public health intervention [43].

Chronic diseases and pain showed significant associations with depression in our study, with 73% of patients with chronic illnesses exhibiting depressive symptoms. These conditions were linked to poorer daily functioning and quality of life measures, paralleling findings from prior epidemiological studies [44].The observed bidirectional patterns between chronic conditions and depressive symptoms underscore the importance of comprehensive care approaches. Notably, pain severity demonstrated a dose–response relationship with depression risk, consistent with previous reports of comorbid pain-depression syndromes [45]. Patients reporting poor self-rated health and limited social engagement exhibited markedly higher depression prevalence (64% vs 36%). While therapeutic strategies targeting these factors may prove beneficial, our predictive model specifically identifies them as key risk markers rather than causal determinants. Sleep disturbances covaried strongly with depressive symptoms, mirroring established psychophysiological correlations [46]. Age-related increases in depression risk tracked with accumulating health burdens and social isolation metrics, aligning with developmental stress accumulation theories [47].

Overall, the risk factors included in this study hold significant value for predicting the risk of depression in middle-aged and older patients with cardiovascular metabolic diseases. Machine learning, as a subfield of artificial intelligence, involves a systematic process of learning from data and training models to accurately predict future events [48]. Recently, numerous researchers have utilized machine learning-based predictive models to identify related diseases, achieving high accuracy [49, 50]. In this study, we were the first to establish machine learning predictive models using three classifiers: logistic regression (LR), random forest (RF), and XGBoost, all of which demonstrated stable predictive performance. Furthermore, we conducted SHAP analysis to identify key risk factors, enhancing the interpretability and transparency of our current machine learning models. Finally, the decision curve analysis (DCA) indicates that our machine learning models possess good clinical utility and acceptability in predicting depression risk among older patients with chronic medical diseases.

Limitations and strengths

This research has certain limitations that must be acknowledged. The reliance on self-reported data from the CESD-10 for depression symptom screening introduces potential recall bias, where individuals may underestimate transient symptoms, as well as social desirability bias, where symptoms may be minimized due to stigma. Despite the CESD-10's alignment with clinical diagnoses from validation studies, it cannot replace structured psychiatric assessments such as DSM-5, potentially misclassifying subclinical cases. Additionally, a 15% attrition rate in our longitudinal cohort may disproportionately exclude severe cases, leading to an underestimation of depression risk. While inverse probability weighting was applied, unmeasured factors like disease severity trajectories may still introduce residual confounding. The absence of biological markers, such as CRP and neuroimaging data, restricts the exploration of pathophysiological pathways associated with chronic medical diseases and depression. Further complicating the findings is the urban-rural distribution of the sample, which deviates from China's urbanization rate, potentially affecting the estimates of urban stressors and rural healthcare disparities. Moreover, the observational design inherently limits causal inference, and omitted psychosocial variables may hinder the model’s comprehensiveness. However, despite these limitations, the study presents significant advancements. The use of LASSO regression enhances feature selection efficiency, avoiding overfitting while maintaining clinical interpretability, and the logistic regression model demonstrates moderate discriminative ability (AUC=0.69). Additionally, the development of a multidimensional risk score consolidates objective and subjective indicators into a unified primary care tool, offering a cost-effective intervention framework. Notably, this is the first machine learning model aimed at predicting depression risk in middle-aged and older Chinese patients with chronic medical diseases, with potential applicability in resource-limited settings. By systematically assessing various demographic, behavioral, and psychosocial factors, the research addresses gaps in previous studies that focused solely on physiological indicators, emphasizing the role of positive psychology as a protective factor in managing comorbid mental health issues.

Conclusion

In this study, various machine learning algorithms were used to build models to predict the risk of depression in middle-aged and elderly patients with cardiovascular metabolic diseases. After evaluation and comparison, it was found that the Logistic regression model was the most predictive. The results indicated that disability status, pain, retirement status, number of chronic diseases, education level, age, gender, place of residence, life satisfaction, optimism about the future, and self-rated health status are risk factors for depression in middle-aged and elderly patients with cardiovascular metabolic diseases. The model can identify high-risk populations with clinically significant depressive symptoms, providing a basis for stratified management in primary healthcare. Healthcare professionals can develop intervention programmes based on these relevant risk factors and implement them early to reduce the adverse effects of depression on middle-aged and elderly patients with cardiovascular metabolic diseases.

Supplementary Information

Supplementary Material 1.^{(22.7KB, docx)}

Supplementary Material 2.^{(19KB, docx)}

Acknowledgements

We thank Peking University for the open data resources and all investigators who participated in the study

Abbreviations

ROC: Receiver operating characteristic
AUC: Area under the curve
DCA: Decision curve analysis
OR: Odds ratio
CI: Confidence interval

Authors’ contributions

Gege Zhang and Sijie Dong contributed to the study design and performed the data analysis. Gege Zhang wrote the manuscript, while Li Wang critically revised and edited the manuscript for important intellectual content. All authors reviewed and approved the final manuscript.

Funding

This work was not specifically funded.

Data availability

Availability of data and materials The datasets generated during and/or analyzed during the current study are available in the CHARLS repository, http://charls.pku.edu.cn.

Declarations

Ethics approval and consent to participate

This is a retrospective study based on CHARLS database. The patient’s informa tion has been hidden before the study. There is no need for the patient’s informed consent and no ethical conflict. The original CHARLS was approved by the Ethical Review Committee of Peking University (IRB00001052–11015), and all participants signed the informed consent at the time of participation. This research followed the guidance of the Declaration of Helsinki.

Consent for publication

All authors agree to the content of this manuscript and its publication.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Yang BY, Hu LW, Jalaludin B, Knibbs LD, Markevych I, Heinrich J, et al. Association between residential greenness, cardiometabolic disorders, and cardiovascular disease among adults in China. JAMA Netw Open. 2020;3(9):e2017507. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Collaborators GBDCoD. Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392(10159):1736–88. [DOI] [PMC free article] [PubMed]
3.Collaboration NCDRF. Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4.4 million participants. Lancet. 2016;387(10027):1513–30. [DOI] [PMC free article] [PubMed]
4.Fox CS, Pencina MJ, Wilson PW, Paynter NP, Vasan RS, D’Agostino RB Sr. Lifetime risk of cardiovascular disease among individuals with and without diabetes stratified by obesity status in the Framingham heart study. Diabetes Care. 2008;31(8):1582–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Qazi MU, Malik S. Diabetes and cardiovascular disease: original insights from the framingham heart study. Glob Heart. 2013;8(1):43–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Warriach ZI, Patel S, Khan F, Ferrer GF. Association of depression with cardiovascular diseases. Cureus. 2022;14(6):e26296. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Pandit M, Frishman WH. The association between cardiovascular disease and dementia: a review of trends in epidemiology, risk factors, pathophysiologic mechanisms, and clinical implications. Cardiol Rev. 2024;32(5):463–7. [DOI] [PubMed] [Google Scholar]
8.Tassew WC, Nigate GK, Assefa GW, Zeleke AM, Ferede YA. Systematic review and meta-analysis on the prevalence and associated factors of depression among hypertensive patients in Ethiopia. PLoS ONE. 2024;19(6):e0304043. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Joseph JJ, Golden SH. Cortisol dysregulation: the bidirectional link between stress, depression, and type 2 diabetes mellitus. Ann N Y Acad Sci. 2017;1391(1):20–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Zhang W, Sun Q, Chen B, Basta M, Xu C, Li Y. Insomnia symptoms are associated with metabolic syndrome in patients with severe psychiatric disorders. Sleep Med. 2021;83:168–74. [DOI] [PubMed] [Google Scholar]
11.Li Z, Jiang YY, Long C, Peng X, Tao J, Pu Y, et al. Bridging metabolic syndrome and cognitive dysfunction: role of astrocytes. Front Endocrinol (Lausanne). 2024;15:1393253. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Hare DL, Toukhsati SR, Johansson P, Jaarsma T. Depression and cardiovascular disease: a clinical review. Eur Heart J. 2014;35(21):1365–72. [DOI] [PubMed] [Google Scholar]
13.Hegeman A, Schutter N, Comijs H, Holwerda T, Dekker J, Stek M, et al. Loneliness and cardiovascular disease and the role of late-life depression. Int J Geriatr Psychiatry. 2018;33(1):e65–72. [DOI] [PubMed] [Google Scholar]
14.Shiga T. Depression and cardiovascular diseases. J Cardiol. 2023;81(5):485–90. [DOI] [PubMed] [Google Scholar]
15.Obermeyer Z, Emanuel EJ. Predicting the future - big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Zhou LN, Ma XC, Wang W. Relationship between cognitive performance and depressive symptoms in Chinese older adults: the China Health and Retirement Longitudinal Study (CHARLS). J Affect Disord. 2021;281:454–8. [DOI] [PubMed] [Google Scholar]
17.Williams MW, Li CY, Hay CC. Validation of the 10-item center for epidemiologic studies depression scale post stroke. J Stroke Cerebrovasc Dis. 2020;29(12):105334. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP. Machine learning in cardiovascular medicine: are we there yet? Heart. 2018;104(14):1156–64. [DOI] [PubMed]
19.Chekroud AM, Bondar J, Delgadillo J. The promise of machine learning in predicting treatment outcomes in psychiatry. World Psychiatry. 2021;20:154–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Miyachi Y, Ishii O, Torigoe K. Design, implementation, and evaluation of the computer-aided clinical decision support system based on learning-to-rank: collaboration between physicians and machine learning in the differential diagnosis process. BMC Med Inform Decis Mak. 2023;23(1):26. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Levis B, Benedetti A, Ioannidis JPA, Sun Y, Negeri Z, He C, et al. Patient health questionnaire-9 scores do not accurately estimate depression prevalence: individual participant data meta-analysis. J Clin Epidemiol. 2020;122:115-28.e1. [DOI] [PubMed] [Google Scholar]
22.Heidenreich PA, Trogdon JG, Khavjou OA, Butler J, Dracup K, Ezekowitz MD, et al. Forecasting the future of cardiovascular disease in the United States: a policy statement from the American Heart Association. Circulation. 2011;123(8):933–44. [DOI] [PubMed] [Google Scholar]
23.Yu S, Guo X, Yang H, Zheng L, Sun Y. Metabolic syndrome in hypertensive adults from rural Northeast China: an update. BMC Public Health. 2015;15:247. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Moussavi S, Chatterji S, Verdes E, Tandon A, Patel V, Ustun B. Depression, chronic diseases, and decrements in health: results from the World Health Surveys. Lancet. 2007;370(9590):851–8. [DOI] [PubMed] [Google Scholar]
25.Gasse C, Laursen TM, Baune BT. Major depression and first-time hospitalization with ischemic heart disease, cardiac procedures and mortality in the general population: a retrospective Danish population-based cohort study. Eur J Prev Cardiol. 2014;21(5):532–40. [DOI] [PubMed] [Google Scholar]
26.Wang X, Jie W, Huang X, Yang F, Qian Y, Yang T, et al. Association of psychological resilience with all-cause and cause-specific mortality in older adults: a cohort study. BMC Public Health. 2024;24(1):1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Ji Q, Zhang L, Xu J, Ji P, Song M, Chen Y, et al. The relationship between stigma and quality of life in hospitalized middle-aged and elderly patients with chronic diseases: the mediating role of depression and the moderating role of psychological resilience. Front Psychiatry. 2024;15:1346881. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Albert PR. Why is depression more prevalent in women? J Psychiatry Neurosci. 2015;40(4):219–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Duong P, Tenkorang MAA, Trieu J, McCuiston C, Rybalchenko N, Cunningham RL. Neuroprotective and neurotoxic outcomes of androgens and estrogens in an oxidative stress environment. Biol Sex Differ. 2020;11(1):12. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Addis ME, Mahalik JR. Men, masculinity, and the contexts of help seeking. Am Psychol. 2003;58(1):5–14. [DOI] [PubMed] [Google Scholar]
31.Schuch JJ, Roest AM, Nolen WA, Penninx BW, de Jonge P. Gender differences in major depressive disorder: results from the Netherlands study of depression and anxiety. J Affect Disord. 2014;156:156–63. [DOI] [PubMed] [Google Scholar]
32.Holt-Lunstad J, Smith TB, Layton JB. Social relationships and mortality risk: a meta-analytic review. PLoS Med. 2010;7(7):e1000316. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Shi Y, Whisman MA. Marital satisfaction as a potential moderator of the association between stress and depression. J Affect Disord. 2023;327:155–8. [DOI] [PubMed] [Google Scholar]
34.Cuijpers P, Geraedts AS, van Oppen P, Andersson G, Markowitz JC, van Straten A. Interpersonal psychotherapy for depression: a meta-analysis. Am J Psychiatry. 2011;168(6):581–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Tarlow KR, Johnson TA, McCord CE. Rural status, suicide ideation, and telemental health: risk assessment in a clinical sample. J Rural Health. 2019;35(2):247–52. [DOI] [PubMed] [Google Scholar]
36.Barger SD. Social integration, social support and mortality in the US National Health Interview Survey. Psychosom Med. 2013;75(5):510–7. [DOI] [PubMed] [Google Scholar]
37.Guan M. Measuring the effects of socioeconomic factors on mental health among migrants in urban China: a multiple indicators multiple causes model. Int J Ment Health Syst. 2017;11:10. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Lorant V, Deliege D, Eaton W, Robert A, Philippot P, Ansseau M. Socioeconomic inequalities in depression: a meta-analysis. Am J Epidemiol. 2003;157(2):98–112. [DOI] [PubMed] [Google Scholar]
39.Prince M, Bryce R, Albanese E, Wimo A, Ribeiro W, Ferri CP. The global prevalence of dementia: a systematic review and metaanalysis. Alzheimers Dement. 2013;9(1):63–75 e2. [DOI] [PubMed]
40.Simpson J, Thomas C. Clinical psychology and disability studies: bridging the disciplinary divide on mental health and disability. Disabil Rehabil. 2015;37(14):1299–304. [DOI] [PubMed] [Google Scholar]
41.Bian C, Zhao WW, Yan SR, Chen SY, Cheng Y, Zhang YH. Effect of interpersonal psychotherapy on social functioning, overall functioning and negative emotions for depression: a meta-analysis. J Affect Disord. 2023;320:230–40. [DOI] [PubMed] [Google Scholar]
42.Boateng-Poku A, Benca-Bachman CE, Najera DD, Whitfield KE, Taylor JL, Thorpe RJ Jr, et al. The role of social support on the effects of stress and depression on African American tobacco and alcohol use. Drug Alcohol Depend. 2020;209:107926. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Gebreslassie M, Sampaio F, Nystrand C, Ssegonja R, Feldman I. Economic evaluations of public health interventions for physical activity and healthy diet: a systematic review. Prev Med. 2020;136:106100. [DOI] [PubMed] [Google Scholar]
44.Katon WJ. Clinical and health services relationships between major depression, depressive symptoms, and general medical illness. Biol Psychiatry. 2003;54(3):216–26. [DOI] [PubMed] [Google Scholar]
45.Dwyer CP, MacNeela P, Durand H, O’Connor LL, Main CJ, McKenna-Plumley PE, et al. Effects of biopsychosocial education on the clinical judgments of medical students and GP trainees regarding future risk of disability in chronic lower back pain: a randomized control trial. Pain Med. 2020;21(5):939–50. [DOI] [PubMed] [Google Scholar]
46.Olfati M, Samea F, Faghihroohi S, Balajoo SM, Küppers V, Genon S, et al. Prediction of depressive symptoms severity based on sleep quality, anxiety, and gray matter volume: a generalizable machine learning approach across three datasets. eBioMedicine. 2024;108:105313. [DOI] [PMC free article] [PubMed]
47.Blazer DG. Depression in late life: review and commentary. J Gerontol. 2003;58A(3):249–65. [DOI] [PubMed] [Google Scholar]
48.Verma AA, Murray J, Greiner R, Cohen JP, Shojania KG, Ghassemi M, et al. Implementing machine learning in medicine. Can Med Assoc J. 2021;193(34):E1351–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Kang SH, Cheon BK, Kim JS, Jang H, Kim HJ, Park KW, et al. Machine learning for the prediction of amyloid positivity in amnestic mild cognitive impairment. J Alzheimers Dis. 2021;80(1):143–57. [DOI] [PubMed] [Google Scholar]
50.Tan WY, Hargreaves C, Chen C, Hilal S. A machine learning approach for early diagnosis of cognitive impairment using population-based data. J Alzheimers Dis. 2023;91(1):449–61. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1.^{(22.7KB, docx)}

Supplementary Material 2.^{(19KB, docx)}

Data Availability Statement

Availability of data and materials The datasets generated during and/or analyzed during the current study are available in the CHARLS repository, http://charls.pku.edu.cn.

[CR1] 1.Yang BY, Hu LW, Jalaludin B, Knibbs LD, Markevych I, Heinrich J, et al. Association between residential greenness, cardiometabolic disorders, and cardiovascular disease among adults in China. JAMA Netw Open. 2020;3(9):e2017507. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Collaborators GBDCoD. Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392(10159):1736–88. [DOI] [PMC free article] [PubMed]

[CR3] 3.Collaboration NCDRF. Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4.4 million participants. Lancet. 2016;387(10027):1513–30. [DOI] [PMC free article] [PubMed]

[CR4] 4.Fox CS, Pencina MJ, Wilson PW, Paynter NP, Vasan RS, D’Agostino RB Sr. Lifetime risk of cardiovascular disease among individuals with and without diabetes stratified by obesity status in the Framingham heart study. Diabetes Care. 2008;31(8):1582–4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Qazi MU, Malik S. Diabetes and cardiovascular disease: original insights from the framingham heart study. Glob Heart. 2013;8(1):43–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Warriach ZI, Patel S, Khan F, Ferrer GF. Association of depression with cardiovascular diseases. Cureus. 2022;14(6):e26296. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Pandit M, Frishman WH. The association between cardiovascular disease and dementia: a review of trends in epidemiology, risk factors, pathophysiologic mechanisms, and clinical implications. Cardiol Rev. 2024;32(5):463–7. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Tassew WC, Nigate GK, Assefa GW, Zeleke AM, Ferede YA. Systematic review and meta-analysis on the prevalence and associated factors of depression among hypertensive patients in Ethiopia. PLoS ONE. 2024;19(6):e0304043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Joseph JJ, Golden SH. Cortisol dysregulation: the bidirectional link between stress, depression, and type 2 diabetes mellitus. Ann N Y Acad Sci. 2017;1391(1):20–34. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Zhang W, Sun Q, Chen B, Basta M, Xu C, Li Y. Insomnia symptoms are associated with metabolic syndrome in patients with severe psychiatric disorders. Sleep Med. 2021;83:168–74. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Li Z, Jiang YY, Long C, Peng X, Tao J, Pu Y, et al. Bridging metabolic syndrome and cognitive dysfunction: role of astrocytes. Front Endocrinol (Lausanne). 2024;15:1393253. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Hare DL, Toukhsati SR, Johansson P, Jaarsma T. Depression and cardiovascular disease: a clinical review. Eur Heart J. 2014;35(21):1365–72. [DOI] [PubMed] [Google Scholar]

[CR13] 13.Hegeman A, Schutter N, Comijs H, Holwerda T, Dekker J, Stek M, et al. Loneliness and cardiovascular disease and the role of late-life depression. Int J Geriatr Psychiatry. 2018;33(1):e65–72. [DOI] [PubMed] [Google Scholar]

[CR14] 14.Shiga T. Depression and cardiovascular diseases. J Cardiol. 2023;81(5):485–90. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Obermeyer Z, Emanuel EJ. Predicting the future - big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Zhou LN, Ma XC, Wang W. Relationship between cognitive performance and depressive symptoms in Chinese older adults: the China Health and Retirement Longitudinal Study (CHARLS). J Affect Disord. 2021;281:454–8. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Williams MW, Li CY, Hay CC. Validation of the 10-item center for epidemiologic studies depression scale post stroke. J Stroke Cerebrovasc Dis. 2020;29(12):105334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP. Machine learning in cardiovascular medicine: are we there yet? Heart. 2018;104(14):1156–64. [DOI] [PubMed]

[CR19] 19.Chekroud AM, Bondar J, Delgadillo J. The promise of machine learning in predicting treatment outcomes in psychiatry. World Psychiatry. 2021;20:154–70. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Miyachi Y, Ishii O, Torigoe K. Design, implementation, and evaluation of the computer-aided clinical decision support system based on learning-to-rank: collaboration between physicians and machine learning in the differential diagnosis process. BMC Med Inform Decis Mak. 2023;23(1):26. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Levis B, Benedetti A, Ioannidis JPA, Sun Y, Negeri Z, He C, et al. Patient health questionnaire-9 scores do not accurately estimate depression prevalence: individual participant data meta-analysis. J Clin Epidemiol. 2020;122:115-28.e1. [DOI] [PubMed] [Google Scholar]

[CR22] 22.Heidenreich PA, Trogdon JG, Khavjou OA, Butler J, Dracup K, Ezekowitz MD, et al. Forecasting the future of cardiovascular disease in the United States: a policy statement from the American Heart Association. Circulation. 2011;123(8):933–44. [DOI] [PubMed] [Google Scholar]

[CR23] 23.Yu S, Guo X, Yang H, Zheng L, Sun Y. Metabolic syndrome in hypertensive adults from rural Northeast China: an update. BMC Public Health. 2015;15:247. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Moussavi S, Chatterji S, Verdes E, Tandon A, Patel V, Ustun B. Depression, chronic diseases, and decrements in health: results from the World Health Surveys. Lancet. 2007;370(9590):851–8. [DOI] [PubMed] [Google Scholar]

[CR25] 25.Gasse C, Laursen TM, Baune BT. Major depression and first-time hospitalization with ischemic heart disease, cardiac procedures and mortality in the general population: a retrospective Danish population-based cohort study. Eur J Prev Cardiol. 2014;21(5):532–40. [DOI] [PubMed] [Google Scholar]

[CR26] 26.Wang X, Jie W, Huang X, Yang F, Qian Y, Yang T, et al. Association of psychological resilience with all-cause and cause-specific mortality in older adults: a cohort study. BMC Public Health. 2024;24(1):1989. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Ji Q, Zhang L, Xu J, Ji P, Song M, Chen Y, et al. The relationship between stigma and quality of life in hospitalized middle-aged and elderly patients with chronic diseases: the mediating role of depression and the moderating role of psychological resilience. Front Psychiatry. 2024;15:1346881. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Albert PR. Why is depression more prevalent in women? J Psychiatry Neurosci. 2015;40(4):219–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Duong P, Tenkorang MAA, Trieu J, McCuiston C, Rybalchenko N, Cunningham RL. Neuroprotective and neurotoxic outcomes of androgens and estrogens in an oxidative stress environment. Biol Sex Differ. 2020;11(1):12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Addis ME, Mahalik JR. Men, masculinity, and the contexts of help seeking. Am Psychol. 2003;58(1):5–14. [DOI] [PubMed] [Google Scholar]

[CR31] 31.Schuch JJ, Roest AM, Nolen WA, Penninx BW, de Jonge P. Gender differences in major depressive disorder: results from the Netherlands study of depression and anxiety. J Affect Disord. 2014;156:156–63. [DOI] [PubMed] [Google Scholar]

[CR32] 32.Holt-Lunstad J, Smith TB, Layton JB. Social relationships and mortality risk: a meta-analytic review. PLoS Med. 2010;7(7):e1000316. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR33] 33.Shi Y, Whisman MA. Marital satisfaction as a potential moderator of the association between stress and depression. J Affect Disord. 2023;327:155–8. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Cuijpers P, Geraedts AS, van Oppen P, Andersson G, Markowitz JC, van Straten A. Interpersonal psychotherapy for depression: a meta-analysis. Am J Psychiatry. 2011;168(6):581–92. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Tarlow KR, Johnson TA, McCord CE. Rural status, suicide ideation, and telemental health: risk assessment in a clinical sample. J Rural Health. 2019;35(2):247–52. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Barger SD. Social integration, social support and mortality in the US National Health Interview Survey. Psychosom Med. 2013;75(5):510–7. [DOI] [PubMed] [Google Scholar]

[CR37] 37.Guan M. Measuring the effects of socioeconomic factors on mental health among migrants in urban China: a multiple indicators multiple causes model. Int J Ment Health Syst. 2017;11:10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Lorant V, Deliege D, Eaton W, Robert A, Philippot P, Ansseau M. Socioeconomic inequalities in depression: a meta-analysis. Am J Epidemiol. 2003;157(2):98–112. [DOI] [PubMed] [Google Scholar]

[CR39] 39.Prince M, Bryce R, Albanese E, Wimo A, Ribeiro W, Ferri CP. The global prevalence of dementia: a systematic review and metaanalysis. Alzheimers Dement. 2013;9(1):63–75 e2. [DOI] [PubMed]

[CR40] 40.Simpson J, Thomas C. Clinical psychology and disability studies: bridging the disciplinary divide on mental health and disability. Disabil Rehabil. 2015;37(14):1299–304. [DOI] [PubMed] [Google Scholar]

[CR41] 41.Bian C, Zhao WW, Yan SR, Chen SY, Cheng Y, Zhang YH. Effect of interpersonal psychotherapy on social functioning, overall functioning and negative emotions for depression: a meta-analysis. J Affect Disord. 2023;320:230–40. [DOI] [PubMed] [Google Scholar]

[CR42] 42.Boateng-Poku A, Benca-Bachman CE, Najera DD, Whitfield KE, Taylor JL, Thorpe RJ Jr, et al. The role of social support on the effects of stress and depression on African American tobacco and alcohol use. Drug Alcohol Depend. 2020;209:107926. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Gebreslassie M, Sampaio F, Nystrand C, Ssegonja R, Feldman I. Economic evaluations of public health interventions for physical activity and healthy diet: a systematic review. Prev Med. 2020;136:106100. [DOI] [PubMed] [Google Scholar]

[CR44] 44.Katon WJ. Clinical and health services relationships between major depression, depressive symptoms, and general medical illness. Biol Psychiatry. 2003;54(3):216–26. [DOI] [PubMed] [Google Scholar]

[CR45] 45.Dwyer CP, MacNeela P, Durand H, O’Connor LL, Main CJ, McKenna-Plumley PE, et al. Effects of biopsychosocial education on the clinical judgments of medical students and GP trainees regarding future risk of disability in chronic lower back pain: a randomized control trial. Pain Med. 2020;21(5):939–50. [DOI] [PubMed] [Google Scholar]

[CR46] 46.Olfati M, Samea F, Faghihroohi S, Balajoo SM, Küppers V, Genon S, et al. Prediction of depressive symptoms severity based on sleep quality, anxiety, and gray matter volume: a generalizable machine learning approach across three datasets. eBioMedicine. 2024;108:105313. [DOI] [PMC free article] [PubMed]

[CR47] 47.Blazer DG. Depression in late life: review and commentary. J Gerontol. 2003;58A(3):249–65. [DOI] [PubMed] [Google Scholar]

[CR48] 48.Verma AA, Murray J, Greiner R, Cohen JP, Shojania KG, Ghassemi M, et al. Implementing machine learning in medicine. Can Med Assoc J. 2021;193(34):E1351–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR49] 49.Kang SH, Cheon BK, Kim JS, Jang H, Kim HJ, Park KW, et al. Machine learning for the prediction of amyloid positivity in amnestic mild cognitive impairment. J Alzheimers Dis. 2021;80(1):143–57. [DOI] [PubMed] [Google Scholar]

[CR50] 50.Tan WY, Hargreaves C, Chen C, Hilal S. A machine learning approach for early diagnosis of cognitive impairment using population-based data. J Alzheimers Dis. 2023;91(1):449–61. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Construction of a machine learning-based risk prediction model for depression in middle-aged and elderly patients with cardiovascular metabolic diseases in China: a longitudinal study

Gege Zhang

Sijie Dong

Li Wang

Abstract

Background

Methods

Results

Conclusion

Supplementary Information

Background

Materials and methods

Study population

Fig. 1.

Research variables

Outcome variables

Demographic factors

Behavioral factors

Health status

Mental health factors

Statistical methods

Results

Table 1.

Table 2.

Identification of risk factors

Fig. 2.

Construction of a logistic regression risk prediction model

Table 3.

Table 4.

Fig. 3.

Construction of the random forest risk prediction model

Fig. 4.

Construction of the XGBoost risk prediction model

Fig. 5.

Comparison of the performance of three predictive models

Fig. 6.

Table 5.

Discussion

Limitations and strengths

Conclusion

Supplementary Information

Acknowledgements

Abbreviations

Authors’ contributions

Funding

Data availability

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases