Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2021 Aug 3;11:15706. doi: 10.1038/s41598-021-95208-y

Classification of psychiatric symptoms using deep interaction networks: the CASPIAN-IV study

Hamid Reza Marateb 1,2, Zahra Tasdighi 3,#, Mohammad Reza Mohebian 4,#, Azam Naghavi 5,#, Moritz Hess 6, Mohammad Esmaiel Motlagh 7, Ramin Heshmat 8, Marjan Mansourian 2,9,, Miguel Angel Mañanas 2,10, Harald Binder 6, Roya Kelishadi 11,
PMCID: PMC8333323  PMID: 34344950

Abstract

Identifying the possible factors of psychiatric symptoms among children can reduce the risk of adverse psychosocial outcomes in adulthood. We designed a classification tool to examine the association between modifiable risk factors and psychiatric symptoms, defined based on the Persian version of the WHO-GSHS questionnaire in a developing country. Ten thousand three hundred fifty students, aged 6–18 years from all Iran provinces, participated in this study. We used feature discretization and encoding, stability selection, and regularized group method of data handling (GMDH) to classify the a priori specific factors (e.g., demographic, sleeping-time, life satisfaction, and birth-weight) to psychiatric symptoms. Self-rated health was the most critical feature. The selected modifiable factors were eating breakfast, screentime, salty snack for depression symptom, physical activity, salty snack for worriedness symptom, (abdominal) obesity, sweetened beverage, and sleep-hour for mild-to-moderate emotional symptoms. The area under the ROC curve of the GMDH was 0.75 (CI 95% 0.73–0.76) for the analyzed psychiatric symptoms using threefold cross-validation. It significantly outperformed the state-of-the-art (adjusted p < 0.05; McNemar's test). In this study, the association of psychiatric risk factors and the importance of modifiable nutrition and lifestyle factors were emphasized. However, as a cross-sectional study, no causality can be inferred.

Subject terms: Biomedical engineering, Psychiatric disorders, Risk factors, Paediatric research, Psychology

Introduction

Mental health and illness are a public health concern1. Nowadays, the prevalence of non-communicable diseases (NCDs), such as mental disorders, is rapidly increasing, and the prevention of their associated risk factors has been one of the world's health priorities2. The World Health Organization (WHO) demonstrated that in different age groups, about 450 million people are suffering from severe and mild mental disorders worldwide3. Moreover, at least 52 million people with severe mental disorders, such as schizophrenia, and nearly 150 million people tolerate unspecified mental diseases, including psychological distress4. The literature also discussed whether such distress is a symptom of a mental disorder or a marker of functional impairment5.

Psychological distress is the most common mental health issue affecting many children and is considered one of the leading causes of the global burden of disease6. The National Comprehensive Cancer Centre (NCCN) defines distress as an unpleasant emotional experience of a psychological problem, such as depression, worriedness, and panic7. If children's psychological distress remains untreated, their development is significantly influenced.

Children worldwide are affected by similar psychological distress as adults8. It is related to an increased risk of harmful events, including drug addiction and poor educational performance9. They are common in the Eastern Mediterranean Region (EMR), including Iran and neighboring countries, and are the leading cause of years of life lived with disability (YLDs). In EMR, depression was accounted for the most Disability-Adjusted Life Years (DALYs), and worriedness ranked second in 201310. In summary, depression, and worriedness, two essential components of psychological distress, are among the illness and disability leading causes in adolescents11.

Many studies have attempted to investigate the association between several factors and psychological distress among children. Risk factors associated with such psychological distress appear to be modifiable, partly through the link between these characteristics and lifestyle factors. In general, the literature review shows that the spread of psychological distress changes depending on various factors such as gender12, age13, hours of sleep14, physical activity and screentime15, family size16, life satisfaction17, residence area18, socioeconomic status19, self-rated health20, body mass index21, eating breakfast22, body image23, number of close friends24, having weight-reduction plan25, junk-food consumption26, sweetened beverage consumption27, smoking28, abdominal obesity29, parents’ education30, birth weight31, breastfeeding32, family history of sudden death33 and family history of cancer34. However, many studies focused on the univariate analysis and descriptive studies of psychological distress in children and adolescents, but there exist fewer studies about the classification of comprehensive modifiable risk factors and their interactions and associations with psychological distress.

Few studies have been conducted on using data mining for psychological distress classification, and to the best of our knowledge, none of them comprehensively considered various determinants and their interactions3, 35, 36. Thus, this study aims to classify the risk factors associated with psychiatric symptoms based on the demographic, lifestyle, socioeconomic status, and family history of diseases in a large sample of children and adolescents. The Group Method of Data Handling (GMDH), proposed by Ivakhnenko37, was used for classification in our study. In this network, optimal hyperparameters, the number of hidden layers, and neurons in such layers are automatically identified. Moreover, the interpretable interaction network is provided by the GMDH, which is very important in medical data mining38.

Results

Prevalence of mild-to-moderate emotional symptoms, worriedness, and depression

This national survey's participation rate was 90.6%, and the subjects enrolled were 13,486 children and adolescents out of 14,880 invited subjects. The number of missing values ranged from zero (living place and gender) to 1158 (birth weight category) in the enrolled subjects. The occurrence of missing data on the dependent and independent variables of the enrolled subjects was random (Little's MCAR test39; p = 0.421). In our study, the subjects with complete information were first analyzed. Accordingly, 10,350 subjects (i.e., 76.7% of the enrolled subjects) were analyzed. Overall, the percentages of 6–10, 11–14- and 15–19-year-old age groups were 33.7, 35.0, and 31.3, respectively, and 50.1% of the population was boys. The prevalence of having worriedness symptoms, mild-to-moderate emotional symptoms, and depression symptoms was 23.7%, 11.1%, and 20.1%, respectively. Specifically, the prevalence of having a worriedness symptom was 20.6% and 26.8% in boys and girls, respectively. 8.9% of boys and 13.2% of the girls suffered from mild-to-moderate emotional symptoms, while 18.5% of boys and 21.7% of girls experienced depression symptoms. The distribution of demographic variables, family history of diseases, and lifestyle factors was presented in different psychiatric groups (Tables 1, 2). The pairwise association between the input features and the outcome variables was shown in Fig. 1.

Table 1.

Demographic and socioeconomic characteristics of participants: the CASPIAN-IV study.

Feature Worriedness symptom p Depression symptom p Mild-to-moderate emotional problems p
No
N (%)
Yes
N (%)
No
N (%)
Yes
N (%)
No
N (%)
Yes
N (%)
Family size
Less/= 4 4035 (39.0) 1125 (10.9) < 0.001 4166 (40.2) 994 (9.6) 0.035 4667 (45.1) 493 (4.8)  < 0.001
> 4 3861 (37.3) 1329 (12.8) 4104 (39.7) 1086 (10.5) 4538 (43.8) 652 (6.3)
Place living
Urban 5888 (56.9) 1985 (19.2) < 0.001 6168 (59.6) 1705 (16.5) < 0.001 6946 (67.1) 927 (9.0) < 0.001
Rural 2008 (19.4) 469 (4.5) 2102 (20.3) 375 (3.65) 2259 (21.8) 218 (2.1)
Gender
Boy 4112 (39.7) 1069 (10.3) < 0.001 4224 (40.8) 957 (9.2) < 0.001 4720 (45.6) 461 (4.5) < 0.001
Girl 3784 (36.6) 1385 (13.4) 4046 (39.1) 1123 (10.9) 4485 (43.3) 684 (6.6)
Socioeconomic status
Weak 2440 (23.6) 800 (7.7) 0.179 2584 (25.0) 656 (6.3) 0.948 2833 (27.4) 407 (4.0) 0.005
Moderate 2647 (25.6) 825 (8.0) 2780 (26.8) 692 (6.7) 3109 (30.0) 363 (3.5)
Good 2809 (27.1) 829 (8.0) 2906 (28.1) 732 (7.1) 3263 (31.5) 375 (3.6)
BMI
Underweight 978 (9.5) 270 (2.6) 0.050 993 (9.6) 255 (2.5) 0.041 1108 (10.7) 140 (1.4) 0.980
Healthy weight 5239 (50.6) 1617 (15.6) 5523 (53.4) 1333 (12.9) 6098 (58.9) 758 (7.3)
Overweight and obese 1679 (16.2) 567 (5.5) 1754 (16.9) 492 (4.7) 1999 (19.3) 247 (2.4)
Age categories
6–10 3071 (29.7) 416 (4.0) < 0.001 3111 (30.1) 376 (3.6) < 0.001 3342 (32.3) 145 (1.4) < 0.001
11–14 2711 (26.2) 916 (8.9) 2898 (28.0) 729 (7.1) 3208 (31.0) 419 (4.0)
15–19 2114 (20.4) 1122 (10.8) 2261 (21.8) 975 (9.4) 2655 (25.7) 581 (5.6)
Abdominal obesity: based on waist to height ratio
No 6396 (61.8) 1942 (18.8) 0.042 6703 (64.8) 1635 (15.8) 0.013 7426 (71.7) 912 (8.8) 0.412
Yes 1500 (14.5) 512 (4.9) 1567 (15.1) 445 (4.3) 1779 (17.2) 233 (2.3)
Mother education
Illiterate 1120 (10.8) 401 (3.9) < 0.001 1202 (11.6) 319 (3.1) 0.064 1316 (12.7) 205 (2.0) 0.001
Diploma 6013 (58.1) 1870 (18.1) 6286 (60.7) 1597 (15.4) 7026 (67.9) 857 (8.3)
University 763 (7.4) 183 (1.7) 782 (7.6) 164 (1.6) 863 (8.3) 83 (0.8)
Birth weight category
Less than 2500 g 705 (6.8) 251 (2.4) 0.042 755 (7.3) 201 (1.9) 0.423 831 (8.0) 125 (1.2) 0.109
2500–4000 g 6624 (64.0) 2000 (19.3) 6915 (66.8) 1709 (16.5) 7696 (74.4) 928 (9.0)
More than 4000 g 567 (5.5) 203 (2.0) 600 (5.8) 170 (1.7) 678 (6.5) 92 (0.9)
Milk type during infancy
Others 1226 (11.8) 389 (3.8) 0.699 1297 (12.5) 318 (3.1) 0.657 1430 (13.8) 185 (1.8) 0.586
Mother milk 6670 (64.4) 2065 (20.0) 6973 (67.4) 1762 (17.0) 7775 (75.1) 960 (9.3)
Family history of sudden death
No 6940 (67.1) 2122 (20.5) 0.065 7271 (70.2) 1791 (17.3) 0.027 8084 (78.1) 978 (9.5) 0.023
Yes 956 (9.2) 332 (3.2) 999 (9.7) 289 (2.8) 1121 (10.8) 167 (1.6)
Family history of cancer
No 6727 (65.0) 2082 (20.1) 0.668 7081 (68.4) 1728 (16.7) 0.004 7848 (75.8) 961 (9.3) 0.238
Yes 1169 (11.3) 372 (3.6) 1189 (11.5) 352 (3.4) 1357 (13.1) 184 (1.8)

N number of people who are in each category.

Table 2.

Lifestyle and health-related characteristics of participants: the CASPIAN-IV study.

Feature Worriedness symptom p Depression symptom p Mild-to-moderate emotional problems p
No
N (%)
Yes
N (%)
No
N (%)
Yes
N (%)
No
N (%)
Yes
N (%)
Sleeping time
< 5 h 25 (0.2) 27 (0.3) < 0.001 32 (0.3) 20 (0.2) < 0.001 37 (0.4) 15 (0.1) < 0.001
5–8 h 1629 (15.8) 678 (6.5) 1736 (16.8) 571 (5.5) 1953 (18.9) 354 (3.4)
> 8 h 6242 (60.3) 1749 (16.9) 6502 (62.8) 1489 (14.4) 7215 (69.7) 776 (7.5)
Screen time
<= 4 6600 (63.8) 1879 (18.1) < 0.001 6865 (66.3) 1614 (15.6) < 0.001 7653 (73.9) 826 (8.0) < 0.001
> 4 1296 (12.5) 575 (5.6) 1405 (13.6) 466 (4.5) 1552 (15.0) 319 (3.1)
Physical activity
Mild 2408 (23.3) 986 (9.5) < 0.001 2599 (25.1) 795 (7.7) < 0.001 2858 (27.6) 536 (5.2) < 0.001
Moderate 3026 (29.2) 860 (8.3) 3131 (30.3) 755 (7.3) 3500 (33.8) 386 (3.7)
Severe 2462 (23.8) 608 (5.9) 2540 (24.5) 530 (5.1) 2847 (27.5) 223 (2.2)
Self-rated health
Good 6299 (60.9) 1800 (74%) 0.534 6612 (63.9) 1200 (57%) 0.139 7364 (71.1) 704 (61%) 0.644
Moderate 1481 (14.3) 303 (12%) 1523 (14.7) 401 (19%) 1703 (16.5) 200 (17%)
Bad 116 (1.1) 351 (14%) 135 (1.3) 479 (24%) 138 (1.3) 241 (22%)
Breakfast categories
Non skipper 5719 (55.2) 1427 (13.8) < 0.001 5900 (57.0) 1246 (12.0) < 0.001 6523 (63.0) 623 (6.0) < 0.001
Semi skipper 939 (9.1) 402 (3.9) 1051 (10.2) 290 (2.8) 1172 (11.4) 169 (1.6)
Skipper 1238 (12.0) 625 (6.0) 1319 (12.7) 544 (5.3) 1510 (14.6) 353 (3.4)
Body image
Under weight 2679 (25.9) 800 (7.7) < 0.001 2817 (27.2) 662 (6.4) < 0.001 3088 (29.8) 391 (3.8) < 0.001
Normal 3817 (36.9) 1086 (10.5) 3978 (38.4) 925 (8.9) 4443 (42.9) 460 (4.5)
Overweight 1400 (13.5) 568 (5.5) 1475 (14.3) 493 (4.8) 1674 (16.2) 294 (2.8)
Number of close friend's category
Nothing 155 (1.5) 67 (0.6) 0.003 169 (1.6) 53 (0.5) 0.070 189 (1.8) 33 (0.3) < 0.001
One 1096 (10.6) 397 (3.8) 1181 (11.4) 312 (3.0) 1299 (12.6) 194 (1.9)
Two 1878 (18.2) 581 (5.6) 2005 (19.4) 454 (4.4) 2156 (20.8) 303 (2.9)
Three or more 4767 (46.1) 1409 (13.6) 4915 (47.5) 1261 (12.2) 5561 (53.7) 615 (6.0)
Having nutrition plan based on special diet
No 6999 (67.6) 2046 (19.8) < 0.001 7326 (70.8) 1719 (16.6) < 0.001 8101 (78.3) 944 (9.1) < 0.001
Yes 897 (8.7) 408 (3.9) 944 (9.1) 361 (3.5) 1104 (10.7) 201 (1.9)
Salty snack including chips
Seldom/never 4118 (39.8) 1183 (11.4) < 0.001 4276 (41.3) 1025 (9.9) < 0.001 4788 (46.2) 513 (5.0) < 0.001
Weekly 2903 (28.0) 902 (8.7) 3062 (29.6) 743 (7.2) 3375 (32.6) 430 (4.1)
Daily 875 (8.5) 369 (3.6) 932 (9.0) 312 (3.0) 1042 (10.1) 202 (2.0)
Sweetened beverage consumption
Seldom/never 4981 (48.1) 257 (2.5) < 0.001 5184 (50.1) 223 (2.2) < 0.001 6860 (66.3) 130 (1.3) < 0.001
Weekly 2436 (23.6) 787 (7.6) 2573 (24.8) 650 (6.3) 2851 (27.5) 372 (3.6)
Daily 479 (4.6) 1410 (13.6) 513 (5.0) 1207 (11.6) 5748 (55.5) 643 (6.2)
Fast food consumption
Seldom/never 5915 (57.2) 1698 (16.4) < 0.001 6161 (59.5) 1452 (14.0) < 0.001 606 (5.9) 753 (7.3) < 0.001
Weekly 1802 (17.4) 652 (6.3) 1915 (18.5) 539 (5.2) 2124 (20.5) 330 (3.2)
Daily 179 (1.7) 104 (1.0) 194 (1.9) 89 (0.9) 221 (2.1) 62 (0.6)
Passive or active smoker
No 4453 (43.0) 1139 (11.0) < 0.001 4656 (45.0) 936 (9.0) < 0.001 5111 (49.4) 481 (4.6) < 0.001
Yes 3443 (33.3) 1315 (12.7) 3614 (34.9) 1144 (11.1) 4094 (39.6) 664 (6.4)
Life-satisfaction
Low 1241 (12.0) 1715 (16.6) < 0.001 1313 (12.7) 1413 (13.7) < 0.001 1537 (14.9) 702 (6.7) < 0.001
High 6655 (64.3) 739 (7.1) 6957 (67.2) 667 (6.4) 7668 (74.1) 443 (4.3)

N number of people who are in each category.

Figure 1.

Figure 1

The bivariate correlation between the inputs and outputs. × 1: sleeping-time cat. (category); × 2: screen time cat.; × 3: family-size cat.; × 4: life satisfaction cat.; × 5: residence area; × 6: gender; × 7: physical activity cat.; × 8: SES cat.; × 9: self-rated health cat. (SRH); × 10: BMI cat.; × 11: breakfast cat.; × 12: body image cat.; × 13: age cat.; × 14: the number of close friends cat.; × 15: weight-reduction plan; × 16: salty-snack cat.; × 17: sweetened beverage consumption cat.; × 18: fast food consumption cat.; × 19: smoker; × 20: abdominal obesity; × 21: mother education cat.; × 22: birth-weight cat.; × 23: milk type during infancy; × 24: the family history of sudden death; × 25: the family history of cancer; y1: worriedness symptom; y2: depression symptom; y3: mild-to-moderate emotional symptoms; y4: psychiatric symptoms. The entire ordinal variables were encoded ascending (e.g., mild to severe, or seldom to daily), except for × 9 (1: good, 2: moderate, 3: bad), × 11 (1: non-skipper, 2: semi-skipper, 3: skipper), and × 17 (1: daily, 2: weekly, 3: seldom/never).

The association between outcome variables was measured using the Phi coefficient40. It was 0.208 (CI 95% 0.191–0.224) (p < 0.001) between worriedness and depression symptoms. The association between worriedness symptom and mild-to-moderate emotional symptomswas 0.385 (CI 95% 0.370–0.399) (p < 0.001). Similarly, the association between depression symptom and the mild-to-moderate emotional symptoms was 0.285 (CI 95% 0.269–0.300) (p < 0.001). Thus, no more than trivial association in the first and last outcome pairs was observed40, 41, while worriedness symptom and mild-to-moderate emotional symptoms were weakly correlated in our study.

Moreover, the association between depression symptoms and each of Questions 1–5 (i.e., questions used to define mild-to-moderate emotional symptoms) was assessed. The lowest association was with confusion (Q5) (Phi coefficient = 0.179 (CI 95% 0.162–0.195) (p < 0.001), while the highest association was with worthless (Q1) (Phi coefficient = 0.220 (CI 95% 0.203–0.236) (p < 0.001). Thus, no more than a trivial association between depression and any of the five mild-to-moderate emotional symptoms items was observed. The factor analysis was also used based on principal components analysis (PCA) on the original Q1–Q5. Only one PC was selected with Eigenvalues greater than one. It showed an acceptable discrimination for depression diagnosis [Area under the ROC Curve = 0.736 (CI 95% 0.726–0.746); (p < 0.001)]. Thus, the combination of Q1-Q5 could be indirectly used for depression symptom diagnosis.

Classification results

Among 25 features used in our study, five, ten, and four features were selected by the stability feature selection and the GMDH network for depression symptom, mild-to-moderate emotional symptoms, and worriedness symptom, respectively, consistent during cross-validation. The proposed GMDH networks for depression symptoms, mild-to-moderate emotional symptoms, and worriedness symptoms were shown in Figs. 2, 3 and 4. The top three most essential depression symptom features were the eating breakfast category (cat.), self-rated health cat (SRH), and diet program. They were age cat., self-rated health cat., and milk type for the mild-to-moderate emotional symptoms while age cat., physical activity cat., and socioeconomic status (SES) cat. were the most critical features for worriedness symptom. The most important features were selected based on the number of interactions used in the network.

Figure 2.

Figure 2

A representative GMDH network for classifying depression symptoms.

Figure 3.

Figure 3

A representative GMDH network for classifying mild-to-moderate emotional symptoms.

Figure 4.

Figure 4

A representative GMDH network for classifying worriedness symptoms.

The average performance of the GMDH network and the state-of-the-art on the test folds during threefold cross-validation of psychological distress were shown in Table 3. The entire performance indices and their CI 95% of the analyzed methods on the cross-validated confusion matrix were reported in Table 4. The GMDH network significantly outperformed the state-of-the-art for the entire outcomes (adjusted p < 0.05; McNemar's test), except for LDA in mild-to-moderate emotional symptoms classification where they were not significantly different.

Table 3.

The performance of the different classifiers in MEAN ± SD over the test folds using threefold cross-validation.

Outcome Classifier Se (%) Sp (%) Pr (%) Acc (%)
Depression symptom GMDH 78 ± 3 96 ± 1 83 ± 4 92 ± 2
LDA 62 ± 2 66 ± 1 31 ± 1 65 ± 1
SVM 19 ± 2 91 ± 2 36 ± 5 77 ± 1
MLP 10 ± 3 98 ± 1 57 ± 3 80 ± 1
Mild-to-moderate emotional symptoms GMDH 70 ± 3 70 ± 2 23 ± 2 70 ± 2
LDA 64 ± 3 70 ± 1 21 ± 1 70 ± 1
SVM 14 ± 1 96 ± 1 29 ± 2 87 ± 1
MLP 4 ± 4 99 ± 1 45 ± 11 89 ± 1
Worriedness symptom GMDH 70 ± 3 88 ± 2 64 ± 3 84 ± 2
LDA 62 ± 1 65 ± 1 35 ± 1 64 ± 1
SVM 23 ± 3 88 ± 2 38 ± 1 73 ± 1
MLP 13 ± 1 97 ± 1 54 ± 3 77 ± 1
Psychiatric symptoms GMDH 80 ± 2 70 ± 3 59 ± 2 74 ± 2
LDA 63 ± 1 66 ± 1 50 ± 1 65 ± 1
SVM 32 ± 2 87 ± 3 57 ± 4 67 ± 2
MLP 32 ± 1 88 ± 1 60 ± 2 69 ± 1

Se Sensitivity, Sp Specificity, Pr Precision, Acc Accuracy.

Table 4.

The performance of the different classifiers and their CI 95% based on the cross-validated confusion matrix.

Outcome Classifier Se (%) Sp (%) Pr (%) NPV (%) Acc (%) AUC DOR DP MCC K(C)
Depression symptom GMDH 79 (77, 81) 97 (96, 97) 87 (85, 88) 95 (94, 95) 93 (93, 94) 0.88 (0.87, 0.89) 121.6 (103.2, 143.4) 2.04 (1.97, 2.11) 0.79 (0.78, 0.80) 0.79 (0.77, 0.80)
LDA 62 (60, 64) 66 (65, 67) 31 (30, 33) 87 (87, 88) 65 (64, 66) 0.64 (0.63, 0.65) 3.2 (2.9, 3.5) 0.48 (0.44, 0.53) 0.23 (0.21, 0.25) 0.20 (0.18, 0.22)
SVM 19 (17, 21) 91 (91, 92) 36 (33, 39) 82 (81, 83) 77 (76, 78) 0.55 (0.54, 0.57) 2.5 (2.2, 2.9) 0.39 (0.33, 0.45) 0.14 (0.12, 0.15) 0.13 (0.10, 0.16)
MLP 10 (9, 12) 98 (97, 98) 57 (52, 62) 81 (81, 82) 80 (80, 81) 0.54 (0.53, 0.56) 5.9 (4.7, 7.2) 0.75 (0.66, 0.84) 0.18 (0.16, 0.20) 0.12 (0.09, 0.16)
Mild-to-moderate emotional symptoms GMDH 71 (68, 74) 69 (68, 70) 22 (21, 24) 95 (94, 96) 69 (68, 70) 0.70 (0.68, 0.72) 5.5 (4.8, 6.2) 0.72 (0.66, 0.78) 0.26 (0.24, 0.28) 0.20 (0.18, 0.23)
LDA 64 (62, 67) 70 (69, 71) 21 (20, 23) 94 (94, 95) 70 (69, 71) 67 (66, 69) 4.3 (3.7, 4.8) 0.62 (0.56, 0.67) 0.23 (0.21, 0.25) 0.18 (0.16, 0.21)
SVM 14 (12, 16) 96 (95, 96) 29 (25, 33) 90 (89, 91) 87 (86, 87) 0.55 (0.53, 0.57) 3.7 (3.0, 4.5) 0.55 (0.47, 0.63) 0.13 (0.12, 0.15) 0.12 (0.08, 0.17)
MLP 4 (3, 6) 99 (99–100) 50 (40, 60) 89 (89, 90) 89 (88, 89) 0.52 (0.51, 0.54) 8.4 (5.6, 12.4) 0.90 (0.73, 1.07) 0.12 (0.10, 0.14) 0.06 (0.01, 0.11)
Worriedness symptom GMDH 71 (69, 73) 86 (85, 88) 61 (59, 63) 91 (90, 91) 82 (81, 83) 0.79 (0.77, 0.80) 15.0 (13.5, 16.8) 1.15 (1.10, 1.20) 0.54 (0.53, 0.56) 0.54 (0.52, 0.56)
LDA 62 (60, 64) 65 (64, 66) 35 (34, 37) 84 (84, 85) 64 (63, 65) 0.63 (0.62, 0.65) 3.0 (2.7, 3.3) 0.46 (0.42, 0.50) 0.23 (0.21, 0.25) 0.21 (0.19, 0.23)
SVM 23 (22, 25) 88 (87, 89) 38 (35, 40) 79 (78, 80) 73 (72, 74) 0.56 (0.54, 0.57) 2.2 (2.0, 2.5) 0.34 (0.29, 0.39) 0.14 (0.12, 0.16) 0.13 (0.10, 0.16)
MLP 13 (12, 14) 97 (96, 97) 54 (50, 58) 78 (77, 79) 77 (76, 77) 0.55 (0.53, 0.56) 4.1 (3.5, 4.9) 0.60 (0.53, 0.67) 0.17 (0.16, 0.19) 0.13 (0.10, 0.16)
Psychiatric symptoms GMDH 78 (77, 79) 71 (70, 72) 59 (58, 61) 86 (85, 87) 73 (72, 75) 0.75 (0.73, 0.76) 8.7 (7.9, 9.5) 0.92 (0.88, 0.96) 0.47 (0.45, 0.48) 0.46 (0.44, 0.47)
LDA 63 (61, 64) 66 (65, 67) 50 (48, 51) 76 (75, 78) 65 (63, 66) 0.64 (0.63, 0.65) 3.2 (3.0, 3.5) 0.50 (0.46, 0.53) 0.27 (0.26, 0.29) 0.27 (0.25, 0.29)
SVM 32 (30, 33) 87 (86, 87) 56 (54, 58) 70 (69, 71) 67 (66, 68) 0.59 (0.58, 0.60) 3.0 (2.7, 3.3) 0.47 (0.43, 0.51) 0.22 (0.20, 0.24) 0.20 (0.18, 0.23)
MLP 33 (31, 34) 88 (87, 89) 60 (58, 62) 71 (70, 72) 69 (68, 70) 0.60 (0.59, 0.61) 3.6 (3.3, 4.0) 0.54 (0.50, 0.59) 0.25 (0.23, 0.27) 0.23 (0.21, 0.25)

Se sensitivity, Sp specificity, Acc accuracy, Pr precision, F1S F1-Score, AUC area under the receiver operating characteristic (ROC) curve, LR likelihood ratio, DOR diagnosis odds ratio, MCC Matthews correlation coefficient, DP discriminant power, K(C) Cohen's kappa coefficient, NPV negative predictive value. The entire AUC and K(C) values were statistically significant (p < 0.05). Statistical power (Power), false alarm (FA), and F1-score (F1S) values could be directly calculated as Power = Se, FA = 1 − Sp(%)/100, and F1S = 2 * Se * Pr/(Se + Pr).

We further classified the imputed dataset in which the number of subjects was increased from 10,350 to 11,820 using the GMDH algorithm (Table 5). No significant improvement was observed compared with the dataset with complete information (adjusted p > 0.05; McNemar's test).

Table 5.

The performance of the GMDH classifier in MEAN ± SD over the test folds using threefold cross-validation on the imputed dataset.

Outcome Se (%) Sp (%) Pr (%) Acc (%)
Depression symptom 79 ± 4 95 ± 2 82 ± 3 91 ± 3
Mild-to-moderate emotional symptoms 71 ± 2 71 ± 3 40 ± 3 71 ± 1
Worriedness symptom 72 ± 2 87 ± 3 41 ± 2 85 ± 1
Psychiatric symptoms 81 ± 3 71 ± 2 81 ± 2 77 ± 3

Se Sensitivity, Sp Specificity, Pr Precision, Acc Accuracy.

The performance of the proposed algorithm for classifying extended three-class psychiatric symptoms was provided in Table 6. The algorithm was run for each age category (6–10, 11–14, and 15–19 years) to improve its performance. The selected factors by the stability feature selection were breakfast, life satisfaction, self-rated health, screen time, residence area, sleeping-time, and weight-reduction plan for the first age category. They were self-rated health, life satisfaction, gender, breakfast, sleeping-time, residence area, weight-reduction plan, physical activity, and body image for the second age category. The algorithm selected self-rated health, life satisfaction, gender, breakfast, physical activity, body image, and screen time for the third age category. Among the selected factors, self-rated health, life satisfaction, and breakfast were common in all age categories.

Table 6.

The performance of the GMDH algorithm (in percent) based on the cross-validated confusion matrix for classifying extended three-class psychiatric symptoms.

Age category (year) Low-risk Medium-risk High-risk Overall
Pr Se Pr Se Pr Se PrM SeM F1SM Acc AR (CI 95%)
6–10 73 47 70 56 73 88 72 64 68 72 50 (47–52)
11–14 72 68 66 66 65 70 68 68 68 68 52 (50–54)
15–19 75 81 68 67 69 57 71 68 70 72 54 (52–56)

Pr Precision, Se Sensitivity, Acc Accuracy, PrM Marco-averaged precision, SeM Marco-averaged recall (= sensitivity), F1SM Marco-averaged F1 Score, AR Agreement rate (Cohen’s Kappa), CI confidence interval.

Discussion

In our study, we considered depression symptoms, worriedness symptoms, and mild-to-moderate emotional symptoms for classification. The SRH was recognized as the primary variable for classifying depressive symptoms and mild-to-moderate emotional symptoms (Figs. 2, 3). Overall, few studies were performed on SRH in the literature. It has been shown that depression was strongly associated with reporting poor SRH42. There was also a relationship between poor SRH, depressive and anxiety symptoms among university students with high academic stress43.

Moreover, a longitudinal study of adolescent health in the United States demonstrated that one of the main factors associated with persistent depressive symptoms was poor self-rated general health44. Although there was no significant correlation between SRH and depression symptoms in our dataset (Rank Biserial rrb = 0.002; p = 0.819) (Fig. 1), its interactions with screentime, diet, and breakfast were selected by the GMDH network (Fig. 2). It is how the interaction network identifies indirect factors. However, the univariate analysis could not identify this factor. It was similar for mild-to-moderate emotional symptoms, where there was no significant correlation between SRH and mild-to-moderate emotional symptoms in our dataset (Rank Biserial rrb = 0.008; p = 0.390) (Fig. 1); its interactions with milk type, abdominal obesity, and beverage consumption were selected by the GMDH network (Fig. 3).

One of the most critical risk factors for children, the physical activity level, was selected for those having worriedness symptoms in our study (Fig. 4). The beneficial effects of regular physical activity on health are indisputable in modern medicine45. Furthermore, a large amount of exercise plays an essential role in minimizing the worry in clinical settings46. Although the correlation between physical activity and worriedness symptoms was very low in our study (Rank Biserial rrb = −0.087; p =  <0.001) (Fig. 1), its interactions with the others were selected by the GMDH network (Fig. 4).

In our study, sleep hour was selected as a factor for mild-to-moderate emotional symptoms (Fig. 3). It is in agreement with previous studies47. Adverse general health outcomes are associated with the indicators of sleep problems, such as short sleep duration. Another study on 11,788 pupils from 11 different European countries showed a negative association between sleep time hours per night and emotional symptoms48. Although the correlation between sleep hours and mild-to-moderate emotional symptoms was very low in our study (Rank Biserial rrb = −0.080; p =  <0.001) (Fig. 1), its interaction with the milk type during infancy was selected by the GMDH network (Fig. 4). It was shown in the literature that there is a relationship between breastfeeding and sleep quality in infants49. Also, breastfeeding is related to behavior problems in children and adolescents50. However, there was no significant correlation between sleep hour and milk type in our study (Rank Biserial rrb = 0.005; p = 0.634) (Fig. 1).

One of the notable factors in our research was screen time (Fig. 2). Some studies showed that children who watch TV for more than two hours a day usually have lower self-esteem, lower school performance, and unhealthy eating habits51. Such consequences would lead to psychological distress in young children52. The consequences reported by these articles are in agreement with our findings of the positive association between screen time and having depression symptoms. Although the correlation between screen time and depression ssymptoms was very low in our study (Rank Biserial rrb = 0.056; p =  < 0.001) (Fig. 1), its interactions with SRH and breakfast were selected by the GMDH network (Fig. 4).

A predictor of depression and worriedness symptoms was salty snack consumption (Figs. 2, 4). In general, there were few studies in this field26. It is also demonstrated that 12 to 13-year-old Norwegian adolescents with healthy dietary patterns have better mental health conditions53. Although the correlation between salty snack consumption and depression or worriedness symptoms was very low in our study (depression: Rank Biserial rrb = 0.030; p = 0.002, worriedness: Rank Biserial rrb = 0.044; p < 0.001) (Fig. 1), its interactions with SRH and breakfast, and with age category, SES category, and the physical activity were selected for depression, and worriedness symptoms, respectively (Figs. 2, 4).

Breakfast is one of the most important meals. The prevalence of breakfast skipping is increasing among adolescents. Previous studies showed that breakfast intake is related to mental problems54. Another study showed that skipping breakfast at least four times a week was significantly associated with a higher depressed mood score55. Our finding is consistent with such results on the association of breakfast consumption with depression symptoms (Fig. 2). Although the correlation between breakfast consumption and depression symptoms was relatively low in our study (Rank Biserial rrb = 0.107; p < 0.001) (Fig. 1), its interactions with the others were selected (Fig. 2).

Discretization was used in our study to generate categorical input variables instead of interval variables. Although this procedure reduces the flexibility of the variables, it could increase the classifiers' performance and their generalization. We further used the original interval variables, and the average accuracy of the GMDH classification system was reduced by 3%, 4%, 2%, and 7% for depression symptoms, mild-to-moderate emotional symptoms, worriedness symptoms, and psychiatric symptoms. Moreover, the correlation between the original input variables and their categorical version ranged from 0.68 for age (Spearman’s rho; p < 0.001) to 0.95 for screentime variables (Spearman’s rho; p < 0.001). Thus, such a discretization did not significantly reduce the amount of information.

In our study, the GMDH network was used as a classifier. This network is incremental and expands with regularized least squares (RLS), a convex algorithm, but it also generates the interaction network (Figs. 2, 3, 4), leading to better clinical interpretations. The proposed GMDH network had very good diagnostic accuracy for depression symptom classification, while it showed good diagnostic accuracy for other outcomes. It showed an excellent, fair to good agreement rate with the gold standard for depression symptoms and worriedness symptom or psychiatric symptoms classification. However, it showed a poor agreement rate for the mild-to-moderate emotional symptoms classification. The proposed system's discriminant power was fair, limited for depression symptoms and worriedness symptom classification. However, it was poor for mild-to-moderate emotional symptoms classifications. The proposed system's false alarm (FA) ranged from 3 to 31% when classifying depression symptoms and mild-to-moderate emotional symptoms. However, the proposed system's statistical power was always higher than 70% in the entire outcome. Moreover, the false discovery rate (a.k.a., 1-precision) ranged from 13 to 78% for classifying depression symptoms and mild-to-moderate emotional symptoms.

The proposed GMDH network had the best and worst classification performance (MCC) for depression symptoms and mild-to-moderate emotional symptoms, respectively (Table 4). The dataset was highly imbalanced for mild-to-moderate emotional symptoms outcome (the prevalence of 11.1%). However, the entire performance indices were consistent in different test folds (Table 3). It must be mentioned that the diagnostic accuracy of classifying psychological distress is not usually high in the literature56. In the meanwhile, there could be two reasons why the GMDH significantly outperformed the MLP classifier. First, the MLP is a fully connected network, while the GMDH is not (Figs. 2, 3, 4), resulting in more parameters in the MLP. Second, the cost function of the GMDH was customized for the imbalanced data, while the cross-entropy was used for the MLP that could be improved by using weighted cross-entropy57.

The GMDH algorithm was further used to classify the extended three-class psychiatric symptoms. Overall, the agreement rate of the extended system was comparable to that of the four two-class problems (Tables 4, 6). However, the original questions are used in the extended system, and the severity of the entire psychiatric symptoms are identified rather than defining a psychiatric symptom based on one question. Accordingly, it might be preferred in practice. For the extended system, analyzing the interaction network identified that self-rated health, life satisfaction, breakfast consumption, and sleeping/screen time were the most critical factors.

In our study, we designed classification systems for psychiatric symptoms. Proper diagnosis of mental disorders, such as depressive and anxiety disorders, requires detailed analysis58. As an important limitation of our study, each outcome variable of depression and worriedness symptoms was derived from a single question. Mild-to-moderate emotional symptoms outcome was created based on five questions (confusion, insomnia, anxiety, angriness, and worthlessness). Generally, depression might be embedded in emotional symptoms, as one of the limitations of our methodology. However, due to the definition of depression in our study that prevented students from routine activities, it was more alarming than mild-to-moderate emotional symptoms defined based on Q1–Q5. However, Q1–Q5 could be considered as symptoms of depression. In our analysis, no more than a trivial association between depression and any of the five mild-to-moderate emotional symptoms items was observed. However, the combination of Q1–Q5 had acceptable discrimination for depression diagnosis. Moreover, the overall outcome (i.e., psychiatric symptoms) was generated based on the entire seven questions in our analysis.

The advantage of our study is the large sample size. Moreover, it analyzed the comprehensive factors related to psychiatric symptoms to monitor direct and indirect modifiable factors. However, it is a repeated cross-sectional study59, and no casualty can be inferred. We only analyzed the CASPIAN-IV data, and examining the trend and association between variables over time is the focus of our future activity. The other limitation was the possible bias in the self-reported answers of participants. Moreover, there is evidence about adolescents using the substance in Iran and their considerable psychological dysfunction60. However, it was not recorded in CASPIAN-IV. It is another limitation of our study.

In conclusion, our study emphasized the modifiable factors of psychiatric symptoms, including breakfast, salty snack, sweet beverage consumption, consumption, screentime, (abdominal) obesity, sleep hour, and physical activity. Iran ranked fourth among the countries with the highest age-standardized mental disorder DALYs rates (2436.44 DALYs per 100,000, based on the GBD 2019)61. Such disorders could root from childhood psychiatric symptoms, and empowering protective factors and changing modifiable risk factors might reduce such rates in the future. It is possible to design an online web-based or Android App of the developed algorithm to identify whether the student could have a high risk of “psychiatric symptoms” based on the selected input variables (Figs. 2, 3, 4). Such indirect health screenings have a great potential to be integrated into schools, which is the focus of our future activities.

Methods

A large population of the fourth study of a national surveillance program, entitled “Childhood and Adolescence Surveillance and Prevention of Adult Non-communicable disease” (CASPIAN-IV), supported by the WHO/Eastern Mediterranean region and the Iranian Ministry of Health and the Ministry of Education, were analyzed in our project. Detailed methodology is published elsewhere62. We briefly describe the study population.

The population and sampling method

A sample of 14,880 students aged 6–18 years was selected by multi-stage sampling from schools of urban and rural areas of 30 Iranian provinces. Having explained the objectives and protocols, participants were enrolled in the study. Parents gave the written informed consent and oral permission, while oral assent was obtained from students to express willingness to participate in research. Trained healthcare professionals performed all the data collection procedures. Study protocols were reviewed and approved by the Research and Ethics Council of Isfahan University of Medical Sciences (#5429-90). All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Outcome variables

The World Health Organization-Global School-based Student Health Survey (WHO-GSHS) was used in our study. It covers alcohol, tobacco use, hygiene, physical activity, mental health, dietary behaviors, violence, protective factors, and unintentional injuries among children and youths. After translating questions into Persian and simplifying the questions with any difficulty in understanding, the questionnaire's reliability and validity were assessed6365. We considered psychiatric symptoms as depression symptoms, worriedness symptoms, and mild-to-moderate emotional symptoms, where the latter included confusion, insomnia, anxiety, angriness, and worthlessness66. The three indicated psychiatric symptoms were taken into consideration in this study. Also, if a subject has any of the depression symptoms, worriedness symptoms, and mild-to-moderate emotional symptoms, he/she was considered having “psychiatric symptoms” as the overall outcome of the study. The psychiatric symptoms were assessed by the questions presented in Supplementary Table S1. The first five questions were used to identify mild-to-moderate emotional symptoms in our research. Those who experienced at least 3 out of 5 problems every day, more than once a week, or once a week were defined as “adolescents with mild-to-moderate emotional symptoms.” An indicator of depression symptoms was a positive answer to the 6th question. In the last question, students who were worried most of the time or always so that they could not sleep at night were considered having worriedness symptoms35. Thus, our goal is to design and implement four binary classifiers to classify depression symptoms, mild-to-moderate emotional symptoms, worriedness symptoms, and “psychiatric symptoms” using the input variables. We further created an extended outcome using the entire seven questions used in the questionnaire. The original response to seven questions of the questionnaire was analyzed, and no dichotomization was used. The first principal component was extracted using Principal Component Analysis (PCA). Then, tertiles of the PC were extracted, splitting the subjects into “low,” “medium,” and “high” risk psychiatric symptoms groups. Thus, an extended three-class classification problem was also considered.

Input variables

The input variables considered in our psychiatric symptom classification system are as follows67: age category, sex, socioeconomic status (SES), physical activity level, body mass index (BMI) category, abdominal obesity, family size, residence area, sleep-time category, screen time category, (passive and active) smoking habit, life satisfaction, health status, as well as the consumption of breakfast, fast food, salty snack and beverage, having nutrition plan, the number of close friends, mothers' education level, body image, birth weight category, milk type used in infancy, and the family history of cancer and sudden death.

Measurements

In this study, age was categorized as 6–10, 11–14, and 15–19 years68. The screen time was considered as a categorical variable and consisted of the time spent on watching television (TV)/video and computer games during leisure time, less than or equal to 4 (≤ 4 h) defined as low, and greater than 4 (> 4 h), as high35. The three categories of sleep time hour per week were defined as: sleep time hour less than or equal to 5 (h ≤ 5 h) (low), 5–8 h (moderate), and ≥ 8 h (high)69. The residence area was considered either urban or rural. The physical activity at school and out of school was quantified using principal component analysis (PCA). The obtained scores were then categorized into tertiles66. Variables including family assets, such as ownership of a house, car, computer, occupation, and education level of parents, school type (private/public), were summarized in one main PCA component. Students were then classified as having low, moderate, and high SES, based on the component tertiles. The active smoking habit was considered using tobacco products (cigarettes, pipe, hookah, etc.) every day, while passive smoking was considered, as exposure to tobacco smoke was used by others or second-hand smokers66. Subjects with either passive or active smoking were considered as smokers and non-smokers; otherwise. The general state of participant's health was determined by the self-rated health (SRH) variable, asking “How would you describe your general state of health?” on the GSHS questionnaire, with the categories of “good,” “moderate,” and “bad”35. Life satisfaction was evaluated by asking questions about the degree of satisfaction with their life, using a tenth-point scale from 10 = very satisfied to 1 = very dissatisfied. The scores below 6, was signified low and high satisfaction, otherwise70. Body image was assessed using the question, “What do you think regarding your body size?”; the answer to this question was obtained with the following options “much too fat,” “a bit too fat,” “about the right size,” “a bit too thin,” “ much too thin.” For the analysis, the variable was divided into overweight (much too fat and a bit too fat), underweight (much too thin and a bit too thin), versus normal weight cognition71. Breakfast consumption was categorized into three groups as non-skipper (those eating breakfast 5–7 days a week, semi-skipper (those eating breakfast 3–4 days a week), and skipper (those eating breakfast 0–2 days a week)22. The students were asked about the frequency of salty snack consumption, categorized as “seldom or never,” “weekly,” and “daily” consumption72. The family size was categorized as “less than or equal to 4” or “greater than 4”. The number of close friends was categorized as nothing, one, two, three, or more. The nutrition plan was assessed as “adherence to a weight-modifying plan based on a special diet” or not, otherwise71. Sugar-sweetened beverage consumption (i.e., soda, soft drinks) was categorized as “daily,” “weekly,” “seldom or never”72. The consumption of fast foods (pizza, fried chicken, cheeseburgers, hamburgers, and hot dogs) was categorized into three groups: daily, Weekly, seldom, or never. The education level of mothers was categorized into three groups: Illiterate, diploma, and university degrees. Participants' birth weight (BW; g) was asked from their parents and then categorized into three groups; low (BW < 2500 g), normal (BW: 2500–4000 g), and high (BW > 4000 g)73. We also assessed whether breastfeeding was done for the children and adolescents during their infancy74, and the variable milk type was categorized as breast milk (1) and others (0) otherwise. Moreover, we considered the family history of sudden death (yes or no) and also the family history of cancer (yes or no) of the first-degree relatives of the subjects enrolled in the study.

Anthropometric measurement

In our study, trained healthcare providers performed anthropometric measurements at school. All measurements were conducted with calibrated instruments, according to standard protocols66. Height was measured in the standing position, barefooted while shoulders touch the wall. It was recorded to the nearest 0.2 cm. We measured weight shoeless and in lightly dressed condition to the nearest 200 g. Waist circumference (WC) was measured by a non-elastic tape to the nearest 0.2 cm.

We calculated the BMI as weight in kilograms, divided by height in meters squared (m2). The subjects were classified as underweight, healthy weight, overweight or obese, if BMI was < 5th percentile, between 5th and 85th percentiles, higher than 85th percentiles (i.e., BMI categories), respectively75. Abdominal obesity was defined as WC to height ratio (WHtR) of more than 0.576.

Feature extraction

The interval variables (e.g., age, BMI, birth weight, family size, number of close friends, sleep, and screen time) were first categorized using unsupervised discretization. Although discretization reduces interval variables' flexibility, it could improve the classification problems' performance and generalization77. The input features in our study were thus entirely categorical. Their measurement scale was nominal (e.g., sex, smoking status, milk type, family history of sudden death, family history of cancer) or ordinal (e.g., age category, SES, physical activity level, BMI category). For each categorical variable, the wight-of-evidence encoding78 was used for obtaining continuous covariates. For each outcome variable, stability feature selection79 was then performed. The selected features were used in the following classification procedure.

Classification

Twenty-five input variables were used in GMDH (i.e., layer zero), while the outcomes depression, worriedness, and mild-to-moderate emotional symptomswere separately used as outputs. At the first layer, each pairwise interaction of the inputs was considered as a neuron. Suppose that the pair xi,j, and xi,k (features no. k and j from the subject no. i) are combined to generate the estimated outcome y~i at the first layer using the second-order polynomial model shown in Eq. (1).

y~i=a0+a1×xi,j+a2×xi,k+a3×xi,j2+a4×xi,k2+a5×xi,j×xi,k 1

where the coefficients A=a0,,a5T could be estimated using Regularized Least Squares (RLS)80 on the entire estimation set in Eqs. (2,3).

A=XT×X+λI6-1×XT×Y 2

where

XNe×6=1x1,jx1,kx1,j2x1,k2x1,j×x1,k1x2,jx2,kx2,j2x2,k2x2,j×x2,k..................1xNe,jxNe,kxNe,j2xNe,k2xNe,j×xNe,k,YNe×1=y1,y2,,yNeT 3

and λ is the regularization parameter, yi is the output of the sample no. i of the estimation set and I6 is the identity matrix of size six. In addition to the regularization, the training set was divided into the estimation set with Ne number of samples and the validation set with Nv number of samples to avoid over-fitting during learning. The regularization parameter was tuned using the brute-force search algorithm to maximize the Matthews correlation coefficient (MCC)81 on each output validation set.

At the first layer, each neuron's RLS coefficients were estimated on the estimation set using the above procedure. Each neuron's performance was then assessed on the validation set, and the corresponding MCC values were calculated. The top 10 neurons with better performance than the previous layer's neurons were selected at maximum, and their pairwise interactions were analyzed at the next layer. The network is built up layer by layer during training until the stopping criterion based on the “early-stopping” strategy is achieved. Whenever the validation set's performance is reduced at the next layer, the output of the current layer's best neuron was selected as the output of the entire GMDH network. The presented algorithm was used for the binary classification problems. For the extended multi-class problem, the Macro-averaged F1-score82 was used instead of the MCC as the fitness function.

Comparison with the state-of-the-art

Other classifiers, namely linear discriminant analysis (LDA), multilayer perceptron (MLP), and supported vector machines (SVM), were used for comparison. LDA is a base classifier used to identify whether the classes could be accurately identified using linear boundaries. SVM, on the other hand, constructs a hyperplane in a high-dimensional space. The nonlinear SVM with the radial basis function (RBF) kernel was used in our study. We tuned the RBF kernel radius and the soft-margin parameter using the method proposed by Wu and Wang83. MLP, a feed-forward fully-connected artificial neural network (ANN) model, maps a set of inputs onto an output. In our study, ten neurons with the sigmoid active function and one hidden layer were used. The parameters of the network were tuned on the validation set.

The validation framework

In our study, threefold cross-validation with stratified sampling84 was used. The same test folds were used for different classifiers. In the GMDH and MLP classifiers, 75% of the training set was used for estimation, while 25% was used for validation. The performance indices in Supplementary Table S2 were reported for different classifiers. True Positive, False Positive, True Negative, and False Negative were calculated by comparing the classifiers' results and the gold standard in four classification problems (i.e., depression symptom, mild-to-moderate emotional symptoms, worriedness symptom, and finally psychiatric symptoms). The interpretation of the reference intervals of the indices AUC, K(C), MCC, and DP85 was listed in Supplementary Table S3. Moreover, following the STARD guideline86, the CI 95% of the performance indices were reported for the cross-validated confusion matrices. The performance of the proposed GMDH algorithm on the three-class extended problem was assessed based on the Macro- and -Micro averaged indices presented by Sokolova and Lapalme82.

Statistical analysis

In this paper, subjects with complete information were used in the analysis. Only for comparison87 (Table 5), Multivariate Imputation by Chained Equations (MICE) R-package88 was used to impute the ordinal data of the enrolled subjects. All variables were reported, as the frequency and percentage, since they were categorical. The χ2 analysis was used to compare the categorical variables in their categories. The Spearman's rho is used as the correlation coefficient between interval-ordinal pairs. Kendall's τb was used as the correlation coefficient between ordinal-ordinal pairs. The phi coefficient and rank-biserial correlation coefficient were used for the binary-binary and binary-ordinal association, respectively40. The Cochran's Q test with McNemar's post hoc test for pairwise comparison with Bonferroni correction was used to compare different classifiers' performance. A significance level of 0.05 was used in our analysis. MATLAB version 8.6 (The MathWorks Inc., Natick, MA, USA) was used for classification, while R version 4.0.0 (R Core Team (2020), https://www.R-project.org/) was used for data imputation. The statistical analysis was performed using the SPSS statistical package, version 18.0 (SPSS Inc., Chicago, IL, USA).

Supplementary Information

Supplementary Tables. (21.9KB, docx)

Acknowledgements

The authors would like to thank the CASPIAN team working on this national project and all students and their families participating in this project. The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 712949 (TECNIOspring PLUS) and from the Agency for Business Competitiveness of the Government of Catalonia (TECSPR18-1-0017).

Author contributions

H.R.M., M.M., H.B., and R.K. participated in conceptualization. H.R.M., M.E.M., R.H., M.M., and R.K. contributed to data curation. H.R.M., Z.T., M.R.M, A.N., M.M., M.A.M., and H.B. participated in formal analysis. M.E.M., R.H., M.M., M.A.M., and R.H. contributed to funding acquisition. H.R.M., Z.T., M.R.M., A.N., M.H., and M.M. participated in the investigation. H.R.M., M.M., and H.B. were responsible for methodology. M.M., M.E.M., R.H., and R.K. were responsible for project administration and resources. H.R.M., Z.T., M.R.M., A.N., and M.M .developed the software. M.M., H.B., M.A.M., and R.K. supervised the project. H.R.M., M.H., and M.M. performed the validation. H.R.M., Z.T., M.R.M., and A.N. contributed to visualization. HRM, ZT, M.R.M., and A.N. wrote the original draft, and M.H., M.E.M., R.H., M.M., M.A.M., H.B., and R.K. contributed to review & editing. All authors read and approved the final version of the manuscript and agreed on all the work aspects.

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Zahra Tasdighi, Mohammad Reza Mohebian and Azam Naghavi.

Contributor Information

Marjan Mansourian, Email: marjan.mansourian@upc.edu.

Roya Kelishadi, Email: kelishadi@med.mui.ac.ir.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-021-95208-y.

References

  • 1.Knifton L, Quinn N. Public Mental Health: Global Perspectives. McGraw-Hill Education; 2013. [Google Scholar]
  • 2.Mahoney LT, et al. Coronary risk factors measured in childhood and young adult life are associated with coronary artery calcification in young adults: The muscatine study. J. Am. Coll. Cardiol. 1996;27:277–284. doi: 10.1016/0735-1097(95)00461-0. [DOI] [PubMed] [Google Scholar]
  • 3.Emami H, Ghazinour M, Rezaeishiraz H, Richter J. Mental health of adolescents in Tehran, Iran. J. Adolesc. Health. 2007;41:571–576. doi: 10.1016/j.jadohealth.2007.06.005. [DOI] [PubMed] [Google Scholar]
  • 4.Omidi A, Tabatabayee A, Sazvar A, Akkasheh G. The epidemic logical study of psychiatric disorders in Nathanz, Isfahan. Andeesheh va Reftar J. 2003;8:32–38. [Google Scholar]
  • 5.Phillips MR. Is distress a symptom of mental disorders, a marker of impairment, both or neither? World Psychiatry. 2009;8:91–92. [PMC free article] [PubMed] [Google Scholar]
  • 6.Moksnes UK, Espnes GA. Self-esteem and emotional health in adolescents—gender and age as potential moderators. Scand. J. Psychol. 2012;53:483–489. doi: 10.1111/sjop.12021. [DOI] [PubMed] [Google Scholar]
  • 7.Clifton, D. & Fletcher, J. in Textbook of Palliative Care (eds. MacLeod, R. D. & Van den Block, L.) 1527–1562 (Springer, 2019).
  • 8.Sadock B, Ahmad S, Sadock V. Kaplan and Sadock's Pocket Handbook of Clinical Psychiatry. Wolters Kluwer; 2019. [Google Scholar]
  • 9.Patel V, Flisher AJ, Hetrick S, McGorry P. Mental health of young people: A global public-health challenge. Lancet (London, England) 2007;369:1302–1313. doi: 10.1016/s0140-6736(07)60368-7. [DOI] [PubMed] [Google Scholar]
  • 10.Charara R, et al. The burden of mental disorders in the eastern Mediterranean region, 1990–2013. PLoS ONE. 2017;12:e0169575. doi: 10.1371/journal.pone.0169575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Maldonado L, et al. Impact of early adolescent anxiety disorders on self-esteem development from adolescence to young adulthood. J. Adolesc. Health. 2013;53:287–292. doi: 10.1016/j.jadohealth.2013.02.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Van Droogenbroeck F, Spruyt B, Keppens G. Gender differences in mental health problems among adolescents and the role of social support: Results from the Belgian health interview surveys 2008 and 2013. BMC Psychiatry. 2018;18:6. doi: 10.1186/s12888-018-1591-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Jorm AF, et al. Age group differences in psychological distress: The role of psychosocial risk factors that vary with age. Psychol. Med. 2005;35:1253–1263. doi: 10.1017/s0033291705004976. [DOI] [PubMed] [Google Scholar]
  • 14.Costello EJ, Mustillo S, Erkanli A, Keeler G, Angold A. Prevalence and development of psychiatric disorders in childhood and adolescence. Arch. Gen. Psychiatry. 2003;60:837–844. doi: 10.1001/archpsyc.60.8.837. [DOI] [PubMed] [Google Scholar]
  • 15.Taheri E, et al. Association of Physical activity and screen time with psychiatric distress in children and adolescents: CASPIAN-IV Study. J. Trop. Pediatr. 2019;65:361–372. doi: 10.1093/tropej/fmy063. [DOI] [PubMed] [Google Scholar]
  • 16.Grinde B, Tambs K. Effect of household size on mental problems in children: Results from the Norwegian Mother and Child Cohort study. BMC Psychol. 2016;4:31–31. doi: 10.1186/s40359-016-0136-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Okwaraji FE, Obiechina KI, Onyebueke GC, Udegbunam ON, Nnadum GS. Loneliness, life satisfaction and psychological distress among out-of-school adolescents in a Nigerian urban city. Psychol. Health Med. 2018;23:1106–1112. doi: 10.1080/13548506.2018.1476726. [DOI] [PubMed] [Google Scholar]
  • 18.Gong Y, Palmer S, Gallacher J, Marsden T, Fone D. A systematic review of the relationship between objective measurements of the urban environment and psychological distress. Environ. Int. 2016;96:48–57. doi: 10.1016/j.envint.2016.08.019. [DOI] [PubMed] [Google Scholar]
  • 19.Kosidou K, et al. Socioeconomic status and risk of psychological distress and depression in the Stockholm Public Health Cohort: A population-based study. J. Affect. Disord. 2011;134:160–167. doi: 10.1016/j.jad.2011.05.024. [DOI] [PubMed] [Google Scholar]
  • 20.Cano A, et al. Family support, self-rated health, and psychological distress. Prim. Care Companion J. Clin. Psychiatry. 2003;5:111–117. doi: 10.4088/pcc.v05n0302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zakeri M, Sedaghat M, Motlagh ME, Ashtiani RT, Ardalan G. BMI correlation with psychiatric problems among 10–18 years Iranian students. Acta Med. Iran. 2012;50:177. [PubMed] [Google Scholar]
  • 22.Ahadi Z, et al. Association of breakfast intake with psychiatric distress and violent behaviors in Iranian children and adolescents: The CASPIAN-IV study. Indian J. Pediatrics. 2016;83:922–929. doi: 10.1007/s12098-016-2049-7. [DOI] [PubMed] [Google Scholar]
  • 23.Friedman KE, Reichmann SK, Costanzo PR, Musante GJ. Body image partially mediates the relationship between obesity and psychological distress. Obes. Res. 2002;10:33–41. doi: 10.1038/oby.2002.5. [DOI] [PubMed] [Google Scholar]
  • 24.Pengpid S, Peltzer K. Prevalence and associated factors of psychological distress among a national sample of in-school adolescents in Morocco. BMC Psychiatry. 2020;20:475. doi: 10.1186/s12888-020-02888-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Lasikiewicz N, Myrissa K, Hoyland A, Lawton CL. Psychological benefits of weight loss following behavioural and/or dietary weight loss interventions. A systematic research review. Appetite. 2014;72:123–137. doi: 10.1016/j.appet.2013.09.017. [DOI] [PubMed] [Google Scholar]
  • 26.Zahedi H, et al. Association between junk food consumption and mental health in a national sample of Iranian children and adolescents: The CASPIAN-IV study. Nutrition. 2014;30:1391–1397. doi: 10.1016/j.nut.2014.04.014. [DOI] [PubMed] [Google Scholar]
  • 27.Lien L, Lien N, Heyerdahl S, Thoresen M, Bjertness E. Consumption of soft drinks and hyperactivity, mental distress, and conduct problems among adolescents in Oslo, Norway. Am. J. Public Health. 2006;96:1815–1820. doi: 10.2105/AJPH.2004.059477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Zvolensky MJ, et al. Psychological distress among smokers in the United States: 2008–2014. Nicotine Tob. Res. 2018;20:707–713. doi: 10.1093/ntr/ntx099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Rivenes AC, Harvey SB, Mykletun A. The relationship between abdominal fat, obesity, and common mental disorders: Results from the HUNT Study. J. Psychosom. Res. 2009;66:269–275. doi: 10.1016/j.jpsychores.2008.07.012. [DOI] [PubMed] [Google Scholar]
  • 30.Assari S. Parental educational attainment and mental well-being of college students; diminished returns of blacks. Brain Sci. 2018;8:193. doi: 10.3390/brainsci8110193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wiles NJ, Peters TJ, Leon DA, Lewis G. Birth weight and psychological distress at age 45–51 years: Results from the Aberdeen Children of the 1950s cohort study. Brit. J. Psychiatry. 2005;187:21–28. doi: 10.1192/bjp.187.1.21. [DOI] [PubMed] [Google Scholar]
  • 32.Krol KM, Grossmann T. Psychological effects of breastfeeding on children and mothers. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2018;61:977–985. doi: 10.1007/s00103-018-2769-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Keyes KM, et al. The burden of loss: unexpected death of a loved one and psychiatric disorders across the life course in a national study. Am. J. Psychiatry. 2014;171:864–871. doi: 10.1176/appi.ajp.2014.13081132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Liu Y, Cao C. The relationship between family history of cancer, coping style and psychological distress. Pak. J. Med. Sci. 2014;30:507–510. doi: 10.12669/pjms.303.4634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Kelishadi R, et al. Relationship between leisure time screen activity and aggressive and violent behaviour in Iranian children and adolescents: The CASPIAN-IV study. Paediatrics Int. Child Health. 2015;35:305–311. doi: 10.1080/20469047.2015.1109221. [DOI] [PubMed] [Google Scholar]
  • 36.Ahadi Z, et al. Regional disparities in psychiatric distress, violent behavior, and life satisfaction in Iranian adolescents: The CASPIAN-III study. J. Dev. Behav. Pediatr. 2014;35:582–590. doi: 10.1097/dbp.0000000000000103. [DOI] [PubMed] [Google Scholar]
  • 37.Ivakhnenko AG. Heuristic self-organization in problems of engineering cybernetics. Automatica. 1970;6:207–219. doi: 10.1016/0005-1098(70)90092-0. [DOI] [Google Scholar]
  • 38.Vellido A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput. Appl. 2019 doi: 10.1007/s00521-019-04051-w. [DOI] [Google Scholar]
  • 39.Jakobsen JC, Gluud C, Wetterslev J, Winkel P. When and how should multiple imputation be used for handling missing data in randomised clinical trials—a practical guide with flowcharts. BMC Med. Res. Methodol. 2017;17:162. doi: 10.1186/s12874-017-0442-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Khamis H. Measures of association: How to choose? J. Diagnostic Med. Sonogr. 2008;24:155–162. doi: 10.1177/8756479308317006. [DOI] [Google Scholar]
  • 41.Fleiss JL, Levin B, Paik MC. Statistical Methods for Rates and Proportions. 3. Wiley; 2003. [Google Scholar]
  • 42.Naicker K, Galambos NL, Zeng Y, Senthilselvan A, Colman I. Social, demographic, and health outcomes in the 10 years following adolescent depression. J. Adolesc. Health. 2013;52:533–538. doi: 10.1016/j.jadohealth.2012.12.016. [DOI] [PubMed] [Google Scholar]
  • 43.Hilger-Kolb J, Diehl K, Herr R, Loerbroks A. Effort-reward imbalance among students at German universities: Associations with self-rated health and mental health. Int. Arch. Occup. Environ. Health. 2018;91:1011–1020. doi: 10.1007/s00420-018-1342-3. [DOI] [PubMed] [Google Scholar]
  • 44.Rushton JL, Forcier M, Schectman RM. Epidemiology of depressive symptoms in the national longitudinal study of adolescent health. J. Am. Acad. Child Adolesc. Psychiatry. 2002;41:199–205. doi: 10.1097/00004583-200202000-00014. [DOI] [PubMed] [Google Scholar]
  • 45.van Minnen A, Hendriks L, Olff M. When do trauma experts choose exposure therapy for PTSD patients? A controlled study of therapist and patient factors. Behav. Res. Ther. 2010;48:312–320. doi: 10.1016/j.brat.2009.12.003. [DOI] [PubMed] [Google Scholar]
  • 46.Anderson E, Shivakumar G. Effects of exercise and physical activity on anxiety. Front. Psychiatry. 2013;4:27. doi: 10.3389/fpsyt.2013.00027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Wang B, et al. Developmental trajectories of sleep problems from childhood to adolescence both predict and are predicted by emotional and behavioral problems. Front. Psychol. 2016;7:1874. doi: 10.3389/fpsyg.2016.01874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Sarchiapone M, et al. Hours of sleep in adolescents and its association with anxiety, emotional concerns, and suicidal ideation. Sleep Med. 2014;15:248–254. doi: 10.1016/j.sleep.2013.11.780. [DOI] [PubMed] [Google Scholar]
  • 49.Ramamurthy MB, et al. Effect of current breastfeeding on sleep patterns in infants from Asia-Pacific region. J. Paediatr. Child Health. 2012;48:669–674. doi: 10.1111/j.1440-1754.2012.02453.x. [DOI] [PubMed] [Google Scholar]
  • 50.Poton WL, Soares ALG, Oliveira ERA, Gonçalves H. Breastfeeding and behavior disorders among children and adolescents: A systematic review. Rev Saude Publica. 2018;52:9–9. doi: 10.11606/S1518-8787.2018052000439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Stiglic N, Viner RM. Effects of screentime on the health and well-being of children and adolescents: A systematic review of reviews. BMJ Open. 2019;9:e023191–e023191. doi: 10.1136/bmjopen-2018-023191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Hamer M, Stamatakis E, Mishra G. Psychological distress, television viewing, and physical activity in children aged 4 to 12 years. Pediatrics. 2009;123:1263–1268. doi: 10.1542/peds.2008-1523. [DOI] [PubMed] [Google Scholar]
  • 53.Oellingrath IM, Svendsen MV, Hestetun I. Eating patterns and mental health problems in early adolescence—A cross-sectional study of 12–13-year-old Norwegian schoolchildren. Public Health Nutr. 2014;17:2554–2562. doi: 10.1017/S1368980013002747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Rampersaud GC, Pereira MA, Girard BL, Adams J, Metzl JD. Breakfast habits, nutritional status, body weight, and academic performance in children and adolescents. J. Am. Diet. Assoc. 2005;105:743–760. doi: 10.1016/j.jada.2005.02.007. [DOI] [PubMed] [Google Scholar]
  • 55.Holtzman NS. To Skip or Not to Skip? Varying Definitions of Breakfast Skipping and Associations with Disordered Eating, Obesity, and Depression. Citeseer; 2010. [Google Scholar]
  • 56.Gaebel W, et al. Accuracy of diagnostic classification and clinical utility assessment of ICD-11 compared to ICD-10 in 10 mental disorders: Findings from a web-based field study. Eur. Arch. Psychiatry Clin. Neurosci. 2020;270:281–289. doi: 10.1007/s00406-019-01076-z. [DOI] [PubMed] [Google Scholar]
  • 57.Aurelio YS, de Almeida GM, de Castro CL, Braga AP. Learning from imbalanced data sets with weighted cross-entropy function. Neural Process. Lett. 2019;50:1937–1949. doi: 10.1007/s11063-018-09977-1. [DOI] [Google Scholar]
  • 58.Association A. P. Diagnostic and Statistical Manual of Mental Disorders (DSM-5®) American Psychiatric Pub; 2013. [Google Scholar]
  • 59.Arsang-Jang S, Kelishadi R, Esmail Motlagh M, Heshmat R, Mansourian M. Temporal trend of non-invasive method capacity for early detection of metabolic syndrome in children and adolescents: A Bayesian multilevel analysis of pseudo-panel data. Ann. Nutr. Metab. 2019;75:55–65. doi: 10.1159/000500274. [DOI] [PubMed] [Google Scholar]
  • 60.Momtazi S, Rawson R. Substance abuse among Iranian high school students. Curr. Opin. Psychiatry. 2010;23:221–226. doi: 10.1097/YCO.0b013e328338630d. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Mansourian M, Khademi S, Marateb HR. A comprehensive review of computer-aided diagnosis of major mental and neurological disorders and suicide: A biostatistical perspective on data mining. Diagn. (Basel, Switzerland). 2021 doi: 10.3390/diagnostics11030393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Kelishadi R, et al. Methodology and early findings of the fourth survey of childhood and adolescence surveillance and prevention of adult non-communicable disease in Iran: The CASPIAN-IV Study. Int. J. Prev. Med. 2013;4:1451–1460. [PMC free article] [PubMed] [Google Scholar]
  • 63.Kelishadi R, et al. Joint association of active and passive smoking with psychiatric distress and violence behaviors in a representative sample of Iranian children and adolescents: the CASPIAN-IV Study. Int. J. Behav. Med. 2015;22:652–661. doi: 10.1007/s12529-015-9462-6. [DOI] [PubMed] [Google Scholar]
  • 64.Ziaei R, et al. Reliability and validity of the Persian version of Global School-based Student Health Survey adapted for Iranian school students. J. Clin. Res. Gov. 2014;3:134–140. [Google Scholar]
  • 65.Heshmat R, et al. Association of socioeconomic status with psychiatric problems and violent behaviours in a nationally representative sample of Iranian children and adolescents: The CASPIAN-IV study. BMJ Open. 2016;6:e011615. doi: 10.1136/bmjopen-2016-011615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Djalalinia S, et al. Association of breast feeding and birth weight with anthropometric measures and blood pressure in children and adolescents: The CASPIAN-IV study. Pediatr. Neonatol. 2015;56:324–333. doi: 10.1016/j.pedneo.2015.01.004. [DOI] [PubMed] [Google Scholar]
  • 67.Alonso SG, et al. Data mining algorithms and techniques in mental health: A systematic review. J. Med. Syst. 2018;42:161. doi: 10.1007/s10916-018-1018-2. [DOI] [PubMed] [Google Scholar]
  • 68.Jari M, et al. A nationwide survey on the daily screen time of Iranian children and adolescents: The CASPIAN-IV study. Int. J. Prev. Med. 2014;5:224–229. [PMC free article] [PubMed] [Google Scholar]
  • 69.Azadbakht L, et al. The association of sleep duration and cardiometabolic risk factors in a national sample of children and adolescents: the CASPIAN III study. Nutrition. 2013;29:1133–1141. doi: 10.1016/j.nut.2013.03.006. [DOI] [PubMed] [Google Scholar]
  • 70.Mirmoghtadaee P, et al. The association of socioeconomic status of family and living region with self-rated health and life satisfaction in children and adolescents: The CASPIAN-IV study. Med. J. Islamic Republic of Iran (MJIRI) 2016;30:891–898. [PMC free article] [PubMed] [Google Scholar]
  • 71.Bahreynian M, et al. Association of perceived weight status versus body mass index on adherence to weight-modifying plan among Iranian children and adolescents: The CASPIAN-IV Study. Indian Pediatr. 2015;52:857–863. doi: 10.1007/s13312-015-0732-9. [DOI] [PubMed] [Google Scholar]
  • 72.Payab M, et al. Association of junk food consumption with high blood pressure and obesity in Iranian children and adolescents: the Caspian-IV Study. Jornal de Pediatria (Versão em Português) 2015;91:196–205. doi: 10.1016/j.jpedp.2014.07.008. [DOI] [PubMed] [Google Scholar]
  • 73.Ansari H, et al. Association of birth weight with abdominal obesity and weight disorders in children and adolescents: The weight disorder survey of the CASPIAN-IV Study. J. Cardiovasc. Thorac. Res. 2017;9:140–146. doi: 10.15171/jcvtr.2017.24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Kelishadi R, Farajian S. The protective effects of breastfeeding on chronic non-communicable diseases in adulthood: A review of evidence. Adv Biomed Res. 2014;3:3–3. doi: 10.4103/2277-9175.124629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Bahreynian M, et al. Association between obesity and parental weight status in children and adolescents. J. Clin. Res. Pediatr. Endocrinol. 2017;9:111–117. doi: 10.4274/jcrpe.3790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Esmaili H, et al. Prevalence of general and abdominal obesity in a nationally representative sample of Iranian children and adolescents: the CASPIAN-IV study. Iran. J. Pediatr. 2015;25:1–5. doi: 10.5812/ijp.25(3)2015.401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Lustgarten JL, Gopalakrishnan V, Grover H, Visweswaran S. Improving classification performance with discretization on biomedical datasets. AMIA Annu. Symp. Proc. 2008;2008:445–449. [PMC free article] [PubMed] [Google Scholar]
  • 78.Rusiñol de Rueda M. Two-Layer Feed Forward Neural Network (TLFN) in Predicting Loan Default Probability. Universitat Politècnica de Catalunya; 2019. [Google Scholar]
  • 79.Meinshausen N, Bühlmann P. Stability selection. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2010;72:417–473. doi: 10.1111/j.1467-9868.2010.00740.x. [DOI] [Google Scholar]
  • 80.Beck A. Introduction to Nonlinear Optimization: Theory, Algorithms, and Applications with MATLAB. Society for Industrial and Applied Mathematics; 2014. [Google Scholar]
  • 81.Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020;21:6. doi: 10.1186/s12864-019-6413-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 2009;45:427–437. doi: 10.1016/j.ipm.2009.03.002. [DOI] [Google Scholar]
  • 83.Wu K-P, Wang S-D. Choosing the kernel parameters for support vector machines by the inter-cluster distance in the feature space. Pattern Recogn. 2009;42:710–717. doi: 10.1016/j.patcog.2008.08.030. [DOI] [Google Scholar]
  • 84.Kohavi, R. in Proceedings of the 14th international joint conference on Artificial intelligence, Vol. 2, 1137–1143 (Morgan Kaufmann Publishers Inc., Montreal, Quebec, Canada, 1995).
  • 85.Sokolova, M., Japkowicz, N. & Szpakowicz, S. in AI 2006: Advances in Artificial Intelligence. (eds. A. Sattar & B.-H. Kang) 1015–1021 (Springer, Berlin, Heidelberg).
  • 86.Bossuyt PM, et al. STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. BMJ Brit. Med. J. 2015;351:h5527. doi: 10.1136/bmj.h5527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Sterne JA, et al. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ (Clin. Res. ed.) 2009;338:b2393. doi: 10.1136/bmj.b2393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 2011;1(3):1–67. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Tables. (21.9KB, docx)

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES