Skip to main content
F1000Research logoLink to F1000Research
. 2022 Apr 4;11:390. [Version 1] doi: 10.12688/f1000research.110090.1

Machine learning techniques for predicting depression and anxiety in pregnant and postpartum women during the COVID-19 pandemic: a cross-sectional regional study

Radwan Qasrawi 1,2,a, Malak Amro 1, Stephanny VicunaPolo 1, Diala Abu Al-Halawa 3, Hazem Agha 3, Rania Abu Seir 4, Maha Hoteit 5,6,7, Reem Hoteit 8, Sabika Allehdan 9, Nouf Behzad 10, Khlood Bookari 11,12, Majid AlKhalaf 12, Haleama Al-Sabbah 13, Eman Badran 14, Reema Tayyem 15,16
PMCID: PMC9445566  PMID: 36111217

Abstract

Background: Maternal depression and anxiety are significant public health concerns that play an important role in the health and well-being of mothers and children. The COVID-19 pandemic, the consequential lockdowns and related safety restrictions worldwide negatively affected the mental health of pregnant and postpartum women.

Methods: This regional study aimed to develop a machine learning (ML) model for the prediction of maternal depression and anxiety. The study used a dataset collected from five Arab countries during the COVID-19 pandemic between July to December 2020. The population sample included 3569 women (1939 pregnant and 1630 postpartum) from five countries (Jordan, Palestine, Lebanon, Saudi Arabia, and Bahrain). The performance of seven machine learning algorithms was assessed for the prediction of depression and anxiety symptoms.

Results: The Gradient Boosting (GB) and Random Forest (RF) models outperformed other studied ML algorithms with accuracy values of 83.3% and 83.2% for depression, respectively, and values of 82.9% and 81.3% for anxiety, respectively. The Mathew’s Correlation Coefficient was evaluated for the ML models; the Naïve Bayes (NB) and GB models presented the highest performance measures (0.63 and 0.59) for depression and (0.74 and 0.73) for anxiety, respectively. The features’ importance ranking was evaluated, the results showed that stress during pregnancy, family support, financial issues, income, and social support were the most significant values in predicting anxiety and depression.

Conclusion: Overall, the study evidenced the power of ML models in predicting maternal depression and anxiety and proved to be an efficient tool for identifying and predicting the associated risk factors that influence maternal mental health. The deployment of machine learning models for screening and early detection of depression and anxiety among pregnant and postpartum women might facilitate the development of health prevention and intervention programs that will enhance maternal and child health in low- and middle-income countries.

Keywords: Machine Learning, Anxiety, Depression, Pregnancy, COVID-19, Random Forest

Introduction

The emergence of the Coronavirus disease (COVID-19) in late 2019 and early 2020 has severely impacted the global population. Being characterized as an infectious disease primarily spreading through droplets of saliva or nasal discharge 1 , the infection rate is significant and its consequences can be lethal 24 . The disease is particularly dangerous to vulnerable populations, such as the elderly and those with underlying medical conditions including cardiovascular disease, diabetes, respiratory disease, and cancer 35 . Nonetheless, the specific implications of COVID-19 infection on pregnancy and childbirth have remained unidentified throughout the pandemic 69 .

The uncertainty about the nature, transmission, and mortality of the virus, together with its rapid spread and the consequential social and mobility restrictions (quarantines, lockdowns, and social distancing) have impacted the mental health of pregnant women worldwide 5, 10 . In fact, the psychological effects of COVID-19 on pregnant women may lead to the appearance or increment of stress, anxiety, and depression symptoms as indicated in Broche-Perez et al. study 11 . In a 2020 study by Tokgoz et al. 12 , the authors demonstrated that pregnant women during the COVID-19 pandemic presented higher rates of depression, stress, and anxiety than pregnant women before the pandemic. The study further evidenced that mental health disorders during pregnancy can result in pre-term labour, low birth weight, delayed neuropsychiatric development in children, preeclampsia, and unscheduled caesarean delivery 11, 13 .

However, mental health disorders among pregnant women are widely undiagnosed and could result in worse consequences for mother and child 14, 15 . Furthermore, the traditionally applied screening programs for psychological conditions rely on self-reporting and are for the most part designed to detect the population with pre-existing symptoms 6, 16 . Contrary to the available traditional assessment of psychological disorders, artificial intelligence (AI) models can predict potential incidence of depression and anxiety among pregnant women, which would facilitate pre-emptive action, treatment, and early diagnosis 16, 17 . As a matter of fact, one subarea of AI, machine learning (ML), has previously been used in the field of mental health for the prediction of psychological conditions such as anxiety, depression, obsessive-compulsive disorder (OCD), and post-traumatic stress disorder (PTSD), both prior to and during the onset of the COVID-19 pandemic 18, 19 .

In a 2020 study by Seah et al. 20 , five ML algorithms where applied for the prediction of anxiety, depression, and stress on individuals around the world using the Depression, Anxiety and Stress Scale questionnaire ( DASS 21). The Random Forest classifier, a machine learning algorithm used for data classification, had the best performance accuracy in predicting psychological conditions. A study by Priya et al. 21 , utilized ML tools for the creation of a new diagnostic methodology for anxiety and depression to replace traditional diagnosis through self-reported symptoms. The tool represented an improvement in diagnosis and treatment. Similarly, a study by Richter et al. 22 applied eight ML algorithms, including a hybrid model, for the prediction of psychological problems such as anxiety, depression, and stress. The study found that the hybrid model presented higher accuracy rates than the single algorithms used. In relation to maternal health, only a few studies have been found to use ML models for the prediction of psychological disorders 16, 19, 23, 24 . Among the available studies in this field, a study by Shin et al. 24 , developed a predictive model for postpartum depression using nine different ML approaches, the results showed that the Random Forest model achieved the highest accuracy rates. In addition, the study of Hochman et al. 16 , provided evidence that ML models are able to accurately screen and identify populations at high risk of postpartum depression for preventive intervention.

Thus, the use of ML techniques in mental health prediction and diagnosis might yield positive results in the reduction of self-harm and the provision of timely treatment for at-risk patients. However, very limited studies have used ML in maternal mental health, especially in relation to COVID-19 11, 16, 19, 24 . This study aims to enrich the literature by assessing the performance of ML techniques in studying the effect of the COVID-19 lockdown on maternal mental health in low- and middle-income countries. The study used ML for predicting depression and anxiety symptoms from different features during the COVID-19 lockdown. To date, this is the first international study using population-based datasets from five countries (Palestine, Lebanon, Jordan, Saudi Arabia, and Bahrain) that accounts for multiple maternal and mental health variables.

Methods

Ethics

The study obtained written approval of the Ethics Committee in Scientific Research of University of Jordan, Jordan (19/2020/585), as well as universities from all participating countries. Written informed consent was obtained from all participants.

Data set. This study is the first of its kind as it utilized a regional dataset for evaluating the performance of ML algorithms in predicting depression and anxiety among pregnant and postpartum women during the COVID-19 lockdown in five Arab countries. The stratification of participants into different sets of data according to country of residence and the overall prediction for the total participants provide important and interesting information about the effect of the COVID-19 lockdown on pregnant women. A total of 3,569 women (1,939 pregnant and 1,630 postpartum) from five countries (Jordan, Palestine, Lebanon, Saudi Arabia, and Bahrain) participated in the study. Data were collected during the period of lockdown from July to December 2020.

The data set was extracted from a regional study conducted by the authors for assessing the impact of the COVID-19 pandemic on pregnant and postpartum women's physical and mental health. The study collected data from five Arab countries including: Lebanon, Palestine, Jordan, Bahrain, and Saudi Arabia. A total of 3569 women (currently pregnant or were pregnant during the COVID-19 pandemic lockdown) were selected in this study. The data set including socio-demographic variables and risk factors related to depression, anxiety, and physical and mental health among pregnant women is shown in Table 1.

Table 1. Machine learning models variables descriptions.

Variable Description Value
Country Country of residence Jordan; Palestine; Lebanon; Saudi Arabia; Bahrain
Locality Place of residence Urban, non-urban
Age Women age 0)≤20;1)21-29; 2)30-39; 4)40+
Marriage age Women’s age at marriage 0)≤18;1)19-25; 2)26-30; 3)31+
W_education Women’s education level 0)≤ Secondary; 1) Diploma; 2) Bachelor +
Preg_freq Number of pregnancies One; Two; Three +
Abortion_freq Number of miscarriages Zero; One; Two +
Income Family Income during COVID-19 1)Decreased; 2) Increased; 3) Remain the same
W_work Women’s working status Yes, No
Pha Physical activity level Inactive (<1/2 hour); Active (>=1/2 hour)
Ta Technology activities Yes, No
Healthy intake Healthy food consumption Yes, No
Unhealthy intake Unhealthy food consumption Yes, No
Diagnosed_C19 COVID-19 diagnosis Yes, No
Relatives_diagnosed One of your relatives diagnosed with COVID-19 Yes, No
Health_prob Has any health problem Yes, No
Cancelling app Cancelling clinic visit due to lockdown Yes; No
Smoking Are you smoking? Yes; No
Sleeping Sleeping hours during pandemic (hours per day) 0) <6; 1) 6-8; 2) >8 hours/day
Preg_depression Depression level Low; moderate; high
Preg_anxiety Anxiety level Low; moderate; high
Preg_stress Stress level Low; moderate; high

Questionnaire

A cross-sectional study design was used for collecting the study data during the COVID-19 pandemic from August to November 2020, in the five listed countries: Lebanon, Palestine, Jordan, Bahrain, and Saudi Arabia) in the Arab region. The snowball sampling method was used to recruit pregnant women. The initial participants were contacted through the research team’s professional network, and the obstetric and maternity clinics in the participating countries. Data was collected through a web-based questionnaire, which was previously validated in two published studies 25, 26 , and the software was designed by Palestinian National Nutrition Platform. The questionnaire was disseminated by researchers through their social media networks ( Facebook, WhatsApp, and Instagram), and the participating universities network. Furthermore, hard copies of the questionnaire were distributed to women in some areas with limited internet access through obstetric and maternity clinics. The survey considered several sociodemographic features of pregnant women, medical history, nutrition patterns, physical activity, smoking, education, residency, economic situation, anxiety indicators, and depression indicators. Questions regarding the pre-pregnancy period were not included in the survey. For the full questionnaire, see Extended data 27 .

Criteria

The following criteria guided the data collection process: (i) pregnancy during the COVID-19 pandemic period; (ii) normal pregnancy (i.e., no complications); (iii) aged over 18 (iv) place of residence (the five study countries); (v) having answered all questions in the questionnaire. Moreover, the exclusion criteria included conception during the intra-COVID19 pandemic period, as well as risk factors such as miscarriage and chronic health complications.

Outcome variables. The outcome variables included pregnant women's depression and anxiety levels. Participants were assessed for depression and anxiety using the Patient Health Questionnaire ( PHQ-9) and Generalized Anxiety Disorder ( GAD-7) scales.

Depression: The depression data was collected using the Patient Health Questionnaire (PHQ), a self-reported scale designed by 28 to screen for symptoms of depression. The PHQ items are composed of four answer categories (Never =0; several days =1, more than half of the days =2, and nearly every day =3). The total score was calculated by summing the scale items responses. The PHQ total score was classified into the following groups: Low=0; Moderate=1; High =2.

Anxiety: The GAD-7 scale 28 was used for measuring generalized anxiety disorder. The anxiety score was estimated by assigning scores of 0, 1, 2, and 3 to the response categories of (Never =0; several days =1, more than half of the days =2, and nearly every day =3). The total score was calculated by summing the scale items responses. The GAD-7 total score was classified into the following groups: Low=0; Moderate=1; High =2.

Features. All potential features (predictors), including associated risk factors and socio-demographic variables were considered in the ML models for the assessment of pregnant women before and during the COVID-19 lockdown. The socio-demographic features included women’s age, age at marriage, country of residence, education level, work status, family income, and locality.

Associated risk factors included pre- and post-COVID-19 pandemic food consumption patterns, smoking status, body mass index (BMI), physical activity, healthy food consumption (fruits, vegetables, meat, grains, and dairy products), unhealthy food consumption (sweets, soft drinks, energy drinks, and fast food), physical activity level, technology-related activities, COVID-19 diagnosis, relatives diagnosed with COVID-19, underlying health conditions (diabetes, gestational diabetes, hypertension, gestational hypertension, heart and arterial diseases, liver diseases, high cholesterol, high triglycerides, thyroid disorders, or respiratory problems), cancellation of follow up appointment due to COVID-19, number of pregnancies, number of abortions, family problems, social problems, psychological stress, and work-related stress.

Data analysis. General descriptive analysis and ANOVA tests were used for describing the distribution of women based on risk factors, while prediction and classifications were measured using ML techniques. The classification accuracy, confusion matrix, precision, sensitivity, and specificity were used for evaluating the ML prediction performance. The ML algorithms were applied in the Python AI development platform to predict the incidence and severity level of depression and anxiety during COVID-19 among pregnant women. The data set was divided into a 70:20:10 ratio for training, testing and validation.

To evaluate whether the ML algorithms can predict pregnant women's depression and anxiety, the outcome variables and features were included in the ML models. The performance of ML models was first evaluated for depression and then for anxiety separately. The performance metrics for Gradient Boosting Machines (GB), Distributed Random Forests (RF), Extreme Randomized Forests (XRT), Naïve Bayes (NB), Support Vector Machine (SVM), Multilayers Neural Network (MNN), and Decision Tree (DT) are presented in the results. The accuracy, precision, Area Under the Curve (AUC), Matthew's Correlation Coefficient (MCC) and Receiver Operating Characteristic Curve (ROC) were used for measuring the performance accuracy.

Results

Table 2 and Table 3 show the descriptive analysis of the participants’ data by anxiety and depression levels. Results indicated that the women’s mean age was 28.5 (±5.3) years. Among participants, 11.6% and 8.7% had moderate and high levels of depression, respectively while 22.4% and 7.7% had moderate and high levels of anxiety, respectively.

Table 2. Descriptive analysis of study variables and maternal depression.

Category Level of Depression
No Depression Moderate High F P-Value
n(row%)
The country of residence Jordan 229(50.0) 133(29.0) 96(21.0) 7.9 <0.001
Palestine 208(49.3) 119(28.2) 95(22.5)
Lebanon 232(59.2) 88(22.4) 72(18.4)
Saudi Arabia 120(61.5) 47(24.1) 28(14.4)
Bahrain 89(66.4) 26(19.4) 19(14.2)
Locality Urban 594(54.0) 285(25.9) 221(20.1) 1.1 .285
Non-Urban 284(56.7) 128(25.5) 89(17.8)
Women age (years) <35 744(54.7) 350(25.7) 267(19.6) 0.4 .542
≥35 133(55.6) 63(26.4) 43(18.0)
Age at marriage (years) <20 142(53.4) 72(27.1) 52(19.5) 1.1 .324
20–29 671(54.5) 317(25.8) 243(19.7)
≥30 64(62.1) 24(23.3) 15(14.6)
Education level ≤Secondary school 175(49.6) 96(27.2) 82(23.2) 2.8 .095
> Secondary school 703(56.3) 317(25.4) 228(18.3)
Currently working Yes 357(59.7) 133(22.2) 108(18.1) 2.9 .090
No 521(51.9) 280(27.9) 202(20.1)
Family income Decreased 489(50.0) 267(27.3) 222(22.7) 22.7 .000
Increased/same 389(62.4) 146(23.4) 88(14.1)
Physical activity during pandemic Inactive (<1/2 hour per
day))
323(57.6) 124(22.1) 114(20.3) 0.5 .488
Active (≥1/2 hour per day) 546(53.6) 279(27.4) 194(19.0)
Food group adherence during
pandemic
No/low adherence (0–2) 429(53.5) 223(27.8) 150(18.7) 0.5 .494
Moderate/high adherence
(3–5)
115(55.6) 53(25.6) 39(18.8)
Smoking during pandemic Non-smoker 495(56.8) 200(22.9) 177(20.3) 10.6 .001
Smoker 62(48.4) 40(31.3) 26(20.3)
Number of pregnancies One 305(56.5) 139(25.7) 96(17.8)
Two 236(53.9) 117(26.7) 85(19.4) .509 .676
Three 158(55.6) 62(21.8) 64(22.5)
Four + 178(52.7) 95(28.1) 65(19.2)
Number of miscarriages Zero 638(54.2) 316(26.8) 223(18.9)
One time 162(55.7) 69(23.7) 60(20.6) .202 .895
Two Time + 77(58.3) 28(21.2) 27(20.5)
Sleeping hours during pandemic <6 58(49.6) 22(18.8) 37(31.6)
6–8 429(58) 181(24.5) 130(17.6) 5.8 .003
>8 388(52.6) 207(28.1) 142(19.3)
Has been diagnosed with COVID-19 No 826(55.5) 383(25.7) 279(18.8)
Yes 52(46.0) 30(26.5) 31(27.4) 4.5 .034
Has any of your relatives been
diagnosed with COVID-19
No 720(55.9) 322(25.0) 246(19.1)
Yes 158(50.5) 91(29.1) 64(20.4) 0.3 .572
Chronic disease No 791(55.0) 375(26.1) 273(19.0)
Yes 87(53.7) 38(23.5) 37(22.8) 0.0 .886
Stress during pandemic No 177(85.5) 19(9.2) 11(5.3)
Yes 701(50.3) 394(28.3) 298(21.4) 130.8 <0.001
Family problems No 754(58) 343(26.4) 204(15.7)
Yes 124(41.3) 70(23.3) 106(35.3) 78.2 <0.001
Financial problems No 665(59.0) 293(26.0) 169(15.0)
Yes 213(44.9) 120(25.3) 141(29.7) 72.7 <0.001
Social problems No 805(54.9) 385(26.3) 275(18.8)
Yes 73(53.7) 28(20.6) 35(25.7) 8.0 .005
Psychological problems No 698(60.9) 277(24.2) 171(14.9)
Yes 180(39.6) 136(29.9) 139(30.5) 111.1 <0.001
Work stress No 749(55.5) 349(25.9) 252(18.7)
Yes 129(51.4) 64(25.5) 58(23.1) 8.1 .004

Table 3. Descriptive analysis of study variables and maternal anxiety.

Level of Anxiety
Category No Anxiety Moderate High F P-Value
n(row%)
Country of residence Jordan 208(36.7) 253(44.7) 105(18.6) 4.9 .001
Palestine 132(31.2) 207(48.9) 84(19.9)
Lebanon 154(39.0) 186(47.1) 55(13.9)
Saudi Arabia 75(38.5) 97(49.7) 23(11.8)
Bahrain 68(50.4) 58(43.0) 9(6.7)
Locality Urban 407(35.7) 539(47.3) 193(16.9) 4.6 .033
Non-urban 230(40.0) 262(45.6) 83(14.4)
Women age (years) <35 535(36.9) 681(46.9) 235(16.2) 0.3 .608
≥35 102(38.9) 119(45.4) 41(15.6)
Marriage age (years) <20 113(37.0) 131(43.0) 61(20.0) 2.1 .128
20–29 476(36.6) 620(47.7) 203(15.6)
≥30 48(44.0) 49(45.0) 12(11.0)
Education level ≤Secondary school 115(30.5) 172(45.6) 90(23.9) 21.3 <0.001
> Secondary school 522(39.0) 629(47.0) 186(13.9)
Currently working Yes 268(40.7) 314(47.7) 76(11.6) 14.1 <0.001
No 369(34.9) 487(46.1) 200(18.9)
Family income Decreased 355(33.0) 519(48.2) 203(18.8) 18.0 <0.001
Increased/same 282(44.3) 282(44.3) 73(11.5)
Physical activity during pandemic Inactive (<1/2 hour per day) 225(38.7) 267(46.0) 89(15.3) 1.9 .165
Active (≥1/2 hour per day) 406(36.5) 526(47.3) 180(16.2)
Food group adherence during
pandemic
No/low Adherence (0-2) 325(36.4) 409(45.8) 159(17.8) 0.0 .945
Moderate/high Adherence
(3-5)
79(37.3) 102(48.1) 31(14.6)
Smoking during pandemic Non-smoker 313(35.7) 425(48.5) 138(15.8) 1.2 .267
Smoker 38(29.7) 66(51.6) 24(18.8)
Number of pregnancies One 216(38.6) 259(46.3) 85(15.2)
Two 168(36.6) 222(48.4) 69(15.0) .078 .925
Three 112(36.7) 141(46.2) 52(17.0)
≥Four 141(36.2) 178(45.8) 70(18.0)
Number of miscarriages Zero 461(36.5) 597(47.3) 204(16.2)
One time 116(37.5) 149(48.2) 44(14.2) .189 .827
≥Two times 60(42.3) 54(38.0) 28(19.7)
Sleeping hours during pandemic <6 36(29.0) 56(45.2) 32(25.8) 5.1 .006
6–8 324(40.2) 361(44.8) 121(15.0)
>8 275(35.4) 381(49.0) 121(15.6)
Has been diagnosed with COVID-19 No 597(37.5) 748(46.9) 249(15.6) 2.0 .158
Yes 40(33.3) 53(44.2) 27(22.5)
Has any of your relatives been
diagnosed with COVID-19
No 538(38.7) 633(45.6) 218(15.7) 8.6 .003
Yes 99(30.5) 168(51.7) 58(17.8)
Chronic diseases No 558(36.4) 727(47.5) 246(16.1) 1.2 .275
Yes 79(43.2) 74(40.4) 30(16.4)
Stress during pandemic No 251(79.4) 55(17.4) 10(3.2) 253.3 <0.001
Yes 386(27.6) 746(53.4) 265(19.0)
Family problems No 572(40.5) 665(47.1) 176(12.5) 102.5 <0.001
Yes 65(21.6) 136(45.2) 100(33.2)
Financial problems No 537(43.3) 539(43.5) 164(13.2) 72.9 <0.001
Yes 100(21.1) 262(55.3) 112(23.6)
Social problems No 597(37.9) 732(46.5) 246(15.6) 7.2 .007
Yes 40(28.8) 69(49.6) 30(21.6)
Psychological problems No 543(43.2) 540(43.0) 173(13.8) 76.3 <0.001
Yes 94(20.5) 261(57.0) 103(22.5)
Work stress No 578(39.5) 649(44.4) 235(16.1) 13.9 <0.001
Yes 59(23.4) 152(60.3) 41(16.3)

The rates of anxiety and depression were found to differ by the country of residence, education level, family income, work stress, social problems, health problems, family problems, financial problems, psychological problems, sleeping hours per day, fear of COVID-19 infection, and unhealthy food consumption. A greater percentage of women with high levels of depression and anxiety symptoms were found in Palestine (19.9%, 22.5%) and Jordan (18.6%, 21.0%), respectively. Furthermore, the results in Table 2 indicated that the highest percentages of depression were among women with self-reported family problems (35.3%), sleep deprivation (<6 hours per night) (31.6%), psychological problems (30.5%), financial problems (29.7%), COVID-19 diagnosis (27.4%), social (25.7%), and work stress (23.1%). Furthermore, high levels of anxiety symptoms were found among women with self-reported family problems (33.2%), financial (23.6%), social (21.6%), and psychological problems (22.5%) as shown in Table 3.

ML performance measures

Different performance measures were considered in our study to evaluate whether the ML models can predict women’s depression and anxiety symptoms during the COVID-19 lockdown. Seven ML classification algorithms were tested on our dataset, including SVM, K-nearest neighbour (KNN), NB, Random Forest (RF), Neural Network (NN), DT, and GB. The performance was evaluated using several assessments measures such as accuracy, precision, Area Under the Curve (AUC), Matthew's Correlation Coefficient (MCC) and Receiver Operating Characteristic Curve (ROC). The performance measures were calculated using the following equations:

1. Specificity

Specificity=TNFP+TN

2. Precision

Precision=TPTP+FP

3. Recall

Recall=TPTP+FN

4. F-measure

FM=2precisionrecallprecision+recall

5. Matthew’s Correlation Coefficient

MCC=(TNTP)(FNFP)(FP+TP)(FN+TP)(TN+FN)

6. Accuracy

Accuracy=TP+TNTP+TN+FP+FN

Figure 1 represents the comparison of accuracy rates among the selected machine learning algorithms in predicting women's depression and anxiety symptoms. All tested models reported a high level of accuracy (ranging from 80.0–83.3%) for predicting depression among pregnant women except for NB. On the other hand, various levels of accuracy were reported for the ML models when predicting anxiety. The GB model presented the highest accuracy rate (82.9%) followed by RF and NB (81.3%). Nonetheless, all the ML models reported an acceptable rate of accuracy for both depression and anxiety symptoms.

Figure 1. Percentage of ML models accuracy by depression and anxiety symptoms.

Figure 1.

Additional performance measures were used for evaluating the ML prediction performance of depression and anxiety symptoms, including AUC, sensitivity, specificity, F-Measure, and MCC. Figure 2 illustrates the different performance measures of depression prediction models. Balanced accuracy, sensitivity, and F measures were observed across the ML models. The AUC varied across models; DT reported the lowest AUC rate (68.8%), while other models ranged from 82.6% to 91.9%. The MCC performance measure showed high variability across models being relatively low among the different ML models; the NB model reported the highest MCC value of 63%. Overall, GB reported the highest AUC, ACC, sensitivity, and F1 measures among all other ML models.

Figure 2. Evaluation of ML models performance analysis in predicting depression symptoms.

Figure 2.

Figure 3 shows the different performance measures of ML models for the anxiety prediction. Performance analysis for the anxiety prediction reported quite similar AUC, ACC, sensitivity, and F1 measures for the RF, NN, KNN, and GB models. SVM and DT models had the lowest accuracy measures. The sensitivity and accuracy were highest at the GB and NB models. The MCC performance measure varied among the ML models, the highest of which were found in the NB and GB models (74.3%, 72.8%, respectively). The SVM had the lowest MCC performance measure (52.4%). Overall, GB achieved the best accuracy and sensitivity, and F1-measures of 82.9%, and a balanced MCC measure of 72.8%.

Figure 3. Evaluation of ML models performance analysis in predicting anxiety symptoms.

Figure 3.

The GB and RF receiver operating characteristics (ROC) for the moderate and high depression and anxiety classes is presented in Figure 4 (A and B) and Figure 5 (A and B) respectively. Three numerical categories of student depression and anxiety classes were used: low, moderate, and high. The ROC resides in the upper left corner; thus, the gradient boosting algorithm showed a better prediction of positive value than the other studied algorithms (AUC of 91.9% and 93.5% for depression and anxiety, respectively).

Figure 4.

Figure 4.

Gradient Boosting and Random Forest ROC sensitivity and specificity analysis: ( A) Moderate depression symptoms analysis; ( B) High depression symptoms analysis.

Figure 5.

Figure 5.

Gradient Boosting and Random Forest ROC sensitivity and specificity analysis: ( A) Moderate anxiety symptoms analysis; ( B) High anxiety symptoms analysis.

Features’ importance

The 23 variables used for predicting depression and anxiety symptoms in the ML models were classified and ranked from 0 to 100%. The variables with importance level greater than 60% were considered. The participants reported different levels of variables’ importance for depression and anxiety. The distribution of most important variables for depression and anxiety can be found in Figure 6 and Figure 7, respectively. The most significant variables in predicting depression symptoms were stress during lockdown, psychological factors, family problems, and country of residence. While the most significant variables in predicting anxiety were stress during lockdown, financial problems, family problems, social problems, and COVID-19 diagnosis.

Figure 6. Variables' importance ranking analysis in predicting depression symptoms.

Figure 6.

Figure 7. Variables' importance ranking analysis in predicting anxiety symptoms.

Figure 7.

Discussion

In this study, we used machine learning techniques for the prediction of depression and anxiety among pregnant and postpartum women from five Middle Eastern countries (Lebanon, Palestine, Jordan, Bahrain, and Saudi Arabia) during the COVID-19 lockdown. We found that 20.3% of women had moderate to severe maternal depression while 30.1% of them had moderate to severe anxiety, the highest rates being among Palestinian and Jordanian women. The findings of this study are consistent with other studies that indicated high levels of anxiety and depression symptoms among pregnant women during COVID-19 lockdown 3 . Women reported a significant concern and were found at high risk of developing post-traumatic stress disorder, which requires direct intervention from health care providers for caring of pregnant and postpartum women mental health during COVID-19 pandemic.

The performance of different ML models in predicting maternal depression and anxiety was evaluated through measuring accuracy, specificity, precision, recall, F-measure, and Matthew's Correlation Coefficient (MCC). The accuracy performance of the studied models was similar and did not indicate a significant difference across models. GB and RF reported the best accuracy, sensitivity, and F1 measures for depression prediction. The MCC has been measured for the selected models as an alternative performance measure which is not affected by an unbalanced dataset. The MCC measure showed acceptable scores for both depression and anxiety symptoms. The NB had the highest MCC values followed by GB and RF. Thus, the results in this study are consistent with other studies that assessed the performance of ML classifiers in predicting depression among pregnant and postpartum women, where the RF model showed the highest accuracy and AUC values 16, 24 .

The ML prediction models of postpartum depression developed by Shin et al., (2020) achieved an AUC of 0.79. On the other hand 29 , utilized a multilayer perceptron approach using several risk factors for depression prediction among Spanish pregnant and postpartum women. The model accomplished an AUC value of 0.82, sensitivity value of 0.84, and specificity of 0.81. Furthermore, in a study, Logistic Regression (LR) classifier was used for depression prediction and achieved an accuracy value of 83.3% 30 while employing multiple ML algorithms including KNN, LR, Linear Discriminant Analysis (LDA), and B improved the overall accuracy values to 90% 28, 29 . Additionally, our study was consistent with 24 study in which the ML classifiers were used in predicting postpartum depression and found that depression before pregnancy, stress during pregnancy, and smoking were the most significant risk factors for depression. On the contrary of our findings 31 , reported that women’s age, marital status, and education were the most significant factors relating to postpartum depression.

Furthermore, our study reported a higher AUC performance measure than other similar ML prediction studies, whose AUC measures were 80% 32 , 79% 5 , and 78% 33 . The results in our study showed an accuracy of 83.3%, which is comparable to the 84% accuracy rate reported in other studies 26, 30 . Nevertheless, the study sample used in this research was collected from diverse population groups across countries, thus diverging background and environmental factors were expected to affect the homogeneity of the dataset.

Significant risk factors for pregnant and postpartum depression and anxiety were found, including country of residence, family income, smoking, COVID-19 diagnosis, number of hours of sleep, stress during the COVID-19 lockdown, family support, social support, financial situation, psychological problems, and work stress. Additionally, risk factors particularly significant for anxiety included education level, locality, and work status. We found the rates of anxiety symptoms to be higher than those of depression among pregnant and postpartum women during the pandemic lockdown. The results showed that Jordanian, Palestinian and Lebanese women had higher anxiety and depression than Saudi and Bahraini women. The increased risk for depression and anxiety among women could be explained by low family income, financial problems, and poor healthcare systems available in these countries 34 . The study also reported significant differences in anxiety because of locality and education levels. Women with lower education levels reported higher anxiety; similarly, women living in urban areas presented higher anxiety levels. These findings could be explained by the stricter lockdown in cities and the lack of knowledge about the disease among women with lower education levels.

The machine learning models returned the five highest ranking features affecting women’s depression symptoms: stress during pregnancy, psychological problems, family support, country of residence, and number of hours of sleep-in descending order. The highest-ranking features for anxiety were stress during pregnancy, financial problems, family problems, social problems, and COVID-19 diagnosis. Our findings are consistent with similar studies indicating that stress during pregnancy negatively affects women's mental health and might influence incidence of postpartum depression 9, 14, 32 . Furthermore, the results were consistent with other studies indicating that family income, and social and psychological problems had significant impact on maternal mental health 3, 6, 8 .

The study provides an interesting finding that the accuracy performance measures is relatively high and remains stable between the selected ML models, especially for AUC, accuracy, and sensitivity even at reduced number of variables. This finding is consistent with other studies 17, 31, 3436 that indicated the high correlation between anxiety and depression symptoms and other socio-demographic risk factors. Thus, stress, family support, financial situation, psychological problems, and country of residence were among the most important variables associated with depression and anxiety during the pandemic lockdown. This is important to consider when developing intervention strategies and programs. The stability in performance measures reflects that the self-reported survey methods can be used as a good assessment tool for anxiety and depression. Moreover, pregnant women had more anxiety symptoms than depression during lockdown, which might affect maternal and child health.

Our findings suggest that deploying machine learning techniques for the screening of pregnant and postpartum women will help in identifying those at highest risk of anxiety and depression through clustering and classification, which will in turn aid in the development of effective preventive interventions. Thus, this research not only addresses the integration of innovative technology for the prediction and diagnosis of depression and anxiety among pregnant and postpartum women in low- and middle-income countries, but given the international dataset used, it assesses the prediction power of several ML algorithms across diverging population groups with distinct risk factors. Additionally, the study included variables specific to the COVID-19 lockdown period, which differentiates it from similar studies.

Nevertheless, some limitations are found in this study, including the extent of the study sample. Having a smaller dataset limits the power of predictions to train a robust range of algorithms, as well as limits the number of clusters and classifications produced by the ML predictive models. In addition, the study used the online self-reported assessment, which was not fully completed by all study participants. Nonetheless, the incomplete and missing data were excluded from out dataset. Finally, a more comprehensive study with a larger and more representative dataset including clinical data is recommended for future research among low- and middle-income countries.

Conclusion

The study assessed the performance measures of machine learning algorithms in predicting depression and anxiety among pregnant and postpartum women in low- and middle-income countries during the COVID-19 pandemic lockdown. Based on the results presented, this research concludes that ML algorithms, particularly (yet not exclusively) Gradient Boosting and Random Forest, are effective predictive models for maternal mental health. These models could be integrated into clinical medical information systems for the automatic prediction of pregnant women’s depression and anxiety based on the identified key variables. The deployment of ML models will provide effective clinical applications for the development of prevention and intervention programs. Likewise, by making use of accurate machine learning techniques such as Random Forest, public health professionals, healthcare providers, and decision-makers will be able to predict rising issues and implement relevant intervention programs to enhance maternal and child health in their respective countries.

Data availability

Underlying data

Harvard Dataverse: Pregnancy and Mental Health Data during COVID-19

https://doi.org/10.7910/DVN/FCDGEB 27

This project contains the following underlying data:

  • ML-DataSet.xlsx

  • Variables_Descriptions.xlsx

Extended data

Dataverse: Pregnancy and Mental Health Data during COVID-19

https://doi.org/10.7910/DVN/FCDGEB 27

This project contains the following extended data:

  • English questionnaire.xlsx

Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

Acknowledgements

The authors would like to thank the study participants for their time and effort in responding to our study. Furthermore, The authors would like to thank the following for assisting in data collection: Elissa Naim, Manal Fardon (Lebanon); Narmeen Al-Awwad (The Hashemite University, Jordan); Asma Bash (The University of Jordan); Nahla Al-Bayyari (Al-Balqa Applied University, Jordan).; Shreen Sulten, Nada Omar Abdul jawad, and Mahmoud Sami (King Hamad University Hospital, Kingdom of Bahrain); Rana Ghabbash, Asma Imam (Al-Quds University, Palestine); Firas Abdel Jawad (Makassed Hospital); Nabil Thawabteh (Makassed Hospital); Areej Alamery (Ministry of Health, Saudi Arabia).

Author Disclaimer: The views expressed in this article do not necessarily represent the views, decisions or policies of WHO, Saudi FDA or the other institutions with which the authors are affiliated.

Funding Statement

The author(s) declared that no grants were involved in supporting this work.

[version 1; peer review: 2 approved]

References

  • 1. Hatmal MM, Al-Hatamleh MAI, Olaimat AN, et al. : Side effects and perceptions following covid-19 vaccination in jordan: A randomized, cross-sectional study implementing machine learning for predicting severity of side effects. Vaccines (Basel). 2021;9(6):556. 10.3390/vaccines9060556 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Ahorsu DK, Imani V, Lin CY, et al. : Associations Between Fear of COVID-19, Mental Health, and Preventive Behaviours Across Pregnant Women and Husbands: An Actor-Partner Interdependence Modelling. Int J Ment Health Addict. 2022;20(1):68–82. 10.1007/s11469-020-00340-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Ravaldi C, Ricca V, Wilson A, et al. : Previous psychopathology predicted severe COVID-19 concern, anxiety, and PTSD symptoms in pregnant women during "lockdown" in Italy. Arch Womens Ment Health. 2020;23(6):783–786. 10.1007/s00737-020-01086-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. WHO: COVID-19 Weekly Epidemiological Update 35. World Heal. Organ., no.2021;1–3. Reference Source [Google Scholar]
  • 5. Wang S, Pathak J, Zhang Y: Using Electronic Health Records and Machine Learning to Predict Postpartum Depression. Stud Health Technol Inform. 2019;264(1):888–892. 10.3233/SHTI190351 [DOI] [PubMed] [Google Scholar]
  • 6. Preis H, Mahaffey B, Heiselman C, et al. : Pandemic-related pregnancy stress and anxiety among women pregnant during the coronavirus disease 2019 pandemic. Am J Obstet Gynecol MFM. 2020;2(3):100155. 10.1016/j.ajogmf.2020.100155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Shayganfard M, Mahdavi F, Haghighi M, et al. : Health anxiety predicts postponing or cancelling routine medical health care appointments among women in perinatal stage during the covid-19 lockdown. Int J Environ Res Public Health. 2020;17(21):8272. 10.3390/ijerph17218272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Dong H, Hu R, Lu C, et al. : Investigation on the mental health status of pregnant women in China during the Pandemic of COVID-19. Arch Gynecol Obstet. 2021;303(2):463–469. 10.1007/s00404-020-05805-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Motrico E, Bina R, Domínguez-Salas S, et al. : Impact of the Covid-19 pandemic on perinatal mental health (Riseup-PPD-COVID-19): protocol for an international prospective cohort study. BMC Public Health. 2021;21(1):368. 10.1186/s12889-021-10330-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Broche-Pérez Y, Fernández-Fleites Z, Fernández-Castillo E, et al. : Anxiety, Health Self-Perception, and Worry About the Resurgence of COVID-19 Predict Fear Reactions Among Genders in the Cuban Population. Front Glob Womens Health. 2021;2:634088. 10.3389/fgwh.2021.634088 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Tokgoz VY, Kaya Y, Tekin AB: The level of anxiety in infertile women whose ART cycles are postponed due to the COVID-19 outbreak. J Psychosom Obstet Gynecol. 2020;1–8. 10.1080/0167482X.2020.1806819 [DOI] [PubMed] [Google Scholar]
  • 12. Effati-Daryani F, Zarei S, Mohammadi A, et al. : Depression, stress, anxiety and their predictors in Iranian pregnant women during the outbreak of COVID-19. BMC Psychol. 2020;8(1):99. 10.1186/s40359-020-00464-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Effati-daryani F, Zarei S, Mohammadi A, et al. : Depression, stress, anxiety and their predictors in Iranian pregnant women during the outbreak of COVID-19. BMC Psychol. 2020;8(1):99. 10.1186/s40359-020-00464-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Seng JS, Sperlich M, Low LK, et al. : Childhood Abuse History, Posttraumatic Stress Disorder, Postpartum Mental Health, and Bonding: A Prospective Cohort Study. J Midwifery Womens Health. 2013;58(1):57–68. 10.1111/j.1542-2011.2012.00237.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Abdollahi F, Etemadinezhad S, Lye MS: Postpartum mental health in relation to sociocultural practices. Taiwan J Obstet Gynecol. 2016;55(1):76–80. 10.1016/j.tjog.2015.12.008 [DOI] [PubMed] [Google Scholar]
  • 16. Hochman E, Feldman B, Weizman A, et al. : Development and validation of a machine learning-based postpartum depression prediction model: A nationwide cohort study. Depress Anxiety. 2021;38(4):400–411. 10.1002/da.23123 [DOI] [PubMed] [Google Scholar]
  • 17. Şahan E, Ünal SM, Kırpınar İ: Can we predict who will be more anxious and depressed in the COVID-19 ward? J Psychosom Res. 2021;140:110302. 10.1016/j.jpsychores.2020.110302 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Umanandhini D: Survey on Stress Types Using Data Mining Algorithms. Int J Innov Res Adv Eng. 2017;4(4):2014–2018. Reference Source [Google Scholar]
  • 19. Seah JHK, Jin Shim K: Data Mining Approach to the Detection of Suicide in Social Media: A Case Study of Singapore. 2018 IEEE International Conference on Big Data (Big Data).. 2019;5442–5444. 10.1109/BigData.2018.8622528 [DOI] [Google Scholar]
  • 20. Priya A, Garg S, Tigga NP: Predicting Anxiety, Depression and Stress in Modern Life using Machine Learning Algorithms. Procedia Comput Sci. 2020;167(2019):1258–1267. 10.1016/j.procs.2020.03.442 [DOI] [Google Scholar]
  • 21. Richter T, Fishbain B, Markus A, et al. : Using machine learning-based analysis for behavioral differentiation between anxiety and depression. Sci Rep. 2020;10(1):16381. 10.1038/s41598-020-72289-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Kumar P, Garg S, Garg A: Assessment of Anxiety, Depression and Stress using Machine Learning Models. Procedia Comput Sci. 2020;171(2019):1989–1998. 10.1016/j.procs.2020.04.213 [DOI] [Google Scholar]
  • 23. Kessler RC, van Loo HM, Wardenaar KJ, et al. : Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366–1371. 10.1038/mp.2015.198 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Shin D, Lee KJ, Adeluwa T, et al. : Machine Learning-Based Predictive Modeling of Postpartum Depression. J Clin Med. 2020;9(9):2899. 10.3390/jcm9092899 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Tayyem RF, Allehdan SS, Alatrash RM, et al. : Adequacy of nutrients intake among jordanian pregnant women in comparison to dietary reference intakes. Int J Environ Res Public Health. 2019;16(18):3440. 10.3390/ijerph16183440 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Dra AFR: Food Group Intake of Pregnant Jordanian Women Based on the Three Pregnancy Trimesters. Angew. Chemie Int. Ed. 6(11), 951–952. 2017;25:1–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Qasrawi R, Hoteit M, Allehdan S, et al. : Pregnancy and Mental Health Data during COVID19. Harvard Dataverse, V1,2022. 10.7910/DVN/FCDGEB [DOI] [Google Scholar]
  • 28. Spitzer RL, Kroenke K, Williams JB, et al. : A brief measure for assessing generalized anxiety disorder: The GAD-7. Arch Intern Med. 2006;166(10):1092–1097. 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]
  • 29. Tortajada S, García-Gomez JM, Vicente J, et al. : Prediction of postpartum depression using multilayer perceptrons and pruning. Methods Inf Med. 2009;48(3):291–298. 10.3414/ME0562 [DOI] [PubMed] [Google Scholar]
  • 30. Hosseinifard B, Moradi MH, Rostami R: Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal. Comput Methods Programs Biomed. 2013;109(3):339–345. 10.1016/j.cmpb.2012.10.008 [DOI] [PubMed] [Google Scholar]
  • 31. Jiménez-Serrano S, Tortajada S, García-Gómez JM: A mobile health application to predict postpartum depression based on machine learning. Telemed J E Health. 2015;21(7):567–574. 10.1089/tmj.2014.0113 [DOI] [PubMed] [Google Scholar]
  • 32. Andersson S, Bathula DR, Iliadis SI, et al. : Predicting women with depressive symptoms postpartum with machine learning methods. Sci Rep. 2021;11(1):7877. 10.1038/s41598-021-86368-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Zhang D, Shen D, Alzheimer's Disease Neuroimaging Initiative: Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease. NeuroImage. 2012;59(2):895–907. 10.1016/j.neuroimage.2011.09.069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Gjerdingen DK, Chaloner KM: The relationship of women’s postpartum mental health to employment, childbirth, and social support. J Fam Pract. 1994;38(5):465–472. [PubMed] [Google Scholar]
  • 35. Lebel C, MacKinnon A, Bagshawe M, et al. : Elevated depression and anxiety symptoms among pregnant individuals during the COVID-19 pandemic. J Affect Disord. 2020;277:5–13. 10.1016/j.jad.2020.07.126 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Onoye JM, Goebert D, Morland L, et al. : PTSD and postpartum mental health in a sample of Caucasian, Asian, and Pacific Islander women. Arch Womens Ment Health. 2009;12(6):393–400. 10.1007/s00737-009-0087-0 [DOI] [PubMed] [Google Scholar]
F1000Res. 2022 Sep 5. doi: 10.5256/f1000research.121665.r144638

Reviewer response for version 1

Iyad Tumar 1

The study used the machine learning techniques in predicting the effect of COVID 19 on the women depression and anxiety. The machine learning models used original data set collected from Arab countries during COVID19-pandemic lockdown. The study sample composed of 3569 women (1939 pregnant and 1630 postpartum).  The study indicated that the gradient boosting algorithm reported the highest performance compared to other algorithms.

The study addressed an important problem in developing countries, and the result of the study is very encouraging, mainly in the deployment of machine learning in the fields of Mental health and public health.   Furthermore, the study is well-written and organized. However, a few issues need more clarifications:

  1. The study contains many figures and tables, will be much better to add some of them in the Appendix.

  2. I recommend double checking the samples and features of Table 2.

  3. Are there any significant differences between pregnant women and postpartum?

  4. Does the authors use a weighted dataset to analyze and run the ML models.

  5. It is important to show the overfitting problem during the training and testing.

Is the work clearly and accurately presented and does it cite the current literature?

Yes

If applicable, is the statistical analysis and its interpretation appropriate?

Yes

Are all the source data underlying the results available to ensure full reproducibility?

Yes

Is the study design appropriate and is the work technically sound?

Yes

Are the conclusions drawn adequately supported by the results?

Yes

Are sufficient details of methods and analysis provided to allow replication by others?

Yes

Reviewer Expertise:

My research area is in the fields of Artificial Intelligence (AI). Within AI, I am interested in problems related to health or education modeling, machine learning, and data mining, and their interdisciplinary applications to real-life problems. Furthermore, I’m interested in computer networks and cybersecurity, in which I worked on the development of models and methods of network management and protection.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2022 Jun 8. doi: 10.5256/f1000research.121665.r136365

Reviewer response for version 1

Chadi Ibrahim Fakih 1

The context of the work proposed in this article is to use machine learning techniques to predict maternal depression and anxiety levels during the COVID-19 pandemic. The data are collected from five Arab countries and composed of 3569 samples (1939 pregnant women and 1630 postpartum). Several machine learning techniques are applied such as Random Forest, Gradient Boosting and Naïve Bayes. As a result, the gradient boosting algorithm outperformed the other methods used in this research.

Regarding the form, the article is well organized and written. The addressed problem is quite important and the results obtained are promising. However, the article suffers from several weak points:

  1. How are features’ importance evaluated? (filtering method) Do authors focus on filter-based, wrapper-based or embedded-based methods? It seems they have based on embedded one, but they should be precise.

  2. Why feature extraction methods are not used such as LDA, ICA and t-SNE?

  3. Article contains a lot of figures and tables per evaluation measure. I think these figures and tables can be put in the appendix and focus on one evaluation measure such as accuracy.

  4. You may provide a matrix that shows the correlation between each couple of features and between each feature and the different classes.

  5. What is the bCA mentioned under the figures 2 and 3?  (Component analysis or classification accuracy).

  6. Authors should show a comparison of their work with other works even they are not applied on the same dataset. It is important to see how the efficiency of each machine learning varies between the work and other works.

  7. Authors stated that they used 3569 samples in their study while when we refer to the table 2 and table 3 there is something missing. For example, if we count in table 2 all the samples in the row that corresponds to the country of residence regardless the depression level, they sum to 1601. In contrast, the sum of the samples in the row smoking during pandemic is 1000 and others sums to 1600. Are there missing features’ values?

  8. There is something missing in table 2 (table 3 also). Table 2 shows that the total samples belonging to class “No depression” is 878 over 1601, “Moderate depression” is 413/1601 and “High depression” is 310/1601. Is the total number of samples are 1601? I thought they are 3569.

  9. In page 5 – section “Results”, you mention “Among participants, 11.6% and 8.7% had moderate and high levels of depression, respectively while 22.4% and 7.7% had moderate and high levels of anxiety, respectively”. It means, the class “No depression” constitute the 80% of the dataset and “No anxiety” 70% of the dataset. How is this class-imbalance handled during experimentations?

  10. Put the confusion matrix for depression-prediction and the one for anxiety-prediction to see how the misclassified samples are distributed.

  11. The dataset is split into 70:20:10 (training:testing:validation). What type of parameters did you use that need validation? Did you split them randomly into three groups? You should use techniques other than random sampling such as cross-validation.

  12. Authors should show the training error to compare it with the testing error in order to show if there is overfitting.

  13. No need to put the formulas of the evaluations measures (Precision, recall, AUC,…). They are known.

  14. Gradient-boosting and Random-forest are decision-tree based classifiers. It is important to derive the rules that constitute the model showing the set of rules of each class.

Is the work clearly and accurately presented and does it cite the current literature?

Yes

If applicable, is the statistical analysis and its interpretation appropriate?

Yes

Are all the source data underlying the results available to ensure full reproducibility?

Yes

Is the study design appropriate and is the work technically sound?

Yes

Are the conclusions drawn adequately supported by the results?

Yes

Are sufficient details of methods and analysis provided to allow replication by others?

Yes

Reviewer Expertise:

Human reproduction, endocrinology, ART

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Availability Statement

    Underlying data

    Harvard Dataverse: Pregnancy and Mental Health Data during COVID-19

    https://doi.org/10.7910/DVN/FCDGEB 27

    This project contains the following underlying data:

    • ML-DataSet.xlsx

    • Variables_Descriptions.xlsx

    Extended data

    Dataverse: Pregnancy and Mental Health Data during COVID-19

    https://doi.org/10.7910/DVN/FCDGEB 27

    This project contains the following extended data:

    • English questionnaire.xlsx

    Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).


    Articles from F1000Research are provided here courtesy of F1000 Research Ltd

    RESOURCES