Early COVID-19 quarantine: A machine learning approach to model what differentiated the top 25% well-being scorers

Theodoros Kyriazos; Michalis Galanakis; Eirini Karakasidou; Anastassios Stalikas

doi:10.1016/j.paid.2021.110980

. 2021 May 12;181:110980. doi: 10.1016/j.paid.2021.110980

Early COVID-19 quarantine: A machine learning approach to model what differentiated the top 25% well-being scorers

Theodoros Kyriazos ^1,^⁎, Michalis Galanakis ¹, Eirini Karakasidou ¹, Anastassios Stalikas ¹

PMCID: PMC9711521 PMID: 36471777

Abstract

This study focused on the interaction of demographics and well-being. Diener's subjective well-being (SWB) was successfully validated with Exploratory Graph Analysis and Confirmatory Factor Analysis to track well-being differences of the COVID-19 quarantined individuals. Six tree-based Machine Learning models were trained to classify top 25% SWB scorers during COVID-19 quarantine, after data-splitting (train 70%, test 30%). The model input variables were demographics, to avoid overlapping of inputs-outputs. A 10-fold cross-validation method (70%–30%) was then implemented in the training session to select the optimal Machine Learning model among the six tested. A CART classification was the optimal algorithm (Train-Accuracy = 0.77, Test-Accuracy = 0.75). A clean, three-node tree suggested that if someone spends time on perceived creative activities during the COVID-19 quarantine, under clearly described conditions, he/she had high probabilities to be a top subjective well-being scorer. The key importance of creative activities was subsequently cross-validated with three different model configurations: (1) a different tree-based model (Test-Accuracy =0.75); (2) a different operationalization of subjective well-being (Test-Accuracy =0.75) and (3) a different construct (depression; Test-Accuracy =0.73). This is an integrative approach to study individual differences in subjective well-being, bridging Exploratory Graph Analysis and Machine Learning in a single research cycle with multiples cross-validations.

Keywords: Subjective well-being, Machine learning, Tree-based models, Network psychometrics, Exploratory Graph Analysis, Confirmatory Factor Analysis, COVID-19

1. Introduction

Loneliness and controlled socializing during early COVID-19 quarantine posed a significant threat for the well-being of quarantined individuals (Brooks et al., 2020). Physical distancing, financial uncertainty, bereavement, and unemployment can put quarantined individuals at risk of post-traumatic stress, depression, anxiety (Mazza et al., 2020), suicide, self-harm, lack of life meaning, relationship breakdown (Holmes et al., 2020), confusion, and social withdrawal (Ingram & Luxton, 2005).

Individual differences in demographics were reported during the early COVID-19 containment measures across studies. Females were more distressed than males (Mazza et al., 2020). Unmarried females were also more distressed than married. The same was true for males at a lower magnitude (Srilakshmidevi & Suseela, 2020). Similarly, people with a psychiatric diagnosis, vulnerable health, rich contact history, or lack of daily routines were more distressed (García-Dantas, Justo-Alonso, Rio-Casanova, González-Vázquez, & Sánchez-Martín, 2020). However, years of education and age showed controversial results (Mazza et al., 2020).

1.1. Well-being, individual differences and syndemics

The current ongoing research suggests that COVID-19 containment measures can affect well-being with the dynamic approach of syndemics (Holmes et al., 2020). Syndemics are global demographic tendencies (like population aging) interacting with health conditions (like diabetes) to generate individual differences through comorbidities.

Well-being is an umbrella term for a number of constructs tapping positive human functioning (Boniwell, David, & Ayers, 2013). It mainly involves subjective well-being (SWB; Diener, Suh, Lucas, & Smith, 1999), flourishing (Seligman, 2011), and happiness (Seligman, 2002). Over the past decades, SWB dominated well-being literature (Boniwell et al., 2013). SWB is related to hedonic well-being tradition (Ryan & Deci, 2001) and means experiencing more positive emotions than negative, and satisfaction in most life domains (Ruini, 2017). Therefore, it involves a life appraisal both affectively and cognitively (Diener et al., 1999). The affective appraisal refers to the experience of moods or emotions in momentary events. The cognitive appraisal refers to satisfaction from how an individual perceives life and the potential discrepancy between the present life situation and the perceived ideal (Boniwell et al., 2013). The term “subjective” implies the potential contrast of objective living conditions (like material goods or health status), and SWB ratings (Diener et al., 1999). This subjective dimension was the main reason it was selected for measuring well-being differences during the COVID-19 containment measures. SWB it has been operationalized by an affect measure and combined with life satisfaction measure (Diener et al., 2010).

1.2. The present study

This study focused on the interactions of demographics and subjective well-being differences through syndemics, due to physical distancing of the Greek adult population during early COVID-19 containment measures.

The dynamic complexity of differences in subjective well-being through syndemics can be effectively studied with multivariate approaches. Machine Learning (ML) techniques can effectively classify the multivariate complexity of differences in subjective well-being through syndemics.

The objectives of this study were to (a) Validate Diener et al.'s (1999) SWB model, facilitating COVID-19 research on well-being differences during the COVID-19 containment measures; (b) Contribute to applied research of individual differences a Machine Learning research cycle with multiple cross-validating steps by comparing Machine Learning models to study what differentiate the SWB of the quarantined individuals; (c) Examine the most important demographic differentiating variables for the optimal Machine Learning model, with an SWB syndemics approach; (d) Examine the replicability of the most important demographic differentiating variables for the SWB in multiple model configurations.

2. Material and methods

2.1. Participants

The sample involved 759 adults (78% females). The 25%, of the sample, was 18–40 years, 42% was 41–60 years, 3% was 61–70 years and 1% was over 70. Almost 1 in 2 participants were single/not married (47%), married/living together (40%), divorced/widowed (13%). 59% did not have children. Most participants received tripartite education (88%), or lower (13%). The 31% were private-sector employees, public-sector employees (26%), self-employed (17%), students (10%), unemployed (7%), retired (4%), other (6%). Monthly income ranged from none (13%), ≤600€ (13%), 601–1200€ (41%), 1201–1800€ (21%), >2500€ (13%). There were 98.8% of no-COVID-19 respondents. Respondent's families included 97.5% no-COVID-19 cases. 84% did not belong to a vulnerable group and 64% had a vulnerable family member. There were three demographics rated on a 5-point Likert scale (Table 1 ).

Table 1.

Frequencies of COVID-19 related demographics answered on a Likert scale.

	Likert scale points
Question	1	2	3	4	5
1. Do you engage in creative activities during the quarantine?^a	3%	13%	26%	42%	16%
2.The financial impact from quarantine was for you…^b	27%	19%	33%	15%	5%
3. Has your daily routine changed during the quarantine?^a	2%	12%	21%	40%	25%

Open in a new tab

1 = Not at all, 3 = Neither slightly nor strongly, 5 = Very strongly.

1 = Very low, 3 = Moderate, 5 = Very high.

2.2. Measures

2.2.1. Scale of Positive and Negative Experience 8 (SPANE-8)

SPANE-8 (Kyriazos, Stalikas, Prassa, & Yotsidi, 2018a) is a shorter version of SPANE-12 (Diener et al., 2010) with 4 items for SPANE P (Pleasant, Happy, Joyful, Contented) and 4 for SPANE N (Bad, Sad, Afraid, Angry). Items are rated on a 5-point Likert scale from 1 (Very Rarely or Never) to 5 (Very Often or Always), midpoint = Sometimes. The positive and negative scores (SPANE Positive and SPANE Negative) can range from 4 to 20. Their difference (Affect Balance or SPANE B) can range from −16 (unhappiest possible) to 16 (happiest possible).

2.2.2. Satisfaction with Life Scale (SWLS)

The SWLS (Diener, Emmons, Larsen, & Griffin, 1985) measures perceived global satisfaction with life (e.g. ‘So far I have gotten the important things I want in life’) on a 7-point scale, from 1 (Strongly Disagree) to 7 (Strongly Agree), midpoint = Neither Agree nor Disagree. The score ranges from 1 to 9 = Extremely dissatisfied, 20 = Neutral, 31–35 = Extremely satisfied.

2.2.3. Mental Health Continuum–Short Form (MHC–SF)

MHC–SF (Keyes et al., 2008) is a 14-item questionnaire, measuring emotional (i.e. subjective), social, and psychological well-being in 3 factors (EWB, SoWB, PWB respectively). Items are rated on a 6-point frequency scale (never –every day). Higher scores suggest higher frequency.

2.2.4. Depression Anxiety Stress Scale, short-form (DASS-9)

DASS-9 (Yusoff, 2013 and in Greek by Kyriazos, Stalikas, Prassa, & Yotsidi, 2018b) is a briefer DASS-21 (Lovibond & Lovibond, 1995). It measures depression, anxiety, and stress in three 3-item factors, rated on a 4-point scale from 0 (did not apply to me at all) to 3 (applied to me very much, or most of the time).

2.3. Procedure

This is a cross-sectional design, using the network sampling method. Data were collected via a web-link posted on webpages and Facebook accounts of the team members. The fields of the digital form were set as “required”. The link was available online from April, 5th 2020 until May 4th, 2020, 6:30 A.M. Note that Greece took early containment measures on February 282,020 locally, and on March 23 nationally. On May 4th containment measures were loosened and eventually removed by June 2020.

2.4. Analytic strategy

Generally, Machine Learning (ML) is searching for generalizable predictive patterns to make predictions of optimal precision from a dataset. In contrast, traditional statistics focus on inferring relationships between variables from a sample. An important advantage of ML models is that researchers do not need to assume the distribution of the dependent/independent variables while traditional statistical models work based on a number of distributional and other assumptions (Kassambara, 2018). Additionally, in comparison to traditional statistical analyses ML offers the major advantage of statistical model-training in train data to improve predictions in test data. To identify the most important factors related to SWB, classification and regression three-based ML models were tested. Table S1 (Supplementary material) presents each step of the adopted analytic strategy. Data were analysed with R version 4.0.2.

3. Results

Table 2 presents descriptive statistics.

Table 2.

Descriptive statistics of the study dimensions (N = 759).

Dimension (latent variable)	M	Mdn	SD	Range
SPANE 8 Positive Experiences (SPANE-8 P)	12.95	13	3.36	4–20
SPANE 8 Negative Experiences (SPANE-8 N)	10.08	10	3.49	4–20
Satisfaction with Life Scale (SWLS)	24.07	25	5.69	7–35
DASS 9 Depression (DASS-9 D)	2.6	2	2.11	0–9
Mental Health Emotional Well-being (EWB)	10.51	11	2.94	0–15

Open in a new tab

Note. M = Mean, Mdn = Median, SD=Standard deviation.

3.1. Building and validating Diener et al.'s (1999) SWB model

Univariate normality was examined with Kolmogorov-Smirnov, Shapiro-Wilk, Shapiro-Francia, and Anderson-Darling tests, p < .001. Multivariate normality was examined with Mardia's multivariate kurtosis and skewness, Henze–Zirkler's consistent test, Doornik–Hansen test, and Energy-test, p < .001. There were no missing values and 16 multivariate outliers, D ² Critical Value > χ²(13) = 34.53, p < .001. Outliers did not impair findings and they were kept in the dataset, N = 759.

3.1.1. Exploratory Graph Analysis (EGA): building the SWB model

EGA is a network psychometrics technique (see Epskamp, Maris, Waldorp, & Borsboom, 2018). It evaluates the number of dimensions without a priori assumptions. Dimensions are equivalent to latent variables (Golino & Epskamp, 2017). A 3-cluster network was identified (Fig. 1 ) with the Glasso estimator (γ = 0.5), as expected (Diener et al., 1999). The first cluster (dimension) contained the 4 SPANE Positive items (2, 3, 6, 8). The second cluster grouped the 4 SPANE Negative items (1,4,5,7). The third contained the 5 SWLS items. The edges connected to the three clusters showed the expected pattern of positive and negative correlations (Fig. 1).

Fig. 1 — The 3-cluster EGA network to establish Diener et al.'s (1999) SWB model, using SPANE −8 Positive (red cluster with 4 nodes or 1), SPANE-8 Negative (blue cluster with 4 nodes or 2), SWLS (green cluster with 5 nodes or 3). Green edges (positive partial correlations) connected the SWLS nodes with the nodes of the SPANE Positive and red edges (negative partial correlations) connected the SWLS cluster with the nodes of the SPANE Negative. Red edges connected the two SPANE-8 clusters. Edge width indicated the magnitude of the partial correlations. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

3.1.2. Confirmatory Factor Analysis (CFA): confirming the SWB model

To evaluate the generated EGA network, an equivalent CFA model was estimated (WLSMV estimator). Goodness fit criteria were RMSEA ≤0.06, RMSEA 90% CI ≤ 0.06, CFI ≥ 0.95, TLI ≥ 0.95, SRMR ≤0.08. The results showed a remarkably good fit, χ²(62) = 59.99 (p = .549), RMSEA = 0.000 [90% CI = 0.000, 0.021], CFI = 1.000, TLI = 0.994, SRMR = 0.033. Factor loadings for SPANE Positive ranged from 0.721–0.871, for SPANE Negative 0.547–0.866, and for SWLS 0.633–898. Inter-factor correlations were Ft1 → Ft2 = −0.744, Ft2 → Ft3 = −0.451, and Ft3 → Ft1 = 0.562 (Fig. 2 ). Statistical power based on population RMSEA (MacCallum, Browne, & Sugawara, 1996) suggested N ≥ 183, α = 0.80 (df = 62, N = 759).

Fig. 2 — Path diagram of Diener et al.'s (1999) SWB model specified to calculate the EGA model fit. SPANE-8 Positive = (Ft1), SPANE-8 Negative = (Ft2), SWLS = (Ft3).

3.1.3. Internal consistency reliability, model-based reliability and validity

The reliability of SPANE Positive, SPANE Negative and SWLS was α = 0.88 [95% CI = 0.87, 0.89], 0.79 [95% CI = 0.77, 0.82], and 0.87 [95% CI = 0.85, 0.88] respectively. All calculated ω coefficients (see Table S2 in Supplementary material), and Average Variance Extracted (see Kyriazos, 2018) suggested adequate model-based reliability and convergent validity respectively (Table S2). The greatest lower bound (Jackson & Agunwamba, 1977) was ≥α (see Table S2).

3.2. Using SWB to compare and select machine learning models

3.2.1. Calculation of the SWB class

The Affect Balance (SPANE-B) and SWLS scores were rescaled (0–1) with min-max scaling to calculate the SWB scoring rule (Table 3 ).

Table 3.

The binary SWB scoring rule.

If SWB > 3rd quartile (Q₃) then =1 else = 0

Open in a new tab

3.2.2. Validation dataset

The binary SWB classifying rule (Table 3) was used to train ML models. Therefore, the dataset (N = 759) was split into training dataset (70%) and test dataset (30%), see Table 4 .

Table 4.

The frequencies of the SWB binary classes were held constant across datasets.

	Dataset
SWB classes	Total (N = 759)		Train (n = 532)		Test (n = 227)
SWB classes	N	%	N	%	N	%
1	188	30	132	30	56	30
0	571	70	400	70	171	70

Open in a new tab

3.2.3. Cross-validation dataset

In the training dataset, the k-fold cross-validation method (k = 10) was used. Specifically, during training, the model was trained on k-1 (=9) subsets. This process was repeated 10 times to build an overall accuracy metric for the entire train dataset (n = 532).

3.2.4. ML model building and optimal model selection

Tree-based ML models were used. Tree-based models generate relatively simple, easy-to-follow if-then conditions, unlike other ML algorithms (Burger, 2018). Six different three-based ML algorithms were evaluated to find the most robust for evaluating differences of the top 25% SWB individuals: Classification and regression trees or CART, C5.0, Random Forest, Conditional Inference Tree (CTREE), Stochastic Gradient Boosting, and Bagged CART.

Model evaluation metrics were Accuracy (correctly predicted instances rate), Cohen's kappa (κ), Sensitivity (true positive rate), and Specificity (true negative rate), see Brownlee, 2014. All 6 models had the same 16 demographic input variables, converted into numerical variables (Fig. 3 ). Centering, scaling, and transforming inputs were omitted. Zero-variance and near-zero-variance inputs were not examined because for tree-based models, this may cause model crashes or unstable fit (Brownlee, 2014).

Fig. 3 — Box and whisker plots by SWB class value for each input (N = 759). Note. area = respondent lives in an absolute lock-down area due to a large number of COVID-19 cases, Creative = perceived creativity of activities during quarantine, routin = change in daily routine during quarantine, finan = financial impact of quarantine, job = respondent's job, diag = respondent was tested positive, f_diag = a family member was tested positive, quaran = be on quarantine, vul = respondent belongs to a vulnerable group, fam_vul = A family member belongs to a vulnerable group, Age = respondent's age, Sex = respondent's gender, Marital = Marital Status, kids = respondent is a parent, Ed lev = Educational level, income = monthly income.

3.2.5. Optimal model

Table 5 summarizes the Acc of the 6 models (Train dataset, n = 532), based on the distributions from the 10-fold cross-validation. A mean Acc > 0.50 was the baseline (>chance, Carpenter, Sprechmann, Calderbank, Sapiro, & Egger, 2016), therefore the Acc across models was acceptable, suggesting a learnable problem (Brownlee, 2016). CART showed the highest mean Acc (M = 0.77), and κ (M = 0.19), see Table 5 and Fig. 4 .

Table 5.

Descriptive statistics for the accuracy of the 6 models evaluated in the train dataset (n = 532). The distributions were calculated from the 10-fold cross-validation method (number of resamples = 10).

Model (R name)	Accuracy						Kappa
Model (R name)	Min	Q1	Mdn	M	Q3	Max	M
CART (rpart)	0.72	0.74	0.77	0.77	0.79	0.83	0.19
C5.0 (c50)	0.70	0.74	0.75	0.75	0.77	0.79	0.18
Random Forest (ranger)	0.70	0.74	0.77	0.76	0.77	0.79	0.16
Conditional Inference Tree (CTREE)	0.74	0.74	0.75	0.75	0.75	0.77	0.05
Stochastic Gradient Boosting (gbm)	0.66	0.70	0.77	0.75	0.79	0.85	0.19
Bagged CART method (treebag)	0.67	0.70	0.732	0.73	0.74	0.81	0.15

Open in a new tab

Note. Models are compared in terms of mean accuracy and Cohen's Kappa (κ) presented in bold typeface.

Fig. 4 — Dotplot comparing the accuracy and Cohen's Kappa of the ML models tested in the train dataset (n = 532). Note. rpart = CART, c50 = C5.0, ranger = Random Forest, CTREE = Conditional Inference Tree, gbm = Stochastic Gradient Boosting, treebag = Bagged CART method.

To evaluate further the optimal model, pair-wise differences in Acc were calculated between the distributions of the 6 models tested in the train dataset (n = 532; see Table S3 in Supplementary material), with a Bonferroni correction.

CART model generated a clean tree-diagram with 3 terminal nodes, no need for fine-tuning (Fig. 5 ). Specifically, the root node split into 1 terminal node for cases with perceived creativity of activities ≤4 (1–5 scale) and a brunch of 2 more terminal nodes. The first brunch-node classified cases with perceived creativity ≤4 and the second brunch-node cases with perceived creativity = 5. The score code that describes the scoring algorithm reads as follows. Respondents (79% in the train dataset) with a perceived creativity ≤4 had 21% probability to be into the top 25% SWB scorers during the quarantine. Respondents (9% in the train dataset) with a perceived creativity = 5 had a 30% probability to be into the top 25% of SWB scorers during quarantine, on condition that their perceived financial impact of the quarantine was >2 (1–5 scale). Likewise, respondents (7% in the test dataset) with perceived creativity of activities rating = 5, had a 66% probability to be classified into the top 25% of SWB scorers during quarantine, provided that they perceived a financial impact of the quarantine ≤2 (Fig. 5).

3.2.6. Optimal model evaluation

The CART model classification rules were assessed with a confusion matrix in test dataset (n = 227). CART predicted correctly 7/56 cases of SWB = 1, and 164/171 cases of SWB = 0, Acc = 75%, Sensitivity = 96%, Specificity = 13% (Positive Class: 0). Table 6 summarizes the metrics of the CART model in the train and test datasets.

Table 6.

The metrics of the CART model in the train (n = 532) and test (n = 227) datasets.

			Acc 95% CI
Dataset	Model	Acc	Lower	Upper	p value	Sensitivity	Specificity
Train (n = 532)	CART	0.77	0.74	0.81	0.124	0.97	0.19
Test (n = 227)	CART	0.75	0.69	0.81	0.535	0.96	0.13

Open in a new tab

Note. Acc = accuracy, 95% CI = 95% confidence interval.

3.2.7. Most important classification variable

Eight classification variables had variable importance ≠ 0. Variable importance is the sum of all goodness of split measures for each split it was the main variable (Therneau, Atkinson, Ripley, & Ripley, 2015). The most important variable for the CART model was the perceived creativity of activities during quarantine, i.e. ‘Do you engage in creative activities during the quarantine?’ (variable importance = 8.99), followed by the financial impact of quarantine (importance = 7.52). Additionally, 8 variables had zero importance (Fig. 6 ).

Fig. 6 — Variable importance plot in the train dataset (n = 532) showing how each variable contributes to the model (CART). Note. Creativ = perceived creativity of activities during quarantine, finan = financial impact of quarantine, fam_vul = A family member belongs to a vulnerable group, Marital = Marital Status, Ed_lev = Educational Level, vul = respondent belongs to a vulnerable group, quaran = be on quarantine, Sex = respondent's gender, f_diag = a family member was tested positive, area = respondent lives in a universally lock-down area due to the large number of COVID-19 cases, diag = respondent was tested positive, kids = respondent is a parent, routin = change in daily routine during quarantine, income = monthly income, Age = respondent's age, job = respondent's job.

3.3. Cross-validation of the most important classification variable in different model configurations

3.3.1. Different tree-based model × SWB

The popular unbiased recursive partitioning algorithm (Schlosser, Hothorn, & Zeileis, 2019) CTREE also showed high accuracy, Acc (M = 0.75), and κ (M = 0.05), in train data (see Table 5 in Optimal model section). The most important classification variable for the CTREE model remained the perceived creativity of activities during quarantine. This cross-validation corroborated the CART variable importance ranking.

Two more model configurations were tested for the CART model: (a) On the top 25% SWB scorers during COVID-19 quarantine, training an SWB model operationalized with MHC-SF Emotional Well-Being (Keyes et al., 2008) and (b) On the lowest 50% DASS-9 Depression scorers (Kyriazos et al., 2018b; Lovibond & Lovibond, 1995; Yusoff, 2013) during COVID-19 quarantine, training depression model (train dataset, n = 532).

3.3.2. CART × different SWB operationalization

Repeating the above process, an equivalent scoring rule was calculated for MHC-SF EWB (Keyes et al., 2008) to classify the 25% top MHC-SF EWB scorers. Then, the 10-fold cross-validation method was used to train the CART model on the SWB variable operationalized by a different instrument (see model metrics in Table 7 ).

Table 7.

The goodness of fit metrics of the 3 tree-based models, cross-validating perceived creativity of activities during quarantine was a variable with major contribution to the classification (test data, n = 232).

Model configuration	Test purpose	MIV	Train data Acc (Kappa)	Test data Acc (Kappa)
1. CTREE Diener et al.’s (1999) SWB	Testing a popular unbiased recursive partitioning algorithm	Perceived creativity	0.75 (0.00)	0.75 (0.00)
2. CART × MHC-SF EWB (Keyes et al., 2008)	Testing a different SWB operationalization	Perceived creativity	0.79 (0.23)	0.75 (0.07)
3. CART × DASS D	Testing a different construct	Perceived creativity	0.73 (0.39)	0.73 (0.39)

Open in a new tab

Note. MIV = most important variable, Acc = accuracy, Kappa = Cohen's kappa (κ).

3.3.3. CART × different construct (DASS-9 Depression)

In yet another replication, when the DASS-9 Depression score was < the 2nd quartile (Q2) it was coded 1, else 0. This procedure generated a binary outcome variable, out of the 50% scorers with the lowest depression. Then, the CART model was trained with the 10-fold cross-validation method in train data n = 532 (see model metrics in Table 7).

Perceived creativity of activities during quarantine was ranked the most important classification variable both for the model trained on the emotional well-being (MHC-SF EWB; Keyes et al., 2008) and the model trained on DASS-9 Depression. Table 7 presents the metrics of the three models used for cross-validating the importance of perceived creativity of activities during quarantine (Table 7).

4. Discussion

This study attempted to: (a) Use EGA and CFA to validate Diener et al.'s (1999) SWB model facilitating COVID-19 research on well-being differences during the COVID-19 containment measures; (b) Support applied research of individual differences a Machine Learning research cycle of multiple cross-validating steps by comparing six Machine Learning models to select the optimal model.

Initially, an EGA network was successfully evaluated with three SWB dimensions (Diener et al., 1999). Subsequently, six ML models were trained on classifying the top 25% SWB scorers during COVID-19 containment measures. All inputs were demographics because (A) The syndemics complexity necessitated multivariate research methods (Holmes et al., 2020). (B) When using demographics to train the ML models, inputs do not affect the calculation of the scoring rule, avoiding indirect overlapping of inputs and outcome variables. For example, even if we excluded all SWB scoring items, an input like ‘On the whole, I am satisfied with myself’ (Rosenberg Self-Esteem Scale) would be highly correlated to ‘I am satisfied with my life’ (SWLS; Diener et al., 1985), used for the SWB score. This could cause problems to training algorithms and generate meaningless node rules, similar to the chicken-egg dilemma. Decision trees and rule-based models were expected to perform better than regression and instance-based models for this classification (Brownlee, 2016). Tree-based models are white-box algorithms, generating easy-to-follow if-then rules (Burger, 2018).

Six different tree-based ML models were trained. CART had the highest Accuracy. Additionally, CART was considered suitable for reproducing human behavior in a ‘user-friendly’ classification, unlike the rest of the models generating complicated classifications (Burger, 2018). CART model Accuracy of about 75%, both seen in unseen data was satisfactory for human behavior. The tree rules showed that respondents with lower perceived creativity of activities rating had only 21% probabilities to be in the top 25% SWB scorers. In contrast, respondents with higher perceived creativity of activities had a 30% probability to be into the top 25% of SWB scorers when the perceived financial impact of the quarantine was low (within the 25% range). Respondents with both high perceived creativity and low perceived financial impact of the quarantine had 66% probabilities to be into the top 25% of SWB scorers.

The importance of creative activities was subsequently cross-validated using three different model configurations, yielding adequate Accuracy. First, perceived creativity remained the most influencing classification variable for the CTREE model (a popular unbiased recursive partitioning algorithm), corroborating CART results. Second, perceived creativity remained an important contributor to the CART model when SWB was operationalized by the MHC-SF EWB factor (Keyes et al., 2008). Lastly, the same procedure was replicated again to evaluate if perceived creativity remained important when the CART model was trained to predict low depression scorers during the quarantine. Crucially, perceived creativity of activities during quarantine had the greatest influence in all cross-validating classifications, supporting the previous findings.

Generally, objective life circumstances can explain a maximum of 10% change in SWB (Diener et al., 1999). Thus, reaching SWB during COVID-19 containment is possible. Moreover, the positive affect SWB component is associated with creativity, resourcefulness (Seligman, 2002), and openness to new experiences (Fredrickson, 2001). Interestingly, environment interacts with cognitive and personality variables to fuel creative behavior. Studies of families and creativity argued that dysfunctional environments may be associated with extreme creativity levels, and happier families with more moderate creativity levels (Kerr, 2009). This association of creativity with dysfunctional contexts seems to hold for the COVID-19 quarantine too. Kapoor and Kaufman (2020) proposed that engaging in creative activities during the COVID-19 quarantine could buffer against the negative effects of the pandemic context. Similarly, a French sample showed significant increase in everyday creative activities during lockdown but not in professional ones. Likewise, creative growth during COVID-19 was associated with a higher flourishing on a sample consisting of 1420 employees from China, Germany, and the United States (Tang, Hofreiter, Reiter-Palmon, Bai, & Murugavel, 2021).

4.1. Conclusions

In the nutshell, if individuals get involved in creative activities, regardless of their other individual differences they are likely to be the top 25% SWB scorers. Certainly, it is not clear whether someone performs creative activities and then reaches higher SWB levels or the opposite because temporal precedence requires an experimental setting (see Shadish, Cook, & Campbell, 2002). A limitation is the imbalanced sample across gender and some COVID-19 demographics. Furthermore, multiple assessment methods were impossible. However, ML models with only demographic inputs have minimal self-report bias. Future research could include longitudinal studies on SWB during COVID-19, and the study of the other well-being models with ML.

The following are the supplementary data related to this article.

Table S1

Description of the analyses performed.

mmc1.pdf^{(72.1KB, pdf)}

Table S2

Internal consistency reliability, model-based reliability, model-based convergent validity, and the greatest lower bound estimate for the 3 latent variables in Diener et al.'s (1999) SWB model tested with CFA in the entire sample (N = 759).

mmc2.pdf^{(64.5KB, pdf)}

Table S3

Accuracy differences between the distributions of the 6 models tested in the train dataset, n = 532. Upper diagonal = estimates of the difference, lower diagonal = p-value for H0 (Difference = 0).

mmc3.pdf^{(60.6KB, pdf)}

Funding

This research did not receive any funding.

CRediT authorship contribution statement

Theodoros Kyriazos: Methodology, Software, Validation, Formal analysis, Data curation, Writing – original draft, Writing – review & editing, Visualization. Michalis Galanakis: Conceptualization, Investigation, Resources, Writing – original draft, Writing – review & editing, Visualization. Eirini Karakasidou: Investigation, Resources, Writing – original draft, Writing – review & editing, Visualization. Anastassios Stalikas: Conceptualization, Writing – review & editing, Supervision, Project administration.

Declaration of competing interest

The authors declare they have no known conflict of interest.

References

Boniwell I., David S.A., Ayers A.C., editors. The Oxford handbook of happiness. Oxford University Press; London: 2013. [Google Scholar]
Brooks S.K., Webster R.K., Smith L.E., Woodland L., Wessely S., Greenberg N., Rubin G.J. The psychological impact of quarantine and how to reduce it: Rapid review of the evidence. The Lancet. 2020;395:912–920. doi: 10.1016/S0140-6736(20)30460-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Brownlee J. Machine learning mastery. 2014. http://machinelearningmastery Available at. (Accessed July, 2020)
Brownlee J. Author; Melbourne: 2016. Machine learning mastery with R. [Google Scholar]
Burger S.V. O’Reilly; Sebastopol: 2018. Introduction to machine learning with R. [Google Scholar]
Carpenter K.L.H., Sprechmann P., Calderbank R., Sapiro G., Egger H.L. Quantifying risk for anxiety disorders in preschool children: A machine learning approach. Plos One. 2016;11(11) doi: 10.1371/journal.pone.0165524. [DOI] [PMC free article] [PubMed] [Google Scholar]
Diener E., Emmons R.A., Larsen R.J., Griffin S. The satisfaction with life scale. Journal of Personality Assessment. 1985;49:71–75. doi: 10.1207/s15327752jpa4901_13. [DOI] [PubMed] [Google Scholar]
Diener E., Suh E.M., Lucas R.E., Smith H.L. Subjective well-being: Three decades of progress. Psychological Bulletin. 1999;125(2):276. [Google Scholar]
Diener E., Wirtz D., Tov W., Kim-Prieto C., Choi D.W., Oishi S., Biswas-Diener R. New well-being measures: Short scales to assess flourishing and positive and negative feelings. Social Indicators Research. 2010;97(2):143–156. [Google Scholar]
Epskamp S., Maris G., Waldorp L.J., Borsboom D. In: Handbook of psychometrics. Irwing P., Hughes D., Booth T., editors. Wiley; New York: 2018. Network psychometrics; pp. 953–986. [Google Scholar]
Fredrickson B.L. The role of positive emotions in positive psychology: The broaden-and-build theory of positive emotions. American Psychologist. 2001;56(3):218. doi: 10.1037//0003-066x.56.3.218. [DOI] [PMC free article] [PubMed] [Google Scholar]
García-Dantas A., Justo-Alonso A., Rio-Casanova L.D., González-Vázquez A.I., Sánchez-Martín M. 2020. Immediate psychological responses during the early stage of the coronavirus pandemic (COVID-19) in the general population in Spain. (Available at SSRN 3576927) [Google Scholar]
Golino H.F., Epskamp S. Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. Plos One. 2017;12(6) doi: 10.1371/journal.pone.0174035. [DOI] [PMC free article] [PubMed] [Google Scholar]
Holmes E.A., et al. Multidisciplinary research priorities for the COVID-19 pandemic: A call for action for mental health science. Lancet Psychiatry. 2020;7:547–560. doi: 10.1016/S2215-0366(20)30168-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ingram R.E., Luxton D.D. Development of psychopathology: A vulnerability-stress perspective. 2005. Vulnerability-stress models; pp. 32–46. [Google Scholar]
Jackson P., Agunwamba C. Lower bounds for the reliability of the total score on a test composed of nonhomogeneous items: I: Algebraic lower bounds. Psychometrika. 1977;42:567–578. [Google Scholar]
Kapoor H., Kaufman J.C. Meaning-making through creativity during COVID-19. Frontiers in Psychology. 2020;11:595990. doi: 10.3389/fpsyg.2020.595990. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kassambara A. Machine Learning Essentials (Ed. 1) 2018. http://www.sthda.com/english Available at.
Kerr B. In: The encyclopedia of positive psychology. Lopez S., editor. Blackwell; Chichester: 2009. Creativity; pp. 254–256. [Google Scholar]
Keyes C.L., Wissing M., Potgieter J.P., Temane M., Kruger A., Van Rooy S. Evaluation of the mental health continuum–short form (MHC–SF) in Setswana-speaking South Africans. Clinical Psychology & Psychotherapy. 2008;15(3):181–192. doi: 10.1002/cpp.572. [DOI] [PubMed] [Google Scholar]
Kyriazos T.A. Applied psychometrics: Writing-up a factor analysis construct validation study with examples. Psychology. 2018;9:2503–2530. [Google Scholar]
Kyriazos T.A., Stalikas A., Prassa K., Yotsidi V. A 3-faced construct validation and a bifactor subjective well-being model using the Scale of Positive and Negative Experience, Greek version. Psychology. 2018;9:1143–1175. [Google Scholar]
Kyriazos T.A., Stalikas A., Prassa K., Yotsidi V. Can the depression anxiety stress scales short be shorter? Factor structure and measurement invariance of DASS-21 and DASS-9 in a Greek, non-clinical sample. Psychology. 2018;9:1095–1127. [Google Scholar]
Lovibond P.F., Lovibond S.H. The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behaviour Research and Therapy. 1995;33(3):335–343. doi: 10.1016/0005-7967(94)00075-u. [DOI] [PubMed] [Google Scholar]
MacCallum R.C., Browne M.W., Sugawara H.M. Power analysis and determination of sample size for covariance structure modeling. Psychological Methods. 1996;1(2):130. [Google Scholar]
Mazza C., et al. A nationwide survey of psychological distress among Italian people during the COVID-19 pandemic: Immediate psychological responses and associated factors. International Journal of Environmental Research and Public Health. 2020;17:3165. doi: 10.3390/ijerph17093165. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ruini C. Springer; Switzerland: 2017. Positive psychology in the clinical domains: Research and practice. [Google Scholar]
Ryan R.M., Deci E.L. On happiness and human potentials: A review of research on hedonic and eudaimonic well-being. Annual Review of Psychology. 2001;52:141–166. doi: 10.1146/annurev.psych.52.1.141. [DOI] [PubMed] [Google Scholar]
Schlosser L., Hothorn T., Zeileis A. The power of unbiased recursive partitioning: A unifying view of CTree, MOB, and GUIDE. 2019. https://arXiv.org/abs/1906.10179 E-Print Archive.
Seligman M.E.P. Free Press; New York: 2002. Authentic happiness: Using the new positive psychology to realize your potential for lasting fulfillment. [Google Scholar]
Seligman M.E.P. Nicholas Brealey; London: 2011. Flourish: A new understanding of happiness and wellbeing and how to achieve them. [Google Scholar]
Shadish W.R., Cook T.D., Campbell D.T. Houghton Mifflin; Boston: 2002. Experimental and quasi-experimental designs for generalized causal inference. [Google Scholar]
Srilakshmidevi B., Suseela V. Psychological issues based on gender and marital status during Covid-19 lockdown period. Tathapi. 2020;19(8):755–764. [Google Scholar]
Tang M., Hofreiter S., Reiter-Palmon R., Bai X., Murugavel V. Creativity as a means to well-being in times of COVID-19 pandemic. Frontiers in Psychology. 2021;12:601389. doi: 10.3389/fpsyg.2021.601389. [DOI] [PMC free article] [PubMed] [Google Scholar]
Therneau T., Atkinson B., Ripley B., Ripley M.B. 2015. Package ‘rpart’. Available online: cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf. [Google Scholar]
Yusoff M.S.B. Psychometric properties of the depression anxiety stress scale in a sample of medical degree applicants. International Medical Journal. 2013;20:295–300. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1

Description of the analyses performed.

mmc1.pdf^{(72.1KB, pdf)}

Table S2

mmc2.pdf^{(64.5KB, pdf)}

Table S3

Accuracy differences between the distributions of the 6 models tested in the train dataset, n = 532. Upper diagonal = estimates of the difference, lower diagonal = p-value for H0 (Difference = 0).

mmc3.pdf^{(60.6KB, pdf)}

[bb0010] Boniwell I., David S.A., Ayers A.C., editors. The Oxford handbook of happiness. Oxford University Press; London: 2013. [Google Scholar]

[bb0015] Brooks S.K., Webster R.K., Smith L.E., Woodland L., Wessely S., Greenberg N., Rubin G.J. The psychological impact of quarantine and how to reduce it: Rapid review of the evidence. The Lancet. 2020;395:912–920. doi: 10.1016/S0140-6736(20)30460-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0020] Brownlee J. Machine learning mastery. 2014. http://machinelearningmastery Available at. (Accessed July, 2020)

[bb0025] Brownlee J. Author; Melbourne: 2016. Machine learning mastery with R. [Google Scholar]

[bb0030] Burger S.V. O’Reilly; Sebastopol: 2018. Introduction to machine learning with R. [Google Scholar]

[bb0035] Carpenter K.L.H., Sprechmann P., Calderbank R., Sapiro G., Egger H.L. Quantifying risk for anxiety disorders in preschool children: A machine learning approach. Plos One. 2016;11(11) doi: 10.1371/journal.pone.0165524. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0040] Diener E., Emmons R.A., Larsen R.J., Griffin S. The satisfaction with life scale. Journal of Personality Assessment. 1985;49:71–75. doi: 10.1207/s15327752jpa4901_13. [DOI] [PubMed] [Google Scholar]

[bb0045] Diener E., Suh E.M., Lucas R.E., Smith H.L. Subjective well-being: Three decades of progress. Psychological Bulletin. 1999;125(2):276. [Google Scholar]

[bb0050] Diener E., Wirtz D., Tov W., Kim-Prieto C., Choi D.W., Oishi S., Biswas-Diener R. New well-being measures: Short scales to assess flourishing and positive and negative feelings. Social Indicators Research. 2010;97(2):143–156. [Google Scholar]

[bb0055] Epskamp S., Maris G., Waldorp L.J., Borsboom D. In: Handbook of psychometrics. Irwing P., Hughes D., Booth T., editors. Wiley; New York: 2018. Network psychometrics; pp. 953–986. [Google Scholar]

[bb0060] Fredrickson B.L. The role of positive emotions in positive psychology: The broaden-and-build theory of positive emotions. American Psychologist. 2001;56(3):218. doi: 10.1037//0003-066x.56.3.218. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0065] García-Dantas A., Justo-Alonso A., Rio-Casanova L.D., González-Vázquez A.I., Sánchez-Martín M. 2020. Immediate psychological responses during the early stage of the coronavirus pandemic (COVID-19) in the general population in Spain. (Available at SSRN 3576927) [Google Scholar]

[bb0075] Golino H.F., Epskamp S. Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. Plos One. 2017;12(6) doi: 10.1371/journal.pone.0174035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0080] Holmes E.A., et al. Multidisciplinary research priorities for the COVID-19 pandemic: A call for action for mental health science. Lancet Psychiatry. 2020;7:547–560. doi: 10.1016/S2215-0366(20)30168-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0085] Ingram R.E., Luxton D.D. Development of psychopathology: A vulnerability-stress perspective. 2005. Vulnerability-stress models; pp. 32–46. [Google Scholar]

[bb0090] Jackson P., Agunwamba C. Lower bounds for the reliability of the total score on a test composed of nonhomogeneous items: I: Algebraic lower bounds. Psychometrika. 1977;42:567–578. [Google Scholar]

[bb0095] Kapoor H., Kaufman J.C. Meaning-making through creativity during COVID-19. Frontiers in Psychology. 2020;11:595990. doi: 10.3389/fpsyg.2020.595990. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0100] Kassambara A. Machine Learning Essentials (Ed. 1) 2018. http://www.sthda.com/english Available at.

[bb0105] Kerr B. In: The encyclopedia of positive psychology. Lopez S., editor. Blackwell; Chichester: 2009. Creativity; pp. 254–256. [Google Scholar]

[bb0110] Keyes C.L., Wissing M., Potgieter J.P., Temane M., Kruger A., Van Rooy S. Evaluation of the mental health continuum–short form (MHC–SF) in Setswana-speaking South Africans. Clinical Psychology & Psychotherapy. 2008;15(3):181–192. doi: 10.1002/cpp.572. [DOI] [PubMed] [Google Scholar]

[bb0115] Kyriazos T.A. Applied psychometrics: Writing-up a factor analysis construct validation study with examples. Psychology. 2018;9:2503–2530. [Google Scholar]

[bb0120] Kyriazos T.A., Stalikas A., Prassa K., Yotsidi V. A 3-faced construct validation and a bifactor subjective well-being model using the Scale of Positive and Negative Experience, Greek version. Psychology. 2018;9:1143–1175. [Google Scholar]

[bb0125] Kyriazos T.A., Stalikas A., Prassa K., Yotsidi V. Can the depression anxiety stress scales short be shorter? Factor structure and measurement invariance of DASS-21 and DASS-9 in a Greek, non-clinical sample. Psychology. 2018;9:1095–1127. [Google Scholar]

[bb0130] Lovibond P.F., Lovibond S.H. The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behaviour Research and Therapy. 1995;33(3):335–343. doi: 10.1016/0005-7967(94)00075-u. [DOI] [PubMed] [Google Scholar]

[bb0135] MacCallum R.C., Browne M.W., Sugawara H.M. Power analysis and determination of sample size for covariance structure modeling. Psychological Methods. 1996;1(2):130. [Google Scholar]

[bb0140] Mazza C., et al. A nationwide survey of psychological distress among Italian people during the COVID-19 pandemic: Immediate psychological responses and associated factors. International Journal of Environmental Research and Public Health. 2020;17:3165. doi: 10.3390/ijerph17093165. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0150] Ruini C. Springer; Switzerland: 2017. Positive psychology in the clinical domains: Research and practice. [Google Scholar]

[bb0155] Ryan R.M., Deci E.L. On happiness and human potentials: A review of research on hedonic and eudaimonic well-being. Annual Review of Psychology. 2001;52:141–166. doi: 10.1146/annurev.psych.52.1.141. [DOI] [PubMed] [Google Scholar]

[bb0160] Schlosser L., Hothorn T., Zeileis A. The power of unbiased recursive partitioning: A unifying view of CTree, MOB, and GUIDE. 2019. https://arXiv.org/abs/1906.10179 E-Print Archive.

[bb0165] Seligman M.E.P. Free Press; New York: 2002. Authentic happiness: Using the new positive psychology to realize your potential for lasting fulfillment. [Google Scholar]

[bb0170] Seligman M.E.P. Nicholas Brealey; London: 2011. Flourish: A new understanding of happiness and wellbeing and how to achieve them. [Google Scholar]

[bb0175] Shadish W.R., Cook T.D., Campbell D.T. Houghton Mifflin; Boston: 2002. Experimental and quasi-experimental designs for generalized causal inference. [Google Scholar]

[bb0180] Srilakshmidevi B., Suseela V. Psychological issues based on gender and marital status during Covid-19 lockdown period. Tathapi. 2020;19(8):755–764. [Google Scholar]

[bb0185] Tang M., Hofreiter S., Reiter-Palmon R., Bai X., Murugavel V. Creativity as a means to well-being in times of COVID-19 pandemic. Frontiers in Psychology. 2021;12:601389. doi: 10.3389/fpsyg.2021.601389. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0190] Therneau T., Atkinson B., Ripley B., Ripley M.B. 2015. Package ‘rpart’. Available online: cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf. [Google Scholar]

[bb0195] Yusoff M.S.B. Psychometric properties of the depression anxiety stress scale in a sample of medical degree applicants. International Medical Journal. 2013;20:295–300. [Google Scholar]

PERMALINK

Early COVID-19 quarantine: A machine learning approach to model what differentiated the top 25% well-being scorers

Theodoros Kyriazos

Michalis Galanakis

Eirini Karakasidou

Anastassios Stalikas

Abstract

1. Introduction

1.1. Well-being, individual differences and syndemics

1.2. The present study

2. Material and methods

2.1. Participants

Table 1.

2.2. Measures

2.2.1. Scale of Positive and Negative Experience 8 (SPANE-8)

2.2.2. Satisfaction with Life Scale (SWLS)

2.2.3. Mental Health Continuum–Short Form (MHC–SF)

2.2.4. Depression Anxiety Stress Scale, short-form (DASS-9)

2.3. Procedure

2.4. Analytic strategy

3. Results

Table 2.

3.1. Building and validating Diener et al.'s (1999) SWB model

3.1.1. Exploratory Graph Analysis (EGA): building the SWB model

Fig. 1.

3.1.2. Confirmatory Factor Analysis (CFA): confirming the SWB model

Fig. 2.

3.1.3. Internal consistency reliability, model-based reliability and validity

3.2. Using SWB to compare and select machine learning models

3.2.1. Calculation of the SWB class

Table 3.

3.2.2. Validation dataset

Table 4.

3.2.3. Cross-validation dataset

3.2.4. ML model building and optimal model selection

Fig. 3.

3.2.5. Optimal model

Table 5.

Fig. 4.

Fig. 5.

3.2.6. Optimal model evaluation

Table 6.

3.2.7. Most important classification variable

Fig. 6.

3.3. Cross-validation of the most important classification variable in different model configurations

3.3.1. Different tree-based model × SWB

3.3.2. CART × different SWB operationalization

Table 7.

3.3.3. CART × different construct (DASS-9 Depression)

4. Discussion

4.1. Conclusions

Funding

CRediT authorship contribution statement

Declaration of competing interest

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases