Skip to main content
PLOS Mental Health logoLink to PLOS Mental Health
. 2026 Mar 4;3(3):e0000552. doi: 10.1371/journal.pmen.0000552

Contribution of social determinants to symptoms of generalized anxiety disorder

Jerzy Bala 1, Jennifer J Newson 1, Tara C Thiagarajan 1,*
Editor: Karli Montague-Cardoso2
PMCID: PMC12959655  PMID: 41779807

Abstract

Symptoms of anxiety are known to be triggered by a range of life context factors including early life trauma, poor sleep quality, infrequent exercise, unemployment and social isolation. Machine learning techniques offer a powerful method for analyzing these factors in combination, enabling the evaluation of aggregate predictive associations rather than causal pathways and the identification of their relative association with anxiety symptoms. However, most studies examining these factors have either been small-scale or included only a small number of factors. Here we applied multiple machine learning approaches (Random Forest, Gradient Boosting, Naïve Bayes, Information Gain, and SHAP) to a cross-sectional data sample of 4,186 individuals to reveal how a broad range of lifestyle and life context factors are associated with the experience of anxiety symptoms, as measured by the Generalized Anxiety Disorder-7 screening questionnaire (GAD-7). The results showed that, in combination, early life trauma, poor sleep quality, infrequent exercise, unemployment, and deterioration of social bonds were substantially associated with anxiety symptoms, particularly for older age groups, with frequency of a good night’s sleep having an outsized impact. For older ages, this was followed by employment status and experience of interpersonal trauma, as well as frequency of in-person socializing. For younger ages (18–34), employment status was less important with interpersonal trauma being a more significant factor. Specifically, poor sleep, rarely socializing in person, not being able to work or being unemployed, bullying by peers, or neglect/abuse by a parent or caregiver had the largest associations with anxiety symptoms. These findings have implications for how we approach both prevention and treatment of anxiety.

Introduction

Feelings of fear and anxiety, evoked by threats, or the anticipation of real or imagined threats, are natural adaptive responses which facilitate survival. However, when these feelings become excessive in frequency, severity or duration, they can become maladaptive and functionally impairing. Although there are multiple disorders associated with maladaptive anxiety (e.g., phobias, panic, social anxiety), Generalized Anxiety Disorder (GAD) is among the most prevalent and clinically significant anxiety-related conditions, and is described by the DSM-5 as ‘excessive worry and apprehensive expectations, occurring more days than not for at least 6 months, about a number of events or activities, such as work or school performance’ [1]. Within a primary care or therapeutic setting, it is often initially screened for using the General Anxiety Disorder-7 questionnaire (GAD-7) which includes 7 questions on worry, fear, restlessness, irritability, and where respondents rate each item according to how frequently they have been bothered by the problem over the last two weeks [2]. A sum score threshold of 10 on the GAD-7 (moderate anxiety) has been shown to have a sensitivity of 89% and a specificity of 82% for GAD, as determined by telephone interview with a mental health professional [2].

Although there are various treatments available to help reduce or manage symptoms of anxiety (e.g., cognitive behavioural therapy, psychoactive medications), its high and increasing prevalence in the general population [36], debilitating life impact [7,8] and low treatment availability [9], means there is a need to better understand underlying risk factors such that incidence can be preventatively reduced. However, the aetiology and risk profile of anxiety is complex and still poorly understood. Although family studies have revealed a genetic component [10,11], people’s environment, life experiences and lifestyle habits (together, social determinants) all play a prominent role in the onset and trajectory of the anxiety symptoms. For example, a range of factors including early life trauma [1214], poor sleep [15,16], unhealthy diet [17,18], infrequent exercise [19,20], unemployment [21,22] and social isolation [23,24] have all been associated with anxiety symptoms, often bidirectionally.

Machine learning techniques offer a powerful method to analyze these factors in combination to predict outcomes and identify their relative association with anxiety symptoms [see [2529] for reviews]. However, to date, most machine learning studies in this context have either been small scale (typically a few hundred people) [3034], only included a limited number of factors (e.g., demographics and medical history) [35,36], and/or focused on specific populations (e.g., a specific country, age group or clinical population) [31,3446]. As a result, it is currently unknown how well anxiety symptoms can be predicted by lifestyle and life circumstance factors in the aggregate, and which ones are more important predictors of anxiety symptoms in the general population. This has implications not only for identifying at risk populations and directing health public policy, but also in terms of how anxiety symptoms can be prevented and how they should be treated from a therapeutic perspective [4749].

Here we take advantage of a large-scale cross-sectional data collection effort that asked a subsample of respondents from the general population (N = 4,186) to complete the GAD-7 screening questionnaire, and included responses on a rich array of lifestyle and life experience factors. In this study, individuals, taken from a general population, completed the GAD-7 questionnaire and answered questions that described their lifestyle, life circumstance, and experience of various adversities and interpersonal traumas. We applied multiple machine learning approaches (Random Forest, Gradient Boosting, Naïve Bayes, Information Gain and SHAP) to this data sample to determine the degree to which these lifestyle and life experience factors, in the aggregate, were associated with GAD-7 outcomes in terms of statistical classification of symptom severity (rather than prospective prediction), and to describe the relative importance of these different factors. These findings have implications not only for identifying at risk populations and directing health public policy, but also in terms of how anxiety can be prevented and how it should be treated from a therapeutic perspective [4749].

Methods

Data acquisition and data elements

Participants were recruited as part of the ongoing Global Mind Project through online advertisements placed on Facebook and Google that targeted age-sex groups and geographical regions across broad based interests and key words [50]. The recruitment and data quality procedures have been previously described in detail [59]. Respondents were aged 18–85, predominantly from 20 countries, with 45.7% of respondents reporting their biological sex as male (see Table A in S1 Text for age and sex break-up and Table B in S1 Text for percentage of respondents by country). Participants were directed to the survey website (https://sapienlabs.org/mhq/) and completed the GAD-7 questions as part of a larger assessment of mental wellbeing [51,52]. Respondents also answered questions on a broad range of life context factors including lifestyle habits, life circumstances and experience of various adversities and interpersonal traumas (see Table 1).

Table 1. Life context factors queried in the survey.

Group Factor
Demographics Age; Biological Sex; Employment status; Educational attainment;
Geography Country, State; City;
Lifestyle Frequency of getting a good night’s sleep; Frequency of doing exercise; Frequency of Socializing
Substance use Tobacco products (e.g., cigarettes, chewing tobacco, cigars, etc.); Vaping products; Alcoholic beverages (e.g., beer, wine, spirits, etc.); Cannabis (e.g., marijuana, pot, grass, hash, etc.); Cocaine (e.g., coke, crack, etc.); Amphetamine type stimulants (e.g., speed, diet pills, ecstasy, etc.); Inhalants (e.g., nitrous, glue, petrol, paint thinner, etc.); Sedatives or Sleeping Pills (e.g., Benzodiazepines, Valium, Serepax, Rohypnol, etc.); Hallucinogens (e.g., LSD, acid, mushrooms, PCP, Special K, etc.); Opioids (e.g., heroin, morphine, methadone, codeine, etc.); Barbiturates (e.g., Nembutal, Pentobarbital)
Medical Presence/absence of diagnosed medical disorder; Type of medical condition; Mental health treatment status; Reasons for not seeking help; Type of mental health treatment; Diagnosed mental health disorder(s)
Childhood Inter-personal traumas Parental Divorce or family breakup; Prolonged physical abuse, or severe physical assault; Prolonged sexual abuse, or severe sexual assault; Physical violence in the home between family members (e.g., between parents); Cyberbullying or online abuse; Prolonged or sustained bullying in person from peers; Prolonged emotional or psychological abuse or neglect from parent/caregiver; Lived with a parent/caregiver who was an alcoholic or who regularly used street drugs; Threatening, coercive or controlling behavior by another person; Forced family control over major life decisions (e.g., marriage); Parent/Caregiver/Sibling with mental illness or who committed suicide; Parent/Caregiver/Sibling went to prison; Serious injury, harm, or death you caused to someone else
Childhood Financial Adversities Extreme poverty leading to homelessness and/or hunger
Childhood Other Adversities Life threatening or debilitating injury or illness; Sudden or premature death of a parent or sibling; Involvement or close witness to a war; Displacement from your home due to political, environmental or economic reasons; Suffered a loss in a major fire, flood, earthquake, or natural disaster; Caring for a parent or sibling with a major chronic disability or illness
Adulthood Interpersonal Traumas Divorce/separation or family breakup; Prolonged physical abuse, or severe physical assault; Prolonged sexual abuse, or severe sexual assault; Cyberbullying or online abuse; Serious injury, harm, or death you caused to someone else; Threatening, coercive or controlling behavior by another person; Forced family control over major life decisions (e.g., marriage)
Adulthood Financial Adversities Extreme poverty leading to homelessness and/or hunger; Loss of your job or livelihood leading to an inability to make ends meet
Adulthood Other Life Adversities Life threatening or debilitating injury or illness; Sudden or premature death of a loved one; Caring for a child or partner with a major chronic disability or illness; Involvement or close witness to a war; Suffered a loss in a major fire, flood, earthquake, or natural disaster; Displacement from your home due to political, environmental or economic reasons

Cross-sectional data was collected between 14/09/2022 and 29/09/2022, during which 4,421 respondents completed the GAD-7. Standard Global Mind Project cleaning criteria were applied to the data. Only respondents who responded ‘Yes’ to the MHQ question ‘Did you find this assessment easy to understand?’, and who had a standard deviation of >0.2 across MHQ rated question responses were included in the analysis, leading to a final sample size of 4,186. This criterion, part of the validated MHQ quality-control framework [51,59], uses within-respondent response variability to identify disengaged or inattentive respondents; SD < 0.2 across 47 MHQ items indicates extremely low variability (e.g., selecting the same response for nearly all items), a pattern empirically associated with poor data quality.

Participants took part in the online survey voluntarily, anonymously, and without any financial compensation. Participants consented to take part by clicking on a start button after reading a detailed privacy policy. Cases with missing data on predictor variables were excluded from analysis on a listwise basis. All procedures involving human subjects were approved by the Health Media Lab Institutional Review Board (HML IRB; OHRP Institutional Review Board #00001211, Federal Wide Assurance #00001102, IORG #0000850).

Calculation of GAD-7 Sum Scores

No post-stratification weights were applied to the sample, as the study’s primary purpose was model performance evaluation rather than population prevalence estimation. Each GAD-7 item was rated on a frequency scale of 0–3 that reflected how much a symptom had bothered them over the last 2 weeks (0 = Not at all; 1 = Several days; 2 = More than half the days; 3 = Nearly every day). The sum of these ratings, the GAD-7 sum score, was computed for each respondent, and the proportion of respondents within each category (Minimal anxiety = 0–4; Mild anxiety = 5–9; Moderate anxiety = 10–14; Severe anxiety=15) was calculated. GAD-7 sum scores in this data spanned a full range from 0 to 21 with 24.1% having scores of 10 or higher (Fig 1).

Fig 1. Histogram of GAD-7 sum scores in the sample.

Fig 1

Classification models

Multiple classification models were used to identify the model type with the best performance. These included Logistic Regression as well as tree-based models such as Random Forest, Gradient Boosting (XGBoost) and Naïve Bayes using Orange Data Mining, an open-source machine learning and data visualization toolkit designed for data analysis through visual programming or Python scripting (https://orangedatamining.com/). The Logistic Regression model was implemented with L1 (LASSO) regularization (the cost strength C = 12) and class balancing enabled. Table C in S1 Text provides the full model comparison across all classifiers. Models were created for the identification of individuals with GAD-7 scores ≥10 and <10 separately, then combined into a composite Logistic Regression model. This approach trains separate classifiers on minority and majority classes to mitigate imbalanced-data bias and is an overfitting-mitigation strategy rather than a method for imputing missing outcome data. All features were one-hot encoded where each answer option, if selected, was coded as a 1 and if not selected, as a 0.

Performance metrics including ROC area under the curve (AUC), accuracy, precision, recall and F1 scores were computed. This was done using all the data together, as well as separating the data by both geography and age. Geographically, models were built separately for all data acquired from western/developed countries (N = 7 countries; 40% of sample) and from non-western/developing countries (N = 13 countries; 60% of sample). Similarly, models were built for each decadal age group, pooling all geographies. Results reported are based on a 5-fold stratified cross-validation.

Estimation of contribution of variable categories to model performance

Using Logistic Regression models, the contribution of each category of lifestyle or life experience factor to the model performance in predicting symptoms of moderate to severe anxiety, defined as GAD-7 scores ≥10, was evaluated. These included frequency of exercise, frequency of socializing, employment status, number of interpersonal traumas, number of financial adversities, number of other life adversities and substance use. For each feature category (e.g., all exercise frequencies for the frequency of exercise question), the increase in each performance metric was evaluated at different positions of forward addition, when added first to a base model that included biological sex, and also when added last after all other feature categories/features had been included.

Information gain

Information Gain was used to assess the contribution of each feature (i.e., one hot encoded option) computed as the reduction in binary entropy of a target variable when it is split based on a particular feature [53]. For feature categories, aggregate information gain values were computed by averaging the individual Information Gain values for each feature/answer option within the category.

SHAP

We used the SHAP method to compute Shapley values, to assess how specific features affected prediction outcomes. Briefly, the marginal contribution of a feature (a one-hot encoded option of a factor such as exercise) was computed for each grouping as the difference in the predicted outcome with and without the feature. The Shapley value was the (weighted) average of marginal contributions, providing a view of both the magnitude and direction of each feature’s contribution [54].

Results

Prediction of moderate to severe anxiety by lifestyle and life circumstance

Here we used multiple models (Logistic Regression, Random Forest, Gradient Boosting and Naïve Bayes), to determine how well, in aggregate, multiple lifestyle and life experience factors predicted symptoms of moderate to severe anxiety, defined as a GAD-7 score of 10 or higher. Given that only 24% of the sample reported symptoms moderate to severe anxiety, models that classified high GAD-7 scores tended to overfit in the training while models that predicted the converse, tended to over generalize. The tree-based models, including Gradient Boosting and Random Forest, tended to overfit the majority class (GAD-7 < 10; Table C in S1 Text). This is evidenced by their high precision (0.82 and 0.86, respectively) and F1 scores (0.88 and 0.87, respectively) for this class, but considerably lower performance on the minority class (GAD-7 > 10), with lower recall (0.25 and 0.47, respectively) and F1 scores (0.33 and 0.49, respectively). When faced with imbalanced datasets (i.e., unequal class distributions, where only 24% of the sample had GAD-7 scores ≥10), this overfitting to the majority class is characteristic of decision tree-based algorithms, as they tend to create overly complex models that capture noise in the majority class. Naive Bayes showed similar, though slightly poorer performance compared to tree-based models. Logistic Regression, while not achieving the absolute highest scores in any single metric, demonstrated the most balanced performance across both classes, as evidenced by its superior AUC (0.80) and F1-score (0.53) for the GAD-7 > 10 class. This balanced performance occurs because tree-based models often maximize overall accuracy by correctly classifying the majority class at the expense of minority class detection, whereas Logistic Regression’s linear decision boundary provides more stable probability estimates across classes, making it the most suitable model for this study’s objective of identifying individuals with moderate to severe anxiety symptoms. Aggregate performance was similar even when models were created separately for western developed and non-western developing countries (Table 2, bottom rows), with AUC scores of 0.81 and 0.77 and F1-scores of 0.76 and 0.74, respectively. Therefore, Logistic Regression models combining all geographic regions (i.e., both western/developed and non-western/developing countries) were used for further analysis.

Table 2. Composite Logistic Regression performance for classification of moderate to severe anxiety symptoms.

Composite Model AUC Accuracy Precision Recall F1
All Countries 0.80 0.73 0.80 0.73 0.75
Western Developed Countries 0.81 0.75 0.81 0.75 0.76
Non-Western Developing Countries 0.77 0.72 0.79 0.72 0.74

However, model performance, indicating the ability to classify moderate to severe anxiety symptoms based on lifestyle and life experience factors, was systematically and significantly lower for younger age groups relative to older age groups (Fig 2, Table 3).

Fig 2. Accuracy and F1 scores for the classification of moderate to severe anxiety symptoms using Logistic Regression showed better performance for older age groups.

Fig 2

Table 3. Logistic regression model performance by age group.

Age Group AUC Accuracy Precision Recall F1
18-24 0.69 0.67 0.68 0.67 0.67
25-34 0.65 0.64 0.68 0.64 0.66
35-44 0.72 0.68 0.73 0.68 0.70
45-54 0.73 0.71 0.76 0.71 0.73
55-64 0.78 0.72 0.79 0.72 0.75
65-74 0.78 0.77 0.88 0.77 0.81
75-84 0.61 0.80 0.90 0.80 0.84

Accuracy and F1 scores were 0.77 and 0.86, respectively, for those age 65 and older but only 0.67 and 0.68 for those under age 34, while performance was in between for the middle age groups. This suggests that while older age groups were more likely to experience anxiety symptoms associated with the lifestyle and adverse life circumstances captured here, younger age groups were increasingly likely to experience anxiety symptoms associated with other factors not included in this model.

Hierarchy of factors contributing to model performance

Many factors contributing to anxiety may be inter-related. For example, one might sleep worse if not exercising or if experiencing interpersonal trauma, or one might be more likely to use substances such as tobacco or alcohol when unemployed. We therefore examined the impact of adding lifestyle and life experiences factors on model performance (AUC and F1 scores) when they were added either first or last (Fig 3A, Table 4, Table D in S1 Text When added first this indicates the contribution of the factor inclusive of its interactions and correlations with other factors, while adding it last provides insight into the contribution of the factor independent of its interactions and correlations with other factors.

Fig 3. Hierarchy of factors contributing to the classification of moderate to severe anxiety symptoms.

Fig 3

(A) Impact to AUC of adding the factor last after all other factors for all ages and 18 to 34. (B) Information Gain of factors. Legend spans both A and B.

Table 4. Ranking of impact of factors on model performance (AUC and F1 scores) based on first or last inclusion in forward addition models.

Average Rank All ages 18-34 Age group
All ages 18-34 AUC First AUC Last F1 First F1 Last AUC First AUC Last F1 First F1 Last
Sleep 1 1 1 1 2 1 1 1 1 1
Employment 3 6 2 2 6 3 7 6 7 5
Social 4 4 3 3 4 7 5 5 3 3
Interpersonal-Trauma 3 2 4 4 1 2 2 2 2 2
Exercise 5 4 5 4 3 7 4 4 4 2
Substance Use 6 6 6 6 8 4 6 7 8 4
Education 6 8 7 5 5 6 9 9 6 6
Financial Adversities 7 8 8 6 7 5 8 8 9 6
Other Adversities 7 4 9 6 7 5 3 3 5 6

Note: ‘Impact’ refers to the change in AUC or F1 score when each variable group is added in the forward-selection framework using the LASSO-regularized Logistic Regression classifier. Adding a factor ‘first’ (after the base model with biological sex) captures its contribution inclusive of correlations with other factors; adding ‘last’ reveals its independent contribution after accounting for shared variance. Actual delta values are provided in Table D in S1 Text R = Rank.

We performed this both for all ages together and for the 18–34 age group separately. Contributions of all factors diminished substantially when added last compared to when added first due to inter-relationships between factors for all ages and for the18–34 age group alone. For all age groups together, frequency of good sleep contributed the most to both AUC and F1 scores (0.163 and 0.041, respectively), while employment status had the second highest impact (0.108 and 0.017, respectively) followed by the experience of interpersonal trauma, frequency of social interaction and frequency of exercise. Educational attainment, substance use, financial adversities, and other adversities ranked lower, with no impact when added last. Similarly, frequency of good sleep also ranked highest in its contribution to AUC and F1 scores overall (average rank of adding first and last) for the 18–34 age group alone. However, employment status had little impact (ranked 6th) while the experience of interpersonal trauma ranked higher (2nd). Frequency of exercise and frequency of social interaction followed, jointly ranking 4th. The experience of other adversities (i.e., not financial or interpersonal, such as illness, injury, or natural disasters) also ranked 4th when added first, but had no impact when added last, similar to financial adversities and educational attainment.

Hierarchy of factors using Information Gain

We similarly evaluated the hierarchy of factors associated with moderate to severe anxiety symptoms for all age groups together, and for younger ages 18–34 separately, using Information Gain, a model independent method (Fig 3B). Here again the results were consistent with the impact to the model as described above. In particular, the feature category of sleep dominated as the most important factor associated with moderate to severe anxiety symptoms across all age groups, followed by employment status, frequency of socializing and the experience of interpersonal trauma. Similarly, for the 18–24 age group alone, sleep was the most important factor, followed by the experience of interpersonal trauma, while employment status had a lesser impact.

The Information Gain values of each feature for all ages are shown in Table E in S1 Text. Getting a good night’s sleep ‘Hardly ever’ or ‘Most of the time’ had the top two highest Information Gain values while ‘Rarely/Never’ exercising or socializing and being ‘Retired’ had the next highest. Bullying by peers and parental abuse or neglect in childhood as well as being ‘Not able to work’ or ‘Unemployed’, other frequencies of getting a good night’s sleep and abuse or assault in childhood were also in the top 15 features.

Factor contribution using SHAP

Finally, we used SHAP (SHapley Additive exPlanations) as a qualitative tool to highlight the directionality and consistency of associations, rather than as a definitive measure of causal importance (Fig 4). We show the direction of impact of the top 4 factors across all ages (frequency of getting a good night’s sleep, employment status, frequency of socializing and frequency of exercising). It is important to note that when predictors are correlated, SHAP values share importance estimates across correlated predictors because they summarize contributions across all feature combination scenarios; however, the consistency of rankings across SHAP, Information Gain, and forward-selection analyses strengthens confidence in the identified hierarchy. Here, each individual was plotted as a point either in blue or red, where blue indicates that the option was not selected whereas red indicates it was selected. Values to the right of zero indicate how much it pushed the model towards a positive classification of moderate to severe anxiety symptoms while values to the left of zero indicate how much it pushed the model towards a negative classification. Selection of ‘Hardly ever’ having a good night’s sleep consistently and substantially contributed towards a positive classification of moderate to severe anxiety symptoms, while having a good night’s sleep only ‘Some of the time’ also contributed towards a positive classification, but to a lesser extent. In contrast, having a good night’s sleep ‘Most of the time’ or ‘All of the time’ contributed to a negative classification. Similarly, ‘Rarely/never’ socializing in person contributed to a positive classification while socializing at least 1–3 times a month or more contributed to a positive classification. Finally, having an employment status of ‘Not able to work’ contributed most strongly to a positive classification of moderate to severe anxiety symptoms followed by a status of ‘Homemaker’ and ‘Unemployed’. In contrast, being ‘Employed’, ‘Retired’ or ‘Studying’ contributed strongly to a negative classification.

Fig 4. SHAP values for each category of sleep frequency, socializing frequency and employment status.

Fig 4

A red dot represents individuals who selected the option, while a blue dot indicates individuals who did not select the option. Points to the right of 0 indicate a contribution to positive classification of moderate to severe anxiety symptoms, points to the left of 0 indicate a contribution to a negative classification.

Discussion

Here we show that lifestyle and life context factors, particularly sleep quality, frequency of exercise, social interaction, and interpersonal trauma, play a substantial role in predicting moderate to severe anxiety symptoms, defined here as GAD-7 scores of 10 and above. The machine learning models employed here also revealed the hierarchy across these factors, with sleep quality being the most prominent across all age groups. Additionally, the influence of lifestyle factors on anxiety symptoms differs across age groups. These findings build on existing machine learning studies that, to date, have been smaller in scale [3034] or scope [31,3440,4245] and are a first demonstration of the aggregate contribution of a large number of adversities and traumas together with lifestyle factors to the incidence of anxiety symptoms across a large-scale sample from the general population. Altogether, they highlight the complex interplay and hierarchy of factors associated with the experience of anxiety symptoms and have implications for targeted interventions and public health policies aimed at preventing and treating anxiety.

Model performance and differences by age

The lifestyle and life context factors used in this study had substantial predictive power for older age groups. However, their predictive power was systematically diminished for younger age groups. While this study included a broad range of adversities, traumas, and lifestyle factors, some factors that impact mental health outcomes such as diet [e.g., ultra-processed food, [55]] and social media use [56] were not included and could be substantial contributors in younger age groups. In addition, studies are increasingly showing an impact of environmental toxins on mental health [57,58] which could also play a role and remains to be studied. However, importantly, it suggests that the factors associated with symptoms of anxiety differ between older and younger generations, which has implications for how we approach both prevention and treatment. While this finding may appear counterintuitive given that adverse events are commonly reported among younger populations, the specific constellation of lifestyle and life context factors captured here—many representing cumulative life experiences such as employment history and interpersonal traumas—may be more salient risk markers for anxiety in older adults, whereas younger adults may be more affected by contemporary factors not included in this model. This pattern is consistent with our previous MHQ study [59], where model performance also improved systematically with age (accuracy 0.68 for ages 18–24 vs. 0.94 for ages 75–84). We also note that this generational trend is also true for the prediction of transdiagnostic mental distress [59].

Hierarchy of life context factors driving anxiety

We used multiple methods to identify the hierarchy of life context factors associated with symptoms of moderate to severe anxiety. Across multiple methods we showed consistently that sleep status, and in particular frequent poor sleep, dominated the classification of moderate to severe anxiety symptoms for both age groups. For older age groups this was followed by employment status (for older ages), frequency of social interaction, number of interpersonal traumas, and frequency of exercise. In particular, for the older age group, not being able to work and being unemployed contributed most substantially to a classification of moderate to severe anxiety symptoms, while being employed or retired had the opposite effect. The experience of interpersonal trauma was a more predictive factor for the 18–34 age group, while employment status was a weaker factor. Among the individual interpersonal traumas, bullying by peers and abuse or neglect by a parent or caregiver in childhood were the strongest predictive factors.

However, although sleep quality was the most significant predictor of moderate to severe anxiety symptoms, even without the inclusion of this factor, other life factors such as social interaction, exercise, employment status and experience of interpersonal trauma, in aggregate, predicted moderate or severe anxiety symptoms with an AUC of 0.75 and F1 score of 0.73. This suggests firstly, that anxiety symptoms are predictably associated with people’s lifestyle habits (i.e., rarely/never socializing or exercising) and certain types of adverse life experiences (i.e., life challenges such as not being able to work, unemployment, and the experience of various types of abuse or assault) and secondly, that these factors likely contribute to poor sleep with reciprocal feedback [6062]. Although this study was cross-sectional in design and therefore cannot categorically distinguish between causality and consequence of symptoms, these findings point to the substantial sociological basis of anxiety symptoms that could be prevented and mitigated through shifts in culture and economics.

Similarity of factors to transdiagnostic predictions

Anxiety is substantially comorbid with various disorders such as depression, obsessive-compulsive disorders (OCD) and panic disorder [63,64] among others. In line with this, we have previously shown that the same set of lifestyle and life context factors were similarly able to predict overall mental distress [59], as measured by the MHQ, a transdiagnostic measure that aggregates across 47 symptoms spanning 10 disorders [65]. That study used a larger sample (N = 270,000) collected between April 2020 and December 2021, whereas the present study uses a distinct sample (N = 4,186) collected in September 2022 with the GAD-7 as the outcome measure. The comparison between disorder-specific (GAD-7) and transdiagnostic (MHQ) prediction provides unique insight into anxiety-specific risk factors. While the top 5 categories of factors were the same across both studies, there are certain differences worth noting. In particular, social interaction was the top contributor to prediction of the MHQ, followed by sleep quality, exercise frequency, employment status and experience of interpersonal trauma. In contrast, sleep quality played a more dominant role in the prediction of moderate to severe anxiety symptoms. This suggests that the specific symptoms of anxiety may be more tied to sleep than other aspects of mental distress.

Strengths and limitations

Key strengths of this study include the wide range of variables studied which integrate lifestyle habits with adverse experiences; the use of multiple algorithms including hierarchical analysis; and the ability to stratify the findings by demographics such as age and geography. However, there are several limitations to note. First, the study is a cross-sectional and therefore cannot distinguish between causality and consequence, especially given the bi-directional nature of many of the factors investigated here. However, this multi-variate data provides a unique opportunity to examine the hierarchy of impact of a broad range of factors on anxiety in a cost-effective and timely fashion and can provide well-evidenced hypotheses for interventional testing. Second, although a wide number of lifestyle and life context factors were used, several key factors, were missing. These factors are now included in more recent iterations of the MHQ and will be included in future analyses. Third, the sample population included in this study, while large-scale and obtained through tailored outreach, was a non-probability sample that may not be representative of the general population. As the assessment is performed online, this is particularly the case in countries where internet penetration is lower. Additionally, all measures were self-reported, which may be subject to recall bias and social desirability effects. Furthermore, while machine learning techniques offer advantages in handling complex, multivariate relationships, they are susceptible to overfitting and may not generalize well to populations with different characteristics from the training sample. However, comparisons of the Global Mind Data from the US showed that it is broadly comparable to data from the US Census. Studies are underway to perform similar comparisons for other countries where equivalent national statistics are available.

Implications for approaches to prevention and treatment

Given the known bidirectional association between anxiety and life context, these findings have implications for how we approach both prevention and treatment of anxiety. From a treatment perspective, it suggests that it is important to first determine an individual’s specific life context before making treatment decisions relating to anxiety. At an individual level, lifestyle factors such as exercise and regular social interaction could substantially reduce the risk of severe anxiety symptoms evoked by adversity. In addition, a better understanding of sleep mechanisms and targeting underlying problems of sleep challenges could also be a possible path to treatment of anxiety symptoms, particularly when found to occur in the absence of obvious adverse events or lifestyle risk factors. For instance, sleep apnea or poor sleep hygiene may drive sleep challenges that in turn cause greater anxiety. Finally, at a population level, age-tailored socioeconomic programs aimed at increasing employment opportunities and reducing interpersonal abuse and assault could substantially decrease the incidence of anxiety.

Supporting information

S1 Text. Table A.

Percentage of respondents by age and biological sex. Table B. Percentage of respondents by country. Table C. Model results for all model types. Table D. Ranking of impact of factors on model performance (AUC and F1 scores) based on first or last inclusion in forward addition models. Table E. InfoGain values of each answer option for all ages.

(DOCX)

pmen.0000552.s001.docx (28.3KB, docx)

Acknowledgments

We thank members of the Sapien Labs team for their assistance with recruitment and data management. We are grateful to all survey respondents for their participation in the Global Mind Project.

Data Availability

The full dataset from the Global Mind Project, including the data used in this study, is freely available for not-for profit purposes from the Sapien Labs Researcher Hub. Access can be requested here: https://sapienlabs.org/global-mind-project/researcher-hub/.

Funding Statement

This work was supported by funding from Sapien Labs.

References

  • 1.Diagnostic and statistical manual of mental disorders. 5 ed. APA. 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Spitzer RL, Kroenke K, Williams JBW, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. 2006;166(10):1092–7. doi: 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]
  • 3.COVID-19 Mental Disorders Collaborators. Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. Lancet. 2021;398(10312):1700–12. doi: 10.1016/S0140-6736(21)02143-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Szuhany KL, Simon NM. Anxiety Disorders: A Review. JAMA. 2022;328(24):2431–45. doi: 10.1001/jama.2022.22744 [DOI] [PubMed] [Google Scholar]
  • 5.Wu Y, Li X, Ji X, Ren W, Zhu Y, Chen Z, et al. Trends in the epidemiology of anxiety disorders from 1990 to 2021: A global, regional, and national analysis with a focus on the sociodemographic index. J Affect Disord. 2025;373:166–74. doi: 10.1016/j.jad.2024.12.086 [DOI] [PubMed] [Google Scholar]
  • 6.Yang X, Fang Y, Chen H, Zhang T, Yin X, Man J, et al. Global, regional and national burden of anxiety disorders from 1990 to 2019: results from the Global Burden of Disease Study 2019. Epidemiol Psychiatr Sci. 2021;30:e36. doi: 10.1017/S2045796021000275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Meier SM, Mattheisen M, Mors O, Mortensen PB, Laursen TM, Penninx BW. Increased mortality among people with anxiety disorders: total population study. Br J Psychiatry. 2016;209(3):216–21. doi: 10.1192/bjp.bp.115.171975 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Mendlowicz MV, Stein MB. Quality of life in individuals with anxiety disorders. Am J Psychiatry. 2000;157(5):669–82. doi: 10.1176/appi.ajp.157.5.669 [DOI] [PubMed] [Google Scholar]
  • 9.Alonso J, Liu Z, Evans-Lacko S, Sadikova E, Sampson N, Chatterji S, et al. Treatment gap for anxiety disorders is global: Results of the World Mental Health Surveys in 21 countries. Depress Anxiety. 2018;35(3):195–208. doi: 10.1002/da.22711 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Meier SM, Deckert J. Genetics of Anxiety Disorders. Curr Psychiatry Rep. 2019;21(3):16. doi: 10.1007/s11920-019-1002-7 [DOI] [PubMed] [Google Scholar]
  • 11.Ask H, Cheesman R, Jami ES, Levey DF, Purves KL, Weber H. Genetic contributions to anxiety disorders: where we are and where we are heading. Psychol Med. 2021;51(13):2231–46. doi: 10.1017/S0033291720005486 [DOI] [PubMed] [Google Scholar]
  • 12.Chu DA, Williams LM, Harris AWF, Bryant RA, Gatt JM. Early life trauma predicts self-reported levels of depressive and anxiety symptoms in nonclinical community adults: relative contributions of early life stressor types and adult trauma exposure. J Psychiatr Res. 2013;47(1):23–32. doi: 10.1016/j.jpsychires.2012.08.006 [DOI] [PubMed] [Google Scholar]
  • 13.Juruena MF, Eror F, Cleare AJ, Young AH. The Role of Early Life Stress in HPA Axis and Anxiety. Adv Exp Med Biol. 2020;1191:141–53. doi: 10.1007/978-981-32-9705-0_9 [DOI] [PubMed] [Google Scholar]
  • 14.Liu J, Shi Y, Xie S, Xing L, Wang L, Li W, et al. Meta-analysis of prospective longitudinal cohort studies on the impact of childhood traumas on anxiety disorders. J Affect Disord. 2025;374:443–59. doi: 10.1016/j.jad.2025.01.067 [DOI] [PubMed] [Google Scholar]
  • 15.Cox RC, Olatunji BO. Sleep in the anxiety-related disorders: A meta-analysis of subjective and objective research. Sleep Med Rev. 2020;51:101282. doi: 10.1016/j.smrv.2020.101282 [DOI] [PubMed] [Google Scholar]
  • 16.Chellappa SL, Aeschbach D. Sleep and anxiety: From mechanisms to interventions. Sleep Med Rev. 2022;61:101583. doi: 10.1016/j.smrv.2021.101583 [DOI] [PubMed] [Google Scholar]
  • 17.Aucoin M, LaChance L, Naidoo U, Remy D, Shekdar T, Sayar N, et al. Diet and Anxiety: A Scoping Review. Nutrients. 2021;13(12):4418. doi: 10.3390/nu13124418 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chen H, Cao Z, Hou Y, Yang H, Wang X, Xu C. The associations of dietary patterns with depressive and anxiety symptoms: a prospective study. BMC Med. 2023;21(1):307. doi: 10.1186/s12916-023-03019-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Stanczykiewicz B, Banik A, Knoll N, Keller J, Hohl DH, Rosińczuk J, et al. Sedentary behaviors and anxiety among children, adolescents and adults: a systematic review and meta-analysis. BMC Public Health. 2019;19(1):459. doi: 10.1186/s12889-019-6715-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Allen MS, Walter EE, Swann C. Sedentary behaviour and risk of anxiety: A systematic review and meta-analysis. J Affect Disord. 2019;242:5–13. doi: 10.1016/j.jad.2018.08.081 [DOI] [PubMed] [Google Scholar]
  • 21.Arena AF, Mobbs S, Sanatkar S, Williams D, Collins D, Harris M, et al. Mental health and unemployment: A systematic review and meta-analysis of interventions to improve depression and anxiety outcomes. J Affect Disord. 2023;335:450–72. doi: 10.1016/j.jad.2023.05.027 [DOI] [PubMed] [Google Scholar]
  • 22.Virgolino A, Costa J, Santos O, Pereira ME, Antunes R, Ambrósio S, et al. Lost in transition: a systematic review of the association between unemployment and mental health. J Ment Health. 2022;31(3):432–44. doi: 10.1080/09638237.2021.2022615 [DOI] [PubMed] [Google Scholar]
  • 23.Santini ZI, Jose PE, York Cornwell E, Koyanagi A, Nielsen L, Hinrichsen C, et al. Social disconnectedness, perceived isolation, and symptoms of depression and anxiety among older Americans (NSHAP): a longitudinal mediation analysis. Lancet Public Health. 2020;5(1):e62–70. doi: 10.1016/S2468-2667(19)30230-0 [DOI] [PubMed] [Google Scholar]
  • 24.Wilkialis L, Rodrigues NB, Cha DS, Siegel A, Majeed A, Lui LMW, et al. Social Isolation, Loneliness and Generalized Anxiety: Implications and Associations during the COVID-19 Quarantine. Brain Sci. 2021;11(12):1620. doi: 10.3390/brainsci11121620 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Altintaş E, Uylaş Aksu Z, Gümüş Demi̇r Z. Machine Learning Techniques for Anxiety Disorder. European Journal of Science and Technology. 2021. doi: 10.31590/ejosat.999914 [DOI] [Google Scholar]
  • 26.Daza A, Saboya N, Necochea-Chamorro JI, Zavaleta Ramos K, Vásquez Valencia Y del R. Systematic review of machine learning techniques to predict anxiety and stress in college students. Informatics in Medicine Unlocked. 2023;43:101391. doi: 10.1016/j.imu.2023.101391 [DOI] [Google Scholar]
  • 27.Kotsilieris T, Pintelas E, Livieris IE, Pintelas P. Predicting anxiety disorders and suicide tendency using machine learning: a review. IJMEI. 2020;12(6):599. doi: 10.1504/ijmei.2020.111040 [DOI] [Google Scholar]
  • 28.Muhammad A, Ashjan B, Ghufran M, Taghreed S, Nada A, Nada A, et al. Classification of Anxiety Disorders using Machine Learning Methods: A Literature Review. Insights Biomed Res. 2020;4(1). doi: 10.36959/584/455 [DOI] [Google Scholar]
  • 29.Pintelas EG, Kotsilieris T, Livieris IE, Pintelas P. A review of machine learning prediction methods for anxiety disorders. In: Proceedings of the 8th International Conference on Software Development and Technologies for Enhancing Accessibility and Fighting Info-exclusion, 2018. 8–15. doi: 10.1145/3218585.3218587 [DOI] [Google Scholar]
  • 30.Anbarasi LJ, Jawahar M, Ravi V, Cherian SM, Shreenidhi S, Sharen H. Machine learning approach for anxiety and sleep disorders analysis during COVID-19 lockdown. Health Technol (Berl). 2022;12(4):825–38. doi: 10.1007/s12553-022-00674-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Chavanne AV, Paillère Martinot ML, Penttilä J, Grimmer Y, Conrod P, Stringaris A, et al. Anxiety onset in adolescents: a machine-learning prediction. Mol Psychiatry. 2023;28(2):639–46. doi: 10.1038/s41380-022-01840-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Collins S, Hoare E, Allender S, Olive L, Leech RM, Winpenny EM, et al. A longitudinal study of lifestyle behaviours in emerging adulthood and risk for symptoms of depression, anxiety, and stress. J Affect Disord. 2023;327:244–53. doi: 10.1016/j.jad.2023.02.010 [DOI] [PubMed] [Google Scholar]
  • 33.Priya A, Garg S, Tigga NP. Predicting Anxiety, Depression and Stress in Modern Life using Machine Learning Algorithms. Procedia Computer Science. 2020;167:1258–67. doi: 10.1016/j.procs.2020.03.442 [DOI] [Google Scholar]
  • 34.Sau A, Bhakta I. Predicting anxiety and depression in elderly patients using machine learning technology. Healthcare Tech Letters. 2017;4(6):238–43. doi: 10.1049/htl.2016.0096 [DOI] [Google Scholar]
  • 35.Hueniken K, Somé NH, Abdelhack M, Taylor G, Elton Marshall T, Wickens CM, et al. Machine Learning-Based Predictive Modeling of Anxiety and Depressive Symptoms During 8 Months of the COVID-19 Global Pandemic: Repeated Cross-sectional Survey Study. JMIR Ment Health. 2021;8(11):e32876. doi: 10.2196/32876 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Tabares Tabares M, Vélez Álvarez C, Bernal Salcedo J, Murillo Rendón S. Anxiety in young people: Analysis from a machine learning model. Acta Psychol (Amst). 2024;248:104410. doi: 10.1016/j.actpsy.2024.104410 [DOI] [PubMed] [Google Scholar]
  • 37.Byeon H. Exploring Factors for Predicting Anxiety Disorders of the Elderly Living Alone in South Korea Using Interpretable Machine Learning: A Population-Based Study. Int J Environ Res Public Health. 2021;18(14):7625. doi: 10.3390/ijerph18147625 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Carpenter KLH, Sprechmann P, Calderbank R, Sapiro G, Egger HL. Quantifying Risk for Anxiety Disorders in Preschool Children: A Machine Learning Approach. PLoS One. 2016;11(11):e0165524. doi: 10.1371/journal.pone.0165524 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Farooq SA, Konda O, Kunwar A, Rajeev N. Anxiety Prediction and Analysis- A Machine Learning Based Approach. In: 2023 4th International Conference for Emerging Technology (INCET), 2023. 1–7. doi: 10.1109/incet57972.2023.10170115 [DOI] [Google Scholar]
  • 40.Husain W, Xin LK, Rashid NA, Jothi N. Predicting Generalized Anxiety Disorder among women using random forest approach. In: 2016 3rd International Conference on Computer and Information Sciences (ICCOINS), 2016. 37–42. doi: 10.1109/iccoins.2016.7783185 [DOI] [Google Scholar]
  • 41.Li Y, Song Y, Sui J, Greiner R, Li X-M, Greenshaw AJ, et al. Prospective prediction of anxiety onset in the Canadian longitudinal study on aging (CLSA): A machine learning study. J Affect Disord. 2024;357:148–55. doi: 10.1016/j.jad.2024.04.098 [DOI] [PubMed] [Google Scholar]
  • 42.Nayan MdIH, Uddin MSG, Hossain MdI, Alam MdM, Zinnia MA, Haq I, et al. Comparison of the Performance of Machine Learning-based Algorithms for Predicting Depression and Anxiety among University Students in Bangladesh. Asian Journal of Social Health and Behavior. 2022;5(2):75–84. doi: 10.4103/shb.shb_38_22 [DOI] [Google Scholar]
  • 43.Nemesure MD, Heinz MV, Huang R, Jacobson NC. Predictive modeling of depression and anxiety using electronic health records and a novel machine learning approach with artificial intelligence. Sci Rep. 2021;11(1):1980. doi: 10.1038/s41598-021-81368-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Qasrawi R, Vicuna Polo SP, Abu Al-Halawa D, Hallaq S, Abdeen Z. Assessment and Prediction of Depression and Anxiety Risk Factors in Schoolchildren: Machine Learning Techniques Performance Analysis. JMIR Form Res. 2022;6(8):e32736. doi: 10.2196/32736 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Talib AA, Binnouh A, Alqahtani H, Bassfar Z, Alhmiedat T, Alatawi A. Predicting anxiety among technical employees: A machine learning approach. Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University. 2023;44:953–65. [Google Scholar]
  • 46.Wei Z, Wang X, Ren L, Liu C, Liu C, Cao M, et al. Using machine learning approach to predict depression and anxiety among patients with epilepsy in China: A cross-sectional study. J Affect Disord. 2023;336:1–8. doi: 10.1016/j.jad.2023.05.043 [DOI] [PubMed] [Google Scholar]
  • 47.Firth J, Solmi M, Wootton RE, Vancampfort D, Schuch FB, Hoare E, et al. A meta-review of “lifestyle psychiatry”: the role of exercise, smoking, diet and sleep in the prevention and treatment of mental disorders. World Psychiatry. 2020;19(3):360–80. doi: 10.1002/wps.20773 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Penninx BW, Pine DS, Holmes EA, Reif A. Anxiety disorders. The Lancet. 2021;397(10277):914–27. doi: 10.1016/s0140-6736(21)00359-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Bandelow B, Michaelis S, Wedekind D. Treatment of anxiety disorders. Dialogues Clin Neurosci. 2017;19(2):93–107. doi: 10.31887/DCNS.2017.19.2/bbandelow [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Taylor J, Sukhoi O, Newson J, Thiagarajan T. Global Mind Project data in the United States: A comparison with national statistics. 2023. https://osf.io/p9ur6 [Google Scholar]
  • 51.Newson JJ, Thiagarajan TC. Assessment of Population Well-Being With the Mental Health Quotient (MHQ): Development and Usability Study. JMIR Ment Health. 2020;7(7):e17935. doi: 10.2196/17935 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Newson JJ, Pastukh V, Thiagarajan TC. Assessment of Population Well-being With the Mental Health Quotient: Validation Study. JMIR Ment Health. 2022;9(4):e34105. doi: 10.2196/34105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Qu K, Xu J, Hou Q, Qu K, Sun Y. Feature selection using Information Gain and decision information in neighborhood decision system. Applied Soft Computing. 2023;136:110100. doi: 10.1016/j.asoc.2023.110100 [DOI] [Google Scholar]
  • 54.Lundberg S, Lee SI. A Unified Approach to Interpreting Model Predictions. 2017. doi: 10.48550/arXiv.1705.07874 [DOI] [Google Scholar]
  • 55.Lane MM, Gamage E, Travica N, Dissanayaka T, Ashtree DN, Gauci S, et al. Ultra-Processed Food Consumption and Mental Health: A Systematic Review and Meta-Analysis of Observational Studies. Nutrients. 2022;14(13):2568. doi: 10.3390/nu14132568 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Twenge JM, Martin GN. Gender differences in associations between digital media use and psychological well-being: Evidence from three large datasets. J Adolesc. 2020;79:91–102. doi: 10.1016/j.adolescence.2019.12.018 [DOI] [PubMed] [Google Scholar]
  • 57.Grandjean P, Landrigan PJ. Neurobehavioural effects of developmental toxicity. Lancet Neurol. 2014;13(3):330–8. doi: 10.1016/S1474-4422(13)70278-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.James AA, OShaughnessy KL. Environmental chemical exposures and mental health outcomes in children: a narrative review of recent literature. Front Toxicol. 2023;5:1290119. doi: 10.3389/ftox.2023.1290119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Bala J, Newson JJ, Thiagarajan TC. Hierarchy of demographic and social determinants of mental health: analysis of cross-sectional survey data from the Global Mind Project. BMJ Open. 2024;14(3):e075095. doi: 10.1136/bmjopen-2023-075095 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Blanchflower DG, Bryson A. Unemployment and sleep: evidence from the United States and Europe. Econ Hum Biol. 2021;43:101042. doi: 10.1016/j.ehb.2021.101042 [DOI] [PubMed] [Google Scholar]
  • 61.Kajeepeta S, Gelaye B, Jackson CL, Williams MA. Adverse childhood experiences are associated with adult sleep disorders: a systematic review. Sleep Med. 2015;16(3):320–30. doi: 10.1016/j.sleep.2014.12.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Dolezal BA, Neufeld EV, Boland DM, Martin JL, Cooper CB. Interrelationship between Sleep and Exercise: A Systematic Review. Adv Prev Med. 2017;2017:1364387. doi: 10.1155/2017/1364387 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Goodwin GM. The overlap between anxiety, depression, and obsessive-compulsive disorder. Dialogues Clin Neurosci. 2015;17(3):249–60. doi: 10.31887/DCNS.2015.17.3/ggoodwin [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Noyes R Jr. Comorbidity in generalized anxiety disorder. Psychiatr Clin North Am. 2001;24(1):41–55. doi: 10.1016/s0193-953x(05)70205-7 [DOI] [PubMed] [Google Scholar]
  • 65.Newson JJ, Sukhoi O, Thiagarajan TC. MHQ: constructing an aggregate metric of population mental wellbeing. Popul Health Metr. 2024;22(1):16. doi: 10.1186/s12963-024-00336-y [DOI] [PMC free article] [PubMed] [Google Scholar]
PLOS Ment Health. doi: 10.1371/journal.pmen.0000552.r001

Decision Letter 0

Karli Montague-Cardoso

10 Nov 2025

PMEN-D-25-00320

Contribution of social determinants to symptoms of generalized anxiety disorder

PLOS Mental Health

Dear Dr. Thiagarajan,

Thank you for submitting your manuscript to PLOS Mental Health. I am sorry for the delay. After careful consideration of the reviewer reports, we feel that your paper has merit but does not fully meet PLOS Mental Health’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please address all of the comments raised, which you can find below. You will notice that Reviewer 1 was especially concerned about the rationale and analysis approach and so it is important that you focus on these particular aspects. We will be unable to proceed with the paper if these concerns are not fully addressed.

Please submit your revised manuscript by Dec 24 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at mentalhealth@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pmen/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Karli Montague-Cardoso

Staff Editor

PLOS Mental Health

Journal Requirements:

1. We noticed you have some minor occurrence of overlapping text with the following previous publication(s), which needs to be addressed:

- https://doi.org/10.3389/fpsyt.2025.1576964

- https://doi.org/10.1136/bmjopen-2023-075095

In your revision ensure you cite all your sources (including your own works), and quote or rephrase any duplicated text outside the methods section. Further consideration is dependent on these concerns being addressed.

2.  Please note that PLOS Mental Health has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/mentalhealth/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

i. Please clarify all sources of financial support for your study. List the grants, grant numbers, and organizations that funded your study, including funding received from your institution. Please note that suppliers of material support, including research materials, should be recognized in the Acknowledgements section rather than in the Financial Disclosure.

ii. State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

iii. State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

iv. If any authors received a salary from any of your funders, please state which authors and which funders.

4. Please ensure that your Ethics Statement is available in its entirety at the beginning of your Methods section, under a subheading 'Ethics Statement'. It must include:

1) The name(s) of the Institutional Review Board(s) or Ethics Committee(s)

2) The approval number(s), or a statement that approval was granted by the named board(s)

3) (for human participants/donors) - A statement that formal consent was obtained (must state whether verbal/written) OR the reason consent was not obtained (e.g. anonymity).

5. Please provide separate figure files in .tif or .eps format.

For more information about figure files please see our guidelines:

https://journals.plos.org/mentalhealth/s/figures

https://journals.plos.org/mentalhealth/s/figures#loc-file-requirements

6. For studies involving third-party data, we encourage authors to share any data specific to their analyses that they can legally distribute. PLOS recognizes, however, that authors may be using third-party data they do not have the rights to share. When third-party data cannot be publicly shared, authors must provide all information necessary for interested researchers to apply to gain access to the data. (https://journals.plos.org/plosone/s/data-availability#loc-acceptable-data-access-restrictions

For any third-party data that the authors cannot legally distribute, they should include the following information in their Data Availability Statement upon submission:

1) A description of the data set and the third-party source

2) If applicable, verification of permission to use the data set

3) Confirmation of whether the authors received any special privileges in accessing the data that other researchers would not have

4) All necessary contact information others would need to apply to gain access to the data

If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise.

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does this manuscript meet PLOS Mental Health’s publication criteria?>

Reviewer #1: No

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?-->?>

Reviewer #1: No

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

**********

Reviewer #1: The abstract gives me the impression that the results will be a hodge-podge, as the types of predictors listed include potential causes (e.g., childhood adversities), prodromes (e.g., sleep problems), and endogenous variables (e.g., social isolation). And the classifiers listed included most that are predictive and one, SHAP, that merely helps interpret results from the others. We are told nothing in the abstract about sample size, design, or strength of associations.

Line 52 Specific phobia and adjustment disorder are much more common than GAD

Line 89 Sentence structure

Line 95 You’re not really “predicting” if the study is cross-sectional and some of the variables are likely to have bidirectional associations with GAD.

Lines 122-2 I don’t understand the rationale for this exclusion based on standard deviation. On the face of it, it does not make sense. Spell out the rationale.

Line 132 Based on absence of mention, I take it that no calibration was used to make this sample representative of the population on known population characteristics.

Line 150 I don’t get this. You estimated a model for having GAD and another model for not having GAD? Why? This was to deal with missing outcomes data?

The description of the analysis approach didn’t make much sense. Nor did the presentation of results. We’d normally expect lasso to be the benchmark model and to see how much you could improve on thi with more complex models. But I never saw a table reporting this. And Table 4 doesn’t tell us either classifier on which results are based on what the meaning of “impact” is. Are these rank orders of SHAP Values? From what model?

It's important to appreciate that SHAP values do not tell you unequivocally about predictor importance. When predictors are correlated, estimates of importance get shared across correlated predictors because the SHAP value summarizes across all scenarios, in some on which those correlated predictors are present. Your interpretations are naïve.

Reviewer #2: This is a very interesting study, and the authors should be commended for a well-written and impactful manuscript. Specific comments follow:

1. Given that the title mentions the contribution of social determinants, the term itself does not appear in the manuscript. It is unclear whether the authors are referring to social determinants of health or specifically of mental health. The study explores contextual factors such as life trauma, poor sleep quality, infrequent exercise, unemployment, and social isolation, so it would be helpful to clarify why the term “social determinants” was not used.

2. While the variables are outlined on pages 4 and 5, there is limited information on how demographics, geography, lifestyle, substance use, medical history, and childhood interpersonal trauma were measured. For example, it is clear that GAD-7 was used to assess anxiety symptoms, but it is not clear how the other measures were defined and operationalised.

3. On page 8, the manuscript states that only respondents who answered “yes” to the MHQ question, completed the survey in a timeframe appropriate for reading all questions (above 7 minutes), and had a standard deviation above 0.2 across MHQ-rated responses were included in the analysis. The authors should clarify why these criteria were chosen and how they were assessed.

4. On page 11, the term “imbalanced datasets” is introduced without explanation. It is unclear what this refers to, for example, was the dataset imbalanced due to unequal class distributions, missing data or another reason? Additionally, on page 12, it is noted that logistic regression did not achieve the highest scores in any single metric. It would be helpful to briefly explain why this might be the case. The term “geographies” is also introduced on page 12 in the statement “regression models combining all geographies were used for further analysis.” The authors should clarify what “geographies” refers to in this context. Further, how was missing data handled?

5. The finding that older age groups were more likely to experience anxiety symptoms associated with lifestyle and adverse life circumstances, while younger age groups were increasingly affected by other factors not included in the model, is interesting. However, this seems counterintuitive, as anxiety symptoms associated with adverse events are often more prevalent among younger populations. The authors may wish to elaborate on this finding, if necessary.

6. In the limitations section, the authors could consider noting the use of self-reported measures, and the potential limitations of machine learning techniques and algorithms.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

PLOS Ment Health. doi: 10.1371/journal.pmen.0000552.r003

Decision Letter 1

Karli Montague-Cardoso

19 Jan 2026

Contribution of social determinants to symptoms of generalized anxiety disorder

PMEN-D-25-00320R1

Dear Dr. Thiagarajan,

We are pleased to inform you that your manuscript 'Contribution of social determinants to symptoms of generalized anxiety disorder' has been provisionally accepted for publication in PLOS Mental Health.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact mentalhealth@plos.org.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Mental Health.

Best regards,

Karli Montague-Cardoso

Staff Editor

PLOS Mental Health

***********************************************************

Reviewer Comments (if any, and for reference):

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. Table A.

    Percentage of respondents by age and biological sex. Table B. Percentage of respondents by country. Table C. Model results for all model types. Table D. Ranking of impact of factors on model performance (AUC and F1 scores) based on first or last inclusion in forward addition models. Table E. InfoGain values of each answer option for all ages.

    (DOCX)

    pmen.0000552.s001.docx (28.3KB, docx)
    Attachment

    Submitted filename: Response to Reviewers.docx

    pmen.0000552.s003.docx (19.5KB, docx)

    Data Availability Statement

    The full dataset from the Global Mind Project, including the data used in this study, is freely available for not-for profit purposes from the Sapien Labs Researcher Hub. Access can be requested here: https://sapienlabs.org/global-mind-project/researcher-hub/.


    Articles from PLOS Mental Health are provided here courtesy of PLOS

    RESOURCES