Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 May 1.
Published in final edited form as: J Affect Disord. 2024 Feb 8;352:133–137. doi: 10.1016/j.jad.2024.02.012

Linguistic Markers of Anxiety and Depression in Somatic Symptom and Related Disorders: Observational Study of a Digital Intervention

Matteo Malgaroli 1, Thomas D Hull 2, Adam Calderon 1,3, Naomi M Simon 1
PMCID: PMC10947071  NIHMSID: NIHMS1968371  PMID: 38336165

Abstract

Background:

Somatic Symptom and Related Disorders (SSRD), including chronic pain, result in frequent primary care visits, depression and anxiety symptoms, and diminished quality of life. Treatment access remains limited due to structural barriers and functional impairment. Digital delivery offers to improve access and enables transcript analysis via Natural Language Processing (NLP) to inform treatment. Therefore, we investigated asynchronous message-delivered SSRD treatment, and used NLP methods to identify symptom reduction markers from emotional valence.

Methods:

173 individuals diagnosed with SSRD received interventions from licensed therapists via messaging 5 days/week for 8 weeks. Depression and anxiety symptoms were measured with the PHQ-9 and GAD-7 from baseline every three weeks. Symptoms trajectories were identified using unsupervised random forest clustering. Emotional valence expressed and use of emotional words were extracted from patients’ de-identified transcripts, respectively using VADER and NCR Lexicon. Valence differences were examined using logistic regression.

Results:

Two subpopulations were identified showing symptoms Improvement (n=72; 41.62%) and non-response (n=101; 58.38%). Improvement patients expressed more positive valence in the first week of treatment (OR=1.84, CI: 1.12–3.02; p=.015) and were less likely to express negative valence by the end of treatment (OR=.05; CI: .30–.83; p=.008). Non-response patients used more negative valence words, including pain.

Limitations:

Findings were derived from observational data obtained during an ecological intervention, without the inclusion of a control group.

Conclusions:

NLP identified linguistic markers distinguishing changes in anxiety and depression symptoms over treatment. Digital interventions offer new forms of delivery and provide the opportunity to automatically collect data for linguistic analysis.

Keywords: Digital Health, Somatic Symptom and Related Disorders, Depression, Anxiety, Pain, Natural Language Processing

Introduction

Medically unexplained pain and somatic complaints are the most frequent reason for primary care visits (Kroenke, 2003). Of these, 10–20% develop into chronic conditions known as Somatic Symptom and Related Disorders (SSRD) in DSM-5, and as Somatoform Disorders in ICD-10 (Hüsing et al., 2018). Individuals with SSRD suffer from depression and anxiety (Katz et al., 2015), functional impairment, and low quality of life (Liao et al., 2019). Although advances in treating SSRD have been made, only a minority of individuals receive mental health care (Mack et al., 2014). Structural obstacles include shortage of clinicians, lack of training (Murray et al., 2016) and hesitance in discussing psychological problems in primary care (Katz et al., 2015). Given the debilitating impact of SSRD, it is imperative to improve rates of treatment access and to evaluate markers of symptoms reduction.

Digital health interventions offer a solution to overcome structural and individual barriers (Patel et al., 2020). Among the synchronous and asynchronous forms of telemedicine, there is an increase in using two-way messaging for scalable, clinician-administered treatment (Hull et al., 2020). An analytic benefit of messaging is that it generates therapy transcripts from communications between patients and therapists. When paired with Natural Language Processing (NLP) algorithms, textual data can be analyzed to identify conversational markers that could potentially inform clinical decision-making (Malgaroli et al., 2023). Linguistic signatures of patients’ emotional processes can be identified in their conversations with therapists and be associated with fluctuations in anxiety and depression (Nook et al., 2022). Among NLP markers there is an increasing focus on analyzing patients’ emotional valence, consisting of positive (e.g., joy, happiness, and excitement) and negative (e.g., sadness, anger, frustration, and fear) emotions expressed by patients during treatment (Burkhardt et al., 2021). While these studies demonstrate feasibility and benefits to capturing NLP markers, less is known about linguistic characteristics related to treatment response for SSRD. Expressions of negative valence emotions are particularly crucial for SSRD, as they are linked to increased somatic pain perception (Lumley et al., 2011).

Therefore, to study advances in digital health treatments and identify treatment markers, we conducted a longitudinal observation of treatment delivered using two-way asynchronous messaging. We examined 8 weeks of treatment in which participants received interventions from licensed therapists delivered via messaging five days a week. We identified trajectories of treatment outcome based on anxiety and depression symptom changes over 8 weeks of interventions. We then used NLP methods to analyze valence markers in patient messages to distinguish improving patients from non-responders.

Methods

Participants and setting

Sample consisted of treatment-seeking individuals in the United States who signed up for a digital health platform used by independently practicing licensed therapists (www.talkspace.com). The platform is accessible through employee assistance programs, self-referral, and as a behavioral health benefit through individual insurances. Data for the study was collected as part of organizational quality assurance and program management processes between February 2017 and June 2021. All patients and clinicians gave written informed consent to use their data in a de-identified, aggregate format as part of the user agreement before they began using the platform. Study procedures were approved by the NYU IRB.

Patients received a primary ICD-10 diagnosis of Pain Disorder, Somatic Disorder, Conversion disorder, or Hypochondria. Patients were diagnosed by a licensed clinician using a standardized intake to identify presenting complaints and treatment history. Additional inclusion criteria included: 1) Baseline score on the PHQ-9 and/or GAD-7 ≥10; 2) being English speakers 3) regular internet access. Exclusion criteria included: 1) current or past diagnoses of bipolar disorder, schizophrenia spectrum, and psychotic disorders; 2) substance or alcohol use disorder; 3) any condition deemed by intake clinician requiring hospitalization.

We evaluated 519 electronic records, of which 301 were excluded due comorbid conditions, and another 45 due to sub-clinical symptoms. The final sample consisted of 173 patients.

Intervention and Measures

The intervention consisted of two-way asynchronous messages exchanged through a HIPAA-compliant interface for smartphones and computers. Following intake, patients communicated with their provider at any time by sending any number of messages. Providers delivered one or more interventions to patients five days a week, based on scheduled response times. The study analyzed a two-month treatment period, aligning with previous findings that most symptom change in asynchronous message therapy happens within this timeframe (Hull et al., 2020; Nook et al., 2022). Psychiatric symptoms were measured using the Patient Health Questionnaire-9 (PHQ-9) (Kroenke et al., 2001) and the General Anxiety Disorder Scale-7 (GAD-7) (Spitzer et al., 2006) at baseline and approximately every three weeks throughout therapy. PHQ-9 and GAD-7 scores of 10 or higher show high sensitivity and specificity for moderate depression and anxiety symptoms, respectively. (Kroenke et al., 2001; Spitzer et al., 2006). De-identified transcripts of treatment were available, consisting of messages written by patients to their therapists with their time of delivery in masked form. All transcript data was de-identified by the platform before collection, using an algorithm that scrubs out personal identifiers including proper nouns, locations, dates, and others.

Data Analysis

Statistical analyses were performed in R, version 4.1.3. First, unsupervised machine learning identified patient subgroups based on PHQ-9 and GAD-7 outcomes. Second, we analyzed de-identified treatment transcripts with NLP methods to examine patients’ expressed emotions and their associations with treatment outcomes.

Treatment Outcomes.

We identified symptoms trajectories with unsupervised random forest and hierarchical clustering, using the R packages randomForest and stats respectively. This approach allows the identification of patient subtypes (i.e., responders and non-responders) by assessing symptom proximities over time and then aggregating patients based on proximities (Siegel et al., 2021). Specifically, PHQ-9 and GAD-7 symptom scores from baseline to week 8 were used as input for the random forest while omitting classification target. The learning procedure generated a proximity matrix that captured similarities between patients: each cell in the proximity matrix consists of the proportion of times two patients were in the same terminal node of the random forest based on patterns of GAD-7 and PHQ-9 scores over treatment. Proximity scores were then used as input for hierarchical clustering to generate groupings. As the sample size was not large, the number of PHQ-9/GAD-7 clusters was restricted to two.

NLP Analysis.

To examine patients’ emotional valence, we divided transcripts according to symptom measures: baseline to 7 days of treatment (Week 0), from second week of treatment to completion of the second symptoms survey (Week 4), and from then to the third assessment (Week 8). For each time point, we extracted aggregated positive and negative valence scores using VADER (Hutto & Gilbert, 2014). VADER was design to analyze sentiment in digital communications across multiple domains (Hutto & Gilbert, 2014) and has been used to study depression linguistic markers (Bathina et al., 2021). VADER uses word-order sensitive heuristics to produce valence scores, adjusting the intensity of lexical features based on punctuation, capitalization, and the presence of negations, contrastive conjunctions, and degree adverbs within preceding trigrams. After deriving VADER scores, we identified associations between patients’ valence and treatment trajectories. We conducted a binomial logistic regression with Firth’s penalized likelihood method including demographics, negative and positive valence (weeks 0 to 8), and using treatment outcome as classification target.

To interpret valence findings, we also examined differences in words associated with emotional distress. We examined transcripts at the aggregated level over treatment, by preprocessing and tokenizing transcripts into unigrams using the R package tidytext. We then generated counts of unigrams for each patient using human-curated words associated with Sadness, Fear, and Anticipation/Anxiety from the NRC Emotion Lexicon. We analyzed differences in counts between patient clusters using chi-square analyses, and examined frequencies of words related to somatic complaints (Lumley et al., 2011).

Results

Patient characteristics

Mean age was 35 years (SD=9.65) and the majority of the sample obtained a bachelor’s degree or more (n=117; 67.6%), and identified as white (n=104; 60.01%) and female (n=113; 65.3%). Pain Disorder was the modal diagnosis (n=106; 61.3%), followed by Hypochondria (n=39; 22.5%), Conversion (n=19; 11.0%) and Somatic Disorders (n=9; 5.2%). At baseline, patients endorsed moderate levels of depression (PHQ-9: M=13.9; SD=4.7) and anxiety symptoms (GAD-7: M=13.9; SD=4.7). Treatment duration was two months for 134 patients (77.6%), with 6 patients (4.03%) dropping out within a month. Median survey completion time was 29.84 days (IQR=21–37.44) from baseline to the second survey (Week 4) and 25.73 days (IQR=21.11–30.75) from the second to the third survey (Week 8).

Symptoms Trajectories

Unsupervised random forest clustering teased patients into two heterogeneous groups based on PHQ-9 and GAD-7 trajectories (Figure 1). The first modal group was the Non-response group (n=101; 58.38%). The Non-response group was characterized by trajectories of initial moderate anxiety and depression scores (PHQ-9: M=14.19, SD=5.71; GAD-7: M=12.86, SD=4.55) that would remain above clinical cut-off levels of depression and anxiety by Week 8 (PHQ-9: M=11.99, SD=3.45; GAD-7: M=11.01, SD=2.91). The second group, Improvement (n=72; 41.62%), consisted of patients with moderate anxiety and depression symptoms at baseline scores (PHQ-9: M=13.85, SD=5.06; GAD-7: M=15.53, SD=4.38) that decreased over time and remained below clinical cut-offs by week 8 (PHQ-9: M=9.27, SD=2.38; GAD-7: M=9.06, SD=2.22).

Fig. 1.

Fig. 1.

Mean Trajectories of anxiety (GAD-7) and depression (PHQ-9) symptoms scores over 8 weeks of weeks of treatment (n = 173). Error bars correspond to 95% confidence intervals.

Linguistic Markers from Treatment Transcripts

We analyzed valence and linguistic differences between Improvement and Non-response groups using de-identified transcripts of patients who completed two months of treatment (n=134). A median of 65.5 (IQR: 30.25–121.75) messages per patient was available.

Results from the logistic regression (Figure 2) indicated that the Improvement group was significantly more likely to express positive valence in the first week of treatment (OR=1.84, CI: 1.12–3.02; p = .015) and less negative valence by the end of treatment (OR=.05; CI:.30–.83; p=.008). Chi-square analysis suggested lexical differences between Non-response and Improvement, with non-responders using more words related to sadness [X2(524, n=11427)=1246, p <.001], fear [X2(584, n=11481)=1371, p <.001], and anticipation/anxiety [X2(421, n=11947)=819.5, p <.001] throughout treatment. Examining word frequencies of terms within the lexicons showed that non-responders used more somatic terms, including “pain” (Non-response vs. Improvement counts: 578 vs. 214) and “surgery” (123 vs. 24), and “hospital” (84 vs. 23).

Fig. 2.

Fig. 2.

Standardized logistic regression coefficients of linguistic markers of symptoms Improvement (n=134)

Discussion

Digital health interventions are gaining interest, but Somatic Symptom and Related Disorders (SSRD) remain underexplored. This study examines PHQ-9 and GAD-7 symptom change trajectories over 8 weeks of messaging-based psychotherapy. In addition, we used NLP methods to identify valence markers of patients’ treatment response. Our findings reveal two distinct patient groups based on treatment outcomes: the Improvement group (41.62%), with clinically significant reduction, and the Non-Response group (58.38%), with persisting symptoms. These results align with previous studies on traditional psychological treatments for SSRD (Dessel et al., 2014), and support the feasibility of messaging as an intervention delivery method. Emotional valence analysis of patients’ psychotherapy transcripts indicated that positive valence expression in the first week of treatment and negative valence at the end of treatment (weeks 4 to 8) was associated with improvement. Significant differences between improving and chronic patients were observed in the use of words related to sadness, fear, and anxiety throughout treatment. Of particular note, the Non-response group more frequently communicated about “pain” with their therapist. These results support findings showing pain complaints as a predictor of non-response (Mehltretter et al., 2020). An increase in positive valence at the onset of treatment may signal early cognitive shifts in patients’ perceptions of their somatic conditions (Dessel et al., 2014). Concurrently, persistent use of language marked by negative valence and lexicons related to negative emotions could help identify patients who might benefit from adjunct treatment strategies. These associations indicate potential for employing passively captured linguistic features as markers in the treatment of SSRD.

Limitations

Our study was observational, lacked a control group and follow-up assessments, which limits our ability to draw causal inferences about the treatment. Second, we focused on a relatively small sample of English speakers only. As such, our linguistic findings might only generalize to telemedicine settings and similar demographics. Third, our study was limited by the use of the PHQ-9 and GAD-7, which confined our examination to the symptoms and somatic characteristics these scales measure. To validate our initial findings, we suggest further studies with larger, representative samples, clinician-rated measures that encompass somatic symptoms, and a control group. These studies should explore using advanced NLP methods, like Large Language Models, to improve context-sensitive analysis.

Conclusions

Despite these limitations, our study’s strengths include unique deidentified transcripts for NLP analysis and a data-driven approach addressing SSRD heterogeneity. Furthermore, the study was conducted in a real-world setting, demonstrating the feasibility of using digital platforms both to deliver interventions and gather data for NLP analysis. Findings underscore the potential to enhance SSRD treatment access using digital interventions’ and show NLP can identify markers associated with clinical improvement. NLP offers the opportunity to capture markers of anxiety and depression automatically, more objectively, and without patient burden (Malgaroli et al., 2023). If validated on larger, more representative samples, NLP-extracted markers could help identify at-risk individuals earlier to provide tailored treatment.

Highlights:

  • NLP characterizes emotional differences in anxiety and depression symptoms.

  • Improving patients expressed more positive valence in the first week of treatment.

  • Non-responders expressed more negative valence by the end of treatment.

  • Non-responders used more pain-related words throughout treatment.

  • Message delivery offers opportunities to increase psychotherapy access.

Funding

MM’s research was supported by the National Center for Advancing Translational Sciences (NCATS) and National Institutes of Health (NIH) through grants # 1K23MH134068 and # 2KL2TR001446-06A1, Talkspace, and by the American Foundation for Suicide Prevention through grant PRG-0-104-19. TDH’s research was supported by National Institutes of Health Awards # R44MH124334 and R01MH125179-01. NS received grant research from the NIH, DoD, American Foundation for Suicide Prevention, PCORI, and Vanda Pharmaceuticals; support from Cohen Veterans Network; personal fees from Praxis Therapeutics, Geno-mind, Bionomics Limited, Cerevel, and Engrail Therapeutics Inc; fees and royalties from Wiley, Wolters Kluwer, and American Psychiatric Association Publishing; and spousal stock from G1 Therapeutics and Zentalis (outside of sub-mitted work). The funding sources had no involvement other than financial support. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Acronyms:

GAD-7

General Anxiety Disorder-7

NLP

natural language processing

PHQ-9

Patient Health Questionnaire-9

SSRD

Somatic Symptom and Related Disorders

Footnotes

Conflict of interest statement

TDH is an employee of the platform that provided the data examined in this study. Talkspace had no role in the analysis, interpretation of the data, or decision to submit the manuscript for publication.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Bathina KC, Ten Thij M, Lorenzo-Luaces L, Rutter LA, & Bollen J (2021). Individuals with depression express more distorted thinking on social media. Nature Human Behaviour, 5(4), 458–466. [DOI] [PubMed] [Google Scholar]
  2. Burkhardt HA, Alexopoulos GS, Pullmann MD, Hull TD, Areán PA, & Cohen T (2021). Behavioral Activation and Depression Symptomatology: Longitudinal Assessment of Linguistic Indicators in Text-Based Therapy Sessions. Journal of Medical Internet Research, 23(7), e28244. 10.2196/28244 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Dessel N. van, Boeft M. den, Wouden J. C. van der, Kleinstäuber M, Leone SS, Terluin B, Numans ME, Horst H. E. van der, & Marwijk H. van. (2014). Non-pharmacological interventions for somatoform disorders and medically unexplained physical symptoms (MUPS) in adults. Cochrane Database of Systematic Reviews, 11. 10.1002/14651858.CD011142.pub2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Hull TD, Malgaroli M, Connolly PS, Feuerstein S, & Simon NM (2020). Two-way messaging therapy for depression and anxiety: Longitudinal response trajectories. BMC Psychiatry, 20(1), 297. 10.1186/s12888-020-02721-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Hüsing P, Löwe B, & Toussaint A (2018). Comparing the diagnostic concepts of ICD-10 somatoform disorders and DSM-5 somatic symptom disorders in patients from a psychosomatic outpatient clinic. Journal of Psychosomatic Research, 113, 74–80. 10.1016/j.jpsychores.2018.08.001 [DOI] [PubMed] [Google Scholar]
  6. Hutto C, & Gilbert E (2014). VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. Proceedings of the International AAAI Conference on Web and Social Media, 8(1), Article 1. [Google Scholar]
  7. Katz J, Rosenbloom BN, & Fashler S (2015). Chronic Pain, Psychopathology, and DSM-5 Somatic Symptom Disorder. The Canadian Journal of Psychiatry, 60(4), 160–167. 10.1177/070674371506000402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Kroenke K (2003). Patients presenting with somatic complaints: Epidemiology, psychiatric comorbidity and management. International Journal of Methods in Psychiatric Research, 12(1), 34–43. 10.1002/mpr.140 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Kroenke K, Spitzer RL, & Williams JBW (2001). The PHQ-9. Journal of General Internal Medicine, 16(9), 606–613. 10.1046/j.1525-1497.2001.016009606.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Liao S-C, Ma H-M, Lin Y-L, & Huang W-L (2019). Functioning and quality of life in patients with somatic symptom disorder: The association with comorbid depression. Comprehensive Psychiatry, 90, 88–94. 10.1016/j.comppsych.2019.02.004 [DOI] [PubMed] [Google Scholar]
  11. Lumley MA, Cohen JL, Borszcz GS, Cano A, Radcliffe AM, Porter LS, Schubiner H, & Keefe FJ (2011). Pain and emotion: A biopsychosocial review of recent research. Journal of Clinical Psychology, 67(9), 942–968. 10.1002/jclp.20816 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Mack S, Jacobi F, Gerschler A, Strehle J, Höfler M, Busch MA, Maske UE, Hapke U, Seiffert I, Gaebel W, Zielasek J, Maier W, & Wittchen H-U (2014). Self-reported utilization of mental health services in the adult German population - evidence for unmet needs? Results of the DEGS1-Mental Health Module (DEGS1-MH): Utilization of Mental Health Services in Germany. International Journal of Methods in Psychiatric Research, 23(3), 289–303. 10.1002/mpr.1438 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Malgaroli M, Hull TD, Zech JM, & Althoff T (2023). Natural language processing for mental health interventions: A systematic review and research framework. Translational Psychiatry, 13(1), Article 1. 10.1038/s41398-023-02592-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Mehltretter J, Rollins C, Benrimoh D, Fratila R, Perlman K, Israel S, Miresco M, Wakid M, & Turecki G (2020). Analysis of Features Selected by a Deep Learning Model for Differential Treatment Selection in Depression. Frontiers in Artificial Intelligence, 2, 31. 10.3389/frai.2019.00031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Murray AM, Toussaint A, Althaus A, & Löwe B (2016). The challenge of diagnosing non-specific, functional, and somatoform disorders: A systematic review of barriers to diagnosis in primary care. Journal of Psychosomatic Research, 80, 1–10. 10.1016/j.jpsychores.2015.11.002 [DOI] [PubMed] [Google Scholar]
  16. Nook EC, Hull TD, Nock MK, & Somerville LH (2022). Linguistic measures of psychological distance track symptom levels and treatment outcomes in a large set of psychotherapy transcripts. Proceedings of the National Academy of Sciences, 119(13), e2114737119. 10.1073/pnas.2114737119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Patel S, Akhtar A, Malins S, Wright N, Rowley E, Young E, Sampson S, & Morriss R (2020). The Acceptability and Usability of Digital Health Interventions for Adults With Depression, Anxiety, and Somatoform Disorders: Qualitative Systematic Review and Meta-Synthesis. Journal of Medical Internet Research, 22(7), e16228. 10.2196/16228 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Siegel CE, Laska EM, Lin Z, Xu M, Abu-Amara D, Jeffers MK, Qian M, Milton N, Flory JD, Hammamieh R, Daigle BJ, Gautam A, Dean KR, Reus VI, Wolkowitz OM, Mellon SH, Ressler KJ, Yehuda R, Wang K, … Marmar CR (2021). Utilization of machine learning for identifying symptom severity military-related PTSD subtypes and their biological correlates. Translational Psychiatry, 11(1), Article 1. 10.1038/s41398-021-01324-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Spitzer RL, Kroenke K, Williams JBW, & Löwe B (2006). A Brief Measure for Assessing Generalized Anxiety Disorder: The GAD-7. Archives of Internal Medicine, 166(10), 1092–1097. 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]

RESOURCES