Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2022 Mar 22;119(13):e2114737119. doi: 10.1073/pnas.2114737119

Linguistic measures of psychological distance track symptom levels and treatment outcomes in a large set of psychotherapy transcripts

Erik C Nook a,1, Thomas D Hull b,c, Matthew K Nock a, Leah H Somerville a
PMCID: PMC9060508  PMID: 35316132

Significance

Using language to “distance” ourselves from distressing situations (i.e., by talking less about ourselves and the present moment) can help us manage emotions. Here, we translate this basic research to discover that such “linguistic distancing” is a replicable measure of mental health in a large set of therapy transcripts (N = 6,229). Additionally, clustering techniques showed that language alone could identify participants who differed on both symptom severity and treatment outcomes. These findings lay the foundation for 1) tools that can rapidly identify people in need of psychological services based on language alone and 2) linguistic interventions that can improve mental health.

Keywords: language, linguistic distance, treatment outcomes, internalizing symptoms, psychotherapy

Abstract

Using language to distance oneself from negative stimuli (e.g., by reducing use of the word “I” and present-tense verbs) is associated with effective emotion regulation. Given that internalizing disorders like anxiety and depression are characterized by maladaptive emotion regulation, stronger linguistic distance may be both a diagnostic marker of lower internalizing symptoms and a prognostic indicator of treatment progress. Here, we tested these hypotheses in a large corpus of naturalistic psychotherapeutic exchanges between clients and their therapists (>1.2 million messages from 6,229 clients). In both exploratory (n = 3,729) and validation (n = 2,500) datasets, we found that clients’ internalizing symptoms decreased over therapy, that client linguistic distance increased over therapy, and that internalizing symptoms tracked fluctuations in linguistic distance both within and between individuals. In other words, clients shifted from discussing themselves and the present moment to discussing other people and time points over treatment, and this psycholinguistic shift was related to symptom reductions. However, effect sizes for linguistic results were small, and we failed to find consistent evidence that linguistic distance statistically mediated changes in symptoms over time. Finally, clustering analyses revealed that data-driven groups of clients defined solely on the basis of their linguistic distance differed in both their symptom severity and treatment outcomes. Together, these findings provide replicable evidence that linguistic distance is a marker of internalizing symptom severity and treatment progress in real-world therapeutic interactions.


Psychopathology is both common and costly: Approximately 50% of Americans will experience a psychological disorder, and psychopathology accounts for 7% of the global burden of disease (1, 2). Scholars agree that the current mental healthcare system is insufficient for meeting this demand, due to a number of factors, including there being too few therapists, therapists being hard to reach, and most therapies having only moderate efficacy (35). As such, there is a dire need for tools that increase treatment accessibility and efficacy. Clinical scientists have called for technological innovations that could achieve these goals (69), leading to a wave of technology-assisted psychotherapies in which therapists treat clients via text messaging (1012). These platforms increase the reach of any given therapist and can effectively treat internalizing disorders like anxiety and depression at a fraction of the cost of in-person treatment (1114).

Although these technological innovations provide a vital step toward addressing the burden of global mental health, there remains a need for tools that can detect individuals who may need them, as well as techniques that improve the efficacy of existing treatments. Language represents a prime entry point for developing these detection and intervention tools at a large scale, given the facts that 1) verbal and digital conversation is ubiquitous in human society and 2) psychotherapeutic interventions are essentially a set of conversations. In this study, we leverage basic understanding of the relationships between language and emotion to demonstrate that a linguistic measure of healthy emotion regulation tracks psychological symptoms in a large set of psychotherapy transcripts.

Every day, people use words to identify and communicate about their emotional experiences (1517). Difficulties with emotion regulation [i.e., the ways in which people modify or manage their emotional experiences (18, 19)] are robustly connected to psychopathology (2022), and successful therapy operates through changing patients’ emotion regulation habits (2326). Consequently, finding a linguistic signature of poor emotion regulation could measure levels of psychopathology and their remission over treatment at a large scale. Interestingly, a growing number of studies show that using language to increase psychological distance [i.e., “taking a step back” and seeing challenging situations as separated from oneself (27, 28)] can facilitate effective emotion regulation (2933). Increasing distance along “social” and “temporal” dimensions by reducing use of first-person singular pronouns (e.g., “I”) and present-tense verbs (e.g., “feel”) both decreases the intensity of negative affect and predicts the success of emotion regulation (29, 30).

These studies provide compelling evidence that linguistic distance predicts emotion regulation success in the laboratory, and here we ask about its clinical relevance in real-world therapeutic conversations. Individuals who fail to gain distance from their negative emotions may struggle to effectively regulate these emotions, leading to symptoms of internalizing disorders like anxiety and depression. If so, linguistic markers of low psychological distance should track symptoms of psychopathology. Preliminary results align with this notion, as greater use of first-person singular pronouns (indicating less psychological distance along the social dimension) is associated with clinical problems like depression, anxiety, PTSD, and suicide (3440). However, only a handful of studies have investigated this relationship within the context of treatment itself, and they have returned mixed results (4144). Consequently, there is a need for a large, systematic test of whether linguistic distance is longitudinally associated with psychological symptoms in naturalistic exchanges during psychotherapy.

Here, we investigate relations between linguistic distance and psychotherapy outcomes in a large dataset (n = 6,229 participants) from a message-based psychotherapy service (Talkspace). Talkspace clients and their licensed therapists communicate primarily via text messaging, and clients complete self-report inventories of anxiety and depression symptoms every 3 wk. In this study, we examined a programmatic set of research questions that, together, test whether a client’s linguistic distance tracks levels of internalizing symptoms both between and within individuals and whether it might serve as a mediating mechanism of treatment outcomes. We initially conducted analyses in an exploratory subset of the data (n = 3,727). We then preregistered our hypothesized model and analytic plan (https://osf.io/r5gn2) and replicated all analyses in a holdout validation dataset (n = 2,500). Our analyses investigated 1) simple linear relations between time in treatment, symptoms, and linguistic distance; 2) whether increasing linguistic distance mediated reduced symptoms over time in treatment; and 3) whether clusters of clients defined purely on the trajectory of their linguistic distance over treatment differed in their symptom severity or treatment outcomes. Establishing replicable relations between linguistic distance and symptoms in such large-scale real-world data lays a foundation for research that can use language to both detect people at risk for psychopathology and enhance the efficacy of psychotherapy, ultimately reducing the global burden of psychopathology.

Results

Are Internalizing Symptoms, Linguistic Distance, and Time in Treatment Related?

Internalizing symptoms over time in treatment.

Mixed-effects regressions showed that internalizing symptoms fell over the course of treatment with a medium effect size in both the exploratory and validation datasets; βe = −0.42, pe < 0.001, R2βe = 0.37, βv = −0.43, pv < 0.001, and R2βv = 0.40 (Fig. 1). The subscripts “e” and “v” are used to indicate that statistics are from the exploratory and validation datasets, respectively.

Fig. 1.

Fig. 1.

Effect size plot for mixed-effects regressions depicting relations between internalizing symptoms, linguistic distance, and time in treatment within the exploratory (black) and validation (gray) datasets. All 95% CIs do not include zero, indicating significant associations. Ss = Subjects.

Linguistic distance over time.

Linguistic distance (i.e., client’s use of verbs and pronouns that were distanced from themselves and the present moment) increased over time in therapy in both the exploratory and validation datasets; βe = 0.07, pe < 0.001, R2βe = 0.02, βv = 0.08, pv < 0.001, and R2βv = 0.02 (Fig. 1). A small effect size indicated that this was a subtle linguistic shift over the course of therapy.

Linguistic distance and internalizing symptoms.

We observed small but significant relationships between internalizing symptoms and linguistic distance in the exploratory and validation datasets; βe = −0.12, pe < 0.001, R2βe = 0.05, βv = −0.14, pv < 0.001, and R2βv = 0.06. As hypothesized, worse internalizing symptoms were associated with less distanced language. We next decomposed variance in linguistic distance into within-person and between-person components (Fig. 1). Mixed-effects models revealed that internalizing symptoms were significantly associated with both between-person, βe = −0.20, pe < 0.001, R2βe = 0.03, βv = −0.21, pv < 0.001, and R2βv = 0.04, and within-person, βe = −0.04, pe < 0.001, R2βe = 0.02, βv = −0.06, pv < 0.001, and R2βv = 0.005, variance in linguistic distance. Effects ranged from very small to small, and effects were larger for between-person than within-person relationships.

Does Linguistic Distance Mediate Symptom Reduction?

Mediation analyses implemented in a Bayesian framework produced mixed support for the notion that within-person increases in linguistic distance mediate decreased symptoms across time. The mediation model was not significant in the exploratory dataset (Fig. 2A), but it was significant in the validation dataset (Fig. 2B). The very small proportion mediated in the validation dataset (0.5%) indicates that the potential mediating role of within-person fluctuations in internalizing symptoms is extremely small.

Fig. 2.

Fig. 2.

Bayesian mediation models testing whether within-person variance in client linguistic distance mediated changes in internalizing symptoms across time in the (A) exploratory and (B) validation datasets. The 95% CR for the indirect effect included zero for the exploratory but not the validation dataset, providing mixed evidence for the proposed mediation model. Median regression estimates are reported from Bayesian regression models, with their corresponding 95% CRs.

Can Symptoms Be Inferred from Linguistic Patterns Alone?

We used finite mixture regression analyses to cluster clients into groups that shared similar trajectories of linguistic distance over treatment. We then found that these groups—defined based on language alone—significantly differed in both treatment outcomes and symptom severity. We first used Akaike Information Criterion (AIC) values to establish that a four-cluster mixture regression solution provided the best fit to the data for both the exploratory dataset (AICe-4-cluster = 3,562,482; AICe-3-cluster = 3,567,742; AICe-2-cluster = 3,574,110; AICe-1-cluster = 3,593,658) and the validation dataset (AICv-4-cluster = 2,388,523; AICv-3-cluster = 2,392,075; AICv-2-cluster = 2,395,775; AICv-1-cluster = 2,411,255). Clusters were remarkably similar across the exploratory and validation datasets, even though they were defined completely independently (Fig. 3 A and E). Linguistic distance for clusters 1e and 1v started high and rose over therapy (although not significantly for the validation dataset); βe = 0.08, pe = 0.002, R2βe = 0.02, Ne = 569, βv = 0.06, pv = 0.139, R2βv = 0.01, and Nv = 270. Clusters 2e and 2v started slightly less high but rose strongly over therapy; βe = 0.13, pe < 0.001, R2βe = 0.05, Ne = 1,277, βv = 0.14, pv < 0.001, R2βv = 0.06, and Nv = 722. Clusters 3e and 3v started low and remained low over therapy; βe = 0.01, pe = 0.610, R2βe = 0.0004, Ne = 735, βv = 0.02, pv = 0.501, R2βv = 0.001, and Nv = 654. Finally, clusters 4e and 4v started low and rose over therapy; βe = 0.06, pe = 0.001, R2βe = 0.01, Ne = 1,146, βv = 0.09, pv < 0.001, R2βv = 0.02, and Nv = 854.*

Fig. 3.

Fig. 3.

Results of finite mixture regressions clustering participants based on the linguistic distance in their texts over the course of therapy for the (AD) exploratory and (EG) validation datasets. (A and E) Four clusters were identified, which differed in starting and ending linguistic distance as well as the slope of changes in linguistic distance over the course of treatment. (B and F) Clusters differed significantly in baseline internalizing symptoms such that the clusters that started with higher linguistic distance (i.e., clusters 1 and 2) had lower symptoms at the start of treatment than those that started with lower linguistic distance (i.e., clusters 3 and 4). (C and G) Clusters differed significantly in final internalizing symptoms, with clusters 1 and 2 also reporting significantly fewer symptoms than clusters 3 and 4. (D and H) Estimated marginal means of changes in internalizing symptoms across clusters (accounting for baseline symptom levels). Cluster 2, which had both a high starting level of linguistic distance and the strongest increase over time, achieved the best treatment response, significantly stronger than clusters 3 and 4. ***P < 0.001, **P < 0.01, *P < 0.05.

A one-way ANOVA revealed that these clusters—defined solely based on trajectories of linguistic distance over time—differed significantly in baseline internalizing symptom scores in both the exploratory and validation dataset; Fe (3, 3,723) = 14.31, pe < 0.001, η2e = 0.01, Fv (3, 2,496) = 9.18, pv < 0.001, and η2v = 0.01 (Fig. 3 B and F). Pairwise comparisons of conditions revealed remarkably consistent results in both the validation and exploratory datasets. Post hoc Tukey comparisons revealed that baseline internalizing symptom levels were significantly lower for clusters 1 and 2 than clusters 3 and 4; pes < 0.001 and pvs < 0.05. Clusters 1 and 2 did not differ in their baseline symptom levels, and neither did clusters 3 and 4; pes > 0.175 and pvs < 0.641. As such, clusters of clients who started treatment with higher linguistic distance had lower internalizing symptoms at baseline.

Analyses of final internalizing symptom scores showed similar patterns. Clusters differed significantly in their final symptom levels in both the exploratory and validation datasets; Fe (3, 3,723) = 23.18, pe < 0.001, η2e = 0.02, Fv (3, 2,496) = 13.94, pv < 0.001, and η2v = 0.02 (Fig. 3 C and G). Tukey post hoc comparisons indicated that final internalizing symptom scores were significantly lower for clusters 1 and 2 compared to clusters 3 and 4; pes < 0.001 and pvs < 0.006. Again, clusters 1 and 2 did not differ from each other, and neither did clusters 3 and 4; pes > 0.848 and pvs < 0.848. As such, clusters of clients with higher linguistic distance had less severe internalizing symptoms.

Finally, one-way analyses of covariance (ANCOVAs) showed that these clusters also differed in how strongly their internalizing symptoms changed across therapy (i.e., final – baseline internalizing symptom scores, controlling for baseline scores); Fe (3, 3,722) = 5.49, pe < 0.001, ηp2e = 0.004, Fv (3, 2,495) = 2.85, pv = 0.036, and ηp2v = 0.003 (Fig. 3 D and H). Post hoc Tukey tests revealed that symptoms fell more strongly across treatment for cluster 2 than for clusters 3 and 4; pes < 0.001 and pvs < 0.02. As such, the cluster with the strongest increase in linguistic distance across treatment also had the greatest treatment response. In the validation dataset only, cluster 3v also showed significantly less treatment gain than cluster 1v, pv = 0.029, meaning that the cluster that did not increase in linguistic distance over treatment faired poorest. No other pairwise comparisons for change in internalizing symptoms reached significance; pes >0.07 and pvs > 0.31.

As such, clustering participants based on trajectories of linguistic distance revealed replicable signatures of participants who differed in both their overall symptom severity and their treatment response. Analyses of temporal and social components of the linguistic distancing measure revealed that temporal distance clusters differed in treatment outcomes but not baseline symptom levels, whereas social distance clusters differed in chronic symptom levels but not treatment outcomes (SI Appendix).

Discussion

Given the immense burden of mental illness, there is a dire need for methods that can detect psychological symptoms and enhance current treatments at a large scale. Due to the central role of language in psychotherapy, we tested whether a linguistic marker of psychological distance could be used as an indicator of a client’s symptom severity throughout treatment. Using a large sample of psychotherapy transcripts, we found replicable evidence that linguistic distance indeed tracks internalizing symptoms at both within- and between-person levels and that clustering analyses reveal groups of participants who differ in both their symptom severity and treatment gains. These results support a theoretical model in which linguistic distance reflects healthy emotion regulation, making linguistic distance a tool for measuring mental health and treatment progress.

Both initial analyses of an exploratory dataset and preregistered analyses of a holdout validation dataset provided consistent evidence that linguistic distance increased over time and tracked internalizing symptoms. Although prior research has shown that using language to distance oneself from aversive stimuli is related to effective emotion regulation (2933), here we demonstrate the translational impact of this basic finding by showing its relationship with psychological symptoms in a naturalistic and longitudinal dataset of psychotherapy transcripts. Not only does this result support a theoretical model in which distancing language facilitates healthy emotion regulation, it also lends evidence to theories that emphasize the transdiagnostic role of emotion regulation in psychopathology (20, 45, 46). Additionally, the discovery of a stable small relation to psychopathology in a large dataset clarifies mixed results obtained from prior studies, most of which used small samples (4144). Furthermore, this study shows that the relationship between linguistic distance and internalizing symptoms exists at both between-person and within-person levels. This extends prior research that focused only on between-level relationships (34, 47), providing strong evidence that linguistic distance can serve as a diagnostic and prognostic indicator of symptom severity, even as symptom levels fluctuate over the course of treatment.

However, evidence for mediation (i.e., that increasing linguistic distance explained reduced symptoms across treatment) was inconsistent across exploratory and validation datasets. These inconsistencies suggest either that linguistic distance plays a very small mechanistic role in psychotherapy or that it merely reflects (rather than reduces) internalizing symptoms in therapy. Although this result runs contrary to hypotheses, it prompts future research that can decipher how linguistic distance tracks internalizing symptoms without serving a mediating role. Testing relations between language and symptoms at closer timescales or examining measures of a client’s actual emotion regulation or psychological distancing skills (rather than their linguistic correlates) could provide better tests of this underlying model.

Finally, clustering approaches provided replicable evidence that trajectories of linguistic distance can predict treatment outcomes and symptom severity. In particular, we found that starting levels and slopes of linguistic distance related to symptom severity and treatment response, respectively. These results supplement the regression results described above to demonstrate that client language is intimately linked to mental health and treatment response, allowing us to deduce clinically relevant diagnostic and prognostic information from linguistic data alone. These clustering methods pave the way for more sophisticated machine learning approaches that could provide accurate clinical predictions from a client’s linguistic data. Additionally, differences between the temporal and social distancing metrics presented in SI Appendix carry several interesting implications for psychotherapy. These analyses showed that temporal distance clusters differed significantly in their treatment outcomes but not baseline symptoms, but social distance clusters differed in internalizing symptom severity but not changes in symptoms over time. Consequently, linguistic measures of social distance (i.e., pronoun use) may provide a trait-like measure of overall internalizing dysfunction, whereas temporal distance may reflect within-person shifts in one’s retreating symptoms. Future research that parses temporal and social distance at both the linguistic and phenomenological levels (i.e., assessing client’s experienced tendency to dilate their psychological focus away from themselves and/or the present moment) could shed further light on these hallmark symptoms of depression and anxiety, as well as the role of this process in successful treatment (see SI Appendix for further discussion).

A strength of the current study is its unprecedented scale and naturalism, made possible through an inclusive approach to analyses. Indeed, using unfiltered data reduced experimenter degrees of freedom and provided the most conservative test of our research questions. However, taking such an unconstrained approach means that substantial noise remains in the data. The decisions to 1) include every text message (even if they are extremely short or may not be related to therapeutic interventions; e.g., messages about scheduling), 2) include all participants (even those who provided very few text messages), and 3) average linguistic data over a 3-wk period to match the frequency of symptom measures could add noise and cloud accurate assessments of effect sizes. Developing principled inclusion criteria and filtering methods could improve effect size estimates of relationships between variables. Nonetheless, this study serves as a foundational litmus test of these relationships, and future studies using machine learning and natural language processing approaches could further refine effect size estimates.

Potentially because of this naturalistic approach, effect sizes for linguistic relationships were consistently small. This indicates that we observed subtle linguistic shifts over treatment and that linguistic interventions may only provide a small “nudge” when it comes to actual clinical impact. However, it is important to remember that linguistic distance was a byproduct, not a target of treatment, meaning that observing this effect in the context of an inclusive and naturalistic dataset provides strong support for the underlying theoretical model. Additionally, it’s possible that the higher level of noise at the text level and relatively small number of within-person symptom measurements (i.e., three to five) compared to the high number of subjects (i.e., thousands) could have rendered within-subject relations much weaker than between-person relations. Future research should use principles noted above to reduce noise and improve estimates of effect sizes, examine whether there are moderators that shift “for whom” these effects work, and increase the frequency of within-person symptom sampling to test whether within-person effects are actually larger than those estimated here. That said, there are reasons to value these small effects. Researchers have recently argued that celebrating small effects is key to developing a replicable psychological science (48), and even small effect sizes can have a large impact when they are employed on a large scale. For example, if 25% of 327 million Americans suffer from psychopathology in any given year, helping patients recover just 1 d faster will restore 82 million days of human productivity. As such, it would also be prudent to conduct cost–benefit analyses to quantify the actual impact of these interventions.

The current findings advance the field’s ability to detect mental health problems from language alone. To work toward deployable tools with real-world impact, future research should address a few key limitations of the current study. First, we propose a theoretical model in which linguistic distance reflects emotion regulation abilities, which increase across time and ultimately improve internalizing symptoms. However, the current study does not include measures of emotion regulation, leaving it unclear what linguistic distance represents in this study. It could, indeed, reflect improved emotion regulation, but it could also reflect myriad other constructs (e.g., avoidance or improved therapeutic alliance). Future studies that empirically evaluate how adaptive emotion regulation fits in the proposed model are needed. Second, because there was no control group in this study, we cannot infer that Talkspace conversations were causally involved in either clients’ decreased symptoms across time or their increased linguistic distance across time. Studies that utilize active control conditions are needed to determine causal relationships. These experiments could also 1) give a clearer sense of the downstream impacts of these relationships (e.g., if they can causally reduce symptoms in the long term) and 2) adjudicate between the directions of language–symptom relationships. The current study tests one direction (i.e., that language predicts and explains symptom changes), and, although this association emerges, mediations were inconsistent. Follow-up experiments that include measures at a fine timescale could compare this direction with its reverse (i.e., that symptoms predict and explain language changes).

Third, although we demonstrate an overall relationship between higher linguistic distance and reduced symptoms, it is possible that distancing is not always an adaptive strategy. Indeed, substantial data show that “experiential avoidance” (i.e., pushing away internal or external stressors; ostensibly increasing distance) is maladaptive, whereas mindfully attending to the present moment (ostensibly decreasing distance) is adaptive (4952). Similarly, depression and anxiety are stereotypically seen as disorders in which people are overly focused on past losses or future threats, respectively. Why then would distancing from the present be helpful if habitually being “away from” the present moment is associated with psychopathology? One way to reconcile this apparent paradox is to consider that people suffering from depression and anxiety might not be distancing themselves from the present moment to think about the past and future; instead they are pulling these past and future moments into the present, seeing them with very low psychological distance, and acting as if they are currently happening (e.g., “I can’t believe I am such a failure” or “catastrophe is imminent”). As such, it is possible that learning to resist the avoidant strategies of worry and rumination requires taking a distanced perspective on that maladaptive habit and gaining skills to interrupt these processes. Mindful awareness is one strategy to do just that, as, even though it requires attending to the present moment, it also calls for viewing one’s thoughts as detached and separate from oneself (51, 53), a highly distanced perspective. These are initial attempts at resolving the puzzle of how a distanced perspective may facilitate psychological health, even though prior research establishes avoidance as unhelpful and present-focused mindful awareness as helpful. However, these possibilities require additional empirical investigation, a line of research that would benefit from incorporating emerging frameworks that emphasize the contextual nature of emotion regulation to parse when and in which contexts high distance is adaptive (5456).

In conclusion, this study used a large dataset of therapeutic exchanges to show that the psychological distance encoded in one’s speech reflects one’s level of internalizing symptoms and can even track within-person changes in symptom severity across time in treatment. Although mixed results emerged for whether linguistic distance played a mediating role in treatment outcomes, the current study lends support to the theoretical model suggesting that linguistic distancing tracks both emotion regulation and mental health. Findings extend prior research, foster future research questions, and lay the foundation for future tools that can use linguistic tools to both detect individuals suffering from psychopathology and guide interventions that reduce human suffering.

Methods

Participants.

This study included data from a random sample of 6,229 clients who utilized the digital psychotherapy service Talkspace (https://www.talkspace.com/) between 2016 and 2019. Given the longitudinal focus of this study, participants were only included if they had completed at least three symptom inventory questionnaires spanning at least 6 wk of treatment. For included clients, we downloaded 1) a fully deidentified record of all text message exchanges between the client and their Talkspace therapist, 2) their responses to measures of depression and anxiety, and 3) their self-reported demographics. Talkspace clients and therapists agreed to third parties conducting research on their data as a part of the terms of use (https://www.talkspace.com/public/terms). The Harvard University institutional review board designated the current study not human research (IRB18-1583), as the study utilized preexisting deidentified data for which consent to research was provided. The overall sample of 6,229 participants was randomly divided into an exploratory dataset (Ne = 3,729; 60%) and a validation dataset (Nv = 2,500; 40%), with analyses of the validation dataset only occurring after preregistering analyses and hypotheses (see https://osf.io/r5gn2). Participant demographics are displayed in Table 1.

Table 1.

Sample and platform description

Full sample Exploratory subsample Validation subsample
Gender, No. (%)
 Female 4,742 (77.4) 2,857 (77.7) 1,885 (77.0)
 Male 1,306 (21.3) 766 (20.8) 540 (22.1)
 Transgender female 11 (0.2) 8 (0.2) 3 (0.1)
 Transgender male 11 (0.2) 5 (0.1) 6 (0.2)
 Gender queer 27 (0.4) 21 (0.6) 6 (0.2)
 Gender variant 6 (0.1) 3 (0.1) 3 (0.1)
 Other 20 (0.3) 16 (0.4) 4 (0.2)
 No response 106 53 53
Age, No. (%)*
 18–25 1,031 (22.2) 606 (21.8) 425 (22.8)
 26–35 2,536 (54.6) 1,517 (54.6) 1,019 (54.8)
 36–49 871 (18.8) 526 (18.9) 345 (18.5)
 50+ 203 (4.4) 131 (4.7) 72 (3.9)
 No response 1,588 949 639
Race, No. (%)
 Caucasian 1,172 (60.4) 698 (60.3) 474 (60.5)
 African American 284 (14.6) 179 (15.5) 105 (13.4)
 Asian 140 (7.2) 82 (7.1) 58 (7.4)
 Hispanic 120 (6.2) 66 (5.7) 54 (6.9)
 Native American 5 (0.3) 2 (0.2) 3 (0.4)
 Other 195 (10.1) 115 (9.9) 80 (10.2)
 Declined to identify 24 (1.2) 15 (1.3) 9 (1.1)
 No response 4,289 2,572 1,717
Education level, No. (%)
 Less than high school 28 (0.5) 14 (0.4) 14 (0.7)
 High school 808 (15.7) 477 (15.3) 331 (16.2)
 Associate’s degree 78 (1.5) 40 (1.3) 38 (1.9)
 Some college no degree 200 (3.9) 126 (4.0) 74 (3.6)
 Bachelor’s degree 3,683 (71.4) 2,238 (71.9) 1,445 (70.7)
 Master’s degree 260 (5.0) 163 (5.2) 97 (4.7)
 Professional degree 43 (0.8) 23 (0.7) 20 (1.0)
 Doctoral degree 56 (1.1) 31 (1.0) 25 (1.2)
 No response 1,073 617 456
Symptom measures, mean (SD)
 Baseline internalizing symptoms 22.21 (9.90) 21.99 (9.89) 22.54 (9.92)
 Final internalizing symptoms 15.16 (9.87) 14.95 (9.79) 15.48 (10.00)
 Baseline depression symptoms 11.03 (5.82) 10.93 (5.84) 11.18 (5.80)
 Final depression symptoms 7.56 (5.58) 7.44 (5.54) 7.73 (5.62)
 Baseline anxiety symptoms 11.18 (5.04 11.06 (5.02) 11.36 (5.06)
 Final anxiety symptoms 7.60 (4.90 7.50 (4.83) 7.75 (4.99)
Therapy and text qualities
 Text-only subscription, No. (%) 6,108 (98.1) 3,659 (98.1) 2,449 (98.0)
 No. of client messages 759,706 455,379 304,327
 Length of client messages (words), mean (SD) 80.81 (146.80) 81.61 (147.02) 79.61 (146.46)
 No. of therapist messages 461,911 273,208 188,703
 Length of therapist messages (words), mean (SD) 82.69 (107.30) 84.32 (108.00) 80.33 (106.23)
 Present-tense verbs per message, mean (SD) 8.73 (15.10) 8.81 (15.11) 8.61 (15.09)
 Past-tense verbs per message, mean (SD) 3.87 (9.34) 3.89 (9.34) 3.83 (9.34)
 Future-tense verbs per message, mean (SD) 0.82 (1.80) 0.83 (1.81) 0.81 (1.79)
 First-person singular pronouns per message, mean (SD) 8.13 (14.50) 8.17 (14.49) 8.06 (14.51)
 Other pronouns per message, mean (SD) 3.75 (9.38) 3.78 (9.38) 3.71 (9.38)
 Number of therapists, mean (SD) 1.21 (0.51) 1.21 (0.51) 1.22 (0.52)
Number of symptom measures, mean (SD) 3.45 (0.51) 3.45 (0.51) 3.45 (0.51)
 Exactly 3 measures, No. (%) 3,453 (55) 2,058 (55) 1,395 (56)
 Exactly 4 measures, No. (%) 2,741 (44) 1,651 (44) 1,090 (44)
 Exactly 5 measures, No. (%) 35 (1) 20 (1) 15 (1)
Days between start of therapy and final symptom measure, mean (SD) 63.93 (11.90) 63.94 (11.89) 63.91 (11.92)

Percentages ignore clients who did not respond to each demographic question.

*Participants selected from predefined age bins. Age is given in years.

Therapy Platform.

Talkspace is a digital mental health platform that provides session-based teletherapy, as well as asynchronous messaging therapy, from which these data were drawn. Potential clients register with the service and begin by describing their presenting complaint and treatment goals to a consultation therapist. This information enables the system to provide the client with three licensed National Committee for Quality Assurance credentialed therapist options. These recommendations are based on each therapist’s history with demographically and diagnostically similar clients. The chosen “primary therapist” then treats the client. Clients can purchase live phone and video sessions, but most clients select the messaging-only plan (98.1% in this dataset; Table 1).

Clients may send therapists messages whenever they wish using the HIPAA-compliant smartphone-based application or the Talkspace website. Therapists respond by messages during designated hours. Clients have the option to transfer to a different primary therapist, but most clients interacted with only one therapist (i.e., 82.5% in the full dataset; Table 1). Mean length of text messages in the full dataset was ∼80 words, providing substantial data for linguistic analysis (Table 1).

Symptom Assessments.

Procedure.

Symptom questionnaires were sent to clients via the messaging platform approximately every 3 wk over the course of therapy. The link to complete questionnaires expired only when the next set of questionnaires were sent (i.e., participants could complete questionnaires whenever they would like after receiving the link, up until the next questionnaire administration). The date on which participants completed the questionnaire was recorded. This date was transformed into a measure of their current time in therapy at that symptom measurement by computing the number of days between questionnaire completion and the start of therapy (i.e., the date of the first text message between the client and their primary therapist).

Depression symptoms.

Symptoms of depression were measured using the eight-item Personal Health Questionnaire (PHQ-8), a validated and widely used tool for assessing depressive symptoms (57). Participants rated how often over the last 2 wk they had been bothered by eight of the nine symptoms of major depressive disorder (i.e., anhedonia, low mood, sleep disturbance, fatigue, appetite disturbance, low self-esteem, concentration difficulties, and psychomotor agitation or slowing). Responses were made on a four-point scale (0 = not at all, 1 = several days, 2 = more than half the days, and 3 = nearly every day). Responses were summed to provide a measure of overall depression symptom severity, with scores ranging from 0 to 24. Unlike PHQ-9, PHQ-8 does not include an item assessing suicidal ideation. However, studies have shown that PHQ-8 and PHQ-9 provide equivalently sensitive and valid measures of depressive symptoms (5860).

Anxiety symptoms.

Anxiety symptoms were assessed using the seven-item Generalized Anxiety Disorder Questionnaire (GAD-7) (61), a widely used and validated measure of anxiety symptoms. Participants rated how often over the last 2 wk they had been bothered by core symptoms of generalized anxiety disorder (i.e., feelings of anxiety, uncontrollable worrying, difficulty relaxing, restlessness, irritability, and fears of catastrophic outcomes). Responses were made on a four-point scale (0 = not at all, 1 = several days, 2 = more than half the days, 3 = nearly every day) and summed to provide a measure of overall anxiety symptom severity, with scores ranging from 0 to 21.

Data Processing.

Producing a combined measure of internalizing symptoms.

Preliminary analyses in the exploratory dataset revealed that scores on PHQ-8 and GAD-7 were strongly related to each other [within-person correlation using the statsBy function in the psych package (62): re = 0.70, pe < 0.001, rv = 0.69, and pv < 0.001]. We consequently collapsed these two measures into a single assessment of internalizing symptoms by summing the two scales together, as has been done in prior work (63). Nonetheless, preregistered supplementary analyses were conducted on depression and anxiety scores separately both to present these individual statistics and to show that results were largely equivalent across the two measures (SI Appendix, Table S2).

Text processing.

We developed code in R Version 4.0.4 (64) to extract individual text messages from Talkspace text records. Text messages were extracted with their corresponding date and time of delivery, as well as the author of the text (i.e., client or therapist). Linguistic distance for each text was first computed following prior work (29, 30, 65). However, analyses of the exploratory dataset suggested that overall pronoun and verb use increased over the course of treatment (presumably due to changes in topics of conversation). These overall shifts across time made this measure unsuited to the current study (SI Appendix). We consequently developed more-precise measures of linguistic distance that controlled for overall shifts in verb and pronoun use over the course of treatment.

Linguistic Inquiry and Word Count (LIWC) (66) software was used to compute the percentage of words that were verbs (divided into past, present, and future tense) and pronouns (divided into first-person singular, first-person plural, second person, third-person singular, and third-person plural). We computed a temporal distance score for each text message by computing the proportion of verbs that were not in the present tense [i.e., (past + future)/(past + future + present)]. Similarly, we computed a social distance score for each text message by computing the proportion of pronouns that were not first-person singular [i.e., (second person + first-person plural + third-person singular + third-person plural)/(second person + first-person plural + third-person singular + third-person plural + first-person singular)]. Temporal distance scores were treated as missing for text messages that included no verbs (7.1% of client text messages for the exploratory and 6.9% of validation dataset), and social distance scores were treated as missing for text messages that included no pronouns (9.4% of client text messages for exploratory and 9.1% of validation dataset).

We then averaged these two measures at the text level into a single combined linguistic distance score (11.2% messages were unusable due to no pronouns or verbs used in exploratory dataset and 10.9% in validation dataset). This revised measure of linguistic distance 1) captures the relative focus on temporal and social targets that are distanced from the present moment and 2) accounts for overall differences in verb and pronoun use across treatment. Analyses of social and temporal distance as separate metrics are presented in SI Appendix.

Aligning text and questionnaire data.

Text data were collected at a more granular timescale (i.e., minutes, hours, or days) compared to symptom measures (i.e., every 3 wk). We computed the mean linguistic distance in users’ text messages within the ∼3-wk periods between symptom assessments and aligned these averages with the symptom assessments completed at the end of each of these observation periods. We quantified time (i.e., days in therapy) by computing the number of days between the date questionnaires were completed and the beginning of therapy (i.e., the date of the first text message between the client and the primary therapist). This resulted in a dataset comprising baseline symptom measures (at time = 0), symptom measures at each subsequent symptom measurement point, and the mean linguistic distance of client text messages sent before each of these symptom measurements, all nested within participants.

Consideration of exclusion criteria.

It is worth noting that we adopted an inclusive approach to analyzing this real-world dataset. Although criteria could have been developed to exclude participants (e.g., minimum number of text messages, minimum initial symptom severity, or type of subscription) or text messages (e.g., minimum word count), we refrained from imposing experimenter-defined cutoffs as much as possible. Given the novelty of this naturalistic analysis, we chose to take an inclusive approach to provide unbiased insight into research questions, but the presence of unfiltered noise should be noted when interpreting results.

Research Questions, Analyses, and Hypotheses.

Are internalizing symptoms, linguistic distance, and time in treatment related?

We first tested the “arms” of a mediation model in which linguistic distance mediates reductions in internalizing symptoms over time in treatment. This involved using mixed-effect models to test for linear relationships between 1) days in treatment and internalizing symptoms, 2) days in treatment and linguistic distance, and 3) linguistic distance and internalizing symptoms. We hypothesized that 1) time in therapy would be negatively related to symptoms, 2) linguistic distance would be positively related to time in therapy, and 3) linguistic distance would be negatively related to internalizing symptoms at both within-person and between-person levels of analysis.

For this third relationship, it was important to decompose measures of linguistic distance into within-person and between-person components within this longitudinal design (67, 68). This is because a relationship between linguistic distance and internalizing symptoms could emerge in mixed-effect models either 1) because, as people increase their linguistic distance, their symptoms reduce (a within-person relationship) or 2) because individuals who, overall, have higher linguistic distance have lower symptoms than individuals who, overall, have lower linguistic distance (a between-person relationship). We consequently followed prior work in decomposing linguistic distance into within-person and between-person components and used these components in mixed-effects regressions (6772). A variable representing the within-person fluctuation in linguistic distance was created by subtracting each individual’s mean linguistic distance score from the score of each of their observations, producing a variable representing within-person deviation, centered around their individual mean. Then, a variable representing between-person variance in linguistic distance was constructed by subtracting the overall group mean of linguistic distance from that participant’s average temporal distance. This produced a variable that was constant for each participant and represented how their mean level deviated from the group’s mean. These within-person and between-person variables were entered simultaneously in mixed-effects models testing relations between linguistic distance and internalizing symptoms.

Does linguistic distance mediate symptom reduction?

We next conducted mediation analyses to formally test whether increasing linguistic distance over the course of treatment mediated symptom changes, using measures of linguistic distance that had been decomposed into their within-person and between-person components. Typically, mediation analyses utilize bootstrapping methods (i.e., randomly sampling from the original dataset with preplacement thousands of times) to generate many samples from which a confidence window can be constructed to test the significance of mediation model (73). However, the appropriate method for bootstrapping multilevel data is not clear, as random samples can be drawn at the participant level, at the observation level, or at both participant and observation levels. We thus used Bayesian analytic procedures—which do not involve bootstrapping methods—for our mediation analyses, to sidestep this issue (70).

Like the mixed-effects models described above, Bayesian regression models included a random effect of participant to account for the multilevel nature of the dataset. To provide relatively unbiased starting points for Bayesian analyses, we supplied weakly informative priors (Gaussian distribution of M = 0, SD = 10) for all regressors in the models. Bayesian analyses were implemented using the Stan language in R (74). Two Markov chains used the Monte Carlo No U-Turn Sampler (75) to approximate the posterior distribution of each regressor across 12,500 iterations, with the first 2,500 iterations discarded as burn-in. The indirect effect (i.e., the a × b pathway for the within-person parameter) and proportion mediated (i.e., indirect effect/[indirect effect + direct effect] × 100) were computed for each mediation model. A significant mediation was determined when the 95% credible range (CR) of posterior density for the indirect effect did not include zero. We hypothesized that within-person increases in linguistic distance would mediate decreased symptoms across therapy.

Can symptoms be inferred from linguistic patterns alone?

Finally, we used clustering approaches to supplement the regression models utilized above. One limitation of regressions is the extent of aggregation that is required to align text and questionnaire data, resulting in loss of information and introduction of noise. As such, we utilized finite mixture regression techniques (76), which analyze data at the text level. In essence, mixture regressions identify clusters of individuals based on similarities of joint distributions among variables. This means that participants who tend to have the same relationship between two variables are grouped together. In this case, we used mixture regressions to cluster individuals based on how their linguistic distance in individual text messages varied across time (e.g., grouping clients whose linguistic distance increased over time into one cluster and grouping clients whose linguistic distance decreased over time into a different cluster). This allowed us to test whether text data could be used to draw inferences about clients’ symptoms and treatment outcomes.

Mixture regression models were conducted on a dataset that included the linguistic distance score for every text message that clients sent to their primary therapist over the course of therapy. For added precision, time in therapy was quantified as a decimal value that included the proportion of a day that had passed since the first text sent between the client and the therapist. We conducted mixture regression analyses that grouped participants into one, two, three, and four clusters, and we then selected the number of clusters that provided the best fit, as determined by AIC. For additional stability and model fit, mixture regressions for each cluster size were implemented 10 times (to account for subtle differences that can emerge depending on random starting points of the clustering algorithm), and the best-fitting model was selected. Mixture regressions included a random effect of subject to account for nesting of text messages within subjects. We then conducted analyses in the aggregated dataset described above (i.e., in which linguistic data were averaged to match the timeline of symptom inventories) to test how each measure of linguistic distance varied across time in each cluster (using mixed-effects models), as well as how clusters differed in baseline and final internalizing symptoms (using ANOVAs and post hoc Tukey tests), and how they differed in their change in internalizing symptom scores [i.e., analyzing final – baseline internalizing symptoms change scores using ANCOVAs to control for baseline symptom levels (77, 78)].

Additional preregistered analyses.

All analyses were initially only conducted in the exploratory dataset of 3,720 participants, and analyses of the 2,500 participants in the validation dataset occurred following preregistration. Note that we preregistered analyzing social and temporal components of linguistic distancing measure separately. In the revision process, we decided to combine these into a single measure. Results and conclusions are largely the same when each component is analyzed separately, and all preregistered analyses are provided in SI Appendix, including separate analyses of depression and anxiety symptoms. We also preregistered an additional set of analyses related to the role of therapist linguistic distancing in treatment outcomes. Because the current paper focuses on client language, we have reserved analyses of therapist language for a subsequent report focused on interpersonal processes in therapy.

Model building.

Mixed-effects models all included a random intercept for subject. We followed conventional model-building steps to test whether adding random slopes improved model fit (as determined by a lower AIC and a significant model comparison). For both the exploratory and validation datasets, these steps consistently revealed that adding a random slope for time in therapy significantly improved model fit. Hence, a random slope of time in therapy was added to all models that included this variable as a fixed effect. Random slopes for models without time as a predictor (e.g., relating client temporal distance and internalizing symptoms) were included when doing so improved model fit. Linear mixed-effect model regression estimates are reported in standardized units (i.e., β). Note that there are several methods for computing standardized βs in mixed-effects models, and here coefficients are standardized at their relevant “level” (i.e., in relation to within-person or between-person variance) using the “pseudo” option of the “standardize_parameters” function in the effectsize package (79). We characterize effect sizes according to conventions for correlation coefficients (i.e., ∼0.1 = small, ∼0.3 = medium, and ∼0.5 = large) (80). To provide an additional estimate of effect sizes in mixed-effects models, we report the proportion variance explained by each predictor (i.e., semipartial R2β) following the conventions described by Edwards and coworkers (81, 82) and using Satterthwaite estimation of degrees of freedom. Regression estimates for Bayesian mediation models are reported in their raw unstandardized form (i.e., b), but we report the proportion mediated as the key effect size for each mediation model. We use eta squared (i.e., η2) to report the effect size of one-way ANOVAs and use partial eta squared (i.e., ηp2) for ANCOVAs that control for baseline symptoms.

Software.

LIWC 2007 (66) was used to extract word class frequencies from text messages. Mixed-effects models were conducted in lme4 (83), with P values calculated using the lmerTest package (84). Standardized betas of linear mixed-effects models were extracted using the effectsize package (79). Bayesian analyses were conducted using the brms package (85, 86). Mixture regressions were conducted in the flexmix package (76).

Supplementary Material

Supplementary File
pnas.2114737119.sapp.pdf (462.3KB, pdf)

Acknowledgments

We thank James Gross, Ethan Kross, and the Cambridge Writing Group for helpful comments. We thank Harvard University’s Institute for Quantitative Social Science for guidance on statistical analyses. This work was supported by an NSF Graduate Research Fellowship to E.C.N. (DGE1144152).

Footnotes

Competing interest statement: T.D.H. is an employee of Talkspace. To eliminate the impact of this conflict of interest on the study, T.D.H. did not contribute to decisions concerning data analysis or reporting of results. M.K.N. receives publication royalties from Macmillan, Pearson, and UpToDate. M.K.N. has been a paid consultant in the past year for Microsoft Corporation, the Veterans Health Administration, and Cerebral, and for a legal case regarding a death by suicide. M.K.N. is an unpaid scientific advisor for Empatica, Koko, and TalkLife. These roles are not perceived as creating conflicts of interests but are reported for transparency.

This article is a PNAS Direct Submission.

*Two participants from the exploratory dataset were not included in temporal distance or social distance clusters because they had no text messages that included verbs or pronouns. These participants sent very few text messages (i.e., 13 or fewer).

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2114737119/-/DCSupplemental.

Data Availability

Markdowns documenting analytic code and results of all analyses have been deposited in the Open Science Framework (https://osf.io/u98r3/). Some study data are available. (Due to the sensitive and proprietary nature of Talkspace data, they cannot be made publicly available. However, Talkspace data can be made available upon completion of a Data Use Agreement and data security review with Talkspace.)

References

  • 1.J. Rehm, K.D. Shield, Global burden of disease and the impact of mental and addictive disorders. Curr. Psychiatry Rep. 21, 10 (2019). [DOI] [PubMed]
  • 2.Kessler R. C., et al. , The global burden of mental disorders: An update from the WHO World Mental Health (WMH) surveys. Epidemiol. Psychiatr. Sci. 18, 23–33 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Merwin E., Hinton I., Dembling B., Stern S., Shortages of rural mental health professionals. Arch. Psychiatr. Nurs. 17, 42–51 (2003). [DOI] [PubMed] [Google Scholar]
  • 4.Andrews G., Sanderson K., Slade T., Issakidis C., Why does the burden of disease persist? Relating the burden of anxiety and depression to effectiveness of treatment. Bull. World Health Organ. 78, 446–454 (2000). [PMC free article] [PubMed] [Google Scholar]
  • 5.Chisholm D., et al. ; Lancet Global Mental Health Group, Scale up services for mental disorders: A call for action. Lancet 370, 1241–1252 (2007). [DOI] [PubMed] [Google Scholar]
  • 6.Kazdin A. E., Rabbitt S. M., Novel models for delivering mental health services and reducing the burdens of mental illness. Clin. Psychol. Sci. 1, 170–191 (2013). [Google Scholar]
  • 7.Kazdin A. E., Blase S. L., Rebooting psychotherapy research and practice to reduce the burden of mental illness. Perspect. Psychol. Sci. 6, 21–37 (2011). [DOI] [PubMed] [Google Scholar]
  • 8.Perle J. G., Nierenberg B., How psychological telehealth can alleviate society’s mental health burden: A literature review. J. Technol. Hum. Serv. 31, 22–41 (2013). [Google Scholar]
  • 9.Holmes E. A., et al. , The Lancet Psychiatry Commission on psychological treatments research in tomorrow’s science. Lancet Psychiatry 5, 237–286 (2018). [DOI] [PubMed] [Google Scholar]
  • 10.Rathbone A. L., Prescott J., The use of mobile apps and SMS messaging as physical and mental health interventions: Systematic review. J. Med. Internet Res. 19, e295 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hull T. D., Mahan K., A study of asynchronous mobile-enabled SMS text psychotherapy. Telemed e-Health 23, 240–247 (2017). [DOI] [PubMed] [Google Scholar]
  • 12.DellaCrosse M., Mahan K., Hull T. D., The effect of messaging therapy for depression and anxiety on employee productivity. J. Technol. Behav. Sci. 4, 1–5 (2019). [Google Scholar]
  • 13.Andrews G., Cuijpers P., Craske M. G., McEvoy P., Titov N., Computer therapy for the anxiety and depressive disorders is effective, acceptable and practical health care: A meta-analysis. PLoS One 5, e13196 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hull T. D., Malgaroli M., Connolly P. S., Feuerstein S., Simon N. M., Two-way messaging therapy for depression and anxiety: longitudinal response trajectories. BMC Psychiatry 20, 297 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Brans K., Van Mechelen I., Rimé B., Verduyn P., To share, or not to share? Examining the emotional consequences of social sharing in the case of anger and sadness. Emotion 14, 1062–1071 (2014). [DOI] [PubMed] [Google Scholar]
  • 16.Rimé B., Emotion elicits the social sharing of emotion: Theory and empirical review. Emot. Rev. 1, 60–85 (2009). [Google Scholar]
  • 17.Rimé B., Bouchat P., Paquot L., Giglio L., Intrapersonal, interpersonal, and social outcomes of the social sharing of emotion. Curr. Opin. Psychol. 31, 127–134 (2020). [DOI] [PubMed] [Google Scholar]
  • 18.Gross J. J., The emerging field of emotion regulation: An integrative review. Rev. Gen. Psychol. 2, 271–299 (1998). [Google Scholar]
  • 19.Gross J. J., Emotion regulation: Current status and future prospects. Psychol. Inq. 26, 1–26 (2015). [Google Scholar]
  • 20.Aldao A., Nolen-Hoeksema S., Schweizer S., Emotion-regulation strategies across psychopathology: A meta-analytic review. Clin. Psychol. Rev. 30, 217–237 (2010). [DOI] [PubMed] [Google Scholar]
  • 21.Fernandez K. C., Jazaieri H., Gross J. J., Emotion regulation: A transdiagnostic perspective on a new RDoC domain. Cognit. Ther. Res. 40, 426–440 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gross J. J., Jazaieri H., Emotion, emotion regulation, and psychopathology: An affective science perspective. Clin. Psychol. Sci. 2, 387–401 (2014). [Google Scholar]
  • 23.Radkovsky A., McArdle J. J., Bockting C. L. H., Berking M., Successful emotion regulation skills application predicts subsequent reduction of symptom severity during treatment of major depressive disorder. J. Consult. Clin. Psychol. 82, 248–262 (2014). [DOI] [PubMed] [Google Scholar]
  • 24.Berking M., et al. , Emotion-regulation skills as a treatment target in psychotherapy. Behav. Res. Ther. 46, 1230–1237 (2008). [DOI] [PubMed] [Google Scholar]
  • 25.Berking M., Ebert D., Cuijpers P., Hofmann S. G., Emotion regulation skills training enhances the efficacy of inpatient cognitive behavioral therapy for major depressive disorder: A randomized controlled trial. Psychother. Psychosom. 82, 234–245 (2013). [DOI] [PubMed] [Google Scholar]
  • 26.Daros A. R., et al. , A meta-analysis of emotional regulation outcomes in psychological interventions for youth with depression and anxiety. Nat. Hum. Behav. 5, 1443–1457 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Parkinson C., Liu S., Wheatley T., A common cortical metric for spatial, temporal, and social distance. J. Neurosci. 34, 1979–1987 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Trope Y., Liberman N., Construal-level theory of psychological distance. Psychol. Rev. 117, 440–463 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Nook E. C., Schleider J. L., Somerville L. H., A linguistic signature of psychological distancing in emotion regulation. J. Exp. Psychol. Gen. 146, 337–346 (2017). [DOI] [PubMed] [Google Scholar]
  • 30.Nook E. C., Vidal Bustamante C. M., Cho H. Y., Somerville L. H., Use of linguistic distancing and cognitive reappraisal strategies during emotion regulation in children, adolescents, and young adults. Emotion 20, 525–540 (2020). [DOI] [PubMed] [Google Scholar]
  • 31.Kross E., et al. , Third-person self-talk reduces ebola worry and risk perception by enhancing rational thinking. Appl. Psychol. Health Well-Being 9, 387–409 (2017). [DOI] [PubMed] [Google Scholar]
  • 32.Kross E., et al. , Self-talk as a regulatory mechanism: How you do it matters. J. Pers. Soc. Psychol. 106, 304–324 (2014). [DOI] [PubMed] [Google Scholar]
  • 33.Orvell A., et al. , Does distanced self-talk facilitate emotion regulation across a range of emotionally intense experiences? Clin. Psychol. Sci. 9, 68–78 (2021). [Google Scholar]
  • 34.Tackman A. M., et al. , Depression, negative emotionality, and self-referential language: A multi-lab, multi-measure, and multi-language-task research synthesis. J. Pers. Soc. Psychol. 116, 817–834 (2019). [DOI] [PubMed] [Google Scholar]
  • 35.Edwards E. R., Wupperman P., Emotion regulation mediates effects of alexithymia and emotion differentiation on impulsive aggressive behavior. Deviant Behav. 38, 1160–1171 (2017). [Google Scholar]
  • 36.Brockmeyer T., et al. , Me, myself, and I: Self-referent word use as an indicator of self-focused attention in relation to depression and anxiety. Front. Psychol. 6, 1564 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Todorov G., Mayilvahanan K., Cain C., Cunha C., Context- and subgroup-specific language changes in individuals who develop PTSD after trauma. Front. Psychol. 11, 989 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Zimmermann J., Brockmeyer T., Hunn M., Schauenburg H., Wolf M., First-person pronoun use in spoken language as a predictor of future depressive symptoms: Preliminary evidence from a clinical sample of depressed patients. Clin. Psychol. Psychother. 24, 384–391 (2017). [DOI] [PubMed] [Google Scholar]
  • 39.Stirman S. W., Pennebaker J. W., Word use in the poetry of suicidal and nonsuicidal poets. Psychosom. Med. 63, 517–522 (2001). [DOI] [PubMed] [Google Scholar]
  • 40.Berry-Blunt A. K., Holtzman N. S., Donnellan M. B., Mehl M. R., The story of “I” tracking: Psychological implications of self-referential language use. Soc. Personal. Psychol. Compass 15, e12647 (2021). [Google Scholar]
  • 41.Althoff T., Clark K., Leskovec J., Large-scale analysis of counseling conversations: An application of natural language processing to mental health. Trans. Assoc. Comput. Linguist. 4, 463–476 (2016). [PMC free article] [PubMed] [Google Scholar]
  • 42.Haug S., Strauss B., Gallas C., Kordy H., New prospects for process research in group therapy: Text-based process variables in psychotherapeutic Internet chat groups. Psychother. Res. 18, 88–96 (2008). [DOI] [PubMed] [Google Scholar]
  • 43.Van Staden C. W., Fulford K. W. M. M., Changes in semantic uses of first person pronouns as possible linguistic markers of recovery in psychotherapy. Aust. N. Z. J. Psychiatry 38, 226–232 (2004). [DOI] [PubMed] [Google Scholar]
  • 44.Arntz A., Hawke L. D., Bamelis L., Spinhoven P., Molendijk M. L., Changes in natural language use as an indicator of psychotherapeutic change in personality disorders. Behav. Res. Ther. 50, 191–202 (2012). [DOI] [PubMed] [Google Scholar]
  • 45.Barlow D. H., et al. , The Unified Protocol for Transdiagnostic Treatment of Emotional Disorders: Therapist Guide (Oxford University Press, New York, NY, 2011). [Google Scholar]
  • 46.Mennin D. S., Emotion regulation therapy for generalized anxiety disorder. Clin. Psychol. Psychother. 11, 17–29 (2004). [Google Scholar]
  • 47.Edwards T., Holtzman N. S., A meta-analysis of correlations between depression and first person singular pronoun use. J. Res. Pers. 68, 63–68 (2017). [Google Scholar]
  • 48.Götz F. M., Gosling S. D., Rentfrow P. J., Small effects: The indispensable foundation for a cumulative psychological science. Perspect. Psychol. Sci. 17, 205–215 (2022). [DOI] [PubMed] [Google Scholar]
  • 49.Kashdan T. B., Barrios V., Forsyth J. P., Steger M. F., Experiential avoidance as a generalized psychological vulnerability: Comparisons with coping and emotion regulation strategies. Behav. Res. Ther. 44, 1301–1320 (2006). [DOI] [PubMed] [Google Scholar]
  • 50.Kashdan T. B., et al. , A contextual approach to experiential avoidance and social anxiety: Evidence from an experimental interaction and daily interactions of people with social anxiety disorder. Emotion 14, 769–781 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Wielgosz J., Goldberg S. B., Kral T. R. A., Dunne J. D., Davidson R. J., Mindfulness meditation and psychopathology. Annu. Rev. Clin. Psychol. 15, 285–316 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kabat-Zinn J., Mindfulness-based interventions in context: Past, present, and future. Clin. Psychol. Sci. Pract. 10, 144–156 (2003). [Google Scholar]
  • 53.Hayes S. C., Wilson K. G., Mindfulness: Method and process. Clin. Psychol. Sci. Pract. 10, 161–165 (2003). [Google Scholar]
  • 54.Aldao A., The future of emotion regulation research: Capturing context. Perspect. Psychol. Sci. 8, 155–172 (2013). [DOI] [PubMed] [Google Scholar]
  • 55.Ford B. Q., Troy A. S., Reappraisal reconsidered: A closer look at the costs of an acclaimed emotion-regulation strategy. Curr. Dir. Psychol. Sci. 28, 195–203 (2019). [Google Scholar]
  • 56.Satpute A. B., Nook E. C., Cakar M., “The role of language in the construction of emotion and memory” in Neuroscience of Enduring Change: Implications for Psychotherapy, Lane R., Nadel L., Eds. (Oxford University Press, New York, NY, 2020), pp. 56–88. [Google Scholar]
  • 57.Kroenke K., et al. , The PHQ-8 as a measure of current depression in the general population. J. Affect. Disord. 114, 163–173 (2009). [DOI] [PubMed] [Google Scholar]
  • 58.Razykov I., Ziegelstein R. C., Whooley M. A., Thombs B. D., The PHQ-9 versus the PHQ-8—Is item 9 useful for assessing suicide risk in coronary artery disease patients? Data from the Heart and Soul Study. J. Psychosom. Res. 73, 163–168 (2012). [DOI] [PubMed] [Google Scholar]
  • 59.Corson K., Gerrity M. S., Dobscha S. K., Screening for depression and suicidality in a VA primary care setting: 2 items are better than 1 item. Am. J. Manag. Care 10, 839–845 (2004). [PubMed] [Google Scholar]
  • 60.Shin C., Lee S. H., Han K. M., Yoon H. K., Han C., Comparison of the usefulness of the PHQ-8 and PHQ-9 for screening for major depressive disorder: Analysis of psychiatric outpatient data. Psychiatry Investig. 16, 300–305 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Spitzer R. L., Kroenke K., Williams J. B., Löwe B., A brief measure for assessing generalized anxiety disorder: The GAD-7. Arch. Intern. Med. 166, 1092–1097 (2006). [DOI] [PubMed] [Google Scholar]
  • 62.Revelle W., psych: Procedures for Personality and Psychological Research (Northwestern University, Evanston, IL, 2016). [Google Scholar]
  • 63.Kroenke K., et al. , Patient health questionnaire anxiety and depression scale: Initial validation in three clinical trials. Psychosom. Med. 78, 716–727 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.R Core Team, R: A Language and Environment for Statistical Computing (R Core Team, 2021) [Google Scholar]
  • 65.Mehl M. R., Robbins M. L., Holleran S. E., How taking a word for a word can be problematic: Context-dependent linguistic markers of extraversion and neuroticism. J. Methods Meas. Soc. Sci. 3, 30–50 (2012). [Google Scholar]
  • 66.Pennebaker J. W., Booth R. J., Francis M. E., Linguistic Inquiry and Word Count: LIWC 2007 (Pennebaker Conglomerates, Inc., Austin, TX, 2007).
  • 67.Enders C. K., Tofighi D., Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychol. Methods 12, 121–138 (2007). [DOI] [PubMed] [Google Scholar]
  • 68.Bolger N., Laurenceau J.-P., Intensive Longitudinal Methods: An Introduction to Diary and Experience Sampling Research (Guilford, New York, NY, 2013). [Google Scholar]
  • 69.Nook E. C., Flournoy J. C., Rodman A. M., Mair P., McLaughlin K. A., High emotion differentiation buffers against internalizing symptoms following exposure to stressful life events in adolescence: An intensive longitudinal study. Clin. Psychol. Sci., 9, 699–718 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Rodman A. M., et al. , A year in the social life of a teenager: Within-person fluctuations in stress, phone communication, and anxiety and depression. Clin. Psychol. Sci. 9, 791–809 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Starr L. R., Hershenberg R., Li Y. I., Shaw Z. A., When feelings lack precision: Low positive and negative emotion differentiation and depressive symptoms in daily life. Clin. Psychol. Sci. 5, 613–631 (2017). [Google Scholar]
  • 72.Vidal Bustamante C. M., et al. , Within-person fluctuations in stressful life events, sleep, and anxiety and depression symptoms during adolescence: A multiwave prospective study. J. Child Psychol. Psychiatry 61, 1116–1125 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Davison A. C., Hinkley D. V., Bootstrap Methods and Their Applications (Cambridge University Press, Cambridge, United Kingdom, 1997). [Google Scholar]
  • 74.Stan Development Team, Stan modeling language: User’s guide and reference manual. https://mc-stan.org/manual.html. Accessed 14 February 2020.
  • 75.Hoffman M. D., Gelman A., The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15, 1593–1623 (2014). [Google Scholar]
  • 76.Leisch F., FlexMix: A general framework for finite mixture models and latent class regression in R. J. Stat. Softw. 11, 1–18 (2004). [Google Scholar]
  • 77.Vickers A. J., Altman D. G., Statistics notes: Analysing controlled trials with baseline and follow up measurements. BMJ 323, 1123–1124 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Clifton L., Clifton D. A., The correlation between baseline score and post-intervention score, and its implications for statistical analysis. Trials 20, 43 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Ben-Shachar M., Lüdecke D., Makowski D., effectsize: Estimation of effect size indices and standardized parameters. J. Open Source Softw. 5, 2815 (2020). [Google Scholar]
  • 80.Cohen J., A power primer. Psychol. Bull. 112, 155–159 (1992). [DOI] [PubMed] [Google Scholar]
  • 81.Edwards L. J., Muller K. E., Wolfinger R. D., Qaqish B. F., Schabenberger O., An R2 statistic for fixed effects in the linear mixed model. Stat. Med. 27, 6137–6157 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Jaeger B. C., Edwards L. J., Das K., Sen P. K., An R2 statistic for fixed effects in the generalized linear mixed model. J. Appl. Stat. 44, 1086–1105 (2017). [Google Scholar]
  • 83.Bates D., Mächler M., Bolker B. M., Walker S. C., Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015). [Google Scholar]
  • 84.Kuznetsova A., Brockhoff P. B., Christensen R. H. B., lmerTest package: Tests in linear mixed effects models. J. Stat. Softw. 82, 1–26 (2017). [Google Scholar]
  • 85.Bürkner P. C., brms: An R package for Bayesian multilevel models using Stan. J. Stat. Softw. 80, 1–28 (2017). [Google Scholar]
  • 86.Bürkner P. C., Advanced Bayesian multilevel modeling with the R package brms. R J. 10, 395–411 (2018). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.2114737119.sapp.pdf (462.3KB, pdf)

Data Availability Statement

Markdowns documenting analytic code and results of all analyses have been deposited in the Open Science Framework (https://osf.io/u98r3/). Some study data are available. (Due to the sensitive and proprietary nature of Talkspace data, they cannot be made publicly available. However, Talkspace data can be made available upon completion of a Data Use Agreement and data security review with Talkspace.)


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES