Abstract
Low self-esteem and interpersonal needs (i.e., thwarted belongingness (TB) and perceived burden-someness (PB)) have a major impact on depression and suicide attempts. Individuals seek social connectedness on social media to boost and alleviate their loneliness. Social media platforms allow people to express their thoughts, experiences, beliefs, and emotions. Prior studies on mental health from social media have focused on symptoms, causes, and disorders. Whereas an initial screening of social media content for interpersonal risk factors and low self-esteem may raise early alerts and assign therapists to at-risk users of mental disturbance. Standardized scales measure self-esteem and interpersonal needs from questions created using psychological theories. In the current research, we introduce a psychology-grounded and expertly annotated dataset, LoST: Low Self esTeem, to study and detect low self-esteem on Reddit. Through an annotation approach involving checks on coherence, correctness, consistency, and reliability, we ensure gold standard for supervised learning. We present results from different deep language models tested using two data augmentation techniques. Our findings suggest developing a class of language models that infuses psychological and clinical knowledge.
Index Terms—: dataset, interpersonal risk factors, low self-esteem, Reddit post
I. Introduction
In the first year of the COVID-19 pandemic, the prevalence of anxiety and depression has increased by 25%. Yet, according to World Health Organization (WHO) reports1, many of these cases have gone un-diagnosed. Interpersonal risk factors, including loneliness and low self-esteem, have impacted individuals’ mental health, triggering sub-clinical depression that worsens into clinical depression if left untreated. A recent study uses the interpersonal requirements of belongingness and positive self-esteem to highlight the latent phase of depression to suicidal ideation [19]. Poor self-esteem is a common issue. Estimates of incidence in the general population go up to 85%2. Depression and the interplay between depression and low self-esteem can result in excessive stress, subpar performance under challenging circumstances, social anxiety, and reduced quality of life [1]. Further, researchers illustrate low self-esteem as a major triggering point for increased risk of depression, anxiety, and suicidal ideation, affecting cognitive functioning, sleep quality, and overall well-being [18]. Youth with multiple suicide attempts were likelier to have persistent suicide ideation, interpersonal struggles, feeling disconnected from others, and low self-competence [8]. Past works demonstrate low self-confidence in people with low self-esteem who have a close association with social disengagement [27]. On the other hand, having strong or “intact” self-esteem can be a barrier against the onset of mental illness. So, raising one’s self-esteem seems like a sensible strategy for preventing and treating mental illness in the general population.
A major challenge in the US is the overburdened healthcare system, which has forced users (early adults and teenagers) to see alternate avenues to meet their treatment needs. For instance, Reddit has been the go-to anonymous social media platform for people to express their thoughts and beliefs easily without being judgmental of each other’s experiences. Prior works demonstrate the potential to learn about social media users’ mental health conditions using user-generated text and behavioral analyses of their social media activity [5]. In this work, we focus on detecting user-generated Reddit posts with low self-esteem to indicate the prospective presence of depression (see illustration in Figure 1).
Fig. 1.

Overview of annotation scheme for the LoST dataset. Assign a specialist through early prediction of hallucinations of low self-esteem. Our task of classifying Reddit posts with low self-esteem facilitates the support for early control mechanisms on an increasing levels of depression and the severity of suicide.
Advancements in natural language processing (NLP) for understanding the language of mental health on social media is significant traction, with a purpose to assist therapists and ensuring users at risk are recognized promptly [10] [15]. Datasets to support mental health research using NLP and artificial intelligence (AI) have focused more on statistical features and less on standardized scales or questionnaires (SCQ). Recently, the research community demonstrate explainable detection of suicide and depression, respectively, by using datasets accompanied with SCQ [30] Through the following post P, we present the benefit of having dataset annotated using SCQ:
P: I feel like a loser because I do not have a group of friends. I have friends but I do not have a group of them to be with while I’m at college. I feel like a loser because I’m always alone. Unfortunately I’m to blame because I am extremely introverted and self conscious. I’m self conscious to the point where I won’t be friends with people because acquaintances mocked them frequently. I’m very lonely. I began to use weed as a coping mechanism for my depression and we all know how that has turned out. I’m very low risk because I believe that I have more to live for. I’m just tired of feeling alone.
The author’s perception of being unsuccessful and mentally weak in post P suggests the presence of low self-esteem in a given text. It exemplifies a recent model that emphasizes how self-esteem relates to perceptions of one’s value concerning personal adequacy [22] and loneliness [26]. The highlighted phrases in P are the focused text spans which presents the warning signs of low self-esteem. These text spans presents psychological theories used by annotators in creating the proposed dataset.
a). Psychological Ground:
According to an American psychologist Cox’s theory of the “hierarchy of human needs,” having a high sense of self-worth is a fundamental need [9]. He distinguishes between two types of “esteem”: the need for respect from others in the form of success, admiration, and recognition, and (ii) the need for respect from oneself in the form of self-love, self-confidence, skill, or aptitude. In this context, two professionals—a clinical psychologist and a social NLP researcher—worked together to design the annotations guidelines. To diagnose poor self-esteem, our professionals use SCQs, including Rosenberg’s Self-esteem Scale (RSS), Coopersmith Self-Esteem Inventory (CSEI), and Interpersonal Needs Questionnaire (INQ-18).
b). Our Contributions:
Our goal is to facilitate public health surveillance and health applications via releasing mental health data annotated from social media posts and analyzing current AI models for low self-esteem detection. To the best of our knowledge, the quantitative literature on low self-esteem and mental health has no publicly available language resources due to the sensitive nature of the data. To this end, our contributions can be summarized as follows:
We construct and publicly release LoST: Low Self-esTeem, a new psychology-grounded dataset of 3,251 Reddit posts to facilitate social computing in mental health.
We create a robust dataset considering FAIR principles to facilitate reliability and reusability [12].
We experimented with deep language models as classifiers for low self-esteem detection and established them as baselines to identify challenges toward better AI solutions.
II. Dataset
We adhere to the ethical considerations for constructing and releasing the LoST dataset in the public domain. In this section, we first discuss the construction of a corpus (see Section II-A) followed by the expert-driven annotation scheme (see Section II-B). Our experts, a senior clinical psychologist and a social NLP researcher, frame annotation scheme through extensive discussions on guidelines and perplexities, train three postgraduate students for eight hours and employ them on the annotation task. In this section, we further examine the annotation task’s coherence, correctness, consistency, and reliability (see Section II-C).
A. Corpus Construction
We extract 200 Reddit posts per day with subreddits r/depression and r/SuicideWatch from 2 December 2021 to 4 January 2022 through the Python Reddit API Wrapper (PRAW) API3. We filter the candidate posts with three major criteria:
We keep the body with length > 0, and do not release information about any metadata to adhere to ethical constraints.
-
We first identify the supportive statements, and remove the posts that do not reflect the mental disturbance. For example, we remove the following posts:P1: Mental health is a very important part of one’s life. One should focus on their mental and emotional wellbeing.P2: We encourage and welcome all our friends with mental disorders to join hands with us on our platform and lets make our journey beautiful and courageous.
The generic nature of the Reddit posts P1 and P2 does not convey any significant information about individuals who are at-risk.
-
We remove the posts which reflect the intent of self-harm and suicidal tendencies without any contextual information about cause [16] and consequences [11]. For example, consider the post below:P3: I am done with my life. I don’t want to live anymore.
The given post P3 highlights the user’s suicidal ideation but there is no context (representing cause or consequence of suicidal tendency).
We obtained a total of 4, 357 candidate posts. The length of about 25% of the posts exceed 300 words. As our psychology-driven task of identifying low self-esteem is highly complex, we simplify it by filtering the candidate posts with a total number of words more than 300. We obtain a final corpus of 3, 251 posts, deploying it for the annotations task.
B. Annotation Scheme
A highly subjective and complex problem of detecting the low self-esteem in a given text may induce errors with naive judgment. To mitigate this problem, we built a team of a clinical psychologist (reading between the lines to understand the psychological perception of the human mind) and a social NLP expert (text-based marking for outstanding AI models). To mitigate the trade-off between clinical psychology and NLP domain, experts suggest fine-grained guidelines to mark low self-esteem as a latent feeling of self-doubt, worthlessness, and lack of confidence. To facilitate the annotation task, our experts frame the annotation scheme, leveraging on two research questions: (i) “RQ1: Does the text contain indicators of low self-esteem which alarms suicidal risk or self-harm in a person?”, and (ii) “RQ2: What should be the extent to which annotators are supposed to read in-between-the-lines for marking the presence or absence of low self-esteem”. In this section, we discuss the annotation guidelines (see Section II-B1) and perplexity guidelines (see Section II-B2) to ensure future coherence during the annotation task.
1). Annotation Guidelines:
Prior work examines self-esteem in terms of the self-worth suggesting three dimensions of self-esteem: worth-based, efficacy-based, and authenticity-based esteem [25]. With this background knowledge and comprehensive literature survey, our experts follow SCQs such as RSS [23], a well-established questionnaire for detecting low self-esteem, to frame the annotation guidelines. RSS is a ten-item scale with items answered on a four-point scale—from strongly agree to strongly disagree. Among these ten items, five items are positively worded statements, and the remaining five are negatively worded statements to measure global self-worth with a score of 0–40. We also consider the adult version of the CSEI, an 58-item inventory designed for an assessment of an individual’s global self-esteem [2]. Furthermore, our experts consider an 18-item INQ to make the annotation guidelines even more mature. The experts annotate 50 data-points of the corpus, in isolation, using a fine-grained guidelines to avoid biases. We discover 40% of possible dilemmas in annotation task due to the subjective nature of the task. To resolve this problem, we identify the problems with annotation guidelines, and address these gaps with perplexity guidelines.
2). Perplexity Guidelines:
We propose perplexity guidelines to simplify the task and facilitate future annotations. We observe two major confusions:
a). Low Self-esteem in the Past:
To check if the condition of a person with low self-esteem is still an alarming prospect of self-harm or suicidal risk. Consider a post A1 given below:
A1: I used to be boring and unattractive when no one used to like me, especially before Christmas but today I am celebrating this New Year with my friends where everyone likes me.
We frame rules to handle such contradictory statements (as shown in A1) that contain text-spans indicating both: (i) boring and unattractive: an unpleasant perception about oneself before Christmas, and (ii) liking me: people liking the author. A clear understanding of knowing oneself through public opinion is exemplified in the post, suggesting the temporary acceptance. The arguments of our experts argues the presence of low self-esteem in the user’s perception which may reiterate after the temporary celebration. Thus, our experts recommend such contradictions as reflecting low self-esteem.
b). Ambiguity with Social Experiences:
Sometimes, the unjust behaviour or biases of society makes a person mentally disturbed. In our work, we keep our guidelines supporting the detection of the low esteem within oneself as compared to the public opinion. Consider the post A2 below:
A2: They don’t like me at all. My friends are mean to me. They told me that they are not having any celebrations, but in reality, they were celebrating in amusement park. I don’t care about what they think, but now its all boring.
In A2, the author is feeling alone because of some subjective biases. Although public opinion may seed the feeling of interpersonal risk factors within an individual, such texts are not marked as the presence of LoST if they are not explicitly stated in a given text, as we choose not to make any assumptions.4 As a result, we observe the user expressing the feeling of lonesomeness and alienation which is clearly not rooted in low self -esteem. Thus, we mark a given text having no signs of low self-esteem.
C. Annotation Task
We employ three postgraduate students for manual annotations, and experts train them for eight hours through annotation scheme, ensuring their coherence. After three successive trial sessions to annotate 40 samples in each round, we ensure their correctness for alignment and understanding of the task to facilitate coherent annotations.
All three students were made to sit in three different rooms and annotate the files individually to avoid any biases. Each student was given a task to annotate 50 samples per day to maintain their consistency for 66 weekdays. We further validate all three annotated files using the Fleiss’ Kappa inter-observer agreement study, where κ is calculated as 78.52%, ensuring the reliability of our judgment followed by the experts’ validation. We obtain final annotations based on the majority voting mechanism. We Justify the findability and reusability of LoST through FAIR principles.
D. FAIR Principles
The FAIR guiding principle increases the Findability, Accessibility, Interoperability, and Reusability of the dataset to emphasize the machine actionability due to increasing reliance on computational systems to facilitate future studies [29]. The LoST dataset contains the text and the label for each of the 3, 251 data-points. We release this dataset as the first version of our dataset as LoST.v1 at Github.5. The LoST dataset is available in the comma-separated format to facilitate its re-usability and interoperability.
To adhere to the ethical constraints of privacy, safety, and accountability, we do not make any metadata available in the public domain, yet our dataset can be used effectively and opens up new research directions in NLP-centered mental health analysis.
III. Experiments and Evaluation
In this section we perform experiments with the LoST dataset over six different classifiers and evaluate their performance of low self-esteem detection with precision, recall, F1-score and accuracy. To measure the impact of imbalanced dataset, we use Matthews Correlation Coefficient (MCC) which range from −1 to +1 where the values closer to 0 suggests randomization, the values closer to +1 suggests the extent of perfection in prediction and the values closer to −1 shows the poor models [6]. The MCC on original data of the dataset suggests the need of adding more samples for less represented class. We employ two data augmentation techniques and experiment all baselines on the resulting dataset.
a). Classifiers:
We perform extensive analysis to build baselines and highlight their limitations. We considered sequence to sequence and attention-based models as considerable baselines for the task. The following classifiers have shown state-of-the-art performances in their respective studies: (i) Recurrent Neural Networks (RNN) (LSTM [4], GRU [7]), (ii) Pre-trained transformer based models such as BERT [20], RoBERTa [21], and XLNet [3].
b). Experimental Setup.:
For consistency, we used the same experimental settings for all models with 10 fold cross-validation. All our results are reported as the average across all folds. A varying lengths of posts are padded and trained for 150 epochs with early stopping with a patience of 10 epochs. Thus, we set hyperparameter for our experiments with transformer-based models as H = 200, O = Adam, learning rate = 1×10−5, and batch size 16.
A. Experimental Results
1). Original data:
Table I contains the classification performance of low self-esteem detection. The results suggest the average performance with the low values of MCC, and the need of reducing data imbalancing. We postulate this average performance due to the incapability of capturing contextual information by existing classifiers. Among the existing ones, RoBERTa outperforms all the other models for original data of the model and counterparts with 82% of the accuracy. We further determine the values of MCC for classifiers on the original data of LoST that is observed as an average of +0.20 for RNN models and +0.48 for pre-trained transformer based models, suggesting the randomized prediction, especially for RNN models. We employ data augmentation techniques to reduce the data imbalancing.
TABLE I.
Comparison of the baseline methods with the Precision (P), Recall (R), F1-score (F1), and Accuracy, are averaged over 10-fold cross validation. Absent: Absence of Low Self-Esteem, Present: Presence of Low Self-Esteem. The second column specifies Composition (C) as Original data (O) or Augmented data (A)
| Model | C | Absent | Present | Accuracy | ||||
|---|---|---|---|---|---|---|---|---|
| P | R | F | P | R | F | |||
| LSTM | O | 0.81 | 0.92 | 0.86 | 0.50 | 0.26 | 0.34 | 0.77 |
| A | 0.84 | 0.72 | 0.78 | 0.72 | 0.84 | 0.78 | 0.78 | |
| GRU | O | 0.81 | 0.91 | 0.86 | 0.48 | 0.27 | 0.35 | 0.76 |
| A | 0.84 | 0.82 | 0.83 | 0.80 | 0.82 | 0.81 | 0.82 | |
| BERT | O | 0.89 | 0.86 | 0.88 | 0.57 | 0.62 | 0.60 | 0.81 |
| A | 0.96 | 0.84 | 0.89 | 0.78 | 0.94 | 0.85 | 0.88 | |
| RoBERTa | O | 0.92 | 0.86 | 0.89 | 0.51 | 0.67 | 0.58 | 0.82 |
| A | 0.90 | 0.83 | 0.86 | 0.80 | 0.88 | 0.84 | 0.85 | |
| XLNet | O | 0.86 | 0.89 | 0.87 | 0.65 | 0.59 | 0.62 | 0.81 |
| A | 0.95 | 0.80 | 0.87 | 0.74 | 0.93 | 0.82 | 0.85 | |
2). Augmentated data:
As observed before, the ratio of data-points for presence : absence is 22.4 : 77.6 which approximates to 1 : 3. To handle this problem of imbalanced dataset, we multiply the positive data-points with two data augmentation approaches:
- Easy Data Augmentation (EDA).: We employ an EDA method for generating an additional corpus of 729 samples [28]. Originally EDA comprises Synonym Replacement (SR), Random Insertion (RI), Random Swap (RS), Random Deletion (RD). We found a dysfunctional mechanism of RS as the lexical sequence deformation. Thus, we use SR, RD with 20%, followed by RI. Consider an example E below:E: I am good for nothing and thus, jobless since two years.Augmentation: I am good for nothing and← RD thus, jobless← umemployed SR since two← RD few ← SR years now ← RI.Augmented text (E): I am good for nothing thus, unemployed since few years now.
- Back Translation (BT).: We use the French language for BT [24] to add 729 samples for positive data points of LoST dataset. Consider the example E given above and following conversions with back translation:French Translation: Je suis bon à rien et donc sans travail depuis deux ans.Augmented text (E): I’m good for nothing and therefore out of work for two years.
We obtain 3 times 729 samples as a resulting set of data-points for label 1. Our initiative with data augmentation reduces the imbalancing of LoST. We further perform our experiments with both original data and augmented data. We observe the improvement in MCC for RNN models (with an average of +0.64) and pre-trained transformer-based models (with average of +0.73), suggesting reliable inferences. For all the models, we observe improved accuracy and F1-score with augmented data in comparison of original data of LoST. The pre-trained model BERT outperforms all the baselines, suggesting the need of more explainable and accountable models.
B. Discussion
Table I reveals that, in contrast to posts with self-esteem signals, our baselines are confident in categorizing posts with no signals of self-esteem. This suggest room for improvement for AI and NLP to develop novel algorithms for detecting warning signs of low self esteem. We considered as baselines, a diverse class of language models (base, not fine-tuned): Sequence to sequence, attention-based, knowledge distillation, and autoregressive which have been predominantly used in prior literature [14]. All the models built over augmented data saw an averaged 7% gain in accuracy, and such a consistency informs dependability of the dataset. Irrespective of the size of the model, the BERT and RoBERTa language models consistently outperformed, highlighting the significance of attention. We observe that by adding external knowledge, such the knowledge in SCQs, autoregressive models (like XLNET) can outperform BERT/RoBERTa.
We recommend infusing clinically grounded knowledge to build more informative models and test the classifiers with human-generated explanations, keeping them as an open research direction for future developments. We recommend the use of SCQs such as RSS, CSEI, and INQ-18 to build a lexicon on the experts’ recommendations and infuse it as an external knowledge to build more contextualized models. We further suggest developing consequence-preserving data augmentation techniques on the original data of LoST for better representation. The practical considerations for this task is applied to problems with work-life and abusive relationships. For instance, the research community has witnessed a surge in the recent issue of job layoffs since the COVID-19 pandemic [13] and impacted interpersonal risk factors [17], highlighting the problems of low self-esteem in affected people. Other than jobs and careers, poor performance in school, hallucinations of unattractive personality among friends and family, and failed relationships are other problems caused by low self-esteem. An early check on such situational mental disturbance may prevent chronic disease of clinical depression and suicidal ideation.
IV. Conclusion
We present LoST, a SCQ-informed gold standard dataset of 3, 251 Reddit Posts for low self-esteem to facilitate social computing in the mental health domain. Our experts-driven annotation scheme and the annotation task enable a coherent, complete, consistent, and reliable dataset. We developed binary classifiers and observed the best performance of RoBERTa (with 82% accuracy on original data of data) and BERT (with 88% accuracy on augmented data). The implications of this work include the potential to improve public health surveillance and health applications that rely on automatically identifying con- sequences in the posts in which users describe their mental health issues. In future work, we plan to enhance the LoST dataset with more samples and additional labels of contextual explanations (similar to CAMS [16]). We further plan to develop new models tailored explicitly to low self-esteem by infusing knowledge through clinical psychology-grounded lexicons.
Ethics and Broader Impact.
Social media data is often sensitive, especially when the data is related to mental health. Our LoST dataset contains only publicly available posts, and no user’s metadata is made available as we are committed to the ethical practices of protecting the privacy and anonymity of the users. The examples shown in this paper are anonymized, obfuscated, and paraphrased to prevent misuse. Thus, this study does not require any ethical approval. Due to the subjective nature of our task, we expect some biases in our annotations. Clearly, machine learning predictions cannot replace professional mental health diagnostics, counseling, or therapy. As shown in our evaluation, their accuracy and trustworthiness remain insufficient for such purposes. We assume all posts to be a genuine expression of users’ experiences and that they are not manipulative.
Acknowledgement
We extend our sincere acknowledgement to Ritika Bhardwaj, Astha Jain, and Amrit Chadha, Veena Krishnan, Ruchi Joshi, and Surjodeep Sarkar for their unwavering support throughout the project. This project was partially supported by NIH R01 AG068007 and publication cost is partially supported by UMBC SURFF and Healthcare NLP/Healthy AI d.b.a Shaip.
Footnotes
We acknowledge that we make contextual observations by reading between the lines, but no emotion-driven assumptions to enable unbiased classification.
Contributor Information
Muskan Garg, Mayo Clinic, Rochester, MN, USA.
Manas Gaur, University of Maryland, Baltimore County, MD, USA.
Raxit Goswami, HealthcareNLP LLC., Louisville, KY, USA.
Sunghwan Sohn, Mayo Clinic, Rochester, MN, USA.
References
- [1].Acarturk C, Smit F, De Graaf R, Van Straten A, Ten Have M, Cuijpers P: Incidence of social phobia and identification of its risk indicators: a model for prevention. Acta Psychiatrica Scandinavica 119(1), 62–70 (2009) [DOI] [PubMed] [Google Scholar]
- [2].Adair FL: Coopersmith self-esteem inventories. Test critiques 1, 226–232 (1984) [Google Scholar]
- [3].Alshahrani A, Ghaffari M, Amirizirtol K, Liu X: Identifying optimism and pessimism in twitter messages using xlnet and deep consensus. In: 2020 International Joint Conference on Neural Networks (IJCNN). pp. 1–8. IEEE; (2020) [Google Scholar]
- [4].Bai X: Text classification based on lstm and attention. In: 2018 Thirteenth International Conference on Digital Information Management (ICDIM). pp. 29–32. IEEE; (2018) [Google Scholar]
- [5].Burke M, Marlow C, Lento T: Social network activity and social well-being. In: Proceedings of the SIGCHI conference on human factors in computing systems. pp. 1909–1912 (2010) [Google Scholar]
- [6].Chicco D, Jurman G: The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC genomics 21, 1–13 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Cho K, van Merriënboer B, Bahdanau D, Bengio Y: On the properties of neural machine translation: Encoder–decoder approaches. In: Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. pp. 103–111 (2014) [Google Scholar]
- [8].Choi KH, Wang SM, Yeon B, Suh SY, Oh Y, Lee HK, Kweon YS, Lee CT, Lee KU: Risk and protective factors predicting multiple suicide attempts. Psychiatry research 210(3), 957–961 (2013) [DOI] [PubMed] [Google Scholar]
- [9].Cox R: The rich harvest of abraham maslow. Motivation and personality pp. 245–271 (1987) [Google Scholar]
- [10].De Choudhury M, Gamon M, Counts S, Horvitz E: Predicting depression via social media. In: Proceedings of the International AAAI Conference on Web and Social Media. vol. 7 (2013) [Google Scholar]
- [11].Dunn HL: High-level wellness for man and society. American journal of public health and the nations health 49(6), 786–792 (1959) [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Dunning A, De Smaele M, Böhmer J: Are the fair data principles fair? International Journal of digital curation 12(2), 177–195 (1970) [Google Scholar]
- [13].El-Deeb A: The first tech layoff wave after years of hypergrowth: How this affects the industry? ACM SIGSOFT Software Engineering Notes 48(1), 4–5 (2023) [Google Scholar]
- [14].Futami H, Inaguma H, Ueno S, Mimura M, Sakai S, Kawahara T: Distilling the knowledge of bert for sequence-to-sequence asr. arXiv preprint arXiv:2008.03822 (2020) [Google Scholar]
- [15].Garg M: Mental health analysis in social media posts: A survey. Archives of Computational Methods in Engineering pp. 1–24 (2023) [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Garg M, Saxena C, Krishnan V, Joshi R, Saha S, Mago V, Dorr BJ: Cams: An annotated corpus for causal analysis of mental health issues in social media posts. arXiv preprint arXiv:2207.04674 (2022) [Google Scholar]
- [17].Garg M, Shahbandegan A, Chadha A, Mago V: An annotated dataset for explainable interpersonal risk factors of mental disturbance in social media posts. Findings of Association of Computation Linguistics (ACL) (2023) [Google Scholar]
- [18].Korkmaz H, Korkmaz S, Çakar M: Suicide risk in chronic heart failure patients and its association with depression, hopelessness and self esteem. Journal of clinical neuroscience 68, 51–54 (2019) [DOI] [PubMed] [Google Scholar]
- [19].Levi-Belz Y, Aisenberg D: Interpersonal predictors of suicide ideation and complicated-grief trajectories among suicide bereaved individuals: A four-year longitudinal study. Journal of affective disorders 282, 1030–1035 (2021) [DOI] [PubMed] [Google Scholar]
- [20].Martínez-Castaño R, Htait A, Azzopardi L, Moshfeghi Y: Bert-based transformers for early detection of mental health illnesses. In: Experimental IR Meets Multilinguality, Multi-modality, and Interaction: 12th International Conference of the CLEF Association, CLEF 2021, Virtual Event, September 21– 24, 2021, Proceedings 12. pp. 189–200. Springer; (2021) [Google Scholar]
- [21].Murarka A, Radhakrishnan B, Ravichandran S: Detection and classification of mental illnesses on social media using roberta. arXiv preprint arXiv:2011.11226 (2020) [Google Scholar]
- [22].Rimes K, Smith P, Bridge L: Low self-esteem: A refined cognitive behavioural model. Behavioural and Cognitive Psychotherapy (2023) [DOI] [PubMed] [Google Scholar]
- [23].Rosenberg M: Rosenberg self-esteem scale. Journal of Religion and Health (1965) [Google Scholar]
- [24].Sennrich R, Haddow B, Birch A: Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709 (2015) [Google Scholar]
- [25].Stets JE, Burke PJ: Self-esteem and identities. Sociological perspectives 57(4), 409–433 (2014) [Google Scholar]
- [26].Uram P, Skalski S: Still logged in? the link between facebook addiction, fomo, self-esteem, life satisfaction and loneliness in social media users. Psychological Reports 125(1), 218–231 (2022) [DOI] [PubMed] [Google Scholar]
- [27].Watson J, Nesdale D: Rejection sensitivity, social withdrawal, and loneliness in young adults. Journal of Applied Social Psychology 42(8), 1984–2005 (2012) [Google Scholar]
- [28].Wei J, Zou K: Eda: Easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 6382–6388 (2019) [Google Scholar]
- [29].Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten JW, da Silva Santos LB, Bourne PE, et al. : The fair guiding principles for scientific data management and stewardship. Scientific data 3(1), 1–9 (2016) [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Zirikly A, Dredze M: Explaining models of mental health via clinically grounded auxiliary tasks. In: CLPsych (2022) [Google Scholar]
