Abstract
Looking at the rapidity the social media has gained ascendancy in the society, coupled with considerable shortage of addressing the health of the social media users, there is a pressing need for employing mechanized systems to help identify individuals at risk. In this study, we investigated potential of people’s social media language in order to predict their vulnerability towards the future episode of mental distress. This work aims to (a) explore the most frequent affective expressions used by online users which reflect their mental health condition and (b) develop predictive models to detect users with risk of psychological distress. In this paper, dominant sentiment extraction techniques were employed to quantify the affective expressions and classify and predict the incident of psychological distress. We trained a set of seven supervised machine learning classifiers on logs crowd-sourced from 2500 Indian Social Networking Sites (SNS) users and validated with 3149 tweets collected from Indian Twitter. We test the model on these two different SNS datasets with different scales and ground truth labeling method and discuss the relationship between key factors and mental health. Performance of classifiers is evaluated at all classification thresholds; accuracy, precision, recall, F1-score. and experimental results show a better traction of accuracies ranging from ~ 82 to ~ 99% as compared to the models of relevant existing studies. Thus, this paper presents a mechanized decision support system to detect users’ susceptibility towards mental distress and provides several evidences that it can be utilized as an efficient tool to preserve the psychological health of the social media users.
Keywords: Machine learning, Mental sentiment analysis, Mental distress, Social network mental distress, Depression, Stress, Anxiety
Globally, people of all ages suffer from mental health problems and would not consult the psychological doctors, deteriorating their conditions. Today’s online population trend shows that people disclose their feelings and daily life without any hesitation on networking sites, making it an emotional platform successfully been leveraged by researchers for helping in detection of mental distress (Shen et al., 2017).
Psychological distress (PD) is an umbrella term encompassing mental disorders like major depressive disorder, anxiety disorder, schizophrenia, bipolar disorder, post-traumatic stress disorder (PTSD), somatization disorder, attention deficit hyperactivity disorder (ADHD), or a variety of other clinical conditions. In general it can be described as unpleasant feelings or emotions that impact the daily functioning of a person. Sadness, anxiety, distraction, and other symptoms of mental illness are manifestations of psychological distress (Viertiö et al., 2021). Literature suggests that online society is facing psychological distress, and it becomes necessary to address the issue in order to protect the society (Keles et al., 2020).
Since a decade, the researchers are examining the potential of social media as tool for mental health measurement and surveillance (Coppersmith et al., 2014; Lin et al., 2014; Wongkoblap et al., 2018). The sentiment and language used in postings on platforms like Facebook, Twitter, and Instagram may indicate feelings of sadness, helplessness, anxiety, panic, stress, worthless, hopeless, etc. that characterize psychological distress as manifested in everyday life of online society (Coppersmith et al., 2014; Gkotsis et al., 2017). The explosion of data due to ever-growing social network across the world has attracted researchers to capture and analyze the social media data. This has resulted in popularization of natural language processing tools, machine learning techniques, as well as data science to interpret and gain considerable and actionable insights (Amir et al., 2017; Glaz et al., 2021). Several researchers have investigated mental health problems using online social platform data and have developed predictive models to classify vulnerable users (De Choudhury et al., 2013; Wongkoblap et al., 2018).
In literature, most of the studies have explored the mental problems faced by online users of Western, European, and Eastern countries (Reece et al., 2017). To the best of our knowledge, very few researchers have developed predictive model to classify Indian online users at risk. Indian President in December 2017 warned of a potential “mental health epidemic” in India, with 10 per cent of its 1.3 billion-strong population having suffered from one or more mental health problems.1 According to the WHO, India, China, and the USA are the countries most affected by anxiety, schizophrenia, and bipolar disorder.2 Also in a November 2019 report on mental health in Asia, only 0.3 psychiatrists for every 100,000 persons and 0.07 per 100,000 clinical psychologists in India are reported.3 Our study becomes pertinent when the world is facing COVID-19 pandemic repercussions, and due to the hike of the usage of online platforms, the society is at risk of mental health issues (Prout et al., 2020).
In this study, we set to detect mental health problems of people in different parts of India on the basis of affective expressions shared by them on social platforms. In order to estimate their PD risk, we employed two dominant feature extraction techniques, namely, Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) to build PD models. For our binary classification problem, we chose seven efficient machine learning techniques, namely, Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), Naïve Bayes (NB), and Logistic Regression (LR) to classify respondents as distressed or non-distressed.
Our main contributions in this paper are as follows:
-
(i)
We gathered ground truth set of affective expressions shared on social media platforms by 2500 participants from different parts of India through a structured questionnaire using crowd sourcing technique. The samples were labeled as psychological distressed or normal on the basis of the participant’s self-disclosure. We propose our datasets as benchmark for psychological distress research in India. As per our knowledge, this is the first attempt to document the mental health risk of Indian social networking sites population at large scale.
-
(ii)
We employed two efficient text extraction techniques combined with the capabilities of machine learning algorithms to build predictive models that can predict whether the postings could be psychological distressed indicative or not. Further, we validated the result of our dataset with random tweets posted by general Twitter population of India.
-
(iii)
We illustrate the significance of affective words frequently used by the psychological distressed people to express their sufferings in the virtual world, and thus timely intervention can be initiated.
In the rest of the paper, first we present literature review followed by proposed methodology which consist of details of data collection, data pre-processing, feature creation, and model formulation. Lastly, we present the experimental results followed by conclusions.
Literature Review
We present the relevant literature of last decade to throw light on the significance of this research and to provide strong indication that virtual social landscape contains vital information useful for describing and mediating the mental health risk.
Natural language as lens for detection of mental health is being studied by researchers since a decade, and most of the studies focused only on occurrence of depression, captured the real-time moods of Twitter users, and suggested that language has potential benefits for describing the depressive mood (Park et al., 2012). A study leveraging on social media language, emotion, style, and user engagement predicted that Twitter post could be indicative of depression with SVM classifier accuracy as 73% (De Choudhury et al., 2013). Another study suggested that observing and extracting emotions from social media text using machine learning techniques and natural language processing can help in the diagnosis of depression level (Hassan et al., 2017). The authors in their study yielded accuracy of SVM as 91%, NB accuracy as 83%, and maximum entropy accuracy as 80%. In a multilevel predictive model, authors employed SVM classifier with radial basis kernel to detect depression in social network users and showed accuracy ranging between 68 and 81% depending on using all and reduced dimensions. Addition of life-satisfaction label with principal component analysis a feature extraction technique yielded a performance of 81% (Wongkoblap et al., 2018). Another study studied Facebook language to screen individuals for depression and documented that language references of words like sadness, loneliness, hostility, rumination, and increased self-reference may point clinicians to specific symptoms of depression (Eichstaedt et al., 2018).
Recently, the researchers are developing deep learning models to identify textual content associated with mental health issues. In a recent study, authors demonstrated that the use of sentiments, emotions, and negative words in users everyday posts is prominent in determining the level of depression. The classification model was constructed using long short-term memory, a deep learning algorithm which generated an accuracy of 70.89%, precision of 50.24%, and recall 70.89% (Kholifah et al., 2020). In another recent investigation of dataset retrieved from Twitter highlighted posts related to mental risk and by applying multiple instance learning with anaphoric resolution encoder researchers achieved 92% accuracy (Wongkoblap et al., 2021). Furthermore, a synthesis of sentiment analysis, machine learning, and NLP technique achieved a 98% identification precision of depressive texts (Martins et al., 2021).
We infer from the above literature that through the social media posts, depression is the most studied type of mental illness, and very few have explored the detection of other types of psychological distress. A broad range of mental illness was examined in the literature by identifying the self-reported diagnosis (Coppersmith et al., 2015). In this study, we focus on psychological distress which encompasses depression, anxiety, and stress episodes and present a comprehensive automated data-driven decision-making system for detection of mental distress. The literature also suggests that most of the studies focused on Western and European online users. As per our knowledge from literature, there is no benchmark datasets for psychological distressed research available publicly that are suitable to our study (Shen et al., 2017). Our aim is to investigate the Indian online users at risk of psychological distress and study their online posts indicative of mental ill-health danger or not in order to address the rising issue of mental health risk.
Proposed Methodology
In our approach the online logs containing affective expressions of respondents are classified using binary classification technique where each log/tweet ti in T where T = {t1,t2,t3……..tn} are classified into category D where D = { distressed(1), non-distressed(0)}. There are four main components in our methodology: data collection, data pre-processing, feature creation from text, and model formulation for developing an efficient automated system to ascertain the probability of social media users being distressed or non-distressed based on their emotions shared on virtual platform.
Data Collection
In the first phase, we crowd-sourced the data through structured questionnaire shared with various people of different walks of life from December 2020 to September 2021. The 2500 participants shared their last 3 to 4 months posts in addition to the demographic data, social networking sites experiences, stress, anxiety, and depression markers with details of internet usage. The stress, anxiety, and depression were measured through a popular and validated scale DASS-21 (Depression Anxiety and Stress Scale) (Lovibond & Lovibond, 1995). The participants’ social networking site addiction was considered using the Bergen Social Networking Scale, a good performance scale validated in literature (Andreassen et al., 2012).
Data Pre-processing
In this phase, we performed two activities: (i) exploratory and statistical analysis of data described in Table 1 and Table 2, respectively, and (ii) analysis of textual data.
Table 1.
Description of variables and methods/scales used for data analysis and interpretation of the results
| Type of variables/features | Variables | Data type | Data analysis method/scale | Data pre-processing results with interpretation |
|---|---|---|---|---|
| Demographic and SNS behavior | Gender | Categorical (male, female, prefer not to say) | Exploratory data analysis using programming in Python |
Male—1226 Female—1263 Prefer not to say—11 |
| Age | Categorical (13–20 years, 21–28 years, 29–36 years, above 36 years) |
- 50% respondents below 30 years - Maximum respondents belong to 21–28 years category |
||
| Work_Experience | Categorical (below 5 years, 6–10 years, 11–15 years, 16–20 years, still a student, unemployed) |
- 20% respondents have less than 5 years of work - 18% are still student, and 15% are unemployed - 14% have more than 15 years of experience |
||
| Number of SNS accessed | Categorical (1, 2–3, 4–6, and more than 6) | Most of the respondents have 2–3 social media accounts | ||
| Frequency of use of SNS | Categorical (once a day, 2–5 times a day, 5–10 times, 10 + times, not daily) | Most respondents use social media 2–5 times daily | ||
| Time spent on SNS | Categorical (less than 30 min, 30–60 min, 1–2 h, 2–3 h, 3 h, and above) | Maximum number of respondents spent 30 min to 2 h daily on SNS | ||
| Frequency of posts on SNS | Categorical (daily, never, weekly, monthly, every few months) | 78% respondents frequently posts on SNS | ||
| Purpose for using SNS | Categorical (news, browsing, family/friends, inspiration, buying/selling) | Mostly the respondents use SNS to connect with family and friends, news/informative content | ||
| Social networking site addiction | Salience | Categorical (very rarely, 1; rarely, 2; sometimes, 3; often, 4; very often, 5) | Measured through Bergen Social Networking site addiction scale ( score above 19 indicates at-risk) | ~ 43% respondents were found to be addicted to SNS |
| Tolerance | ||||
| Mood_modification | ||||
| Relapse | ||||
| Withdrawal | ||||
| Conflict | ||||
| Self-report of mental health state | Suffering from mental health | Categorical (yes/no) | EDA |
Yes—1062 No—1438 ~ 57.5% respondents reported that they do not have any mental health problem ~ 42.5% respondents reported suffering from mental health problem |
| Self-report of taking pills for mental health issues | Medications | Categorical (yes/no) | EDA |
Yes—1050 No—1450 ~ 42% respondents reported taking pills for mental health issues |
| Psychological disorder | Depression | Numeric (never, 0; sometimes, 1; often, 2; almost always, 3) |
Measured through DASS (Depression Anxiety and Stress Scale) (Total score above 59 indicates severity) |
~ 43% respondents found to be at risk of psychological disorder |
| Anxiety | ||||
| Stress | ||||
| SNS posts | Text shared by respondents (logs) | Textual data | Natural language processing tools using programming in Python | ~ 43% of respondents posted comments containing markers, i.e., words that may suggest to risk of mental health issues |
Table 2.
Bivariate statistical analysis of variables with results
A chi-square test is used to compare correlation between two variables; T-test is an inferential statistic to ascertain significant difference between the means of two groups which may be correlated in certain features. p-value describes how likely it is that data would have occurred by random chance. A p-value less than 0.05 is statistically significant. T-test statistic score indicates the existence of difference between the two sample sets. A small t-score means two samples are similar, and large t-score indicates samples are different
The exploratory data analysis in Table 1 is done to visualize the online behavioral pattern and identify the relationship between variables as seen among the respondents of our dataset. Table 1 also presents the description of scales employed to measure the SNS addiction of the respondents with their anxiety, stress, and depression levels. Table 2 depicts the statistical methods applied on the pre-processed data in order to discover and interpret the correlation between the various features and mental health vulnerability.
Further, for the text analysis activity, we scrutinized the logs of the respondents to identify the respondents at risk. The sentiments or expression of feelings on social media platform contains abbreviations, emojis, smileys, URLs, hashtags, etc. and are considered a challenge when we attempt to pre-process this type of raw text. We tackled this challenge by using natural language processing methods. Firstly we cleaned the logs by removing URLs, hashtags, mentions, punctuation, emojis, etc. followed by removal of stop words which are commonly used words having little importance like under, over, this, that, from, on, and of. Next we applied the stemming, i.e., finding the root word of the words followed by tokenization. Tokenization basically is the breaking down of the entire document into words called tokens in order to interpret the meaning of the text by analyzing the sequence of words.
The abovementioned activities were accomplished by statistical and natural language processing tools using Colab platform.
Feature Creation and Formulation of Model
For the purpose of the feature creation from pre-processed text data, we chose a set of mental keywords depicted in Table 3. These keywords were selected based on literature for classification of social media users as depressed and build predictive models for various types of mental disorders (Becker et al., 2018; Coppersmith et al., 2018; Eichstaedt et al., 2018).
Table 3.
Set of mental keywords considered for text analysis
| Mental sentiment keywords |
| Depression, Hopeless, Breakdown, Paranoia, Antidepressant, Loneliness, ADHD, Stress, Anxiety, Trauma, Sadness, Bipolar, PTSD |
We have tested the performance of two different text representation techniques. We first made a binary Bag of Words (BoW) model where the mental keywords were searched in each log and status of each log gets a one (1) if mental keywords appears in it, otherwise gets a zero (0). Table 4 shows the binary BoW representation of logs of few respondents out of all, using frequently occurring mental sentiment words.
Table 4.
The output of PD-BoW model representation of logs showing status as 1 in case the mental keyword appears in the logs otherwise status is 0
The BoW model represents only the existence of the words in the logs of the respondents but did not take account of the importance of mental keywords in a log, for example, the word “Hopelessness” in second last log in Table 4 has more importance than rest of the words for measuring the polarity of the log. Thus, to enhance our findings, we employed Term Frequency-Inverse Document Frequency (TF-IDF) scores to build our second model. Here the logs are represented as vectors, and each vector contains scores for each of the words. The scores intend to reflect how important is the relevance of word to a log in the collection of logs or corpus, calculated as per the equations: 1, 2, and 3 can be considered cues or signals that could reflect mental state emotions. Firstly the Term Frequency (TF) score is calculated by using Eq. 1 which is the measure of how frequently say a word “w” appears in a log “L.”
| 1 |
where n is the number of times the term “w” appears in the log “L.” Thus, each log and word would have its own TF value. Next, Eq. (2) is used for calculation of IDF score which represents the measure of how relevant a word is to the log in conveying sentiments of mental health issues.
| 2 |
Finally the TF-IDF score of any word in any log can be deliberated by Eq. 3.
| 3 |
This model as compared to BoW does not represent the logs as vectors of “0”s and “1”s, rather assigns more precise values within 0 and 1. The TF-IDF model works well and gives importance to the uncommon words reflecting the mental issue sentiments rather than treating all the words as equal in case of BoW model.
We present the output of the TF-IDF values of few respondents in Table 5. We can see that TF-IDF gets larger values for less frequent words and is high when both IDF and TF values are higher, i.e., the word/sentiment like “adhd,” “anxiety,” and “depressed” has higher values as they are rare in the all the logs combined but frequent in a distressed user logs. We also see that words like “about,” “would,” and “agree” are reduced to 0 and have little importance.
Table 5.
The output of PD-TF-IDF model representation of logs showing the larger values (highlighted) for the mental keywords
After feature creation for both the models, we fitted the models in seven classification algorithms chosen due to their common capabilities: these methods understand, analyze and connect data, and draw apt problem-specific conclusions (Rastogi & Singh, 2021; Javaid et al., 2022).
In order to measure the performance of classifiers and to compare it to other works in the literature, we used accuracy as one of the performance metric. Accuracy is the most viable and execution measure which is simply a ratio of correctly predicted cases to the total cases. We may think that a model is best fit if accuracy is high, but accuracy is a great measure only when dataset is symmetric, i.e., where values of false positive and false negatives are almost same. So in addition to accuracy, we used other performance metrics like precision, recall, F1 score, and receiver operating curve (ROC). Precision is the ratio of correctly predicted distressed cases to the total predicted as distressed, i.e., this metric answers the question: How many of the cases model identified as distressed are actually distressed? High precision relates to the low false positive rate and highlights the good performance of the model. Recall is the ratio of correctly predicted distressed cases to all the cases in actual distressed class. It answers the question: out of all the distressed cases, how many did we properly detect? F1/F-measure score is the weighted average of precision and recall. ROC illustrates graphically the diagnostic capability of the classification algorithm at all classification thresholds. Area under curve (AUC) measures the degree of separability between distressed and non-distressed class, and higher AUC indicates the better performance of the model at distinguishing between users with the risk and no risk. The values of these key performance metrics range between 0 and 1, higher values denote how well the models have performed in predicting the distressed cases, and a best fit model can be identified to be used practically in real-world scenarios (Hajian-Tilaki, 2013; Xu & Goodacre, 2018).
Experimental Results
We experimented the logs posted by the social networking site users in order to identify the probable users at risk. For this purpose, we have used two datasets: (i) our dataset (collected through structured questionnaire) and (ii) Twitter dataset (Indian tweets crawled from Twitter). Scrutiny of textual data for finding the frequently used emotional vocabulary by the SNS users is presented in Fig. 1. We found that the words “depressed” are frequently used by the respondents of our dataset, whereas the word “stress” is frequently used by the Twitter users to share their emotions with virtual world when they are distressed. Figure 2 illustrates the cloud of affective words of our dataset and Twitter datasets, where the size of the words depicts the frequency of the words in the logs which is visually evident.
Fig. 1.
Mental sentiments/keywords frequently used by online users to share their feelings with virtual world: a our dataset and b Twitter dataset
Fig. 2.
Affective expression cloud exemplifies the frequently used words by online users to share their feelings with virtual world: a our dataset and b Twitter dataset
Further on both the datasets, two models were trained one with Bag of Words representation of logs called PD-BoW and another with Term Frequency-Inverse Document Frequency method of representation of logs named as PD-TF-IDF. Both models were constructed with seven popular classifiers. For training and testing, we split both the datasets into two parts with split ratio of 0.7 (70% data for training and 30% data for testing). In order to compare the performance of the models, the accuracies yielded from the models are presented in Table 6. We found that the performance of the models constructed on our dataset is in sync with the models based on the twitter dataset, thus validating the reliability of our dataset. The SVM classifier (with accuracy ranging from ~ 93 to ~ 99%) outperformed in both the datasets closely followed by LR and DT classifiers. We then compared our results with other few studies in this field. We found that our predictive models could outperform predictive models from Tiwari et al. (2021), Joshi et al. (2018) , Deshpande and Rao (2017), and Nadeem (2016) as shown in Table 6. However, a comparison across studies in this field is difficult because each study not only uses different dataset with varying feature creation and different set of classifiers but also has different proportion of samples in each class.
Table 6.
Comparative analysis of accuracies between proposed models and few previously proposed models in existing studies (accuracies in percentage)
| Proposed model accuracies | Existing model accuracies* | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Our dataset | Twitter dataset | |||||||||
| Feature extraction technique | PD-TF-IDF | PD-BOW | tenfold cross-validation | PD-TF-IDF | PD-BOW | 10-fold cross-validation | Sentiment polarity1 | BOW2 | BOW3 | BOW4 |
| Classification Algorithms | ||||||||||
| SVM | 99.3 | 98.5 | 98.8 | 95.4 | 93.7 | 94.4 | 64.4 | –- | 79.0 | –– |
| RF | 82.0 | 82.0 | 99.4 | 88.2 | 87.8 | 96.6 | 51.3 | 88.0 | –– | –– |
| DT | 95.0 | 96.3 | 99.4 | 93.1 | 93.1 | 92.4 | 92.8 | –- | –– | –– |
| KNN | 85.2 | 76.9 | 61.0 | 80.6 | 72.4 | 64.7 | 76.6 | 86.0 | –– | –– |
| MLP | 94.8 | 96.5 | 99.4 | 86.2 | 90.5 | 92.4 | –– | 83.0 | –– | –– |
| NB | 96.5 | 94.8 | 96.5 | 89.7 | 91.4 | 89.6 | 87.1 | –- | 83.0 | 81.0 |
| LR | 97.7 | 98.0 | 98.5 | 94.1 | 94.6 | 94.6 | –– | 87.0 | –– | –– |
| Sample size | 2500 logs | 2500 logs | 2500 logs | 3149 tweets | 3149 tweets | 3149 tweets | 20,000 tweets | 12 lakh tweets | 10,000 tweets | 2.5 M tweets |
Table 7.
Results of performance metrics-precision, recall, and F1-score for classifiers implemented on TF-IDF and BoW models
| Our dataset | Twitter dataset | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PD-TF-IDF | PD-BOW | PD-TF-IDF | PD-BOW | ||||||||||
| Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | F1 | ||
| Classification Algorithms | SVM | 99 | 99 | 99 | 99 | 98 | 98 | 95 | 95 | 95 | 94 | 94 | 94 |
| RF | 88 | 79 | 80 | 88 | 79 | 80 | 90 | 88 | 88 | 90 | 88 | 88 | |
| DT | 96 | 94 | 95 | 97 | 95 | 96 | 94 | 93 | 93 | 94 | 93 | 93 | |
| KNN | 89 | 83 | 84 | 78 | 75 | 75 | 81 | 81 | 81 | 75 | 72 | 72 | |
| MLP | 95 | 95 | 95 | 97 | 96 | 96 | 86 | 86 | 86 | 91 | 91 | 91 | |
| NB | 96 | 97 | 96 | 95 | 95 | 95 | 90 | 90 | 90 | 91 | 91 | 91 | |
| LR | 98 | 97 | 98 | 98 | 98 | 98 | 94 | 94 | 94 | 95 | 95 | 95 | |
Note - (i) Highlighted values show that SVM classifier outperformed the rest of the classification algorithms
(ii) Precision, recall, and F1 values in %
We utilize 10-fold cross-validation technique to verify our results for each of the algorithms and present the accuracy outcome in Table 6. Further we also utilize precision, recall, and F-measures as performance indices to evaluate psychological distress estimation and present the result in Table 7. We found that in these set of performance metrics also, the SVM classifier outperformed other classifiers in predicting the distressed cases. Lastly we developed and examine the receiver operating curves to provide an illustration of performance of a binary classifier system over various discrimination thresholds. Good performance of SVM classifier is also evident from the receiver operating characteristic (ROC) curves showed in Tables 8 and 9, respectively, for our dataset and Twitter dataset. We also infer that overall performance of PD-TF-IDF model gives slightly better traction in prediction as compared to the PD-BoW model, thus demonstrating that sentiments or emotions posted by the online users holds relevance and can be taken as indicative signals for identifying distress users.
Table 8.
Receiver operating curve (ROC) of classifiers implemented on PD-TF-IDF and PD-BOW representation using our dataset

Table 9.
Receiver operating curve (ROC) of classifiers implemented on PD-TF-IDF and PD-BoW representation using Twitter dataset

Conclusion
The purpose of this study is to demonstrate how a set of personal, emotional, cognitive, and social behavioral markers manifested in social networking sites postings can be harnessed to predict distressed indicative logs and thereby comprehend the mental health risks tendencies in the online population. We experimented with real-world corpus comprising of demographic factors, internet usage behavior, factors responsible for social networking sites addiction, as well as depression, stress, and anxiety markers. We build two probabilistic models trained on this corpus to determine if logs vocabulary could indicate the distressed episode. The models leveraged on two types of text representation (Bag of Words and Term Frequency-Inverse Document Frequency) were constructed to extract relevant emotions from logs. We employed seven efficient machine learning classifiers and our models outperformed in comparison to few other existing similar models in literature. The SVM classifier model created with TF-IDF text analysis technique shined with excellent performance (accuracy ~ 99%, precision ~ 99%, recall ~ 99%, F1 score ~ 99%). This can be associated with the intricacies that come into sight due to all possible co-occurrence of features in our dataset.
This research work also provides a better feature processing technique for the identification of various features related to the occurrence of mental distress. Further in our dataset, there is no issue of under skewness. During the training and testing our model, there was neither underfitting nor overfitting issues. For validating the model, we performed the 10 cross-validations on our dataset.
Based on our crowd-sourced dataset gathered from general Indian social networking site community, the target variable mental health was found to be significantly associated with age, work experience, salience, relapse, conflict, and SNS addiction. We also deduced that distressed individuals showed higher levels of depression, anxiety, and stress. Since we collected data through self-report questionnaire, the results could be affected by the common method bias issue, making it tough to generalize the outcome achieved to the more general online population. Despite the abovementioned limitation, the study highlights the strong psychometric properties of the scale put forward a suitable and viable context-sensitive instrument for timely identification of individuals at higher risk of developing mental distress. The well-timed intervention could reduce the detrimental consequences related to this modern illness.
Emotion detection from text has become a popular attraction due to the key role of sentiments in human machine interaction, so we also provided evidence that usage of vocabulary like “stress,” “depression,” and “anxiety” in postings on social networking sites can be indicative of a person at risk of being mentally distressed. Through this study, we concluded that the surveillance of affective expressions on social media can be an effective tool to measure the existence of psychological distress, thus preserving mental health of online community. Since we focused on Indian online population, we wrap up that information gleaned from social networking sites of Indian users and twitter postings of Indians carries potential to identify the risk factors that may trigger early detection of mental ill-health and enable rapid treatment and thus improve the health care support system.
Data Collection
Data was collected using crowd sourcing method purely on willingness of the respondents to share their information for this study. The respondents’ personal details were not collected to maintain integrity and ethics.
Author Contribution
The manuscript was designed and written together by both the authors Anju Singh and Dr. Jaspreet Singh.
The experiment in the study was performed by the author Anju Singh and reviewed by the author Dr. Jaspreet Singh. Both the authors contributed equally in this study and vouch for the results reported and protocol adherence during the study. Both the authors edited and approved the manuscript.
Data Availability
Data will be made available upon reasonable request for further study in this area of research within the limitations of the informed consent by the corresponding author.
Declarations
Ethics Declarations
Informed consent was obtained from all participants included in the study.
Conflict of Interest
The authors declare no competing interests.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Anju Singh and Jaspreet Singh contributed equally in this study.
Contributor Information
Anju Singh, Email: anjusingh40@gmail.com.
Jaspreet Singh, Email: jsarora1@gmail.com.
References
- Amir, S., Coppersmith, G., Carvalho, P., Silva, M. J., & Wallace, B. C. (2017). Quantifying mental health from social media with neural user embeddings. Proceedings of the 2nd Machine Learning for Healthcare Conference, 306–321. https://proceedings.mlr.press/v68/amir17a.html
- Andreassen CS, Torsheim T, Brunborg GS, Pallesen S. Development of a Facebook addiction scale. Psychological Reports. 2012;110(2):501–517. doi: 10.2466/02.09.18.PR0.110.2.501-517. [DOI] [PubMed] [Google Scholar]
- Becker D, van Breda W, Funk B, Hoogendoorn M, Ruwaard J, Riper H. Predictive modeling in e-mental health: A common language framework. Internet Interventions. 2018;12:57–67. doi: 10.1016/j.invent.2018.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coppersmith, G., Dredze, M., & Harman, C. (2014). Quantifying mental health signals in Twitter. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 51–60, Baltimore, Maryland, USA. Association for Computational Linguistics. 10.3115/v1/W14-3207
- Coppersmith, G., Dredze, M., Harman, C., & Hollingshead, K. (2015). From ADHD to SAD: Analyzing the language of mental health on Twitter through self-reported diagnoses. Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 1–10, Denver, Colorado. Association for Computational Linguistics. 10.3115/v1/W15-1201
- Coppersmith G, Leary R, Crutchley P, Fine A. Natural language processing of social media as screening for suicide risk. Biomedical Informatics Insights. 2018;10:1178222618792860. doi: 10.1177/1178222618792860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Choudhury, M., Counts, S., & Horvitz, E. (2013). Social media as a measurement tool of depression in populations. Proceedings of the 5th Annual ACM Web Science Conference, 47–56, Association for Computing Machinery, NY, USA. 10.1145/2464464.2464480
- Deshpande M, Rao V. Depression detection using emotion artificial intelligence. International Conference on Intelligent Sustainable Systems (ICISS) 2017;2017:858–862. doi: 10.1109/ISS1.2017.8389299. [DOI] [Google Scholar]
- Eichstaedt JC, Smith RJ, Merchant RM, Ungar LH, Crutchley P, Preoţiuc-Pietro D, Asch DA, Schwartz HA. Facebook language predicts depression in medical records. Proceedings of the National Academy of Sciences. 2018;115(44):11203–11208. doi: 10.1073/pnas.1802331115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gkotsis G, Oellrich A, Velupillai S, Liakata M, Hubbard TJP, Dobson RJB, Dutta R. Characterisation of mental health conditions in social media using informed deep learning. Scientific Reports. 2017;7(1):45141. doi: 10.1038/srep45141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glaz AL, Haralambous Y, Kim-Dufor D-H, Lenca P, Billot R, Ryan TC, Marsh J, DeVylder J, Walter M, Berrouiguet S, Lemey C. Machine learning and natural language processing in mental health: Systematic review. Journal of Medical Internet Research. 2021;23(5):e15708. doi: 10.2196/15708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hajian-Tilaki K. Receiver Operating Characteristic (ROC) Curve analysis for medical diagnostic test evaluation. Caspian Journal of Internal Medicine. 2013;4(2):627–635. [PMC free article] [PubMed] [Google Scholar]
- Hassan AU, Hussain J, Hussain M, Sadiq M, Lee S. Sentiment analysis of social networking sites (SNS) data using machine learning approach for the measurement of depression. International Conference on Information and Communication Technology Convergence (ICTC) 2017;2017:138–140. doi: 10.1109/ICTC.2017.8190959. [DOI] [Google Scholar]
- Javaid M, Haleem A, Singh RP, Suman R, Rab S. Significance of machine learning in healthcare: Features, pillars and applications. International Journal of Intelligent Networks. 2022;3:58–73. doi: 10.1016/j.ijin.2022.05.002. [DOI] [Google Scholar]
- Joshi, D. J., Makhija, M., Nabar, Y., Nehete, N., & Patwardhan, M. S. (2018). Mental health analysis using deep learning for feature extraction. Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 356–359, Association for Computing Machinery, NY, USA. 10.1145/3152494.3167990
- Keles B, McCrae N, Grealish A. A systematic review: The influence of social media on depression, anxiety and psychological distress in adolescents. International Journal of Adolescence and Youth. 2020;25(1):79–93. doi: 10.1080/02673843.2019.1590851. [DOI] [Google Scholar]
- Kholifah, B., Syarif, I., & Badriyah, T. (2020). Mental disorder detection via social media mining using deep learning. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 5(4), 309–316. 10.22219/kinetik.v5i4.1120
- Lin, H., Jia, J., Guo, Q., Xue, Y., Li, Q., Huang, J., Cai, L., & Feng, L. (2014). User-level psychological stress detection from social media using deep neural network. MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia, Association for Computing Machinery, NY, USA, 507–516. 10.1145/2647868.2654945
- Lovibond PF, Lovibond SH. The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales (DASS) with the beck depression and anxiety inventories. Behaviour Research and Therapy. 1995;33(3):335–343. doi: 10.1016/0005-7967(94)00075-u. [DOI] [PubMed] [Google Scholar]
- Martins, R., Almeida, J., Henriques, P., & Novais, P. (2021). Identifying depression clues using emotions and AI: Proceedings of the 13th International Conference on Agents and Artificial Intelligence, 2, 1137–1143. 10.5220/0010332811371143
- Nadeem, M. (2016). Identifying depression on Twitter. ArXiv:1607.07384 [Cs, Stat]. http://arxiv.org/abs/1607.07384
- Park, M., Cha, C., & Cha, M. (2012). Depressive moods of users portrayed in Twitter, In Proceedings of the ACM SIGKDD Workshop on healthcare informatics (HI-KDD), 2012, pp 1–8. ACM New York, NY.
- Prout, T. A., Zilcha-Mano, S., Aafjes-van Doorn, K., Békés, V., Christman-Cohen, I., Whistler, K., Kui, T., & Di Giuseppe, M. (2020). Identifying predictors of psychological distress during COVID-19: A machine learning approach. Frontiers in Psychology, 11. 10.3389/fpsyg.2020.586202 [DOI] [PMC free article] [PubMed]
- Rastogi S, Singh J. A systematic review on machine learning for fall detection system. Computational Intelligence. 2021;37:951–974. doi: 10.1111/coin.12441. [DOI] [Google Scholar]
- Reece AG, Reagan AJ, Lix KLM, Dodds PS, Danforth CM, Langer EJ. Forecasting the onset and course of mental illness with Twitter data. Scientific Reports. 2017;7(1):13006. doi: 10.1038/s41598-017-12961-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen, G., Jia, J., Nie, L., Feng, F., Zhang, C., Hu, T., Chua, T.-S., & Zhu, W. (2017). Depression detection via harvesting social media: A multimodal dictionary learning solution. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017 : 3838–3844. ScholarBank@NUS Repository. 10.24963/ijcai.2017/536
- Tiwari PK, Sharma M, Garg P, Jain T, Verma VK, Hussain A. A study on sentiment analysis of mental illness using machine learning techniques. IOP Conference Series: Materials Science and Engineering. 2021;1099(1):12043. doi: 10.1088/1757-899X/1099/1/012043. [DOI] [Google Scholar]
- Viertiö S, Kiviruusu O, Piirtola M, Kaprio J, Korhonen T, Marttunen M, Suvisaari J. Factors contributing to psychological distress in the working population, with a special reference to gender difference. BMC Public Health. 2021;21(1):611. doi: 10.1186/s12889-021-10560-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wongkoblap A, Vadillo MA, Curcin V. A multilevel predictive model for detecting social network users with depression. IEEE International Conference on Healthcare Informatics (ICHI) 2018;2018:130–135. doi: 10.1109/ICHI.2018.00022. [DOI] [Google Scholar]
- Wongkoblap A, Vadillo MA, Curcin V. Deep learning with anaphora resolution for the detection of tweeters with depression: Algorithm development and validation study. JMIR Mental Health. 2021;8(8):e19824. doi: 10.2196/19824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu Y, Goodacre R. On splitting training and validation set: A comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. Journal of Analysis and Testing. 2018;2(3):249–262. doi: 10.1007/s41664-018-0068-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data will be made available upon reasonable request for further study in this area of research within the limitations of the informed consent by the corresponding author.









