Table 2.
Summary of machine learning studies of detection of depression using text data from social media.
| Reference | Sample | Platform | Outcome | Depression identification method | MLa approach type | Features examined | Cross- validation |
Type of study |
| Wang et al [21], 2013 | 122 depressed and 346 nondepressed subjects, the ages of the samples were not reported | Sina microblog | Bayes: mean absolute error=0.186, ROCb=0.908, F-measure=0.85; Trees: mean absolute error=0.239, ROC=0.798, F-measure=0.762; Rules: mean absolute error=0.269, ROC=0.869, F-measure=0.812 | Researcher-inferred | 3 classification approaches: Bayes, trees and rules | Ten features from three dimensions, including microblog content, interactions, and behaviors. Four of the ten features, (1st person singular, 1st person plural, positive emoticons, and negative emoticons) pertain to microblog content, while three pertain to interactions (mentioning, [being] forwarding, and commenting), and two pertain to behaviors (original blogs and blogs posted between midnight and 6:00 am). | 10-fold cross-validation | Observational cohort study |
| Burdisso et al [14], 2019 | 486 training subjects (83 depressed/403 nondepressed); 401 test subjects (52 depressed/349 nondepressed), the ages of the samples were not reported | SS3c: F-measure=0.61, precision=0.63, recall=0.60 |
User-declared | The proposed model: SS3 |
Words in users’ online text posts on Reddit | 4-fold cross-validation | Observational cohort study | |
| Nguyen et al [22], 2014 | 5000 posts made by users from clinical communities and 5000 posts from control communities, the ages of the samples were not reported | LiveJournal | Lasso to classify communities (Accuracy): ANEWd=0.89, mood=0.96, topic=1, LIWCe=1; Lasso to classify posts (Accuracy): topic=0.93, LIWC=0.88 | Community membership-based | The Lasso model | Affective features, mood tags, features topics from the LIWC, all extracted from posts on LiveJournal. | 10-fold cross-validation | Observational cohort study |
| Fatima et al [23], 2018 | 4026 posts (2019/2007) from depressive and non-depressive communities, the ages of the samples were not reported | LiveJournal | The proposed RFf-based model (Accuracy): post=0.898, community=0.950, depression degree=0.923; SVMg (Accuracy): post=0.8, community=0.895 | Community membership-based | Random forest, SVM | The values of the feature set serve as inputs to the classification algorithm, being extracted from first person singular, positive emotion, negative emotion, anxiety, cognitive process, insight, cause, affiliation health, and informal language of online text. | 10-fold cross-validation | Observational cohort study |
| Tung & Lu [15], 2016 | 724 posts, the ages of the samples were not reported | PTTh | EDDTWi: precision=0.593, recall=0.668, F-measure=0.624 | Researcher-inferred | EDDTW | Negative emotion lexicon, negative thought lexicon, negative event lexicon, and symptom lexicon. | 10-fold cross-validation | Observational cohort study |
| Husseini Orabi et al [24], 2018 | 154 subjects (53 labeled as Depressed/101 labeled as Control), the ages of the samples were not reported | The optimized CNNj model: accuracy=0.880 | User-declared | CNN-based models, RNNk-based models, SVM | Twitter texts from among which all the @mentions, retweets, nonalphanumeric characters, and URLs were extracted by the researchers. | 5-fold cross-validation | Observational cohort study | |
| Islam et al [19], 2018 | 7145 Facebook comments (58% depressed/42% nondepressed), the ages of the samples were not reported | Decision Tree (F-measure): emotional process=0.73, linguistic style=0.73, temporal process=0.73, all features=0.73; SVM (F-measure): emotional process=0.73, linguistic style=0.73, temporal process=0.73, all features=0.73; KNNl (F-measure): emotional process=0.71, linguistic style=0.70, temporal process=0.70, all features=0.67; Ensemble (F-measure): emotional process=0.73, linguistic style=0.73, temporal process=0.73, all features=0.73 | User-declared | SVM, decision tree, ensemble, KNN | Emotional information (positive, negative, anxiety, anger, and sad), linguistic style (prepositions, articles, personal, conjunctions, auxiliary verbs), temporal process information (past, present, and future) | 10-fold cross-validation | Observational cohort study | |
| Shen et al [6], 2017 | 1402 depressed users, 36993 depression-candidate users, and over 300 million nondepressed users, the ages of the samples were not reported | Accuracy: NBm=0.73, MSNLn=0.83, WDLo=0.77, MDLp=0.85 | User-declared | MDL, NB, MSNL, WDL | Features of network interactions (number of tweets, social interactions, and posting behaviors), user profiles (users’ personal information in social networks), and visual, emotional, and topic-level features, domain-specific features | 5-fold cross-validation | Observational cohort study | |
| De Choudhury, Gamon, et al [25], 2013 |
476 users (171 depressed/305 nondepressed), with a median age of 25 | Accuracy: engagement=0.553, ego-network=0.612, emotion=0.643, linguistic style=0.684, depression language=0.692, demographics=0.513, all features=0.712 | Researcher-inferred | SVM | Engagement, egocentric social graph, emotion, linguistic style, depression language, demographics | 10-fold cross-validation | Observational cohort study | |
| Mariñelarena-dondena et al [26], 2017 | 135 articles (20 depressed/115 nondepressed), the ages of the samples were not reported | Precision=0.850, recall=0.810, F-measure=0.829, accuracy=0.948 | User-declared | SVD, GBMq, SMOTEr | n-grams, use of which can create a large feature space and hold much important information | Not reported | Observational cohort study | |
| Tsugawa et al [27], 2015 | 209 Japanese users (81 depressed/128 nondepressed), and users were aged 16-55, with a median age of 28.8 years | Precision=0.61, recall=0.37, F-measure=0.46, accuracy=0.66 | Researcher-inferred | LDAs, SVM | Frequencies of words used in the tweet, ratio of tweet topics found by LDA, ratio of positive-affect words contained in the tweet, ratio of negative-affect words contained in the tweet, hourly posting frequency, tweets per day, average number of words per tweet, overall retweet rate, overall mention rate, ratio of tweets containing a URL, number of users following, number of users followed | 10-fold cross-validation | Observational cohort study | |
| Chen et al [28], 2018 | 446 perinatal users, the ages of the samples were not reported | WeChat circle of friends | The result of LSTMw was similar to EPDSx | Researcher-inferred | LSTM | Top 10 emotions in the data set | Not reported | Observational cohort study |
| De Choudhury, Counts, et al [29], 2013 | 489 users, with a median age of 25 years | Accuracy: eng.+ego=0.593, n-grams=0.600, style=0.658, emo.+ time=0.686, all features=0.701 | Researcher-inferred | PCAt, SVM | Postcentric features (emotion, time, linguistic style, n-grams), user-centric features (engagement, ego-network) | 5-fold cross-validation | Observational cohort study | |
| Dinkel et al [30], 2019 | 142 speakers (42 depressed/100 nondepressed), the ages of the samples were not reported | Distress Analysis Interview Corpus-Wizard of Oz (WOZ-DAIC) database | Precision=0.93, recall=0.83, F-measure=0.87 | Researcher-inferred | LSTM | Words from online posts | 10-fold cross-validation | Observational cohort study |
| Sadeque et al [7], 2017 | 888 users (136 depressed/752 nondepressed), the ages of the samples were not reported | F-measure: LibSVMu=0.40, WekaSVMv=0.30, RNN=0.34, Ensemble=0.45 |
User-declared | LibSVM, RNN, Ensemble, WekaSVM | Depression lexicony, metamap featuresz | 5-fold cross-validation | Observational cohort study | |
| Shatte et al [31], 2020 | 365 fathers in the perinatal period, the ages of the samples were not reported | Precision=0.67, recall=0.68, F-measure=0.67, accuracy=0.66 | Researcher-inferred | SVM | Fathers’ behaviors, emotions, linguistic style, and discussion topics | 10-fold cross-validation | Observational cohort study | |
| Li et al [32], 2020 | 1,410,651 users, the ages of the samples were not reported | Accuracy:SVM (radial basis function kernel)=0.82, SVM (linear kernel)=0.87, logistic regression=0.86, naïve Bayes=0.81, simple neural network=0.87 | Researcher-inferred | SVM, logistic regression, naïve Bayes Classifier, simple neural network |
512 features that were extracted from tweets using a universal sentence encoder | Not reported | Observational cohort study |
aML: machine learning.
bROC: receiver operating characteristic.
cSS3: sequential S3 (smoothness, significance, and sanction).
dANEW: affective norms for English words.
eLIWC: linguistic inquiry and word count.
fRF: random forest.
gSVM: support vector machine.
hPTT: the gossip forum on the Professional Technology Temple.
iEDDTW: event-driven depression tendency warning.
jCNN: convolutional neural networks.
kRNN: recurrent neural network.
lKNN: k-nearest neighbor.
mNB: naive Bayesian.
nMSNL: multiple social networking learning.
oWDL: Wasserstein Dictionary Learning.
pMDL: multimodal depressive dictionary learning.
qGBM: gradient boosting machine.
rSMOTE: synthetic minority oversampling technique.
sLDA: latent Dirichlet allocation.
tPCA: principal component analysis.
uLibSVM: library for support vector machines.
vWekaSVM: Waikato Environment for Knowledge Analysis for support vector machines.
wLSTM: long short-term memory.
xEPDS: Edinburgh Postnatal Depression Scale.
yA cluster of unigrams that has a great likelihood of appearing in depression-related posts.
zThe features were extracted using Metamap based on concepts from the Unified Medical Language System Metathesaurus.