Table 8.
Binary classification scores using all posts of 70 depressed users and 10 of their matched control usersa.
|
|
Positive predictive value, mean (SD) | Sensitivity, mean (SD) | F1-score, mean (SD) |
| SVMb using TF-IDFc | 0.000 (N/Ad) | 0.000 (N/A) | 0.000 (N/A) |
| SVM using word embeddings | 0.212 (N/A) | 0.371 (N/A) | 0.268 (N/A) |
| SVM using TF-IDF and word embeddings | 0.000 (N/A) | 0.000 (N/A) | 0.000 (N/A) |
| BERTe LMf | 0.100 (0.000) | 0.014 (0.000) | 0.025 (0.00) |
| ALBERTg LM | 0.089 (0.019) | 0.014 (0.000) | 0.025 (0.001) |
| BioBERT LM | 0.067 (0.115) | 0.005 (0.008) | 0.009 (0.016) |
| Longformer LM | 0.024 (0.019) | 0.019 (0.033) | 0.021 (0.037) |
| MentalBERT LM | 0.167 (0.058) | 0.014 (0.000) | 0.026 (0.001) |
| MentalRoBERTa LM | 0.272 (0.185) | 0.034 (0.008) | 0.057 (0.018) |
| Naive baseline—all depression | 0.091 (N/A) | 1.000 (N/A) | 0.167 (N/A) |
aLanguage model experiments were run 3 times each, therefore both mean and SD scores are provided.
bSVM: support vector machine.
cTF-IDF: term frequency–inverse document frequency.
dN/A: not applicable.
eBERT: Bidirectional Encoder Representations from Transformers.
fLM: language model.
gALBERT: A Lite Bidirectional Encoder Representations from Transformers.