Table 6.
Binary classification scores using all posts of 70 depressed users and 3 of their matched control usersa.
|
|
Positive predictive value, mean (SD) | Sensitivity, mean (SD) | F1-score, mean (SD) |
| SVMb using TF-IDFc | 0.800 (N/Ad) | 0.086 (N/A) | 0.153 (N/A) |
| SVM using word embeddings | 0.411 (N/A) | 0.529 (N/A) | 0.459 (N/A) |
| SVM using TF-IDF and word embeddings | 0.800 (N/A) | 0.057 (N/A) | 0.107 (N/A) |
| BERTe LMf | 0.653 (0.033) | 0.481 (0.022) | 0.546 (0.025) |
| ALBERTg LM | 0.652 (0.034) | 0.476 (0.009) | 0.547 (0.018) |
| BioBERT LM | 0.654 (0.028) | 0.410 (0.030) | 0.496 (0.020) |
| Longformer LM | 0.653 (0.036) | 0.476 (0.036) | 0.534 (0.031) |
| MentalBERT LM | 0.657 (0.034) | 0.509 (0.008) | 0.562 (0.016) |
| MentalRoBERTa LM | 0.614 (0.023) | 0.471 (0.015) | 0.522 (0.002) |
| Naive baseline—all depression | 0.250 (N/A) | 1.000 (N/A) | 0.167 (N/A) |
aLanguage model experiments were run 3 times each, therefore both mean and SD scores are provided.
bSVM: support vector machine.
cTF-IDF: term frequency–inverse document frequency.
dN/A: not applicable.
eBERT: Bidirectional Encoder Representations from Transformers.
fLM: language model.
gALBERT: A Lite Bidirectional Encoder Representations from Transformers.