Table 5.
Binary classification scores using all posts of 70 depressed users and 1 of their matched control usersa.
|
|
Positive predictive value, mean (SD) | Sensitivity, mean (SD) | F1-score, mean (SD) |
| SVMb using TF-IDFc | 0.637 (N/Ad) | 0.557 (N/A) | 0.590 (N/A) |
| SVM using word embeddings | 0.558 (N/A) | 0.543 (N/A) | 0.548 (N/A) |
| SVM using TF-IDF and word embeddings | 0.673 (N/A) | 0.557 (N/A) | 0.596 (N/A) |
| BERTe LMf | 0.638 (0.021) | 0.805 (0.022) | 0.709 (0.012) |
| ALBERTg LM | 0.606 (0.008) | 0.786 (0.015) | 0.683 (0.010) |
| BioBERT LM | 0.601 (0.005) | 0.862 (0.022) | 0.707 (0.005) |
| Longformer LM | 0.633 (0.009) | 0.838 (0.036) | 0.719 (0.018) |
| MentalBERT LM | 0.660 (0.019) | 0.848 (0.008) | 0.738 (0.013) |
| MentalRoBERTa LM | 0.629 (0.002) | 0.819 (0.022) | 0.709 (0.006) |
| Naive baseline—all depression | 0.500 (N/A) | 1.000 (N/A) | 0.667 (N/A) |
aLanguage model experiments were run 3 times each, therefore both mean and SD scores are provided.
bSVM: support vector machine.
cTF-IDF: term frequency–inverse document frequency.
dN/A: not applicable.
eBERT: Bidirectional Encoder Representations from Transformers.
fLM: language model.
gALBERT: A Lite Bidirectional Encoder Representations from Transformers.