Skip to main content
. 2023 Mar 24;2:e41205. doi: 10.2196/41205

Table 5.

Binary classification scores using all posts of 70 depressed users and 1 of their matched control usersa.


Positive predictive value, mean (SD) Sensitivity, mean (SD) F1-score, mean (SD)
SVMb using TF-IDFc 0.637 (N/Ad) 0.557 (N/A) 0.590 (N/A)
SVM using word embeddings 0.558 (N/A) 0.543 (N/A) 0.548 (N/A)
SVM using TF-IDF and word embeddings 0.673 (N/A) 0.557 (N/A) 0.596 (N/A)
BERTe LMf 0.638 (0.021) 0.805 (0.022) 0.709 (0.012)
ALBERTg LM 0.606 (0.008) 0.786 (0.015) 0.683 (0.010)
BioBERT LM 0.601 (0.005) 0.862 (0.022) 0.707 (0.005)
Longformer LM 0.633 (0.009) 0.838 (0.036) 0.719 (0.018)
MentalBERT LM 0.660 (0.019) 0.848 (0.008) 0.738 (0.013)
MentalRoBERTa LM 0.629 (0.002) 0.819 (0.022) 0.709 (0.006)
Naive baseline—all depression 0.500 (N/A) 1.000 (N/A) 0.667 (N/A)

aLanguage model experiments were run 3 times each, therefore both mean and SD scores are provided.

bSVM: support vector machine.

cTF-IDF: term frequency–inverse document frequency.

dN/A: not applicable.

eBERT: Bidirectional Encoder Representations from Transformers.

fLM: language model.

gALBERT: A Lite Bidirectional Encoder Representations from Transformers.