Table 2.
User-level performance (%) using different features.
Featuresa | Accuracy | F1 | AUCb |
VADERc | 54.9 | 61.7 | 54.6 |
Demographics | 58.7 | 56.0 | 61.4 |
Engagement | 58.7 | 62.3 | 61.7 |
Personality | 64.8 | 67.8 | 72.4 |
LIWCd | 70.6 | 70.8 | 76.0 |
V + D + E + P + Le | 71.5 | 72.0 | 78.3 |
XLNet | 78.1 | 77.9 | 84.9 |
All (random forest) | 78.4 | 78.1 | 84.9 |
All (logistic regression) | 78.3 | 78.5 | 86.4 f |
All (SVMg) | 78.9 | 79.2 | 86.1 |
aWe used SVM for classifying individual features.
bAUC: area under the receiver operating characteristic curve.
cVADER: Valence Aware Dictionary and Sentiment Reasoner.
dLIWC: Linguistic Inquiry and Word Count.
eV + D + E + P + L: VADER + demographics + engagement + personality + LIWC.
fItalics indicate the best performing model in each column.
gSVM: support vector machine.