Skip to main content
. Author manuscript; available in PMC: 2025 Feb 8.
Published in final edited form as: Proc ACM Interact Mob Wearable Ubiquitous Technol. 2024 Mar 6;8(1):31. doi: 10.1145/3643540

Table 9.

Balanced Accuracy Performance Summary on Three External Datasets. These datasets come from diverse social media platforms. For each column, the best result is bolded, and the second best is underlined.

Dataset Red-Sam Twt-60Users SAD
Category Model Task #2 Task #2 Task #1
Zero-shot Prompting AlpacaZS_best 0.527±0.006 0.569±0.017 0.557±0.041
Alpaca-LoRAZS_best 0.577±0.004 0.649±0.021 0.477±0.016
FLAN-T5ZS_best 0.563±0.029 0.613±0.046 0.767±0.050
LLaMA2ZS_best 0.574±0.008 0.736 ±0.019 0.704±0.026
GPT-3.5ZS_best 0.506±0.004 0.571±0.000 0.750±0.027
GPT-4ZS_best 0.511±0.000 0.566±0.017 0.854 ±0.006
Instructional Finetuning Mental-Alpaca 0.604 ±0.012 0.718±0.011 0.819 ±0.006
ΔAlpacaFT_vs_ZS ↑ +0.077 ↑ +0.149 ↑ +0.262
Mental-FLAN-T5 0.582 ±0.002 0.736 ±0.003 0.779±0.002
ΔFLAN-T5FT_vs_ZS ↑ +0.019 ↑ +0.123 ↑ +0.012