Skip to main content
. Author manuscript; available in PMC: 2025 Feb 8.
Published in final edited form as: Proc ACM Interact Mob Wearable Ubiquitous Technol. 2024 Mar 6;8(1):31. doi: 10.1145/3643540

Table 5.

Balanced Accuracy Performance Change using Enhancement Strategies.

Dataset Dreaddit DepSeverity SDCNL CSSRS-Suicide
Model Task #1 Task #2 Task #3 Task #4 Task #5 Task #6 Δ¯–All Six Tasks
ΔAlpacaZS_context ↑ +0.019 ↑ +0.045 ↑ +0.023 ↑ +0.004 ↑ +0.014 ↑ +0.018 ↑ +0.021
ΔAlpacaZS_mh ↑ +0.000 ↑ +0.055 ↑ +0.013 ↓ −0.011 ↑ +0.006 ↑ +0.004 ↑ +0.011
ΔAlpacaZS_both ↓ −0.053 ↑ +0.037 ↓ −0.010 ↑ +0.039 ↓ −0.007 ↓ −0.010 ↓ −0.001
ΔAlpaca-LoRAZS_context ↓ −0.035 ↓ −0.047 ↓ −0.094 ↓ −0.030 ↑ +0.027 ↑ +0.027 ↓ −0.025
ΔAlpaca-LoRAZS_mh ↓ −0.071 ↓ −0.047 ↓ −0.105 ↓ −0.005 ↑ +0.017 ↑ +0.029 ↓ −0.031
ΔAlpaca-LoRAZS_both ↓ −0.071 ↓ −0.048 ↓ −0.051 ↓ −0.003 ↓ −0.023 ↑ +0.037 ↓ −0.027
ΔFLAN-T5ZS_context ↑ +0.004 ↑ +0.011 ↓ −0.018 ↑ +0.010 ↓ −0.018 ↓ −0.040 ↓ −0.009
ΔFLAN-T5ZS_mh ↓ −0.043 ↑ +0.003 ↓ −0.030 ↑ +0.005 ↓ −0.013 ↓ −0.046 ↓ −0.021
ΔFLAN-T5ZS_both ↓ −0.055 ↓ −0.003 ↓ −0.007 ↑ +0.002 ↓ −0.010 ↓ −0.036 ↓ −0.018
ΔLLaMA2ZS_context ↓ −0.062 ↑ +0.014 ↓ −0.019 ↑ +0.000 ↑ +0.031 ↑ +0.106 ↑ +0.012
ΔLLaMA2ZS_mh ↓ −0.102 ↑ +0.018 ↓ −0.033 ↑ +0.053 ↑ +0.004 ↑ +0.031 ↓ −0.005
ΔLLaMA2ZS_both ↓ −0.136 ↑ +0.011 ↑ +0.016 ↑ +0.054 ↓ −0.002 ↑ +0.067 ↑ +0.002
ΔGPT-3.5ZS_context ↑ +0.003 ↑ +0.011 ↓ −0.060 ↑ +0.157 ↑ +0.007 ↑ +0.031 ↑ +0.025
ΔGPT-3.5ZS_mh ↓ −0.006 ↓ −0.006 ↑ +0.039 ↑ +0.116 ↓ −0.093 ↑ +0.077 ↑ +0.021
ΔGPT-3.5ZS_both ↓ −0.005 ↓ −0.015 ↑ +0.014 ↑ +0.172 ↑ +0.047 ↑ +0.020 ↑ +0.039
ΔGPT-4ZS_context ↑ +0.006 ↑ +0.000 ↑ +0.001 ↑ +0.000 ↓ −0.007 ↑ +0.023 ↑ +0.004
ΔGPT-4ZS_mh ↑ +0.025 ↓ −0.035 ↑ +0.067 ↑ +0.002 ↓ −0.023 ↓ −0.022 ↑ +0.002
ΔGPT-4ZS_both ↑ +0.018 ↓ −0.031 ↑ +0.061 ↑ +0.003 ↓ −0.063 ↓ −0.006 ↓ −0.003
Δ¯–All Six Models ↓ −0.031 ↑ +0.000 ↓ −0.011 ↑ +0.032 ↓ −0.006 ↑ +0.017 ↑ +0.000
Δ¯–Alpaca, GPT-3.5, GPT-4 ↑ +0.001 ↑ +0.007 ↑ +0.017 ↑ +0.053 ↓ −0.013 ↑ +0.015 ↑ +0.013

The green/red color indicates increased/decreased accuracy. This table zooms in on the zero-shot section of Table 4. / marks the ones with better/worse performance in comparison.