Table 5:
Observations with the Out-of-distribution (OOD) dataset. Comparison of the explanations obtained by LIME with three different types of textual cues annotated as Trigger, LoST indicators, and consequences using ROUGE scores and BLEU scores. (italicize and underline) represents the highest values of LoST in True Positives and comparable values of LoST and Trigger, respectively.
| Model | Eval. | T | L | C |
|---|---|---|---|---|
|
| ||||
| BERT | ROUGE | 0.0449 | 0.1764 | 0.0533 |
| BLEU | 0.0225 | 0.1328 | 0.0432 | |
| ALBERT | ROUGE | 0.1525 | 0.1857 | 0.0259 |
| BLEU | 0.1142 | 0.1468 | 0.0166 | |
| DistilBERT | ROUGE | 0.0569 | 0.1291 | 0.0266 |
| BLEU | 0.0138 | 0.0993 | 0.0200 | |
| DeBERTa | ROUGE | 0.0979 | 0.1094 | 0.0159 |
| BLEU | 0.0638 | 0.0739 | 0.0100 | |
|
| ||||
| ClinicalBERT | ROUGE | 0.0607 | 0.0968 | 0.0533 |
| BLEU | 0.0279 | 0.0536 | 0.0500 | |
| PsychBERT | ROUGE | 0.1444 | 0.2310 | 0.0352 |
| BLEU | 0.1018 | 0.1843 | 0.0260 | |
| MentalBERT | ROUGE | 0.1054 | 0.2228 | 0.0214 |
| BLEU | 0.0633 | 0.1897 | 0.0133 | |