Skip to main content
. Author manuscript; available in PMC: 2023 Feb 25.
Published in final edited form as: Proc Conf. 2022 Jul;2022:1029–1040. doi: 10.18653/v1/2022.naacl-main.75

Table 4:

Paragraph level performance of the evidence retriever module. The overall evaluation metrics (precision, recall and F1-score) are macro-weighted. Evidence prediction is the main task whereas SA and SI prediction are auxiliary tasks and help the model align the vector representations of the paragraphs for the hospital-stay level suicidal behavior prediction.

Paragraph Evidence Prediction Paragraph SA Prediction Paragraph SI Prediction
Evidence P R F Labels P R F Labels P R F
Yes 0.79 0.87 0.83 Positive 0.71 0.74 0.73 Positive 0.46 0.62 0.53
No 0.95 0.91 0.93 Neg_Unsure 0.19 0.26 0.22 Negative 0.38 0.46 0.42
- - - - Neutral-SA 0.95 0.92 0.93 Neutral-SI 0.98 0.99 0.98
Overall 0.87 0.89 0.88 Overall 0.62 0.64 0.63 Overall 0.61 0.69 0.64

P: Precision, R: Recall and F: F1-score.