Skip to main content
. Author manuscript; available in PMC: 2023 Jul 25.
Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023(ClinicalNLP):78–85.

Table 2:

Performance of fine-tuned T5 models on the summarization task. 95% confidence intervals are included. The first row is a baseline representing the best performance on this task to date. Please see the Appendix for the full set of results.

Model Training Summarization
Gao et al., 2023 Single task 7.60 (5.31 – 9.89)
T5 220M Single task 26.35 (22.18 – 30.52)
Multi-task 24.84 (20.28 – 29.40)
T5 770M Single task 26.90 (22.58 – 31.23)
Multi-task 23.99 (19.86 – 28.13)
SciFive 220M Single task 25.31 (21.45 – 29.17)
Multi-task 24.38 (19.99 – 28.78)
SciFive 770M Single task 27.31 (23.09 – 31.53)
Multi-task 25.31 (21.45 – 29.17)
Clinical-T5 Single task 25.35 (21.19 – 29.51)
220M Multi-task 26.21 (21.92 – 30.49)
Clinical-T5 Single task 28.28 (24.17 – 32.38)
770M Multi-task 28.55 (24.29 – 32.80)