. Author manuscript; available in PMC: 2023 Jul 25.

Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023(ClinicalNLP):78–85.

Table 2:

Performance of fine-tuned T5 models on the summarization task. 95% confidence intervals are included. The first row is a baseline representing the best performance on this task to date. Please see the Appendix for the full set of results.

Model	Training	Summarization
Gao et al., 2023	Single task	7.60 (5.31 – 9.89)
T5 220M	Single task	26.35 (22.18 – 30.52)
	Multi-task	24.84 (20.28 – 29.40)
T5 770M	Single task	26.90 (22.58 – 31.23)
	Multi-task	23.99 (19.86 – 28.13)
SciFive 220M	Single task	25.31 (21.45 – 29.17)
	Multi-task	24.38 (19.99 – 28.78)
SciFive 770M	Single task	27.31 (23.09 – 31.53)
	Multi-task	25.31 (21.45 – 29.17)
Clinical-T5	Single task	25.35 (21.19 – 29.51)
220M	Multi-task	26.21 (21.92 – 30.49)
Clinical-T5	Single task	28.28 (24.17 – 32.38)
770M	Multi-task	28.55 (24.29 – 32.80)