. 2024 Oct 3;26:e60601. doi: 10.2196/60601

Table 3.

For the evaluation of the single-document summarization task, we compared ROUGE-1, ROUGE-2, ROUGE-L, and some results are derived from other papers [52].

	PubMed				MIMIC-CXR^a				MEDIQA-AnS (p)				MEDIQA-AnS (s)
	R-1	R-2	R-L	R-1		R-2	R-L	R-1		R-2	R-L	R-1		R-2	R-L
Pegasus	45.97	20.15	28.25	22.49		11.57	20.35	18.29		4.82	13.87	22.21		8.23	16.76
BigBird	46.32	20.65	42.33^b	38.99		29.52	38.59	13.18		2.14	10.04	14.89		3.13	11.15
BART	48.35^b	21.43^b	36.90	41.70^b		32.93^b	41.16^b	24.02^b		7.20	17.09^b	38.19		22.20	30.58
SciFive	—^c	—	—	35.41		26.48	35.07	13.08		2.15	10.10	16.88		6.47	14.42
BioBART	—	—	—	41.61		32.90	41.00	22.58		7.49^b	16.69	39.40^b		24.64^b	32.07^b

^aMIMIC-CXR: MIMIC Chest X-Ray database.

^bThe superior score within the same data set.

^cNot applicable.