Table 3.
For the evaluation of the single-document summarization task, we compared ROUGE-1, ROUGE-2, ROUGE-L, and some results are derived from other papers [52].
|
|
PubMed | MIMIC-CXRa | MEDIQA-AnS (p) | MEDIQA-AnS (s) | |||||||||||
|
|
R-1 | R-2 | R-L | R-1 | R-2 | R-L | R-1 | R-2 | R-L | R-1 | R-2 | R-L | |||
| Pegasus | 45.97 | 20.15 | 28.25 | 22.49 | 11.57 | 20.35 | 18.29 | 4.82 | 13.87 | 22.21 | 8.23 | 16.76 | |||
| BigBird | 46.32 | 20.65 | 42.33b | 38.99 | 29.52 | 38.59 | 13.18 | 2.14 | 10.04 | 14.89 | 3.13 | 11.15 | |||
| BART | 48.35b | 21.43b | 36.90 | 41.70b | 32.93b | 41.16b | 24.02b | 7.20 | 17.09b | 38.19 | 22.20 | 30.58 | |||
| SciFive | —c | — | — | 35.41 | 26.48 | 35.07 | 13.08 | 2.15 | 10.10 | 16.88 | 6.47 | 14.42 | |||
| BioBART | — | — | — | 41.61 | 32.90 | 41.00 | 22.58 | 7.49b | 16.69 | 39.40b | 24.64b | 32.07b | |||
aMIMIC-CXR: MIMIC Chest X-Ray database.
bThe superior score within the same data set.
cNot applicable.