Skip to main content
. Author manuscript; available in PMC: 2024 Apr 30.
Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:10520–10542. doi: 10.18653/v1/2023.acl-long.587

Figure 5:

Figure 5:

A plot showing the impact of calibration performance on downstream performance (relevance). An average rank of 0 reveals a model which always identifies the most relevant summary. The worst score is 3.