Skip to main content
. Author manuscript; available in PMC: 2024 Apr 30.
Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:10520–10542. doi: 10.18653/v1/2023.acl-long.587

Table 6:

PRIMERA models calibrated to improve relevance. Calibration candidates are pooled from fine-tuned PRIMERA and LongT5 models. REL stands for RelAgg (from §4.1). FAITH stands for FaithAgg (from §4.2).

Selection Type Selection Strategy Clinical Chemical Biomedical Dataset Avg.
REL FAITH REL FAITH REL FAITH REL FAITH
Random .220 .180 .081 −.038 .028 .061 .110 .068
Quality Based Extreme .263 .152 .049 −.168 .039 .002 .117 −.005
Average .028 −.080 .015 .056 .030 .025 .024 .000
Min .193 −.022 .069 −.049 .039 −.012 .100 −.027
High .218 .095 .056 −.029 .019 .004 .098 .023
Margin Based Max .235 .210 .062 .031 .032 −.011 .110 .077
Min .158 −.115 .028 .080 .014 .015 .067 −.007
Diversity Based Max .274 .151 .054 −.166 .015 −.011 .114 −.009
Min .275 .091 −.049 −.114 .020 −.037 .082 −.020
Likelihood Based Extreme Beam .260 .140 .029 −.158 .030 −.008 .106 −.009
Top Beam .287 .142 .066 −.042 .030 −.008 .128 .031
Bottom Beam .101 .125 .059 .085 .025 −.002 .062 .069
Spurious Correlates Max Length .255 .150 .051 −.095 .017 −.027 .108 .009
Min Length .181 .243 .042 .052 .033 .022 .085 .106
Avg. Across Strategies .211 .104 .044 −.040 .027 .001 .094 .022