Skip to main content
. Author manuscript; available in PMC: 2024 Apr 30.
Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:10520–10542. doi: 10.18653/v1/2023.acl-long.587

Table 4:

# of candidates pooled for each training instance. m is % of noun phrases masked, s % of entities swapped, and t the softmax temperature for GPT-3.

Method Hyper-Param Number
Mask-And-Fill (Low) m = 0.25 10
Mask-And-Fill (High) m = 0.75 10
Swap Intrinsic (Low) s = 0.5 10
Swap Intrinsic (High) s = 1.0 10
Swap Extrinsic (Low) s = 0.5 10
Swap Extrinsic (High) s = 1.0 10
Paraphrase t = 0.7 5
Reference N/A 1
Total For Faithfulness 66
Diverse Beam (PRIMERA) p = 1 10
Diverse Beam (LongT5) p = 1 10
Total For Relevance 20