Table 6:
Hyperparameters for fine-tuning T5 and BART
| Hyper-parameter | Setting |
|---|---|
| Optimizer | Adam |
| Epoch | 10 (with early stopping) |
| Learning rate | 1e-4, 1e-5, 1e-6 |
| Batch size | 4 |
| Task Prefix (t5) | “summarize:” |
| Encoder max length | 512 |
| Decoder max length | 128 |
| Beam size | 10 |
| Length penalty | 1 |
| no repeat ngram size | 2 |