Table 7:
Hyperparameter | Pretraining | Finetuning |
---|---|---|
batch-size | 128 | 32 |
learning-rate | 0.1 | 2e-5 |
optimizer | SGD | Adam |
temperature (CL) | 0.4 | - |
n_epochs | 100 | 10 |
beta | - | [0.9, 0.99] |
Aug. Probability | 0.2 | - |
Hyperparameter | Pretraining | Finetuning |
---|---|---|
batch-size | 128 | 32 |
learning-rate | 0.1 | 2e-5 |
optimizer | SGD | Adam |
temperature (CL) | 0.4 | - |
n_epochs | 100 | 10 |
beta | - | [0.9, 0.99] |
Aug. Probability | 0.2 | - |