Table 4.
Parameter | Value |
---|---|
Training data size | 88 641 |
Number of training batches | 500 |
Batch size | 1 |
Number of epochs | 1 |
Initialized learning rate | 5e − 5 |
Warmup steps | 10% of the number of training batches |
Parameter | Value |
---|---|
Training data size | 88 641 |
Number of training batches | 500 |
Batch size | 1 |
Number of epochs | 1 |
Initialized learning rate | 5e − 5 |
Warmup steps | 10% of the number of training batches |