Table 2. Hyperparameters used in the training.
Optimizer | SGDM |
Mini-batch size | 16 |
Initial learning rate | 3e-4 |
Learning rate drop factor | 0.2 |
Learning rate drop period | 8 |
L2 regularization factor | 0.004 |
Validation frequency | 16 |
Momentum | 0.9 |
Maximum # of epochs | 20 |
SGDM: Stochastic gradient descent with momentum. |