Table 2.
Hyperparameter | Value |
---|---|
Maximum epochs | 16 |
Validation frequency (per epoch) | 1 |
Validation patience | 2 |
Initial learning rate | 0.001 |
Learning rate drop period | 4 |
Learning rate drop factor | 0.1 |
Minibatch size | 8 |
The maximum number of epochs the network can be trained for is 16. The cross-entropy loss of the validation dataset is calculated for each epoch, and if this value is less than the prior minimum validation loss for more than two epochs, training terminates. The initial learning rate is 0.001, and every four epochs the learning rate drops by a factor of ten. We use a minibatch size of eight to minimize deficits in generalizability that could occur at larger values54.