Epoch | Step | Training Loss | Validation Loss | Accuracy |
---|---|---|---|---|
1 | 73 | 0.1179 | 0.0977 | 98.85% |
2 | 147 | 0.06 | 0.0693 | 98.98% |
3 | 219 | 0.0376 | 0.0604 | 99.11% 1 |
4 | 292 | 0.024 | 0.0769 | 98.09% |
5 | 366 | 0.0236 | 0.1111 | 97.45% |
6 | 440 | 0.0172 | 0.0542 | 98.98% 2 |
7 | 514 | 0.0114 | 0.0630 | 98.85% |
8 | 587 | 0.0051 | 0.0674 | 98.60% |
9 | 661 | 0.0044 | 0.0640 | 98.85% |
10 | 735 | 0.0037 | 0.0646 | 98.85% |
11 | 809 | 0.0034 | 0.0652 | 98.85% |
12 | 882 | 0.0032 | 0.0656 | 98.85% |
13 | 949 | 0.0032 | 0.0657 | 98.85% |
1 After epoch 3, training reached best accuracy overall (as evaluated on a test set that was split from the training set) and the checkpoint was saved as Model v1.
2 After epoch no. 6, training reached best accuracy of the second training run and the checkpoint was saved as Model v2.