Table 2.
Comparison of test results with different parameter configurations.
| Learning rate | Epoch | Batch size | Optimizer | Test loss | Test accuracy |
|---|---|---|---|---|---|
| 0.001 | 50 | 64 | Adam | 0.9687 | 96.33% |
| 0.001 | 100 | 32 | Adam | 0.0685 | 97.36% |
| 0.001 | 150 | 16 | Adam | 0.0744 | 97.79% |
| 0.0001 | 50 | 16 | Adam | 0.0865 | 97.78% |
| 0.0001 | 150 | 32 | Adam | 0.6556 | 97.82% |
| 0.0001 | 100 | 64 | Adam | 0.0265 | 98.05% |
| 0.0002 | 50 | 64 | Adam | 0.1243 | 98.01% |
| 0.0002 | 100 | 32 | Adam | 0.0434 | 97.73% |
| 0.0002 | 150 | 16 | Adam | 0.0792 | 97.56% |
| 0.001 | 50 | 64 | Adagrad | 0.9699 | 95.34% |
| 0.001 | 100 | 32 | Adagrad | 0.0768 | 96.27% |
| 0.001 | 150 | 16 | Adagrad | 0.0868 | 96.52% |
| 0.0001 | 50 | 16 | Adagrad | 0.0978 | 96.98% |
| 0.0001 | 150 | 32 | Adagrad | 0.7686 | 96.92% |
| 0.0001 | 100 | 64 | Adagrad | 0.0357 | 97.85% |
| 0.0002 | 50 | 64 | Adagrad | 0.2126 | 97.02% |
| 0.0002 | 100 | 32 | Adagrad | 0.0567 | 95.56% |
| 0.0002 | 150 | 16 | Adagrad | 0.0876 | 96.88% |
The bold values are used to highlight that they are optimal compared to other numerical combinations. They are also the final parameter settings used in the model we proposed.