Table 1.
Test accuracy on the CIFAR-10 and CIFAR-100 datasets for CNNs trained with BP and Adam, with and without GRAPES modulation.
| Optimizer | CIFAR-10 | CIFAR-100 |
|---|---|---|
| Adam | 84.78 ± 0.20 | 58.20 ± 0.39 |
| Adam+ GRAPES | 85.59 ± 0.17 | 58.85 ± 0.38 |
The network is a nine-layer residual architecture. The learning rate is decayed by 90% every 50 epochs and the initial learning rate is η = 1e − 2. The models are trained for 250 epochs. The accuracy for each run is computed as the mean of the test accuracy over the last 10 training epochs. The reported result is the mean and standard deviation over the accuracy of ten independent runs. The bold font indicates the best performance for each dataset.