Skip to main content
. 2020 Jul 3;10(7):427. doi: 10.3390/brainsci10070427

Table 10.

Error rate of ten optimizers with various learning rates on a proposed patch-wise model architecture of CNN.

Learning Rate  → 1e1 1e2 1e3 1e4 1e5 1e6 1e7 1e8 1e9 1e10
Optimizers  ↓
Adam 0.05 0.03 0.04 0.06 0.07 0.1 0.1 0.07 0.1 0.1
Adagrad 1.04 0.86 0.56 0.37 0.54 0.77 0.81 0.54 0.77 0.81
AdaDelta 1.86 1.81 1.83 2.01 2.08 2.17 2.2 2.08 2.17 2.2
SGD 0.53 0.49 0.47 0.63 0.67 0.81 0.97 0.67 0.81 0.97
NAG 2.3 2.25 2.19 2.39 2.43 2.67 2.79 2.43 2.67 2.79
Rmsprop 2.12 2.13 2.1 2.17 2.29 2.15 2.49 2.29 2.15 2.49
Momentum 0.25 0.26 0.28 0.43 0.49 0.51 0.57 0.49 0.51 0.57
Adamax 1.69 1.52 1.26 1.49 1.6 1.92 2.09 1.6 1.92 2.09
CLR 1.88 1.88 1.79 1.97 2.05 2.15 2.45 2.05 2.15 2.45
Nadam 1.79 1.76 1.65 1.69 1.81 2.07 2.31 1.81 2.07 2.31