TABLE 5.
For each of the 24 tests, the mean as well as the standard deviation of the average prediction error over the five runs for each of the three progress indication methods.
| Deep learning model | Learning rate decay method | Optimization algorithm | Average prediction error |
||
|---|---|---|---|---|---|
| Progress indication method 1 | Progress indication method 2 | Progress indication method 3 | |||
|
| |||||
| GoogLeNet | Using a constant learning rate | Adam | 0.50±0.10 | 0.45±0.12 | 0.51±0.14 |
| RMSprop | 0.53±0.25 | 0.42±0.11 | 0.42±0.13 | ||
| SGD | 0.18±0.03 | 0.30±0.01 | 0.11±0.01 | ||
| AdaGrad | 0.17±0.07 | 0.41±0.02 | 0.15±0.02 | ||
|
| |||||
| Exponential decay method | Adam | 2.46±1.20 | 1.46±0.66 | 0.89±0.26 | |
| RMSprop | 1.20±0.51 | 0.79±0.19 | 0.66±0.05 | ||
| SGD | 1.32±0.53 | 0.97±0.31 | 0.70±0.20 | ||
| AdaGrad | 1.22±0.29 | 0.80±0.16 | 0.58±0.09 | ||
|
| |||||
| Step decay method | Adam | 0.45±0.06 | 0.45±0.07 | 0.44±0.11 | |
| RMSprop | 0.73±0.50 | 0.54±0.14 | 0.57±0.14 | ||
| SGD | 0.40±0.04 | 0.49±0.05 | 0.34±0.09 | ||
| AdaGrad | 0.35±0.04 | 0.44±0.05 | 0.52±0.09 | ||
|
| |||||
| GRU | Using a constant learning rate | Adam | 1.94±0.67 | 0.54±0.08 | 0.48±0.05 |
| RMSprop | 1.55±0.53 | 0.60±0.17 | 0.52±0.19 | ||
| SGD | 0.65±0.08 | 0.43±0.08 | 0.58±0.12 | ||
| AdaGrad | 0.93±0.60 | 0.52±0.03 | 0.48±0.08 | ||
|
| |||||
| Exponential decay method | Adam | 2.40±1.17 | 0.60±0.18 | 0.44±0.13 | |
| RMSprop | 1.27±0.22 | 0.44±0.13 | 0.25±0.09 | ||
| SGD | 1.39±0.25 | 0.93±0.15 | 0.51±0.07 | ||
| AdaGrad | 1.45±0.62 | 0.66±0.58 | 0.42±0.26 | ||
|
| |||||
| Step decay method | Adam | 1.94±0.60 | 0.55±0.18 | 0.46±0.17 | |
| RMSprop | 1.59±0.17 | 0.51±0.07 | 0.47±0.13 | ||
| SGD | 0.57±0.10 | 0.41±0.08 | 0.55±0.12 | ||
| AdaGrad | 1.99±0.50 | 0.63±0.21 | 0.45±0.16 | ||
|
| |||||
| Over all runs in all tests | 1.13±0.84 | 0.60±0.33 | 0.48±0.21 | ||