TABLE 4.
For each combination of a deep learning model, a learning rate schedule, and an optimization method in the unloaded system test presented in Sections A to C, the summary statistics of the average estimation error across the five runs.
| Deep learning model |
Learning rate schedule |
Optimization method |
Average estimation error |
|---|---|---|---|
| GoogLeNet | fixed learning rate | SGD | 0.093±0.012 |
| fixed learning rate | AdaGrad | 0.093±0.008 | |
| exponential decay | RMSprop | 0.990±0.282 | |
| exponential decay | SGD | 0.897±0.274 | |
| exponential decay | AdaGrad | 0.632±0.197 | |
| step decay | RMSprop | 0.364±0.025 | |
| step decay | SGD | 0.540±0.160 | |
| step decay | AdaGrad | 0.552±0.196 | |
| GRU | fixed learning rate | RMSprop | 0.275±0.101 |
| fixed learning rate | SGD | 0.793±0.326 | |
| fixed learning rate | AdaGrad | 0.536±0.226 | |
| exponential decay | RMSprop | 0.284±0.046 | |
| exponential decay | SGD | 0.695±0.530 | |
| exponential decay | AdaGrad | 0.631±0.360 | |
| step decay | Adam | 0.333±0.050 | |
| step decay | RMSprop | 0.384±0.133 | |
| step decay | SGD | 0.568±0.217 | |
| step decay | AdaGrad | 0.304±0.169 |