Table 1.
Evaluation of Networks from Combination of Top Hyperparameters Selected in Each Optimization Phase
| Epochs | Batch Size | Weight Initializer | Optimizer | Activation | Hidden Layers | Nodes per HL | Dropout Rate | Mean MSE | Mean MAE |
|---|---|---|---|---|---|---|---|---|---|
| 50 | 16 | Truncated Normal | Adagrad | ReLU | 2 | 300 | 0 | 108.742 | 6.6 |
| 50 | 16 | Truncated Normal | Adagrad | ReLU | 2 | 200 | 0 | 101.4 | 6.6 |
| 75 | 32 | Truncated Normal | Adagrad | ReLU | 2 | 300 | 0 | 116.341 | 6.2 |
| 75 | 32 | Truncated Normal | Adagrad | ReLU | 2 | 200 | 0 | 99.2 | 5.9 |
| 50 | 16 | Truncated Normal | Adagrad | ELU | 2 | 300 | 0 | 90.3 | 5.9 |
| 50 | 16 | Truncated Normal | Adagrad | ELU | 2 | 200 | 0 | 95.7 | 6.1 |
| 75 | 32 | Truncated Normal | Adagrad | ELU | 2 | 300 | 0 | 87.4 | 5.6 |
| 75 | 32 | Truncated Normal | Adagrad | ELU | 2 | 200 | 0 | 87.2 | 5.68 |
The network with the lowest average MSE used to build KiDNN is highlighted. ReLU, rectified linear unit; ELU, exponential linear unit; Adagrad, adaptive gradient.