Table 1.
Hyperparameter | Values considered |
---|---|
Preprocessing | norm; norm+tanh; norm+tanh+norm |
Hidden units | [8192, 8192]; [4096, 4096]; [2048, 2048]; |
[8192, 4096]; [4096, 2048]; [4096, 4096, 4096]; | |
[2048, 2048, 2048]; [4096, 2048, 1024]; | |
[8192, 4096, 2048] | |
Learning rates | 10−2; 10−3; 10−4; 10−5 |
Dropout | no dropout; input: 0.2, hidden: 0.5 |
Note: All possible combinations of the presented hyperparameters were optimized via grid-search.