Table 7. Overview of Combinations of Hyperparameters Explored.
| number of layers | parameter | valuesa |
|---|---|---|
| 4 | neurons | (8000, 4000, 1000, X), (4000, 2000, 500, X) |
| dropout rate | 0, 0.3 | |
| regularizer rate | 0.000001 | |
| learning rate | 0.0001 | |
| 5 | neurons | (9000, 4000, 1000, 100, X), (5000, 2000, 1000, 100, X) |
| regularizer rate | 0, 0.0000001 | |
| learning rate | 0.0001 |
“X” in the number of neurons denotes the number of end points employed for each multi-task model (i.e., number of neurons in the output layer).