Table 3. Top 10 best performing hyperparameter combinations that advanced to fine-tuning.
See Materials and methods and Table 1 for a detailed description of the hyperparameters.
| λ1 | λ2 | β | ρ | Activation | Learn rate | γ | Optimizer | Loss type | Hidden layers | Size ratio | Decay |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.1 | 0 | 0.01 | 0.01 | tanh | 1.0*10–4 | 0 | adam | CE | 4 | 1 | 0.95 |
| 0.1 | 0 | 1 | 0.5 | sigmoid | 1.0*10–4 | 1 | adam | CE | 2 | 0.9 | 0.95 |
| 0.1 | 0 | 5 | 0.5 | sigmoid | 1.0*10–1 | 4 | adam | CE | 2 | 0.5 | 0 |
| 0.1 | 0 | 1 | 0.005 | relu | 1.0*10–1 | 4 | adam | FL | 6 | 1 | 0.25 |
| 0.1 | 0 | 5 | 0.01 | relu | 1.0*10–5 | 5 | adam | FL | 4 | 1 | 0.95 |
| 0.1 | 0 | 0.01 | 0.1 | leakyrelu | 1.0*10–5 | 0 | adam | FL | 8 | 0.9 | 0.95 |
| 0.1 | 0 | 1 | 0.01 | tanh | 1.0*10–4 | 0 | adam | CE | 6 | 1 | 0.95 |
| 0 | 1.0*10–8 | 0.001 | 0.05 | relu | 1.0*10–5 | 4 | adam | CE | 8 | 0.6 | 0.95 |
| 0.1 | 0 | 0 | 0.01 | relu | 1.0*10–1 | 5 | adam | FL | 8 | 0.9 | 0 |
| 0.1 | 0 | 0.01 | 0.01 | tanh | 1.0*10–3 | 5 | adam | CE | 2 | 1 | 0.95 |