Table 2.
Summary of best hyperparameters and fitness values at generation number 50.
| Parameter name | Symbol | R | R + P | R + N | R + P + N |
|---|---|---|---|---|---|
| CACLA + var beta | β | 0.0251* | 0.01685 | 0.0001* | 0.0156 |
| Initial variance | var0 | 5.1 | 4.4 | 4.7 | 4.3 |
| Initial iterations | 10 | 10 | 11 | 10 | |
| Discount factor | γ | 0.836 | 0.887 | 0.715 | 0.868 |
| Exploration rate | σ | 1.4 | 1.2 | 1.3 | 0.9 |
| Exploration rate decay | κ | 1 | 1 | 1 | 1 |
| Learning rate Critic | η | 0.0001* | 0.0276 | 0.0601 | 0.0076 |
| Critic MLP in →h1st | Cin→h1st | 35 | 35 | 25 | 30 |
| Critic MLP hsecond →out | Ch2nd→out | 10 | 15 | 10 | 15 |
| Learning rate Actor | α | 0.0426 | 0.1151 | 0.0226 | 0.0751 |
| Actor MLP in →h1st | Ain→h1st | 30 | 30 | 30 | 30 |
| Actor MLP hsecond →out | Ah2nd→out | 15 | 15 | 10 | 10 |
| Reward | R | 10 | 10 | 10 | 10 |
| Punishment | P | N.A. | −0.1 | N.A. | −0.1 |
| Avg. fitness (SD) | 6.253625 (±5.480689) | 8.939111 (±6.389975) | 4.765209 (±5.280052) | 8.864413 (±6.943758) | |
| Best fitness | 1.504085 | 1.550623 | 1.425379 | 1.377808 |
The fitness is the total reaching distance, in meters, on the testing set, thus the smaller the better.
N.A., not applicable.
The star (*) indicates values that reached their maximum or minimum allowed value.