Table 2:
Selected MRPPO Hyperparameters.
| Hyperparameter | Value |
|---|---|
| Environmental Steps, τ | 4096 |
| Epochs, E | 5 |
| Mini Batches, M | 4 |
| Discount Factor, γ | 0.95 |
| GAE Factor, λ | 0.85 |
| Clipping Parameter, ε | 2 × 10−3 |
| Value Function Coefficient, c1 | 0.5 |
| Entropy Coefficient, c2 | 1 × 10−4 |
| Learning Rate, lr | 1 × 10−4 |
| Maximum Gradient Norm, cmax | 0.5 |