Table 5. Parameter configuration of the actor-critic neural networks and agent.
| Parameter | Actor network | Critic network |
|---|---|---|
| Input layer size | 19 | 19 |
| Hidden layer | 150 | 150 |
| Activation function | Tanh | Tanh |
| Hidden layer | 150 | 150 |
| Activation function | Tanh | Tanh |
| Hidden layer | 100 | 100 |
| Activation function | ReLu | ReLu |
| Output layer | 27 | 1 |
| AC agent parameters | ||
| Number of steps to look ahead | 70 | 70 |
| Learning rate | 0.001 | 0.001 |
| Entropy loss weight | 0.25 | 0.25 |
| Gradient threshold | 1 | 1 |
| Discount factor | 0.91 | 0.91 |
| Max number of episodes | 5,000 | 5,000 |
| Max steps per episode | 4,000 | 4,000 |