Table 2.
DDPG Setup Hyper-Parameters | |
---|---|
Actor/Critic learning rate | 1 × 10−3 |
Reward discount factor | 0.9 |
Soft replacement | 0.01 |
Batch size | 32 |
Running episodes | 300 |
Number of track points | 50 |
Training steps per update | 200 |
Memory capacity | 80,000 |
Updates | episodes × points × steps |