Table 6.
Training parameters setting of DDPG.
| Training Parameters | Values |
|---|---|
| Size of experience pool | 5000 |
| Number of cycles in the outer layer | 2000 |
| Time step of inner layer | 7 |
| Adjustment time of inner action space | 60 |
| Threshold of discarding the sample | 0.5 |
| Discount factor | 0.9 |
| Learning rates of actual network | 0.0001 |
| Learning rates of target network | 0.01 |
| Soft update rate | 0.001 |
| Exploration noise | 0.01 |