Table 3.
Comparison of experimental evaluation processes.
Policy | Evaluation | Description | ||
---|---|---|---|---|
Mean_length | Mean_reward | Success_rate | ||
DRLNDT | 110.055 | 80.3718 | 0.99502488 | Transformer + SAC |
Baseline | 181.065 | 54.3497 | 0.59701493 | RNN + SAC |
Standard SAC | 218.16 | 65.6080 | 0.7312 | |
DRLNDT-n-10 | 314.845 | 27.8312 | 0.398 | Historical state length n = 10 |
DRLNDT-n-20 | 179.08 | 20.5407 | 0 | Historical state length n = 20 |
DRLNDT-n-30 | 175.09 | 24.3238 | 0 | Historical state length n = 30 |
DRLNDT-n-40 | 110.12 | 80.3325 | 0.99502488 | Historical state length n = 40 |
DRLNDT-n-60 | 108.025 | 80.9850 | 0.995 | Historical state length n = 60 |
DRLNDT-n-70 | 140.1 | 80.7300 | 0.995 | Historical state length n = 70 |
DRLNDT-n-80 | 155.255 | 14.7500 | 0 | Historical state length n = 80 |
DRLNDT-w-0.125 | 166.01 | 31.7600 | 0 | Potential_reward_w = 0.125 |
DRLNDT-w-0.175 | 359.04 | 80.3810 | 0.99502 | Potential_reward_w = 0.175 |
DRLNDT-w-0.25 | 110.015 | 80.5790 | 0.99502 | Potential_reward_w = 0.25 |
DRLNDT-w-0.5 | 119.025 | 80.7813 | 0.99502 | Potential_reward_w = 0.5 |
Modular Pipi | 1113 | 83.0000 | 0.99 |