Skip to main content
. 2024 Mar 19;18:1338189. doi: 10.3389/fnbot.2024.1338189

Table 3.

Comparison of experimental evaluation processes.

Policy Evaluation Description
Mean_length Mean_reward Success_rate
DRLNDT 110.055 80.3718 0.99502488 Transformer + SAC
Baseline 181.065 54.3497 0.59701493 RNN + SAC
Standard SAC 218.16 65.6080 0.7312
DRLNDT-n-10 314.845 27.8312 0.398 Historical state length n = 10
DRLNDT-n-20 179.08 20.5407 0 Historical state length n = 20
DRLNDT-n-30 175.09 24.3238 0 Historical state length n = 30
DRLNDT-n-40 110.12 80.3325 0.99502488 Historical state length n = 40
DRLNDT-n-60 108.025 80.9850 0.995 Historical state length n = 60
DRLNDT-n-70 140.1 80.7300 0.995 Historical state length n = 70
DRLNDT-n-80 155.255 14.7500 0 Historical state length n = 80
DRLNDT-w-0.125 166.01 31.7600 0 Potential_reward_w = 0.125
DRLNDT-w-0.175 359.04 80.3810 0.99502 Potential_reward_w = 0.175
DRLNDT-w-0.25 110.015 80.5790 0.99502 Potential_reward_w = 0.25
DRLNDT-w-0.5 119.025 80.7813 0.99502 Potential_reward_w = 0.5
Modular Pipi 1113 83.0000 0.99