Skip to main content

View full-text article in PMC

. 2024 Mar 19;18:1338189. doi: 10.3389/fnbot.2024.1338189

Table 3.

Comparison of experimental evaluation processes.

Policy	Evaluation			Description
	Mean_length	Mean_reward	Success_rate
DRLNDT	110.055	80.3718	0.99502488	Transformer + SAC
Baseline	181.065	54.3497	0.59701493	RNN + SAC
Standard SAC	218.16	65.6080	0.7312
DRLNDT-n-10	314.845	27.8312	0.398	Historical state length n = 10
DRLNDT-n-20	179.08	20.5407	0	Historical state length n = 20
DRLNDT-n-30	175.09	24.3238	0	Historical state length n = 30
DRLNDT-n-40	110.12	80.3325	0.99502488	Historical state length n = 40
DRLNDT-n-60	108.025	80.9850	0.995	Historical state length n = 60
DRLNDT-n-70	140.1	80.7300	0.995	Historical state length n = 70
DRLNDT-n-80	155.255	14.7500	0	Historical state length n = 80
DRLNDT-w-0.125	166.01	31.7600	0	Potential_reward_w = 0.125
DRLNDT-w-0.175	359.04	80.3810	0.99502	Potential_reward_w = 0.175
DRLNDT-w-0.25	110.015	80.5790	0.99502	Potential_reward_w = 0.25
DRLNDT-w-0.5	119.025	80.7813	0.99502	Potential_reward_w = 0.5
Modular Pipi	1113	83.0000	0.99