Research on AGV Path Planning Based on Improved DQN Algorithm

. 2025 Jul 29;25(15):4685. doi: 10.3390/s25154685

Algorithm 1: B-PER DQN algorithm

1. Initialize the priority experience playback buffer

D

, the capacity is

C

2. Initialize environment parameters and training parameters
3. For episode =1 to

M

do
4. Reset the environment, get the initial states, initialize episode reward =0
5. While not done:
6. Select the action according to Equation (8).
7. Execute the action

a

and store

s, s^{'}, r, d

D

8. If

C

> 256:
9. According to Equation (5), the sample of batch-size size is extracted from

D

10. Calculate and update network parameters according to Equation (11)
11. End if
12. Dynamically adjust

τ

according to Equation (10)
13. Attenuation

ε

according to Equation (9)
14. End for