Safe Decision Controller for Autonomous DrivingBased on Deep Reinforcement Learning inNondeterministic Environment

. 2023 Jan 20;23(3):1198. doi: 10.3390/s23031198

Algorithm 1 DQN Algorithm

Input:

M D P = (S, A, T, R), d o n e, u p d a t e

Output: Optimal policy

π

1: Init (

π

)
2:

s_{t} \leftarrow s_{0}

a \leftarrow N O P

4: while

s_{t}! = d o n e and e p i s o d e < e p i s o d e s

do
5: // Iterative selection of optimal value
6:

A^{^{'}} = A

a \leftarrow R a n d o m_{S} e l e c t (A^{^{'}})

u p d a t e (s_{t}, a, π)

// Update parameters
9: Go to the next state

s_{t + 1}

10: end while