Skip to main content
. 2023 Jan 20;23(3):1230. doi: 10.3390/s23031230
Algorithm 1 Procedure of DQN Optimizer and Classifier.
  1 : Initialize replay memory 
  2 : Initialize action value function Q with random weights
  3 : for ϵ = 1, M do
  4 :     for t = 1, T do
  5 :           With probability epsilon, select a random action 
  6 :           if random action is feature:
  7 :                   Execute action in emulator, and observe reward 
  8 :                   Set state and preprocess policy
  9 :                   Store transition in replay memory
 10 :                  Perform a gradient descent step
 11 :           if random action is subject number: 
 12 :                   Execute action in the emulator, and observe reward     
 13 :                   Set state and preprocess policy
 14 :                   Store transition in replay memory
 15 :     end for
 16 : end for