Skip to main content
. 2022 Feb 18;12(2):309. doi: 10.3390/jpm12020309
Algorithm 1. QL_MAS
 Initialize: The capacity Cap of the memory M, the values Q: ∀r, a|Q (r, a) = 0, The estimation weight
 LSTM-DQN θ = θ_0, the weight of the LSTM-DQN objectives θ′  For episode = 1 → ep do # ep represents the number of episodes
  Fix the initial positions of the agents according to the map from the RAG
  For i = 1 -> Reg do # Reg is the number of regions
   #N is the number of agents N = Reg
   Implement in each Reg i an agent Ai
  End for
  Do as long as t < Cap
   For j = 1 -> N do
    Calculation of the initial actions (2)
    Calculation of initial Q-value (1)
    Verification of the best adjacent neighbors satisfying the similarity criteria
    Negotiation to decide the optimal proposal
   End for
    Fusion
    Update of the map of the regions
    Update Reg
    N = Reg
   For j = 1 -> N do
    Calculation of the actions (2)
    Calculation of Q-value (1)
    Calculation of reward (4)
    Next state calculation (3)
    Save data d = {state(t), action(t), R(t), state(t + 1)} in memory M
    End for
   End Do
  Reset
 End for