Skip to main content
. 2022 Mar 17;22(6):2328. doi: 10.3390/s22062328
Algorithm 3 The two-layer hybrid MRA training algorithm.
  • 1:

    (Input) λi,μi,νi,i, batch size η, learning rate α, minimum exploration rate ϵmin, discount factor ζ, exploration decay rate d, and converge threshold ϱ;

  • 2:

    (Output) Learned DQN to decide Pi,θi,fi,i, for (7);

  • 3:

    (Upper-layer DQN-based learning:)

  • 4:

    Initialize action a(0) and replay buffer D=;

  • 5:

    for episode = 1 to M do

  • 6:

         Initialize state s(0);

  • 7:

         for time t=1 to N do

  • 8:

              Observe current state s(t)=L(t),F(t);

  • 9:

              ϵ=max(ϵ·d,ϵmin);

  • 10:

              if random number r<ϵ then

  • 11:

                   Select a(t) from A^F at random;

  • 12:

              else

  • 13:

                   Select a(t)=argmaxaQ*(s(t),a,ω);

  • 14:

              end if

  • 15:

              Observe next state s;

  • 16:

              (Lower-layer game-theory-based iteration:)

  • 17:

              for each link i do

  • 18:

                   for iteration k=1 to K do

  • 19:

                        Update Pi[k] with (47);

  • 20:

                        Update θi[k] with (48);

  • 21:

                        if |Ui[k]Ui[k1]|ϱ then

  • 22:

                             k=k; break;

  • 23:

                        end if

  • 24:

                   end for

  • 25:

                   k*=mink,K;

  • 26:

                   Pi(t)=Pi[k*]; θi(t)=θi[k*];

  • 27:

              end for

  • 28:

              Determine Ui(t) based on Pi(t) and θi(t) in the lower layer, and fi(t) in the upper layer, i;

  • 29:

              Store transition (s(t),a(t),r(t),s) in D;

  • 30:

              Select η random samples (s(j),a(j),r(j),s(j+1)) from D;

  • 31:

              Calculate Q^(s(j),a(j),ω) and perform SGD to find the optimal weight of DNN, ω*;

  • 32:

              Update ω=ω* for DQN in the upper layer;

  • 33:

              s(t)=s;

  • 34:

       end for

  • 35:

    end for