Skip to main content
. 2020 Aug 30;22(9):957. doi: 10.3390/e22090957
Algorithm 5: The algorithm for optimizing initial problems.
Require: αA, βB, S=0,1,,Smax, Reward r, Learning rate λ=λ1,λ2,λ3,
  Greedy probability ε=ε1,ε2,ε3 and Discount factor γ=γ1,γ2,γ3.
Ensure: Optimal action configuration α,β,S in each state.
  1: Initialize Ut, α, β, S,
  2: while n <= threshold4 do
  3:  Fixed the CRE bias β and the number of SBS activation S, calculate the ABS ratio α
   according to Algorithm 2. Pass the solved α to step (4) and step (5);
  4:  Fixing the ABS ratio α and the number of SBS activation S, calculate the CRE bias β
   according to Algorithm 3. Pass the solved β to step (3) and step (4);
  5:  Fix the ABS ratio α and the CRE bias β, calculate the number of SBS activation S
   according to Algorithm 4. Pass the solved S to step (4) and step (3);
  6:  n=n+1;
  7: end while
  8: Output: α, β, S;