Skip to main content
. 2020 Aug 30;22(9):957. doi: 10.3390/e22090957
Algorithm 3: The ABS ratio α and the number of SBSs activation S are given to optimize the CRE bias β.
Require: at=(α,β,S).
Ensure: Optimized CRE bias β.
  1: Initialize Qβ(s,a), state s and n = 0;
  2: Setting learning rate λ2, greedy probability ε2, discount factor γ2, and threshold2;
  3: while n<= threshold2 do
  4:  In state s, select the optimal action a with greedy probability ε2;
  5:  Observe r;
  6:  randomly transfer from s to s;
  7:  Update Qβ(s,a) according to Formula (19);
  8:  ss;
  9:  n=n+1;
  10: end while
  11: Output: β=a;