|
Algorithm 3: The ABS ratio and the number of SBSs activation are given to optimize the CRE bias . |
| Require:
. |
| Ensure: Optimized CRE bias . |
| 1: Initialize , state s and n = 0; |
| 2: Setting learning rate , greedy probability , discount factor , and ; |
| 3: while
n<=
do
|
| 4: In state s, select the optimal action a with greedy probability ; |
| 5: Observe r; |
| 6: randomly transfer from s to ; |
| 7: Update according to Formula (19); |
| 8: ; |
| 9: ; |
| 10: end while
|
| 11: Output: ; |