| Algorithm 1 Reward function of the proposed method | |
| 1 | Current load of connecting eNB: |
| 2 | Ideal load that is control target value: |
| 3 | Available bandwidth of connecting eNB: |
| 4 | Select action at time : |
| 5 | Episode end time: |
| 6 | Normalization variable: |
| 7 | ifthen |
| 8 | |
| 9 | else |
| 10 | if and then |
| 11 | |
| 12 | else |
| 13 | ) |
| 14 | end if |
| 15 | end if |
| 16 | return |