Online Learning Approach for Predictive Real-Time Energy Trading in Cloud-RANs

. 2021 Mar 25;21(7):2308. doi: 10.3390/s21072308

Algorithm 1 Super Arm Exploration

1:
Initialize: Total number of trials K
2:
for $k = 1 : K$
3:
Solve problem (16) for a given $B_{n}^{[ahead]} (k)$ ,
4:
CU calculates $B_{n}^{[total]} (k)$ as per (2), $R (B_{n}^{[ahead]} (k))$ as per (17), and $R (A_{k}^{[set]})$ as per (18).
5:
If $k = 1$
6:
then $B_{n}^{[ahead]} (k + 1) = B_{n}^{[ahead]} (k) + Δ E, n \in L_{b}$ .
7:
else if the super arm reward of all the RRHs $R (A_{k}^{[s e t]}) \leq R (A_{k - 1}^{[s e t]})$ ,
8:
then $B_{n}^{[ahead]} (k + 1) = B_{n}^{[ahead]} (k - 1)$ , $\forall n \in L_{b}$ ,
9:
else if the individual reward for the n-th RRH, $n \in N$ $R (B_{n}^{[ahead]} (k)) \geq R (B_{n}^{[ahead]} (k - 1))$ and $B_{n}^{[ahead]} (k) \neq E^{J}$ ,
10:
then $B_{n}^{[ahead]} (k + 1) = B_{n}^{[ahead]} (k) + Δ E$ ,
11:
else $B_{n}^{[ahead]} (k + 1) = B_{n}^{[ahead]} (k)$ .
12:
end If
13:
Calculate the total energy cost of all the RRHs, $β^{[k, f, t]}$ as $β^{[k, f, t]} = \sum_{n \in L_{b}} B_{n}^{[total]} (k)$ .
14:
Calculate the energy package index p at all RRHs from $p = \frac{B_{n}^{[ahead]} (k)}{Δ E}, n \in L_{b}$ .
15:
Update $μ_{n, p}^{[k, f, t]} = R (B_{n}^{[ahead]} (k)), \forall p \in J, n \in L_{b}$ ;
16:
Update $A_{k + 1}^{[set]} = {B_{1}^{[ahead]} (k + 1), \dots, B_{N}^{[ahead]} (k + 1)}$ ;
17:
end for
18:
Estimated mean reward for K trials ${\hat{μ}}_{n, p}^{[f, t]} = \frac{\sum_{k = 1}^{K} μ_{n, p}^{[k, f, t]}}{K}, \forall p \in J, n \in L_{b}$ .