Joint Optical and Wireless Resource Allocation for Cooperative Transmission in C-RAN

. 2020 Dec 31;21(1):217. doi: 10.3390/s21010217

Algorithm 1 Reinforcement Learning-based Wavelength Allocation Algorithm
Input: $Γ, ε, λ_{t}$
Output: Wavelength allocation matrix
1:	Initialize Process:
2:	Based on history network data $Γ = {{H_{1}, Q_{1}}, {H_{2}, Q_{2}}, \dots, {H_{T}, Q_{T}}}$ , the BBU initialize a virtual environment. Then a agent is created, whose action set is $A = {a_{1}, a_{2}, \dots, a_{L}}$ . Initialize $t = 1$ , $Q_{a_{l}} = 0$ for each $a_{l}$ , the exploration probability $ε$ and and learning rate $λ_{t}$ .
3:	Learning Process:
4:	for each transmission interval $t = 1 : T$ do
5:	Generate a random number z
6:	if $z < ε$ then
7:	The agent selects a wavelength allocation strategy $a_{t}$ with equal probability.
8:	else
9:	The agent selects a wavelength allocation strategy with the maximum $Q - v a l u e$
10:	end if
11:	Under $a_{t}$ , $H_{t}$ and $Q_{t}$ , Algorithm 2 is performed to produce a clustering result
12:	The environment feedbacks the signal $r_{a_{t}}$ to the agent
13:	The agent makes an update according to Equation (5)
14:	end for