Skip to main content
. 2021 Jan 30;23(2):171. doi: 10.3390/e23020171
Algorithm 2 Q-learning algorithm for the QLSCF algorithm
Whileepisode < number:
Initial observation s;
While frame < 10000:
Initial information of polar codes: y1N, L1N;
If episode < 0.1×number:
achoose_action(s, εl);
Else:
achoose_action(s, εs);
End if
(s,r)env(a, y1N, L1N, s);
Q(s,a)Q(s,a)+α[r+γmaxa Q(s,a)Q(s,a)];
s=s;
frame+=1;
End while
End while
// The env function
Function (s, r)env(a, y1N, L1N, s):
sgetstate(s,a);
flagpolardecoder(a, y1N, L1N, s );
If flag=1:
r=|La|+1;
Else:
r=|La|1;
End if
End function