Skip to main content
. 2022 Sep 6;22(18):6730. doi: 10.3390/s22186730
Algorithm 1 BIOS optimization algorithm based on DQN (Part 1).
Input: N Initial capacity of the playback pool,
   M Number of final exploration frames for optimization,
   T Number of optimization steps per iteration process,
   sinitial State corresponding to the initial configuration
1 Initialize replay memory (D) to capacity N;
2 Initialize Q-network with random weights θ;
3 Initialize environment;
4 For episode = 1, M do
5   Initialize environment state s=sinitial;
6   For t = 1, T do
7     Calculate Q(s,a;θ) in state s;
8     With probability ε select a random action a,
       Otherwise, select a=argmaxa(Q(s,a;θ));
9     Execute action a in the environment,
       obtain a reward r, next state s, and whether server downtime;
10      If server downtime: store the same transition in D k times;
11      else: store one transition in D;
12      Sample random minibatch of transitions from D
13      Update parameters θ in Q-network with Formula (5)
14      Every C steps reset θ=θ
15   End For
16 End For