|
Algorithm 1 BIOS optimization algorithm based on DQN (Part 1). |
|
Input: N Initial capacity of the playback pool, |
| M Number of final exploration frames for optimization, |
| T Number of optimization steps per iteration process, |
| State corresponding to the initial configuration |
| 1 Initialize replay memory (D) to capacity N; |
| 2 Initialize Q-network with random weights ; |
| 3 Initialize environment; |
| 4 For episode = 1, M do |
| 5 Initialize environment state ; |
| 6 For t = 1, T do |
| 7 Calculate in state ; |
| 8 With probability select a random action , |
| Otherwise, select ; |
| 9 Execute action in the environment, |
| obtain a reward , next state , and whether server downtime; |
| 10 If server downtime: store the same transition in D times; |
| 11 else: store one transition in D; |
| 12 Sample random minibatch of transitions from D |
| 13 Update parameters in Q-network with Formula (5) |
| 14 Every C steps reset
|
| 15 End For |
| 16 End For |