| Algorithm 2 Learning process, out-of-policy for mutating DoS cyber-attacks frames |
|
Require:, arbitrarilty, and for each network frame do Sort the start of the protocol conversation Build an action space S for each frame for each do Initialize agent a with sates s at time for each do Choose A from S using derived from Q Take action , observe R, end for until is terminal, hence the DoS cyber-attack frame is fully mutated. end for end for |