Skip to main content
. 2023 Jan 20;23(3):1231. doi: 10.3390/s23031231
Algorithm 2 Learning process, out-of-policy for mutating DoS cyber-attacks frames
Require:Q(st,At), stS,AtA arbitrarilty, and Q(terminalstate,·)=0
    for each network frame do
    Sort the start of the protocol conversation
    Build an action space S for each frame
        for each st do
    Initialize agent a with sates s at time t+1
            for each st+1 do
          Choose A from S using Λ derived from Q
          Take action At, observe R, st+1
          Q(st,At)Q(st,At)+α[R(st+1,At+1)+ΦmaxQ(st+1+At+1)Q(st,At)]
          stst+1
            end for
    until st is terminal, hence the DoS cyber-attack frame is fully mutated.
    end for
end for