| Algorithm 1. Used Approach |
| Result: Best Solution (Route) and Cumulative Reward Initialization; |
| 1. Generate Attack Graph Using Architecture Analysis and Design Language, JKind checker tool and Graphviz |
| 2. Convert Attack Graph to Refinement Graph; |
| 3. Formulate the RL problem. Define environment, agent, states, actions, and rewards; |
| 4. Train RL Agent in MDP Environment; |