Yi, J. and Liu, X. [9] |
Leverages MulVAL attack graphs and predefined vulnerabilities. |
Simulated networks with hosts and subnets. |
🗸 |
|
|
Capable of scaling to subnet-based configurations but limited by fixed graph structures. |
Hamidi, M., et al. [10] |
Connects with tools like Metasploit, SQLmap, and Weevely via APIs. |
Controlled setups with predefined exploitation paths. |
🗸 |
🗸 |
|
Limited adaptability due to predefined tools and static environments. |
Ghanem, M. and Chen, T. [15] |
Analyzes penetration testing expert behavior using logs from servers, databases, and routing devices. |
Simulated environments with predefined vulnerability paths. |
🗸 |
|
|
Limited due to static and predefined scenarios. |
Ghanem, M. and Chen, T. [16] |
Processes state and action spaces with probabilistic representations of devices and networks. |
Networks with devices modeled probabilistically for vulnerabilities. |
🗸 |
|
🗸 |
Constrained by reliance on probabilistic state-space representations. |
Zennaro, F., et al. [17] |
Uses Q-learning to train agents in Capture the Flag scenarios. |
Simplified scenarios with predefined port vulnerabilities. |
🗸 |
|
|
Restricted to predefined attack paths and ports. |
Chaudhary, S. et al. [18] |
Employs DT scripts and Python-based log analysis for vulnerability identification. |
Focused on file exploitation in predefined Windows and Linux environments. |
|
|
🗸 |
Restricted to static environments, without provisions for scalability or dynamic updates. |
Nhu, N., et al. [19] |
Employs Docker-based environments for training reinforcement learning agents. |
Dockerized setups with a variety of CVEs. |
🗸 |
🗸 |
|
Scales moderately well but lacks contextual processing for extrapolation. |
Schwartz, J. and Kurniawati, H. [20] |
Focuses on Metasploit-based testing for FTP vulnerabilities. |
Single-port FTP exploitation scenarios. |
🗸 |
|
|
Minimal scalability beyond basic vulnerability testing. |
Tran, K., et al. [22] |
Implements Cascaded Reinforcement Learning Agents for discrete action spaces. |
Simulated networks with multiple subnets and hosts. |
🗸 |
|
🗸 |
Highly scalable in subnet-based scenarios but less effective in dynamic configurations. |
Nguyen, H., et al. [21] |
Implements action spaces using Metasploit modules for scanning, exploitation, and PEsc. |
Simulations with connected hosts and service vulnerabilities like CVE-2021-41773 and CVE-2015-3306. |
🗸 |
|
|
Limited to predefined Metasploit actions and lacks dynamic adaptability to emerging or IoT environments. |
Ying, W. et al. [23] |
Analyzes and filters CVE data with NLP techniques for event extraction, covering vulnerabilities from 1999 to 2021. |
Employs a database of 4638 vulnerabilities from CVE with detailed categorization of 16 CWE types. |
|
🗸 |
|
Limited to textual analysis and lacks integration with reinforcement learning or adaptive exploration. |
BERT QA RL + RS (This proposal) |
Combines BERT’s contextual processing with reinforcement learning for adaptive exploration, integrating real-time data updates for dynamic environments. |
Supports diverse configurations, including interconnected services, AB weaknesses, CFs, and real-world scenarios. |
🗸 |
🗸 |
🗸 |
Highly scalable due to its modular design, contextual adaptability, and ability to generalize policies across complex environments like cloud and IoT systems. |