. 2024 Jun 7;3:e55957. doi: 10.2196/55957

Table 2.

Overview of the layered integrative approach for evaluating artificial intelligence (AI) in health care, delineating the structured, multistage framework for the comprehensive assessment and continuous improvement of AI systems.

Stage	Verification paradigm	Objective	Integration
Initial assessment	Quiz, vignette, and knowledge survey	To gauge the AI’s foundational medical knowledge and its ability to apply this knowledge in simulated real-world scenarios	Forms the baseline assessment of the AI’s capabilities, setting the stage for more targeted evaluations
Refinement	Historical data comparison	To refine the AI’s understanding and application of medical knowledge by comparing its recommendations or diagnoses against known outcomes from historical data	Uses the insights gained from initial assessments to focus on areas requiring improvement, ensuring that the AI’s recommendations are grounded in real-world evidence
Expert feedback	Expert consensus	To incorporate nuanced clinical insights and expert judgments into the AI’s learning, ensuring that it aligns with current clinical practices and expert opinions	Builds on the refined knowledge base by integrating expert clinical insights, further improving the AI’s decision-making processes
Comprehensive evaluation	Cross-discipline validation	To evaluate the AI’s recommendations and diagnostics across various medical disciplines, ensuring a comprehensive and holistic assessment	Leverages the foundational knowledge, refined understanding, and expert insights to test the AI’s capabilities in a multidisciplinary context, identifying any gaps or biases
Complexity handling	Rare or complex simulation and scenario testing	To test the AI’s ability to handle complex, rare, or novel medical scenarios, ensuring that it can adapt to a wide range of clinical challenges	Uses the comprehensive evaluations as a foundation to challenge the AI with scenarios that require sophisticated reasoning, further refining its decision-making abilities
Knowledge accuracy	False myth	To ensure that the AI’s current knowledge base is accurate and up-to-date, identifying and correcting any misconceptions or outdated information	Builds on the previous layers by specifically targeting and rectifying inaccuracies in the AI’s knowledge, ensuring reliability
Complexity and nuance handling	Challenging (or controversial) question	To evaluate the AI’s ability to navigate complex medical questions that may not have straightforward answers, assessing its reasoning in ambiguous situations	Further refines the AI’s decision-making process by exposing it to nuanced clinical scenarios, enhancing its ability to provide balanced and informed recommendations
Real-world efficacy	Real-time monitoring	To monitor the AI’s recommendations and diagnoses in real-world clinical settings, assessing its practical efficacy and safety	Applies all previous layers of assessment in a live clinical environment, providing direct feedback on the AI’s performance and areas for improvement
Transparency and trust	Algorithm transparency and audit	To ensure that the decision-making processes of the AI are transparent and understandable, building trust among health care providers and patients	Uses insights from real-world applications and previous evaluations to demystify the AI’s logic, ensuring that it is both effective and comprehensible
Continuous improvement	Feedback loop	To continuously refine and improve the AI system based on real-world data, feedback, and evolving medical knowledge	Represents the culmination of the integrative approach, in which feedback from all previous stages is used to iteratively enhance the AI system, ensuring that it remains effective, safe, and ethically compliant over time
Ethical and legal compliance	Ethical and legal review	To ensure that all AI recommendations and processes adhere to established ethical guidelines and legal standards	Runs parallel to all stages, providing a constant check on the AI’s compliance with ethical norms and legal requirements, safeguarding against potential malpractices, and ensuring that patient rights are protected