Skip to main content
. 2024 Jun 7;3:e55957. doi: 10.2196/55957

Table 2.

Overview of the layered integrative approach for evaluating artificial intelligence (AI) in health care, delineating the structured, multistage framework for the comprehensive assessment and continuous improvement of AI systems.

Stage Verification paradigm Objective Integration
Initial assessment Quiz, vignette, and knowledge survey To gauge the AI’s foundational medical knowledge and its ability to apply this knowledge in simulated real-world scenarios Forms the baseline assessment of the AI’s capabilities, setting the stage for more targeted evaluations
Refinement Historical data comparison To refine the AI’s understanding and application of medical knowledge by comparing its recommendations or diagnoses against known outcomes from historical data Uses the insights gained from initial assessments to focus on areas requiring improvement, ensuring that the AI’s recommendations are grounded in real-world evidence
Expert feedback Expert consensus To incorporate nuanced clinical insights and expert judgments into the AI’s learning, ensuring that it aligns with current clinical practices and expert opinions Builds on the refined knowledge base by integrating expert clinical insights, further improving the AI’s decision-making processes
Comprehensive evaluation Cross-discipline validation To evaluate the AI’s recommendations and diagnostics across various medical disciplines, ensuring a comprehensive and holistic assessment Leverages the foundational knowledge, refined understanding, and expert insights to test the AI’s capabilities in a multidisciplinary context, identifying any gaps or biases
Complexity handling Rare or complex simulation and scenario testing To test the AI’s ability to handle complex, rare, or novel medical scenarios, ensuring that it can adapt to a wide range of clinical challenges Uses the comprehensive evaluations as a foundation to challenge the AI with scenarios that require sophisticated reasoning, further refining its decision-making abilities
Knowledge accuracy False myth To ensure that the AI’s current knowledge base is accurate and up-to-date, identifying and correcting any misconceptions or outdated information Builds on the previous layers by specifically targeting and rectifying inaccuracies in the AI’s knowledge, ensuring reliability
Complexity and nuance handling Challenging (or controversial) question To evaluate the AI’s ability to navigate complex medical questions that may not have straightforward answers, assessing its reasoning in ambiguous situations Further refines the AI’s decision-making process by exposing it to nuanced clinical scenarios, enhancing its ability to provide balanced and informed recommendations
Real-world efficacy Real-time monitoring To monitor the AI’s recommendations and diagnoses in real-world clinical settings, assessing its practical efficacy and safety Applies all previous layers of assessment in a live clinical environment, providing direct feedback on the AI’s performance and areas for improvement
Transparency and trust Algorithm transparency and audit To ensure that the decision-making processes of the AI are transparent and understandable, building trust among health care providers and patients Uses insights from real-world applications and previous evaluations to demystify the AI’s logic, ensuring that it is both effective and comprehensible
Continuous improvement Feedback loop To continuously refine and improve the AI system based on real-world data, feedback, and evolving medical knowledge Represents the culmination of the integrative approach, in which feedback from all previous stages is used to iteratively enhance the AI system, ensuring that it remains effective, safe, and ethically compliant over time
Ethical and legal compliance Ethical and legal review To ensure that all AI recommendations and processes adhere to established ethical guidelines and legal standards Runs parallel to all stages, providing a constant check on the AI’s compliance with ethical norms and legal requirements, safeguarding against potential malpractices, and ensuring that patient rights are protected