Abstract
AI-based prediction models demonstrate equal or surpassing performance compared to experienced physicians in various research settings. However, only a few have made it into clinical practice. Further, there is no standardized protocol for integrating AI-based physician support systems into the daily clinical routine to improve healthcare delivery. Generally, AI/physician collaboration strategies have not been extensively investigated. A recent study compared four potential strategies for AI model deployment and physician collaboration to investigate the performance of an AI model trained to identify signs of acute respiratory distress syndrome (ARDS) on chest X-ray images. Here we discuss strategies and challenges with AI/physician collaboration when AI-based decision support systems are implemented in the clinical routine.
Subject terms: Diagnosis, Prognosis
Potential roles for AI-based algorithms in clinical settings
In a recent New York Times editorial, an author dubbed artificial intelligence (AI) a “Pandora’s box” that humans were lifting the lid of1. Indeed, AI seems to be everywhere, from the realistic appearing conversations of ChatGPT to the image recognition software allowing many of us to unlock our cellphones with a glance. Within healthcare, both excitement and concern about AI are nothing new.
Over the last years, several AI-based models have demonstrated their usefulness in various medical disciplines, including dermatology2, ophthalmology3, and radiology4. Thus far, the prevailing thought within healthcare has been that AI will be most useful within clinical decision support systems (CDSS). CDSS aim to aid clinicians with the complex decision-making process to improve healthcare quality. Where exactly AI fits best within the decision-making process remains debated5.
In a recent case study, Farzaneh et al.5 systematically determined how AI could be integrated into a specific clinical scenario. They took an AI model6 trained to identify patterns of acute respiratory distress syndrome (ARDS) on chest X-ray images and evaluated its strengths and weakness compared to physicians. In doing so, they tested four different physician–AI collaboration strategies. One strategy involved the AI model reviewing a chest X-ray first and then deferring to a physician in cases of uncertainty. This strategy achieved higher diagnostic accuracy compared to the other three, which included physicians reviewing the X-ray first and deferring to the AI model in cases of uncertainty, the AI model examining the X-ray alone, or the physician examining the X-rays alone.
Ultimately, these findings imply that the AI model had higher and more consistent accuracy on less complicated chest X-rays, while physicians had higher accuracy on difficult chest X-rays5. This could mean that those caring for ARDS patients can use AI models to help triage these patients when the X-ray findings are clear, while physicians focus on interpreting the more complicated images.
Key challenges hampering CDSS implementation into clinical practice
Despite the promises of AI-based CDSS, obstacles are blocking its implementation. Four key challenges to the widespread use of AI-based CDSS include trust, bias, scalability, and deployment.
Trust
Both patients and clinicians must trust AI models used for decision support if they are to be widely adopted7,8. This study by Farzaneh et al. 5 is an example of the type of research that must continue to be done to evaluate and build trust in AI/physician collaboration workflows. Additionally, AI models will need to be explainable, describing why and which parameters impacted a model’s decision. For example, visualizing alerts or displaying regions of concern in an X-ray can help to overcome the distrust in the “black-box” nature of most AI-based systems8,9. On the other hand, excessive trust and reliance on a CDSS can interfere with developing clinical skills10.
Bias
Datasets used to train AI models can contain bias, amplifying existing inequity in the healthcare system. This type of bias specifically harms disadvantaged populations11,12. For example, AI models trained on a dataset with unequal representation of a particular minority group might generate less accurate predictions for that minority patient population, leading to worse patient care. Various strategies for detecting and mitigating bias12,13 have been developed to tackle this issue, but further approaches are pivotal for generalizability and fairness.
Scalability
While Farzaneh et al.5 describe a specific case study that helps optimize AI-physician collaboration, it is unrealistic to expect every implementation of AI support to occur only after a published trial. Instead, AI-CDSS will likely scale based on inferences from studies of similar clinical challenges. The specific point at which AI will be implemented in a workflow will ultimately differ among healthcare settings.
Looking into the future, generalizable CDSS tools will be implemented in healthcare settings in which they were not actually developed. The details surrounding how such generalizable tools will be developed and implemented in local workflows remain up to question14. Other challenges include insufficient IT infrastructure in under-resourced clinical settings where building on AI is challenging. As technical complexity increases, increased computer literacy and proficiency will be required. Lacking these skills can be hindering for clinical decision support systems adoption10,15.
Deployment
Challenges to AI-based CDSS on the deployment level are centered around regulatory concerns and long-term effects. AI-based CDSS will require new rules around where responsibility lies for potential mistakes. To take just one example, not using physician decision support systems could be considered malpractice by an individual physician, or a healthcare institution that has adopted the AI-CDSS tool16.
AI in medical education
AI as well as augmented and virtual reality offer unique opportunities for medical training and education17. For example, for medical students AI tools could provide a three-dimensional virtual reality experience that can change the way of Anatomy teaching and learning18. Further, as medical training during the COVID-19 pandemic became challenging, surgeon training in lung cancer surgery through the metaverse (a 3-D-enabled digital space using augmented and virtual reality) was implemented in a smart operating room19. Other recent publications demonstrated how AI could be used to identify a surgeon’s skill20–22 and thus improve continuous learning. Integrating AI in medical (student) education may not only enrich the teaching and learning experience, but may also help to teach opportunities and challenges of AI, and thus results in more awareness, trust, and better use of AI-based systems.
Conclusion
In many cases, CDSS demonstrated a better outcome if AI algorithms collaborated with physicians. However, integrating CDSS into the daily clinical routine using real-world data requires rigorous clinical validation in a real-world environment before implementation in clinical practice. Key challenges include trust, bias, scalability, and deployment. Further, regulatory and privacy issues need to be addressed.
Acknowledgements
M.M. is a fellow of the BIH—Charité Digital Clinician Scientist Program funded by the Charité—Universitätsmedizin Berlin, the Berlin Institute of Health at Charité, and the German Research Foundation (DFG).
Author contributions
M.M. wrote the first draft of the paper. M.R. contributed to the first draft and provided critical revisions. J.C.K. provided critical revisions. All authors approved the final paper.
Competing interests
J.C.K. is the Editor-in-Chief of npj Digital Medicine. M.M. and M.R. declare no competing interests.
References
- 1.Friedman, T. L. In The New York Times (2023).
- 2.Esteva A, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–118. doi: 10.1038/nature21056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gulshan V, et al. Performance of a deep-learning algorithm vs manual grading for detecting diabetic retinopathy in India. JAMA Ophthalmol. 2019;137:987–993. doi: 10.1001/jamaophthalmol.2019.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.McKinney SM, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577:89–94. doi: 10.1038/s41586-019-1799-6. [DOI] [PubMed] [Google Scholar]
- 5.Farzaneh N, Ansari S, Lee E, Ward KR, Sjoding MW. Collaborative strategies for deploying artificial intelligence to complement physician diagnoses of acute respiratory distress syndrome. NPJ Digit. Med. 2023;6:62. doi: 10.1038/s41746-023-00797-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sjoding MW, et al. Deep learning to detect acute respiratory distress syndrome on chest radiographs: a retrospective study with external validation. Lancet Digit. Health. 2021;3:e340–e348. doi: 10.1016/S2589-7500(21)00056-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Esteva A, et al. Deep learning-enabled medical computer vision. NPJ Digit. Med. 2021;4:5. doi: 10.1038/s41746-020-00376-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Asan O, Bayrak AE, Choudhury A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 2020;22:e15154. doi: 10.2196/15154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Fujimori R, et al. Acceptance, barriers, and facilitators to implementing artificial intelligence-based decision support systems in emergency departments: quantitative and qualitative evaluation. JMIR Form. Res. 2022;6:e36501. doi: 10.2196/36501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sutton RT, et al. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit. Med. 2020;3:17. doi: 10.1038/s41746-020-0221-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Panch T, Mattie H, Atun R. Artificial intelligence and algorithmic bias: implications for health systems. J. Glob. Health. 2019;9:010318. doi: 10.7189/jogh.09.020318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Vokinger KN, Feuerriegel S, Kesselheim AS. Mitigating bias in machine learning for medicine. Commun. Med. (Lond.) 2021;1:25. doi: 10.1038/s43856-021-00028-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Yang J, Soltan AAS, Eyre DW, Yang Y, Clifton DA. An adversarial training framework for mitigating algorithmic biases in clinical machine learning. NPJ Digit. Med. 2023;6:55. doi: 10.1038/s41746-023-00805-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Pinsky MR, Dubrawski A, Clermont G. Intelligent clinical decision support. Sensors (Basel) 2022;22:1408. doi: 10.3390/s22041408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Devaraj, S., Sharma, S. K., Fausto, D. J., Viernes, S. & Kharrazi, H. Barriers and facilitators to clinical decision support systems adoption: a systematic review. J. Bus. Adm. Res.10.5430/jbar.v3n2p36 (2014).
- 16.Mamo, C. Not Using AI in Healthcare Will Soon be Malpractice. https://emerging-europe.com/news/not-using-ai-in-healthcare-will-soon-be-malpractice/ (2021).
- 17.Wang G, et al. Development of metaverse for intelligent healthcare. Nat. Mach. Intell. 2022;4:922–929. doi: 10.1038/s42256-022-00549-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Abdellatif H, et al. Teaching, learning and assessing anatomy with artificial intelligence: the road to a better future. Int. J. Environ. Res. Public Health. 2022;19:14209. doi: 10.3390/ijerph192114209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Koo H. Training in lung cancer surgery through the metaverse, including extended reality, in the smart operating room of Seoul National University Bundang Hospital, Korea. J. Educ. Eval. Health Prof. 2021;18:33. doi: 10.3352/jeehp.2021.18.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kiyasseh D, et al. A multi-institutional study using artificial intelligence to provide reliable and fair feedback to surgeons. Commun. Med. (Lond.) 2023;3:42. doi: 10.1038/s43856-023-00263-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kiyasseh D, et al. Human visual explanations mitigate bias in AI-based assessment of surgeon skills. NPJ Digit. Med. 2023;6:54. doi: 10.1038/s41746-023-00766-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kiyasseh D, et al. A vision transformer for decoding surgeon activity from surgical videos. Nat. Biomed. Eng. 2023;7:780–796. doi: 10.1038/s41551-023-01010-8. [DOI] [PMC free article] [PubMed] [Google Scholar]