Introduction
Artificial intelligence (AI) is an absolute hot topic and might be the most intensively discussed subject in recent times, not only for autonomous driving and IoT applications, but also, or even particularly in the field of medicine. AI stimulates computer-based processes in manifold ways, with this issue of Visceral Medicine highlighting some of them from an expert point of view and in respect to the current literature. AI methods promise to provide a revolutionary progress in almost all fields of medicine; however, as a new, somehow obscure and elusive technology, at least for non-experts, they do also rise concern and caution. Fear of the unknown is a natural attitude of our species and served as a guarantor to survive in face of wild animals and poisonous fruits and might also be found in the context of AI. However, reservation and skepticism are often aligned to missing, or more to the point, to layman knowledge, while elucidation and information cure for it. The following expert discussion thus aims at nothing else as to provide expert sights on the topic of AI, or better sights from experts of different areas in order to allow for a profound und unprejudiced perception. Fortunately, we succeeded to allure five renown experts for this interdisciplinary discussion who were all asked to answer a few selected questions. With specialists of visceral surgery, gastroenterology and computer science, and one representative from the European Commission Expert Group on AI and one from industry, we hopefully could illuminate this topic from all relevant directions enabling the completion of a well-balanced essay. The following discussion summarizes the provided expert opinions in an integrative fashion.
What Are the Areas of Medicine That You Think Can Benefit Most from AI?
Beneficial effects of AI are not only expected for daily practice but also for research and education. Depending on their primary focus, however, the involved experts did see the major impact of AI in different fields. In the view of contributing clinicians, computer-based diagnostics and especially imaging modalities will benefit most from AI. Müller-Stich expects decision support systems to play a relevant role and to support patient care during the entire treatment process, starting pre-therapeutically, to continue over the clinical and interventional course until discharge and also during the postoperative course. Other good examples in this field are computer-based real-time tissue characterization methods, which provide support during diagnostic endoscopy and intervention, as highlighted by Meining. AI is supposed to facilitate early lesion detection and preventive medicine and to enable new and individualized treatment. AI will provide the tools necessary to transform data to information which then enable decision support. Nonetheless, there are complex challenges and possible shortcomings that must be considered before full implementation can occur. As Navab explains, AI is based on Machine Learning (ML) and its success is therefore highly dependent on the availability of data and in particular annotated data. Such data could be available more easily for diagnostics than for treatment. In the field of diagnostics, the most progress has thus been, and will be, first in medical imaging. One of the reasons for this is that the success of ML in the last decade appeared first through the advancements in image analysis and computer vision. Therefore, fields such as pathology and radiology, in which images play crucial roles and were easily available in high volume, were the first to see the implementation and success of ML in the improvement of their outcome. The existence of the internationally accepted DICOM standard for storage and transfer of imaging data might have contributed to this cutting edge position. With the success of ML becoming more and more evident, other fields of medicine started gathering digital annotated data, but still those relying highly on image analysis will remain the ones taking first advantage of ML in the coming years. Nevertheless, AI will penetrate all fields of medicine and thus will not exclusively be used in imaging, as noted by the industry expert (A. Jarc), who is absolutely convinced that AI will also stimulate surgery, and especially robotic surgery. AI will be used here to support the surgeon by intraoperative guidance and may impact on how and how well surgery is performed. Thus, and as already mentioned, AI will not aim at replacing clinicians; e.g., by autonomous robotic systems, but will provide supportive measures to improve their performance. This could be on the fly by methods of augmented reality or postoperatively by identifying objective performance features, which help assess ones' own quality of care as well as to compare and improve one's own performance to other experts having performed the same procedure. This objective qualitative evaluation would allow the surgical community to teach and scale optimal skills in surgery like never before. Especially young trainees would benefit from this approach as it could help shorten learning curves and improve the overall performance level.
Another aspect is added by L. Bouarfa, a computer scientist and member of the European Commission's High Level Expert Group for Artificial Intelligence; she mentions the validation of new medical interventions and treatment concepts to experience a major transformation by AI, and expects that it will support or even replace clinical trials in real-world conditions by means of big data analysis. Alternatively, adaptive clinical trial designs might be implemented and could also help shorten the regulatory process and approval for new methods and treatment concepts.
How Could the Future of an AI-Enabled Healthcare System Look Like? How Will Healthcare Employees Apply AI Technologies in Daily Work?
To the conviction of Navab, AI methods will not dominate the healthcare system at the sharp end but will at first become hidden components in medicine for support and assistance, will help doctors with their decision-making, and not replace them entirely. Visions for the implementation of AI in healthcare derive from advanced car navigation systems or aviation. While the driver still decides where to go and, at least to date, controls the car, the navigations system uses the real-time flow of location and movement of other vehicles on the streets to provide guidance to the user. The user is globally aware of the way the system is working, but is not directly aware of all the information and features that the system is using at all times. As soon as AI-enabled methods have proven their usefulness and in particular their reliability, they will become standard allowing the doctors to focus on more and more complex issues. Meining suggests that AI will free up physicians' time from mundane tasks, such as collecting data, analyzing data, or writing reports and it will help them focus on delivering value to patients, saving as many lives as possible. Annoying tasks and bureaucracy, will be dramatically reduced or even replaced. To the opinion of A. Jarc, AI may change the role of physicians and healthcare workers in the future by shifting the nature of “collaboration” between humans and machines − with the goal to improve both outcomes and the empathetic human contact and interaction surrounding a patient's care.
As already prearranged by health trackers and apps, Bouarfa thinks that AI will transform the healthcare system to be preventive in a proactive fashion. AI will be fed by smart sensor technology; wearables, cameras, intelligent devices, robots, tracking systems and others, will systematically collect data and store them structured in inline forms as it is assumed by Müller-Stich. AI-enabled care will not wait for people to get sick, but will intervene early and effectively. Solutions such as Google Flu Trends, which analyzed search terms for hints for epidemic flu and which was able to “predict” an epidemic 10–14 days prior to the respective health authority [1], will be refined and adopted to individual needs including automatic checks. These systems will alert people for early diagnosis and treatment.
As suggested by Bouarfa, AI will further support “navigation” of patients both for themselves, e.g. for selecting the best treatment site and specialist for a given problem, as for the medical professionals who will be aided in selecting the most promising treatment modality for each patient after AI-enabled diagnostics. Medical employees will use AI on a daily basis, which will empower them in several ways, e.g. by planning cases based on urgency and risk scores, by advising recommended checks, and during the course of clinical care by prediction and early detection of complications. The product of automatic data processing will be provided App-based on mobile devices or integrated in the healthcare equipment.
Bouarfa believes that the fruitful implementation of AI in healthcare depends on the acceptance of professionals and their cooperation with engineers and computer scientists. As published by Warr et al. [2]: “If aerospace engineers had sought to mimic birds flapping wings and feathers, they would have created an unsafe and ineffective transport mechanism. We should treat healthcare like engineering, and focus on building AI systems that help healthcare stakeholders and professionals navigate through complexity, to meet the ultimate objectives of more accurate and effective diagnoses for the patients. Doctors, drug providers, payers and regulators need to oversee processes and interpret the results from computers, somewhat similarly to how pilots oversee the flying of the plane.”
Which Measures and Changes Are Necessary so That AI Can Be Successfully Established in the Clinic?
Establishing AI in the clinic, to the opinion of interviewed experts, seems at first an ethical-regulatory but also a technical issue. As L. Bouarfa points out, everybody must be aware of the fact that the patient and his care stands in the center of AI solutions and that profit and secondary added value need to be set in second line. A behavioral and cultural change of the point of care thus might be required, which might be fostered by patient organizations and cross stakeholder collaboration, as well as by healthcare professionals who are asked to carefully track the outcome and to establish outcome-based measures. In this context, establishment of common vocabularies and process models are necessary to the assessment of A. Jarc, so users and healthcare professionals can easily understand them. Müller-Stich and Meining both require the implementation of AI in the clinic must focus on meaningful technologies that are thoroughly validated. Everybody involved in this subject has to take on the responsibility that the gathered information is used for improvement of the global healthcare and that no information is held if their availability could accelerate the human race for getting better healthcare solutions. AI should be applied to the benefit of patients and involved professionals, and ease and reduce current work flow instead of interfering with it. Outcomes of AI-supported methods and treatments have to be openly communicated, and insight has to be given to the analytic process. The latter, to the opinion of Meining, mainly applies to the approval and control of input data and how and on which clinical standards the AI was developed, rather than explaining the underlying algorithms and reasoning. From the regulatory side, responsibilities have to be defined to guarantee for the safety of AI-based measures in the medical context. This includes that AI-methods are critically assessed, and their approval is regulated by assigned authorities. Another responsibility needs to address the access and use of patient data, which form the basis of any AI method and thus may not only be controlled in respect to privacy issues but also in regard to their value for the global healthcare. Who will own the data, or will patient data become generally available for healthcare use? We have to define solutions to balance the need for big data and the privacy protection of an ill patient. If we strive for an accelerated progress in the quality of treatment by the opportunity of big and smart data analysis, Müller-Stich assumes that we have to accept the inability to completely control the mode of action of AI algorithms. Accordingly, clear regulations have to be defined from a reliable authority, including active representatives of patients at local, national, and international levels. The acquisition, storage, and use of patient data, which is an absolute need for the improvement of AI solutions and the improvement of healthcare, must be regulated in a cautious and reliable manner. The heterogeneity of data and associated bias that results from diverse characteristics of patients, clinicians, and care teams around the world also needs to be addressed and it is important that we learn from existing examples how bias can find its way into the world and how we can control for this.
N. Navab sees this question more from the technical side and suggests that digitalization is probably still the strongest barrier to deployment of AI as a universal solution in medicine. Digitalization includes standardization of the data communication between different systems by establishment of generally accepted standards and protocols, thus the realization of systems interoperability. Beyond this, the need for secure data access, data safety, and data protection have to be addressed, and reliable solutions have to be developed, preferably on an international basis. The generation of structured data in the clinic has to be advanced by the installation of suitable sensors, use of digital methods and other solutions, as advanced user interfaces that facilitate data input but also allow for the communication of AI-derived measures.
What Are the Ethical Concerns about AI and How Can These Be Overcome?
L. Bouarfa has detected that AI raises several concerns in society. At first and to start with, she notices a clear fear of unfair bias. Bias occurs when a system shows an unfair inclination or prejudice against a group of people. This group can be defined by any criteria, but gender, race, age, and religion are common ones. Bias is further related to what data are used for training algorithms, how data are labelled, and who is designing the algorithms as all play a major factor in the objectivity of the AI results. A. Jarc confirms the data related bias and demands that we must strive to create inclusive development teams and act, as a community, as good stewards of data by developing and deploying AI technologies that respect a user's privacy and maintain the highest standards of data protection. Bias can be reduced by having multidisciplinary teams involved in the product development process, by balancing the classes in the data available, or by controlling some variables from the data (e.g., doctor gender, location of institution). Another possibility, as assumed by Navab, are patient bodies or governmental organizations representing them which control the correct use of data. They should balance between patient privacy and ethical concerns on the one hand and progress of healthcare and discovery of novel solutions on the other.
However, Bouarfa also sees a source of social worry in AI accuracy or robustness. It is a matter of concern when the algorithm does not achieve enough accuracy to make good predictions about the patients. To decide if an algorithm is providing better value than the current standard of care, comparative analyses need to be performed between a control and a trial group in the real-world setting, where patients are treated with and without the algorithm. Then we can measure the uplift in human-reported outcomes, both for patients and doctors, and we can report on classical measures for the patients, such as efficiency, efficacy and safety. Moreover, liability has to be defined when errors occur.
Finally, there is the element of ‘human in the loop’. Fully autonomous AI systems are not adequate for healthcare systems. In the healthcare context, we need a human in the loop to analyze the AI system's output and make a decision upon the moral societal values. There can be instances when a choice is made by a fully autonomous AI system, for example, conversational algorithms such as chatbots. If this is the case, then the system has to make clear to the user that he/she is interacting with an AI algorithm and not with an actual human being. According to Meining, specific attention must be given to strengthening the trust with the patient, reviewing which amount of information is sufficiently rich and understandable for autonomous patient reflection and decision-making.
Many Clinicians Ask for an “Explainable” AI to Be Able to Understand the Decisions Made; Does This Make Sense, or will We Need Other Methods to Validate Such Algorithms?
No full agreement was found among the involved experts for this question, and even among the clinicians. While one expert rated completely explainable AI as utopic, as he does not even understand the mode of operations of a TV or a computer,” Meining assessed explainable AI as of definite sense. He expects major barriers encountered by AI systems are the quality of input data sets, the lack of appropriate annotations, and the absence or neglect of robust golden standards against which to train AI models. Therefore, explanation needs to be given to the clinicians (who certainly will need some guidance by experts of respective professional societies) to judge upon the usability of an AI in the clinical context.
A more sophisticated answer was contributed by N. Navab, who made a differentiation between explainability and interpretability. He believes that AI solutions for medicine need to be interpretable, in the sense of the doctors being able to see for example why the system has decided that a patient has a disease or a given level of severity, etc. Interpretability here could be achieved for example by the system highlighting unnatural or excessive vascularization around the tumor, which made it classify the lesion as malignant. However, the exact mathematical model or method may not be exposed and therefore no explainability be there. Navab sees a subtle different between why and how, while the explanation of why should already meet the demands of clinicians. AI methods need to be interpretable for doctors and explainable for the computational scientist, as the first one needs to trust the AI support and intelligence, and the second one needs to be competent to improve and perfect the existing models and architecture. Thus, explainability seems a significant requirement in order to allow humans as a whole to improve the science and technology of AI at the highest speed, however, might be dispensable for clinicians.
Explainability, in the clinical setting, should supposedly be reframed to accuracy, which must be approved depending on the application and the degree of criticality. In the context of AI, at least for clinicians, there is an issue with transparency, known as ‘the black-box effect’. However, there are different algorithms that can analyze causality and correlation, which can help explain the output of a ML algorithm and which can approve its accuracy. In this manner, explainable AI is already here, and some solutions include “explainability engines,” which provide the reasons behind every output and if confounding variables are missing in the data (L. Bouarfa). These engines highlight the key existing variables that do correlate with the outcome.
To summarize and according to A. Jarc, those building AI technologies will need to collaborate closely with those applying AI technologies in order to ensure that the technologies are designed, validated, deployed, and iterated upon to meet their goals to improve the quality of patient care.
Conflict of Interest Statement
Authors D. Wilhelm, N. Padoy, A. Meining, B. Müller-Stich, and N. Navab have no conflicts of interest to declare. L. Bouarfa is the founder and CEO of OKRA Technologies. A. Jarc is an employee and the director of Research and Data Science at Intuitive Surgical, Inc.
Funding Sources
The authors did not receive any funding.
Author Contributions
All authors contributed equally to the completion, revision, and finalization of the manuscript.
References
- 1.Araz OM, Bentley D, Muelleman RL. Using Google Flu Trends data in forecasting influenza-like-illness related ED visits in Omaha, Nebraska. Am J Emerg Med. 2014 Sep;32((9)):1016–23. doi: 10.1016/j.ajem.2014.05.052. [DOI] [PubMed] [Google Scholar]
- 2.Warr W, Willetts M, Holmes C. The BMJ Opinion. 2019. We don't need AI to pass the Turing Test to be helpful in healthcare. Available from: https://blogs.bmj.com/bmj/2019/06/07/we-dont-need-ai-to-pass-the-turing-test-to-be-helpful-in-healthcare/ [Google Scholar]