With recent advances in artificial intelligence (AI), particularly in machine learning and deep learning, there is increasing excitement regarding their potential use in medicine. Continual learning, also known as lifelong learning or online machine learning is a fundamental concept in machine learning (ML), in which the models continuously learn and evolve based on the input of increasing amounts of data while retaining previously-learned knowledge.1 This is a dynamic process of supervised learning that allows the model to incrementally learn and autonomously change its behavior while not forgetting the original task.
The recommender systems used by companies such as Netflix and Amazon are well-known examples of continual learning. These systems instantly gather new labeled data as people interact with the model output and adjust accordingly.2 In medicine, a continual learning model (previously trained with labeled, stationary data from other patients) would ideally assist the physician by performing tasks such as providing diagnoses or making management decisions. New patient data and the results of previous tasks (actual diagnoses or treatment outcomes) would be introduced to the model, which would then transfer its previous knowledge to the new data, fine-tune its current task, or even incrementally learn new tasks.
Although continual learning ML systems sound ideal for medical purposes, in practice, there are many long-standing challenges in applying them.3 One main obstacle is catastrophic forgetting (or catastrophic interference phenomenon), in which the new information interferes with what the model has already learned. This can lead to an abrupt decrease in performance while the new data is being integrated, or even worse, an overwrite of the model’s previous knowledge with the new data.4,5 Most of the current applications for continual learning in nonmedical fields are less critically impacted by this limitation.2 Continual learning models in healthcare settings address many heterogeneous problems that require multiple, complex tasks. In addition, although this is not unique to medicine, the stakes for real-time medical applications of AI are high due to their impact on health outcomes.
A simple solution to catastrophic interference is to completely retrain the model each time new data is available, but this can be computationally expensive and inhibit real-time inferences. Advances in cloud computing may provide a solution to this problem, but currently the HIPAA compliant GPU accelerated resources that are needed to retrain on the full datasets are complex to create legally and are difficult to securely maintain. Healthcare information governance across different countries is constantly evolving, making it difficult to maintain compliance. In addition, the availability of retrospective training sets needed to fully retrain the model with new data is especially challenging in healthcare due to consent for use constraints. For these reasons, online training methods that do not involve full retraining (but rather use new data only) are probably more realistic in the healthcare setting.
Current applications of ML and deep learning in medical research have been mostly limited to supervised learning where a focused task (ex. classification, segmentation of images) is trained using labeled data.6,7 To date, only a few automated algorithms have been approved by the Food and Drug Administration (FDA) for limited capacities such as detection of diabetic retinopathy or breast abnormalities.8,9 All of these algorithms have been “locked” for safety, to prevent any potential for further learning or change post-approval.8,9 However, continual learning (i.e. “unlocked”) ML models may be more advantageous as they are able to incrementally learn from their mistakes and fine-tune their performance with progressively more data, similar to the ways that human clinicians learn.
There are certain areas within clinical medicine where continual learning ML models could be safely implemented. One example is diagnostic testing, but the labeling of the new data would be a rate-limiting step. When new patient data becomes available, the trained model would perform inference and make a diagnostic call. The new data would also need to be manually graded using the reference standard, and the results would then be used to update the model (Figure 1A). Manual image grading is a time-consuming step that will limit the overall utility of an automated AI algorithm since all new incremental data will require human input to produce reliable labels, but the performance of the model as it “learns” would not directly affect patient outcomes.
Figure 1. Diagram of potential continual learning algorithms.
A) Diagnostics. Unlike the traditional locked model, the current state AI model continually updates as new data are inputted. When the new data are fed into the current state AI model, the model provides the diagnosis output. Meanwhile, the same new data are manually labeled and graded. This information is fed into the continual learning model, updating the current state AI model. B) Predictive Analytics. Instead of labeling data as in A, there is a waiting period between new clinical data and extraction of clinical outcome. Once the clinical outcome has occurred and has been compared to the model’s prediction, this information is used to update the current state AI. C) Clinical Decision Making. New data are fed into the current state AI model which provides a treatment decision. Once this is implemented in the clinical pathway, there is a waiting period for the resulting clinical outcome to occur. This information is used to update the current state AI.
AI, artifical intelligence
Continual learning ML models could also be used for predictive analytics, in situations where clinical outcomes can be automatically obtained and fed into the algorithm (Figure 1B). For example, if a model were to predict a critical clinical outcome such as all-cause mortality within three months, then at three months the actual clinical outcome would be used to update the model. Since the standard of care would not change, this scenario is a safer setting in which to test continual learning algorithms, with the added benefit that manual grading would not be necessary. Ultimately, if the model’s performance improves and surpasses expert predictions, then it may seem reasonable to integrate the model’s output into the clinical care pathway. It is important to note that before the model’s predictions are used to change clinical decisions, a prospective randomized clinical trial should be performed to compare against the standard of care. In addition, the performance of the model may be impacted, as the model will need to readapt with changing care paradigms.
The ultimate goal is for continual learning models to do just that - optimize clinical management decisions in real time. For example, AI models could be combined with therapeutics to provide the optimal medication dosing and/or combination of drugs for individual patients or could help control the ventilator settings of intubated patients in critical care units.10 In these situations, the model would be making active clinical decisions and attempting to optimize the eventual clinical outcome of the patient, which can lead to potential complications. Once the AI model output becomes fully integrated into the clinical management decision, there will be a delay in evaluating the ultimate outcome for each participant (Figure 1C). Patients could potentially be harmed by erroneous versions of the model as it changes and updates. In addition, when these models are used in real time, there is not a separate set of aggregated data to test the model’s safety, since the model is directly influencing the clinical outcome.
Other significant challenges need consideration prior to implementing continual learning models in the clinical arena. First, no established methods exist for evaluating the quality of these models. After the initial launch of the model and evaluation of its performance using traditional metrics, other factors such as the collection process for new data, the automated organization or labeling of new data, the knowledge transfer between new and original data, and the overall performance of the model after incorporating data would all need to be validated while ensuring that no catastrophic interference occurred. Second, the regulatory challenges will be substantial. Last year’s white paper from the FDA titled “Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning Based Software as a Medical Device” illustrates that a new framework is needed for allowing AI algorithms that, by their very nature, will continuously update and change after they are approved.11 Third, the use of these models for clinical applications will require the acceptance of everyone involved in medical care. There is no fail-safe AI model, as the entire premise of continual learning models is that they improve because they make mistakes, and systems must be established to respond when errors occur. Finally, continual learning models will have to merge clinical data from large numbers of patients, which may lead to privacy concerns.
There is enormous potential for the use of continual learning AI models in the practice of medicine, but this technology should be implemented cautiously, beginning with lower risk applications. Results from these lower-risk cases can be used to make regulatory guidelines and establish systems for addressing problems as they arise. As with any new technology, careful risk management will be essential, but the potential benefits of this powerful tool are impressive and may ultimately change the practice of medicine.
Acknowledgments
Financial Support: NIH/NEI K23EY029246, R01AG060942 and an unrestricted grant from Research to Prevent Blindness. The sponsors / funding organizations had no role in the design or conduct of this research.We have not been paid to write this article by a pharmaceutical company or other agency. I, Aaron Lee, had the final responsibility for the decision to submit for publication.
Footnotes
DECLARATION OF INTEREST
Dr. A. Lee reports other from US Food and Drug Administration, grants from Santen, personal fees from Genentech, grants from Carl Zeiss Meditec, grants from Novartis, personal fees from Topcon, personal fees from Verana Health, outside the submitted work; Dr. C. Lee has nothing to disclose. This article does not reflect the opinions of the US Government or of the US FDA.
REFERENCES
- 1.Parisi GI, Kemker R, Part JL, Kanan C, Wermter S. Continual lifelong learning with neural networks: A review. Neural Netw 2019; 113: 54–71. [DOI] [PubMed] [Google Scholar]
- 2.Portugal I, Alencar P, Cowan D. The use of machine learning algorithms in recommender systems: A systematic review. Expert Systems with Applications. 2018; 97: 205–27. [Google Scholar]
- 3.Hassabis D, Kumaran D, Summerfield C, Botvinick M. Neuroscience-Inspired Artificial Intelligence. Neuron 2017; 95: 245–58. [DOI] [PubMed] [Google Scholar]
- 4.McClelland JL, McNaughton BL, O’Reilly RC. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol Rev 1995; 102: 419–57. [DOI] [PubMed] [Google Scholar]
- 5.McCloskey M, Cohen NJ. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. Psychology of Learning and Motivation. 1989; : 109–65. [Google Scholar]
- 6.Lee CS, Baughman DM, Lee AY. Deep learning is effective for the classification of OCT images of normal versus Age-related Macular Degeneration. Ophthalmol Retina 2017; 1: 322–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med 2018; 24: 1342–50. [DOI] [PubMed] [Google Scholar]
- 8.Rana SP, Dey M, Tiberi G, et al. Machine Learning Approaches for Automated Lesion Detection in Microwave Breast Imaging Clinical Data. Sci Rep 2019; 9: 10510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit Med 2018; 1: 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ghassemi MM, Alhanai T, Westover MB, Mark RG, Nemati S. Personalized medication dosing using volatile data streams. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence. 2018. https://www.aaai.org/ocs/index.php/WS/AAAIW18/paper/viewPaper/17234. [Google Scholar]
- 11.[No title]. https://www.fda.gov/files/medical%20devices/published/US-FDA-Artificial-Intelligence-and-Machine-Learning-Discussion-Paper.pdf (accessed March 8, 2020).