Skip to main content
NPJ Digital Medicine logoLink to NPJ Digital Medicine
. 2024 Dec 3;7:351. doi: 10.1038/s41746-024-01357-5

Artificial intelligence related safety issues associated with FDA medical device reports

Jessica L Handley 1, Seth A Krevat 1,2, Allan Fong 3, Raj M Ratwani 1,2,
PMCID: PMC11615200  PMID: 39627534

Abstract

The Biden 2023 Artificial Intelligence (AI) Executive Order calls for the creation of a patient safety program. Patient safety reports are a natural starting point for identifying issues. We examined the feasibility of this approach by analyzing reports associated with AI/Machine Learning (ML)-enabled medical devices. Of the 429 reports reviewed, 108 (25.2%) were potentially AI/ML related, with 148 (34.5%) containing insufficient information to determine an AI/ML contribution. A more comprehensive approach is needed.

Subject terms: Health policy, Public health


Artificial intelligence (AI) use in clinical settings has tremendous potential to improve care and reduce healthcare workforce burden, however, it may also introduce patient safety risks13. To begin to address potential patient safety risks, President Biden’s October 2023 AI Executive Order calls for an AI safety program. Federal agencies, working with patient safety organizations (PSOs), shall establish approaches for identifying and capturing clinical errors resulting from AI in healthcare settings and create a central tracking repository for these issues4. Patient safety event reports, which are descriptions of actual safety incidents or potential safety hazards typically entered by frontline clinicians, are a natural starting point for identifying AI-related safety issues. These reports are already collected by most U.S. healthcare facilities, aggregated by PSOs and collected by some federal agencies. Further, there is precedent for these reports being used to identify safety issues associated with electronic health records, as well as other technologies5,6. The Executive Order’s explicit reference to PSOs involvement in the development of the AI safety program suggests that patient safety event reports are being viewed as important sources of information on AI-related safety issues. However, the feasibility of using patient safety event reports in this way is unknown.

We sought to determine whether safety reports associated with AI/machine learning (ML)-enabled medical devices reported to the Food and Drug Administration’s (FDA) Manufacturer and User Facility Device Experience (MAUDE) database describe AI/ML safety issues and contain enough detail to identify how AI/ML may have contributed to the safety issue. While there has been an analysis of these reports to identify the different factors contributing to safety issues, such as device or use problems, this analysis did not inform whether safety reports provide insight into AI-related safety issues from the perspective of whether these reports can serve to identify clinical errors involving AI/ML-enabled devices7. Although the MAUDE database was never intended to identify clinical errors involving AI, we sought to analyze these reports with a focus on whether they enable the identification of AI/ML contributions to patient safety to inform efforts under the Biden Executive Order and to inform FDA’s real-world monitoring of AI/ML-enabled medical devices.

We identified and reviewed 429 safety reports associated with AI/ML-enabled medical devices. Of the reports reviewed, 108 (25.2%) were potentially AI/ML related and 173 (40.3%) were unlikely AI/ML related. There was insufficient information to determine if AI/ML contributed to the safety event in 148 reports (34.5%), see Table 1.

Table 1.

Frequency counts, percentages, definitions, and examples of the contribution of artificial intelligence/machine learning to safety issues identified in MAUDE reports

Category Frequency count (%) Definition Examples
Potentially AI/ML Related 108 (25.2%) The MAUDE report contained language suggesting AI/ML potentially contributed to the event. “Utilizing insulin algorithm software Monarch Endotool, a pt was administered insulin 9 units as recommended by endotool the pt was hypokalemic potassium replacement was started simultaneously with initiation of insulin the pt was transferred to the icu for dka management he became unresponsive and coded due to an unstable cardiac arrhythmia postcode the pt was found to have critically low potassium level which contributed to the code”
Unlikely AI/ML Related 173 (40.3%) The MAUDE report did not contain language suggesting AI/ML contributed to the event. “While inserting trocar and sleeve part of the tricuspid membrane came apart breaking off and going into abdomen device was not retrieved”
Insufficient Information 148 (34.5%) There was not enough information provided in the MAUDE report to determine if AI/ML contributed to the event. “Additional information will be provided once the investigation has been completed the device manufacturer date is not known at this time however, should it become available, it will be provided in future reports”

Our analysis shows safety issues are being reported about AI/ML-enabled medical devices and these issues may potentially be related to AI/ML in 25.2% of the reports reviewed. This finding underscores the need for an AI patient safety program, as outlined by the Biden Executive Order. However, the reports that were identified as potentially AI/ML-related lacked sufficient detail to identify how AI/ML contributed at a level of specificity that would enable improvements to the technology. A previous study was able to classify these types of reports as device or use-related, however, these categories still do not provide the level of specificity needed to identify specific improvements7. Further, 34.5% of reports contained insufficient information to determine whether AI/ML contributed at all. Together, these results highlight that patient safety reports alone, which were never intended for identifying AI issues, may not be sufficient for identifying AI/ML related safety issues and how AI/ML may have contributed to the issue. Safety reports may be insufficient for identifying AI/ML issues because those reporting may not have insight on whether AI/ML are contributing to the safety issue they are observing given that these algorithms are at work “behind the scenes”. Thus, a different approach to AI safety is needed to better capture these issues.

A more comprehensive patient safety program should include additional mechanisms for capturing AI safety issues that extend beyond self-reported safety concerns. Guidelines to inform the safe implementation and use of AI in clinical settings, proactive AI algorithm monitoring processes, and developing ways to trace AI algorithm contributions to safety issues should be part of the safety program8. Guidelines will be especially important to support healthcare facilities as they adopt more AI/ML-enabled technologies and may not have the expertise to safely implement these technologies. These guidelines should inform how to assess technologies for safe use, implement them into the healthcare work system, and frequently monitor for safety issues. In addition, the FDA should develop other mechanisms, aside from the MAUDE database, to capture AI-related safety issues associated with AI/ML-enabled medical devices. The FDA has been developing best practices and methods to enable updates to AI/ML algorithms under predetermined change control plans.

Limitations to this study include analyzing safety reports from a single database from one federal agency and recognizing that reporters may not be aware of when AI is contributing to a safety issue. It is possible that patient safety event reports at the healthcare facility level contain different details about AI-related safety issues.

In addition to the Executive Order safety program, additional safety mechanisms such as AI assurance labs that are developed through a public-private partnership, like the Coalition for Health AI, can serve to promote the adoption of safer AI algorithms9. Keeping patients safe will require engagement from multiple stakeholders including federal agencies, AI developers, healthcare facilities, frontline clinicians, and patient advocacy groups.

Methods

Medical device manufacturers, importers, and device user facilities (e.g., hospitals and nursing homes) are required to report to the FDA MAUDE database when they learn that any of their medical devices may have caused or contributed to a death or serious injury10. The purpose of MAUDE is to support post-market surveillance of medical devices and support risk management, independent of cause or contributing factors. The FDA defines a medical device per Section 201(h) of the Food, Drug, and Cosmetic Act as11:

An instrument, apparatus, implement, machine, contrivance, implant, in vitro reagent, or other similar or related article, including a component part, or accessory which is:

  • (A) recognized in the official National Formulary, or the United States Pharmacopoeia, or any supplement to them,

  • (B) intended for use in the diagnosis of disease or other conditions, or in the cure, mitigation, treatment, or prevention of disease, in man or other animals, or

  • (C) intended to affect the structure or any function of the body of man or other animals, and which does not achieve its primary intended purposes through chemical action within or on the body of man or other animals and which is not dependent upon being metabolized for the achievement of its primary intended purposes.

To identify potential AI/ML safety related reports from the MAUDE database a list of FDA approved AI/ML-enabled medical devices, made publicly available by the FDA, was matched to all reports in the MAUDE database through October, 18 2023 using openFDA which is an Elasticsearch-based API that serves public FDA data12,13. MAUDE reports with an exact or partial match of the AI/ML-enabled medical device brand name, generic name, or manufacturer name were retrieved, resulting in 429 reports.

Reports were independently reviewed by a physician safety leader and a human factors expert to determine if AI/ML contributed to the safety event or whether insufficient information was provided to identify AI/ML as a contributor. Each report was classified into one of three categories (Table 1). For a report to be considered potentially AI/ML related, the report had to explicitly describe that an AI algorithm was used, and its use had to be associated with the reported safety issue. If the report explicitly described an aspect of the device that may have contributed to the safety issue, and it was not AI/ML related, it was coded as unlikely AI/ML related. All other reports were coded as insufficient information. Discrepancies were discussed until a consensus was reached.

Acknowledgements

This work was supported by grant R01HS026481 from the Agency for Healthcare Research and Quality to Dr. Ratwani and the MedStar Health Research Institute.

Author contributions

J.H., S.K., A.F., and R.R. conceived of the idea and contributed to discussing the data and writing the paper. J.H. and S.K. analyzed the data.

Data availability

The safety reports analyzed will be made available upon request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The safety reports analyzed will be made available upon request.


Articles from NPJ Digital Medicine are provided here courtesy of Nature Publishing Group

RESOURCES