Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jun 1.
Published in final edited form as: Int J Radiat Oncol Biol Phys. 2017 Feb 12;98(2):438–446. doi: 10.1016/j.ijrobp.2017.02.006

Medical Device Recalls in Radiation Oncology: Analysis of U.S. Food and Drug Administration Data, 2002–2015

Michael J Connor *,, Kathryn Tringale *, Vitali Moiseenko *, Deborah C Marshall *, Kevin Moore *, Laura Cervino *, Todd Atwood *, Derek Brown *, Arno J Mundt *, Todd Pawlicki *, Abram Recht , Jona A Hattangadi-Gluth *
PMCID: PMC5518627  NIHMSID: NIHMS865819  PMID: 28463163

Abstract

Purpose

Medical devices in radiation oncology have undergone remarkable technological advancement over the last two decades. The U.S. Food and Drug Administration (FDA) administers recalls of medical devices posing safety risks. We analyzed all recalls involving radiation oncology devices (RODs) from the FDA’s recall database, comparing these to non-radiation oncology device recalls to identify discipline-specific trends that may inform improvements in device safety.

Methods and Materials

Recall data on RODs from 2002–2015 were sorted into four product categories (external beam, brachytherapy, planning systems, and simulation systems). Outcomes included determined cause of recall, recall class (severity), quantity in commerce, time until recall termination (date FDA determines recall is complete), and time since 510(k) approval. Descriptive statistics were performed with linear regression of time-series data. Results for RODs were compared to those for other devices by Pearson’s Chi-squared test for categorical data and two-sample Kolmogorov-Smirnov test for distributions.

Results

There were 502 ROD recalls and 9,534 other class II device recalls from 2002–2015. Most recalls were for external beam devices (66.7%) and planning systems (22.9%), and recall events peaked in 2011. RODs differed significantly from other devices in all recall outcomes (p≤0.04). Recall cause was commonly software-related (49% vs. 10% for other devices). Recall severity was more often moderate among RODs (97.6% vs. 87.2%) instead of severe (0.2% vs. 4.4%; p<0.001). Time from 510(k) market approval to recall was shorter among RODs (p<0.001), and progressively shortened over time. RODs had fewer recalled devices in commerce than other devices (p<0.001).

Conclusions

Compared to other class II devices, RODs experience recalls sooner after market approval and are trending sooner still. Most of these recalls were moderate in severity and software issues are prevalent. Comprehensive analysis of recall data can identify areas for device improvement, such as better system design among RODs.

Introduction

Radiation oncology depends critically on advanced medical technologies, which have evolved enormously over the last several decades—linear accelerators with real-time intra-fraction monitoring, high-performance particle therapies, brachytherapy systems, and complex treatment planning software. However, with rapidly evolving technology comes potential for error. In 2010, the U.S. Food and Drug Administration (FDA), which oversees all medical devices, wrote to manufacturers of radiation oncology devices (RODs) about adverse event reports involving “under-doses, over-doses, and misaligned exposures” (1). The agency held a public meeting to discuss device improvements to improve safety in radiation delivery. The national press has also publicized catastrophic and avoidable errors in radiation delivery (25).

The vast majority of radiation therapy devices are cleared through a process of premarket notification process (“510(k)”), requiring demonstration that the device is substantially equivalent in safety and efficacy to a legally marketed device. Malfunctioning or unsafe devices already marketed may be subject to recall. Recall of a device means “a firm's removal or correction of a marketed product that the FDA considers to be in violation of the laws it administers and against which the agency would initiate legal action, e.g., seizure” (21 CFR 7.3g). Recalls are usually initiated by a manufacturer, though the FDA can require it in rare instances. Correction or other remedial actions may be taken, rather than withdrawal of the device from use. Thus, trends of recalls provide critical insight into the monitoring and maintenance of device safety.

According to the FDA’s Center for Devices and Radiological Health (CDRH), the most commonly recalled medical device from 2003–2012 was the linear accelerator (6). Nonetheless, details of ROD recalls remain poorly characterized. Previous studies have utilized recall data to analyze a variety of safety concerns with medical technologies including orthopedic devices (7), diagnostic radiology devices (8), coronary stents (9), medical software (10), and cardiovascular devices (11). We sought to characterize all recalls involving RODs and to compare these to recalls of other devices.

Materials and Methods

Data Acquisition

Device recall data from December 4, 2002 through December 31, 2015 was obtained from openFDA on February 7, 2016 (12). The FDA 510(k) device preapproval database was also obtained in January 2016 (13), which contains all releasable 510(k) approvals from July 15, 1976 through December 31, 2015. Data files were imported and analyzed using R (14) with the data.table (15) and jsonlite (16) packages.

Many devices may be recalled due to a single event or problem (17). FDA’s recall database captures this distinction through the use of separate “Z-numbers”, unique identifiers for each separate device, and a recall event ID, which is the same for all products affected by the same event. To avoid distorting the magnitude of some problems, for this analysis the term “recall” refers to records with unique combinations of recall event ID, 510(k) number, and the cause of the recall. Thus, we considered all devices with a single problem or event as one recall, unless the reasons for recall or type of device recalled were different (for example, products of different sizes would not be double-counted, whereas hip and knee implants sharing a single faulty component would count as two separate recalls).

Product Code Classification

Each device is classified by a three-letter product code within the FDA system. We identified the 39 product codes specifically designated for therapeutic radiation devices (21 C.F.R. §892) and designated these as RODs. We also identified 4 more product codes associated with therapeutic radiation by manual search outside of this regulatory subpart, resulting in 43 total product codes representing radiation oncology devices. These were further categorized broadly into 4 categories: External Beam, Brachytherapy, Planning System, and Simulation System (Supplementary Table 1).

Device Class

The FDA classifies devices into three categories: Class I (low risk and subject to the least regulatory controls; e.g., dental floss), Class II (higher risk, require greater regulatory controls to provide reasonable assurance of the device’s safety and effectiveness; e.g., MRI, contact lenses), and Class III (highest risk, subject to the highest level of regulatory control; e.g., replacement heart valves). Nearly all radiation oncology devices are designated Class II. We therefore restricted analysis of all other non-radiation oncology devices to Class II devices to maximize the comparability of the two groups.

Additional Data

While the openFDA initiative provided a convenient download for recall data, we found additional information in the FDA recall database web interface (18). Therefore, each openFDA recall ID was queried against the web database to obtain: the supplemental data of “quantity in commerce”, more consistent information on the date of the recall via a “date posted” field, and 510(k) numbers associated with recalls back to 2003, rather than back to 2006 via openFDA.

FDA Determined Cause

The FDA makes a determination as to the root cause of the recall. We grouped common causes: for example, “software change control”, “software design”, etc. were considered as “software”. The classification scheme is shown in Supplementary Table 2.

Quantity in Commerce

This data field, supplied by the manufacturer and available only through the FDA’s web interface, provides the number of devices in distribution at the time of the recall. The field is free-text, and may therefore include difficult-to-parse entries such as “USA: 55 units; Foreign: 123 units”. This field was stripped of all non-numeric characters to yield numerical values, and then manually checked for all records. In instances such as above, the quantity in commerce was changed to the sum of the separate distribution regions. In instances where a total quantity was explicitly stated, i.e. “12 boxes of 10 each (120 units)”, this value (120) was used, whereas if no total was explicitly stated by the manufacturer, the base unit was used, i.e. “70 bottles of 12 oz. each” was recorded as “70”.

Time Since 510(k) Approval

Each recall record lists one or more 510(k) numbers. The dates of approval for each recall’s 510(k) numbers were determined via merging of the recall data with the 510(k) approval database (19). Time since 510(k) approval was computed by subtracting the 510(k) approval date from the date the recall was posted. Where a recall had more than one 510(k) number listed, the 510(k) approval date was computed as the average date of all 510(k) numbers. Data using the minimum and maximum dates were also reported.

Recall Class

Recall severity is classified by the FDA depending on whether use or exposure to the violative product: would reasonably cause serious adverse health consequences or death (Class I); may cause temporary or medically reversible adverse health consequences or where the probability of serious adverse health consequences is remote (Class II); or would not be likely to cause adverse health consequences (Class III).

Time to Termination

A recall progresses through four phases: manufacturer initiation and notification of FDA district office; FDA district office issuance of alert and classification recommendation; final classification and posting to FDA.gov; and termination. Recalls are terminated (phase IV) when the FDA “determines that manufacturers have completed all reasonable efforts to remove or correct the product in accordance with the recall strategy, and that proper disposition or correction has been made commensurate with the degree of hazard of the recalled product” (20). Date posted and date terminated are available within the public datasets. Recalls not yet terminated were not included.

Statistical Analysis

Descriptive statistics for RODs were compared to other devices by Pearson’s Chi-squared test for categorical data and the two-sample Kolmogorov-Smirnov test for distributions. P-values of <0.05 were considered significant.

Results

Recall Characteristics

Database characteristics are shown in Table 1. There were 502 ROD recalls and 9,534 other class 2 device recalls. The number of recalls increased over each 5-year period for both RODs and other devices, with the greatest number reported in 2011–2015. ROD device recalls and other device recalls differed in their distribution across 5-year time periods (chi-squared test, p<0.001).

Table 1.

Radiation Oncology-related vs. All Other Class 2 Recall Characteristics

Radiation Oncology Devices (n=502) Other Class 2 Devices (n=9,534) p-value
Year < .001*
2002–2005 15 (3.0) 1,396 (14.6)
2006–2010 162 (32.3) 3,363 (35.3)
2011–2015 325 (64.7) 4,775 (50.1)
Device Product Categories
External Beam 335 (66.7)
Planning System 115 (22.9)
Brachytherapy 42 (8.4)
Simulation System 10 (2.0)
Recall Class < .001*
Class 1 1 (0.2) 424 (4.4)
Class 2 490 (97.6) 8,311 (87.2)
Class 3 11 (2.2) 799 (8.4)
Time Since 510K Clearance (n=471, 8267)
Oldest: Mean # days (range) 2,117 (32 – 10,301) 2,842 (13 – 14,182)
Average: Mean # days (range) 2,294 (43 – 10,301) 2,967 (13 – 14,182) < .001
Newest: Mean # days (range) 2,480 (53 – 12,893) 3,093 (13 – 14,182)
Average: Recalls within 12 months 47 (10.0) 488 (5.9) < .001*
Quantity in Commerce (n=480, 9184) < .001
Mean 682 233,498
Median 107 433
Range 1 – 22,900 0 – 225,000,000
Time to Termination (n=438, 8294) .04
Mean 470.9 497.7
Median 335.5 315
Range 0 – 2,639 0 – 47,46
*

Pearson’s Chi-squared test

Two-sample Kolmogorov-Smirnov test

Most ROD recalls (66.7%) were for external beam devices, followed by planning systems (22.9%), brachytherapy devices (8.4%), and simulation systems (2.0%). Both external beam recalls (n=73) and brachytherapy recalls (n=12) peaked in 2011. Planning system recalls were also highest in 2011 (n=44). The most simulation system recalls were in 2008 (n=4) (Fig 1). In general, the number of recalls peaked in 2012 (1,019, 10.7%) for other devices and in 2011 (129, 25.7%) for RODs (Fig 1).

Fig. 1.

Fig. 1

Recalls over time by event type, RODs vs. other devices.

Number of recalls increased over time for other devices (linear regression; β=63.5 recalls per year, R2=0.86, p<0.001). Only recalls of external beam devices within radiation oncology significantly increased over time (β=3.5 recalls per year, R2=0.44, p=0.01).

Recall Class/Severity

Nearly all ROD recalls were Class II recalls (97.6%, n=490); 2.2% were Class III, and a solitary recall was Class I. The Class I recall described an intraoperative external beam device which “may shed particles identified as tungsten” that “may look like suspicious calcifications” on follow-up imaging. For other devices, 87.2% of recalls were Class II, 4.4% Class III, and 8.4% Class I. Recall classification was significantly different between RODs and other device by Pearson’s Chi-Square test (p<0.001).

FDA Determined Cause

The FDA determined causes for recalls are shown in Table 2. The top three causes for ROD recalls were software (49%), device design or change control (16.7%), and other/under investigation (16.1%). The top three causes for other device recalls were other/under investigation (33.2%), material/component (11.9%), and process (10.7%). Determined causes were significantly different between RODs and other devices by Pearson’s Chi-Square test (p<0.001).

Table 2.

FDA Determined Causes, Radiation Oncology vs. All Other Class 2 Device Recalls

FDA Determined Cause Radiation Oncology Devices (n=502) Other Class 2 Devices (n=9,534)*
Count Percent Count Percent
Software 246 49.0 958 10.0
Device Design/Change Control 84 16.7 991 10.4
Other/Under Investigation 81 16.1 3,163 33.2
Labeling 24 4.8 675 7.1
Material/Component 21 4.2 1,138 11.9
Process 20 4.0 1,024 10.7
Component 15 3.0 357 3.7
Employee/Use Error 7 1.4 297 3.1
Radiation Control for Health and Safety Act 3 0.6 110 1.2
Equipment Maintenance 1 0.2 111 1.2
Packaging 0 0.0 507 5.3
*

For brevity, the following determined causes were omitted, as 0% of ROD recalls and <1% of other device recalls cited them: Counterfeit, Environmental control, PMA, Reprocessing Controls, Storage, Vendor change control

Time Since 510(k) Approval

Of the 502 ROD recalls, 471 (93.8%) had at least one valid 510(k) number recorded (Table 1). The recall occurred an average of 2,294 days (range: 43–10,301 days) after the average date of 510(k) approval. For other devices, 8267 (86.7%) had 510(k) numbers. The average time of recall since average date of 510(k) approval was 2,967 days (range: 13–14,182). More RODs recalls occurred in the first 12 months after market clearance than other devices recalls (10% vs 5.9%, p<0.001). The distribution functions for time since 510(k) differed significantly for RODs vs. other devices by two-sample Kolmogorov-Smirnov test (p<0.001) (Fig 2A).

Fig. 2.

Fig. 2

A. Probability distributions for time since 510(k) approval, RODs vs. other devices. B. Years since 510(k) approval over time. Shaded regions: standard deviation. Dashed lines: linear regressions.

The average and standard-deviation for years since 510(k) approval for recalls over time are shown in Fig. 2B. Time since 510(k) approval for RODs decreased significantly (β=-0.47 years/year, p<0.001), while for other devices, this increased (β=0.22 years/year, p<0.001).

Quantity in Commerce

There were 480 ROD recall events for which quantity in commerce could be computed. These ranged from 1–22,900, with a mean of 682 and a median of 107. For other devices, 9,184 events had data for quantity in commerce, which varied from 0–225,000,000, with a mean of 233,485 and a median of 433 (Table 1). The distribution functions for quantity in commerce are shown in Fig. 3A in a semi-log plot to improve visualization, as the data was highly positive skewed. The probability distributions differed significantly by two-sample Kolmogorov-Smirnov test (p<0.001). Quantity in commerce was steady over time for both recalled RODs and other devices; linear regression was not significant in either group (Fig. 3B)

Fig. 3.

Fig. 3

A. Probability distributions for quantity in commerce, RODs vs. other devices. B. Quantity in commerce over time. Shaded regions: standard deviation. Dashed lines: linear regressions.

Time to Termination

There were 438 and 8,294 terminated recalls for RODs and other devices, respectively (Table 1). Median termination time was 335.5 days for ROD and 315 days for other devices. The distributions differed significantly for RODs vs. other devices by two-sample Kolmogorov-Smirnov test (p=0.04) (Fig 4A). Other devices had a longer mean time owing to a longer tail of extremely delayed terminations, while RODs had a longer median time. The time to termination decreased over time for both RODs (β=-43.1 days per year, p<0.001) and other devices (β=−47.7 days per year, p<0.001), and converged over time, with no statistically significant difference in slopes (Fig 4B).

Fig. 4.

Fig. 4

A. Probability distributions for time to termination, RODs vs. other devices. B. Time to termination over time. Shaded regions: standard deviation. Dashed lines: linear regressions.

Discussion

Congress and the U.S. Government Accountability Office (GAO) have recently pushed for systematic analyses of medical device recall data, including trends in numbers, device type, and causes, to identify ways to prevent health risks from unsafe or defective devices (20). To our knowledge, this is the first comprehensive analysis of medical device recalls in radiation oncology. We found that ROD recalls comprised 5% of all recalls, with external beam devices recalled most commonly. ROD recalls peaked in 2011, which the FDA linked with “increased awareness prompted by targeted interactions with industry and individual manufacturers” (6) such as a public workshop to discuss mitigating the rising incidence of therapeutic radiation errors. There was also increased media attention in 2010, which may have prompted manufacturers to be more vigilant in issuing voluntary recalls.

Most ROD recalls were class II, or moderate in risk, with only one class I potentially high risk recall. This contrasts with a higher proportion of class I recalls for other devices. Successful safety efforts within radiation oncology may be partly responsible, such that most recalls are reported even when the probability of serious adverse health effects is remote. Indeed, this is supported by a recent analysis of FDA data on adverse device events (19), where patient injury was less common among RODs than among other devices. The quantity of devices in commerce was less for recalled RODs than other devices, reflecting a smaller device market in radiation oncology compared to specialties like cardiology, general surgery, and orthopedics (6). It is also possible that RODs are recalled before they are as widely distributed.

Software issues account for half of all ROD recalls. An analysis of all medical device recalls showed that recalls related to software have steadily risen, from 5.9% in 1983–1991 to 19.4% in 2005–2011 (10). The complexity of embedded software is increasing. “Predicate creep”, whereby devices/systems become quite dissimilar from the original predicate over multiple 510(k) cycles, may be more problematic with software (21). Software components rarely have a clear track-record of quality and reliability in other devices, unlike hardware components (22). An FDA report found linear accelerators were the most recalled product type from 2003–2012 (6), most often for software-related causes. System compatibility (interoperability between treatment planning and treatment delivery systems), user interfaces (human factors), and dose calculation (clinical decision support software) were identified as the most pressing issues. For example, one software-related linear accelerator recall occurred when the use of multiple isocenters may have moved the patient to an unintended position (23). An analysis of adverse events among RODs also found that software-related issues were the most common (19), cited in 30% of adverse events. The FDA recently shared a new approach to the analysis of software-related radiation therapy recalls, finding defects in the software development process (24). A survey of the medical device industry found that sound software engineering practices were rarely adopted in device development, and that software developers were often unaware of the impact of their products on device safety (25). Linear accelerator software errors are well known in radiation oncology and the reasons for these errors have many contributing factors, only one of which is the actual software code (26). A recent study on radiation oncology incident reports found that 32% of incidents with high potential for clinical harm involved human-software interface (27). Continued focus on software reliability may not improve safety in software-based equipment (3, 28). Device manufacturers in radiation oncology should focus on applying systems engineering and human-centered design to software-equipment development (29).

RODs experienced recall sooner after their market approval compared to other devices. Time to recall since 510(k) clearance also decreased significantly over the evaluated period. This identifies a new trend, as a previous FDA analysis of all device recalls found no relationship with time on the market (6). These data are consistent with adverse event data which also show these events occurring sooner after market clearance among RODs (19). Ten percent of ROD were recalled in the first 12 months after market clearance. For instance, a planning system was recalled three months after 510(k) clearance because dose and monitor unit calculations were incorrect when viewing CT images from the head (30). Collectively, this suggests that there may be inadequate pre-market testing of RODs prior to device approval. The Institute of Medicine has declared the current 510(k) process inadequate (31), and others have also criticized the process in its current form (21, 32). Devices cleared via the 510(k) pathway are more likely to be recalled than those cleared through the more rigorous premarket approval process (7, 33). Our data suggest that two to five years from 510(k) approval appears to be the most critical period in which to focus safety efforts among RODs to mitigate recall events.

The median number of days from recall posting to termination was longer for RODs. A 2011 GAO report found that the FDA missed its 3-month time frame for terminating completed recalls over 50% of the time. Additionally, FDA’s decisions to terminate completed recalls—that is, assess whether firms had taken sufficient actions to prevent a recurrence of the problems that led to the recalls—were frequently not made within the prescribed time frame. The GAO found that these FDA shortcomings may increase the risk that unsafe medical devices could remain on the market (20). Reassuringly, phase IV termination time has been steadily decreasing over time, among RODs and other devices.

Limitations of this study primarily stem from the FDA databases themselves, including the possibility of incomplete or inaccurate information. Not all safety notices from manufacturers that describe correction or removal amount to an official recall (34), therefore some device issues may be underreported. Despite the shortcomings, the FDA itself has relied on the data to identify recall trends (6, 20).

Clearly, a better framework for recall reporting and analysis is needed. The FDA is phasing in a Global Unique Device Identification database, allowing for more effective management of recalls, more accurate reports of adverse events, and a potential reduction in medical errors (35). Better data collection and coding of root causes may reveal important and actionable trends in radiation oncology device recall. Efforts such as the ASTRO-AAPM Radiation Oncology Incident Learning System (RO-ILS) (36) and the Center for Assessment of Radiological Sciences’ Radiotherapy Incident Reporting and Analysis System (RIRAS) (37) may also help gather discipline specific data on device and user errors. This system should be better leveraged so that any anomalous equipment behavior reported is immediately disseminated to RO-ILS users and the vendor so that clinics are aware in real-time, pending a full vendor investigation and report-out. This would be a threshold improvement in safety for radiation oncology but takes cooperation between RO-ILS, clinics, and vendors. Pre- and post-market scrutiny may also be increased for these devices, as they continue to be recalled sooner and sooner after approval.

Conclusion

We identified significant differences between recalls of RODs and other devices. RODs played a significant role in the increasing prevalence of software failures, with 49% of RODs recalled for software reasons. RODs spend less time on the market prior to recall, suggesting shortcomings of the 510(k) vetting process. ROD recalls also tended to take longer for termination. Improving the quality of data on recalls will help identify where to focus efforts to improve the quality of our medical devices, thereby improving patient outcomes.

Supplementary Material

1

Supplementary Table 1. Product Code Classifications

2

Supplementary Table 2. FDA Determined Cause Classifications‡

3

Summary.

FDA recalls among radiation oncology devices peaked in 2011 and mostly reflected software issues. These recalls differ significantly from other devices in cause of recall, recall class (severity), quantity in commerce, and time from 510(k) market clearance to recall. The field should demand better design of these systems as well as improved regulatory requirements, software quality efforts, and enhanced post-market surveillance.

Acknowledgments

Funding: This work was partially supported by the following grants: National Institutes of Health KL2TR00099, UL1TR000100 (J.H-G.); American Cancer Society Pilot Award ACS-IRG #70-002 (J.H-G.); National Cancer Institute Cancer Center Specialized Grant P30CA023100 (J.H-G.)

Footnotes

Conflict of Interest: Drs. Hattangadi-Gluth, Moore, and Cervino have research grants from Varian Medical Systems, unrelated to the current study.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Supplementary Table 1. Product Code Classifications

2

Supplementary Table 2. FDA Determined Cause Classifications‡

3

RESOURCES