Skip to main content
Radiology: Artificial Intelligence logoLink to Radiology: Artificial Intelligence
. 2023 Oct 18;5(6):e230369. doi: 10.1148/ryai.230369

Removing Radiographic Markers Using Deep Learning to Enable Image Sharing

Ken Chang 1, Matthew D Li 1,
PMCID: PMC10698605  PMID: 38074775

See also the article by Khosravi and Mickley et al in this issue.

Ken Chang, MD, PhD, is a diagnostic radiology resident at Stanford University. He completed his PhD at the Martinos Center for Biomedical Imaging at Massachusetts General Hospital/Harvard Medical School, developing artificial intelligence techniques to enhance medical imaging workflows. His research interests include distributed learning, response assessment, and radiogenomics.

Ken Chang, MD, PhD, is a diagnostic radiology resident at Stanford University. He completed his PhD at the Martinos Center for Biomedical Imaging at Massachusetts General Hospital/Harvard Medical School, developing artificial intelligence techniques to enhance medical imaging workflows. His research interests include distributed learning, response assessment, and radiogenomics.

Matthew D. Li, MD, is an assistant professor in the department of radiology and diagnostic imaging at the University of Alberta and a radiologist at MIC Medical Imaging in Edmonton, Alberta, Canada, with a clinical specialization in musculoskeletal radiology. His research focuses on the use of artificial intelligence for assessing disease severity and change over time in medical imaging.

Matthew D. Li, MD, is an assistant professor in the department of radiology and diagnostic imaging at the University of Alberta and a radiologist at MIC Medical Imaging in Edmonton, Alberta, Canada, with a clinical specialization in musculoskeletal radiology. His research focuses on the use of artificial intelligence for assessing disease severity and change over time in medical imaging.

For medical imaging artificial intelligence (AI) algorithms to be clinically useful, they must be generalizable across various patient populations, radiology device vendors, image acquisition parameters, and clinical settings. The performance of AI algorithms can improve with increased dataset sizes (1,2), data from multiple institutions, and greater diversity (3), which can be achieved by sharing datasets across institutions. However, before datasets can be shared, they must be curated to remove protected health information (PHI) that could compromise patient privacy. Dataset anonymization is a labor-intensive task, which hinders the ability to share or publicly release experimental data. Other than studies using public domain medical imaging datasets, the radiology AI literature has few instances of data sharing (4). Increased sharing of datasets would allow improved collaboration and reproduction of research findings and facilitate public release of more high-quality datasets, which in turn could catalyze further AI algorithm development and validation.

Khosravi and Mickley et al (5), in this issue of Radiology: Artificial Intelligence, offer a proof of concept to anonymize medical images to lower the barriers for image data sharing. The authors trained a YOLOv5-x model, a widely used object detection deep learning model, to detect patient-identifying information in radiographs of the hip or pelvis. Importantly, the authors used a two-step approach to retain information—in their specific use case, the burned-in laterality markers on radiographs—that may be useful to train AI algorithms on the dataset.

The authors demonstrated 100% accuracy of their AI model on an internal test set and 96% accuracy on an out-of-domain set of chest radiographs from the publicly available CheXpert dataset (6). The authors highlighted some notable failure modes on the external test set, including the false-positive detection of electrocardiography leads. Using a tuning set of only 20 chest radiographs, the authors were able to increase external test set performance to 99.6%.

This study used a two-pass approach to remove PHI and selectively retain laterality markers. A natural next step for this approach could include removing PHI while retaining other diagnostically important radiographic markers that indicate radiographic positioning or technique (eg, supine, decubitus, posteroanterior, weight bearing). Such information could aid AI applications, such as the detection of pneumoperitoneum on abdominal radiographs, joint space narrowing on knee radiographs, or cardiomegaly on chest radiographs. One could extend this anonymization approach to other imaging modalities where burned-in markers may be present, such as US, CT, MRI, or nuclear radiology.

AI algorithms similar to this work could form part of a larger, multicomponent anonymization pipeline to enable dataset sharing or public release. AI-based large language models have been used to anonymize radiology reports (7). Automated anonymization of both images and reports would be a powerful tool. The anonymized radiology report could provide information to train an AI model on the image data with decreased manual labeling. Such datasets could facilitate unsupervised or weakly supervised learning.

Although the potential of automated radiology data anonymization is tantalizing, the performance of such models, including in this work, on external test data may be near perfect, but not perfect. How would AI anonymization algorithms be implemented in real-world use cases? The release of even a single individual's information would be a problem for both the individual and the institution from which the data originate. Given this concern, such AI tools could be used in tandem with a manual reviewer or as a first reviewer in a two-reviewer schema, with calibration of the AI model for high sensitivity to ensure that no PHI is missed, at the cost of specificity. Although not completely automatic, such an approach would greatly decrease the time and expense of manual review of imaging data.

There are important caveats to this study. First, the authors showed improved performance with in-domain fine-tuning on a small dataset. Requiring fine-tuning for every use case, even if the tuning dataset is small, may prevent many researchers from choosing to take on the task of anonymization for data sharing. Second, many possible modes of dataset shift could affect the performance of the anonymization AI algorithm, including previously unseen medical devices (such as the electrocardiography leads in this study) and changes in image contrast, patient positioning, image orientation, or the font or shape of the markers. Checking of the algorithm's performance will remain a necessary safeguard to prevent inadvertent sharing of incompletely anonymized image data. Third, the authors evaluated only a single deep learning object detection algorithm in a landscape of many similar or superior detection algorithms, as the YOLOv5-x model was released originally in 2020. Improvements in AI models have proceeded rapidly since then, and algorithm performance may be improved with newer model architectures and training strategies. Last, only a single modality and two anatomic regions were considered in this study, while coverage of many more modalities of all bodily regions are needed.

Nonetheless, the advances reported here are a quality proof of concept of the anonymization process to enable medical image data sharing and public release.

Footnotes

Authors declared no funding for this work.

Disclosures of conflicts of interest: K.C. No relevant relationships. M.D.L. RSNA Research & Education Foundation Resident/Fellow Research Grant recipient, 2020-2021; Radiology: Artificial Intelligence Trainee Editorial Board alum; associate editor for Radiology: Artificial Intelligence.

References

  • 1. Dunnmon JA , Yi D , Langlotz CP , Ré C , Rubin DL , Lungren MP . Assessment of convolutional neural networks for automated classification of chest radiographs . Radiology 2019. ; 290 ( 2 ): 537 – 544 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. He B , Kwan AC , Cho JH , et al . Blinded, randomized trial of sonographer versus AI cardiac function assessment . Nature 2023. ; 616 ( 7957 ): 520 – 524 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Chang K , Beers AL , Brink L , et al . Multi-institutional assessment and crowdsourcing evaluation of deep learning for automated classification of breast density . J Am Coll Radiol 2020. ; 17 ( 12 ): 1653 – 1662 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Venkatesh K , Santomartino SM , Sulam J , Yi PH . Code and data sharing practices in the radiology artificial intelligence literature: A meta-research study . Radiol Artif Intell 2022. ; 4 ( 5 ): e220081 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Khosravi B , Mickley JP , Rouzrokh P , et al . Anonymizing radiographs using an object detection deep learning algorithm . Radiol Artif Intell 2023. ; 5 ( 6 ): e230085 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Irvin J , Rajpurkar P , Ko M , et al . CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison . arXiv 1901.07031 [preprint] https://arxiv.org/abs/1901.07031. Published January 21, 2019. Accessed September 1, 2023. [Google Scholar]
  • 7. Chambon PJ , Wu C , Steinkamp JM , Adleberg J , Cook TS , Langlotz CP . Automated deidentification of radiology reports combining transformer and “hide in plain sight” rule-based methods . J Am Med Inform Assoc 2023. ; 30 ( 2 ): 318 – 328 . [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Radiology: Artificial Intelligence are provided here courtesy of Radiological Society of North America

RESOURCES