Amplifying Signal-to-Noise: Responsible Use of Large Language Models in Radiology Publishing

Albert S Song; Evan M Masutani; Ali S Tejani; Tara A Retson

doi:10.1016/j.clinimag.2025.110679

. Author manuscript; available in PMC: 2026 Feb 24.

Published in final edited form as: Clin Imaging. 2025 Nov 20;129:110679. doi: 10.1016/j.clinimag.2025.110679

Amplifying Signal-to-Noise: Responsible Use of Large Language Models in Radiology Publishing

Albert S Song ¹, Evan M Masutani ¹, Ali S Tejani ², Tara A Retson ¹

PMCID: PMC12928988 NIHMSID: NIHMS2139402 PMID: 41308492

Introduction

The field of radiology is no stranger to automation and emerging technologies. AI in radiology can be traced back to the 1960s where Lee Lusted anticipated “an electronic ‘scanner-computer’ to look at chest photofluorograms and to separate the clearly normal chest films from the abnormal chest films. The abnormal chest films would be marked for later study by the radiologist.” [1] With algorithms for triage of emergencies like pulmonary embolism and intraparenchymal hemorrhage now becoming commonplace, current technology is quickly growing beyond these early dreams.

Generative artificial intelligence (AI) represents an evolving class of algorithms capable of producing new multimodal content based on patterns identified from massive datasets [2]. At the forefront of generative AI are large language models (LLMs) such as GPT-4o, Gemini 1.5, or Llama 4, which have garnered widespread attention for their ability to generate human-like text. These models learn from a vast corpora of material to ingest and produce coherent sentences, paragraphs, or entire articles. Already, generative AI has been applied in various domains of medicine, such as research, medical education, and clinical tasks (reviewed in further detail elsewhere [3, 4]). Clinicians and patients alike are also turning to LLMs for assistance with daily tasks and medical advice. A recent BMJ survey found that at least one in five general practitioners use ChatGPT for daily tasks [5], while an Australian survey demonstrated that nearly 10% of the population had sought health information from ChatGPT in the past six months [6]. As with all emerging technologies, increasing scope of adoption for clinical use requires a thoughtful evaluation of its advantages, limitations, and ethical considerations to ensure responsible and effective implementation.

Radiology journals and peer-reviewed articles serve as essential platforms for this discourse, ensuring that novel concepts are rigorously evaluated, refined, and integrated into clinical practice. Over the last few decades the process of creation, review, and distribution of scientific literature has been disrupted by the internet, giving rise to open access journal models [7], debates over publication costs [8], pressures on impact factors and associated metrics [9, 10], and shifting reviewer dynamics [11]. Generative AI is poised to intensify these existing debates and introduce new challenges. Generative AI can serve as a catalyst for more efficient research communication and collaboration, but its adoption must be balanced against the responsibility to preserve the integrity of the scientific record and human creativity. In this editorial, we build upon previous efforts to clarify the role of generative AI in scientific publishing [12–16]. We examine four areas of current interest in the use of generative AI in peer-reviewed journal publications: 1) Transparency and accountability, 2) Misinformation, 3) Plagiarism and intellectual property, and 4) Bias and inclusivity, aiming to provide a practical perspective on how radiology journals and researchers can responsibly harness the power of generative AI without compromising the core tenets of scientific publishing.

1). Transparency and Accountability

Generative AI technologies increasingly support scientific writing by generating drafts, helping format manuscripts to meet journal guidelines, enhancing clarity and language, and (a favorite LLM task for one of the authors here) refining casual language into scholarly prose. They may also play a more potentially questionable role in identifying citations, creating figures, or summarizing existing literature [17–19]. Regardless of use, their rapid adoption in scientific publishing reinforces a need for clear disclosure. Authors are ethically obligated to specify the type and degree of AI involvement in any manuscript, whether it be simple grammar checks, text or figure generation, or sourcing of references. By openly acknowledging AI contributions, authors invite reviewers and editors to act as “second-line guardians,” scrutinizing the integrity and rigor of the content. This transparent disclosure can then be formally incorporated into the manuscript’s accompanying statements. Already, this change is in progress. Major journals now require authors to fully disclose any use of AI tools like ChatGPT in manuscript preparation. For example, JAMA’s policy mandates clearly describing any AI-generated content and citing the AI model used [20]. Some publishers (e.g. Science, Nature) have banned listing AI as an author and consider undisclosed AI-written text a form of misconduct [21, 22].

This journal, Clinical Imaging, has addressed this issue through its “Declaration of Generative AI in Scientific Writing”, which instructs authors to add an “AI use” statement before the reference list. We endorse this policy and recommend one refinement: disclosure of generative AI use should be a required step in the submission process and even use of generative AI for grammar and phrasing should be reported.

Crucial to this disclosure is that AI models cannot be held responsible for inaccurate or misleading content. They have no moral agency, and accountability rightly reverts to human participants in the process—authors, reviewers, and editors. In this sense, generative AI does not diminish the need for human oversight but amplifies it, emphasizing the irreplaceable value of human judgment. Far from making human labor obsolete, generative AI heightens the necessity for it; ensuring the credibility of scientific discourse relies more than ever on the diligence and expertise of those involved.

2. Risk of Misinformation

At its core, a language model does not really understand the text it produces in the same way a human would. They are fundamentally trying to predict the most statistically likely next best word, and their responses are shaped by probabilities learned in their training data sets. While this allows them to mimic expert language and provide useful summaries they are also prone to “hallucinations,” factual errors, or drawing unwarranted conclusions that may sound fluent but ultimately lack substance [23]. In a scholarly context such inaccuracies can erode trust in the literature. Consequently, the reputation of both individuals and institutions looms larger than ever. An author’s track record in meticulously verifying AI-generated material can serve as a powerful credential, while an editor’s vigilance in identifying errors or potential misinformation can become a hallmark of a reputable journal.

However, this heightened emphasis on reputation comes with a cautionary note: fostering trust must not devolve into insular or elitist practices. While it is natural to rely on known, reputable circles when evaluating new submissions, it is equally important not to discourage new entrants or lesser-known researchers who may struggle to establish their credibility. The scientific community must strike a delicate balance between healthy skepticism (prompted by the pervasiveness of AI) and openness to fresh voices and perspectives.

Although generative AI poses the risk of generating fluent sounding misinformation in the hands of the careless or malicious, it may also accelerate literature review, particularly given its ability to summarize text and answer free-form questions. Models such as OpenEvidence are trained to not only answer such questions but also provide high quality peer-reviewed references [24]. Disciplined researchers have both an opportunity and a duty to verify these rapidly-gleaned AI statements against the cited literature.hen used responsibly generative AI provides a rapid means to peruse the scientific literature, possibly more so than existing internet search-engines.

3. Plagiarism and Intellectual Property

The seamless and automated nature of AI text generation raises the stakes around plagiarism and intellectual property. Even without malicious intent, generative AI tools can produce content that closely resembles copyrighted material or fails to credit original sources, leading to inadvertent infringements. Journals and authors alike must incorporate enhanced screening procedures. Robust plagiarism-checking protocols potentially augmented by AI detection tools become increasingly important to separate legitimate scholarly work from automated duplication or imitation. However, extreme caution must be exercised in the deployment of such tools. As with any classifier, these AI detection tools will always have associated false negative and, perhaps more critically, false positive rates [25]. The accusation of plagiarism is not to be taken lightly in the realm of academics given the grave professional consequences.

Human intervention, therefore, is crucial. Undoubtedly, AI detection tools may rightfully flag articles which present nonsensical citations and overtly plagiarized sections of text (with accompanying source material). Beyond such obvious clues, however, the human reviewer must ultimately provide nuanced judgment–it is easy to envision a scenario where an AI incorrectly flags a section of text as having a high-likelihood of being generated by AI. On whom should the burden of proof lie? Should the human authors be asked to rewrite the relevant sections? While there is no easy answer to such questions, editors and peer reviewers, equipped with an understanding of how generative models function, will be more important than ever to provide nuanced and ethical judgment.

4. Bias and Inclusivity

Any AI system is only as impartial as the data on which it is trained. When those data reflect inevitable historical biases (social, cultural, or linguistic), these biases can resurface in AI-generated text. For example, the majority of training data for many LLMs is strongly based in English language and Western sources which can bias responses towards the cultural perspectives of English speakers. Consequently, authors should be transparent about which AI tools have been used, and reviewers and editors must familiarize themselves with known limitations or biases of these models. By proactively recognizing that every generative model carries inherent assumptions, the scientific community can critically appraise potential distortions. Importantly, bias becomes genuinely harmful when it goes unrecognized [26]. If researchers and reviewers suspect it, they can confront and correct it. This dynamic requires ongoing education for all stakeholders, ensuring they are up to date on the capabilities and known pitfalls of common generative AI platforms.

Conversely, generative AI provides a practical opportunity to overcome human reviewer biases, particularly for authors for whom English is not their native language [27]. English remains the lingua franca of scientific publishing, which often places non-native writers at a distinct disadvantage. Generative AI has the opportunity to level the playing field, either through direct translation or via copyediting. Yet, AI editing taken to the extreme can easily distort or replace the original intentions of the human author. This problem, however, is not new. Prior to generative AI many authors have used human-run editing services to revise their manuscripts. Banning or enforcing regulations regarding these editing services is untenable, but it is important to acknowledge these issues exist [28]. In contrast, the unparalleled ability of LLMs to translate writing into multiple languages has the potential to broaden access to knowledge in a previously unrealized way, allowing individuals who might otherwise face language barriers to engage with the scientific literature.

Conclusion: The Imperative of Scientific Rigor and Human Oversight

Generative AI offers remarkable efficiencies in manuscript preparation, literature analysis, and data synthesis. We believe that generative AI ultimately increases, rather than diminishes, the collective responsibility of authors, editors, and reviewers. Each stakeholder now bears a heightened duty to verify, scrutinize, and ensure that the scientific record remains both accurate and ethical. Trust and rigor, built over decades in scientific circles, gain renewed significance in a publishing landscape where information can be generated with unprecedented speed. AI also presents a new and unique opportunity to accelerate research by providing rapid access to high-quality references, summaries, translations, and text editing, particularly for non-native English writers. Balance, rather than persecution or uncritical acceptance, will be the most difficult path forward but is ultimately the correct choice. Rather than undermining human input, generative AI reaffirms its necessity in the ongoing evolution of scientific discovery.

Acknowledgments

Declaration of generative AI and AI-assisted technologies in the writing process

During the preparation of this work the author(s) used ChatGPT (OpenAI, model o3 and GPT-4.5) in order to brainstorm ideas, draft preliminary outlines, refine grammar, and improve clarity. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the published article.

References

1.Meyers PH, et al. , Automated Computer Analysis Of Radiographic Images. Radiology, 1964. 83: p. 1029–34. [DOI] [PubMed] [Google Scholar]
2.OpenAI. Generative models. 2016; Available from: https://openai.com/index/generative-models/. [Google Scholar]
3.Lee P, Bubeck S, and Petro J, Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. N Engl J Med, 2023. 388(13): p. 1233–1239. [DOI] [PubMed] [Google Scholar]
4.The Lancet Digital, H., Large language models: a new chapter in digital health. Lancet Digit Health, 2024. 6(1): p. e1. [DOI] [PubMed] [Google Scholar]
5.Blease CR, et al. , Generative artificial intelligence in primary care: an online survey of UK general practitioners. BMJ Health Care Inform, 2024. 31(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Ayre J, Cvejic E, and McCaffery KJ, Use of ChatGPT to obtain health information in Australia, 2024: insights from a nationally representative survey. Med J Aust, 2025. 222(4): p. 210–212. [DOI] [PubMed] [Google Scholar]
7.McKiernan EC, et al. , How open science helps researchers succeed. Elife, 2016. 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Sanderson K, Who should pay for open-access publishing? APC alternatives emerge. Nature, 2023. 623(7987): p. 472–473. [Google Scholar]
9.McKiernan EC, et al. , Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations. Elife, 2019. 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Moher D, et al. , Assessing scientists for hiring, promotion, and tenure. PLoS Biol, 2018. 16(3): p. e2004089. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Aczel B, et al. , The present and future of peer review: Ideas, interventions, and evidence. Proc Natl Acad Sci U S A, 2025. 122(5): p. e2401232121. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Koller D, et al. , Why We Support and Encourage the Use of Large Language Models in NEJM AI Submissions. NEJM AI, 2024. 1(1): p. AIe2300128. [Google Scholar]
13.Ong JCL, et al. , Medical Ethics of Large Language Models in Medicine. NEJM AI, 2024. 1(7): p. AIra2400038. [Google Scholar]
14.Haltaufderheide J and Ranisch R, The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs). NPJ Digit Med, 2024. 7(1): p. 183. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Liebrenz M, et al. , Generating scholarly content with ChatGPT: ethical challenges for medical publishing. Lancet Digit Health, 2023. 5(3): p. e105–e106. [DOI] [PubMed] [Google Scholar]
16.Thirunavukarasu AJ, et al. , Large language models in medicine. Nat Med, 2023. 29(8): p. 1930–1940. [DOI] [PubMed] [Google Scholar]
17.Walters WH and Wilder EI, Fabrication and errors in the bibliographic citations generated by ChatGPT. Sci Rep, 2023. 13(1): p. 14045. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Májovský M, et al. , Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora’s Box Has Been Opened. J Med Internet Res, 2023. 25: p. e46924. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Spinellis D, False authorship: an explorative case study around an AI-generated article published under my name. Res Integr Peer Rev, 2025. 10(1): p. 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Flanagin A, Kendall-Taylor J, and Bibbins-Domingo K, Guidance for Authors, Peer Reviewers, and Editors on Use of AI, Language Models, and Chatbots. Jama, 2023. 330(8): p. 702–703. [DOI] [PubMed] [Google Scholar]
21.Thorp HH, ChatGPT is fun, but not an author. Science, 2023. 379(6630): p. 313. [DOI] [PubMed] [Google Scholar]
22.McNutt MK, et al. , Transparency in authors’ contributions and responsibilities to promote integrity in scientific publication. Proc Natl Acad Sci U S A, 2018. 115(11): p. 2557–2560. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Shen Y, et al. , ChatGPT and Other Large Language Models Are Double-edged Swords. Radiology, 2023. 307(2): p. e230163. [DOI] [PubMed] [Google Scholar]
24.Hurt RT, et al. , The Use of an Artificial Intelligence Platform OpenEvidence to Augment Clinical Decision-Making for Primary Care Physicians. J Prim Care Community Health, 2025. 16: p. 21501319251332215. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Dalalah D and Dalalah OMA, The false positives and false negatives of generative AI detection tools in education and academic research: The case of ChatGPT. The International Journal of Management Education, 2023. 21(2): p. 100822. [Google Scholar]
26.Gopal DP, et al. , Implicit bias in healthcare: clinical practice, research and decision making. Future Healthc J, 2021. 8(1): p. 40–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Ferguson G, Pérez-llantada C, and Plo R, English as an international language of scientific publication: a study of attitudes. World Englishes, 2011. 30(1): p. 41–59. [Google Scholar]
28.Lozano GA, Ethics of using language editing services in an era of digital communication and heavily multi-authored papers. Sci Eng Ethics, 2014. 20(2): p. 363–77. [DOI] [PubMed] [Google Scholar]

[R1] 1.Meyers PH, et al. , Automated Computer Analysis Of Radiographic Images. Radiology, 1964. 83: p. 1029–34. [DOI] [PubMed] [Google Scholar]

[R2] 2.OpenAI. Generative models. 2016; Available from: https://openai.com/index/generative-models/. [Google Scholar]

[R3] 3.Lee P, Bubeck S, and Petro J, Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. N Engl J Med, 2023. 388(13): p. 1233–1239. [DOI] [PubMed] [Google Scholar]

[R4] 4.The Lancet Digital, H., Large language models: a new chapter in digital health. Lancet Digit Health, 2024. 6(1): p. e1. [DOI] [PubMed] [Google Scholar]

[R5] 5.Blease CR, et al. , Generative artificial intelligence in primary care: an online survey of UK general practitioners. BMJ Health Care Inform, 2024. 31(1). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Ayre J, Cvejic E, and McCaffery KJ, Use of ChatGPT to obtain health information in Australia, 2024: insights from a nationally representative survey. Med J Aust, 2025. 222(4): p. 210–212. [DOI] [PubMed] [Google Scholar]

[R7] 7.McKiernan EC, et al. , How open science helps researchers succeed. Elife, 2016. 5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Sanderson K, Who should pay for open-access publishing? APC alternatives emerge. Nature, 2023. 623(7987): p. 472–473. [Google Scholar]

[R9] 9.McKiernan EC, et al. , Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations. Elife, 2019. 8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Moher D, et al. , Assessing scientists for hiring, promotion, and tenure. PLoS Biol, 2018. 16(3): p. e2004089. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Aczel B, et al. , The present and future of peer review: Ideas, interventions, and evidence. Proc Natl Acad Sci U S A, 2025. 122(5): p. e2401232121. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Koller D, et al. , Why We Support and Encourage the Use of Large Language Models in NEJM AI Submissions. NEJM AI, 2024. 1(1): p. AIe2300128. [Google Scholar]

[R13] 13.Ong JCL, et al. , Medical Ethics of Large Language Models in Medicine. NEJM AI, 2024. 1(7): p. AIra2400038. [Google Scholar]

[R14] 14.Haltaufderheide J and Ranisch R, The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs). NPJ Digit Med, 2024. 7(1): p. 183. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Liebrenz M, et al. , Generating scholarly content with ChatGPT: ethical challenges for medical publishing. Lancet Digit Health, 2023. 5(3): p. e105–e106. [DOI] [PubMed] [Google Scholar]

[R16] 16.Thirunavukarasu AJ, et al. , Large language models in medicine. Nat Med, 2023. 29(8): p. 1930–1940. [DOI] [PubMed] [Google Scholar]

[R17] 17.Walters WH and Wilder EI, Fabrication and errors in the bibliographic citations generated by ChatGPT. Sci Rep, 2023. 13(1): p. 14045. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Májovský M, et al. , Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora’s Box Has Been Opened. J Med Internet Res, 2023. 25: p. e46924. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Spinellis D, False authorship: an explorative case study around an AI-generated article published under my name. Res Integr Peer Rev, 2025. 10(1): p. 8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Flanagin A, Kendall-Taylor J, and Bibbins-Domingo K, Guidance for Authors, Peer Reviewers, and Editors on Use of AI, Language Models, and Chatbots. Jama, 2023. 330(8): p. 702–703. [DOI] [PubMed] [Google Scholar]

[R21] 21.Thorp HH, ChatGPT is fun, but not an author. Science, 2023. 379(6630): p. 313. [DOI] [PubMed] [Google Scholar]

[R22] 22.McNutt MK, et al. , Transparency in authors’ contributions and responsibilities to promote integrity in scientific publication. Proc Natl Acad Sci U S A, 2018. 115(11): p. 2557–2560. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Shen Y, et al. , ChatGPT and Other Large Language Models Are Double-edged Swords. Radiology, 2023. 307(2): p. e230163. [DOI] [PubMed] [Google Scholar]

[R24] 24.Hurt RT, et al. , The Use of an Artificial Intelligence Platform OpenEvidence to Augment Clinical Decision-Making for Primary Care Physicians. J Prim Care Community Health, 2025. 16: p. 21501319251332215. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Dalalah D and Dalalah OMA, The false positives and false negatives of generative AI detection tools in education and academic research: The case of ChatGPT. The International Journal of Management Education, 2023. 21(2): p. 100822. [Google Scholar]

[R26] 26.Gopal DP, et al. , Implicit bias in healthcare: clinical practice, research and decision making. Future Healthc J, 2021. 8(1): p. 40–48. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Ferguson G, Pérez-llantada C, and Plo R, English as an international language of scientific publication: a study of attitudes. World Englishes, 2011. 30(1): p. 41–59. [Google Scholar]

[R28] 28.Lozano GA, Ethics of using language editing services in an era of digital communication and heavily multi-authored papers. Sci Eng Ethics, 2014. 20(2): p. 363–77. [DOI] [PubMed] [Google Scholar]

PERMALINK

Amplifying Signal-to-Noise: Responsible Use of Large Language Models in Radiology Publishing

Albert S Song, MD, PhD

Evan M Masutani, MD, PhD

Ali S Tejani, MD

Tara A Retson, MD, PhD

Introduction

1). Transparency and Accountability

2. Risk of Misinformation

3. Plagiarism and Intellectual Property

4. Bias and Inclusivity

Conclusion: The Imperative of Scientific Rigor and Human Oversight

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Amplifying Signal-to-Noise: Responsible Use of Large Language Models in Radiology Publishing

Albert S Song, MD, PhD

Evan M Masutani, MD, PhD

Ali S Tejani, MD

Tara A Retson, MD, PhD

Introduction

1). Transparency and Accountability

2. Risk of Misinformation

3. Plagiarism and Intellectual Property

4. Bias and Inclusivity

Conclusion: The Imperative of Scientific Rigor and Human Oversight

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases