Dear editor,
We read with great interest the letter by Jamaluddin et al., which highlighted hallucination as a key challenge in artificial intelligence (Al)-generated writing, particularly in academic and clinical contexts.1 Since the publication of their correspondence, developments in large language models (LLMs) have progressed rapidly, with newer systems demonstrating measurable improvements in accuracy. In light of recent performance data for GPT-5, it is timely to revisit this discussion - not to reiterate the risks alone but to acknowledge the notable reduction in hallucination rates and to consider its positive implications for medical and scientific writing.
The launch of GPT-5 has been accompanied by encouraging evidence that one of the most persistent limitations of LLMs - the phenomenon of ‘hallucination’ - can be meaningfully reduced.2 Hallucinations, defined as outputs that are factually inaccurate or not grounded in real-world data, have long posed challenges in medical writing, research and clinical documentation.3 As highlighted in previous work, such errors may arise from limitations in data training, overfitting or gaps in conceptual understanding, and they become especially concerning when fabricated references or distorted interpretations appear in academic or clinical contexts.4
According to recently reported figures, GPT-5 demonstrates a substantial improvement over its predecessors when evaluated with web access. The hallucination rate has decreased to 9.6%, down from 12.9% for GPT-4o - representing a 26% relative reduction. The enhanced reasoning variant, GPT-5-thinking, achieved a rate as low as 4.5%. In addition, GPT-5 produced 44% fewer responses containing at least one major factual error compared with GPT-4o. These advances suggest that improvements in training, reasoning capacity and real-time access to online sources can directly translate into more reliable AI-assisted content generation.2
Nevertheless, performance remains context-dependent. When evaluated without internet connectivity on fact-seeking tasks, GPT-5’s hallucination rate increased to 47%, underscoring that human oversight remains indispensable, particularly in high-stakes domains such as medicine. As emphasised by Jamaluddin et al., authors must critically verify Al-generated content and references, while editors and reviewers should adopt proactive screening methods - both manual and software-assisted - to safeguard the accuracy and credibility of publications.5
From a broader perspective, GPT-5’s performance trajectory offers optimism for the future integration of LLMs into scientific and clinical workflows. The data indicate that hallucination management is not only possible but also measurable, and sustained progress could make AI tools more dependable partners in knowledge creation. However, realising this potential will require continued collaboration between developers, clinicians and the academic community, with an unwavering commitment to verification and quality control.4
In conclusion, GPT-5’s marked reduction in hallucinations represents a meaningful step forward. While challenges remain, the combination of technical advancements and responsible human oversight could transform AI from a promising assistant into a trusted contributor to medical and scientific literature.
Acknowledgements
The authors gratefully acknowledge the assistance of colleagues who provided valuable insights during manuscript preparation. In addition, the authors declare that OpenAI’s ChatGPT-5 (GPT-5, OpenAI, San Francisco, CA, USA) was utilized as a writing and language enhancement tool during the preparation of this letter. The model was employed to improve clarity, grammar, and coherence of the text, and to ensure scientific precision in phrasing. The authors verified all Al-generated content for factual accuracy, reference integrity, and alignment with the intended message. No text, data, or references were accepted without human review and modification.
Funding Statement
The authors declare that this study received no financial support.
Author Contributions
Concept: Alyanak B, Temel MH, Yildizgoren MT, Design: Polat S, Dede BT, Data Collection or Processing: Temel MH, Bagcier F, Analysis or Interpretation: Polat S, Dede BT, Literature Search: Alyanak B, Yildizgoren MT, Writing: Polat S, Bagcier F.
Conflicts of interest
The authors declare no conflicts of interest that are relevant to the content of this article.
References
- 1.Jamaluddin J, Gaffar NA, Din NSS. Hallucination: a key challenge to artificial intelligence-generated writing. Malays Fam Physician. 2023;18(68) doi: 10.51866/lte.527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.OpenAI. GPT-5 System Card. Aug 13, 2025. [August 13; 2025 ]. https://cdn.openai.com/gpt-5-system-card.pdf [Google Scholar]
- 3.Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus. 2023 Feb;15(2):e35179. doi: 10.7759/cureus.35179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Al Jannadi K. AI hallucinations: types, causes, impacts, and strategies for detection and prevention. Aug 26, 2023. [August 11; 2025 ]. https://www.researchgate.net/publication/386148806_AI_Hallucinations_Types_Causes_Impacts_and_Strategies_for_Detection_and_Prevention [Google Scholar]
- 5.Roustan D, Bastardot F. The clinicians’ guide to large language models: a general perspective with a focus on hallucinations. Interact J Med Res. 2025 Jan 28;14:e59823. doi: 10.2196/59823. [DOI] [PMC free article] [PubMed] [Google Scholar]
