Over the last 5 years, artificial intelligence (AI) has seen widespread adoption in various fields, including healthcare, finance, and transportation. This growth can be attributed to significant advancements in AI sub-areas [1]. Large language models (LLMs), which are advanced AI models trained on extensive textual data, have gained immense popularity among the general population and have also impacted patients who increasingly rely on large language models (LLMs) to seek clinical information.
In particular, in vitreoretinal conditions, there is a lack of patient awareness of “alarm” symptoms that may delay presentation for medical attention [2]. In response, patients often turn to the internet, social networks, and, with the new boom in artificial intelligence, also to large language models to find answers to their questions [3].
We evaluated three free LLMs to assess the information type and accuracy they provided in the context of vitreoretinal surgery. The questions were based on common patient queries from our daily practice. The LLMs tested were ChatGPT 3.5 (OpenAI), Bing AI (powered by GPT-4 and Microsoft), and Docs-GPT Beta (optimized for healthcare and medical contexts by OpenAI). We categorized the questions into two groups: medical advice (Table 1) and medical conditions/post-operative advice (Table 2). The answers provided by the LLMs were reviewed by three vitreoretinal surgeons and classified as follows: Accurate and sufficient: Correct content with all important information present. Partially accurate and sufficient: Some incorrect information, but overall answer content understandable and informative. Inaccurate: Completely wrong answer or fundamental errors in the response. No human subjects were involved in our study, and the questions used did not include any personal information about patients.
Table 1.
Medical advice questions.
| ChatGPT | Bing AI | DocsGPT | |
|---|---|---|---|
| Questions | |||
| 1. I have a sudden increase in floaters in my eye. What could it be? What should I do? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 2. I have frequent flashes in my eye. What could it be? What should I do? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 3. I have a black shadow in the peripheral field. What could it be? What should I do? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 4. I have distortion if I close one eye. What could it be? What should I do? | Partially accurate and sufficient | Accurate and sufficient | Partially accurate and Insufficient |
| 5. I have a missing patch in the center of my vision if I close one eye. What could I be? What should I do? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
Table 2.
Pre and post-operative advice questions.
| ChatGPT | Bing AI | DocsGPT | |
|---|---|---|---|
| Questions | |||
| 1. What is a vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 2. What are the risks associated with vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 3. Is gas always used after a vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 4. How long will gas remain in my eye after vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 5. Can I fly after vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 6. How should I posture after vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 7. Do I have to buy special equipment for posturing after a vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 8. When can I do exercise after a vitrectomy? | Partially accurate and sufficient | Partially accurate and sufficient | Partially accurate and sufficient |
| 9. Can I wear my contact lenses after a vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 10. Can I swim after a vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 11. Can I lift objects after a vitrectomy? | Accurate and sufficient | Accurate and sufficient | Partially accurate and sufficient |
| 12. Can I do exercise after a vitrectomy? | Accurate and sufficient | Partially accurate and sufficient | Partially accurate and sufficient |
| 13. Can I go for a walk after a vitrectomy? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
| 14. Can I drive after a vitrectomy? | Accurate and sufficient | Partially accurate and sufficient | Accurate and sufficient |
| 15. When should I go to the optician after a vitrectomy? | Inaccurate | Accurate and sufficient | Accurate and sufficient |
| 16. I have increasing redness and pain after a vitrectomy. What could it be? What should I do? | Accurate and sufficient | Accurate and sufficient | Accurate and sufficient |
Regarding medical advice questions, ChatGPT and DocsGPT had 80% of their answers classified as accurate and sufficient and 20% being partially accurate and sufficient. In contrast, Bing AI achieved a 100% accuracy rate (Table 1). For pre- and post-operative advice questions, ChatGPT provided accurate and sufficient responses in 88% of cases, partially accurate and sufficient, and inaccurately in 6%, respectively. Bing AI and DocsGPT scored 81% for accurate and sufficient answers, and 19% for partially accurate and sufficient responses. ChatGPT offered the most detailed information among the LLMs, while Bing AI was the only one providing verifiable references. However, these differences did not significantly impact the accuracy of the responses. All LLMs demonstrated acceptable performance. This research did not aim to determine the best LLMs
LLMs could play a crucial role in vitreoretinal care, assisting with various tasks like summarizing topics for patients, addressing their questions and emails, and facilitating communication with non-English speakers through translation services [4]. These capabilities are especially valuable in virtual clinics, where patients can seek clarifications after receiving virtual medical evaluations [5]. Accessibility is a key advantage, enabling patients to obtain quick answers anytime, which is particularly beneficial for those in remote areas. Moreover, LLMs generate responses easier to understand than medical terminology.
LLMs offer potential benefits, but we must address inherent limitations before incorporating this technology into medical practice. While our evaluation showed generally acceptable performance, further in-depth and extensive testing is necessary. Developing specific training models for vitreoretinal surgery can enhance accuracy and coverage of complex topics, ensuring LLMs become reliable tools. Proper patient education on LLMs usage is vital, understanding they complement, not replace, medical professionals. Ethical and legal concerns about data collection and dissemination require attention, as sensitive patient information may be at risk [6]. Strict data protection measures, guidelines, and informed consent are essential.
LLMs have great potential to be a valuable tool in vitreoretinal surgery. The key is now how we, as ophthalmologists engage with, and integrate LLMs into our daily practice.
Author contributions
RA and AM conceived and designed the research. RA, AM, and JH analyzed the data. RA, AM, JH, and LW analyzed and interpreted the literature. RA, AM, JH, and LW drafted the manuscript and made critical revisions of the manuscript.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Michael LL, Ifeoma A, Guy B, Craig B, Morgan C, Finale D-V. et al. Gathering strength, gathering storms: The one hundred year study on artificial intelligence (AI100) 2021 study panel report. Stanford University, Stanford, CA, 2021. http://ai100.stanford.edu/2021-report. Accessed 16 June 2023.
- 2.Anguita R, Ting MYL, Makuloluwa A, Charteris DG. Causal factors for late presentation of retinal detachment. Eye (Lond) 2023;37:185–6. doi: 10.1038/s41433-022-02109-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ruran HB, Petty CR, Eliott D, Rao RC, Phipatanakul W, Young BK. Patient perceptions of retinal detachment management and recovery through social media. Semin Ophthalmol. 2023;38:498–502. doi: 10.1080/08820538.2023.2168492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Large language models in medicine: The potential to reduce workloads, leverage the EMR for Better Communication & More (2023a) The Rheumatologist. Available at: https://www.the-rheumatologist.org/article/large-language-models-in-medicine-the-potential-to-reduce-workloads-leverage-the-emr-for-better-communication-more/
- 5.Hanumunthadu D, Adan K, Tinkler K, Balaskas K, Hamilton R, Nicholson L, et al. Outcomes following implementation of a high-volume medical retina virtual clinic utilising a diagnostic hub during COVID-19. Eye (Lond) 2022;36:627–33. doi: 10.1038/s41433-021-01510-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li H, Moon JT, Purkayastha S, Celi LA, Trivedi H, Gichoya JW. Ethics of large language models in medicine and medical research. Lancet. Digit Health. 2023;5:e333–35. doi: 10.1016/S2589-7500(23)00083-3. [DOI] [PubMed] [Google Scholar]
