Programming literacy for physician‐scientists in the AI era

Ryoma Kani; Hidehiro Gon; Shinichiro Nakajima; Kazunari Yoshida

doi:10.1002/pcn5.70179

letter

. 2025 Aug 10;4(3):e70179. doi: 10.1002/pcn5.70179

Programming literacy for physician‐scientists in the AI era

Ryoma Kani ¹, Hidehiro Gon ², Shinichiro Nakajima ³, Kazunari Yoshida ^4,^✉

PMCID: PMC12335935 PMID: 40791978

Physician‐scientists have unique advantages in conducting clinically impactful research by leveraging their clinical insights and experience. In modern medical research including the field of psychiatry, data analysis proficiency has become important for physician‐scientists to maximize this advantage. ¹ The emergence of generative artificial intelligence (AI) tools has significantly lowered the barriers to data analysis, enabling more physician‐scientists to conduct meaningful research. ² While AI tools facilitate programming skill development, understanding fundamental programming concepts remains necessary.

Programming skills provide a practical foundation for implementing and understanding various data analysis methods. In modern medical research, programming languages such as R and Python have become essential tools, offering advantages over traditional point‐and‐click statistical software. For instance, in psychiatry and clinical neuroscience research, programming literacy enables researchers to handle and analyze large‐scale data sets, including electronic health records (EHRs), neuroimaging, and genetic data. ³ , ⁴ Furthermore, programming platforms help establish reproducible workflows, enabling research teams to enhance research quality by systematically documenting their methodological decisions through collaborative coding. ⁵

While physician‐scientists can conduct basic statistical analyses independently, their analytical expertise varies considerably, and they often rely on statisticians for complex statistical methods. However, physician‐scientists without sufficient analytical skills may not be able to properly consider the implications of the analysis results provided by statisticians. Furthermore, statisticians with limited clinical knowledge may find it challenging to analyze data in a clinically meaningful way, potentially missing important medical contexts. Although collaboration with statisticians remains valuable, more effective collaborative approaches can be achieved through reciprocal relationships where physician‐scientists develop fundamental data analysis literacy while sharing clinical insights with statisticians.

Physician‐scientists face significant obstacles in developing these skills, as their responsibilities as both clinicians and researchers, along with career stage‐specific barriers, make skill acquisition difficult. For mid‐career physicians, this represents a fundamental and challenging “digital skills gap,” because they must self‐learn technical skills without foundational computer science education. The traditional medical education curriculum's historical lack of data science or coding instruction creates this gap, making late‐career skill acquisition significantly more difficult. Residents pursuing research careers face particular difficulties due to intensive clinical duties restricting consistent programming learning time. ⁶ Medical students also struggle, as education primarily focuses on mastering an expanding body of medical knowledge and preparing for licensing examinations, leaving little room for developing coding skills. ⁷ Across all career stages, while various learning resources such as textbooks and online courses are available, the lack of personalized mentoring can significantly impede their progress.

The advent of generative AI represents a promising approach to address the programming literacy challenges for physician‐scientists. ² These tools can provide context‐aware code suggestions and real‐time debugging support, potentially offering a more accessible entry point for time‐constrained medical professionals. However, utilizing generative AI in research programming requires careful consideration. The phenomenon of AI hallucination presents challenges. A concerning example from clinical practice illustrates these challenges: a physician used ChatGPT to generate an overview of rare diseases, which included citations to non‐existent PubMed articles, potentially leading to inappropriate patient management decisions. ⁸ In the context of coding, while non‐functional code is easily identified, errors in code that produces seemingly correct output can be especially difficult to detect. Therefore, fundamental programming literacy remains essential for effectively validating AI‐generated code.

To mitigate these risks, researchers should implement concrete verification methods: testing AI‐generated code with known data sets where expected outcomes can be verified and cross‐checking AI‐provided literature citations against actual publications. Many journals currently require disclosure of AI use in research, with authors maintaining full responsibility for accuracy, emphasizing the critical need for thorough validation. ⁹ Furthermore, beyond addressing hallucination risks, responsible use of generative AI in clinical research requires broader safety considerations, including version‐controlled prompts and temperature settings for reproducibility, and automatic log audits to prevent protected health information leakage.

To leverage technological advances effectively, medical institutions could consider several approaches to foster programming literacy. ¹ , ⁶ One strategy would be flexible, modular training approaches—focused workshops on specific applications such as EHR data analysis, clinical data visualization, or biostatistical programming. These targeted sessions are accessible even to busy clinicians, allowing them to immediately apply new skills to their current research projects. The establishment of mentorship programs, pairing technically proficient physician‐scientists with learners, might accelerate skill development. Through collaboration with statisticians, physician‐scientists can learn informatics and statistical methods within interdisciplinary teams.

As research methodologies become more sophisticated, programming literacy may become an increasingly important skill for successful physician‐scientists. The synergy between clinical expertise and programming capabilities, enhanced by generative AI tools and guided by appropriate validation practices, will help physician‐scientists accelerate medical discoveries while ensuring research reproducibility.

AUTHOR CONTRIBUTIONS

Ryoma Kani, Hidehiro Gon, and Kazunari Yoshida did the literature search, extracted the data, and wrote the first draft of the manuscript. All authors approved the final version of the manuscript.

CONFLICT OF INTEREST STATEMENT

S. N. has received grants from Japan Society for the Promotion of Science (18H02755 and 22H03002), Japan Agency for Medical Research and development (AMED: JP24wm0625302 and JP24wm0625307), Japan Research Foundation for Clinical Pharmacology, Naito Foundation, Takeda Science Foundation, Watanabe Foundation, Osakeno‐Kagaku Foundation, and Astellas Foundation within the past three years. S. N. has also received research support, manuscript fees or speaker's honoraria from Sumitomo Pharma, Meiji Seika Pharma, Otsuka, PDR pharma, and MSD within the past 3 years. K. Y. has received manuscript fees from Sumitomo Dainippon Pharma and consultant fees from Signant Health and WCG Clinical, Inc within the past three years. The remaining authors declare no conflicts of interest.

ETHICS APPROVAL STATEMENT

N/A.

PATIENT CONSENT STATEMENT

N/A.

CLINICAL TRIAL REGISTRATION

N/A.

AI STATEMENT

This manuscript was edited for English language and academic style using Claude 3.5 Sonnet and ChatGPT‐4, developed by Anthropic Inc. and OpenAI Inc., respectively, on July 5, 2025. The authors retain full responsibility for the accuracy and integrity of the final content.

ACKNOWLEDGMENTS

This study was supported by JSPS KAKENHI (Grant Number 24K18724).

DATA AVAILABILITY STATEMENT

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

REFERENCES

1. Gerber BS, Kannampallil T, Kitsiou S, Heckerling PS. Physician scientists should learn how to program. J Investig Med. 2017;65(8):e5. 10.1136/jim-2017-000573 [DOI] [PubMed] [Google Scholar]
2. Perkel JM. Six tips for better coding with ChatGPT. Nature. 2023;618(7964):422–423. 10.1038/d41586-023-01833-0 [DOI] [PubMed] [Google Scholar]
3. Ressler KJ, Williams LM. Big data in psychiatry: multiomics, neuroimaging, computational modeling, and digital phenotyping. Neuropsychopharmacology. 2021;46(1):1–2. 10.1038/s41386-020-00862-x [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Newby D, Taylor N, Joyce DW, Winchester LM. Optimising the use of electronic medical records for large scale research in psychiatry. Transl Psychiatry. 2024;14(1):232. 10.1038/s41398-024-02911-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, et al. A manifesto for reproducible science. Nat Hum Behav. 2017;1(1):0021. 10.1038/s41562-016-0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Mokhtari B, Badalzadeh R, Ghaffarifar S. The next generation of physician‐researchers: undergraduate medical students’ and residents’ attitudes, challenges, and approaches towards addressing them. BMC Med Educ. 2024;24(1):1313. 10.1186/s12909-024-06166-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. D'Eon MF. The overcrowded curriculum is alarming. Can Med Educ J. 2023;14(4):1–5. 10.36834/cmej.78084 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus. 2023;15(2):e35179. 10.7759/cureus.35179 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Ganjavi C, Eppler MB, Pekcan A, Biedermann B, Abreu A, Collins GS, et al. Publishers' and journals' instructions to authors on use of generative artificial intelligence in academic and scientific publishing: bibliometric analysis. BMJ (Clinical Research Ed.). 2024;384:077192. 10.1136/bmj-2023-077192 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

[pcn570179-bib-0001] 1. Gerber BS, Kannampallil T, Kitsiou S, Heckerling PS. Physician scientists should learn how to program. J Investig Med. 2017;65(8):e5. 10.1136/jim-2017-000573 [DOI] [PubMed] [Google Scholar]

[pcn570179-bib-0002] 2. Perkel JM. Six tips for better coding with ChatGPT. Nature. 2023;618(7964):422–423. 10.1038/d41586-023-01833-0 [DOI] [PubMed] [Google Scholar]

[pcn570179-bib-0003] 3. Ressler KJ, Williams LM. Big data in psychiatry: multiomics, neuroimaging, computational modeling, and digital phenotyping. Neuropsychopharmacology. 2021;46(1):1–2. 10.1038/s41386-020-00862-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcn570179-bib-0004] 4. Newby D, Taylor N, Joyce DW, Winchester LM. Optimising the use of electronic medical records for large scale research in psychiatry. Transl Psychiatry. 2024;14(1):232. 10.1038/s41398-024-02911-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcn570179-bib-0005] 5. Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, et al. A manifesto for reproducible science. Nat Hum Behav. 2017;1(1):0021. 10.1038/s41562-016-0021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcn570179-bib-0006] 6. Mokhtari B, Badalzadeh R, Ghaffarifar S. The next generation of physician‐researchers: undergraduate medical students’ and residents’ attitudes, challenges, and approaches towards addressing them. BMC Med Educ. 2024;24(1):1313. 10.1186/s12909-024-06166-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcn570179-bib-0007] 7. D'Eon MF. The overcrowded curriculum is alarming. Can Med Educ J. 2023;14(4):1–5. 10.36834/cmej.78084 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcn570179-bib-0008] 8. Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus. 2023;15(2):e35179. 10.7759/cureus.35179 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcn570179-bib-0009] 9. Ganjavi C, Eppler MB, Pekcan A, Biedermann B, Abreu A, Collins GS, et al. Publishers' and journals' instructions to authors on use of generative artificial intelligence in academic and scientific publishing: bibliometric analysis. BMJ (Clinical Research Ed.). 2024;384:077192. 10.1136/bmj-2023-077192 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Programming literacy for physician‐scientists in the AI era

Ryoma Kani, MD

Hidehiro Gon, BA

Shinichiro Nakajima, MD, PhD

Kazunari Yoshida, MD, PhD

AUTHOR CONTRIBUTIONS

CONFLICT OF INTEREST STATEMENT

ETHICS APPROVAL STATEMENT

PATIENT CONSENT STATEMENT

CLINICAL TRIAL REGISTRATION

AI STATEMENT

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Programming literacy for physician‐scientists in the AI era

Ryoma Kani, MD

Hidehiro Gon, BA

Shinichiro Nakajima, MD, PhD

Kazunari Yoshida, MD, PhD

AUTHOR CONTRIBUTIONS

CONFLICT OF INTEREST STATEMENT

ETHICS APPROVAL STATEMENT

PATIENT CONSENT STATEMENT

CLINICAL TRIAL REGISTRATION

AI STATEMENT

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases