Abstract
Our research letter investigates the potential, as well as the current limitations, of widely available text-to-image tools in generating images for medical education. We focused on illustrations of important physical signs in the face (for which confidentiality issues in conventional patient photograph use may be a particular concern) that medics should know about, and we used facial images of hypothyroidism and Horner syndrome as examples.
Keywords: artificial intelligence, AI, medical illustration, medical images, medical education, image, images, illustration, illustrations, photo, photos, photographs, face, facial, paralysis, photograph, photography, Horner's syndrome, Horner syndrome, Bernard syndrome, Bernard's syndrome, miosis, oculosympathetic, ptosis, ophthalmoplegia, nervous system, autonomic, eye, eyes, pupil, pupils, neurologic, neurological
Introduction
Artificial intelligence (AI) has become integral in medicine, outperforming skilled radiologists in certain domains [1]. However, there is limited exploration of AI's potential in producing illustrations for medical education [2,3]. Confidentiality concerns can limit traditional patient photo use, especially when facial features are essential [4]. Using widely available AI text-to-image tools, we aimed to create images portraying distinct facial signs important for medical trainees—hypothyroidism (myxedema) and Horner syndrome [5,6]. These tools generate unique, high-quality images based on text prompts, utilizing learned probability distributions rather than pre-existing images [7].
Methods
ChatGPT was used to generate prompts for the two AI text-to-image tools used in this study—DALL·E 2 and Midjourney (Multimedia Appendix 1) [8-10], with which the prompts were used to generate images for hypothyroidism and Horner syndrome. The images were assessed and selected, using the following suitability criteria:
- Images were excluded if any of the following features were present: insufficient coverage of the face, blurred images, a lack of realistic or humanoid features, a lack of continuity of edges, background noise, cloning errors, and geometrical and shadow inconsistencies. 
- Remaining images were accepted if they adequately represented the facial features of hypothyroidism or Horner syndrome, as judged by the coauthors (all were experienced physicians). 
If adequate images could not be generated via the above methods, additional prompts, which were not generated with ChatGPT, were used. If adequate images were still not generated, then secondary editing via Microsoft Paint and GNU Image Manipulation Program (GIMP) was performed on the best image to try and meet the criteria listed above.
Results
Facial Features of Hypothyroidism
Using ChatGPT, the following text prompt was generated (restricted to the DALL·E 2 prompt word limit):
Generate an image depicting a middle-aged Caucasian woman with hypothyroidism presenting with facial myxedema. The woman should be shown in a frontal view, focusing on her face, scalp, and neck, without any makeup. The face must be very rounded and extreme scalp balding with coarse hair. Skin looks dry and pale. Outer eyebrows have a paucity of hairs, eyelids look very puffy. She looks tired.
The prompt was used to generate 120 images. Of these, 53 were removed, using our preset exclusion criteria. Of the remaining 67, only 17 met some of the criteria for adequately representing facial features of hypothyroidism. The best image was selected as Figure 1 [9], with no additional editing needed.
Figure 1.

Artificial intelligence text-to-image production of facial features typical of hypothyroidism (myxedema) showing classical clinical features, including a rounded face with dry, pale skin; puffy eyelids; a general appearance of tiredness; and partial balding with coarse hair and loss of hair in the eyebrows (especially in the outer third). This image was produced by using DALL·E 2 [9] alone and without additional editing.
Horner Syndrome
The following prompt was obtained from ChatGPT:
Create an illustrative depiction of a patient displaying Horner's syndrome, emphasizing the key clinical features, such as ptosis (drooping of the upper eyelid), miosis (constricted pupil), and anhidrosis (lack of sweating) on one side of the face. Ensure the image is clear and medically accurate, aiding in the understanding of this neurological condition.
Of the 120 images, 85 met our exclusion criteria, but none met our inclusion criteria, even after alternative prompts and DALL·E 2 were used. We therefore selected the best image (produced by Midjourney) and then performed secondary editing with Microsoft Paint and GIMP (Figure 2 [10]). This produced an image of Horner syndrome that was judged as adequate.
Figure 2.

Generated illustration of Horner syndrome. Image 1 was produced by using Midjourney [10]. Image 2 shows the result after minor image editing (as described in our Methods section) to attenuate the key teaching features, which are labeled in image 3 (A: ptosis; B: miosis; C: apparent enophthalmos; D: upside-down ptosis).
Discussion
We aimed to explore the potential, as well as the current limits, of AI text-to-image generation in producing illustrations of medical conditions affecting the face. Without the use of high-quality medical images, it can be more challenging to teach others about these important conditions [11]. We showed that AI text-to-image generation is readily possible for hypothyroidism—a condition with symmetrical features. However, for Horner syndrome—a condition with asymmetrical features—adequate images could only be produced after some additional slight editing, reflecting a possible limiting factor of these tools. Ours are the first AI-generated images of classical facial features of hypothyroidism and Horner syndrome that we are aware of.
Confidentiality has become an increasing concern in the use of medical images over the last few decades. Text-to-image tools have ethical issues, including issues of consent for the original photos used to train these tools. Additionally, issues of accuracy are key. Nonmedics might be misled on medical signs by using such tools. Targets for future research are the potential for biases with these tools and the danger of stereotypes being perpetuated. Despite these limitations, AI-generated images may enhance case-based learning, allowing students to study and analyze a diverse range of medical cases. Text-to-image tools show exciting potential and may allow easier access to high-quality images in medical education [12,13].
Abbreviations
- AI
- artificial intelligence 
- GIMP
- GNU Image Manipulation Program 
Tools used in this article (all prompts entered in English).
Footnotes
Conflicts of Interest: None declared.
References
- 1.Plesner LL, Müller FC, Nybing JD, Laustrup LC, Rasmussen F, Nielsen OW, Boesen M, Andersen MB. Autonomous chest radiograph reporting using AI: estimation of clinical impact. Radiology. 2023 May;307(3):e222268. doi: 10.1148/radiol.222268. [DOI] [PubMed] [Google Scholar]
- 2.Williams MC, Williams SE, Newby DE. Artificial intelligence-based text-to-image generation of cardiac CT. Radiol Cardiothorac Imaging. 2023 Apr 6;5(2):e220297. doi: 10.1148/ryct.220297. https://europepmc.org/abstract/MED/37274418 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Adams LC, Busch F, Truhn D, Makowski MR, Aerts HJWL, Bressem KK. What does DALL-E 2 know about radiology? J Med Internet Res. 2023 Mar 16;25:e43110. doi: 10.2196/43110. https://www.jmir.org/2023//e43110/ v25i1e43110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hill K. Consent, confidentiality and record keeping for the recording and usage of medical images. J Vis Commun Med. 2006 Jun;29(2):76–79. doi: 10.1080/01405110600863365.Q3X2554176323270 [DOI] [PubMed] [Google Scholar]
- 5.Siskind SM, Lee SY, Pearce EN. Investigating hypothyroidism. BMJ. 2021 Apr 27;373:n993. doi: 10.1136/bmj.n993. [DOI] [PubMed] [Google Scholar]
- 6.Amonoo-Kuofi HS. Horner's syndrome revisited: with an update of the central pathway. Clin Anat. 1999;12(5):345–361. doi: 10.1002/(SICI)1098-2353(1999)12:5<345::AID-CA5>3.0.CO;2-L.10.1002/(SICI)1098-2353(1999)12:5<345::AID-CA5>3.0.CO;2-L [DOI] [PubMed] [Google Scholar]
- 7.Zhang C, Zhang C, Zhang M, Kweon IS. Text-to-image diffusion models in generative AI: a survey. arXiv. doi: 10.48550/arXiv.2303.07909. Preprint posted online on Apr 2, 2023. https://arxiv.org/pdf/2303.07909.pdf . [DOI] [Google Scholar]
- 8.ChatGPT. OpenAI. [2024-01-10]. https://chat.openai.com .
- 9.DALL·E 2. OpenAI. [2024-01-10]. https://openai.com/dall-e-2 .
- 10.Midjourney. Midjourney Inc. [2023-07-03]. https://www.midjourney.com/home/
- 11.Sagoo MG, Vorstenbosch MATM, Bazira PJ, Ellis H, Kambouri M, Owen C. Online assessment of applied anatomy knowledge: the effect of images on medical students' performance. Anat Sci Educ. 2021 May;14(3):342–351. doi: 10.1002/ase.1965. [DOI] [PubMed] [Google Scholar]
- 12.Preiksaitis C, Rose C. Opportunities, challenges, and future directions of generative artificial intelligence in medical education: scoping review. JMIR Med Educ. 2023 Oct 20;9:e48785. doi: 10.2196/48785. https://mededu.jmir.org/2023//e48785/ v9i1e48785 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Amri MM, Hisan UK. Incorporating AI tools into medical education: harnessing the benefits of ChatGPT and Dall-E. Journal of Novel Engineering Science and Technology. 2023 Apr 24;2(02):34–39. doi: 10.56741/jnest.v2i02.315. https://journal.iistr.org/index.php/JNEST/article/view/315/249 . [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Tools used in this article (all prompts entered in English).
