Skip to main content
BMC Oral Health logoLink to BMC Oral Health
. 2025 Oct 17;25:1632. doi: 10.1186/s12903-025-06983-3

What artificial intelligence (AI) can tell us about Nasoalveolar Molding (NAM)?

Şirin Hati̇poğlu 1,, Esra Çifçi Özkan 2, Fatma Aslı Konca Taşova 3, Özge Özdal Zi̇nci̇r 4
PMCID: PMC12535079  PMID: 41107819

Abstract

Background

The aim of this study was to evaluate the accuracy, reliability and comprehensibility of information about Nasoalveolar Molding (NAM) provided by artificial intelligence (AI).

Methods

A cross-sectional content analysis was conducted on the responses generated by ChatGPT-4 (OpenAI LLC, San Francisco, CA, USA), Gemini (Alphabet Inc., Mountain View, CA, USA) and Copilot (Microsoft Corporation, Redmond, WA, USA). In total, 11 domains and 129 questions were generated, and the answers received by the AI models were evaluated. Descriptive statistics were applied. The Pearson chi-square test was used to test the relationships between categorical variables when the sample size assumption was met, and Fisher’s exact test was used when the sample size assumption was not met. Analyses were performed via the IBM SPSS 27 (IBM Corp. Armonk, NY, USA) program.

Results

There was no statistically significant difference between the AI types and the responses given (p > 0.05). However, a significant difference was found between AI types only in the ‘Soft tissues’ domain (p = 0.013), where ChatGPT-4 gave completely ‘Objectively True’.

When each AI type was evaluated separately, the answers of the “Knowledge/Information” domain in all the models were significantly different from those of the other domains (ChatGPT-4: p=0.003, Gemini: p=0.044, Copilot: p<0.001). ChatGPT-4 and Copilot received answers in the 'Selected Facts' category, whereas Gemini mostly received answers in the 'False' category. For the 'Function' and 'Other' domains, ChatGPT-4 mostly gave 'False' answers. Copilot produced mostly 'Objectively True' answers only for 'Satisfaction' and completely 'False' answers for the 'Microbiological/Physiological' domain.

Conclusions

These findings reveal that the accuracy of AI-supported language models in providing medical information may vary according to the subject matter.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12903-025-06983-3.

Keywords: ChatGPT, Gemini, Copilot, Artificial intelligence, Nasoalveolar molding

Introduction

Cleft lip and palate (CLP) is a congenital abnormality resulting from inadequate fusion of the embryonic processes that form the soft and bone tissues of the palate and lips. The prevalence of CLP in live-born infants is estimated to be 1/470–850 in Asians, 1/1370–5000 in blacks and 1/775–1000 in Caucasians [1]. The family of a baby born with CLP may face both emotional and practical challenges, as well as worries about their child’s future health and social status. The timing and uncertainties of the interventions that need to be implemented can be an additional source of stress for parents [2, 3].

As parents learn to live with this condition, they try to find answers to many questions: How will I feed my baby? When do doctors repair the deformity? Professional support and the right information are important for finding answers to these questions and overcoming difficulties [4].

Treatment of CLP requires a multidisciplinary approach. Nasoalveolar molding (NAM) treatment is a frequently used method before surgical intervention. NAM is performed to align and approximate the alveolar cleft segments and to correct nasal cartilage and lip deformities. NAM facilitates surgical soft tissue repair for minimal tension and scar formation, resulting in improved surgical outcomes [5, 6].

NAM has been shown to improve surgical outcomes, requiring a low rate of soft tissue revision and alveolar bone grafting and reducing the number of total operations per patient from birth to facial maturity [7]. Although NAM treatment has been developed to facilitate the treatment of children with CLP deformities, the treatment process can be anxiety-provoking for parents. The intensity of the treatment period, how long it will last, which stages will be passed through, the expected results, the financial burden, and the nutritional problems that may occur during treatment may cause anxiety for parents. Parents can turn to a variety of support resources to address their concerns and ask questions during this process. Surgeons, orthodontists, nurses, psychologists, social media and, increasingly, digital platforms can be used to address their concerns and find answers to their questions [8, 9].

Currently, the internet has become a very popular means of accessing information. People frequently use the internet to find answers to their questions about medical issues. In particular, artificial intelligence (AI) language models have become important tools for finding answers to medical questions [10]. While these platforms can provide fast and seemingly accurate information, there are questions about the accuracy and reliability of the information obtained [11].

Natural language generation (NLG) models based on AI, which seek to introduce advanced intelligence to various computer systems, are AI models that can be trained on large amounts of text data to generate human-like text, answer questions, and perform other language-related tasks with high accuracy [12]. These models can help healthcare professionals stay informed about updates and new developments in their field, as well as provide summaries of current medical literature and guidance on differential diagnoses, treatment options and potential risks. Patients can use these language models to access health information, medical advice and counseling and to answer their questions [13].

The generative pretrained transformer (GPT) is an NLG model. The model can generate human-like text, answer questions and perform other language-related tasks with high accuracy [14]. Generative pretrained transformer 4 (GPT-4), developed by Open AI (OpenAI LLC, San Francisco, CA, USA), is the fourth-generation language model of the GPT series [15]. During the training of GPT-4, which was released on March 14, 2023, large amounts of textual data, including many model parameters such as books, articles and websites, were used [14]. It has the ability to generate accurate, comprehensive and detailed answers because of the vast amount of data available from internet sources [16]. The role of ChatGPT in the healthcare field includes use by patients, use by healthcare professionals, use in public health and use in databases [17]. There are other AI models that offer similar functionalities to those of ChatGPT. Gemini, developed by Google (Alphabet, Inc., Mountain View, CA, USA), can perform many different functions, such as understanding language, providing information, generating creative content and answering questions to help users with a variety of topics. Provide up-to-date information thanks to Google’s powerful infrastructure and search engine integration [18]. Powered by Microsoft’s (Microsoft Corporation, Redmond, WA, USA) AI innovation and using OpenAI’s GPT-4 language model, Copilot is a generative AI tool that delivers natural conversational interactions [19]. As each model is trained with different datasets, its responsiveness and strengths may differ [20]. ChatGPT-4.0, Gemini and Copilot AI applications were developed via large language models (LLMs). OpenAI’s newest language models, ChatGPT-4.0, Gemini and Copilot, were introduced in 2023 [21].

Although the assessment of the responses of AI language models to medical questions has been demonstrated by various studies [2226], information on how effectively they are in generating responses for a specific treatment is limited [27]. For example, the literature has not evaluated the content, accuracy and reliability of the answers to possible questions about NAM, which we aimed to do in our study and to systematically compare the performance of those 3 selected different AI-supported language models. The findings provide recommendations for improving information sharing in the NAM treatment process and increasing the effectiveness of AI tools in healthcare. This study provides important data for understanding the role of AI language models in health communication, and the findings can be used to improve parents’ access to accurate information.

Our hypothesis, in light of previous studies, was that the information provided by AI chatbots on NAM would be accurate and that ChatGPT-4 would be the most accurate information provider, even if there would be differences between different chatbots.

Materials and methods

In this study, content analysis of the responses generated by ChatGPT-4 (OpenAI LLC, San Francisco, CA, USA), Gemini (Alphabet Inc., Mountain View, CA, USA) and Copilot (Microsoft Corporation, Redmond, WA, USA) to the questions about NAM was conducted.

First, 11 domains and 129 questions were created by 4 authors (SH, FAKT, OOZ, ECO) (Appendix 1). Of the authors of the questions, 2 were experienced NAM treatment provider orthodontists, and 1 author was an oral and maxillofacial surgeon. The questions were designed to cover all topics that patients may be asked about and be curious about NAM. While creating the question pool, all questions that the public/laypeople may ask, which may be relevant to the subject and treatment process, have been included by specialists in that field. The questions are based on the NAM treatment criteria defined by Grayson BH et al. in 1993 [28] and 1999 [6]. Unfortunately, there is no other specific guide prepared on this subject that can serve as an example. The answers given by an AI to the questions were collected by one author (FAKT).

During the interaction with the language model, no specific prompt engineering or parameter adjustment was performed. All the questions were manually entered into the web interfaces of ChatGPT (OpenAI), Gemini (Google), and Copilot (Microsoft) without modifying the model parameters. Therefore, default settings were used for each platform. On the basis of the providers’ standard configurations, the approximate parameters were as follows: ChatGPT: temperature ≈ 0.7, top-p = 1.0, max tokens ≈ 1024; Gemini: temperature ≈ 0.7, top-p = 1.0; Copilot (Balanced mode): temperature ≈ 0.7, top-p = 1.0. The questions were asked about AI chatbots between January 13 and 16, 2025. The web client interface type was used. Each LLM provider was logged in, and questions were asked one by one. ChatGPT was asked questions on 2 days (January 13 and 14, 2025), and Copilot and Gemini were asked questions on a single day (January 15 and 16, respectively).

The accuracy of the collected answers was independently scored by 3 orthodontists (SH, FAKT, ECO) and 1 oral and maxillofacial surgeon (OOZ). Before the scoring process started, a meeting was organized to establish a common understanding of the scoring system among the assessors. After the scoring process, a meeting was organized again, and all the answers were evaluated. The answers scored differently by the researchers were reread and discussed, and a common answer was accepted as correct. A five-point Likert scale was used to assess the accuracy of the answers (Table 1) [29].

Table 1.

Definitions of the different accuracy categories

Category Definition
A: Objectively True A claim that is based on scientific evidence and presents all relevant information, whether positive or negative.
B: Selected Facts A claim that presents some true selected Facts based on scientific evidence but omits important information related to a product.
C: Minimal Facts A claim that exaggerates the benefit of the product, with an overemphasis on the benefit supported by poor-quality scientific evidence.
D: NonFacts A claim that presents an intangible characteristic. Often these claims are in the form of product opinions or lifestyle claims, leaving clinicians/patients to misinterpret the opinion as an objective product evaluation.
F: False A claim that is objectively false either due to lack of evidence to support it or contradicting available evidence.

The results assessed in an analysis are not hard and fast “basic facts” and focus on the median value of the scores for each response, as the quality of the responses is assessed in a personalized way [30]. The analysis was based on the principles of the crowd (or ensamble) scoring strategy, which aims to analyze the quality of each response in a fairer way by considering the quality of each response from different perspectives by various evaluators [25]. In this study, since only publicly available information was evaluated, ethical approval was not needed

Statistical methods

In this study, descriptive statistics of the data are given. To test the relationships between categorical variables, the Pearson chi-square test was applied when the sample size assumption (expected value >5) was met, and Fisher’s exact test was applied when the sample size assumption was not met. Analyses were performed via the IBM SPSS 27 (IBM Corp. Armonk, NY, USA) program.

Results

The Pearson chi-square test was applied to examine the relationships between the answers given according to AI type. As a result of the analysis, no statistically significant relationship was found between AI types and responses (p >0.05) (Table 2) (Fig. 1), which were homogeneous.

Table 2.

Distribution of responses by AI type and their relationships

ChatGPT Gemini Copilot
Answers n % %AI n % %AI n % %AI Test Statistics p
False 6 27,3 4,7 10 45,5 7,8 6 27,3 4,7 9,080 0,336
NonFacts 4 20,0 3,1 7 35,0 5,4 9 45,0 7,0
Minimal Facts 4 18,2 3,1 7 31,8 5,4 11 50,0 8,5
Selected Facts 25 30,9 19,4 26 32,1 20,2 30 37,0 23,3
Objectively True 90 37,2 69,8 79 32,6 61,2 73 30,2 56,6

% Row percentage and %AI Column percentage for AI

Fig. 1.

Fig. 1

Distribution of answers according to AI type

Fisher’s exact tests were used to examine the relationships between the answers given according to AI types for the domains. Although the response rates for ‘Objectively True’ were 69.8% for ChatGPT, 61.2% for Gemini and 56.6% for Copilot, and the rates of ‘False’ responses were found to be 7.8% for Gemini and 4.7% for ChatGPT and Copilot, the differences among those rates were not statistically significant (Table 2). As a result of the analysis, a statistically significant relationship was observed between AI types and responses only for the ‘Soft Tissues’ domain (p < 0.05) (Table 3). It was found that ChatGPT answered completely ‘Objectively True’, and Gemini answered mostly ‘Objectively True’ and ‘Selected Facts’. The correct answers for the domain ‘Soft Tissues’ are ChatGPT > Gemini > Copilot.

Table 3.

Distribution of responses by AI type for domains and their relationships

Domains Answers ChatGPT Gemini Copilot Test Statistic p
n % %AI n % %AI n % %AI
Knowledge/Information False 1 12,5 3,3 6 75,0 20,0 1 12,5 3,3 7,201 0,490
NonFacts 1 33,3 3,3 1 33,3 3,3 1 33,3 3,3
Minimal Facts 2 40,0 6,7 2 40,0 6,7 1 20,0 3,3
Selected Facts 13 34,2 43,3 10 26,3 33,3 15 39,5 50,0
Objectively True 13 36,1 43,3 11 30,6 36,7 12 33,3 40,0
Compliance False 0 0,0 0,0 0 0,0 0,0 1 100,0 6,3 5,745 0,843
NonFacts 0 0,0 0,0 1 50,0 6,3 1 50,0 6,3
Minimal Facts 0 0,0 0,0 0 0,0 0,0 1 100,0 6,3
Selected Facts 2 50,0 12,5 1 25,0 6,3 1 25,0 6,3
Objectively True 14 35,0 87,5 14 35,0 87,5 12 30,0 75,0
Alveolar Crest Alignment False 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0 9,955 0,072
NonFacts 1 33,3 14,3 1 33,3 14,3 1 33,3 14,3
Minimal Facts 0 0,0 0,0 0 0,0 0,0 4 100,0 57,1
Selected Facts 1 20,0 14,3 3 60,0 42,9 1 20,0 14,3
Objectively True 5 55,6 71,4 3 33,3 42,9 1 11,1 14,3
Soft Tissues False 0a 0,0 0,0 1a 100,0 8,3 0a 0,0 0,0 13,268 0,013*
NonFacts 0a 0,0 0,0 0a 0,0 0,0 1a 100,0 8,3
Minimal Facts 0a 0,0 0,0 1a 100,0 8,3 0a 0,0 0,0
Selected Facts 0a 0,0 0,0 5b 62,5 41,7 3a, b 37,5 25,0
Objectively True 12a 48,0 100,0 5b 20,0 41,7 8a, b 32,0 66,7
Function False 2 50,0 33,3 1 25,0 16,7 1 25,0 16,7 3,856 0,918
NonFacts 1 16,7 16,7 2 33,3 33,3 3 50,0 50,0
Minimal Facts 1 100,0 16,7 0 0,0 0,0 0 0,0 0,0
Selected Facts 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0
Objectively True 2 28,6 33,3 3 42,9 50,0 2 28,6 33,3
Satisfaction False 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0 1,305 1,000
NonFacts 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0
Minimal Facts 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0
Selected Facts 0 0,0 0,0 1 50,0 9,1 1 50,0 9,1
Objectively True 11 35,5 100,0 10 32,3 90,9 10 32,3 90,9
Harms False 0 0,0 0,0 2 100,0 14,3 0 0,0 0,0 7,525 0,443
NonFacts 1 100,0 7,1 0 0,0 0,0 0 0,0 0,0
Minimal Facts 0 0,0 0,0 2 50,0 14,3 2 50,0 14,3
Selected Facts 3 33,3 21,4 2 22,2 14,3 4 44,4 28,6
Objectively True 10 38,5 71,4 8 30,8 57,1 8 30,8 57,1
Oral Hygiene False 0 0,0 0,0 0 0,0 0,0 1 100,0 16,7 6,426 0,308
NonFacts 0 0,0 0,0 1 33,3 16,7 2 66,7 33,3
Minimal Facts 0 0,0 0,0 1 100,0 16,7 0 0,0 0,0
Selected Facts 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0
Objectively True 6 46,2 100,0 4 30,8 66,7 3 23,1 50,0
Microbiological/Physiological False 0 0,0 0,0 0 0,0 0,0 2 100,0 100,0 6,043 0,204
NonFacts 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0
Minimal Facts 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0
Selected Facts 1 100,0 50,0 0 0,0 0,0 0 0,0 0,0
Objectively True 1 33,3 50,0 2 66,7 100,0 0 0,0 0,0
Efficiency/Cost Effectiveness False 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0 3,376 0,634
NonFacts 0 0,0 0,0 0 0,0 0,0 0 0,0 0,0
Minimal Facts 0 0,0 0,0 0 0,0 0,0 2 100,0 22,2
Selected Facts 3 42,9 33,3 2 28,6 22,2 2 28,6 22,2
Objectively True 6 33,3 66,7 7 38,9 77,8 5 27,8 55,6
Other False 3 100,0 18,8 0 0,0 0,0 0 0,0 0,0 6,919 0,548
NonFacts 0 0,0 0,0 1 100,0 6,3 0 0,0 0,0
Minimal Facts 1 33,3 6,3 1 33,3 6,3 1 33,3 6,3
Selected Facts 2 28,6 12,5 2 28,6 12,5 3 42,9 18,8
Objectively True 10 29,4 62,5 12 35,3 75,0 12 35,3 75,0

% Row percentage, %AI Column percentage for AI

*p < 0,05

For ChatGPT, Gemini and Copilot, a statistically significant relationship was found between domain types and responses.

For ChatGPT, ‘Selected Facts’ answers for the ‘Knowledge/Information’ domain and mostly ‘False’ answers for the ‘Function’ and ‘Other’ domains were obtained (p < 0.05) (Table 4) (Fig. 2)

Table 4.

Distribution of responses by domain for ChatGPT and their relationships

Domains False NonFacts Minimal Facts Selected Facts Objectively True Test Statistics p
n % %C. n % %C. n % %C. n % %C. n % %C.
Knowledge/Information 1a, b 3,3 16,7 1a, b 3,3 25,0 2a, b 6,7 50,0 13b 43,3 52,0 13a 43,3 14,4 52,719 0,003*
Compliance 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 2a 12,5 8,0 14a 87,5 15,6
Alveolar Crest Alignment 0a 0,0 0,0 1a 14,3 25,0 0a 0,0 0,0 1a 14,3 4,0 5a 71,4 5,6
Soft Tissues 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 12a 100,0 13,3
Function 2a 33,3 33,3 1a, b 16,7 25,0 1a, b 16,7 25,0 0a 0,0 0,0 2b 33,3 2,2
Satisfaction 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 11a 100,0 12,2
Harms 0a 0,0 0,0 1a 7,1 25,0 0a 0,0 0,0 3a 21,4 12,0 10a 71,4 11,1
Oral Hygiene 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 6a 100,0 6,7
Microbiological/Physiological 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 1a 50,0 4,0 1a 50,0 1,1
Efficiency/Cost Effectiveness 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 3a 33,3 12,0 6a 66,7 6,7
Other 3a 18,8 50,0 0a 0,0 0,0 1a 6,3 25,0 2a 12,5 8,0 10a 62,5 11,1

% Row percentage, %C Column percentage for responses, and different lettering values indicate differences between column proportions

*p < 0,05

Fig. 2.

Fig. 2

Pie chart of the distribution of responses according to domains for ChatGPT

For Gemini, the answers received in the ‘Knowledge/Information’ domain were mostly ‘False’ (p < 0.05) (Table 5) (Fig. 3).

Table 5.

Distribution of responses by domain for Gemini and their relationships

Domains False NonFacts Minimal Facts Selected Facts Objectively True Test Statistics p
n % %C. n % %C. n % %C. n % %C. n % %C.
Knowledge/Information 6a 20,0 60,0 1a, b 3,3 14,3 2a, b 6,7 28,6 10a, b 33,3 38,5 11b 36,7 13,9 45,080 0,044*
Compliance 0a 0,0 0,0 1a 6,3 14,3 0a 0,0 0,0 1a 6,3 3,8 14a 87,5 17,7
Alveolar Crest Alignment 0a 0,0 0,0 1a 14,3 14,3 0a 0,0 0,0 3a 42,9 11,5 3a 42,9 3,8
Soft Tissues 1a 8,3 10,0 0a 0,0 0,0 1a 8,3 14,3 5a 41,7 19,2 5a 41,7 6,3
Function 1a, b 16,7 10,0 2b 33,3 28,6 0a, b 0,0 0,0 0a 0,0 0,0 3a, b 50,0 3,8
Satisfaction 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 1a 9,1 3,8 10a 90,9 12,7
Harms 2a 14,3 20,0 0a 0,0 0,0 2a 14,3 28,6 2a 14,3 7,7 8a 57,1 10,1
Oral Hygiene 0a 0,0 0,0 1a 16,7 14,3 1a 16,7 14,3 0a 0,0 0,0 4a 66,7 5,1
Microbiological/Physiological 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 2a 100,0 2,5
Efficiency/Cost Effectiveness 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 2a 22,2 7,7 7a 77,8 8,9
Other 0a 0,0 0,0 1a 6,3 14,3 1a 6,3 14,3 2a 12,5 7,7 12a 75,0 15,2

% Row percentage, %C Column percentage for responses, and different lettering values indicate differences between column proportions

*p < 0,05

Fig. 3.

Fig. 3

Pie chart of the distribution of responses by domain for Gemini

For Copilot, the responses for the ‘Knowledge/Information’ domain were mostly ‘Selected Facts’; the responses for the ‘Microbiological/Physiological’ domain were mostly ‘False’; the responses for the ‘Alveolar Crest Alignment’ domain were mostly ‘Minimal Facts’; and the responses for the ‘Satisfaction’ domain were mostly ‘Objectively True’ (p < 0.05) (Table 6) (Fig. 4).

Table 6.

Distribution of responses by domain for Copilot and their relationships

Domains False NonFacts Minimal Facts Selected Facts Objectively True Test Statistics p
n % %C. n % %C. n % %C. n % %C. n % %C.
Knowledge/Information 1a, b 3,3 16,7 1a, b 3,3 11,1 1a, b 3,3 9,1 15b 50,0 50,0 12a 40,0 16,4 67,052 < 0,001*
Compliance 1a 6,3 16,7 1a 6,3 11,1 1a 6,3 9,1 1a 6,3 3,3 12a 75,0 16,4
Alveolar Crest Alignment 0a, b 0,0 0,0 1a, b 14,3 11,1 4b 57,1 36,4 1a 14,3 3,3 1a 14,3 1,4
Soft Tissues 0a 0,0 0,0 1a 8,3 11,1 0a 0,0 0,0 3a 25,0 10,0 8a 66,7 11,0
Function 1a, b 16,7 16,7 3b 50,0 33,3 0a, b 0,0 0,0 0a 0,0 0,0 2a 33,3 2,7
Satisfaction 0a 0,0 0,0 0a 0,0 0,0 0a 0,0 0,0 1a 9,1 3,3 10a 90,9 13,7
Harms 0a 0,0 0,0 0a 0,0 0,0 2a 14,3 18,2 4a 28,6 13,3 8a 57,1 11,0
Oral Hygiene 1a 16,7 16,7 2a 33,3 22,2 0a 0,0 0,0 0a 0,0 0,0 3a 50,0 4,1
Microbiological/Physiological 2a 100,0 33,3 0a, b 0,0 0,0 0a, b 0,0 0,0 0b 0,0 0,0 0b 0,0 0,0
Efficiency/Cost Effectiveness 0a 0,0 0,0 0a 0,0 0,0 2a 22,2 18,2 2a 22,2 6,7 5a 55,6 6,8
Other 0a 0,0 0,0 0a 0,0 0,0 1a 6,3 9,1 3a 18,8 10,0 12a 75,0 16,4

% Row percentage, %C Column percentage for responses, and different lettering values indicate differences between column proportions

*p < 0,05

Fig. 4.

Fig. 4

Pie chart of the distribution of responses by domain for Copilot

A comparison of the performance of the three AI models is given in the table below. (Table 7)

Table 7.

A comparison of the performance of the three AI models

ChatGPT-4 Gemini Copilot
Description Could not define CLP, opened it in 6 different ways, none of each was Cleft Lip and Palate. We had to write the open description in order to get the answers. Able to define CLP, there was no need to open the description. Able to define CLP, there was no need to open the description.
Answer Long, complicated. Gives other information besides the answer to the question and the truth is getting confused and blind. Short, clear. Short, clear.
Copying Has a copy button. No copy button. Has a coppy button.
Usage Easy to use. Easy to use. Easy to use.
Sources Not always. Always cites sources after each answer. Always cites sources after each answer.

Discussion

In this study, we examined the accuracy, reliability and extent of the information provided by AI-supported language modeling for NAM treatment. Although online resources have become widely used to answer questions about health problems that individuals are trying to overcome, the accuracy and quality of the answers obtained from these resources are questionable [40]. Incorrect or incomplete information can affect treatment success and lead to false expectations among parents. Importantly, such resources are evaluated to prevent the potentially harmful effects of misdirection on such specialized and important issues and to promote the flow of accurate information between healthcare providers and patients

In our study, the content responses generated by ChatGPT-4, Gemini and Copilot for NAM were analyzed. No statistically significant relationships were found between the three AI types and the responses. Similarly, Rossetti et al., in their study to evaluate the overall performance of three different AI robots, reported that the robots showed promising accuracy in predicting correct answers quickly, producing grammatically correct and speech coherent writing [33].

In our study, ChatGPT was found to yield better and more accurate answers only for the ‘Soft Tissues’ domain. We found that ChatGPT answered completely ‘Objectively True’, and Gemini answered mostly ‘Objectively True’ and ‘Selected Facts’. The correct answers for the domain ‘Soft Tissues’ are ChatGPT >Gemini >Copilot. When we look at the literature, in the studies comparing ChatGPT, Copilot and Gemini, we can see studies reporting different results and finding differences between the AIs. ChatGPT4.0 was reported to outperform Gemini and Copilot, providing detailed and accurate answers, but the answers were found to be more complex for patients to understand [34]. We also found that the responses given by ChatGPT-4 were longer and more complex than those given by Gemini and Copilot. In addition, it provides other information in addition to the answer to the question, and the truth becomes confused and blind. (Table 7) In another study, ChatGPT-4, Copilot and Gemini were compared, and it was found that both ChatGPT-4 and Copilot outperformed Gemini [33]. In contrast, another study that compared three AI language models revealed that Copilot gave more accurate results for interpreting biochemical data than ChatGPT-3.5 and Gemini did [35]. Similarly, in the nursing context, where the nursing skills and knowledge of ChatGPT-3.5, Copilot, Gemini, and Llama-2 were evaluated, Copilot was found to have the highest accuracy, followed by ChatGPT-3.5, Gemini and Llama-2 [36].

In our study, the response rates of the 3 AI types, i.e., ‘Objectively True’, were homogeneous and ranged from 57–70%. This finding shows that AI types provide moderate general information about NAM treatment in terms of accuracy and adequacy. Abu Arqub et al. reported that the responses generated by ChatGPT exhibited insufficient accuracy and did not support existing evidence [37]. On the other hand, Demirsoy et al. prepared frequently asked and curious questions about the four basic areas of orthodontics and directed them to ChatGPT, and the results obtained from this study showed that ChatGPT has the potential to be a valuable resource for patient education and information dissemination in orthodontics [38]. Tanaka et al. evaluated the ChatGPT responses of five orthodontists regarding clear aligners, temporary anchorage devices and digital imaging in the context of orthodontic interest and reported that ChatGPT was effective in providing quality responses [39]. Duran et al. reported that ChatGPT provides highly reliable, high-quality, but difficult-to-read information about CLPs and emphasized that the information obtained should be verified by a qualified medical professional [40].

In our study, although there was no significant difference among responses of AI types, statistically significant relationships were found between domain types and responses for each AI. This shows that although there is no difference between different AI software programs in terms of general knowledge, there are differences in specific topics and subtopics. Similarly, Nastasi et al. reported that ChatGPT’s medical advice was generally safe but sometimes lacked detail or nuance. While 35% of the responses offered general advice and lacked personalized information, they concluded that ChatGPT is currently useful for providing background information on general clinical issues but cannot reliably provide personalized or appropriate medical advice [22].

Although the assessment of the responses of AI language models to medical questions has been demonstrated by various studies [2226], information on how effectively they are in generating responses for a specific treatment is limited [27]. Although there are studies in the literature investigating the utility of ChatGPT in cleft lip repair [32], the literature has not yet evaluated the content, accuracy and reliability of responses to possible NAM-related questions that we aimed for in our study, nor has it systematically compared the performance of these 3 different AI-assisted language models selected. Our findings provide recommendations for improving information sharing in the NAM treatment process and increasing the effectiveness of AI tools in healthcare. This study provides important data for understanding the role of AI language models in health communication, and the findings can be used to improve parents’ access to accurate information.

Our hypothesis, in light of previous studies, was that the information provided by AI chatbots on NAM would be accurate and that ChatGPT-4 would be the most accurate information provider, even if there would be differences between different chatbots. However, the results revealed that there was no statistically significant difference between the AI types and the responses given.

With the constant information uploaded to the internet, the learning of AI chatbots is continuous and very fast. The answers to our questions were mostly obtained from health, hospital or doctor websites. It is thought that uploading data from these sites by professionals or making the final check by professionals before uploading the data may be one way to increase the accuracy of the data provided. In addition, increasing the flow of data from scientific sources is another way to increase data accuracy.

The constant flow of information on the internet has caused AI chatbots to be open to continuous learning and development. Because our study was conducted over a specific time interval, we measured the accuracy, reliability and comprehensibility of the information in that time interval. Asking the same questions at different times and evaluating the answers may yield different results because of the difference in information timeliness. In our study, we used a five-point Likert scale as an evaluation criterion and performed our evaluation with only 4 evaluators. It would be possible to enrich the evaluation part of the study by including different evaluation methods or different groups of evaluators. These are the factors that constitute the limitations of our study. 

Conclusions

The direct use of AI-supported language models, especially in patient information processes, should be carefully evaluated and considered. Notably, patients can access the most accurate NAM information from their healthcare providers.

Supplementary Information

Supplementary Material 1. (17.2KB, docx)
Supplementary Material 2. (500.1KB, docx)

Acknowledgements

Not available.

Appendix

Table 8.

Table of domains and questions

Domain Questions
1. Knowledge/Information 1. What is CLP?
2. Which type of CLP is most common?
3. What is Nasoalveolar Molding (NAM) treatment?
4. Which gender is most commonly treated with NAM?
5. Is there a certain age limit for NAM treatment?
6. When does NAM treatment end (criteria)?
7. How is NAM treatment performed?
8. In the NAM appliance, when are the buttons placed?
9. When is a nasal stent placed in NAM?
10. What is the importance of taping in NAM treatment?
11. What material is the NAM apparatus made of?
12. To whom is NAM performed?
13. To whom is NAM not performed?
14. Is NAM treatment necessary?
15. What are the instructions for use of the NAM appliance?
16. Are the patient instructions for use of the NAM appliance easily understood by patients?
17. Is the content of social media platforms related to NAM treatment accurate in terms of the reliability of information?
18. Are there Randomized Contolled Trials proving the efficacy of NAM treatment?
19. How often are patients undergoing NAM treatment checked?
20. Is the frequency of NAM patient check-ups different from the frequency of other orthodontic treatment check-ups?
21. What are the reasons why patients on NAM treatment are called for follow-up visits more frequently than patients on fixed treatment?
22. What are the consequences for patients undergoing NAM treatment if they miss their check-ups?
23. What are the advantages of NAM treatment?
24. What are the disadvantages of NAM treatment?
25. What are the feeding instructions for infants undergoing NAM treatment?
26. How to feed babies undergoing NAM treatment?
27. How are CLP babies fed?
28. Can babies on NAM treatment be bottle-fed?
29. Can babies undergoing NAM treatment be breastfed?
30. What should be considered during the feeding of babies undergoing NAM treatment?
31. Are there myths about NAM treatment? If so, what are they?
2. Compliance 32. Is parental compliance necessary for NAM treatment?
33. Is it easy for parents to adapt to NAM treatment?
34. Is patient compliance necessary for NAM treatment?
35. Is it easy for patients to comply with NAM treatment?
36. What happens if the NAM appliance is lost?
37. How often are NAM appliance fractures reported?
38. How often are NAM appliance losses reported?
39. What is the average daily usage time of the NAM device?
40. Is the NAM appliance aesthetic?
41. Does NAM treatment change facial aesthetics?
42. Is the NAM appliance easy for the patient to use?
43. Is the NAM appliance easy for the parent to use?
44. What happens if the use of the NAM device is interrupted for a period of time while treatment is ongoing?
45. Can the orthodontist recognize if the NAM appliance is not used properly?
46. Which methods can be used to measure the compliance of patients/parents using NAM appliances?
3. Alveolar crest alignment 47. What is the success rate of alveolar crest alignment with NAM treatment in patients with unilateral CLP?
48. What is the success rate of alveolar crest alignment with NAM treatment in patients with bilateral CLP?
49. Is the alveolar crest alignment obtained with NAM treatment more successful in patients with unilateral or bilateral CLP?
50. How close do the alveolar crests come together with NAM treatment in patients with unilateral CLP?
51. How close do the alveolar crests come together with NAM treatment in patients with bilateral CLP?
52. Do the alveolar crests converge more in patients with unilateral or bilateral CLP treated with NAM?
53. Is the alveolar crest alignment obtained with NAM treatment stable in the long term?
4. Soft tissues 54. What are the effects of NAM treatment on the soft tissue profile?
55. Are the soft tissue profile changes obtained with NAM treatment stable in the long term?
56. What are the effects of NAM treatment on the soft tissue frontal plane?
57. Are the soft tissue frontal plane changes obtained with NAM treatment stable in the long term?
58. What are the effects of NAM therapy on the nostrils?
59. Are the nostril changes obtained with NAM treatment stable in the long term?
60. What are the effects of NAM treatment on columella?
61. Are columella changes obtained with NAM treatment stable in the long term?
62. What are the effects of NAM treatment on the distance between cleft lips?
63. What are the effects of NAM treatment on alar distance?
64. What are the effects of NAM treatment on facial asymmetry?
65. What are the effects of NAM treatment on nasal asymmetry?
5. Function 66. Does NAM treatment have an impact on infants' feeding?
67. Does NAM treatment have an impact on breastfeeding?
68. Does NAM treatment have an effect on infants' pacifier use?
69. Does respiratory relief occur with NAM treatment?
70. Does NAM treatment have an effect on the eruption of teeth in babies?
71. Does NAM treatment have any effect on speech?
6. Satisfaction 72. Are the families of babies treated with NAM satisfied with the treatment outcomes?
73. Are orthodontists of babies treated with NAM satisfied with the treatment results?
74. Are orthodontists more satisfied with the outcomes of infant patients treated with NAM than those not treated?
75. Are surgeons of babies treated with NAM satisfied with the treatment outcomes?
76. Are surgeons more satisfied with the outcomes of infant patients treated with NAM than those not treated?
77. Does NAM treatment increase parental self-confidence?
78. Does NAM treatment increase the parent's self-esteem?
79. Does NAM treatment affect babies' sleep patterns?
80. Does NAM treatment affect infant feeding?
81. Does NAM treatment affect the psychological and psychosocial status of parents?
82. Does NAM treatment affect the success of surgical interventions?
7. Harms 83. Is NAM treatment associated with nasal irritation?
84. Is NAM treatment associated with irritation of the alveolar crests?
85. Is NAM treatment associated with irritation of the palate?
86. Is NAM treatment associated with irritation of the sulcusses?
87. Is NAM treatment associated with irritation of the frenulum attachments?
88. Is NAM treatment associated with irritation of the lips?
89. Is NAM treatment associated with irritation of the cheeks?
90. Is NAM treatment associated with redness of the cheeks?
91. Is NAM treatment associated with gingival irritation?
92. Does NAM treatment cause discomfort/irritation?
93. What are the most commonly reported side effects of NAM treatment?
94. What kind of emergencies are there in NAM treatment?
95. What measures can be taken to prevent emergencies in NAM treatment?
96. What to do in case of a possible emergency in NAM treatment?
8. Oral Hygiene 97. How to clean the mouth of babies undergoing NAM treatment?
98. How often is the inside of the mouth cleaned for babies undergoing NAM treatment?
99. How to clean the NAM appliance?
100. How often is the NAM appliance cleaned?
101. Does NAM treatment affect bad breath?
102. Does NAM treatment change the coating on the tongue?
9. Microbiological/physiological 103. Does NAM treatment change Candida Albicans involvement in the oral cavity?
104. Does NAM treatment change the microbiological content of the oral cavity?
10. Efficiency/cost effectiveness 105. What is the average treatment time with NAM?
106. Does the type of CLP affect the duration of NAM treatment?
107. What is the average NAM treatment duration for unilateral CLP?
108. What is the average NAM treatment duration for bilateral CLP?
109. What is the rate of convergence of the alveolar crests in patients treated with NAM?
110. What is the speed of achieving nasal changes in NAM-treated patients?
111. Is NAM treatment successful?
112. Does the success of NAM treatment vary according to the type of CLP?
113. What are the types of CLP best treated with NAM therapy?
11. Other 114. Do parents of NAM patients have fear and anxiety before starting NAM treatment?
115. Does fear and anxiety decrease in parents of NAM patients after starting NAM treatment?
116. Do NAM-treated patients need extra bone grafts later in life compared to non-NAM-treated patients?
117. Do patients who undergo NAM treatment need extra soft tissue (lip, nose, etc.) revision surgery at older ages compared to patients who do not undergo NAM treatment?
118. Do NAM-treated patients need orthognathic surgery in the future compared to non-NAM-treated patients?
119. Does NAM treatment eliminate the need for orthognathic surgery in the future?
120. Can NAM treatment be performed with clear aligners instead of the NAM appliance?
121. Can NAM treatment be performed with 3D printed aligners instead of the NAM appliance?
122. What is the main mechanism of alveolar crest correction with NAM therapy?
123. What is the main mechanism by which cleft lips move closer together with NAM treatment?
124. What is the main mechanism of nasal changes achieved with NAM therapy?
125. Does NAM treatment continue after lip surgery?
126. Is any kind of appliance made after lip surgery for NAM-treated babies?
127. Is retention necessary after NAM treatment?
128. Which devices can be used if retention is necessary after NAM treatment?
129. Can the NAM appliance be used as a retention device?

Authors’ contributions

Ş. H.: Conseptualization, Methodology, Investigation, Resources, Writing – Review and Editing, Supervision, Project administration.E. Ç. Ö.: Investigation, Resources, Writing – Original Draft, Writing – Review and Editing, VisualizationF. A. K. T.: Validation, Investigation, ResourcesÖ. Ö. Z.: Validation, Investigation, Resources.

Funding

Not applicable.

Data availability

No datasets were generated or analysed during the current study.

Declarations

Ethics approval

In this study, since only publicly available information was evaluated, ethical approval was not needed.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Conflict of interest

The authors declare that there are no conflicts of interest.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Coleman JR Jr, Sykes JM. The embryology, classification, epidemiology, and genetics of facial clefting. Facial Plast Surg Clin North Am. 2001;9(1):1–13. [PubMed] [Google Scholar]
  • 2.Zeytinoğlu S, Davey MP, Crerand C, Fisher K, Akyil Y. Experiences of couples caring for a child born with cleft lip and/or palate: impact of the timing of diagnosis. J Marital Fam Ther. 2017;43(1):82–99. [DOI] [PubMed] [Google Scholar]
  • 3.Stock NM, Blaso D, Hotton M. Caring for a child with a cleft lip and/or palate: A narrative review. Cleft Palate Craniofac J Published Online September. 2024;9. 10.1177/10556656241280071.
  • 4.Hlongwa P, Rispel LC. People look and ask lots of questions: caregivers’ perceptions of healthcare provision and support for children born with cleft lip and palate. BMC Public Health. 2018;18(1):506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Chammanam SG, Biswas PP, Kalliath R, Chiramel S. Nasoalveolar molding for children with unilateral cleft lip and palate. J Maxillofac Oral Surg. 2014;13(2):87–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Grayson BH, Santiago PE, Brecht LE, Cutting CB. Presurgical nasoalveolar molding in infants with cleft lip and palate. Cleft Palate Craniofac J. 1999;36(6):486–98. [DOI] [PubMed] [Google Scholar]
  • 7.Yarholar LM, Shen C, Wangsrimongkol B, Cutting CB, Grayson HB, Staffenberg DA, et al. The nasoalveolar molding cleft protocol: long-term treatment outcomes from birth to facial maturity. Plast Reconstr Surg. 2021;147(5):e787-94. [DOI] [PubMed] [Google Scholar]
  • 8.Roth M, Lonic D, Grill FD, Ritschl ML, Loeffelbein DJ, Wolff KD, Niu LS, Pai BCJ, Prantl L, Kehrer A, Heidekrüger PI, Rau A, Lo LJ. NAM—help or burden? Intercultural evaluation of parental stress caused by Nasoalveolar molding: a retrospective multicenter study. Clin Oral Invest. 2021;25:5421–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hopkins E, Gazza E, Marazıtam L. Parental experiencecaring for cleft lip and palate infants with Nasoalveolar molding. J ofAdvanced Nurs. 2016;72(10):2413–22. [DOI] [PubMed] [Google Scholar]
  • 10.Shen Y, Heacock L, Elias J, et al. ChatGPT and other large language models are double-edged swords. Radiology. 2023;307(2):e230163. [DOI] [PubMed] [Google Scholar]
  • 11.Vaid NR. Artificial intelligence (AI) driven orthodontic care: a quest toward utopia? Semin Orthod. 2021;2:57–61. [Google Scholar]
  • 12.Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell. 2023;6(1):1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Javaid M, Haleem A, Singh RP. ChatGPT for healthcare services: an emerging stage for an innovative perspective. BenchCouncil Trans Benchmarks Stand Evaluations. 2023;3(1):1–4. [Google Scholar]
  • 14.Zhu JJ, Jiang J, Yang M, Ren ZJ. ChatGPT and environmental research. Environ Sci Technol. 2023;57(46):17667–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.OpenAI: ChatGPT 4o. (2024). Accessed: December 9, 2024: https://ChatGPT.com.
  • 16.Deng J, Lin Y. The benefits and challenges of chatgpt: an overview. Front Comput Intell Sys. 2022;2:81–3. [Google Scholar]
  • 17.Biswas SS. Role of ChatGPT in public health. Ann Biomed Eng. 2023;51(5):868–9. [DOI] [PubMed] [Google Scholar]
  • 18.Google Gemini: Google. (2024). Accessed: December 29, 2024: https://gemini.google.com/app?hl=tr.).
  • 19.Microsoft Copilot, Microsoft. (2023) Accessed: December 20, 2024: https://www
  • 20.Suhaili S, Salim N, Jambli M. Service chatbots: a systematic review. Expert Syst Appl. 2021;184:115461. [Google Scholar]
  • 21.Rudolph J, Tan S, Tan S. War of the chatbots: Bard, Bing chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. J Appl Learn Teach. 2023;6:364–89. [Google Scholar]
  • 22.Nastasi AJ, Courtright KR, Halpern SD, Weissman GE. A vignette-based evaluation of chatgpt’s ability to provide appropriate and equitable medical advice across care contexts. Sci Rep. 2023;19(1):13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Armbruster J, Bussmann F, Rothhaas C, Titze N, Grützner PA, Freischmidt H. Doctor chatgpt, can you help me? The patient’s perspective: cross-sectional study. J Med Internet Res. 2024;26:e58831. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Fatima A, Shafique MA, Alam K, Fadlalla Ahmed TK, Mustafa MS. ChatGPT in medicine: a cross-disciplinary systematic review of chatgpt’s (artificial intelligence) role in research, clinical practice, education, and patient interaction. Medicine (Baltimore). 2024;103(32):e39250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023;183(6):589–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Nadarzynski T, Miles O, Cowie A, Ridge D. Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: a mixed-methods study. Digit Health. 2019;5:2055207619871808. 10.1177/2055207619871808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kurniawan MH, Handiyani H, Nuraini T, Hariyati RTS, Sutrisno S. A systematic review of artificial intelligence-powered (AI-powered) chatbot intervention for managing chronic illness. Ann Med. 2024;56(1):2302980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Grayson BH, Cutting C, Wood R. Preoperative columella lengthening in bilateral cleft lip and palate. Plast Reconstr Surg. 1993;92(7):1422–3 PMID: 8248436. [PubMed] [Google Scholar]
  • 29.Faerber AE, Kreling DH. Content analysis of false and misleading claims in television advertising for prescription and nonprescription drugs. J Gen Intern Med. 2014;29(1):110–8. 10.1007/s11606-013-2604-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Dumitrache A, Aroyo L, Welty C. Crowdsourcing ground truth for medical relation extraction. ACM Trans Interact Intell Syst. 2018;8(2):1–20. [Google Scholar]
  • 31.Rossettini G, Rodeghiero L, Corradi F, et al. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study. BMC Med Educ. 2024;24(1):694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Demir S. Evaluation of responses to questions about keratoconus using ChatGPT-4.0, Google Gemini, and Microsoft Copilot: a comparative study of large language models on keratoconus. Eye Contact Lens. 2024;2024:1158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kaftan AN, Hussain MK, Naser FH. Response accuracy of ChatGPT 3.5 copilot and gemini in interpreting biochemical laboratory data in a pilot study. Sci Rep. 2024;14(1):8233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hiwa DS, Abdalla SS, Muhialdeen AS, Hamasalih HM, Karim SO. Assessment of nursing skill and knowledge of ChatGPT, Gemini, Microsoft copilot, and llama: a comparative study. Barw Med J. 2024;2:3–6. [Google Scholar]
  • 35.Abu Arqub S, Al-Moghrabi D, Allareddy V, Upadhyay M, Vaid N, Yadav S. Content analysis of AI-generated (ChatGPT) responses concerning orthodontic clear aligners. Angle Orthod. 2024;94(3):263–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Kurt Demirsoy K, Buyuk SK, Bicer T. How reliable is the artificial intelligence product large language model ChatGPT in orthodontics? Angle Orthod. 2024;94(6):602–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Tanaka OM, Gasparello GG, Hartmann GC, Casagrande FA, Pithon MM. Assessing the reliability of chatgpt: a content analysis of self-generated and self-answered questions on clear aligners, TADs and digital imaging. Dent Press J Orthod. 2023;28(5):e2323183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Duran GS, Yurdakurban E, Topsakal KG. The quality of CLP-related information for patients provided by ChatGPT. Cleft Palate Craniofac J. 2023. 10.1177/10556656231222387. [DOI] [PubMed] [Google Scholar]
  • 39.Mahedia M, Rohrich RN, Sadiq KO, Bailey L, Harrison LM, Hallac RR. Exploring the utility of ChatGPT in cleft lip repair education. J Clin Med. 2025;14(3):993. 10.3390/jcm14030993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. Med. 2023;4(3):139–40. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1. (17.2KB, docx)
Supplementary Material 2. (500.1KB, docx)

Data Availability Statement

No datasets were generated or analysed during the current study.


Articles from BMC Oral Health are provided here courtesy of BMC

RESOURCES