Abstract
Background
ChatGPT's potential as a diet information tool is emerging. However, little is known about the extent to which the information provided by ChatGPT aligns with that provided by dietitians.
Objective
This study aimed to assess ChatGPT's capacity to provide responses to diet-related questions, compared to responses by dietitians.
Methods
A total of 928 diet-related questions and corresponding responses from dietitians were collected from Naver Knowledge-iN, a Korean online Q&A platform, between January 18, 2023, and January 17, 2024. ChatGPT-4o was used to generate responses to the same questions. Five text similarity indices—Dice Coefficient, Jaccard Index, Overlap Coefficient, Cosine Similarity, and Term Frequency-Inverse Document Frequency—were used to assess the similarity between ChatGPT's and dietitians’ responses. Questions with the top 5% response similarity were reviewed to identify characteristics of the questions for which ChatGPT generated responses similar to those of dietitians. Responses with the bottom 5% similarity were reviewed to identify reasons for the low similarity.
Results
The average similarity coefficient between ChatGPT and dietitian responses was 0.42. Questions with high response similarity tended to include detailed information, such as specific food items or portions (76.1%), the questioner's context (69.6%), or personal characteristics (17.4%). Low response similarity was mainly due to ChatGPT providing significantly longer responses than dietitians.
Conclusions
ChatGPT demonstrated content similarity to dietitian responses, but they were not identical. The development of prompt engineering techniques to enhance ChatGPT's ability to provide more expert-like and personalized information could benefit users seeking dietary information.
Keywords: Dietitian, ChatGPT, dietary information, similarity, large language model, artificial intelligence
Introduction
Diet-related questions represent a large segment of health-related queries online. 1 Although nutrition professionals, such as dietitians, provide guidance through in-person consultations, telehealth services, and mobile applications, individuals often rely on online information due to its convenience and confidentiality.1–4 However, online dietary information is often low in credibility and fails to account for individuals’ unique contexts, such as complex health conditions and lifestyles.2,4–8 Furthermore, the large amount of available dietary information can be overwhelming, making it difficult for users to discern which information is applicable to their specific situation. 5
Generative artificial intelligence (AI) models are emerging as important sources of health information, including dietary information. These models have the potential to provide reliable information across various healthcare fields. ChatGPT, the most widely recognized generative AI model, can deliver quality information in various areas, including nutrition, mental health, periodontal disease, eye care, vaccine, and occupational medicine, comparable to human experts.9–18 While generative AI models can play a positive role in providing health information, there are concerns about the generation of misleading or false information (i.e., hallucination) and its potential to cause harm to health.17,19–22
Most previous studies on health information provided by ChatGPT have focused on evaluating the quality of its responses or comparing them with existing guidelines or textbooks. Considering that even human experts sometimes deviate from textbook information, the assessment of ChatGPT's performance needs to include a direct comparison with the information provided by human experts. Furthermore, most studies have relied on well-structured questions that were created by experts for the study or frequently asked by patients to assess ChatGPT's performance.19,23–26 In real life, however, people often use non-standard words or slang, make grammatical errors, ramble, or fail to articulate their questions. Especially, individuals use the same dietary information differently depending on their food preferences, food culture, lifestyle, health status, and health goals. Therefore, when answering people's dietary questions, it is important to understand the individual's context and provide personalized information. Therefore, it is necessary to assess how ChatGPT responds to questions asked by real people. To our knowledge, however, only a few studies have compared the responses of human experts and ChatGPT to health questions asked by individuals in real-life contexts, such as those on online question-and-answer (Q&A) or medical consultation platforms.10,12,18,27–29 While these studies included a wide range of health-related questions, no studies have evaluated how ChatGPT performs on real-world diet-related questions. One study compared responses from ChatGPT and human dietitians to commonly asked nutrition questions, 23 which were not unfiltered, real-world questions.
It is also important to identify the characteristics of the questions that ChatGPT responds well to, enabling users to learn how to phrase their questions to receive better responses. Although previous studies suggested using a ChatGPT response validation tool 30 and fine-tuning with cancer guidelines, 31 there is a lack of research assessing the characteristics of questions that elicit high-quality answers.
This study aimed to assess ChatGPT's capacity to provide information in response to diet-related questions. We considered dietitians’ responses as a gold standard and compared the responses of human dietitians with those of ChatGPT to diet-related questions from an online Q&A forum. We specifically focused on the similarities between the responses from dietitians and ChatGPT and the characteristics of questions that resulted in human expert-like ChatGPT responses.
Methods
Data source
We collected diet-related questions posted on Naver Knowledge-iN, an online Q&A platform with a database of nearly 970 million questions and answers, with over 100 million questions and answers uploaded only in 2023, operated by Naver (as of December 8, 2023). 32 Naver is the leading portal site in the Republic of Korea, accounting for 58.14% of the search engine share and capturing 65.63% of the health/medical-related searches in 2024. 33 On Naver Knowledge-iN, users can post questions across various fields, and anyone can provide answers. Experts who wish to contribute to the online Q&A platform can submit proof of their qualification, such as a dietitian license, to Naver Knowledge-iN. Once reviewed, qualified individuals are designated as experts and listed accordingly. Responses from experts in relevant fields are distinguished from those of general users by special marks. These marks help questioners determine the credibility of the responses. For diet-related questions in nutrition and weight management categories, only registered dietitians are listed as experts.
We collected diet-related questions along with responses from registered dietitians. A total of 945 diet-related questions posted between January 18, 2023, and January 17, 2024, in nutrition or weight management categories and answered by a total of 15 dietitians were collected. Each question had one response from dietitians. The questions were reviewed by two authors (YM, SKC), and one of them (SKC) is a registered dietitian. After excluding 17 questions that were not related to diet, 928 diet-related questions were determined to be eligible for analysis. We collected the titles and contents of the questions as well as the responses from dietitians.
ChatGPT response generation
We generated ChatGPT responses corresponding to the collected questions. We input the combined text in the form of “(question title) (question content)” into untrained ChatGPT. We used questioners’ questions verbatim, without any fine-tuning or additional prompt engineering, and then generated ChatGPT responses in Korean, consistent with the source text. The ChatGPT session was reset for each question after a response was generated. We used GPT-4o because it is known to perform better on multi-language tasks and generates more reliable responses compared to the previous model. 34
Analysis of responses
We analyzed the length of diet-related questions, responses from dietitians, and responses from ChatGPT, as well as the text similarity between ChatGPT's responses and those of dietitians. The length of each text was calculated in two ways: by word count and by token count. To calculate the word count, the text was divided into words based on spaces and special characters. For the token count, each text was tokenized by dividing it into morphemes with Okt, a Korean morphological analyzer. 35 Unlike English, Korean is an agglutinative and ambiguous language; the meaning of a suffix can change depending on the nominal or predicate stem it attaches to. 36 As Korean is different from English, we calculated and compared the similarities between the token sets of dietitian responses and ChatGPT-generated responses in various ways. We used five similarity coefficient metrics, including Dice Coefficient (DC), Jaccard Index (JI), Overlap Coefficient (OC), Cosine Similarity (CS), and Term Frequency-Inverse Document Frequency (TF-IDF).37,38 All measures ranged from 0 to 1, where 1 is the highest similarity and 0 is the lowest similarity between the two texts.
We compared the responses from dietitians with those generated by ChatGPT using paired t-tests, considering the number of tokens and the number of words in responses. The mean, standard deviation, minimum, and maximum values of the five similarity indices were calculated. Based on the arithmetic mean of the five similarity indices, we identified questions with the top 5% (n = 46) and bottom 5% (n = 46) response similarities between dietitians and ChatGPT.
By reviewing the questions with the top 5% response similarity, we identified the characteristics of the questions to which ChatGPT generated responses similar to those of dietitians. SKC initially reviewed the questions and developed an initial categorization of their characteristics. Subsequently, all authors participated in reviewing how the categories were constructed and whether there were more representative features that could be used. After all authors discussed and agreed on the categorization, characteristics of the high similarity questions were categorized as follows: questions about specific branded products; questions about calorie or nutrient content of food items; questions about the interpretation of nutrition facts; questions about nutritional terminologies or concepts; provision of information on specific food items or portions; the questioner's context (e.g., when to eat, the purpose of dietary management); the questioner's personal characteristics; or the questioner's physical activity level. After all authors discussed and agreed on the categorization, one author (SKC) coded the high similarity questions according to their characteristics. Questions were coded with at least one characteristic. After the coding was completed, all authors reviewed the coding results and discussed whether the coding was consistent with the agreed-upon categorization. We also reviewed the content of responses with a bottom 5% similarity between dietitians and ChatGPT to identify reasons for the low similarity. The review and coding process was the same for the categorization of the characteristics of high-similarity questions.
All statistical analyses were performed using Statistical Analysis System (SAS) software (ver. 9.4; SAS Institute, Cary, NC, USA). Statistical significance was set at P < .05. This study was reviewed and deemed to be exempt by the University of Seoul Institutional Review Board.
Results
Length of the questions and similarities in responses between dietitians and ChatGPT
For the 928 eligible questions, the average similarity coefficient between responses by dietitians and ChatGPT was 0.42. The highest coefficient was OC (mean = 0.55), and the lowest coefficient was JI (mean = 0.28) (Table 1). Examples of the questions with the highest and lowest response similarity are illustrated in Supplemental Material 1.
Table 1.
Length of the questions reviewed and similarities in responses between dietitians and ChatGPT.
| Mean ± SD | Range | |
|---|---|---|
| Similarity index | ||
| Dice coefficient | 0.43 ± 0.06 | 0.14–0.67 |
| Jaccard index | 0.28 ± 0.05 | 0.08–0.51 |
| Overlap coefficient | 0.50 ± 0.09 | 0.25–0.86 |
| Cosine similarity | 0.44 ± 0.06 | 0.25–0.67 |
| Term frequency-inverse document frequency | 0.48 ± 0.13 | 0.12–0.89 |
| Average similarities of the five indices | 0.42 ± 0.07 | 0.05–0.69 |
| Length of the questions reviewed | ||
| Number of tokens | 60.35 ± 54.05 | 5–410 |
| Number of words | 37.62 ± 34.28 | 2–264 |
For the length of the eligible questions, the average number of tokens and words in a question was 60.35 (range: 5–410) and 37.62 (range: 2–264), respectively. For questions with both the top 5% and the bottom 5% response similarity, the average number of tokens and words was not significantly different (high similarity: 43.91 tokens, 26.65 words; low similarity: 40.48 tokens, 25.04 words), suggesting that the length of the questions does not associate with ChatGPT's performance (data not shown).
Comparison of the length of responses between dietitians and ChatGPT
When comparing the length of the responses from dietitians with those generated by ChatGPT, both the mean number of tokens and word counts of responses were significantly higher for ChatGPT (824.50 tokens, 192.90 words) than for dietitians (623.73 tokens, 138.48 words) (P < .001) (Table 2). In some cases, however, ChatGPT had shorter responses than dietitians. One of the examples was responses to the following question:
I know that either Kamut enzyme or Bergamot can help with dieting, so if I had to choose, which one should I choose? For those who are familiar with Kamut enzyme or Bergamot, please just say Kamut enzyme if it's Kamut or Bergamot if it's Bergamot.
Table 2.
Comparison of the number of tokens and words in responses between dietitians and ChatGPT.
| Dietitian | ChatGPT | P | |||
|---|---|---|---|---|---|
| Mean ± SD | Range | Mean ± SD | Range | ||
| Total (n = 928) | |||||
| Tokens | 623.73 ± 272.13 | 21–1531 | 824.50 ± 274.50 | 6–2422 | <0.001 |
| Words | 138.48 ± 61.33 | 5–328 | 192.90 ± 65.64 | 1–593 | <0.001 |
| High similarity (n = 46) | |||||
| Tokens | 599.76 ± 259.89 | 138–1134 | 701.28 ± 278.26 | 300–1293 | <0.001 |
| Words | 130.72 ± 58.72 | 26–245 | 159.59 ± 62.84 | 64–290 | <0.001 |
| Low similarity (n = 46) | |||||
| Tokens | 279.28 ± 192.47 | 21–881 | 711.22 ± 289.51 | 6–1224 | <0.001 |
| Words | 61.76 ± 43.30 | 5–193 | 164.80 ± 70.16 | 1–298 | <0.001 |
The response from a dietitian was 113 words, including how bergamot helps with weight loss, while ChatGPT responded in one word, “Bergamot” as requested (see Supplemental Material 2).
Characteristics of questions with high or low response similarity
Characteristics of 46 questions that had the top 5% response similarity between dietitians and ChatGPT are presented in Table 3. Most questions with high response similarity included detailed information. For example, questions included information on specific food items or portions (76.1%), the questioner's context (69.6%), or personal characteristics (17.4%). A good example of a question that ChatGPT provided a response highly similar to that of a dietitian is one from a 27-year-old woman. She provided detailed information about her basal metabolic rate, anthropometric measurements, and usual eating pattern:
My basal metabolic rate is usually around 1400 kcal, and on days when I work out, it's around 2500 kcal. I'm 27 years old, 168 cm/69 kg. I'm trying to lose weight and get down to my ideal weight of 62 kg, so I’ve been eating less. I usually eat about half of a regular meal for lunch, eggs and bananas for dinner, and no snacks at all! I feel like I’m eating less than 1400 kcal a day, so I have some questions about that!
Table 3.
Characteristics of questions with the top 5% response similarity between dietitians and ChatGPT (n = 46).
| Characteristics a | N | % |
|---|---|---|
| Providing information on specific foods and portions | 35 | 76.1 |
| Providing information on the questioner's context | 32 | 69.6 |
| Asking for calorie/nutrient content of foods or meals | 12 | 26.1 |
| Providing information on the questioner's personal characteristics | 6 | 17.4 |
| Asking about specific branded products | 5 | 13.0 |
| Providing information on the questioner's physical activity level | 4 | 8.7 |
| Asking about the interpretation of nutrition facts | 3 | 6.5 |
| Asking about nutritional terminologies or concepts | 2 | 4.4 |
A question can be coded with multiple characteristics.
She then asked four questions about her basal metabolic rate and food choices for losing weight. Since she provided detailed information upfront, both ChatGPT and a dietitian provided similar responses to all four questions, tailored to her situation. Questions that asked for numerical information or could be answered with a scientific rationale, such as asking for the calorie or nutrient content of food items or meals (26.1%), the interpretation of nutrition facts (6.5%), or nutritional terminologies or concepts (4.4%), also had higher response similarity.
We reviewed the content of responses with a bottom 5% similarity between dietitians and ChatGPT (n = 46). The primary reasons for the low response similarity were significantly different lengths of the responses between dietitians and ChatGPT. While the content of the responses with low similarity was not different between dietitians and ChatGPT, ChatGPT tended to provide responses that were longer and included general explanations than those provided by dietitians. About 30.4% of cases (n = 14) were attributable to problematic responses by dietitians (n = 11) or ChatGPT (n = 4), with one being categorized under both categories. Seven out of 15 dietitians who responded to diet-related questions on Naver Knowledge-in were involved in problematic responses. Out of the 14 problematic responses by dietitians, five included only general information, so the questioner might not receive answers they wanted, such as “If you exercise, it would be better.” Other reasons for problematic responses by dietitians included misunderstanding of the question (n = 4) or addressing only part of the question (n = 2). As with dietitians, ChatGPT also generated problematic responses, including information that might not be applicable to all individuals (n = 1), misunderstanding the question (n = 1), and addressing only part of the question (n = 1). Notably, one response by ChatGPT was evidently incorrect. The question was about the recommended amount of carbohydrate intake when the questioner is on a diet. ChatGPT provided several diets, including extremely low-carbohydrate diets—consuming 20–30 g/day. While a prolonged low-carbohydrate diet may lead to adverse health outcomes—such as increased low-density lipoprotein cholesterol, symptoms of ketosis, impaired glucose metabolism, psychological discomfort, and even higher mortality39–41—ChatGPT did not recommend restricting such diets to a short duration. Thus, we considered that the response was not accurate.
Discussion
We investigated the similarity between responses by dietitians and ChatGPT to diet-related questions on a public Q&A platform. Overall, text similarity between dietitians’ and ChatGPT's responses was not high; however, the content of the two was mostly similar. For questions with detailed information, ChatGPT generated responses similar to those of dietitians. ChatGPT tended to generate longer responses than dietitians, which contributed to the low similarities between the two responses.
The average text similarity between dietitians’ and ChatGPT's responses, measured using various similarity indices, was 0.42. This indicates that even if the two responses shared common keywords and topics, they were not identical. However, upon reviewing all responses, we found that the main content from dietitians and ChatGPT was mostly similar, with both including key concepts or expressions used by the questioner. Differences in responses were primarily due to variations in response length or additional elements, such as greetings, introductions, or the inclusion of more detailed explanations, rather than differences in the core content. Since we used ChatGPT with zero-shot and the core content was nearly identical, fine-tuning GPT with responses from the dietitians with similar structures is expected to increase the similarity score. Therefore, we concluded that ChatGPT is capable of providing expert-like responses to diet-related questions. However, its response capacity varied depending on how the question was asked.
ChatGPT is more likely to generate responses similar to those of human experts when answering questions that include detailed information, such as the questioner's characteristics and contexts, specific food items, or portion sizes. Human dietitians, with diverse experiences, can comprehend the context of a question and provide appropriate responses. ChatGPT, however, lacking lived experiences, requires more explicit information to generate accurate responses. 17 These results imply that the inclusion of comprehensive information in diet-related questions is crucial to obtain expert-level responses from ChatGPT. A multitude of prompt engineering techniques can be utilized to instruct ChatGPT on how to respond; any type of these techniques is capable of generating better responses in comparison to the absence of instruction. 42 Especially, the quality of the responses to health questions depends on the input prompt. 43 ChatGPT users may not know which information should be included in the question to get good responses, as well as how to employ prompt engineering. In addition, they may not attempt to use a different prompt to obtain more appropriate information, similar to the tendency of many Internet users not to refine their search queries or seek additional information. 44 Consequently, they may encounter incorrect or inadequate information rather than the information they need. This issue is even more critical for individuals with medical conditions that require careful dietary management. In this context, the development and provision of question guides or prompt engineering techniques that assist users in locating relevant queries can contribute to ensuring the quality and accuracy of the information they receive. Similarly, ChatGPT can present simple prompts to help users get responses that suit their needs. User-friendly prompt engineering could be an example. It allows ChatGPT to provide tailored responses, even for non-experts. For example, if ChatGPT requires users who ask diet-related questions to set their context, such as food preferences, health condition, grocery stores near home, and the purpose of diet management, ChatGPT could provide more user-specific information without requiring the users to know what information he/she should include in the prompt to obtain good responses.
ChatGPT tended to generate lengthy responses that included general explanations about food items, nutrition, and health rather than content specific to questions. Providing more information is not always the best response. Some individuals may prefer to know sufficient information, while others may only require answers to their specific queries. When people encounter general information, they may be confused about whether it is relevant to their situation, especially if their health condition is complex or the information is comprehensive. Furthermore, individuals with limited health literacy experience a greater cognitive burden and often require more time to process information compared to those with high health literacy when encountering health information. 45 Excessive information from ChatGPT may lead them to experience information overload, exceeding their capacity for effective processing.45,46 Generative AI models will need the capacity to distinguish between need-to-know and good-to-know information and provide an appropriate amount of information based on the users’ preferences.
In light of the good performance of ChatGPT shown in this study, it can serve as a valuable resource for individuals seeking diet-related information, as well as for human experts who need assistance in drafting responses to patients’ inquiries or educational materials.11,47 However, it is not possible to guarantee that ChatGPT responses do not have misinformation or hallucination, since we did not review all response contents. When we reviewed 10% of the responses from ChatGPT, including those with the highest or lowest 5% similarity to dietitians, only one response from ChatGPT, which was one of the lowest response similarities, provided inaccurate information. While the number of responses containing misinformation is small, inaccurate dietary information could have a negative effect on people's health, especially children, pregnant women, older adults, and individuals with diseases. In addition, as ChatGPT tends to provide general information rather than specific details, its responses may lack the nuance necessary to account for variations in nutrient contents and health effects of the same food items based on their ingredients and cooking methods. Especially, it may generate information appropriate for the general population but inappropriate for individuals with health issues. Unlike licensed healthcare professionals, AI tools are not held accountable for harm, which may make users more vulnerable to potentially harmful information. 48 In addition, the storage of user data by AI tools may give rise to privacy concerns, such as the risk of data leakage or misuse.48,49 Previous studies have also pointed out the limitations of using AI tools as a standalone solution for nutrition management, emphasizing the necessity of integrating them with human expertise.20,47,49–53 Therefore, it is not feasible to advocate the substitution of ChatGPT as a primary source of diet-related information, instead of human dietitians.
In an era of rapid advancement in generative AI technologies, it is vital for dietitians to continuously enhance their professional competencies. While this study underscores the pivotal role of human expertise, it also identifies instances, although limited, where human dietitians provided inaccurate information. Professionals who lack sufficient competence risk jeopardizing public trust. Consequently, ongoing efforts by human dietitians to remain competitive and adapt alongside technological innovations are essential for sustaining their relevance and credibility in the evolving healthcare landscape.47,49
This study has several limitations. First, we only compared the text similarity and length of responses between dietitians and ChatGPT and did not consider other characteristics of the responses, including expressions of empathy or emotional encouragement. Although text similarity indices indicated that both responses were generally similar, there may be subtle differences in emotional expression and nuance. While several studies demonstrated that ChatGPT can exhibit empathy,10,12,18,27 AI-generated empathetic responses lack the authentic emotional resonance of human empathy.20,54 This limitation highlights the need for future research examining how differences in emotional expression in dietary information, whether provided by human experts or generative AI services, affect users’ perceptions and intentions to adopt healthy dietary practices. Second, we did not conduct a qualitative evaluation of the response contents involving multiple experts. Additionally, as the questions and responses included in this study were written in Korean, we were unable to evaluate ChatGPT's capacity to respond in other languages. Since ChatGPT's training data primarily consists of English-written sources, it may not respond well to questions about foods that are unique to Korea or traditional Korean cuisine. It could lead to differences between the responses provided by dietitians and those generated by ChatGPT. We believe that, however, the influence of cultural or language differences on the study results is limited. We used ChatGPT-4o, which has an improved capacity in responding to inquiries in multiple languages when compared to previous models. OpenAI, the developer of ChatGPT, has reported that ChatGPT-4 has demonstrated an enhanced capacity in responding to questions written in Korean compared to ChatGPT-3's capacity in responding to questions written in English. 55 Moreover, previous studies have reported that while ChatGPT performs best in English, it also demonstrates good performance in non-English languages.56,57
Despite these limitations, this study has notable strengths. To test ChatGPT's capacity in a natural setting, we used the questions asked by users online in their original form, whereas most studies evaluating ChatGPT's capacity to respond diet- or health-related questions used researcher-generated questions.19,23,25,26,47 Furthermore, we compared ChatGPT's responses with real-world responses of dietitians, rather than with written guidelines or textbooks.
Conclusions
ChatGPT seems to be useful to obtain general diet-related information; however, there is still room for improvement. Further research is required to develop prompt engineering for obtaining high-quality diet-related information, conducting effective personalized nutritional assessments, and managing diets using ChatGPT.
Supplemental Material
Supplemental material, sj-docx-1-dhj-10.1177_20552076251361381 for ChatGPT and human dietitian responses to diet-related questions on an online Q&A platform: A comparative study by Seul Ki Choi, Yunseo Moon and Hyunggu Jung in DIGITAL HEALTH
Footnotes
ORCID iDs: Seul Ki Choi https://orcid.org/0000-0002-3330-3652
Yunseo Moon https://orcid.org/0009-0009-2483-0547
Hyunggu Jung https://orcid.org/0000-0002-2967-4370
Ethical approval: The study was reviewed by the University of Seoul Institutional Review Board and received an exemption because it used de-identified and publicly available data.
Contributorship: SKC conceptualized the study, analyzed the data, interpreted the results, and drafted the manuscript. YM conducted data collection and analysis, interpreted the results, and drafted the manuscript. HJ conceptualized the study, interpreted the results, reviewed the manuscript critically, and supervised the study. All authors read and approved the final manuscript.
Funding: This work was supported by the New Faculty Startup Fund from Seoul National University. This work was also supported by the National Research Foundation of Korea(NRF) grant funded by the Korean government (MSIT) (No. RS-2024-00407105).
Declaration of conflicting interest: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement: The datasets generated for this study are available from the corresponding author on reasonable request.
Supplemental material: Supplemental material for this article is available online.
References
- 1.Jia X, Pang Y, Liu LS. Online health information seeking behavior: a systematic review. Healthcare 2021; 9: 1740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Brunton C, Arensberg MB, Drawert S, et al. Perspectives of registered dietitian nutritionists on adoption of telehealth for nutrition care during the COVID-19 pandemic. Healthcare 2021; 9: 235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Powell J, Inglis N, Ronnie J, et al. The characteristics and motivations of online health information seekers: cross-sectional survey and qualitative interview study. J Med Internet Res 2011; 13: e20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ruani MA, Reiss MJ, Kalea AZ. Diet-nutrition information seeking, source trustworthiness, and eating behavior changes: an international web-based survey. Nutrients 2023; 15: 4515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ramondt S, Ramírez AS. Assessing the impact of the public nutrition information environment: adapting the cancer information overload scale to measure diet information overload. Patient Educ Counsel 2019; 102: 37–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Adamski M, Truby H, Klassen KM, et al. Using the internet: nutrition information-seeking behaviours of lay people enrolled in a massive online nutrition course. Nutrients 2020; 12: 750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Denniss E, Lindberg R, McNaughton SA. Quality and accuracy of online nutrition-related information: a systematic review of content analysis studies. Public Health Nutr 2023; 26: 1345–1357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Metzger MJ. Making sense of credibility on the web: models for evaluating online information and recommendations for future research. J Am Soc Inf Sci Technol 2007; 58: 2078–2091. [Google Scholar]
- 9.Alan R, Alan BM. Utilizing ChatGPT-4 for providing information on periodontal disease to patients: a DISCERN quality analysis. Cureus 2023; 15: e46213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 2023; 183: 589–596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ayers JW, Zhu Z, Poliak A, et al. Evaluating artificial intelligence responses to public health questions. JAMA Netw Open 2023; 6: e2317517–e2317517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bernstein IA, Zhang YV, Govil D, et al. Comparison of ophthalmologist and large language model chatbot responses to online patient eye care questions. JAMA Netw Open 2023; 6: e2330320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Deiana G, Dettori M, Arghittu A, et al. Artificial intelligence and public health: evaluating ChatGPT responses to vaccination myths and misconceptions. Vaccines (Basel) 2023; 11: 1217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Goodman RS, Patrinely JR, Stone CA, et al. Accuracy and reliability of chatbot responses to physician questions. JAMA Netw Open 2023; 6: e2336483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lee T-C, Staller K, Botoman V, et al. ChatGPT answers common patient questions about colonoscopy. Gastroenterology 2023; 165: 509–511.e507. [DOI] [PubMed] [Google Scholar]
- 16.Padovan M, Cosci B, Petillo A, et al. ChatGPT in occupational medicine: a comparative study with human experts. Bioengineering 2024; 11: 57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Spallek S, Birrell L, Kershaw S, et al. Can we use ChatGPT for mental health and substance use education? Examining its quality and potential harms. JMIR Med Educ 2023; 9: e51243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yalamanchili A, Sengupta B, Song J, et al. Quality of large language model responses to radiation oncology patient care questions. JAMA Netw Open 2024; 7: e244630. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sng GGR, Tung JYM, Lim DYZ, et al. Potential and pitfalls of ChatGPT and natural-language artificial intelligence models for diabetes education. Diabetes Care 2023; 46: e103–e105. [DOI] [PubMed] [Google Scholar]
- 20.Arslan S. Exploring the potential of Chat GPT in personalized obesity treatment. Ann Biomed Eng 2023; 51: 1887–1888. [DOI] [PubMed] [Google Scholar]
- 21.Bayram HCM, Ozturkcan A. AI Showdown: info accuracy on protein quality content in foods from ChatGPT 3.5, ChatGPT 4, bard AI and Bing chat. Br Food J 2024; 126: 3335–3346. [Google Scholar]
- 22.Liao L-L, Chang L-C, Lai I-J. Assessing the quality of ChatGPT’s dietary advice for college students from dietitians’ perspectives. Nutrients 2024; 16: 1939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kirk D, van Eijnatten E, Camps G. Comparison of answers between ChatGPT and human dieticians to common nutrition questions. J Nutr Metab 2023; 2023: 5548684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Niszczota P, Rybicka I. The credibility of dietary advice formulated by ChatGPT: robo-diets for people with food allergies. Nutrition 2023; 112: 112076. [DOI] [PubMed] [Google Scholar]
- 25.Ponzo V, Goitre I, Favaro E, et al. Is ChatGPT an effective tool for providing dietary advice? Nutrients 2024; 16: 469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gibson D, Jackson S, Shanmugasundaram R, et al. Evaluating the efficacy of ChatGPT as a patient education tool in prostate cancer: multimetric assessment. J Med Internet Res 2024; 26: e55939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Biswas MR, Islam A, Shah Z, et al. Can ChatGPT be your personal medical assistant? In: 2023 10th international conference on social networks analysis, management and security, 2023, pp.1–5: IEEE. [Google Scholar]
- 28.Xue Z, Zhang Y, Gan W, et al. Quality and dependability of ChatGPT and DingXiangYuan forums for remote orthopedic consultations: comparative analysis. J Med Internet Res 2024; 26: e50882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.He W, Zhang W, Jin Y, et al. Physician versus large language model chatbot responses to web-based questions from autistic patients in Chinese: cross-sectional comparative analysis. J Med Internet Res 2024; 26: e54706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Vaira LA, Lechien JR, Abbate V, et al. Validation of the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool: a new tool to assess the quality of health information provided by AI platforms. Eur Arch Otorhinolaryngol 2024; 281: 6123–6131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lee J-w, Yoo I-S, Kim J-H, et al. Development of AI-generated medical responses using the ChatGPT for cancer patients. Comput Methods Programs Biomed 2024; 254: 108302. [DOI] [PubMed] [Google Scholar]
- 32.Yoo JH. “I felt so relieved after sharing my worries”… A place where troubled teens and young adults are gathered. Hankyung , January 9, 2023, 2023.
- 33.InternetTrend™. Web traffic statistics, http://www.internettrend.co.kr/trendForward.tsp (accessed 10 February 2025).
- 34.Islam R, Moushi OM. Gpt-4o: the cutting-edge advancement in multimodal LLM. TechRxiv 2024. DOI: 10.36227/techrxiv.171986596.65533294/v1. [DOI]
- 35.Moon S, Cho WI, Han HJ, et al. OpenKorPOS: democratizing Korean tokenization with voting-based open corpus annotation. In: Proceedings of the 13th language resources and evaluation conference, 2022, pp.4975–4983. [Google Scholar]
- 36.Sohn H-M. The Korean language. Cambridge: Cambridge University Press, 2001. [Google Scholar]
- 37.Nikolentzos G, Meladianos P, Rousseau F, et al. Shortest-path graph kernels for document similarity. In: Proceedings of the 2017 conference on empirical methods in natural language processing, 2017, pp.1890–1900. [Google Scholar]
- 38.Seegmiller B, Papanikolaou D, Schmidt LD. Measuring document similarity with weighted averages of word embeddings. Explor Econ Hist 2023; 87: 101494. [Google Scholar]
- 39.Seidelmann SB, Claggett B, Cheng S, et al. Dietary carbohydrate intake and mortality: a prospective cohort study and meta-analysis. Lancet Public Health 2018; 3: e419–e428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Popiolek-Kalisz J. Ketogenic diet and cardiovascular risk–state of the art review. Curr Probl Cardiol 2024; 49: 102402. [DOI] [PubMed] [Google Scholar]
- 41.Brinkworth GD, Buckley JD, Noakes M, et al. Long-term effects of a very low-carbohydrate diet and a low-fat diet on mood and cognitive function. Arch Intern Med 2009; 169: 1873–1880. [DOI] [PubMed] [Google Scholar]
- 42.Liu M, Okuhara T, Chang X, et al. Performance of ChatGPT across different versions in medical licensing examinations worldwide: systematic review and meta-analysis. J Med Internet Res 2024; 26: e60807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Fernández-Pichel M, Pichel JC, Losada DE. Evaluating search engines and large language models for answering health questions. NPJ Dig Med 2025; 8: 153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Dean B. How people use google search (New user behavior study), https://backlinko.com/google-user-behavior (2024, accessed 7 February 2025).
- 45.Meppelink CS, Smit EG, Diviani N, et al. Health literacy and online health information processing: unraveling the underlying mechanisms. J Health Commun 2016; 21: 109–120. [DOI] [PubMed] [Google Scholar]
- 46.Sweller J. Cognitive load during problem solving: effects on learning. Cogn Sci 1988; 12: 257–285. [Google Scholar]
- 47.Chatelan A, Clerc A, Fonta P-A. ChatGPT and future artificial intelligence chatbots: what may be the influence on credentialed nutrition and dietetics practitioners? J Acad Nutr Diet 2023; 123: 1525–1531. [DOI] [PubMed] [Google Scholar]
- 48.Wang C, Liu S, Yang H, et al. Ethical considerations of using ChatGPT in health care. J Med Internet Res 2023; 25: e48009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Arslan S. Decoding dietary myths: the role of ChatGPT in modern nutrition. Clin Nutr ESPEN 2024; 60: 285–288. [DOI] [PubMed] [Google Scholar]
- 50.Arslan S. ChatGPT is no nutrition encyclopedia, but does it need to be? Clin Nutr ESPEN 2025; 66: 213–214. [DOI] [PubMed] [Google Scholar]
- 51.Bayram HM, Çelik ZM, Barcın Güzeldere HK. Can artificial intelligence (AI) chatbot tools be used effectively for nutritional management in obesity? Nutr Health. Epub ahead of print 20 Mar 2025. DOI: 10.1177/02601060251329070. [DOI] [PubMed] [Google Scholar]
- 52.Papastratis I, Stergioulas A, Konstantinidis D, et al. Can ChatGPT provide appropriate meal plans for NCD patients? Nutrition 2024; 121: 112291. [DOI] [PubMed] [Google Scholar]
- 53.Podszun M, Hieronimus B. Can ChatGPT generate energy, macro-and micro-nutrient sufficient meal plans for different dietary patterns? 2023.
- 54.Montemayor C, Halpern J, Fairweather A. In principle obstacles for empathic AI: why we can’t replace human empathy in healthcare. AI Soc 2022; 37: 1353–1359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.OpenAI. GPT-4 technical report. 2023.
- 56.Fang C, Wu Y, Fu W, et al. How does ChatGPT-4 preform on non-English national medical licensing examination? An evaluation in Chinese language. PLoS Digit Health 2023; 2: e0000397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Gimeno A, Krause K, D’Souza S, et al. Completeness and readability of GPT-4-generated multilingual discharge instructions in the pediatric emergency department. JAMIA Open 2024; 7: ooae050. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material, sj-docx-1-dhj-10.1177_20552076251361381 for ChatGPT and human dietitian responses to diet-related questions on an online Q&A platform: A comparative study by Seul Ki Choi, Yunseo Moon and Hyunggu Jung in DIGITAL HEALTH
