Skip to main content
Medical Science Educator logoLink to Medical Science Educator
editorial
. 2024 May 28;34(5):1179–1184. doi: 10.1007/s40670-024-02078-w

Enhancing English Presentation Skills with Generative Artificial Intelligence: A Guide for Non-native Researchers

Takanobu Hirosawa 1,✉,#, Taro Shimizu 1,#
PMCID: PMC11496412  PMID: 39450042

Abstract

This commentary explores the utilization of generative artificial intelligence (AI), particularly Google Gemini (previously Bard), in enhancing English presentation skills among non-native researchers. We present a step-by-step methodology for using Google Gemini’s Speech-to-Text and Text-to-Speech features. Our findings suggest that Google Gemini effectively aids in draft presentations, pronunciation practice, and content verification, tapping into an area often unexplored—using AI for presentation skills in scientific research. Despite its potential, users must exercise caution due to the experimental nature of this AI technology. Adapting to such technologies is timely and beneficial for the global scientific community.

Keywords: Artificial intelligence, Natural language processing, Communication, Learning

Background

Why are English Presentations Difficult?

English proficiency is increasingly recognized as a critical factor for academic and professional success, especially in the scientific community. This is particularly true in fields like medicine [1] and science [2], where English serves as the lingua franca for research dissemination, international collaboration, and education. In Japan, the medical system is intricately connected with English-speaking countries through various channels. English is the primary language for most of the leading medical journals, international conferences, and professional exchanges. Consequently, Japanese medical professionals and students are often required to access, interpret, and contribute to English-language resources to stay abreast of the latest research, clinical practices, and innovations in their field. This requirement underscores the necessity of effective English communication skills, not only for passive comprehension but also for active engagement, such as presenting at international conferences, publishing in peer-reviewed journals, and collaborating with international colleagues.

For non-native English speakers, mastering the nuances of English presentation skills—such as pronunciation, vocabulary selection, and the clarity of articulation—presents a significant challenge [3]. These challenges are compounded in countries like Japan, where educational systems traditionally emphasize passive language skills, reading and listening, over active skills, speaking and writing [4]. This imbalance hinders the development of effective presentation skills, essential for communicating scientific findings and participating in the global scientific community.

What is Artificial Intelligence (AI) and Generative AI?

Artificial intelligence (AI) is a field of computer science that aims to create machines capable of mimicking human intelligence. This endeavor encompasses a broad spectrum of technologies, such as generative AI, natural language processing, large language models, and more. Among these, generative AI is a subset of AI that focuses on creating digital content such as text, image, and video, autonomously. Recently, the use of generative AI technologies has increased, partly due to their enhanced performance in large language models. Additionally, the ease of access to generative AI—requiring no special knowledge or equipment—has contributed to its widespread use. As a result, almost all fields, including science [5] and medicine [6, 7], are being affected by the introduction of generative AI, including Google Gemini (previously Bard), ChatGPT, and other AI.

What is Google Gemini?

Google Gemini, an AI tool developed by Google, exemplifies this trend. Google Gemini utilizes a large language model (LLM) [8]. Google Gemini was initially released as an experimental version, named Google Bard, and has since received regular updates to enhance functionality and user experience. Available for free to users logged into a Google account, it also offers an upgraded version, Google Gemini Advanced, through a monthly subscription. Notably, Google Gemini offers unique features, including voice input and text-to-speech capability, which can be highly beneficial for improving researchers’ presentation skills.

Compared to Traditional Methods

There has been significant demand for research into language learning methods. Compared to traditional language learning methods, such as speech recognition technology [9], generative AI excels in several key areas critical for effective language learning. For example, interactive feedback, adaptation to the user’s level proficiency, comprehensive language engagement, content generation, simulation, accessibility, and flexibility were key features in generative AI. Additionally, some generative AI tools integrate speech-to-text and text-to-speech capabilities, as well as draft refinement, all in one tool. However, apart from several case reports in educational fields [10, 11], the literature on the use of generative AI for improving presentation skills, particularly regarding pronunciation and the clarity of English presentations, was limited.

This gap in the literature raises several questions about the effectiveness and practicality of using generative AI tools for language learning and presentation skills enhancement. For instance, how accurate are the pronunciations provided by these AI tools, especially for words of non-English origin? Can non-native speakers easily recognize and correct inaccuracies or “hallucinations,” generated errors and misinformation by the AI [12]? Moreover, the use of AI for language learning beyond traditional contexts, such as medicine or science, remains underexplored. Therefore, it is worth noting that generative AI, such as Google Gemini, has the potential to help researchers become more proficient English speakers. While several studies have explored the use of generative AI for enhancing scientific writing [1317], there has been limited research on leveraging generative AI for scientific speaking, specifically for presentations at conferences or for case presentations. This commentary delves into the potential of generative AI, specifically Google Gemini, as an innovative tool to assist non-native English-speaking researchers in refining their English presentation skills. Additionally, this article serves as a foundational guide for the new era enhancing skills through generative AI.

Activity

How to Use Google Gemini to Practice Your English Presentation

Here is a practical, step-by-step guide to using Google Gemini to improve your English presentation:

  1. Prepare Your Draft

Make sure your draft does not contain any sensitive or confidential information. If you have concerns about Google Gemini storing your input, disable the Gemini Activity feature. Always consult your co-researchers and supervisor regarding privacy and confidentiality issues.

  • 2.

    Polish Your Draft with Google Gemini

  • Speech-to-Text: Click the microphone button and use voice input (Fig. 1, arrow). If this is your first time using voice input, grant microphone access for Google Gemini. If the speech-to-text capability does not transcribe your draft accurately, consider it an opportunity to improve your pronunciation or choose words that are easier for you to articulate.

Fig. 1.

Fig. 1

Example of input draft through speech-to-text capability. The arrow is shown for the microphone button

  • Elaborate Your Draft with Google Gemini: Input the following sentences before your draft content: “You are an English speaker. Elaborate the following English presentation draft: (your draft).” Additionally, when you want Google Gemini to simply read back the presentation, you should change the prompt like “Read back the following sentences: (copy and paste the draft)” and listen to the software read back the draft without elaboration. The user should review and potentially paraphrase the system’s output to ensure relevance and clarity.

  • Text-to-Speech: After generating output from Google Gemini, click the speaker button to listen to the elaborated draft (Fig. 2, arrow). This step allows you to learn correct pronunciation. When using Google Gemini for English practice, ensure your prompt is solely in English. The text-to-speech capability does not support switching between multiple languages in one session. For example, if you start the session in English, ensure the entire content remains in English.

Fig. 2.

Fig. 2

Example of generating output from Google Bard (currently Gemini). The arrow is shown for the speaker button

  • 3.

    Verify Output with Reliable Sources

Keep in mind that all generative AI, including Google Gemini, may generate “hallucinations” [12]. While LLMs can predict subsequent words in a sequence, the models lack understanding of the truthfulness or depth of the content. In the context of scientific research and medicine, generative AI lacks an understanding of the basic mechanisms and underlying pathophysiology, respectively. Therefore, it is crucial for the user to cross-check the output with reliable sources.

Afterward, repeat steps 2 and 3 to practice speaking your draft, further elaborate on it, and listen to the improvements via Google Gemini.

Results

Using Google Gemini for Presentation Enhancement

We provide examples of conference presentations and case presentations.

Example 1: Conference Presentation

Below is a snippet from an introduction section of a presentation about AI’s diagnostic accuracy. We will demonstrate how to refine this content using Google Gemini for enhanced presentation quality. We accessed Google Bard (currently Gemini) on March 9, 2024.

  1. Prepare Your Draft

Input: The initial draft emphasized the potential of AI in medical diagnosis, starting, “Artificial intelligence chatbot is anticipated to assist with medical diagnosis. For diagnostic excellence, it is essential not only to generate differential-diagnosis lists but also to evaluate whether the final diagnosis is included in these lists for proper feedback. Moreover, it necessitates a consideration of various factors that may differ among patients, including symptoms and medical history, to ensure the accuracy and relevance of the diagnosis.”

  • 2.

    Polish Your Draft with Google Gemini

  • Speech-to-Text: After pushing the microphone button, speak your draft into the microphone (Fig. 1).

  • Elaborate Your Draft with Google Gemini

    Output: AI in Medical Diagnosis: Partnering for Accuracy

    AI chatbots are poised to transform medical diagnosis. But for true excellence, we need more than just a list of potential diagnoses. The doctor’s final diagnosis should be included in that list, allowing for feedback and continuous AI improvement. Additionally, AI must consider individual variations in symptoms and medical history to ensure diagnoses are accurate and relevant. By focusing on these aspects, AI can become a powerful partner in delivering the best possible care.

    Elaborated Points:

    Context Enhancement: The output provided more context with a focused header to immediately convey the main topic.

    Clarification: It clarified whose diagnosis needs to be included (the doctor’s) and underscored the importance of incorporating continuous AI feedback.

    Modification for Clarity: Sentences were restructured to be more easily pronounced and understood by non-native English speakers.

  • Text-to-Speech: After generating output from Google Gemini, click the speaker button to listen to the elaborated draft (Fig. 2).

  • 3.

    Verify Output with Reliable Sources

Example 2: Case Presentation

We present another example focused on a case presentation. When presenting a clinical case, greater attention to privacy issues is required compared to general conference presentations [18]. The case presentation is a 60-year-old female who presents with fatigue and fever. The date of access was the same as the previous example.

  1. Prepare Your Draft

Input: The draft for this presentation began with “The case is a 60-year-old female. The patient’s chief complaints were fatigue and fever. One month before admission, the patient noticed fatigue. Two days before admission, she had a fever and cough. But other people around the patient did not have the same symptoms.”

  • 2.

    Polish Your Draft with Google Gemini

  • Speech-to-Text: Use voice input for the case presentation

  • Elaborate Your Draft with Google Gemini

    Output: Introduction

    Today, we’ll discuss the case of a 60-year-old female patient who presented with fatigue and fever.

    History of Present Illness

    The patient’s main concerns are fatigue and fever. She reports noticing fatigue one month ago. This symptom gradually worsened, and two days before admission, she developed a fever and cough. It’s important to note that no close contacts have reported similar symptoms, suggesting a less contagious cause.

    Further Details to Explore….

  • Elaborated Points:

    Detailed Timeline and Symptoms: The output included a detailed timeline and symptom progression, enhancing understanding of the patient’s condition over time.

    Significance of Evaluation: Explained the importance of evaluating symptoms in the context of their development and impact.

    Additional Insights: Offered next steps and further considerations, ensuring a comprehensive overview for potential diagnosis.

  • Text-to-Speech: Push the speaker button to listen to the elaborated draft.

  • 3.

    Verify Output with Reliable Sources

In the conference presentation example, the output included technical aspects of AI, diagnosis, and diagnostic accuracy. In the case presentation, the output was oriented towards the clinical context of a patient. Both examples emphasized refining the initial draft to make it more concise and clearer with Google Gemini’s assistance, highlighting its potential in enhancing the quality of the presentation content.

Discussion

Perspectives on Enhancing Presentation Skills Through Generative AI

Generative AI, such as Google Gemini, offers an interactive approach for non-native English speakers. They act not only as passive tools for correction but also as active participants. When using such tools, researchers are not only improving their written and spoken English but also getting accustomed to nuances like intonation, word choice, and sentence construction, which can be critical for effective presentations. The real-time feedback method encourages continuous learning [19], enabling researchers to identify and rectify their mistakes instantly.

Adapting Your Draft for Your Presentation

The user ensures that the draft is adjusted for pronunciation clarity, especially for those who speak English as a second language. Some words can be challenging for non-native English speakers. For instance in example #1, words like “differ” and “defer” can pose pronunciation challenges [20], due to their similar sound, especially for non-native speakers. Of course, the difficulty level of pronouncing English words can significantly depend on the speaker’s native language, as the phonetic inventory and syntactic structures of languages vary widely. You should adjust the prompt and draft to modify sentences including the easier pronunciation for non-native English speakers.

Caution When Using Generative AI for Presentation Practice

While generative AI tools such as Google Gemini present substantial benefits for enhancing presentation skills, their use must be approached with caution. These AI tools can potentially generate errors and misinformation, necessitating the verification of their output against reliable sources such as textbooks, references, and accurate datasets. Despite several studies highlighting the usefulness of AI tools in improving speaking ability [21, 22], comprehensive research on AI-assisted pronunciation, especially within the context of generative AI, remains limited. Special attention should be paid to technical or specialized terminology, which generative AI may not yet fully comprehend. Additionally, given that Google Gemini is in an experimental version, it is important to stay informed about any updates to the tool and adapt your usage to accommodate these changes.

Generative AI tools have been widely utilized across various fields, including science and medicine. There exists potential in developing generative AI tools specifically tailored for presentation practice. Now is the time for us to wisely adapt to using these tools, while minimizing any associated risks. We anticipate that this commentary will provide researchers with actionable steps to improve their English pronunciation, presentation structure, and content verification, thereby enhancing their overall presentation skills.

Acknowledgements

This study was made possible using the resources from the Department of Diagnostic and Generalist Medicine, Dokkyo Medical University.

Abbreviations

AI

Artificial intelligence

LLM

Large language model

Declarations

Conflict of Interest

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Takanobu Hirosawa and Taro Shimizu contributed equally to this work.

References

  • 1.Chan SMH, Mamat NH, Nadarajah VD. Mind your language: the importance of English language skills in an International Medical Programme (IMP). BMC Med Educ. 2022;22(1):405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Martirosyan NM, Hwang E, Wanjohi R. Impact of English proficiency on academic performance of international students. J Int Students. 2015;5(1):60–71. [Google Scholar]
  • 3.Dalem M. Difficulties of speaking that are encountered by English language students at Al Margeb University. Premise: J Eng Educ Appl Linguist. 2017;6(2):20–9. [Google Scholar]
  • 4.Amaki Y. Perspectives on English education in the Japanese public school system: the views of foreign assistant language teachers (ALTs). Educ Stud Jpn. 2008;3:53–63. [Google Scholar]
  • 5.Stokel-Walker C, Van Noorden R. What ChatGPT and generative AI mean for science. Nature. 2023;614(7947):214–6. [DOI] [PubMed] [Google Scholar]
  • 6.Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29(8):1930–40. [DOI] [PubMed] [Google Scholar]
  • 7.Liu J, Wang C, Liu S. Utility of ChatGPT in clinical practice. J Med Internet Res. 2023;25:e48568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Hashemi-Pour C. Google Gemini (formerly Bard): TechTarget Network; 2024. Available from: https://www.techtarget.com/searchenterpriseai/definition/Google-Bard.
  • 9.Shadiev R, Hwang W-Y, Chen N-S, Huang Y-M. Review of speech-to-text recognition technology for enhancing learning. J Educ Technol Soc. 2014;17(4):65–84. [Google Scholar]
  • 10.Kohnke L, Moorhouse BL, Zou D. Exploring generative artificial intelligence preparedness among university language instructors: a case study. Comput Educ: Artif Intell. 2023;5:100156. [Google Scholar]
  • 11.Wang T, Lund BD, Marengo A, Pagano A, Mannuru NR, Teel ZA, et al. Exploring the potential impact of artificial intelligence (AI) on international students in higher education: generative AI, chatbots, analytics, and international student success. Appl Sci. 2023;13(11):6716. [Google Scholar]
  • 12.Alkaissi H, McFarlane SI. Artificial Hallucinations in ChatGPT: implications in Scientific writing. Cureus. 2023;15(2):e35179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Koo M. The importance of proper use of ChatGPT in medical writing. Radiology. 2023;307(3):e230312. [DOI] [PubMed] [Google Scholar]
  • 14.Salvagno M, Taccone FS, Gerli AG. Can artificial intelligence help for scientific writing? Crit Care. 2023;27(1):75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gilat R, Cole BJ. How will artificial intelligence affect scientific writing, reviewing and editing? The future is here. Arthroscopy. 2023;39(5):1119–20. [DOI] [PubMed] [Google Scholar]
  • 16.Zheng H, Zhan H. ChatGPT in scientific writing: a cautionary tale. Am J Med. 2023;136(8):725–6. [DOI] [PubMed] [Google Scholar]
  • 17.Biswas S. ChatGPT and the future of medical writing. Radiology. 2023;307(2):e223312. [DOI] [PubMed] [Google Scholar]
  • 18.Draper H, Rogers W. Re-evaluating confidentiality: using patient information in teaching and publications. Adv Psychiatr Treat. 2005;11(2):115–21. [Google Scholar]
  • 19.Porter B, Grippa F. A platform for AI-enabled real-time feedback to promote digital collaboration. Sustainability. 2020;12(24):10243. [Google Scholar]
  • 20.Kondo M, editor. Is acquisition of L2 phonemes difficult? Production of English stress by Japanese speakers. Proceedings of the 10th Generative Approaches to Second Language Acquisition Conference (GASLA 2009); 2009: Citeseer.
  • 21.Noviyanti SD. Artificial intelligence (AI)-Based pronunciation checker: an alternative for independent learning in pandemic situation. J Engl Lang Teach Foreign Lang Context. 2020;5(2):162. [Google Scholar]
  • 22.Fatima SN. Artificial intelligence for pronunciation improvement in second language acquisition. Artif Intell. 2024;8(41). 10.54850/jrspelt.8.41.001. [DOI]

Articles from Medical Science Educator are provided here courtesy of Springer

RESOURCES