Skip to main content
AEM Education and Training logoLink to AEM Education and Training
. 2021 Jul 1;5(3):e10605. doi: 10.1002/aet2.10605

The future of simulation‐based medical education: Adaptive simulation utilizing a deep multitask neural network

Aaron J Ruberto 1,2, Dirk Rodenburg 3, Kyle Ross 4, Pritam Sarkar 4, Paul C Hungler 5, Ali Etemad 4, Daniel Howes 6, Daniel Clarke 7, James McLellan 5, Daryl Wilson 8, Adam Szulewski 1,
PMCID: PMC8155693  PMID: 34222746

Abstract

Background

In resuscitation medicine, effectively managing cognitive load in high‐stakes environments has important implications for education and expertise development. There exists the potential to tailor educational experiences to an individual's cognitive processes via real‐time physiologic measurement of cognitive load in simulation environments.

Objective

The goal of this research was to test a novel simulation platform that utilized artificial intelligence to deliver a medical simulation that was adaptable to a participant's measured cognitive load.

Methods

The research was conducted in 2019. Two board‐certified emergency physicians and two medical students participated in a 10‐minute pilot trial of a novel simulation platform. The system utilized artificial intelligence algorithms to measure cognitive load in real time via electrocardiography and galvanic skin response. In turn, modulation of simulation difficulty, determined by a participant's cognitive load, was facilitated through symptom severity changes of an augmented reality (AR) patient. A postsimulation survey assessed the participants’ experience.

Results

Participants completed a simulation that successfully measured cognitive load in real time through physiological signals. The simulation difficulty was adapted to the participant's cognitive load, which was reflected in changes in the AR patient's symptoms. Participants found the novel adaptive simulation platform to be valuable in supporting their learning.

Conclusion

Our research team created a simulation platform that adapts to a participant's cognitive load in real‐time. The ability to customize a medical simulation to a participant's cognitive state has potential implications for the development of expertise in resuscitation medicine.

INTRODUCTION

Caring for critically ill patients is a complex and challenging task, requiring physicians to make life and death decisions under difficult circumstances. 1 Simulation‐based medical education (SBME) is an important pedagogical strategy within modern medical training that provides an ethical and effective bridging strategy between “knowing” and “doing.” 2 , 3 , 4 , 5 , 6 , 7 While SBME has been successfully implemented in undergraduate and postgraduate training across North America, its application to continuing professional development for quality improvement remains a challenge. The expertise reversal effect suggests that certain learning techniques (for example, traditional SBME) that are effective for novice learners may theoretically be inefficient or even harmful for more advanced learners as a result of relative cognitive overload. 8 In this context, this could be related to the “one‐size‐fits‐all approach” in existing simulation programs in which training typically (1) does not fully replicate complex, real‐life encounters and (2) is not dynamically adapted to suit the cognitive load and level of expertise of the participant. Cognitive load refers to the amount of working memory being used to perform a task. 9

Exceeding cognitive capacity has been shown to significantly degrade medical performance and learning. 10 , 11 , 12 , 13 Effectively managing cognitive load during simulation has implications for medical practice and education. In line with best educational practices, the design and presentation of simulations should match the educational objectives, the level of expertise and the cognitive load of the individual medical student or physician. For example, well‐known educational models like the four‐component instructional design method suggest building tasks in a simple to complex and low‐ to high‐fidelity manner. 14 Mismatches in both task complexity and fidelity with an individual's cognitive capacity can significantly impede skill acquisition via cognitive overload. 12

In an attempt to better understand the cognitive processes of medical students and physicians and tailor SBME to the individual participant, both psychometric, and more recently physiological, tools have been utilized to assess cognitive load. 12 , 15 , 16 , 17 , 18 , 19 , 20 , 21 Psychometric tools include subjective inventories such as the Paas cognitive load scale, NASA Task Load Index, and the Cognitive Load Component questionnaire. 16 , 17 Although validated in numerous contexts, these subjective measures are limited by the requirement for retrospective assessment. In contrast, physiologic indicators of cognitive load, including pupillometry, eye tracking, electrocardiography (ECG), electroencephalography, and galvanic skin response (GSR) allow for less intrusive real‐time data capture. Although used to some degree in medical education research, 16 , 19 , 20 , 21 these multimodal tools that rely on physiologic measurement mark a relatively novel and objective approach to the assessment of cognitive load in resuscitation medicine simulation.

The primary objective of this pilot research was to enhance SBME through the development of an augmented reality (AR) simulation platform that measures and adapts to the participants’ cognitive load. The research was implemented in two phases. During the first phase, physiological data and subjective cognitive load scores were acquired from participants during two emergency medical simulations. The data were analyzed for the most salient predictive features and used to train a deep neural network to classify low and high levels of cognitive load based on physiological data. The methods of data acquisition, artificial intelligence algorithm design, and analysis for the first phase have been detailed previously by Ross et al. 22 and Sarkar et al. 23 This paper describes the second phase of the project, in which the classification algorithms for autonomously discriminating between high and low levels of cognitive load were utilized within an adaptive simulation platform.

METHODS

Two female third‐year medical students (MS1, MS2) and one male and one female board‐certified emergency physician (EP1, EP2) participated in a 10‐minute adaptive simulation trial. The simulated patient presented with an asthma exacerbation, which was chosen for the relative ease of scaling symptom severity for programming of an AR patient (Figure 1). Each participant was supported by three experienced medical personnel in the roles of the simulated medical team: two emergency medicine registered nurses (RN) and one junior resident physician. The resident physicians were enrolled in either emergency medicine or anesthesiology residency programs. The RNs and resident physicians had prior experience in both simulated and real‐world emergency patient management. The research was conducted in 2019 and approved by Queen's University Health Sciences and Affiliated Teaching Hospitals Research Ethics Board.

FIGURE 1.

FIGURE 1

Flow diagram of the simulation and vital signs for the simulated patient at each modulation stage of severity (MSS).

Prior to participation in the simulation, participants had ECG and GSR sensors (Shimmer3, Dublin, Ireland) attached to their chest and hands (Figure 2). Signals from both sensors were wirelessly transmitted to a computer in the simulation control room. Heart rate variability derived from the ECG signal along with GSR data were used to classify cognitive load. The AR headset (Microsoft Hololens, Redmond, WA) displayed a simulated AR patient with modifiable respiratory symptoms (e.g., respiratory rate, depth of respiration, and appearance of distress) overtop a patient simulation manikin (Laerdal SimMan 3G, Stavanger, Norway; Figure 3). The AR patient was only visible to the participant who was wearing the headset. The simulation had seven patient modulated stages of severity, referred to as “MSS” (Figure 1). Each MSS correlated with a visually apparent change in respiratory symptoms of the AR patient.

FIGURE 2.

FIGURE 2

Participant outfitted with the AR headset (Microsoft Hololens) and ECG and GSR sensors (Shimmer3). AR, augmented reality; GSR, galvanic skin response

FIGURE 3.

FIGURE 3

Mannequin‐augmented reality patient overlay (Laerdal SimMan 3G), visible to the participant wearing the AR headset (Microsoft Hololens). AR, augmented reality

Three minutes of baseline ECG and GSR signals were measured prior to commencing the simulation. The participant then had 1 minute to read the scenario introduction before entering the simulation room. In the simulation room the participant engaged the patient and medical team during a 2‐minute simulation engagement phase with an MSS‐III (Figure 1). During this phase, the AI algorithm provided classification of cognitive load based on ECG and GSR. The initial classification of cognitive load defined the upper and lower limits of the patient's MSS boundaries for the remainder of the simulation. If the participant's cognitive load was labeled as “high,” then the range was set to MSS‐I to MSS‐V (less difficult). If the participant's cognitive load was labeled as “low,” then the range was set to MSS‐III to MSS‐VII (more difficult).

After the engagement phase, the simulation seamlessly progressed to an 8‐minute modulation phase for the remainder of the scenario (Figure 1). During this stage, the participant continued to manage the patient and cognitive load was classified by the AI algorithm at 2‐minute intervals for a total of four assessments. Based on a binary classification of the participant's cognitive load (low or high) the MSS was adjusted up or down a single MSS to increase or decrease the severity of the patient's respiratory symptoms.

Five realistic distractors were implemented throughout the simulation. The introduction of these distractors was used to increase the cognitive load of the medical student or physician, which allowed the algorithm to better differentiate high versus low cognitive load. The distractors included an ECG displaying sinus tachycardia, a blood glucose level of 14 mmol/L, a home medication list, an emergency medical service patch call, and a simultaneous code blue in the emergency department (Figure 1). Participants were expected to manage these distractors while simultaneously caring for the patient in their simulation.

At the end of the modulation phase the simulation concluded. Participants were brought out of the simulation room and had their equipment removed. Each participant was brought to a debriefing room where a post‐simulation survey was administered to assess the user's experience with the adaptive simulation (Table 1).

TABLE 1.

Postsimulation survey

Question MS1 MS2 MD1 MD2 Comments
To what extent do you feel the augmented reality within the simulation enhanced the simulation experience? 7 8 4 6

Increased visual field on Hololens (if possible), troubleshoot resolution of image from different angles.

This is incredibly cool, and I think would greatly enhance traditional simulation once these kinks are ironed out.

To what extent do you feel the augmented reality within the simulation enhanced the believability of the simulation? 8 8 5 6 Great simulation of tachypnea. Obviously it would be even more improved if the ARs mouth moved with speaking.
How did each of the following factors impact your experience?
Scenario realism 7 8 8 7
Patient realism 7 8 7 6 If possible having the patient speak would definitely increase the realism.
Asthma realism 6 8 9 6 Sound of breathing, wheezing.
How aware were you of the following during the simulation?
Changes in symptom severity? 6 8 8 9
Changes in patient distress? 6 8 8 9 I did notice the cyanosis of the actual manniquin under the AR which is always a trigger for me to recognize the patients condition is worsening. Would be neat to add that too the AR in some capacity.
Changes in symptom severity in response to your level of cognitive load (i.e., decreased severity when experiencing high cognitive load, increased severity when experiencing low cognitive load)? 5 4 9 8

Hard to answer as I wasn't aware it was happening ‐ but very cool that it did!

I think this was done well.

Timing in symptom severity changes in response to your level of affect (i.e., decreased severity when experiencing high emotional stress, increased severity when experiencing low emotional stress)? 6 4 8 8
How valuable was the inclusion of augmented reality in simulation based on your experience in this scenario? 6 8 6 6 Seeing a visual representation of the patient breathing really increased the believability of the scenario.
How valuable was the inclusion of the modulation of patient distress and symptom severity through augmented reality in simulation based on your experience in this scenario? 8 9 7 6
How valuable do you think this type of augmented reality presentation and modulation might be in supporting better learning outcomes for simulated experiences? 8 9 8 10
Do you have any additional comments? Very cool and I can see this would have a lot of promise in the future.

Following participation in the simulation each participant completed a questionnaire regarding his or her experience within the simulation. Questions utilized a 10‐point Likert scale with the ability to comment on how to improve the simulation following each question: 1 = not at all; 10 = a great deal.

Abbreviations: AR, augmented reality; MD, emergency physician; MS, medical student.

RESULTS

We successfully designed a medical simulation platform that classified participants’ cognitive load via physiologic signal measurement of ECG and GSR. Based on these measurements, an AR patient overlay on the simulation mannequin displayed visual changes of respiratory distress that reflected real‐time symptom severity modulation, adapting to the individual's cognitive load (Figure 3). For example, if a participant was determined to have high cognitive load at 2 minutes during the engagement phase, the simulation would be adjusted to MSS‐II for the initial modulation phase. On the next cognitive load assessment, at 4 minutes and after the glucose value distractor, if the participant was determined to have low cognitive load, then the simulation would be adjusted MSS‐III and so on until the participant completed the simulation scenario or 10 minutes had elapsed (Figure 1).

The postsimulation user experience survey is shown in Table 1. The survey provided valuable information on each participant's experience with this novel adaptive simulation platform. When asked whether the AR patient presentation enhanced the simulation experience MS1 and MS2 had higher ratings compared to EP1 and EP2 with values of 7 and 8 and 4 and 6, respectively. When asked how aware the participant was of changes in symptom severity in response to their perceived cognitive load, EP1 and EP2 reported higher ratings compared to MS1 and MS2 with values of 9 and 8 versus 5 and 4, respectively. MS1, MS2, EP1, and EP2 reported that the simulation and modulation would be valuable in supporting better learning outcomes for simulated experiences, with ratings of 8, 9, 8, and 10, respectively. Overall, emergency physicians were more aware of their cognitive state and modulation of the simulation compared to medical students. The majority of the postsimulation user experience comments were positive and specific to ways the simulation could be further enhanced, usually by improving the realism of the AR patient presentation.

DISCUSSION

By continuously monitoring cognitive functioning via ECG and GSR signals, the platform described in this article utilized machine learning to detect and automatically classify, in real time, the cognitive load of the participant. After cognitive load classification, we dynamically adapted the simulated environment to participant cognitive state via changes in the AR patient's respiratory symptoms.

The ability to classify the cognitive load of a participant has important implications in the design of a simulation. If the simulation difficulty exceeds a participant's abilities, research has shown that they will perform poorly, and learning will be negatively impacted. Alternatively, if a simulation is designed with a difficulty level that is too low for the level of expertise of a participant, this can also have a negative effect on learning outcomes. 8 , 24 In some professions, like emergency and critical care medicine, there are circumstances when purposefully overloading a student cognitively creates a desirable opportunity to train for these situations. Adaptive simulation, in which the complexity of both the scenario and the tasks can be modulated in real time to respond to the participant's cognitive state, offers the chance to address these issues. The research described here works toward developing an adaptive simulation platform that can tailor simulation pedagogy to the individual participant throughout their development trajectory.

The use of cognitive load theory to improve learning outcomes in medical simulation has previously been studied. 10 , 11 , 12 , 16 , 17 , 25 Prior SBME literature has explored various ways of measuring cognitive load with the goal of modifying a simulation to meet the educational needs of the student. These measures are predominantly self‐reported, subjective, and collected retrospectively. 15 , 16 , 17 Physiological indicators of cognitive load offer the ability to objectively assess cognitive load in real time and seamlessly integrate these measures into a given simulated environment, without disrupting or impacting performance. 17 , 19 , 26 , 27 , 28 , 29

LIMITATIONS

The primary limitation of our research was the small number of participants. This was due to the pilot nature of the study, funding, and time constraints. Furthermore, it is not necessarily true that worsening patient vital signs increases the cognitive load of a physician in a linear manner. In some instances, equivocal examination findings and diagnostic uncertainty may actually prove more cognitively taxing than a rapidly deteriorating patient. 9 Despite these limitations, we were able to demonstrate that an AR simulation can be designed to dynamically respond to cognitive load. The current binary nature of our cognitive load classifier algorithm limits our ability to improve the learning of participants. The results, while exploratory, demonstrate the potential of this pedagogical approach and the role of AI in mediating adaptive learning systems. Future research will focus on increasing participation and on elaborating the discriminatory power of our classifiers. This will allow for a more nuanced and statistically verified analysis of the impact of adaptive AR technology on the learning experience and, eventually, its pedagogical efficacy.

CONCLUSION

Our research team developed a novel simulation platform that adapts augmented reality patient symptom severity modulation to participants’ measured cognitive load. This represents a step forward in advancing medical simulation design and has the potential to enhance the way that simulation‐based medical education is conceptualized, designed, and delivered. There is still more work to be done, including refining our algorithms, increasing the granularity of cognitive load classification, exploring other surrogate measurements of cognitive load, and ultimately evaluating the precise impact on learning outcomes.

CONFLICT OF INTEREST

The authors have no potential conflicts to disclose.

AUTHOR CONTRIBUTIONS

Aaron J. Ruberto: study concept and design, acquisition of the data, analysis and interpretation of the data, drafting of the manuscript. Dirk Rodenburg: study concept and design, acquisition of the data, analysis and interpretation of the data, drafting of the manuscript, obtained funding, study supervision. Kyle Ross: acquisition of the data, analysis and interpretation of the data, drafting of the manuscript, statistical expertise. Pritam Sarkar: acquisition of the data, analysis and interpretation of the data, drafting of the manuscript, statistical expertise. Paul C. Hungler: study concept and design; drafting of the manuscript; obtained funding, administrative, technical, or material support; study supervision. Ali Etemad: study concept and design, acquisition of the data, analysis and interpretation of the data, statistical expertise, obtained funding, study supervision, drafting of the manuscript. Daniel Howes: study concept and design, drafting of the manuscript, obtained funding, study supervision. Daniel Clarke: acquisition of the data; drafting of the manuscript; administrative, technical, or material support. James McLellan: study concept and design, drafting of the manuscript, obtained funding. Daryl Wilson: analysis and interpretation of the data, drafting of the manuscript. Adam Szulewski: study concept and design, acquisition of the data, analysis and interpretation of the data, drafting of the manuscript, obtained funding, study supervision.

ACKNOWLEDGMENTS

The interdisciplinary research team included physicians, computer scientists, engineers, computer graphics designers, and educational specialists funded by the Department of National Defence Innovation for Defence Excellence and Security grant (IDEaS).

Funding information

Funded by Innovation for Defence Excellence and Security (IDEaS), Government of Canada, Department of National Defence (grant W7714‐196812/001/SV).

Supervising Editor: Susan E. Farrell

REFERENCES

  • 1. Sarcevic A, Zhang Z, Marsic I, Burd RS. Checklist as a memory externalization tool during a critical care process. AMIA Annu Symp Proc AMIA Symp. 2016;2016:1080‐1089. [PMC free article] [PubMed] [Google Scholar]
  • 2. Miller GE. The assessment of clinical skills/competence/performance. Acad Med. 1990;65(9):s63‐s67. [DOI] [PubMed] [Google Scholar]
  • 3. Steadman RH, Coates WC, Huang YM, et al. Simulation‐based training is superior to problem‐based learning for the acquisition of critical assessment and management skills. Crit Care Med. 2006;34(1):151‐157. [DOI] [PubMed] [Google Scholar]
  • 4. Curtis JR, Back AL, Ford DW, et al. Effect of communication skills training for residents and nurse practitioners on quality of communication with patients with serious illness: a randomized trial [published erratum appears in JAMA. 2014;311(13):1360]. JAMA. 2013;310(21):2271‐2281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Cohen ER, Barsuk JH, Moazed F, et al. Making July safer: simulation‐based mastery learning during intern boot camp. Acad Med. 2013;88(2):233‐239. [DOI] [PubMed] [Google Scholar]
  • 6. Kirkman T. The effectiveness of human patient simulation on nursing students’ transfer of learning. Clin Simul Nurs. 2012;8(8):e408. [Google Scholar]
  • 7. Schroedl CJ, Corbridge TC, Cohen ER, et al. Use of simulation‐based education to improve resident learning and patient care in the medical intensive care unit: a randomized trial. J Crit Care. 2012;27(2):219.e7‐e13. [DOI] [PubMed] [Google Scholar]
  • 8. Kalyuga S. Expertise reversal effect and its implications for learner‐tailored instruction. Educ Psychol Rev. 2007;19(4):509‐539. [Google Scholar]
  • 9. Szulewski A, Howes D, van Merriënboer JJ, Sweller J. From theory to practice: the application of cognitive load theory to the practice of medicine. Acad Med. 2021;96(1):24‐30. [DOI] [PubMed] [Google Scholar]
  • 10. Fraser KL, Ayres P, Sweller J. Cognitive load theory for the design of medical simulations. Simul Healthc. 2015;10(5):295‐307. [DOI] [PubMed] [Google Scholar]
  • 11. Laxmisan A, Hakimzada F, Sayan OR, Green RA, Zhang J, Patel VL. The multitasking clinician: decision‐making and cognitive demand during and after team handoffs in emergency care. Int J Med Inform. 2007;76(11‐12):801‐811. [DOI] [PubMed] [Google Scholar]
  • 12. Sweller J. Cognitive load during problem solving: effects on learning. Cogn Sci [Internet]. 1988;12(2):257‐285. https://www.sciencedirect.com/science/article/pii/0364021388900237 [Google Scholar]
  • 13. Young JQ, Van Merrienboer J, Durning S, Ten Cate O. Cognitive load theory: implications for medical education: AMEE Guide No. 86. Med Teach. 2014;36(5):371‐384. [DOI] [PubMed] [Google Scholar]
  • 14. Van Merrienboer JJ, Kirschner P. Ten Steps to Complex Learning: A Systematic Approach to Four‐Component Instructional Design. Mahwah, NJ: Lawrence Erlbaum Associates Publishers; 2007. [Google Scholar]
  • 15. Naismith LM, Cheung JJ, Ringsted C, Cavalcanti RB. Limitations of subjective cognitive load measures in simulation‐based procedural training. Med Educ. 2015;49(8):805‐814. [DOI] [PubMed] [Google Scholar]
  • 16. Naismith LM, Cavalcanti RB. Validity of cognitive load measures in simulation‐based training: a systematic review. Acad Med. 2015;90(11):S24‐S35. [DOI] [PubMed] [Google Scholar]
  • 17. Szulewski A, Gegenfurtner A, Howes DW, Sivilotti ML, van Merriënboer JJ. Measuring physician cognitive load: validity evidence for a physiologic and a psychometric tool. Adv Heal Sci Educ. 2017;22(4):951‐968. [DOI] [PubMed] [Google Scholar]
  • 18. Szulewski A, Braund H, Egan R, et al. Starting to think like an expert: an analysis of resident cognitive processes during simulation‐based resuscitation examinations. Ann Emerg Med. 2019;74(5):647‐659. [DOI] [PubMed] [Google Scholar]
  • 19. Bruder E. Manipulation of cognitive load in simulation‐based medical education (Master’s thesis). QSpace; [Internet]. 2018; http://hdl.handle.net/1974/23962 [Google Scholar]
  • 20. Dias RD, Ngo‐Howard MC, Boskovski MT, Zenati MA, Yule SJ. Systematic review of measurement tools to assess surgeons’ intraoperative cognitive workload. Br J Surg. 2018;105(5):491‐501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Szulewski A, Roth N, Howes D. The use of task‐evoked pupillary response as an objective measure of cognitive load in novices and trained physicians: a new tool for the assessment of expertise. Acad Med. 2015;90(7):981‐987. [DOI] [PubMed] [Google Scholar]
  • 22. Ross K, Sarkar P, Rodenburg D, et al. Toward dynamically adaptive simulation: multimodal classification of user expertise using wearable devices. Sensors (Switzerland). 2019;19(19):4270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Sarkar P, Ross K, Ruberto AJ, Rodenbura D, Hungler P, Etemad A. Classification of cognitive load and expertise for adaptive simulation using deep multitask learning. 2019 8th Int Conf Affect Comput Intell Interact ACII 2019. 2019:496‐502. [Google Scholar]
  • 24. Hamstra SJ, Brydges R, Hatala R, Zendejas B, Cook DA. Reconsidering fidelity in simulation‐based training. Acad Med. 2014;89(3):387‐392. [DOI] [PubMed] [Google Scholar]
  • 25. White MR, Braund H, Howes D, et al. Getting inside the expert’s head: an analysis of physician cognitive processes during trauma resuscitations. Ann Emerg Med. 2018;72(3):289–298. [DOI] [PubMed] [Google Scholar]
  • 26. Nourbakhsh N, Wang Y, Chen F, Calvo RA. Using galvanic skin response for cognitive load measurement in arithmetic and reading tasks. In Proceedings of the 24th Australian Computer Interaction Conference OzCHI 2012. 2012:420–423. [Google Scholar]
  • 27. Perlin K, He Z, Zhu F, et al. Detecting users’ cognitive load by galvanic skin response with affective interference. ACM Trans Interact Intell Syst. 2017;7(2):1‐20. [Google Scholar]
  • 28. Charles RL, Nixon J. Measuring mental workload using physiological measures: a systematic review. Appl Ergon. 2019;74:221‐232. [DOI] [PubMed] [Google Scholar]
  • 29. Solhjoo S, Haigney MC, McBee E, et al. Heart rate and heart rate variability correlate with clinical reasoning performance and self‐reported measures of cognitive load. Sci Rep. 2019;9(1):1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AEM Education and Training are provided here courtesy of Wiley

RESOURCES