Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Sep 1.
Published in final edited form as: Proc Int Symp Hum Factors Ergon Healthc. 2019 Sep 15;8(1):110–114. doi: 10.1177/2327857919081025

Can Eye Tracking be Used to Predict Performance Improvements in Simulated Medical Training? A Case Study in Central Venous Catheterization

Hong-En Chen 1, Rucha R Bhide 1, David F Pepley 1, Cheyenne C Sonntag 2, Jason Z Moore 1, David C Han 2, Scarlett R Miller 1
PMCID: PMC6944057  NIHMSID: NIHMS1063324  PMID: 31909058

Abstract

Manikins have traditionally been used to train ultrasound-guided Central Venous Catheterization (CVC), but are static in nature and require an expert observer to provide feedback. As a result, virtual simulation and personalized learning has been increasingly adopted in medical education to efficiently provide quantitative feedback. The Dynamic Haptic Robotic Trainer (DHRT) trains surgical residents in CVC needle insertions by simulating various patient profiles and presenting personalized feedback on objective performance. However, no studies have examined the learning gains of the personalized learning feedback or the relation of feedback to what the user is focusing on during the training. Thus, this study was developed to determine the effectiveness of the current personalized learning interface through a long-term investigation with 7 surgical residents. The eye tracking analysis showed that residents spent significantly more time fixated on percent aspiration throughout the study; the more time participants spent looking at the Number of Insertions, Percent Aspiration and the Angle of Insertion on the DHRT GUI, the better they performed on subsequent trials on the DHRT system.

INTRODUCTION

Central Venous Catheterization (CVC) is a common medical procedure used to provide nutrients and medication to the body via direct access to the heart (Graham, Ozment, Tegtmeyer, & Braner, 2013). During this procedure, a needle is inserted into, typically, the internal jugular vein (IJ) through the use of ultrasound guidance. Next, a guidewire is threaded and a catheter is inserted over the guidewire (Osborne, 2005). While over 5 million central lines are placed in the United States each year that account for 15 million catheter days per year (McGee & Gould, 2003; Raad, 1998), complications occur in up to 15% of patients (McGee & Gould, 2003). The greatest predictor of placement complications is attributed to the number of unsuccessful needle insertion attempts, where the failure rate of cannulation increases from 1.6% (1 attempt) to 10.2% (2 attempts) to 43.2% (3 or more attempts) (Mansfield, Hohn, Fornage, Gregurich, & Ota, 1994). With rising healthcare costs, proper training in CVCs is essential to reduce the cost and occurrence of vascular-related infections.

In order to reduce these complications, manikin simulators have been integrated throughout medical education. While manikin trainers are physically realistic and simulate tactile feedback, they only represent a single patient anatomy. In addition, there are no objective performance criterion on these manikins; Instead, these systems require a trained preceptor (e.g. faculty) to be present to provide real-time feedback on performance – which also introduces subjectivity to the evaluation process and makes standardization difficult. Because of this, existing medical simulator education has been criticized as time consuming and resource intensive (Ogden, Cobbs, Howell, Sibbitt, & DiPette, 2007; Sherertz et al., 2000).

In order to combat the deficits of these manikin based systems, the Dynamic Haptic Robotic Trainer (DHRT) was developed to train ultrasound-guided CVC needle insertion skills by providing multiple patient anatomies and objective feedback on performance (Pepley et al., 2017). The DHRT system includes a simulated and interactive ultrasound probe, ultrasound screen, needle, and a variety of patient cases. In order to reduce the time and resource burden associated with traditional training techniques, the DHRT system deploys a personalized learning interface that provides real-time automated feedback through a graphical user interface (GUI) (Yovanoff et al., 2017), see Figure 1 and the methods section for a description. However, the utility of this feedback system for learning gains has yet to be investigated.

Figure 1:

Figure 1:

(Left) Personalized learning interface with (1) Overall Grade, (2) Case Difficulty, (3) Number of Insertions, (4) Angle of insertion, (5) Distance to the Center of the Vein, (6) Percent of time spent aspirating. (Right) Sample heat map of participant viewing their feedback on one trial, where red represents a high concentration of fixation.

One way to monitor what the user is looking at in this system and how it relates to performance gains is through eye tracking. Eye tracking has been used in a variety of different industries, from neuroscience and psychology to marketing and advertising, in order to better understand what people are focusing their attention on (Duchowski, 2002). Eye movements, including fixations on certain areas of a user interface, can reflect the amount of mental processing applied in specific gaze points (Ghaoui, 2005). In the medical field, eye tracking to differentiate expertise in a simulated laparoscopic task found that experts maintained steadier gaze patterns and required less visual feedback in order to complete the procedure (Law, Atkins, Kirkpatrick, & Lomax, 2004). Another study examined whether or not eye gaze patterns during sinus surgery could be indicative of surgical skill, concluding that there was a correlation between performance and eye gaze patterns (Ahmidi et al., 2010).

In this study, eye tracking was used to determine which pieces of information participants fixated on in the personalized learning interface, and how gaze patterns changed from trial to trial throughout training. This data was then compared to participant performance to determine if there was any correlation. Eye tracking during CVC training allows researchers to determine whether or not the information on the GUI is actually useful, and whether it helps participants learn. Few studies have used eye tracking to examine learning and performance gains in CVC insertion. Thus, the goal of this paper is the explore the effectiveness of the learning interface by examining performance improvements and gaze fixation patterns that may better inform the metrics and design. This type of information can inform the development of personalized learning systems for other procedures in the medical field to improve training and skill retention.

METHODS

The purpose of the current study was to identify the effectiveness of the DHRT personalized learning interface previously developed by (Yovanoff et al., 2017) through an eye tracking investigation. Six key Areas Of Interest (AOIs) were analyzed: 1) overall grade, 2) case difficulty, 3) number of insertion attempts, 4) angle of insertion, 5) distance to the center of the vein, and 6) percentage of time aspirating, see Figure 1. Specifically, the study was developed to answer the following research questions (RQ):

  • RQ1: How does the eye fixation patterns on different AOIs change as training progresses?

  • RQ2: How does performance in each AOI change throughout training? How does this relate to what participants fixated on in the personalized learning interface?

Participants

In order to answer these questions, participants were recruited from a first-year surgical residency program at Penn State Hershey Medical Center (HMC). In total, there were 7 participants (5 males, 2 females) in the study who specialized in General Surgery (3), Otolaryngology (1), Orthopedics (1), Urology (1) and Unspecified (1). These participants were selected because they were right handed (necessary for the DHRT system at the time) and did not wear glasses (necessary for the eye tracking system deployed). This experiment represents a subset of a larger study (N=26) aimed at comparing the effectiveness of the DHRT system to current manikin simulators. Only aspects of the study that pertain to the current investigation will be described here.

Procedure

At the start of the Institutional Review Board approved study, the purposes and procedures were explained and informed consent was obtained. Participants completed a total of three DHRT sessions, a practice session and two training sessions, which occurred over a 2-month period of time. The first practice session consisted of 2 needle insertions (Trials 1–2), while the two subsequent sessions included 20 additional insertions: Session II (Trials 3–12) and III (Trials 13–22). Before each training session, participants were fitted and calibrated with Tobii Pro Glasses 2, which collects raw data at a frequency of 20 Hz (“Tobii Pro Glasses 2,” 2018). Once calibrated, the participants independently completed their insertions on the feedback screen. The 22 insertions consisted of 17 distinct patient characteristics, with the same baseline profile for the first and last trial of each training session. For the purpose of the current study, only training Sessions II and III will be discussed.

Eye Tracking Measures:

This section defines the eye tracking measures that were used in our data analysis. All of the eye tracking metrics were based on eye gaze fixations, which are defined as the amount of time spent maintaining a gaze in a specific location. Specifically, the fixation filter in Tobii Pro software uses a classification algorithm that detects changes in eye gaze locations using a sliding window method (Olsson, 2007). Signals, or eye gazes, that slowly shift around the same area are classified as a fixation; abrupt changes in signals indicate a change in fixation location. Figure 2 (right) provides a demonstration of two fixation points.

Figure 2:

Figure 2:

(Left) Eye-tracking the DHRT system setup and (Right) Gaze plots for with a two sample fixation points. The size of the fixation circle grows as the length of time spent fixating in that area increases.

Total Fixation Duration:

The total amount of time spent fixating on the feedback screen after each trial.

AOI Fixation Duration:

The total percent of time (in seconds) each participant spent fixating on a particular AOI after each trial. For each participant, the fixation duration was normalized by dividing the amount of time spent fixating on points within a particular AOI (e.g. overall grade) by the Total Fixation Duration per trial.

Mean Fixation duration:

For each participant, this was calculated as the average AOI Fixation Duration for both Training Sessions (II and III).

Eye gaze points were analyzed using the real-world mapping feature and manually confirmed in the Tobii Pro Lab software. The real-world mapping feature automatically codes eye gaze points from the video onto a still image (snapshot) (“Real-World Mapping,” 2018). The software outputs a variety of eye gaze metrics, including scan path, type of gaze, fixation duration, gaze duration, and location of eye movement. While several gaze filters were available, the fixation filter was used for this study due to its precise classification method.

For the purposes of this study, only eye gaze data for the feedback screen was coded. Thus, whenever the participant looked at something that was not the feedback screen (i.e. the floor, the ultrasound probe), it was not analyzed for the current study, see Figure 2 for experimental setup. In order to map eye gaze data, a snapshot of the participants’ feedback screen was added to the Tobii project for every trial. Each time a participant looked at a specific area of the feedback screen in the video, that location was clicked on the corresponding snapshot.

Performance Measures:

Performance improvement was measured on five AOIs, excluding case difficulty. Each of the six AOI’s shown in Figure 1 is summarized below:

Overall Performance Score:

An overall performance score (e.g. 891.67, see Figure 2) was calculated for each patient profile based on the weighted performance in all over AOIs, see (Pepley et al., 2017) for more details on the calculation.

Case Difficulty:

The patient profiles were assigned difficulty scores according to depth and size of the vein and artery. These scores ranged from 1 to 5, 5 being the most difficult and 1 being the least difficult, where smaller or deeper vessels were considered more difficult. These ratings were developed with the aid of a vascular surgeon. This AOI did not correspond to performance and was excluded from the analyses.

Number of Insertion Attempts:

This AOI was defined as the number of times the needle punctured the simulated tissue surface. This metric is important because prior work has shown that multiple insertion attempts is the greatest predictor of cannulation failures (Mansfield et al., 1994).

Angle of Insertion:

The angle of insertion was calculated as the average angle of the needle, relative to the simulated tissue surface, while the needle tip was under the tissue surface. The ideal range for the angle of insertion was between 30–45°.

Distance from the center of the vein:

This measure was calculated as the distance between the final location of the needle tip and the center of the target vein. Participants were instructed to place the needle at the center of the vein as often as possible.

Percent aspiration:

This was calculated as the percentage of time the needle is being aspirated while the needle tip is under the simulated tissue surface. Participants were instructed to aspirate 100% of the time to reduce the risk of air embolism.

Performance improvements between trials:

The performance improvement between trials for each of the five Areas of Interest (AOI) shown in Figure 1 was calculated by looking at the change in score (±) between subsequent trials.

RESULTS AND DISCUSSION

To answer our research questions, statistical analyses were conducted on performance and fixation patterns on the five AOIs from the two full training sessions (Sessions II and III). These analyses and results are presented in relation to each research question. All statistics were analyzed using SPSS (v. 25.0) with a significance level of 0.05.

RQ1: How does the eye fixation patterns on different AOIs change as training progresses?

Our first research question was developed to understand what people were looking at on the personalized learning interface and to determine whether these fixations changed throughout the course of learning. In order to understand this, we first performed a repeated measures ANOVA to determine if there were significant differences in the Mean Fixation Duration between the AOIs throughout each training session. Mauchly’s Test of Sphericity indicated that the assumption of sphericity had been violated χ2 = 47.496, p < 0.0005; therefore, a Greenhouse-Geisser correction was used. There was a significant effect of AOI on Mean Fixation Duration F(1, 3.543) = 32.673, p < 0.0005. Follow-up paired sample t-tests revealed that participants spent significantly more time fixating on the Distance to Center of the Vein (21.9 ± 14.8s), Number of Insertion Attempts (21.0 ± 14.6s), and Overall Performance Score (19.7 ± 12.5s) than Angle of Insertion (15.5 ± 12.4s) or Percent Aspiration (8.2 ± 8.8s), p < 0.005. In addition, participants spent significantly more time fixating on the Angle of Insertion than Percent Aspiration, p < 0.005. There were no other statistically significant differences.

In order to understand if participants spent more (or less) time fixating on these AOIs over the course of the two training sessions, a paired sample t-test was conducted for the Mean Fixation Duration for the AOIs between all trials in Sessions II and III. Assumptions were checked prior to the test. Results indicated that there was a significant difference in Mean Fixation Duration on the number of insertion attempts and percent aspiration between Sessions II and III. Specifically, participants spent more time looking at the number of insertion attempts AOI during session II (21.2 ± 6.5s) than during session III (18.9 ± 6.6s), t(6) = 2.48, p < 0.048. Additionally, participants spent less time fixating on the percent aspiration AOI during session II (6.5 ± 2.8s) than in session III (9.7 ± 1.3s), t(6) = −3.417, p < 0.014. This indicates that they were more attentive to feedback on their percent aspiration during the latter training session, relative to the previous session. Interestingly, participants spent the least amount of time fixating on the percent aspiration AOI compared to all other AOIs. This could potentially be due to its placement on the lower right-hand corner of the screen, discussed more in RQ2.

RQ2: How does performance in each AOI change throughout training?

Our second research question was developed to understand if each AOI on the DHRT GUI was providing valuable feedback that helped participants evaluate and improve on their own performance. In order to understand this, regression analyses were performed with the independent variables being the time spent fixating on each of the AOIs, and the dependent variable being the difference in performance in each of the five AOIs between trials. The results of the regression revealed that two predictors explained 12.7% of the variance for performance improvements in the Number of Insertion Attempts (R2= .127, F(6, 116)= 2.801, p < .014) including the time fixated on the angle of insertion AOI (β = −.133, p < .009) and aspiration % AOI (β = .194, p < .040), see Figure 3. In addition, the regression analyses showed that one predictor explained 11.4% of the variance for performance improvements in the Angle of Insertion (R2 = .114, F(6, 116)= 2.490, p < .027), see Figure 4. Specifically, the time fixated on the angle of insertion AOI (β = −.298, p < .003) significantly predicted the improvement on the participants’ Angle of Insertion performance in subsequent trials. There were no other significant findings.

Figure 3:

Figure 3:

Regression analysis for changes in # insertions between trials. The time spent fixating on the # insertions and aspiration % in the GUI significantly contributed to the model.

Figure 4:

Figure 4:

Regression for changes in the angle of insertion between trials. The time spent fixating on the aspiration percent in the GUI significantly contributed to the model.

CONCLUSION

The goal of the current study was to understand what surgical residents focused their attention on during performance feedback, and how feedback improved their performance in each of the areas of interest (AOIs) presented by the Dynamic Haptic Robot Trainer (DHRT) personalized learning interface. Results from eye tracking analysis showed that residents spent significantly more time fixated on percent aspiration from session II to session III. Additionally, the results showed that the more time participants spent looking at the Number of Insertion Attempts, Percent Aspiration, and the Angle of Insertion on the GUI, the better they performed on subsequent trials on the DHRT system. The results of this study support the use of the DHRT GUI for improving resident performance. Our results also identified how eye-tracking can be used to measure the effectiveness of GUI design in medical simulators.

While these results are promising, there were several limitations to the study. Due to the design of the robot at the time of the study (right-handed) and the number of participants who wore glasses, only the 7 participants who did not wear glasses were eye-tracked, as to not interfere with the accuracy of the gaze data. Additionally, this study focused on the performance improvements in the interface design, but did not conduct another full round of usability testing for a redesign. Future design recommendations include replacing the “Case Difficulty” AOI with a “Time to Complete.”

ACKNOWLEDGEMENTS

This work was supported by the national Heart, Lung, and Blood Institute of the National Institutes of Health under Award Number RO1HL127316. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

REFERENCES

  1. Ahmidi N, Hager GD, Ishii L, Fichtinger G, Gallia GL, & Ishii M (2010). Surgical task and skill classification from eye tracking and tool motion in minimally invasive surgery. Paper presented at the International Conference on Medical Image Computing and Computer-Assisted Intervention. [DOI] [PubMed] [Google Scholar]
  2. Duchowski AT (2002). A breadth-first survey of eye-tracking applications. Behavior Research Methods, Instruments, & Computers, 34(4), 455–470. [DOI] [PubMed] [Google Scholar]
  3. Ghaoui C (2005). Encyclopedia of human computer interaction: IGI Global. [Google Scholar]
  4. Graham A, Ozment C, Tegtmeyer K, & Braner D (Producer). (2013). Central Venous Catheterization. Retrieved from http://www.youtube.com/watch?v=L_Z87iEwjbE [DOI] [PubMed]
  5. Law B, Atkins SM, Kirkpatrick AE, & Lomax AJ (2004). Eye gaze patterns differentiate novice and experts in a virtual laparoscopic surgery training environment. Paper presented at the Proceedings of the 2004 symposium on Eye tracking research & applications, San Antonio, TX. [Google Scholar]
  6. Mansfield PF, Hohn DC, Fornage BD, Gregurich MA, & Ota DM (1994). Complications and failures of subclavian-vein catheterization. New England journal of medicine, 331(26), 1735–1738. [DOI] [PubMed] [Google Scholar]
  7. McGee DC, & Gould MK (2003). Preventing complications of central venous catheterization. New England journal of medicine, 348(12), 1123–1133. [DOI] [PubMed] [Google Scholar]
  8. Ogden PE, Cobbs LS, Howell MR, Sibbitt SJ, & DiPette DJ (2007). Clinical simulation: importance to the internal medicine educational mission. The American journal of medicine, 120(9), 820–824. [DOI] [PubMed] [Google Scholar]
  9. Olsson P (2007). Real-time and offline filters for eye tracking. In.
  10. Osborne T (2005). CENTRAL VENOUS CATHETER In: WO Patent 2,005,004,966. [Google Scholar]
  11. Pepley DF, Gordon AB, Yovanoff MA, Mirkin KA, Miller SR, Han DC, & Moore JZ (2017). Training Surgical Residents With a Haptic Robotic Central Venous Catheterization Simulator. Journal of surgical education, 74(6), 1066–1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Raad I (1998). Intravascular-catheter-related infections. Lancet, 351(9106), 893. [DOI] [PubMed] [Google Scholar]
  13. Real-World Mapping. (2018). Retrieved from https://www.tobiipro.com/learn-and-support/learn/steps-in-an-eye-tracking-study/data/real-world-mapping/
  14. Ritter FE, & Schooler LJ (2001). The learning curve. International encyclopedia of the social and behavioral sciences, 13, 8602–8605. [Google Scholar]
  15. Sherertz RJ, Ely EW, Westbrook DM, Gledhill KS, Streed SA, Kiger B, … Cruz J, (2000). Education of physicians-in-training can decrease the risk for vascular catheter infection. Annals of internal medicine, 132(8), 641–648. [DOI] [PubMed] [Google Scholar]
  16. Tobii Pro Glasses 2. (2018). Retrieved from https://www.tobiipro.com/product-listing/tobii-pro-glasses-2/
  17. Yovanoff M, Pepley D, Mirkin K, Moore J, Han D, & Miller S (2017). Personalized Learning in Medical Education: Designing a User Interface for a Dynamic Haptic Robotic Trainer for Central Venous Catheterization. Paper presented at the Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Austin, TX. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES