Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Jul 1.
Published in final edited form as: Otolaryngol Head Neck Surg. 2024 Mar 15;171(1):188–196. doi: 10.1002/ohn.714

Tremor Assessment in Robot-Assisted Microlaryngeal Surgery Using Computer Vision-Based Tool Tracking

Sue M Cho 1,*, Henry H Joo 2,*, Pranathi Golla 3, Manish Sahu 1, Ahjeetha Shankar 2, Danielle R Trakimas 2, Francis Creighton 2, Lee Akst 2, Russell H Taylor 1, Deepa Galaiya 2
PMCID: PMC11211051  NIHMSID: NIHMS1986154  PMID: 38488231

Abstract

Objective.

Use microscopic video-based tracking of laryngeal surgical instruments to investigate the effect of robot assistance on instrument tremor.

Study Design.

Experimental trial.

Setting.

Tertiary Academic Medical Center.

Methods.

In this randomized cross-over trial, 36 videos were recorded from 6 surgeons performing left and right cordectomies on cadaveric pig larynges. These recordings captured 3 distinct conditions: without robotic assistance, with robot-assisted scissors, and with robot-assisted graspers. To assess tool tremor, we employed computer vision-based algorithms for tracking surgical tools. Absolute tremor bandpower and normalized path length were utilized as quantitative measures. Wilcoxon rank sum exact tests were employed for statistical analyses and comparisons between trials. Additionally, surveys were administered to assess the perceived ease of use of the robotic system.

Results.

Absolute tremor bandpower showed a significant decrease when using robot-assisted instruments compared to freehand instruments (P = .012). Normalized path length significantly decreased with robot-assisted compared to freehand trials (P = .001). For the scissors, robot-assisted trials resulted in a significant decrease in absolute tremor bandpower (P = .002) and normalized path length (P < .001). For the graspers, there was no significant difference in absolute tremor bandpower (P = .4), but there was a significantly lower normalized path length in the robot-assisted trials (P = .03).

Conclusion.

This study demonstrated that computer-vision-based approaches can be used to assess tool motion in simulated microlaryngeal procedures. The results suggest that robot assistance is capable of reducing instrument tremor.

Keywords: nnU-Net, robotic surgery, surgical tool tracking, tremor


Transoral microlaryngeal surgery is a minimally invasive approach for laryngo-pharyngeal cancers and lesions.1 Like many procedures in otolaryngology, microlaryngeal surgery is performed within the tight confines of the laryngeal vault, demanding controlled, precise motion of long tools in a narrow channel for safe and desirable patient outcomes. Even minor instrument tremor can result in injury to the vocal folds, leading to permanent vocal damage, particularly in patients with abnormal anatomy and in pediatric patients with smaller airways.2

In response to these challenges, a novel robotic ear, nose, and throat microsurgery system (REMS) was developed, which incorporates cooperative control of microsurgical instruments. In this system, the robot arm grasps the instrument shaft adjacent to the handle used by the surgeon, allowing surgeon-robot collaboration during the microsurgical procedure, in contrast to systems such as the Da Vinci robot, where the surgeon remotely operates the surgical robot and is not in the surgical field.3 The Galen ES system, based on the REMS system4 (Galen Robotics), recently received Food and Drug Administration approval for laryngeal surgery [regulation number 874.4460].

The potential of REMS to improve precision in simulated laryngoscopy surgical tasks was demonstrated in a preliminary study.5 However, the study’s primary limitation was its reliance on simulated tasks, which may not be generalizable to actual laryngeal procedures. To further advance this research, the subsequent step involves the utilization of a higher fidelity cadaver model to assess tremor reduction between robot-assisted surgery and freehand surgery techniques. This would provide a comprehensive understanding of the benefits offered by robot-assisted surgery in tremor reduction during microlaryngeal procedures.

Conventionally, surgical performance has been assessed subjectively, which is both time-intensive and prone to several biases. In recent years, there have been several advancements in automating evaluation of surgical skills, covered in a systematic review by Levin et al.6 Previously, automated analysis of surgical tool motion using electromagnetic trackers has been pioneered in several surgical settings. For example, Smith et al demonstrated that motion tracking offers valuable insights for assessing laparoscopic dexterity.7 Others have successfully validated hand-motion analysis (HMA) as an objective measure of surgical skill in the operating room,810 building the foundation for further advancements in surgical tool motion studies.

In this study, we use a computer-vision-based automated tool-tracking algorithm to quantify surgical tool motion. Compared to systems that rely on physical tracking hardware such as electromagnetic trackers, computer vision-based methods have proven to be advantageous in multiple ways, including their potential to analyze intraoperative videos, which are easily captured using modern surgical microscopes common to many procedures in otolaryngology. Automated tool motion analysis of surgical videos offers several advantages compared to physical trackers, including improved accuracy and rapid feedback.6

Our study uses these computer vision-based methods in surgical tool tracking to characterize the effects of the robotic microsurgery platform, Galen ES version of the REMS system, on tremor and tool motion. The primary objective of this research is to quantitatively assess tremor reduction during a microlaryngeal procedure with and without robotic assistance using a high-fidelity cadaveric pig model and a computer vision-based algorithm. By comparing the tremor during freehand and robot-assisted cordectomies, this investigation could offer critical insights into the potential advantages of robot-assisted microlaryngeal surgery.

Methods

Study Design

This study was approved by the Johns Hopkins Institutional Review Board (HIRB00001598). Six otolaryngologists, comprising 3 residents and 3 attendings, were recruited for this study. Cadaveric porcine larynges were mounted on a 3D-printed laryngoscope and viewed through a surgical microscope (Haag-Streit) equipped with stereo vision cameras for video recording (Figure 1).

Figure 1.

Figure 1.

Experimental setup. Top and bottom-left show the experimental setup, including the 3D-printed laryngoscope. Bottom-right shows example frame of video data captured by the surgical microscope.

In the first phase of the study, participants were instructed on how to perform a simulated cordectomy procedure using forward-action laryngeal graspers and curved laryngeal scissors (Integra) on the left and right false vocal folds. Participants adjusted their chair, table, armrests, and microscope settings to achieve optimal ergonomics and limb support. Then, they performed the cordectomy procedure on both sides without robotic assistance. Subsequently, they were given a tutorial on the Galen ES robot followed by a hands-on session, to ensure they were adequately familiar with the system’s functionalities. The Galen ES robotic arm has a universal mount with custom adapters that can accommodate a variety of instruments, including both the graspers and scissors. Participants performed left and right cordectomies using a combination of robot-assisted scissors with freehand graspers and robot-assisted graspers with freehand scissors, for a total of 4 trials (2 on each side). The trial order within each of the 2 phases was randomized. Before and after the experiment, participant information and feedback were collected via written surveys including Likert scales (Supplemental surveys 1 and 2, available online).

Surgical Tool Tracking

All freehand and robot-assisted trials were recorded at 25 frames per second using a surgical microscope. The recorded videos were annotated to outline the instruments employed and isolate specific intervals of operation. These annotations served as training data for nnU-Net, a deep-learning segmentation method.11 The inferences from this model were utilized as detections and integrated with OpenCV CSRT (Discriminative Correlation Filter [with Channel and Spatial Reliability]).12 The vision-based tracking algorithm generated frame-by-frame coordinates for the joints of the graspers and scissors (Figure 2). The power spectral density (PSD) of the instrument acceleration was calculated using the pwelch function in MATLAB (version R2022b; MathWorks).8 Subsequently, the absolute tremor bandpower was computed by approximating the area under the curve (AUC) of the PSD plot between 8 and 12 Hz (Figure 3), as this has been shown to represent the range of intraoperative physiological tremor.9 The normalized path length of the instrument was calculated by summing the Euclidean distances between consecutive points along the path (Figure 4) and dividing by the elapsed time.

Figure 2.

Figure 2.

Representative frame of tracked tool joints. The image shows the automated tool detection of tracked joints in a left cordectomy. Left instrument: scissors. Right instrument: graspers.

Figure 3.

Figure 3.

Representative process for absolute tremor bandpower calculation. This figure displays representative plots of displacement, acceleration, and power spectral density (PSD) for absolute bandpower calculation.

Figure 4.

Figure 4.

Representative instrument pathing. This graph shows the path of scissors in a left cordectomy. Shading denotes progression over the course of the procedure (from light start to dark end).

Statistical Analysis

Wilcoxon rank sum exact tests were utilized to analyze the impact of experience levels and robot assistance. All statistical analyses were performed in R (version 4.2.1), and statistical significance was set to P < .05.

Results

The average absolute tremor bandpower for freehand trials, at 559.4 (mm/s2),2 was significantly larger than that of robot-assisted trials at 311.8 (mm/s2)2 (P = .012). Likewise, the normalized path length was higher in freehand (5.7 mm/s) compared to robot-assisted trials (P = .001) (Figure 5). The procedure completion time was similar between the freehand trials (mean = 24 s, SD = 14 s) and robot-assisted trials (mean = 28 s, SD = 14) (P = .4).

Figure 5.

Figure 5.

Freehand versus robot-assisted. Boxplots comparing robot assistance for absolute tremor bandpower and normalized path length. Rows comprise all data, scissors, and graspers, respectively (dots: mean, *P < .05, **P < .01, ***P < .001).

The results were then stratified by instrument type. For the scissors, robot-assisted trials resulted in a significant decrease in absolute tremor bandpower (P = .002) and normalized path length (P < .001). For the graspers, there was no significant difference in absolute tremor bandpower (P = .400), but there was a significantly lower normalized path length in the robot-assisted trials (P = .030).

The scissors were further stratified based on experience level and hand dominance (Figure 6). The resident group demonstrated a significant reduction in absolute tremor bandpower (P = .001) when using robot-assisted scissors compared to freehand scissors. However, the attending group did not show the same degree of improvement from freehand to robot-assistance (P = .200). Additionally, both the attending (P = .040) and resident (P = .010) groups had a statistically significant decrease in normalized path length when using robot-assisted scissors compared to freehand scissors. When examining the effect of hand dominance, the improvement in absolute tremor bandpower was only significant in the nondominant hand with the addition of robotic assistance (P = .020), but not in the dominant hand (P = .300). However, for normalized path length, both the dominant (P = .048) and nondominant trials (P = .020) experienced a significant decrease with the addition of robotic assistance (Table 1).

Figure 6.

Figure 6.

Stratified freehand versus robot-assisted analysis. Boxplots comparing robot assistance in scissors for absolute tremor bandpower, stratified by expertise level and hand dominance (dots: mean, *P < .05, **P < .01, ***P < .001).

Table 1.

Stratified Analyses

Subset of trials Absolute tremor bandpower ((mm/s2)2) Normalized path length (mm/s)
Graspers No robot, N = 20 329 (217) 4.25 (1.08)
Robot, N = 9 317 (326) 3.31 (0.76)
P value .4 .03*
 Graspers—Residents No robot, N = 10 261 (149) 4.2 (1.0)
Robot, N = 5 416 (411) 3.5 (0.8)
P value >.9 .3
 Graspers—Attendings No robot, N = 10 397 (259) 4.3 (1.2)
Robot, N = 4 192 (147) 3.1 (0.7)
P value 397 (259) .1
 Graspers—Non-Dominant Hand No robot, N = 9 431 (239) 5.07 (0.86)
Robot, N = 6 414 (367) 3.63 (0.70)
P value .7 .005**
 Graspers—Dominant Hand No robot, N = 11 245 (164) 3.6 (0.7)
Robot, N = 3 122 (53) 2.7 (0.4)
P value .13 .06
Scissors No robot, N = 20 790 (477) 7.2 (2.1)
Robot, N = 9 307 (213) 4.4 (0.7)
P value .002** <.001***
 Scissors—Residents No robot, N = 11 832 (490) 6.8 (1.4)
Robot, N = 4 185 (26) 4.4 (0.4)
P value .001** .01*
 Scissors—Attendings No robot, N = 9 739 (485) 7.7 (2.8)
Robot, N = 5 404 (252) 4.4 (0.9)
P value .2 .04*
 Scissors—Non-Dominant Hand No robot, N = 8 825 (588) 6.6 (2.2)
Robot, N = 6 235 (165) 4.1 (0.5)
P value .020* .02*
 Scissors—Dominant Hand No robot, N = 12 767 (415) 7.6 (2.1)
Robot, N = 3 450 (259) 5.0 (0.6)
P value .3 .048*

Rank sum test results comparing subsets of steadiness measurements between freehand (“No robot”) and robot-assisted (“Robot”) cases. Cells are populated with mean (standard deviation).

*

P < .05.

**

P < .01.

***

P < .001.

When asked about the clinical utility of Galen ES, all participants reported that they could see a system like Galen ES assisting with procedures in the foreseeable future (Supplemental Table S1, available online). Notably, half of the participants believed that the cooperative control of Galen ES would be particularly advantageous when used in the nondominant hand. Half of the participants also indicated a preference for robot-assistance for the graspers over scissors. In terms of usability, Galen ES had a positive reception, with an average rating of 3.7 out of 5.0. Nonetheless, some concerns were expressed in the participants’ comments. The most frequently voiced concerns pertained to diminished haptic feedback and limited degrees of rotational freedom.

Discussion

This experimental study evaluated the impact of Galen ES on surgical tremor and tool motion in laryngeal surgery utilizing a computer vision-based method for automatically tracking surgical instruments on microscopic videos. In the cordectomy procedure highlighted in this study, robotic assistance was most likely to reduce tremor when participants were using scissors, operating with the nondominant hand, or if they were a trainee.

The economy of motion analysis in our study allowed for novel quantitative tremor assessments. While other studies have shown some of the benefits of REMS using simple simulation models,1,2,13 our approach expanded on these findings by adding a quantitative tremor analysis without any additional hardware. When considering the impact of the Galen ES version of the REMS system, it is evident that the robot’s assistance led to a reduction in tremor, regardless of surgical instrument. However, there was a more pronounced tremor reduction in scissors, given the baseline discrepancy in freehand tremor between the 2 instruments. We found that scissors have a higher baseline amount of tremor and normalized path length in freehand surgery (without robotic assistance) compared to graspers (Figure 7); this is expected given that graspers are often more fixed as retractors during surgery, while the scissors have relatively more dynamic motion during cordectomy. Subanalyses of the experience level of the surgeons revealed that resident surgeons exhibited reduced absolute tremor bandpower with robot-assisted scissors compared to freehand moreso than attending surgeons. This highlights that the robot may be particularly useful for instrument stabilization for early-career surgeons and trainees. Subanalyses of the hand dominance also revealed that tremor reduction is more pronounced in the nondominant hand. This highlights the fact that Galen ES may be useful to surgeons during ambidextrous maneuvers, which was supported by subjective survey comments at the end of the study.

Figure 7.

Figure 7.

Freehand versus robot-assisted average PSD (power spectral density). Plots comparing average power spectral density in freehand and robot assistance. Rows comprise all data, scissors, and graspers, respectively.

There was a decrease in normalized path length with the use of the robot. Calculated as instrument travel divided by procedure time, normalized path length is meant to be an indirect measure of motion economy. The lower normalized path lengths in the robot-assisted trials may suggest that robot assistance limited unnecessary instrument travel. As normalized path length is also related to average instrument speed, however, it is possible that the robot may undesirably restrict high-speed movements, which could ultimately prolong operating time. This concern was supported by survey results, in which some participants mentioned difficulty moving the instruments due to a limited range of motion. This may be a limitation to the Galen ES, but it is also possible that increased familiarity with the robot can mitigate these concerns. While the time to task completion was not significantly different between freehand and robot-assisted trials in the present study, the robot’s impact on surgical efficiency should be carefully studied when the Galen ES is implemented in the operating room. Future patient-based studies evaluating Galen ES should also consider other important metrics, such as patient outcomes.

An intriguing discrepancy arose from the qualitative feedback: many study participants showed a preference for robot assistance with graspers over scissors. This is in contrast to the quantitative data where scissors displayed a more pronounced reduction in tremor bandpower. This discrepancy suggests that while users perceive robotic assistance for graspers to be more beneficial, the objective data suggests that scissors derive more pronounced advantages from robotic assistance during routine surgery, particularly given the pronounced tremors in freehand cutting. This suggests that tremor suppression may not be the top priority for surgeons. Such insights are important for those designing surgical robots, as user feedback should be considered alongside quantitative advancements.

Computer vision approaches have been pioneered in other surgical specialties and have several advantages, including increased objectivity and scalability. Several researchers have leveraged Deep Neural Networks (DNNs) to objectively assess surgical skills.14 For example, Lavanchy et al developed a 3-stage machine learning algorithm for automating surgical skills assessment in laparoscopic cholecystectomy videos, achieving an accuracy of 87% in distinguishing between good and poor surgical skill levels compared to expert evaluators.15 Most similar to our study, Conroy et al demonstrated a video-based surgical skills assessment for laryngeal procedures.13 However, our methodology offers a few key improvements, including the use of cadaveric pig larynges instead of an artificial target and the use of the surgical microscope video feed instead of additional camera hardware or tracking equipment.

Our current analysis is grounded in a 2-dimensional image space, approximated in millimeters. This means that the tracking framework in the present study could also be applied to endoscopic and plain video data. Future work could also utilize the bifocal nature of surgical microscopes to expand our analysis to a 3-dimensional framework, providing deeper insights into intricate procedures that involve significant depth movement.

This study had several limitations. Our instrument tracking pipeline is contingent upon the quality of recordings, requiring well-focused and stable footage devoid of occlusions. Recordings failing to meet these criteria led to unsuccessful tracking, resulting in the exclusion of a portion of our dataset (7 out of 36 videos). Although this exclusion is not inconsequential, the data derived from the remaining sample were sufficiently representative and yielded statistically significant results. However, this reduced sample size limits the generalizability of our conclusions, and conducting a larger-scale study could broaden the applicability of our findings. Furthermore, the nature of the experiments prohibited blinding participants and creating a placebo condition with the robot powered off. Finally, the configuration of the laryngoscope, lack of anatomical structures surrounding the larynx, and simplified procedure task differ from true operating conditions, so the present findings may not be generalizable to actual surgery.

Conclusion

Employing a computer vision-based tremor assessment of microscopic videos, we evaluated the effects of Galen ES on instrument stability during a cordectomy procedure in a cadaveric porcine model. Robot assistance reduced the absolute tremor bandpower, especially for trainee surgeons. Further, robot assistance decreased normalized path length, possibly suggesting improved economy of motion. The findings of this study demonstrate the potential of Galen ES to bridge gaps in experience, handedness and dexterity. Future applications include using the assessment methodology to offer constructive feedback to surgeons-in-training and further improving intraoperative performance and usability of Galen ES.

Supplementary Material

survery questions

Competing interests:

The technology described herein is currently under a license agreement between Galen Robotics, Inc., and Johns Hopkins University. Dr. Taylor is entitled to royalty distributions on the aforementioned technology. Dr. Taylor is a paid consultant to and owns equity in Galen Robotics, Inc. These arrangements have been reviewed and approved by the Johns Hopkins University in accordance with its conflict of interest policies. The authors have no other funding, financial relationships, or conflicts of interest to disclose.

Footnotes

This article will be presented at the AAO-HNSF 2023 Annual Meeting & OTO Experience; October 2023; Nashville, Tennessee.

Supplemental Material

Additional supporting information is available in the online version of the article.

References

  • 1.Tateya I, Shiotani A, Satou Y, et al. Transoral surgery for laryngo-pharyngeal cancer—the paradigm shift of the head and cancer treatment. Auris Nasus Larynx. 2016;43(1):21–32. doi: 10.1016/j.anl.2015.06.013 [DOI] [PubMed] [Google Scholar]
  • 2.Holzki J, Laschat M, Puder C. Iatrogenic damage to the pediatric airway mechanisms and scar development. Pediatric Anesthesia. 2009;19(s1):131–146. doi: 10.1111/j.1460-9592.2009.03003.x [DOI] [PubMed] [Google Scholar]
  • 3.Feng AL, Razavi CR, Lakshminarayanan P, et al. The robotic ENT microsurgery system: a novel robotic platform for microvascular surgery. Laryngoscope. 2017;127(11):2495–2500. doi: 10.1002/lary.26667 [DOI] [PubMed] [Google Scholar]
  • 4.Olds KC. Robotic Assistant Systems for Otolaryngology-Head and Neck Surgery. Dissertation. Johns Hopkins University; 2015. https://jscholarship.library.jhu.edu/bitstream/handle/1774.2/37927/OLDS-DISSERTATION-2015.pdf [Google Scholar]
  • 5.Akst LM, Olds KC, Balicki M, Chalasani P, Taylor RH. Robotic microlaryngeal phonosurgery: testing of a “steady-hand” microsurgery platform. Laryngoscope. 2018;128(1): 126–132. doi: 10.1002/lary.26621 [DOI] [PubMed] [Google Scholar]
  • 6.Levin M, McKechnie T, Khalid S, Grantcharov TP, Goldenberg M. Automated methods of technical skill assessment in surgery: a systematic review. J Surg Educ. 2019;76(6):1629–1639. doi: 10.1016/j.jsurg.2019.06.011 [DOI] [PubMed] [Google Scholar]
  • 7.Smith SGT, Torkington J, Brown TJ, Taffinder NJ, Darzi A. Motion analysis. Surg Endosc. 2002;16(4):640–645. doi: 10.1007/s004640080081 [DOI] [PubMed] [Google Scholar]
  • 8.Ahmidi N, Poddar P, Jones JD, et al. Automated objective surgical skill assessment in the operating room from unstructured tool motion in septoplasty. Int J Comput Assist Radiol Surg. 2015;10(6):981–991. doi: 10.1007/s11548-015-1194-1 [DOI] [PubMed] [Google Scholar]
  • 9.Grober ED, Roberts M, Shin EJ, Mahdi M, Bacal V. Intraoperative assessment of technical skills on live patients using economy of hand motion: establishing learning curves of surgical competence. Am J Surg. 2010;199(1):81–85. doi: 10.1016/j.amjsurg.2009.07.033 [DOI] [PubMed] [Google Scholar]
  • 10.Dosis A Synchronized video and motion analysis for the assessment of procedures in the operating theater. AArch Surg. 2005;140(3):293–299. doi: 10.1001/archsurg.140.3.293 [DOI] [PubMed] [Google Scholar]
  • 11.Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods. 2021;18(2): 203–211. doi: 10.1038/s41592-020-01008-z [DOI] [PubMed] [Google Scholar]
  • 12.Lukežič A, Vojíř T, Čehovin Zajc L, Matas J, Kristan M. Discriminative correlation filter tracker with channel and spatial reliability. Int J Comput Vis. 2018;126(7):671–688. doi: 10.1007/s11263-017-1061-3 [DOI] [Google Scholar]
  • 13.Conroy E, Surender K, Geng Z, Chen T, Dailey S, Jiang J. Video-based method of quantifying performance and instrument motion during simulated phonosurgery. Laryngoscope. 2014;124(10):2332–2337. doi: 10.1002/lary.24724 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Yanik E, Intes X, Kruger U, et al. Deep neural networks for the assessment of surgical skills: a systematic review. J Defense Model Simul Appl Methodol Technol. 2022;19(2):159–171. doi: 10.1177/15485129211034586 [DOI] [Google Scholar]
  • 15.Lavanchy JL, Zindel J, Kirtac K, et al. Automation of surgical skill assessment using a three-stage machine learning algorithm. Sci Rep. 2021;11:5197. doi: 10.1038/s41598-021-84295-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

survery questions

RESOURCES