Abstract
Purpose
This article investigates the current state of the art of the use of auditory display in image-guided medical interventions. Auditory display is a means of conveying information using sound, and we review the use of this approach to support navigated interventions. We discuss the benefits and drawbacks of published systems and outline directions for future investigation.
Methods
We undertook a review of scientific articles on the topic of auditory rendering in image-guided intervention. This includes methods for avoidance of risk structures and instrument placement and manipulation. The review did not include auditory display for status monitoring, for instance in anesthesia.
Results
We identified 13 publications in the course of the search. Most of the literature (62%) investigates the use of auditory display to convey distance of a tracked instrument to an object using proximity or safety margins. The remainder discuss continuous guidance for navigated instrument placement. Four of the articles present clinical evaluations, 9 present laboratory evaluations, and 3 present informal evaluation (3 present both laboratory and clinical evaluations).
Conclusion
Auditory display is a growing field that has been largely neglected in research in image-guided intervention. Despite benefits of auditory displays reported in both the reviewed literature and non-medical fields, adoption in medicine has been slow. Future challenges include increasing interdisciplinary cooperation with auditory display investigators to develop more meaningful auditory display designs and comprehensive evaluations which target the benefits and drawbacks of auditory display in image guidance.
Keywords: Auditory display, Image-guided Interventions, Human-Computer Interaction
1 Introduction
Modern medical image-guided interventions depend on reliable access to patient data to ensure a successful procedure. Navigated interventions typically employ virtual images of planning data overlaid on images of patient anatomy to aid surgeons during the procedure, for instance, to view the location of a tracked instrument in relation to patient anatomy, to locate the target site, or to become better aware of the locations of risk structures or objects of interest. The field of image-guided interventions has grown greatly over the last 20 years thanks to progress in medical imaging and computing technology. For an overview of image-guided intervention technology and clinical applications, see Cleary and Peters [11]. Using image guidance during an intervention, clinicians can access important information that was previously unavailable, typically on a computer screen placed in the operating room.
However, despite the benefits of image guidance in medicine, displaying information on a screen is sometimes not ideal [25] and alternatives to traditional computer screens are being researched. Clinicians must remove their view from the operating situs to receive information [17], meaning important notifications might not be perceived [39], and 3-D information might not be correctly interpreted [5]. To remedy some of the deficiencies inherent in visual display, the relatively new field of auditory display [23] presents an interesting possibility for image-guided medicine. Auditory display harnesses sound to present information; changes in a data sources can be mapped to parameters in a sound synthesizer so that a user can hear information as opposed to viewing it on a screen. This article presents a review of the literature on the use of auditory display in image-guided interventions, including systems applications including volumetric resection, telerobotic suture tying, resection path guidance, needle placement, and temporal bone drilling. Various motivations, auditory display approaches and evaluation results are presented and discussed, and the primary problems and future trends in auditory display in image guidance are presented.
2 Methods
2.1 Literature Search
A search of the literature was performed using a combination of the following search terms: ‘auditory display,’ ‘image guidance,’ ‘sonification,’ ‘image-guided navigation,’ and ‘auditory navigation.’ We performed forward and backward searches using PubMed and Google Scholar for related literature. Cited, citing, and similar articles that satisfied the eligibility criteria (see below) were thus included in the review. We did not include language restrictions in the search. Literature search and review were performed by two independent reviewers (DB, CH).
2.2 Eligibility Criteria
We considered literature for inclusion that included auditory display as an integral part of navigated image-guided intervention support. These included scientific articles that described both laboratory prototypes, clinical applications of auditory display, as well as detailed, published descriptions of systems yet to be evaluated in a laboratory or clinical setting. We excluded literature that focused on sound as a means of continuous interventional process monitoring, such as basic warning alarms auditory display used in anesthesia. Although important for the success of an intervention, sound in the context of continuous process monitoring is not used for the direct task of image-guided navigation. For detailed information on auditory display for monitoring for anesthesia, please see Sanderson et al. [31].
2.3 Data extraction
During the search process, we retrieved articles meeting the aforementioned eligibility criteria for further assessment. The following information was extracted from each article:
-
–
Interventional tasks to be supported by auditory display
-
–
Motivations for including auditory display for navigational support
-
–
Auditory display methods employed
-
–
Evaluation designs and findings
-
–
Clinical considerations and discussion specifically concerning the use of auditory display
3 Results
The results of the search yielded 13 articles [4,5,9,10, 12,17,22,34,35,37–40] that met the eligibility criteria. The eligible articles cover a wide range of interventional tasks, implemented auditory display methods, evaluation styles and environments, and findings.
3.1 Interventional Tasks Supported by Auditory Display
The selection of literature reveals a broad spectrum of interventional tasks supported by auditory display, see Table 1. Four of the 13 articles concern needle placement: Wegner and Karron [37], Wegner [38], Black et al. [4], and Bork et al. [5]. Specifically, Wegner [38] explores a generalized instrument placement task with a tracked drawing device that is meant to aid “… a procedure requiring an insertion trajectory.” Black et al. [4] explore radiofrequency needle ablation targeting lesions, and Bork et al. [5] support needle biopsy targeting lesions.
Table 1.
Overview of clinical applications and auditory display approaches in image-guided interventions
| Clinical Application | Auditory Display Method | |
|---|---|---|
| Proximity Alerts | ||
| Kitagawa et al. (2005) [22] | Sensory subsitution for forces during robotic suture ties | Single tone when desired manual tension of suture tie has been reached |
| Willems et al. (2005) [39] | Volume resection during frameless stereotaxy neuronavigation | Safety margin tone: 510 Hz ‘pure tone’, 2 per second. System error tone of 870 Hz, 1 per second. |
| Woerdeman et al. (2009) [40] | Volume resection during frameless stereotaxy neuronavigation | Safety margin tone: ‘pure tone’, 3 per second, increasing frequency and volume until tumor reached. System error tone. |
| Strauss et al. (2010) [34] | Endo- and transnasal surgery | Safety margin tone for risk structures as part of collision-warning system |
| Voormolen et al. (2012) [35] | Temporal bone drilling for target access during neuronavigation | Saftey margin tone for drill tip inside margins, see also [40] |
| Cho et al. (2013) [9] | Protecting facial nerve during otologic surgery by monitoring safe region | Safety margin tones between drill tip and surface of facial nerve corresponding to 300, 600, and 900 Hz |
| Cho et al. (2014) [10] | Guiding cochlear implantation | Safety margin tones for absolute distances as well as relative distances between risk scala vestibuli and scala tympani with tones of 300, 600, and 900 Hz |
| Dixon et al. (2014) [12] | Endoscopic cranial base surgery with virtual endoscopy | Saftey margin tone played when drill tip inside margins. Auditory icons used to distinguish dura and carotid arteries. |
|
| ||
| Continuous Aud. Display | ||
| Wegner et al. (1997–8) [37,38] | Generalized 3-D medical instrument placement | Various recommendations: beat interference using harmonic structures, 3-D audio spatialization, distance-based spherical triggers, wave-terrain synthesis |
| Hansen et al. (2013) [17] | Path marking in open liver resection with surgical ultrasound dissector | Parametric auditory display encodes distance towards resection path, left/right sides encoded, confirmation tone when placement correct |
| Black et al. (2013) [4] | Radiofrequency ablation needle guidance | Parameter-mapping auditory display with two-dimensional distance encoding for needle tip and handle using inter-onset interval and alternating pitch comparison |
| Bork et al. (2015) [5] | Needle biopsy to target virtual lesions | Temporal distance coding: repeating metronome-like tone for virtual sphere propagating in space, bell tone when sphere reaches object of interest |
Four of the articles support temporal bone drilling applications. Cho et al. first support monitoring the “distance between the drill tip and important organs,” [9] and later [10] support monitoring the distance between the drill tip and facial nerve, as well as the distance between scala vestibuli, and scala tympani. Voormolen et al. [35] support monitoring the distance between the drill tip and the facial nerve and sigmoid sinus. Dixon et al. [12] implement auditory display alerts for monitoring the distance of the drill to dura and carotid arteries in skull base surgery.
Three of the articles describe aids for tissue resection. Willems et al. [39] and Woerdeman et al. [40] support volumetric lesion resection for neuronavigation. Hansen et al. [17] develop an auditory display for open liver surgery for transferring a preoperatively planned initial resection path onto the surface of the liver.
Finally, Kitagawa et al. [22] implement auditory display as a means of sensory substitution for the loss of haptic feedback encountered when performing suture manipulation during telerobotic surgery. Strauss et al. [34] support functional endoscopic surgery of the paranasal sinuses with a collision warning system for monitoring distance of the surgical instrument to the frontal skull base, lamina papyracea, and internal carotid artery.
3.2 Clinical Motivations for Exploring Auditory Display
The motivations for developing auditory display to aid image-guided interventions arise from shortcomings in traditional image-guided intervention methods described in the literature. Three primary motivations named in the literature include:
increasing awareness of structures surrounding the tracked instrument
reducing attention to the screen or increasing attention to the patient or test phantom
helping clinicians correctly interpret (multidimensional) navigation data.
Aiming to improve clinician interaction with visual displays and change view behavior was also cited in much of the literature [4,9,10,17,34,35,37,38]. Investigators commented on the necessity of clinicians to draw attention away from the situs in order to view the navigation screen. Willems et al. [39] argue that “to appreciate the visual information offered … the surgeons’ attention (the visual focus) will need to be drawn away from the actual surgery. This will result in the images being used only at intervals chosen by the surgeon.” Hansen et al. [17] state that “the navigation system needs to be frequently consulted by surgeons, which leads to increased mental load and time pressure during surgery. The surgeon’s attention to the working area is interrupted by viewing the navigation system’s screen.” Cho et al. [9] note that “when using a navigation system, the surgeons visual focus must move between the operating field and the navigation monitor to identify the position of the drill, causing a temporary interruption in the temporal bone dissection.” Wegner [38] cites as motivation for auditory display “users who cannot tolerate the encumbrance of graphical display hardware, and whose visual faculties have pre-existing obligations, such as addressing the task at hand.”
Most articles that focus on threshold alerts [12, 34, 35, 39, 40] understandably mention the aim of increasing awareness of the anatomy or critical structures surrounding the tracked instrument. For instance, Dixon et al. state that “…surgery can be technically demanding and requires a continuous appreciation of the surrounding critical structures.” Strauss et al. mention1 a desire to “improve the situational awareness of the surgeon,” because “in the field of surgical navigation, situational awareness for the described conventional task is not optimal.” [34] Woerdeman et al. lament that “complete spatial awareness at all times can be compromised during IGS.” [40] According to Voormolen et al., “standard neuronavigation does not adequately notify the surgeon about where he/she is drilling in relation to surrounding temporal bone critical structures.” [35]
A third primary motivation cited by a number of articles is the correct interpretation of navigation information [5,34,39,40]. Bork et al. [5] cite the lack of usable depth information in augmented reality applications: “when [augmented reality] visualization is implemented as a simple superimposition of virtual objects on the video stream, the virtual objects appear to float above the anatomy. This lack of correct depth perception has been recognized as a major challenge,” hoping that auditory display can ameliorate this lack of depth information in the camera view. Strauss et al. [34] state that “the surgeon is continuously required to translate the supplied information into a 3-D model. This approach is laborious and prone to error.” Willems et al. [39] and Woerdeman et al. [40] both note the difficulty in interpreting conventional 2-D views of 3-D scenarios.
Further motivations include the ability to hear structures occluded in visual display [5], subsitute the loss of tactile sense during robotic surgery [22], lessen simulator sickness [38] and vertigo [37], reduce clinician workload [4,12,17,34,37], and reduce memory burden [38].
3.3 Methods of Auditory Display for Image-Guided Interventions
An auditory display is one that uses sound rather than a screen to communicate information [36]. These use data from some source that is typically mapped to changing parameters of a sound generator which generates acoustic output. Auditory displays are quite common in everyday life, with applications including speech, radios, music, alarm clocks, bells, telephone ringtones, microwave buzzers, sirens, and horns [19]. Historically, scientific auditory display has been sparsely employed, for example to bring seismometer recordings into audible range for analysis [33]; the first International Conference on Auditory DIsplay was held in 1992 [23]. Using auditory display for guidance tasks in related fields has been described for applications such as obstacle avoidance and route finding for blind pedestrians, emcouraing athletes towards more efficient movements, or aiding patients during rehabilitation [29]. In the case of image-guided interventions, the data source is typically distance information delivered by the navigation system.
In addition to ameliorating the shortcomings in traditional image-guided interventions described above, Wegner and Karron [37,38] describe multiple benefits of auditory display for interventional use, including the omnidirectionality of audio allowing for information display without line-of-sight, the relatively open auditory perceptual channel, reduced computational demands of audio synthesis, and the ability of humans to perceive parallel streams of audio. Further general benefits of auditory display including improving ergonomics, for instance, by reducing the number of head and neck movements to switch between viewing various displays [6] or to promote rapid detection of events in high-stress environments [23]. Auditory display has been shown to be fairly easy to learn [36], and even engaging or fun to use [32].
Although there are multiple methods of auditory display available to the sound designer, the reviewed literature includes three primary auditory display methods: alerts, auditory icons, and parameter mapping.
3.3.1 Alerts
Alerts are sounds that are played back when the source data reaches a predetermined threshold. These are common in the operating environment [31]. The purpose of alerts is to indicate that an event has taken place or is about to occur, thereby prompting the listener to take action. In the reviewed literature, alerts have been described by 6 of the 13 articles. For instrument tracking, the alert plays back when the distance of the tracked instrument to a certain structure has passed a predetermined distance threshold [9,10,34,35,39], or in the case of Kitagawa et al. [22], when applied during telerobotic suture tying tension reaches a desired threshold.
For volumetric resection, Willems et al. [39] implement an alert when the tip of the tracked instrument encroached a predefined contour. The alert played back twice per second with a frequency of 510 Hz and duration of 0.1 seconds. For temporal bone drilling, Cho et al. (2013) [9] create three absolute distance margins of 2, 4, and 6 mm from the facial nerve, which correspond to alerts of 900, 600, and 300 Hz, respectively, played back for 20 ms. In a second article [10], two alerts play back when the tip was within 5 mm of either the scala vestibuli or the scala tympani. If the distance to the scala vestibuli is greater than the distance to the scala tympani, an alert of 900 Hz is played; if the distance to the scala tympani is greater than the distance to the scala vestibuli, an alert of 300 Hz is played. Strauss et al. [34] describe playing back an alert when the instrument reaches a “minimal distance,” but detailed descriptions of the auditory display method are not provided in the text. For telerobotic suture tension, Kitagawa et al. [22] describe an “[audio feedback] mode, which provided a single tone when the magnitude of the applied tension reached the [optimal] manual tension,” although detailed descriptions of the auditory display method are not provided.
3.3.2 Auditory Icons
Somewhat more complex than alerts are auditory icons, which are everyday sounds to convey information about events by analogy to everyday sound-producing events. [8] These icons are used in a similar way to visual icons: they map system events to those found in everyday listening, mimicking such sounds as throwing trash in a bin, commonly employed when deleting a file in a desktop graphical environment. Short auditory icons use the richness of everyday sounds and their ease of comprehension to link sounds to events.
In the case of Dixon et al. [12], aforementioned simple abstract alerts were first developed, but preliminary tests suggested that participants found it “difficult to distinguish acoustically which anatomical structure was close and how far away it was.” Auditory icons representing the dura and carotid arteries were developed to be easier to learn. For instance, the sound of an arterial Doppler trace was used to represent proximity to a major artery. Dixon et al. manually set safety margins to 2 and 3 mm.
3.3.3 Parameter Mapping Models
In contrast to alerts and auditory icons, parameter mapping links continuous changes in one set of data to continuous changes in audio parameters, providing a higher level of complexity. In essence, the underlying data delivered by the navigation system are used to ‘play’ a realtime software instrument according to those changes. Because audio has a wide range of parameters [38] that may be altered, such as frequency, intensity, and timbre, continuous parameter mapping is also suitable for displaying multivariate data. This technique attempts to make the listener an active participant in the listening process by providing interactive, changing mappings that relate data to audio. This method is useful for smoothly representing continuous changes in events.
In image-guided intervention support systems which implement parameter-mapping auditory display, the tracked instrument itself is in essence the physical musical instrument: the clinician plays the realtime software instrument by moving the tracked instrument. The range of parameter mappings found in the literature extend from fairly simple frequency and volume mappings [35, 40] up to complex methods such as 3-D audio spatialization and wave-terrain synthesis [37,38].
For volumetric resection for neuronavigation, Woerdeman et al. [40] adapt a previous approach of the group [39], which employed a simple alert, to play back a parameter-mapping alert. They describe a “soft warning sound (an intermittent pure tone)” with a duration of 0.1 seconds that plays back at a rate of 3 times per second at a distance of 5 mm from the tumor outline. Upon entering the 5 mm threshold, volume and tone frequency increased proportionally until the outline of the tumor was reached. After encroaching the tumor outline, a “continuous pure tone” was played back. Voormolen et al. [35] cite and employ this method for temporal bone drilling but do not describe further adaptation details.
Biopsy needle placement support is described by Bork et al. [5], who use a method of parameter mapping called temporal distance coding [14], in which the time an object is rendered depends on the distance from the tracked instrument. In this case, auditory temporal distance coding allowed playback of the distance from the tip of the biopsy needle to objects of interest within the AR environment. Virtual ‘spheres’ propagate from the needle tip at a certain speed. The longer it takes for these spheres to collide with the objects of interest, the longer a metronome sound is played back. Once an object is ‘hit’, a bell tone is played. Thus, the more metronome sounds are played before a bell tone is played, the further the object is from the needle tip. The user is not explicitly guided towards a target using auditory display, but rather receives information concerning the distance of objects of interest in the area.
A different method for needle placement is investigated by Black et al. [4]. This auditory display encodes the distance of the tip of the needle to the correct insertion point, the distance of the shaft to the correct position, and the depth as the distance of the tip to the target lesion. Two auditory display methods are described. In both methods, the task of needle placement is split into tip placement, handle placement, and insertion phases. The first method employs a tone with a moving pitch and a reference tone; the pitch of the moving tone is mapped to the distance in the y-axis. These are brought together, creating an auditory display that mimics tuning an instrument. Distance in the x-axis is mapped to the inter-onset interval of train of tones, from 250 ms at the outer edge of the navigation area to 100 ms at the center. The second method further separates motion along the x-axis and y-axis. Placement along the x-axis is first performed using changes in inter-onset interval, repeated again for movement along the y-axis corridor. After correct tip and handle placement, the needle is inserted and depth to target is mapped to the increasing pitch of 10 consecutive tones, whereafter a bell tone is played back upon reaching the target lesion.
Hansen et al. [17] support resection line marking for open liver surgery with a parameter-mapping auditory display. In this method, the navigation system delivers the nearest distance between the instrument tip and the planned resection line. The distance is divided into three margins: ‘safe,’ ‘warning,’ and ‘outside.’ When the instrument is in the safe margin, signaling to the clinician that the position is correct, a confirmation tone is played back with a frequency of 698.5 Hz and an inter-onset interval of 660 ms at the center of the safe margin and 180 ms at the edge of the safe margin. In the warning margin, the distance is mapped to inter-onset interval, pitch, and tone length. Pitches to the left of the resection line become consecutively lower, while those to the right of the resection line become consecutively higher, thus providing directional information. Outside the warning zone, no sound is played to prevent unwanted sound when the instrument is not near the resection line.
Wegner and Karron [37] map a discrete error function in the plane perpendicular to the trajectory path to MIDI2 tones. For placement in this plane, a chordal drone is produced, with deviations from the correct placement producing inharmonicity. Another tone was produced at regular points along the trajectory path to ‘tag’ the distance traveled. A second method [38] employs beat interference between 3 pairs of sinusoids which correspond to 3 axes in space. By reducing the beat interference between each of the 3 pairs, correct position is found.
Figure 1 visualizes the primary mapping approaches encountered in the literature.
Fig. 1.

Various approaches to map tracking data to auditory display found in the literature. From top to bottom: risk avoidance using safety margins, resection path following, 3-D trajectory following, and temporal distance coding.
3.4 Experimental Designs and Findings
The variety of experimental designs used to evaluate the reviewed literature ranges from informal evaluations to phantom studies in laboratory conditions to clinical evaluations.
3.4.1 Informal Evaluations
Three of the 13 reviewed articles provide informal evaluations without statistical data gathering or analysis; see Table 2 for an overview. Wegner and Karron [37] provide solely a technical description of their range of auditory display methods for generalized tracked instrument placement. Wegner [38] states that “informal usability testing” was completed, but does not further elaborate. Black et al. [4] perform informal, ‘think-aloud’ evaluations [24] of 2 auditory display methods for ablation needle placement with 8 non-expert participants. Comments were gathered during the interviews and suggest general satisfaction with performance during the placement task.
Table 2.
Overview of literature with informal evaluation
| Informal Evaluation | Findings | |
|---|---|---|
| Wegner (1998) [38] | Informal usability testing | General benefit during placement task |
| Black et al. (2013) [4] | 8 participants: talk-aloud walkthrough and interview | Satisfaction with performance; auditory display engaging and fun to use; preference for less ‘synthesized’ tones |
3.4.2 Laboratory Evaluations
The majority (10) of the reviewed literature describe evaluations in laboratory conditions on phantoms. Of these, 4 of 13 also performed a clinical evaluation described in the same article, see section 3.4.3. See Table 3 for an overview.
Table 3.
Overview of literature with laboratory evaluation
| Laboratory Evaluation | Findings | |
|---|---|---|
| Kitagawa et al. (2005) [22] | 5 surgeons: suture ties with different materials for manual tying and no-feedback, auditory, visual, and audiovisual displays | Suture tie tension consistency using visual and audiovisual displays superior to hand ties; consistency of tie tension using auditory display comparable to hand ties |
| Willems et al. (2005) [39] | 5 surgeons: volume resection on floral foam phantoms using auditory display and standard visual navigation. | Auditory display increased similarity of the resected to target volume, reduced the amount of target tissue not removed, increased amount of non-target tissue removed |
| Woerdeman et al. (2009) [40] | 4 surgeons: volume resection using auditory, conventional display, and heads-up display | Similar task completion time and target volume removal. Auditory display subjectively preferred over conventional display by improving time spent viewing phantom |
| Strauss et al. (2010) [34] | 5 surgeons: reported and actual distance to structures with using conventional and combined audiovisual display | Audiovisual display improved reported accuracy over conventional display. |
| Voormolen et al. (2012) [35] | 5 surgeons: bone drilling in phantoms, comparing combined audiovisual display and conventional display | Less critical structures hit when using audiovisual display. Improved subjective orientation and tumor exposure |
| Cho et al. (2013) [9] | 1 surgeon: bone drilling with and without audiovisual display | Facial nerve was hit less using audiovisual display, greater uniformity of safe margin in resection |
| Hansen et al. (2013) [17] | 12 surgeons: resection line marking with audiovisual and conventional navigation display. | Audiovisual display reduced the percent of time viewing the screen, increased accuracy of the marking task, and increased task completion time. |
| Dixon et al. (2014) [12] | 7 surgeons: dissection and clivus ablation with and without audiovisual display | Using the audiovisual display reduced perceived workload scores for mental demand, effort, and frustration. |
| Bork et al. (2015) [5] | 15 participants: lesion targeting with simple overlay, auditory feedback, visual feedback, and audiovisual feedback. | Audiovisual feedback resulted in most target hits and least localization error. Auditory, visual, and audiovisual more accurate, slower than simple overlay. Audiovisual display outperformed auditory and visual display in accuracy, task completion time, and number of lesions hit. |
For the task of telerobotic suture tying reported by Kitagawa et al., [22] 5 surgeons completed suture ties using no feedback, auditory feedback, visual feedback, and audiovisual feedback after 1 hour training with the robot system. Findings indicate that consistency of tying tension with sensory substitution using visual and audiovisual displays were superior to those of hand ties, and that the consistency of tying tension using auditory display were comparable to those of hand ties.
Willems et al. [39] compare volume resection on floral foam phantoms with 3 experienced surgeons who each completed one resection each using both auditory display and standard visual navigation. Results indicate that using auditory display, the similarity of the resected volume to target volume increased marginally, the amount of target tissue not removed was reduced, but that the amount of non-target tissue removed increased.
Using a similar task, Woerdeman et al. [40] describe an evaluation with 4 surgeons performing volume resection with auditory display, conventional IGS display, and a heads-up display. Task completion time between auditory display, conventional display and heads-up display did not differ, and target volume removal did not differ between auditory and conventional displays. However, auditory display was subjectively perceived to improve performance compared to conventional display.
Strauss et al. [34] compared surgeon-reported and actual distance measurement points from instrument to risk structures with 5 ‘advanced beginner’ surgeons using a combined audiovisual collision warning system. Results indicate that the audiovisual display improved reported accuracy 76% over conventional display.
Voormolen et al. [35] evaluate 5 surgeons each performing a temporal bone drilling task in two phantoms, once with conventional image guidance and once with the combined audiovisual assistance. Using the audiovisual system, no critical structures were damaged (opposed to three structures using conventional methods), and participants reported improved subjective orientation and improved tumor exposure with the system.
Cho et al. (2013) [9] describe a laboratory study with one inexperienced surgeon who drilled 10 bone phantoms, 5 using an audiovisual display and 5 without navigation. The drill distance to the facial nerve was recorded to determine when the surgeon encroached the safe margin of 2 mm to the nerve and when the nerve was hit. Using no navigation, the nerve was hit in 4 of 5 attempts, whereas with audiovisual display the nerve was hit once. In addition, the uniformity of the safe margin in the resected area appeared greater with audiovisual display.
Hansen et al. [17] compare resection line marking on a floral foam phantom with 12 surgeons using combined audiovisual and conventional 3-D navigation display. Findings indicate that the auditory display reduced the percent of time viewing the screen from 96% using visual display to 10% using combined audiovisual display. Auditory display increased accuracy of the marking task, but also increased task completion time.
Dixon et al. [12] report an evaluation of 14 cadaver specimens with 7 surgeons who each performed dissection and clivus ablation on 2 heads, once each using conventional display and audiovisual display. Using the audiovisual display reduced NASA-TLX perceived workload [18] scores for mental demand, effort, and frustration.
Bork et al. [5] evaluate lesion targeting using a biopsy needle with 15 non-clinical participants. Each participant completed 3 attempts using simple lesion overlay, auditory-only feedback, visual feedback, and audiovisual feedback. Participants verbally confirmed reaching each lesion point. Results show that targeting using combined audiovisual feedback resulted in most target hits and least localization error. Auditory, visual, and audiovisual displays improve accuracy but resulted in slower task completion times than the simple overlay. Audiovisual display outperformed auditory and visual display with respect to accuracy, task completion time, and number of lesions hit.
3.4.3 Clinical Evaluations
Four of the 13 reviewed articles evaluated their approaches in clinical conditions in addition to the laboratory studies described in section 3.4.2, see Table 4.
Table 4.
Overview of literature with clinical evaluation
| Clinical Evaluation | Findings | |
|---|---|---|
| Woerdeman et al. (2009) [40] | 1 surgeon: 6 patients resection with auditory display randomly activated | No specific effect of auditory display on instrument tip speed. Subjective reports of improvements in decision-making. |
| Strauss et al. (2010) [34] | 4 surgeons: functional endoscopic sinus surgery | Complication rate reduced and preparation time increased when using audiovisual display |
| Cho et al. (2013) [9] | 1 surgeon: 2 cochlear implantations, 1 acoustic tumor resection | Warning margins with auditory display allowed drilling continuously without removing view from situs |
| Cho et al. (2014) [10] | 1 surgeon: 2 cochlear implantations | Auditory display helped locate correct cochleostomy point while keeping focus on microscope |
Woerdeman et al. [40] report 6 patient neurosurgical resections during which auditory display was switched on and off at random intervals by the primary investigator. Instrument tip movement was measured, including mean tip translational speed and mean tip rotational speed. A specific effect on instrument tip speed could not be determined. Surgeons subjectively reported improvements in decision-making when using the auditory display without negatively influencing instrument use.
Strauss et al. [34] measured complication rate during functional endoscopic sinus surgery with 4 surgeons and 104 patients. A combined audiovisual collision-warning system was compared to conventional navigation display. The complication rate for critical incidents using conventional navigation and the collision-warning system were 11.3% and 7%, respectively, and 3.9% and 2.12% for small complications, respectively. The time to prepare the navigation system rose when using the collision-warning system.
Cho et al. (2013) [9] report one experienced surgeon who performed 2 cochlear implantations and 1 acoustic tumor resection using auditory display with 3 warning margins. Although no statistics were gathered, findings indicate that the audiotyr display made it possible for the surgeon to continuoulsy concentrate on the operating situs without having to switch the view to the screen. In a second study in 2014, Cho et al. [10] report that, during surgeries on 2 patients, the auditory display helped the surgeon find the correct cochleostomy spot while maintaining focus on the microscope.
4 Discussion
This review presents, to the authors’ knowledge, the first overview of the state of the art of auditory display for image-guided interventions. The articles included in the review cover a range of interventional tasks to be supported, auditory display methods, and evaluation designs and findings. Whereas a majority of the literature covers the use of auditory display to inform the clinician of risk structures, thereby prompting the clinician to navigate away from a certain object of interest, a number of articles attempt to aid clinicians in navigation towards a target itself.
The body of results of the reviewed literature show that in most cases, systems with auditory display were found to be beneficial. Advantages include improved recognition of the presence of or distance to anatomical risk structures [9,34,35], reduced complication rate [34], improved placement accuracy [5,17,34], improved resection volume similarity [39], improved orientation [35], reduced workload demands [12]. Reported drawbacks included increased task time [34, 17] and increased amount of non-target tissue removed during volumetric resection [39]. However, none of the reviewed literature report negative subjective perception of the implemented auditory display.
Considering the wide range of tasks that are currently supported with image guidance [11] and primarily positive evaluation results of the use of auditory display in the reviewed literature, it is surprising that only a limited amount of investigations have attempted tackling auditory display. Even when auditory display has been integrated into support for image guidance, the majority of attempts usually only implement a simple threshold-based alert. The paucity of investigations into more complex navigational aids might be traced to an unfamiliarity with the relatively nascent field of auditory display and its possibilities in enhancing guidance tasks, possibly prompted by clinician dissatisfaction with previous experiences with alarms.
The abundance of other sounds in the operating environment, including speech and instrument noises, plays a role in the distrust or rejection of new auditory displays. This is discussed in the reviewed literature: an editorial comment in response to Willems et al. [39] recognizes the benefit of reducing the necessity of viewing a navigation screen, but is pessimistic of its clinical application:
“We doubt whether we would like to have such an auditory warning system in the operating room creating distracting sounds. Especially when surgery comes to critical areas, a beeping neuronavigation system may be annoying, since the operating room is already filled with acoustic warning systems of the anesthesiologist, with which an additional system should also not interfere. We should seek ways to increase the comfort for the surgeon in the operating room, allowing concentration on the surgical field, supported by enhanced guidance systems using modern 3-D visualising techniques.”
This sentiment is reflected in statements by authors of the reviewed literature, who caution that auditory display methods should take the sounds in the existing operating room into account during the design phase to reduce annoyance or overburden the environment [5, 9,12,17,22,37,39]. To be sure, the operating room is a noisy environment, with average sound levels considerably higher than those of other workplaces [26]. However, according to Katz et al., “there is little evidence to demonstrate a direct association between excessive operating room noise and poor surgical outcomes.” Music in the operating room has become commonplace [26], and surgeon-selected music in the operating room has been reported to enhance performance [1] and reduce workload [15]. In addition, Moorthy et al. [28] report that surgeons can effectively block out unwanted noise in the operating room.
As a whole, the effect of noise and music in the operating room is complex and not widely investigated, bringing into question any sweeping pessimism of the inclusion of auditory display for image guidance. Unfortunately, clinicians may associate novel auditory display methods with just another alarm. The perception that auditory displays equal alarms could be due to the lack of clinicians’ experience with beneficial auditory displays, but also to approaches that amount to little more than simple alarms applied to an image guidance task.
Parseihan et al. [29] note a major problem of designing auditory display for guidance tasks: the aesthetics of resulting sound design. The authors cite the discomfort caused by using auditory display designs; they can become fatiguing to use or do not match the listening tastes of the intended user. Indeed, the relationship between the intended urgency of a situation and the urgency perceived when using an auditory display is an important issue that should be considered during design [13]. Common auditory warnings in the operating room have been found to be inappropriate, conveying an unintended level of urgency [27]. Thus, clinicians’ dissatisfaction with inappropriately urgent alarms may be one reason that investigations into potentially useful auditory display for image guidance never properly develop.
Many of the reviewed approaches are indeed simple in nature, and most articles do not cite psychoacoustically or psychologically driven motivations for sound design decisions, prompting the assumption that most investigations tend to lack interdisciplinary collaboration between researchers of image guidance systems and researchers in field of auditory display.
Bringing auditory display for image guidance into the operating room to provide usable and flexible support for clinicians demands fundamental changes. Enhanced cooperation with sound designers and experts in auditory display to produce more aesthetically appropriate auditory displays will encourage contextual inquiry to help limit implementation in cases when it is disadvantageous, and help discover new applications which might benefit from auditory display.
Increasing sound design complexity so that auditory displays sound more like instruments and less like alarms could increase acceptance [17] and help differentiate the perception of displays from pure alarms that exist in the already noisy environment. Sound designs that are customizable based on clinicians’ desires are also an interesting option to increase acceptance [17, 29]. Future development should carefully implement toggling so that sound output is only produced when absolutely necessary, a suggestion also offered by Dixon et al. [12]. This will further reduce unnecessary sound output and its related annoyance.
More thorough evaluations of developed methods will help discern exactly which aspects of auditory display are most useful in the operating room and which are superfluous or better supported by other means. In addition to comparing the effects of auditory display to other intraoperative modalities such as augmented reality [16], virtual reality, and conventional navigation, evaluations should include comparisons of multiple auditory display methods, which none of the reviewed literature provided. Investigations into comparing multiple methods for auditory display for 1-dimensional guidance tasks suggest expanding such evaluations to 2-D and 3-D tasks [30]. Such an approach could be taken within the context of multidimensional tracked instrument placement, for instance, for needles [2, 3], aspirators, or continuum robots [7].
5 Conclusion
This review of the literature on the use of auditory display for image guidance shows that, despite apparent benefits of augmenting or replacing certain aspects of image-guided interventions with sound information, investigations have been sparse. Positive results include increased risk structure awareness, placement accuracy, and general subjective satisfaction with auditory display, although investigators warn of aspects of annoyance and additional noise in busy operating rooms. There is a need for intensified development and comprehensive evaluation of novel auditory displays that reach beyond simple alerts and alarms to provide clinicians the optimal tool when needed during image-guided interventions.
Acknowledgments
Funding: The work of this paper is partly funded by the German Research Foundation (DFG) under grant number HA 7819/1-1, and the Federal Ministry of Education and Research within the Forschungscampus STIMULATE under grant number 13GW0095A.
Footnotes
Passages from Strauss et al. [34] translated from the original German into English by author DB.
Musical Instrument Digital Interface, a protocol for electronic musical instrument communication
Ethical Approvement
For this type of study formal consent is not required. This article does not contain any studies with human participants or animals performed by any of the authors.
Conflict of Interest
The authors declare that they have no conflict of interest.
Informed consent
This articles does not contain patient data.
Contributor Information
David Black, Medical Image Computing, University of Bremen; Jacobs University, Bremen; Fraunhofer MEVIS, Bremen, Germany.
Christian Hansen, Otto-von-Guericke University Magdeburg, Germany.
Arya Nabavi, International Neuroscience Institute, Hannover, Germany.
Ron Kikinis, Medical Image Computing, University of Bremen; Fraunhofer MEVIS, Bremen, Germany; Brigham and Women’s Hospital and Harvard Medical School, Boston, USA.
Horst Hahn, Fraunhofer MEVIS, Bremen, Germany; Jacobs University, Bremen, Germany.
References
- 1.Allen K, Blascovich J. Effects of music on cardiovascular reactivity among surgeons. Journal of the American Medical Association. 1994;272(11):882–4. [PubMed] [Google Scholar]
- 2.Arnolli MM, Hanumara NC, Franken M, Brouwer DM, Broeders IA. An overview of systems for CT and MRIguided percutaneous needle placement in the thorax and abdomen. Medical Robotics and Computer Assisted Surgery. 2015;11(4):458–475. doi: 10.1002/rcs.1630. [DOI] [PubMed] [Google Scholar]
- 3.Banz V, Mller PC, Tinguely P, Inderbitzin D, Ribes D, Peterhans M, Candinas D, Weber S. Intraoperative image-guided navigation system: development and applicability in 65 patients undergoing liver surgery. Langenbeck’s Archives of Surgery. 2016;401(4):495–502. doi: 10.1007/s00423-016-1417-0. [DOI] [PubMed] [Google Scholar]
- 4.Black D, Al Issawi J, Hansen C, Rieder C, Hahn HK. Auditory Support for Navigated Radiofrequency Ablation. Proceedings of CURAC-Deutsche Gesellschaft fr Computer-und Roboterassistierte Chirurgie. 2013:3033. [Google Scholar]
- 5.Bork F, Fuerst B, Schneider A, Pinto F, Graumann C, Navab N. Auditory and Visio-Temporal Distance Coding for 3-Dimensional Perception in Medical Augmented Reality. Proceedings of 2015 IEEE International Symposium on Mixed and Augmented Reality (ISMAR) 2015:7–12. [Google Scholar]
- 6.Brock D, Stroup J, Ballas J. Using an auditory display to manage attention in a dual task, multiscreen environment. Proceedings of 8th International Conference on Auditory Display. 2002:177–180. [Google Scholar]
- 7.Burgner J, Rucker D, Gilbert H, Swaney P, Russell P, Weaver K, Webster R. A Telerobotic System for Transnasal Surgery. IEEE ASME Transactions on Mechatronics. 2013;19(3):996–1006. doi: 10.1109/TMECH.2013.2265804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Buxton B, Gaver W, Bly S. The Use of Non-Speech Audio at the Interface. 1994 http://www.billbuxton.com/Audio.TOC.html ccessed 15 December 2016.
- 9.Cho B, Oka M, Matsumoto N, Ouchida R, Hong J, Hashizume M. Warning navigation system using realtime safe region monitoring for otologic surgery. Int J Comput Assist Radiol Surg. 2013;8:395–405. doi: 10.1007/s11548-012-0797-z. [DOI] [PubMed] [Google Scholar]
- 10.Cho B, Matsumoto N, Komune s, Hashizume M. A Surgical Navigation System for Guiding Exact Cochleostomy Using Auditory Feedback: A Clinical Feasibility Study. BioMed Research International. 2014;2014:769659. doi: 10.1155/2014/769659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cleary K, Peters T. Image-Guided Interventions: Technology Review and Clinical Applications. Annual Review of Biomedical Engineering. 2010;12:1–433. doi: 10.1146/annurev-bioeng-070909-105249. [DOI] [PubMed] [Google Scholar]
- 12.Dixon B, Daly M, Chan H, Vescan A, Witterick I, Irish J. Augmented real-time navigation with critical structure proximity alerts for endoscopic skull base surgery. Laryngoscope. 2014;124:853–859. doi: 10.1002/lary.24385. [DOI] [PubMed] [Google Scholar]
- 13.Edworthy J, Loxley S, Dennis I. Improving auditory warning design: relationship between warning sound parameters and perceived urgency. Human factors. 1991;33(2):205–31. doi: 10.1177/001872089103300206. [DOI] [PubMed] [Google Scholar]
- 14.Furmanski C, Azuma R, Daily M. Augmented-reality visualizations guided by cognition: perceptual heuristics for combining visible and obscured information. Proceedings of the International Symposium on Mixed and Augmented Reality. 2002:215. [Google Scholar]
- 15.George S, Ahmed S, Mammen K, John G. Influence of music on operation theatre staff. Anaesthesiology Clinical Pharmacology. 2011;27(3):354–7. doi: 10.4103/0970-9185.83681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Haerle S, Daly M, Chan H, Vescan A, Witterick I, Gentili F, Zadeh G, Kucharczyk W, Irish J. Localized Intraoperative Virtual Endoscopy (LIVE) for Surgical Guidance in 16 Skull Base Patients. Otolaryngology Head Neck Surgery. 2015;152:165–17. doi: 10.1177/0194599814557469. [DOI] [PubMed] [Google Scholar]
- 17.Hansen C, Black D, Lange C, Rieber F, Lamade W, Donati M, Oldhafer K, Hahn H. Auditory Support for Resection Guidance in Navigated Liver Surgery. Medical Robotics and Computer Assisted Surgery. 2013;9(1):36. doi: 10.1002/rcs.1466. [DOI] [PubMed] [Google Scholar]
- 18.Hart S. NASA-Task Load Index (NASA-TLX); 20 Years Later. Human Factors and Ergonomics Society 50th Annual Meeting. 2006:904–908. [Google Scholar]
- 19.Hedge A. Auditory Displays. 2013 http://ergo.human.cornell.edu/studentdownloads/DEA3250pdfs/idauditory.pdf Accessed 15 December 2016.
- 20.Hermann T. Taxonomy and definitions for sonification and auditory display. Proceedings of 14th International Conference on Auditory Display 2008 [Google Scholar]
- 21.Katz J. Noise in the Operating Room. Anesthesiology. 2014;121(4):894–8. doi: 10.1097/ALN.0000000000000319. [DOI] [PubMed] [Google Scholar]
- 22.Kitagawa M, Dokko D, Okamura A, Yuh D. Effect of sensory substitution on suture-manipulation forces for robotic surgical systems. Thoracic and Cardiovascular Surgery. 2005;129(1):151–8. doi: 10.1016/j.jtcvs.2004.05.029. [DOI] [PubMed] [Google Scholar]
- 23.Kramer G, Walker B, Bonebright T, Cook P, Flowers J, Miner N, Neuhoff J, Bargar R, Barrass S, BErger J, Evreinov G, Fitch W, Grhn M, Handel S, Kaper H, Lev-kowitz H, Lodha S, Shinn-Cunningham B, Simoni M, Tipei S. The Sonification Report: Status of the Field and Research Agenda. Report prepared for the National Science Foundation by members of the International Community for Auditory Display 1999 [Google Scholar]
- 24.Lewis C. Using the ”Thinking Aloud” Method In Cognitive Interface Design. IBM Research Report RC-9265 1982 [Google Scholar]
- 25.Mewes A, Hensen B, Wacker F, Hansen C. Touch-less Interaction with Software in Interventional Radiology and Surgery: A Systematic Literature Review. Int J Com-put Assist Radiol Surg. 2016 doi: 10.1007/s11548-016-1480-6. [DOI] [PubMed] [Google Scholar]
- 26.Miller R, Eriksson L, Fleisher L, Wiener-Kronish J, Cohen N, Young W. Anesthesia. Elsevier; Amsterdam: 2015. [Google Scholar]
- 27.Mondor T, Finley G. The perceived urgency of auditory warning alarms used in the hospital operating room is inappropriate. Canadian J Anesth. 2003;50(3):221228. doi: 10.1007/BF03017788. [DOI] [PubMed] [Google Scholar]
- 28.Moorthy K, Munz Y, Undre S, Darzi A. Objective evaluation of the effect of noise on the performance of a complex laparoscopic task. Surgery. 2004;136(1):25–30. doi: 10.1016/j.surg.2003.12.011. Discussion 31. [DOI] [PubMed] [Google Scholar]
- 29.Parseihian G, Ystad S, Aramaki M, Martinet R. The process of sonification design for guidance tasks. Journal of Mobile Media. 2015;9(2) [Google Scholar]
- 30.Parseihian G, Gondre C, Aramaki M, Ystad S, Kronland-Martinet R. Comparison and Evaluation of Sonification Strategies for Guidance Tasks. IEEE Trans Multimedia. 2016;18(4):674–686. [Google Scholar]
- 31.Sanderson P, Watson M, Russell W. Advanced patient monitoring displays: tools for continuous informing. Anesth Analg. 2005;101:161168. doi: 10.1213/01.ANE.0000154080.67496.AE. [DOI] [PubMed] [Google Scholar]
- 32.Saue S. A model for interaction in exploratory sonification displays. In 6th International Conference on Auditory Display. 2000:105110. [Google Scholar]
- 33.Speeth S. Seismometer Sounds. Journal of the Acoustical Society of America. 1961;33:909916. [Google Scholar]
- 34.Strau G, Schaller S, Zaminer B, Heininger S, Hofer M, Manzey D, Meixensberger J, Dietz S, Luth T. Klinische Erfahrungen mit einem Kollisionswarnsystem. HNO. 2010;59:470479. doi: 10.1007/s00106-010-2237-0. [DOI] [PubMed] [Google Scholar]
- 35.Voormolen E, Woerdeman P, van Stralen M, Noordmans H, Viergever M, Regli L, van der Sprenkel J. Validation of exposure visualization and audible distance emission for navigated temporal bone drilling in phantoms. PLoS One. 2012;7:e41262. doi: 10.1371/journal.pone.0041262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Walker B, Nees M. Theory of Sonification. In: Hermann T, Hunt A, Neuhoff J, editors. Handbook of Sonification. Academic Press; New York: 2011. pp. 9–31. [Google Scholar]
- 37.Wegner K, Karron DB. Surgical navigation using audio feedback. In: Morgan K, Hofman H, Stredney D, Weghorst S, editors. Medicine Meets Virtual Reality. IOS Press; 1997. pp. 450–458. [PubMed] [Google Scholar]
- 38.Wegner K. Surgical navigation system and method using audio feedback. 5th International Conference on Auditory Display 1998 [Google Scholar]
- 39.Willems P, Noordmans H, van Overbeeke J, Viergever M, Tulleken C, van der Sprenkel J. The impact of auditory feedback on neuronavigation. Acta Neurochirurgica. 2005;147:167–173. doi: 10.1007/s00701-004-0412-3. [DOI] [PubMed] [Google Scholar]
- 40.Woerdeman P, Willems P, Noordmans H, van der Sprenkel J. Auditory feedback during frameless image-guided surgery in a phantom model and initial clinical experience. Neurosurgery. 2009;110:257–262. doi: 10.3171/2008.3.17431. [DOI] [PubMed] [Google Scholar]
