Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2023 Aug 7;378(1886):20220340. doi: 10.1098/rstb.2022.0340

Conserved features of eye movement related eardrum oscillations (EMREOs) across humans and monkeys

Stephanie N Lovich 1,2,3,4, Cynthia D King 1,2,3,4, David L K Murphy 1,3,4,5, Hossein Abbasi 8, Patrick Bruns 8, Christopher A Shera 9, Jennifer M Groh 1,2,3,4,6,7,
PMCID: PMC10404921  PMID: 37545299

Abstract

Auditory and visual information involve different coordinate systems, with auditory spatial cues anchored to the head and visual spatial cues anchored to the eyes. Information about eye movements is therefore critical for reconciling visual and auditory spatial signals. The recent discovery of eye movement-related eardrum oscillations (EMREOs) suggests that this process could begin as early as the auditory periphery. How this reconciliation might happen remains poorly understood. Because humans and monkeys both have mobile eyes and therefore both must perform this shift of reference frames, comparison of the EMREO across species can provide insights to shared and therefore important parameters of the signal. Here we show that rhesus monkeys, like humans, have a consistent, significant EMREO signal that carries parametric information about eye displacement as well as onset times of eye movements. The dependence of the EMREO on the horizontal displacement of the eye is its most consistent feature, and is shared across behavioural tasks, subjects and species. Differences chiefly involve the waveform frequency (higher in monkeys than in humans) and patterns of individual variation (more prominent in monkeys than in humans), and the waveform of the EMREO when factors due to horizontal and vertical eye displacements were controlled for.

This article is part of the theme issue ‘Decision and control processes in multisensory perception’.

Keywords: visual–auditory integration, eye movements, reference frame, otoacoustic emissions, middle ear muscles, EMREO

1. Introduction

Linking visual and auditory space requires information about eye movements. Spatial cues to sound location involve interaural level differences (ILDs) and interaural timing differences (ITDs), cues that can be used to determine the positions of sounds with respect to the head, or a head-centred reference frame. By contrast, the locations of visual stimuli are detected via the pattern of light on the retina, an eye-centred reference frame. In short, every time the eyes move, the retina shifts with respect to the head and ears, changing the relationship between eye-centred visual spatial cues and head-centred auditory spatial cues. Thus, the brain must keep track of eye movements to resolve the difference between these two coordinate systems and support binding of visual and auditory stimuli based on their location (e.g. [1,2]).

Early work on how the brain achieves reference frame alignment was conducted via single unit recordings in rhesus monkeys. In the superior colliculus, changes in eye position were found to cause auditory receptive fields to shift their positions with respect to the head ([35]; see also [68]). This suggested that a coordinate transformation of auditory signals was occurring somewhere within the brain [1]. Subsequent work extended these observations to other multisensory brain regions such as the frontal eye fields [9,10] and parietal cortex [1114]. Intriguingly, effects of eye position were also found in predominantly auditory brain regions such as auditory cortex [1517] and the inferior colliculus [1823] during responses to sound stimuli.

These reports concerning eye movement effects on auditory responses in auditory brain regions motivated our foray into the most peripheral part of the auditory system, the ear itself. We reasoned that information about eye movements could be conveyed via the descending pathways to the motor actuators within the ear, such as the middle ear muscles and the outer hair cells, and that, just like with conventional otoacoustic emissions and middle ear reflex testing (e.g. [2426]), the impact of such signals might produce movements of the eardrum that could be detected by microphones in the ear canal. This led to the discovery of eye movement-related eardrum oscillations (EMREOs) [2730] and inquiries into their relationship to visual and auditory perception [31,32].

EMREOs occur with saccades, time-locked to the beginning of, or sometimes slightly preceding, their onset, and they show phase-resetting at saccade offset. The oscillation then continues for at least several tens of milliseconds after the eyes stop moving. The waveforms of EMREOs carry precise and parametric information about the direction and amplitude of the saccade [29]. EMREOs occur in the absence of incoming sound, and how they (or more properly, the underlying mechanism they reflect) impact sound transduction is unknown. It is also unknown how they might contribute to the eye movement-related modulation of sound responses observed later in the brain's auditory pathway [923].

Intriguingly, EMREOs are seen in both humans and rhesus monkeys [27]. A detailed comparison between the human and monkey EMREOs can therefore serve as a natural experiment to shed light on which features of the EMREO are most likely to be functionally important to auditory coordinate transformations or any other potential purpose. This human–monkey comparison is particularly apt because the two species have similar visual and auditory acuity and similar eye movements [33]. They are also known to integrate visual and auditory space in reasonably similar ways, showing similar thresholds for fusing versus distinguishing visual and auditory locations [34]. Features of the EMREO that are shared across humans and monkeys may represent aspects that are particularly important for these shared perceptual and oculomotor attributes. In short, a quantitative cross-species comparative analysis can shed light on the function and underlying mechanism of this phenomenon, and is a needed advance over the qualitative assessment provided in our initial study [27].

We report here that EMREOs in both humans and monkeys exhibit a similar dependence on the horizontal displacement of the eye during the accompanying saccade. Differences across species include the frequency of the EMREO waveform, which is higher in monkeys than in humans, and the dependence of the EMREO on the vertical displacement of the eye, which showed more variation across individual monkeys than in individual humans, causing this signal to ‘wash out’ when averaged across monkeys. Monkeys and humans also differed in the nature of the EMREO waveform when the dependence on horizontal and vertical saccade amplitude is controlled for. However, EMREOs were similar across different task designs in humans, suggesting that they are associated with the eye movements regardless of how the eye movements are elicited. We hypothesize that the eye displacement signals reflected in EMREOs may be used during auditory localization and perception, and that their horizontal dependence may be specifically used to tailor interaural timing and level difference cues to sound location in an eye movement-dependent fashion.

2. Methods

(a) . Participants

We recorded eye movement-related eardrum oscillations (EMREOs) simultaneously in the left and right ears of 4 monkeys (3 female, 1 male) and 21 human subjects (14 female, 7 male). All procedures involving human subjects were approved by the Duke University Institutional Review Board. All procedures involving monkey subjects were performed in accordance with an animal protocol approved by Duke University IACUC. Subjects had apparently normal hearing and normal or corrected vision. Informed consent was obtained from all human participants before testing, and all human participants received monetary compensation for participation.

(b) . Tasks

(i) . Monkey behavioural task

To maximize the rate of data collection in the monkeys and permit the testing of relatively untrained animals, we allowed monkeys to freely view a black monitor screen and make saccades wherever and whenever they wished (free-viewing saccade task). The black screen made capturing the pupil of the monkey easier and more accurate because of size of the pupil in darkness. The free-viewing task allowed us to test any of the available monkeys in our colony regardless of their training status. No external sounds were presented.

(ii) . Human behavioural tasks

Since our previous study involved performance of visually guided saccades to specific targets [27], but the monkey testing for the present study involved free viewing, we tested 4 of our human subjects on both a visually guided saccade task similar to the one we used before (figure 1a) and a free-viewing task (figure 1b) similar to the one used for monkeys (figure 1c). The remainder of the human subjects were tested only on the visually guided saccade task. For the free-viewing task, humans viewed abstract Jackson Pollock paintings, which helped them stay awake and engaged throughout the task. They made saccades wherever and whenever they wished, similar to the monkeys. This permitted us to establish whether different tasks affect the EMREO signals (in humans) and to compare the monkey and human results using a free-viewing task. No external sounds were presented during either task.

Figure 1.

Figure 1.

Task and microphone setup. (a) Humans performed a visually guided saccade task in which the black fixation dot appeared on the screen for 750 ms. The black dot would then disappear and one of the green dots would appear for 750 ms. Subjects had to saccade to the green dot and fixate for a minimum of 200 ms, at which point the dot would turn red signalling that the trial was over. An example of one subject's eye position during performance of one block can be seen on the right. (b) Humans performed a free-viewing saccade task interleaved with the visually guided saccade task in which they made saccades in any direction and magnitude of their own volition (example performance on the right). (c) Monkeys performed the same free-viewing task as humans but with a black screen (example performance on the right). (d) A microphone was placed into each ear canal of the subject and recorded pressure changes of the eardrum while the subject made saccades. No external sounds were presented. OAEs, otoacoustic emissions.

(c) . Sessions, head restraint, eye tracking and saccade identification

(i) . Sessions

An earphone assembly consisting of a microphone (Etymotic ER-10B+) and transducer (Etymotic ER 2) was placed in each of the subject's ears while they made saccades (figure 1d). In a given session, human subjects performed three blocks (5 min/block) of the visually guided saccade task interleaved with similar length blocks of free viewing, for a total of six blocks. Each session lasted approximately one hour. Microphones (48 kHz sampling rate) were assessed prior to each session by placing each earphone assembly into a test tube and playing a frequency sweep from 10 Hz to 2 kHz through the transducer (pure tone sweep from 10–1600 Hz: 10 Hz steps for 10–200 Hz, 100 Hz steps for 300–1000 Hz, 200 Hz steps for 1200–2000 Hz) to evaluate recording fidelity over a broad frequency range. Microphones were also assessed before every block of a session using the same frequency sweep while in the ears of the subject. Monkey sessions lasted about 60–180 min. Human participants were tested in a total of 3–5 sessions each and monkeys were tested in 20–30 sessions.

(ii) . Head restraint and eye tracking

Head movements were minimized in human participants with a chin rest and in monkey participants using a surgically implanted head post. Surgical implantation was accomplished under general anesthesia, aseptic conditions and with suitable analgesics to minimize discomfort in accordance with IACUC regulations. Eye movements were monitored in both humans and monkeys using the same eye tracking system (EyeLink1000), and using similar brightness, pupil size, and corneal reflections settings. Eye position was sampled at 1000 Hz. In most subjects, the left eye was tracked and the right eye was assumed to move in a conjugate fashion. In human subjects, the eye tracking system was calibrated based on saccades to visual targets. In the monkeys, a visual task-based calibration was not performed but the EyeLink's base settings for gain and offset based on subject distance were used and found to be effective. We verified this by checking the monkey eye data against the following assumptions: (a) the average eye position should be roughly straight ahead, and (b) the standard deviation of the distribution of eye positions should be roughly 10°. Operationally, such assumptions have worked well for us in the past when calibrating eye tracking in monkeys prior to training on visual tasks. The data generally accorded well with the standard deviation assumption, indicating the gain setting was roughly appropriate, but the centre of the distributions was not necessarily centred at straight ahead for all monkeys and sessions, indicating that the offset setting was not always perfect. As described further below, the data analysis methods used were robust to differences in these calibration procedures.

(d) . Initial analysis steps for microphone and eye tracking data

After collection, microphone data were downsampled from 48 kHz to 2 kHz for further analysis. These data were otherwise unfiltered. Saccade onsets and offsets were identified based on the rate of change in eye acceleration (third derivative, or ‘jerk’). After smoothing with a 7 point filter (7 ms), the first peak of the jerk was considered to be saccade onset and the second peak of the jerk was considered to be saccade offset [28,29].

(e) . Data inclusion criteria and normalization of signals across subjects/sessions

Data could be excluded at the session/block level or at the trial/saccade level. At the session/block level, data for the entire session/block were excluded if the assessment of the microphone via frequency sweep for that session or block indicated any major issues, such as a possible shift in the position of the earphone assembly. Within each session and block, individual trials or saccades could be excluded on the basis of the eye tracking data. Trials/saccades were excluded if there were less than 200 ms of steady fixation before saccade onset or after saccade offset. Saccades could also be excluded if the saccade curvature was too great (more than 4.5° of subtended angle). These screens typically caused the exclusion of many eye movements in the free-viewing tasks as they proved to be drifting movements rather than true saccades or they were not preceded and followed by satisfactory periods of steady fixation. After these screens were applied, we also excluded individual trials based on whether the root mean square (RMS) noise level on individual trials exceeded a criterion value (0.02, arbitrary units, set based on inspection of the normal RMS range in quiet recording sessions). We also excluded trials if the maximum or minimum microphone value on that trial was more than 10 standard deviations from the mean microphone value in that session or block. Overall, these exclusion criteria served to minimize the inclusion of trials in which the monkey generated noise via fidgeting or other behaviours. The same criteria were applied to both human and monkey sessions. For monkeys, between 7 and 37% of potential saccades were included for analysis, yielding between 9000 and 51 000 total saccades for each subject. For the four human subjects tested similarly to the monkeys, between 24 and 86% of potential saccades were included for analysis, yielding between 700 and 1500 saccades for each subject and task condition. After these exclusion criteria were applied, we then computed the Z-scores of the microphone values relative to a pre-saccadic baseline (−100 to −40 ms prior to saccade onset). Results in the present study are therefore expressed in units of standard deviation relative to this baseline period.

(f) . Data analysis

Extending/replicating our previous methods [27,29], we fit the raw EMREO signal collected in each ear to a linear regression model that includes the basic parameters of each eye movement:

Mic(t)=BDH(t)DH+BDV(t)DV+BH(t)H+BV(t)V+C(t).

The first four terms concern the change in eye position relative to the starting position for each saccade in the horizontal and vertical dimensions, and are measured in degrees. Specifically, these terms concern the horizontal displacement of the eye (ΔH), vertical displacement of the eye (ΔV), the horizontal initial position (H) and the vertical initial position of the eye (V). Time-varying coefficients BΔH(t), BΔV(t), BH(t) and BV(t) were fit to these parameters. The final term is a time-varying constant component (C(t)) that captures everything else left in the signal. C(t) can be thought of as the best-fitting average oscillation across all eye positions and displacements.

This regression model allows us to analyse the EMREO with respect to the individual parameters of the synchronous eye movement and is largely robust to the particular task involved, except that the initial position parameters will mainly capture fixational scatter in the visually guided saccade task versus a much larger range of initial positions in the free-viewing tasks. In addition, as noted above, the monkey calibration procedures left the precise values of position somewhat uncertain. Thus, we do not necessarily expect perfect correspondence of the initial position coefficients of the regression fits across these two tasks; any small differences in the accuracy of eye tracking calibration will also affect the precise values of these coefficients. Accordingly, we will focus on the regression coefficients for displacement in the horizontal and vertical dimensions as well as the constant component when comparing humans and monkeys, building on our related work [29,30].

3. Results

In the human participants, we first confirmed that task performance had little impact on key aspects of the EMREO signal. We found that the regression coefficients of horizontal displacement (ΔH), vertical displacement (ΔV) and the constant component (C) across four human subjects while they performed the visually guided saccade task (purple) and the free-viewing saccade task (green) were similar (figure 2a, data from four right ears). While different subjects showed different oscillatory patterns, the overall pattern was generally very consistent within subjects across tasks. The thick parts of the EMREO signal traces in figure 2a indicate the time periods when the corresponding regression coefficient differed significantly from zero with 95% confidence, and the shaded areas are the SEM. The horizontal displacement regression coefficient (figure 2a, top row) generally differed from zero for several tens of milliseconds starting at saccade onset except where it crossed zero as it transitioned between a peak or a trough, but there was little notable difference across tasks. The vertical term showed more variation across subjects (figure 2a middle row) but, when present, was still quite similar within each subject for each task (e.g. S88). The constant component was also variable across subjects, but similar across tasks. Overall, these results indicate that an oscillatory dependence on horizontal saccade amplitude occurs across all subjects and for both visually guided and free-viewing saccades, with some variation in the timing, amplitude, precise waveform and presence/magnitude of the vertical dependence and constant components. This pattern is consistent with our previous work [2730] and suggests that the free-viewing saccade task is an equally effective way to collect EMREO data.

Figure 2.

Figure 2.

Regression analysis and individual subject data. (a) The regression coefficient reflecting the influence of horizontal and vertical displacements of the eye on the EMREO signal (top, middle, respectively) and the constant component (bottom) in the right ears of humans across both tasks (visually guided in purple and free viewing in green). The numbers of saccades for each subject/task type are noted in the horizontal component graph for each subject. (b) Similar to (a) but for the right ears of monkeys during the free-viewing task. Shading indicates standard error of the measurement (SEM). Because there are many more saccades for the monkeys than the humans, the SEM appears smaller for this species.

Figure 2b shows corresponding findings for the right ears of four monkey subjects while they performed the free-viewing saccade task (black). Like the human data, the EMREO is strongly dependent on the horizontal displacement of the associated eye movement (figure 2b top row). The oscillation co-varies in phase and amplitude similarly to the signal recorded in humans. Like the humans, the waveform varied to some extent across subjects, but all showed a sharp onset at (or slightly before) saccade onset, continuing for at least one full oscillatory cycle (i.e. including one peak and one trough), and all subjects had a significant signal at some points in the first few tens of milliseconds after saccade onset. Similar to the human data, the vertical and constant components varied across monkey subjects more than did the horizontal component. The vertical coefficients (middle row) differed considerably across monkeys in magnitude and waveform. As can be seen from the differences in the scales of the y-axes of the corresponding horizontal and vertical plots for each monkey, the vertical dimension contributed more weakly than the horizontal dimension did to the EMREO signal in two monkeys (U, J), but more strongly in a third (Y); the two dimensions exerted a roughly comparable effect in the fourth (F). Similarly, while some subjects have clear, significant oscillations in the constant component (monkey F and monkey U), others have a much noisier signal (monkey Y; figure 2b bottom row). The constant component of the monkeys also differs from the humans in that the largest positive peaks occur later in the time course of the EMREO (around 75 ms–100 ms) compared to within the first 50 ms after saccade onset in humans.

A population-level comparison across all subject ears is illustrated in figure 3a–c. This figure shows the average of all human data for both the visually guided saccade task (purple) and the free-viewing saccade task (green) as well as the monkey data for the free-viewing saccade task (black), aligned to onset of the saccade. The additional human subjects tested only on the visually guided saccade task (n = 17 subjects, n = 34 ears) are also included (orange traces). As suggested in figure 2, task has little effect on the EMREO signal within the same subjects (green versus purple) and the results in this small sample accord well with the larger group tested only with the visual task (orange).

Figure 3.

Figure 3.

Average EMREO signal across populations of subjects. Average EMREO regression coefficients across both left and right ears are shown for 3 groups/sets of conditions for human subjects in comparison to the monkeys. Purple and green traces show the same set of 4 human participants as illustrated in figure 2, who performed the visually guided saccade task (purple) and the free-viewing saccade paradigm (green). Orange traces show an additional larger group (N = 17 subjects, 34 ears) who only performed the visually guided saccade task. The monkey free-viewing saccades data are shown in black. The regression coefficient concerning horizontal displacement of the eye is plotted in (a), the coefficient for vertical displacement is shown in (b) and the constant component is plotted in (c). Regression coefficients were first computed within each subject ear. To facilitate combining across ears, the sign convention for the horizontal dimension was defined relative to the recorded ear (e.g. saccade amplitude was assigned a positive value for contralateral saccades and a negative value for ipsilateral saccades). The regression coefficients for each subject ear were then averaged together across all the relevant subjects (monkeys, human free viewing, corresponding human visually guided task, and the additional human subjects tested in visually guided tasks only). Shading indicates SEM across the subject ears included in each group. Mean saccade ending time ± standard deviation for each species and task is plotted above the regression coefficients along the same time axis.

On average, the chief difference between the horizontal displacement regression coefficients in humans versus monkeys was in the waveform shape and time course. The oscillation initially occurs at a higher frequency for the monkeys' responses than the humans, and this rapid oscillatory portion then transitions to several much slower peaks/troughs.

The chief differences for the average vertical and constant components between humans and monkeys were in their magnitude (figure 3b). For the constant component, the human data exhibit clear significant peaks and align in phase at the onset of the saccade for both tasks (purple: visually guided; green: free viewing). There is no comparably consistent signal in the monkey data. As seen in figure 2, some individual monkeys (monkey F and monkey U) have significant positive and negative peaks between 50–100 ms, very different from what we see in the humans. However, the timing and waveform of these signals differed in these two animals, and the other two monkeys had no peaks in the constant component that were significantly different from zero. Therefore the average across the monkeys is much smaller than for the humans.

Similarly, even though all four individual monkeys exhibited vertical components, they all differed from each other in waveform, whereas in the humans this signal was not necessarily larger but its timing and waveform were more consistent across individuals. The net result is that when averaging across individual subjects, the vertical signal actually appears smaller in the monkeys than in the humans. It is unclear why the constant and vertical components show this inter-individual variability within monkeys or variability between the monkey and human groups.

To a degree, the faster time course of the monkey EMREO compared to humans may relate to the faster time course of monkey saccades combined with the known phase-resetting associated with saccade offset. As shown in figure 3a,b, the saccades of the monkeys were shorter in duration than those of the humans (black versus purple versus green versus orange bars at the top of the graphs). It is known that monkeys make faster saccades than humans (compare findings in [33,3537]), and this was true in our participants as well.

In sum, our findings indicate that the dependence of the EMREO on the horizontal displacement of the eye movement is quite conserved across humans and monkeys, differing mainly in the time course of the oscillation. Greater differences are observed in the vertical and constant components, and whether these signals appear comparable versus smaller in monkeys versus humans depends on whether the analysis is conducted at the level of individual participants or is averaged across the population.

4. Discussion

Sounds and visual stimuli are detected in different ‘native’ coordinate systems. The location of a sound is computed from cues involving loudness and timing differences across ears (ILDs and ITDs), as well as variation in spectral frequency as a function of location with respect to the ears. These cues yield location information in a head-centred reference frame. The location of a visual stimulus is ascertained from the pattern of light hitting the retina, producing cues in an eye-centred reference frame. Information about the position and movements of the eyes with respect to the head is therefore necessary before the locations of visual and auditory stimuli can be matched up with one another. Understanding how the brain incorporates this information to reconcile these different coordinate systems is critical to knowing how the brain links information across the visual and auditory systems.

Research into the intersection between eye movements and hearing has involved psychophysical studies in humans (e.g. [23,3848]) and neurophysiological studies in monkeys (e.g. [36,9,1123,49,50]; see also [7,8]). The discovery of EMREOs in both humans and monkeys [2732] afforded an opportunity to study an early aspect of this process in the same way in two different species. Given that how EMREOs actually contribute to auditory coordinate transformations has not yet been established, a comparison between human and monkey EMREOs provides an opportunity to identify clues as to which aspects of the EMREO signal are conserved across species.

We report here that in both humans and monkeys, EMREO waveforms depend on the magnitude and direction of the accompanying saccade, especially in the horizontal dimension. Performance of a visual task is not necessary; the signal is similar regardless of whether there is an explicit reason for making a saccade or if the participant is free-viewing. EMREOs occur without the presentation of visual or auditory stimuli and across an assortment of target locations, saccade magnitudes and saccade tasks. The EMREOs dependence on horizontal eye displacement exhibits a higher oscillation frequency in monkeys than humans, but a similar pattern is not seen for the vertical dimension and the significance of this aspect of the signal is therefore unclear. A portion of this observation may simply reflect the fact that phase-resetting of the EMREO occurs at saccade offset, and monkey saccades are shorter in duration than human saccades. Put another way, if the period of the EMREO oscillation is roughly proportional to saccade duration, the systematic differences in saccade duration between monkeys and humans could account for this finding. However, it is unclear why a similar pattern is not clearly observed in the other aspects of the EMREO such as the vertical and constant components. Differences in the resonance properties of the eardrum or associated structures could also contribute to these frequency/shape differences.

By contrast to the largely similar horizontal displacement component (within and across species), the waveforms and amplitudes of the vertical and constant components in the EMREO regression analysis differed across humans and monkeys, and, in the latter, across subjects. The constant term reflects the average EMREO waveform across all directions and amplitudes of movement. In humans, there appears to be a basic non-zero EMREO waveform upon which the dependence on eye movement parameters is superimposed. In monkeys, this basic EMREO waveform is much smaller and variable across individuals. The vertical signal is also sufficiently variable across monkeys that it largely disappears when averaging at the population level (figure 3), leaving chiefly the dependence of the waveform on the eye movement parameters as the dominant signal. That the basic EMREO waveform is different between humans and monkeys, and shows different properties in the horizontal versus vertical dimensions despite largely similar visual–auditory–oculomotor constraints and performance in the two species, suggests that the overall shape of the EMREO waveform may be of less significance than its dependence on eye movement parameters. As long as the signal is reproducible within an individual participant, it may contain the signatures needed for the brain to interpret its hypothesized impact on sound transduction in an eye movement-dependent fashion.

Our study provides an advance over our previous work [27] in several ways. Our previous study presented monkey data in aggregated form only, and while the results appeared qualitatively similar to the aggregated human data, we did not perform an analysis of individual monkey data to establish that parallel on a more formal basis. Here, we used the same regression analysis on both human and monkey data, and we evaluated the pattern of results at both the individual and population levels. We also focused on both the dependence of the EMREO on the horizontal and vertical displacements of the eyes as well as the average EMREO waveform for comparison purposes.

Our study has several limitations. The free-viewing method did not permit fully exploring aspects of the EMREO signal relating to the initial position of the eyes. In addition, the free-viewing paradigm may not have yielded fully comparable distributions of eye movements across individuals and species, and this could have led to different patterns of ‘variance capture’ by the different regression terms across individuals and species. The lack of a visually guided saccade task also limits the accuracy of our eye tracking calibration. Future work with a controlled task will be needed to clarify these points. A second related limitation of both the current work and our previous work is that the use of a microphone to measure the efferent system of the ear limits the analysis to oscillatory aspects of the signal; it is possible that there are steady state changes that are not reflected in the measurements made here. Such steady state changes can potentially be detected via laser doppler vibrometry or optical coherence tomography [51] and may provide more information about sensitivity to initial/final eye position.

A third limitation is that we did not attempt to quantify the overall amplitude of the EMREO in humans versus monkeys. This would have been challenging for several reasons. The microphone used here is specialized for the frequency range of conventional otoacoustic emissions (e.g. greater than 300 Hz), and thus its gain in the much lower frequency range of EMREOs is less certain (we have previously estimated the amplitude of the human EMREO for an 18 degree eye movement to be about 57 dB SPL, based on modelling of the microphone's transfer function in this frequency range [27]). Second, the relationship between the microphone's output and the amplitude of the eardrum oscillation will depend on the size and acoustic properties of the ear canal space, which likely differ between humans and monkeys. Accordingly, the microphone measurements used in this study are expressed in units of standard deviation relative to a baseline period. This permits comparison across species on an even footing, although it leaves uncertain the precise conversion to sound pressure level.

Our overarching theory is that the mechanism(s) underlying EMREOs contribute to the process of localizing an incoming sound with respect to the eyes. How exactly this may happen remains uncertain. The cues to sound location involve interaural timing differences, interaural level differences and the spectral content of sound. Eye movement-related adjustments to any of these cues could contribute to the computation of sound location with respect to the eyes for subsequent integration with visual signals.

Understanding how this might occur requires considering the potential role(s) of the ear's motor elements in generating EMREOs. These candidate motor elements consist of the two middle ear muscles (MEMs), the stapedius and tensor tympani, and the outer hair cells (OHCs). Generally, all three of these motor elements are thought to control the gain of the response of the ear to sound, with the MEMs thought of as playing a role in dampening the responses to loud sounds and the OHCs thought of as enhancing the responses to quiet sounds. Adjusting the gain of the response differently in the two ears in an eye movement-dependent fashion would likely affect the inferred interaural level difference cues. For example, if deviation of the eyes to the left were to cause the gain of sound transduction to decrease in the left ear and/or increase in the right ear, the resulting alteration in the binaural level difference produced by incoming sounds would no longer be anchored solely to the head but could then be interpreted as indicating sound location with respect to the eyes. In addition, recent work by Cho and colleagues [52] has suggested that contractions of the MEMs should not only influence the gain of responsiveness to incoming sound, but also affect the time delay of sound transmission, and thus potentially interaural timing difference cues in a similar fashion. Both timing and level difference cues relate solely to the horizontal dimension, which is the most consistent and powerful contributor to the EMREO signal in both humans and monkeys. How spectral cues are processed in the brain is less well known, but the ability of OHCs to alter processing in a frequency-specific fashion could be relevant to how this cue might be modulated in an eye movement-dependent way.

Candidate anatomical pathways by which oculomotor commands could reach any or all of these actuators exist. For example, the superior colliculus (SC), a brain region involved in controlling eye movements, projects to the inferior colliculus (IC), an auditory structure [53,54]. Auditory signals in both the SC and the IC are sensitive to changes in eye position (SC: [38,49]; IC: [1823]), and the IC conveys descending projections to both the superior olivary complex [55]—the source of descending input to the outer hair cells [56,57]—and the cochlear nucleus [58,59] which in turn projects to the facial and trigeminal nerves that innervate the stapedius and tensor tympani, respectively [60]. Preliminary findings from our group are consistent with a role for all three of these types of motor actuators in the generation of EMREOs [30]. Future studies regarding these motor components and how they control mechanical processes in the ear will benefit from an animal model with a significant and consistent EMREO and with comparable ear anatomy to humans. With an animal model similar enough to humans, interventional studies can be conducted to isolate the source of motor commands and their impact on the motor actuators that create and/or modulate the EMREO signal.

Overall, it is important to understand the similarities and differences of the EMREO signal across species when considering future work to determine which anatomical components are necessary and functional in the EMREO signal. With information about the key features that are conserved in the signal, we can answer questions about the physical mechanisms in the ear that generate the EMREO and lead to sound localization and, eventually, perception.

Ethics

All procedures involving human subjects were approved by the Duke University Institutional Review Board. All procedures involving monkey subjects were performed in accordance with an animal protocol approved by Duke University IACUC. Subjects had apparently normal hearing and normal or corrected vision. Informed consent was obtained from all human participants before testing, and all human participants received monetary compensation for participation.

Data accessibility

Additional data are available via figshare [61].

Authors' contributions

S.N.L.: conceptualization, data curation, formal analysis, methodology, software, visualization, writing—original draft, writing—review and editing; C.D.K.: conceptualization, methodology, visualization, writing—review and editing; D.L.K.M.: software, writing—review and editing; H.A.: writing—review and editing; P.B.: writing—review and editing; C.A.S.: conceptualization, writing—review and editing; J.G.: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, software, supervision, validation, visualization, writing—review and editing.

All authors gave final approval for publication and agreed to be held accountable for the work performed therein.

Conflict of interest declaration

We declare we have no competing interests.

Funding

We thank Gelana Tostaeva, Justine Griego, Tingan Zhu for general laboratory support; funding sources NIDCD R01DC017532 and NIDCD R01DC020363 to C.D.K. and J.M.G. and Volkswagen Foundation Grant 97624 to P.B. for financial support.

References

  • 1.Groh JM, Sparks DL. 1992. Two models for transforming auditory signals from head-centered to eye-centered coordinates. Biol. Cybern 67, 291-302. ( 10.1007/BF02414885) [DOI] [PubMed] [Google Scholar]
  • 2.Groh JM. 2014. Your sunglasses are in the milky way. In Making space: how the brain knows where things are, pp. 161-176. Cambridge, MA: Harvard University Press. [Google Scholar]
  • 3.Jay MF, Sparks DL. 1984. Auditory receptive fields in primate superior colliculus shift with changes in eye position. Nature 309, 345-347. ( 10.1038/309345a0) [DOI] [PubMed] [Google Scholar]
  • 4.Jay MF, Sparks DL. 1987. Sensorimotor integration in the primate superior colliculus. II. Coordinates of auditory signals. J. Neurophysiol. 57, 35-55. ( 10.1152/jn.1987.57.1.35) [DOI] [PubMed] [Google Scholar]
  • 5.Lee J, Groh JM. 2012. Auditory signals evolve from hybrid- to eye-centered coordinates in the primate superior colliculus. J. Neurophysiol. 108, 227-242. ( 10.1152/jn.00706.2011) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hartline PH, Vimal RL, King AJ, Kurylo DD, Northmore DP. 1995. Effects of eye position on auditory localization and neural representation of space in superior colliculus of cats. Exp. Brain Res. 104, 402-408. ( 10.1007/BF00231975) [DOI] [PubMed] [Google Scholar]
  • 7.Zella JC, Brugge JF, Schnupp JW. 2001. Passive eye displacement alters auditory spatial receptive fields of cat superior colliculus neurons. Nat. Neurosci. 4, 1167-1169. ( 10.1038/nn773) [DOI] [PubMed] [Google Scholar]
  • 8.Populin LC, Tollin DJ, Yin TC. 2004. Effect of eye position on saccades and neuronal responses to acoustic stimuli in the superior colliculus of the behaving cat. J. Neurophysiol. 92, 2151-2167. ( 10.1152/jn.00453.2004) [DOI] [PubMed] [Google Scholar]
  • 9.Russo GS, Bruce CJ. 1994. Frontal eye field activity preceding aurally guided saccades. J. Neurophysiol. 71, 1250-1253. ( 10.1152/jn.1994.71.3.1250) [DOI] [PubMed] [Google Scholar]
  • 10.Caruso VC, Pages DS, Sommer MA, Groh JM. 2019. Compensating for a shifting world: a quantitative comparison of the reference frame of visual and auditory signals across three multimodal brain areas. J. Neurophysiol. 126, jn.00385.2020. ( 10.1152/jn.00385.2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Stricanne B, Andersen RA, Mazzoni P. 1996. Eye-centered, head-centered, and intermediate coding of remembered sound locations in area LIP. J. Neurophysiol. 76, 2071-2076. ( 10.1152/jn.1996.76.3.2071) [DOI] [PubMed] [Google Scholar]
  • 12.Cohen YE, Andersen RA. 2000. Reaches to sounds encoded in an eye-centered reference frame. Neuron 27, 647-652. ( 10.1016/S0896-6273(00)00073-8) [DOI] [PubMed] [Google Scholar]
  • 13.Mullette-Gillman OA, Cohen YE, Groh JM. 2005. Eye-centered, head-centered, and complex coding of visual and auditory targets in the intraparietal sulcus. J. Neurophysiol. 94, 2331-2352. ( 10.1152/jn.00021.2005) [DOI] [PubMed] [Google Scholar]
  • 14.Mullette-Gillman OA, Cohen YE, Groh JM. 2009. Motor-related signals in the intraparietal cortex encode locations in a hybrid, rather than eye-centered, reference frame. Cereb. Cortex 19, 1761-1775. ( 10.1093/cercor/bhn207) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Werner-Reiss U, Kelly KA, Trause AS, Underhill AM, Groh JM. 2003. Eye position affects activity in primary auditory cortex of primates. Current Biology 13, 554-562. ( 10.1016/S0960-9822(03)00168-4) [DOI] [PubMed] [Google Scholar]
  • 16.Fu KM, Shah AS, O'Connell MN, McGinnis T, Eckholdt H, Lakatos P, Smiley J, Schroeder CE. 2004. Timing and laminar profile of eye-position effects on auditory responses in primate auditory cortex. J. Neurophysiol. 92, 3522-3531. ( 10.1152/jn.01228.2003) [DOI] [PubMed] [Google Scholar]
  • 17.Maier JX, Groh JM. 2010. Comparison of gain-like properties of eye position signals in inferior colliculus versus auditory cortex of primates. Front. Integr. Neurosci. 4, 121-132. ( 10.3389/fnint.2010.00121) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Groh JM, Trause AS, Underhill AM, Clark KR, Inati S. 2001. Eye position influences auditory responses in primate inferior colliculus. Neuron 29, 509-518. ( 10.1016/S0896-6273(01)00222-7) [DOI] [PubMed] [Google Scholar]
  • 19.Zwiers MP, Versnel H, Van Opstal AJ. 2004. Involvement of monkey inferior colliculus in spatial hearing. J. Neurosci. 24, 4145-4156. ( 10.1523/JNEUROSCI.0199-04.2004) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Porter KK, Metzger RR, Groh JM. 2006. Representation of eye position in primate inferior colliculus. J. Neurophysiol. 95, 1826-1842. ( 10.1152/jn.00857.2005) [DOI] [PubMed] [Google Scholar]
  • 21.Bulkin DA, Groh JM. 2012. Distribution of visual and saccade related information in the monkey inferior colliculus. Front. Neural Circuits 6, 61. ( 10.3389/fncir.2012.00061) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bulkin DA, Groh JM. 2012. Distribution of eye position information in the monkey inferior colliculus. J. Neurophysiol. 107, 785-795. ( 10.1152/jn.00662.2011) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Willett SM, Groh JM, Maddox RK. 2019. Hearing in a ‘moving’ visual world: coordinate transformations along the auditory pathway. In Springer handbook of auditory research. Multisensory processes: the auditory perspective (eds Lee AKC, Wallace MT, Coffin AB, Popper AN, Fay RR), pp. 85-104. Berlin, Germany: Springer. [Google Scholar]
  • 24.Kemp DT. 1978. Stimulated acoustic emissions from within the human auditory system. J. Acoust. Soc. Am. 64, 1386-1391. ( 10.1121/1.382104) [DOI] [PubMed] [Google Scholar]
  • 25.Shera CA. 2004. Mechanisms of mammalian otoacoustic emission and their implications for the clinical utility of otoacoustic emissions. Ear Hear. 25, 86-97. ( 10.1097/01.AUD.0000121200.90211.83) [DOI] [PubMed] [Google Scholar]
  • 26.Schairer KS, Feeney MP, Sanford CA. 2013. Acoustic reflex measurement. Ear Hear. 34, 43s-47s. ( 10.1097/AUD.0b013e31829c70d9) [DOI] [PubMed] [Google Scholar]
  • 27.Gruters KG, Murphy DLK, Jenson CD, Smith DW, Shera CA, Groh JM. 2018. The eardrums move when the eyes move: a multisensory effect on the mechanics of hearing. Proc. Natl Acad. Sci. USA 115, E1309-E1318. ( 10.1073/pnas.1717948115) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Murphy DL, King CD, Schlebusch SN, Landrum R, Shera CA, Groh JM. 2020. Evidence for a system in the auditory periphery that may contribute to linking sounds and images in space. bioRxiv:2020.2007.2019.210864. ( 10.1101/2020.07.19.210864) [DOI]
  • 29.Lovich SN, King CD, Murphy DL, Landrum R, Shera CA, Groh JM. 2022. Parametric information about eye movements is sent to the ears. BioRxiv 2022.11.27.518089. ( 10.1101/2022.11.27.518089) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.King CD, Lovich SN, Murphy DL, Landrum R, Kaylie D, Shera CA, Groh JM. 2023. Individual similarities and differences in eye-movement-related eardrum oscillations (EMREOs). bioRxiv 2023.03.09.531896. ( 10.1101/2023.03.09.531896) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Abbasi H, King CD, Lovich S, Röder B, Groh JM, Bruns P. 2023. Audiovisual temporal recalibration modulates eye movement-related eardrum oscillations. International Multisensory Research Forum (IMRF), Brussels, Belgium, 27–30 June 2023. Abstract 37. See https://imrf2023.sciencesconf.org/data/pages/IMRF23_FullProgram.pdf. [Google Scholar]
  • 32.Bröhl F, Kayser C. 2023. Detection of spatially-localized sounds is robust to saccades and concurrent eye movement-related eardrum oscillations (EMREOs). BioRxiv 2023.2004.2017.537161. ( 10.1101/2023.04.17.537161) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Jay MF, Sparks D. 1990. Localization of auditory and visual targets for the initiation of saccadic eye movements. In Comparative perception, Vol. I. Basic mechanisms (eds Berkley MA, Stebbins WC), pp. 351-374. New York, NY: John Wiley & Sons. [Google Scholar]
  • 34.Mohl JT, Pearson JM, Groh JM. 2020. Monkeys and humans implement causal inference to simultaneously localize auditory and visual stimuli. J. Neurophysiol. 124, 715-727. ( 10.1152/jn.00046.2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bahill AT, Clark MR, Stark L. 1975. The main sequence, a tool for studying human eye movements. Math. Biosci. 24, 191-204. ( 10.1016/0025-5564(75)90075-9) [DOI] [Google Scholar]
  • 36.Groh JM, Sparks DL. 1996. Saccades to somatosensory targets. I. behavioral characteristics. J. Neurophysiol. 75, 412-427. ( 10.1152/jn.1996.75.1.412) [DOI] [PubMed] [Google Scholar]
  • 37.Guadron L, van Opstal AJ, Goossens J. 2022. Speed-accuracy tradeoffs influence the main sequence of saccadic eye movements. Sci. Rep. 12, 5262. ( 10.1038/s41598-022-09029-8) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Weerts TC, Thurlow WR. 1971. The effects of eye position and expectation on sound localization. Percept. Psychophys. 9, 35-39. ( 10.3758/BF03213025) [DOI] [Google Scholar]
  • 39.Lewald J, Ehrenstein WH. 1996. The effect of eye position on auditory lateralization. Exp. Brain Res. 108, 473-485. ( 10.1007/BF00227270) [DOI] [PubMed] [Google Scholar]
  • 40.Lewald J. 1997. Eye-position effects in directional hearing. Behav. Brain Res. 87, 35-48. ( 10.1016/S0166-4328(96)02254-1) [DOI] [PubMed] [Google Scholar]
  • 41.Lewald J. 1998. The effect of gaze eccentricity on perceived sound direction and its relation to visual localization. Hear. Res. 115, 206-216. ( 10.1016/S0378-5955(97)00190-1) [DOI] [PubMed] [Google Scholar]
  • 42.Boucher L, Groh JM, Hughes HC. 2001. Afferent delays and the mislocalization of perisaccadic stimuli. Vision Res. 41, 2631-2644. ( 10.1016/S0042-6989(01)00156-0) [DOI] [PubMed] [Google Scholar]
  • 43.Lewald J, Ehrenstein WH. 2001. Effect of gaze direction on sound localization in rear space. Neurosci. Res. 39, 253-257. ( 10.1016/S0168-0102(00)00210-8) [DOI] [PubMed] [Google Scholar]
  • 44.Lewald J, Getzmann S. 2006. Horizontal and vertical effects of eye-position on sound localization. Hear. Res. 213, 99-106. ( 10.1016/j.heares.2006.01.001) [DOI] [PubMed] [Google Scholar]
  • 45.Pavani F, Husain M, Driver J. 2008. Eye-movements intervening between two successive sounds disrupt comparisons of auditory location. Exp. Brain Res. 189, 435-449. ( 10.1007/s00221-008-1440-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Klingenhoefer S, Bremmer F. 2009. Perisaccadic localization of auditory stimuli. Exp. Brain Res. 198, 411-423. ( 10.1007/s00221-009-1869-3) [DOI] [PubMed] [Google Scholar]
  • 47.Collins T, Heed T, Roder B. 2010. Eye-movement-driven changes in the perception of auditory space. Attent. Percept. Psychophys. 72, 736-746. ( 10.3758/APP.72.3.736) [DOI] [PubMed] [Google Scholar]
  • 48.Krüger HM, Collins T, Englitz B, Cavanagh P. 2016. Saccades create similar mislocalizations in visual and auditory space. J. Neurophysiol. 115, 2237-2245. ( 10.1152/jn.00853.2014) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Jay MF, Sparks DL. 1987. Sensorimotor integration in the primate superior colliculus. I. Motor convergence. J. Neurophysiol. 57, 22-34. ( 10.1152/jn.1987.57.1.22) [DOI] [PubMed] [Google Scholar]
  • 50.Caruso VC, Pages DS, Sommer MA, Groh JM. 2021. Compensating for a shifting world: evolving reference frames of visual and auditory signals across three multimodal brain areas. J. Neurophysiol. 126, 82-94. ( 10.1152/jn.00385.2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Starovoyt A, Putzeys T, Wouters J, Verhaert N. 2019. High-resolution imaging of the human cochlea through the round window by means of optical coherence tomography. Sci. Rep. 9, 14271. ( 10.1038/s41598-019-50727-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Cho NH, Ravicz ME, Puria S. 2023. Human middle-ear muscle pulls change tympanic-membrane shape and low-frequency middle-ear transmission magnitudes and delays. Hear. Res. 430, 108721. ( 10.1016/j.heares.2023.108721) [DOI] [PubMed] [Google Scholar]
  • 53.Coleman JR, Clerici WJ. 1987. Sources of projections to subdivisions of the inferior colliculus in the rat. J. Comp. Neurol. 262, 215-226. ( 10.1002/cne.902620204) [DOI] [PubMed] [Google Scholar]
  • 54.Sparks DL, Hartwich-Young R. 1989. The deep layers of the superior colliculus. In The neurobiology of saccadic eye movements (eds Wurtz RH, Goldberg ME), pp. 213-255. New York, NY: Elsevier. [PubMed] [Google Scholar]
  • 55.Faye-Lund H. 1986. Projection from the inferior colliculus to the superior olivary complex in the albino rat. Anatomy Embryol. 175, 35-52. ( 10.1007/BF00315454) [DOI] [PubMed] [Google Scholar]
  • 56.Guinan JJ Jr. 2006. Olivocochlear efferents: anatomy, physiology, function, and the measurement of efferent effects in humans. Ear Hear. 27, 589-607. ( 10.1097/01.aud.0000240507.83072.e7) [DOI] [PubMed] [Google Scholar]
  • 57.Ciuman RR. 2010. The efferent system or olivocochlear function bundle – fine regulator and protector of hearing perception. Int. J. Biomed. Sci. 6, 276-288. [PMC free article] [PubMed] [Google Scholar]
  • 58.Milinkeviciute G, Muniak MA, Ryugo DK. 2017. Descending projections from the inferior colliculus to the dorsal cochlear nucleus are excitatory. J. Comp. Neurol. 525, 773-793. ( 10.1002/cne.24095) [DOI] [PubMed] [Google Scholar]
  • 59.Balmer TS, Trussell LO. 2022. Descending axonal projections from the inferior colliculus target nearly all excitatory and inhibitory cell types of the dorsal cochlear nucleus. J. Neurosci. 42, 3381-3393. ( 10.1523/JNEUROSCI.1190-21.2022) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Mukerji S, Windsor AM, Lee DJ. 2010. Auditory brainstem circuits that mediate the middle ear muscle reflex. Trends Amplif. 14, 170-191. ( 10.1177/1084713810381771) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Lovich S, King C, Murphy DL, Groh J. 2023. Lovich et al. Phil Trans B 2023 dataset. Figshare Dataset. ( 10.6084/m9.figshare.23297849.v1) [DOI]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Lovich S, King C, Murphy DL, Groh J. 2023. Lovich et al. Phil Trans B 2023 dataset. Figshare Dataset. ( 10.6084/m9.figshare.23297849.v1) [DOI]

Data Availability Statement

Additional data are available via figshare [61].


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES