Exploring the Role of Feedback-Based Auditory Reflexes in Forward Masking by Schroeder-Phase Complexes

Magdalena Wojtczak; Jordan A Beim; Andrew J Oxenham

doi:10.1007/s10162-014-0495-3

. 2014 Oct 22;16(1):81–99. doi: 10.1007/s10162-014-0495-3

Exploring the Role of Feedback-Based Auditory Reflexes in Forward Masking by Schroeder-Phase Complexes

Magdalena Wojtczak ^1,^✉, Jordan A Beim ¹, Andrew J Oxenham ¹

PMCID: PMC4310863 PMID: 25338224

Abstract

Several studies have postulated that psychoacoustic measures of auditory perception are influenced by efferent-induced changes in cochlear responses, but these postulations have generally remained untested. This study measured the effect of stimulus phase curvature and temporal envelope modulation on the medial olivocochlear reflex (MOCR) and on the middle-ear muscle reflex (MEMR). The role of the MOCR was tested by measuring changes in the ear-canal pressure at 6 kHz in the presence and absence of a band-limited harmonic complex tone with various phase curvatures, centered either at (on-frequency) or well below (off-frequency) the 6-kHz probe frequency. The influence of possible MEMR effects was examined by measuring phase-gradient functions for the elicitor effects and by measuring changes in the ear-canal pressure with a continuous suppressor of the 6-kHz probe. Both on- and off-frequency complex tone elicitors produced significant changes in ear canal sound pressure. However, the pattern of results was not consistent with the earlier hypotheses postulating that efferent effects produce the psychoacoustic dependence of forward-masked thresholds on masker phase curvature. The results also reveal unexpectedly long time constants associated with some efferent effects, the source of which remains unknown.

Keywords: medial olivocochlear reflex, middle-ear muscle reflex, forward masking, Schroeder-phase complexes

INTRODUCTION

The mechanical responses to sound in the cochlea have been generally described as time-invariant, with the time constants associated with nonlinear aspects, such as suppression, being generally considered negligible from a perceptual perspective (e.g., Ruggero and Temchin 2007). Consequently, the synapse between the inner hair cell and the spiral ganglion is thought to be the first stage where perceptually relevant changes in responses over time occur in the form of neural adaptation (Smith 1977; Abbas 1979; Smith 1979; Smith and Brachman 1982). On the other hand, physiological evidence has also shown that cochlear gain and consequently cochlear responses can decrease over time due to the activation of efferent fibers that project from the medial olivary complex (MOC) and synapse with the outer hair cells (OHCs) in the organ of Corti (for a review, see Guinan 2006). The time course of the effect of efferent activation is relatively slow, with a 25- to 30-ms latency, followed by a buildup over about 70 ms to a nearly asymptotic level. After the offset of the eliciting stimulus, the effect of efferent activation remains constant for about 25–30 ms and then decays over a 160–200-ms interval (Backus and Guinan 2006).

Despite robust physiological evidence showing decreased peripheral responses due to the stimulation of MOC efferents by electric shocks in cats (Gifford and Guinan 1987; Liberman 1989; Warren and Liberman 1989) and reduced magnitudes of evoked otoacoustic emissions in the presence of acoustic elicitors of the medial olivocochlear reflex (MOCR) in humans (Collet et al. 1990; Veuillet et al. 1991; Collet et al. 1992; Norman and Thornton 1993; Maison et al. 2000; Guinan et al. 2003; Backus and Guinan 2006; Lilaonitkul and Guinan 2009a, 2012), the functional role of the reflex remains poorly understood. There is evidence from physiological animal studies that MOC efferents play a role in protecting the cochlea and the synaptic connections with afferent auditory nerve fibers from the effects of aging (Liberman et al. 2014) and from noise-induced damage (Kujawa and Liberman 1997; Maison et al. 2013). It is likely that they have a similar role in the human auditory system. However, the effects of efferent activation on performance in perceptual tasks remain elusive. Scharf et al. (1994, 1997) measured performance by listeners with Ménière’s disease in a series of basic psychophysical tasks before and after sectioning of the olivocochlear bundle. He found no effect of the sectioning on the detection of tones in quiet and in noise, no effect on tuning measured with a notched noise masker, and no effect on intensity and frequency discrimination. The only significant effect was an improvement in the detection of tones with unexpected frequencies embedded in noise maskers after the sectioning of efferents, which led the authors to conclude that efferent activation may facilitate detection under selective attention. However, Scharf et al. tempered their conclusions by noting that the vestibular neurotomy performed on their patients may not have resulted in a complete elimination of MOC efferent connections to the cochlea.

Recently, a growing number of studies have implicated efferent activation as a factor contributing to various psychophysical effects that appeared consistent with relatively slow changes in cochlear responses over time. Examples of such effects include a so-called temporal effect or overshoot (McFadden and Champlin 1990; von Klitzing and Kohlrausch 1994; Strickland 2004; Strickland and Krishnan 2005; Strickland 2008), the effect of a precursor on cochlear gain and compression, as estimated from the growth of psychophysical forward masking (Krull and Strickland 2008; Jennings et al. 2009; Roverud and Strickland 2010, 2014), changes in frequency selectivity during the course of acoustic stimulation (Jennings et al. 2009; Jennings and Strickland 2012), changes in frequency selectivity due to contralateral noise (Aguilar et al. 2013), changes in the rate of recovery from forward masking for high masker levels (Wojtczak and Oxenham 2009a), and masker phase effects in forward masking by harmonic complexes (Wojtczak and Oxenham 2009b).

All these studies used psychophysical methods and provided no independent estimate or measure of efferent activation. In two recent studies, psychophysical measurements of overshoot were combined with noninvasive physiological measurements of the stimulus frequency otoacoustic emissions (SFOAEs) to verify the MOCR-based explanation of the effect (Keefe et al. 2009; Walsh et al. 2010). While Walsh et al. (2010) found good correspondence between the time course of the psychophysical overshoot and changes in the magnitude of a so-called “nonlinear SFOAE” with a delay from masker onset, Keefe et al. (2009) found no changes in SFOAE threshold with the delay despite very robust (on average 16 dB) psychophysical overshoot in the same listeners. Because of the general lack of consistent independent verification, statements regarding efferent involvement in various psychophysical phenomena remain speculative. The aim of this study was to provide an independent test of the hypothesized role of the MOCR in forward masking by harmonic complexes with identical power spectra and different phase spectra, as used by Wojtczak and Oxenham (2009b). In that study, forward masking produced by Schroeder-phase complexes was measured as a function of their phase curvature, which was defined as

\frac{\partial^{2} θ (f)}{\partial f^{2}} = C \frac{2 π}{N f_{0}^{2}}

where θ(f) denotes the component starting phase as a function of component frequency f, N is the number of components in the harmonic complex, f₀ denotes the fundamental frequency, and C is a constant that was varied between −1 and 1 in steps of 0.25 to obtain maskers with different phase curvatures (Lentz and Leek 2001; Oxenham and Dau 2001a). The study showed that when the probe frequency was 1 or 2 kHz, there was a significant effect of C value on masked thresholds for an on-frequency masker (with components around the probe frequency) but not for an off-frequency masker (with components placed around the frequency an octave below the probe frequency). However, masker phase effects were significant for a 6-kHz probe in both the on- and off-frequency masking conditions.

Two findings of the Wojtczak and Oxenham study were surprising and led them to invoke an additional mechanism to account for their data. The first finding was the effect of masker phase curvature in off-frequency masking of the 6-kHz probe. In earlier studies, masker phase effects had been explained as resulting from the interaction between the phase curvatures of the masker and the cochlear filter tuned to the probe frequency (Smith et al. 1986; Kohlrausch and Sander 1995; Lentz and Leek 2001; Oxenham and Dau 2001a, b) and from compression of the waveform at the output of that filter. The role of compression was evidenced by reduced masker phase effects in listeners with hearing impairment (Summers and Leek 1998; Summers 2000; Oxenham and Dau 2004) and by the presence of the effects in forward masking (Carlyon and Datta 1997; Wojtczak and Oxenham 2009b). In forward masking, listeners cannot use dips in the masker envelope to detect the signal. However, harmonic complexes with fluctuating temporal envelopes have lower rms amplitude after being subjected to compression than the complexes with the same original rms amplitude and spectrum but with flat temporal envelopes. This fact has been used to explain a decreased effectiveness of Schroeder-phase maskers with fluctuating envelopes at the output of the cochlea. Because off-frequency stimuli with frequencies about an octave below the characteristic frequency (CF) of the measurement place are thought to produce a linear response on the basilar membrane (BM; Ruggero et al. 1997), masker phase effects should be observed in forward masking for on-frequency but not for off-frequency maskers.

The second surprising finding in the study by Wojtczak and Oxenham (2009b) was the effect of masker duration: The effect of masker phase curvature was stronger for the 200-ms than for the 30-ms maskers. Because cochlear compression is known to be nearly instantaneous (Ruggero et al. 1997), the duration effect, along with the phase effects in off-frequency masking, suggested that an additional mechanism may be involved. Wojtczak and Oxenham (2009b) suggested that the mechanism may be one (or both) of the two feedback-based mechanisms with relatively long time constants, the MOCR (Backus and Guinan 2006) and the middle-ear muscle reflex (MEMR; Church and Cudahy 1984). Wojtczak and Oxenham (2009b) hypothesized that maskers producing the most modulated envelopes, and thus, the smallest average excitation on the BM (due to the interaction between the masker and cochlear-filter phase curvatures and compression) may be the least effective elicitors of the feedback-based reflexes. On the other hand, maskers producing waveforms with flatter envelopes at the output of the cochlea would result in a greater excitation and therefore would be more effective at eliciting either of the reflexes. As a consequence, maskers with flatter envelopes at the output of the cochlea would produce higher forward-masked thresholds than the maskers with fluctuating envelopes, due to either a greater reduction of cochlear gain at the signal frequency place on the BM (MOCR) or a greater attenuation of the transmission through the middle ear (MEMR). The difference in threshold would be greater for maskers with a longer duration that would allow for a longer buildup time for the effect of the involved reflex. Although either reflex could play a role, the authors favored the explanation in terms of the MOCR because the MEMR has been shown to predominantly affect transmission of low frequencies (<2 kHz) through the middle ear (e.g., Schairer et al. 2007) whereas Wojtczak and Oxenham (2009b) observed the off-frequency masker phase effects only for a 6-kHz probe and not for the 1- and 2-kHz probes.

In the present study, a method for measuring the effect of efferent activation on stimulus frequency otoacoustic emission (SFOAE) developed by Guinan et al. (2003) was implemented to measure changes in the ear-canal sound pressure at the probe frequency due to the on- and off-frequency Schroeder-phase complexes with different phase curvatures. In addition, psychophysical measurements were performed to test an alternative hypothesis that the off-frequency masker phase effects observed by Wojtczak and Oxenham (2009b) were due to residual compression, such that the place along the BM with a CF corresponding to the probe frequency still responded at least somewhat compressively to the off-frequency masker.

EXPERIMENT 1: EFFECTS OF ELICITOR PHASE CURVATURE ON EAR-CANAL SOUND PRESSURE AT THE PROBE FREQUENCY

Rationale

The aim of this experiment was to test the hypothesis that MOC efferent activation is affected by component phase relationships within a harmonic complex tone. The stimuli were the on- and off-frequency Schroeder-phase maskers used by Wojtczak and Oxenham (2009b) with the 6-kHz probe, because unexpected masker phase effects in off-frequency masking conditions were observed for this probe frequency, and because studies have shown that the ipsilaterally activated MEMR should not affect the transmission of a 6-kHz tone through the middle ear (Schairer et al. 2007).

The data from Wojtczak and Oxenham (2009b) showed that for the off-frequency masker of a 6-kHz probe, forward-masked thresholds were the lowest when all the masker components started with the same (0 ° or sine) phase, and they were the highest for maskers with phase curvatures obtained by setting the value of C in Eq. 1 to −1 (Schroeder-phase negative masker) and 1 (Schroeder-phase positive masker). To be consistent with the role of efferent activation, as hypothesized in the study of Wojtczak and Oxenham, the Schroeder-positive and Schroeder-negative complexes should produce a greater reduction in cochlear gain, and consequently a greater reduction in SFOAE magnitude at the probe frequency, than the sine-phase harmonic complex at the same overall intensity.

Listeners

Normal-hearing listeners were used for this study. Their hearing thresholds were below 15 dB HL at audiometric frequencies between 250 and 8,000 Hz, as measured using an ANSI-certified audiometer (Madsen Conera). Three of the recruited listeners were excluded because they showed no significant post-elicitor effects on the ear-canal pressure for the Schroeder-phase elicitors during a 2-h session. Seven listeners (one male, six females), with ages in the range of 21–49 years (median 26 years), provided useable data that were analyzed to test for the effects of the elicitor phase curvature. Prior to data collection, the listeners provided written informed consent and the protocol for this study was approved by the Institutional Review Board of the University of Minnesota.

Stimuli and Procedure

Ear-canal pressure waveforms were recorded during the presentation of a continuous 6-kHz tone and an intermittent Schroeder-phase complex. A schematic illustration of the stimuli in a recording trial is shown in Figure 1. A Schroeder-phase complex consisting of 25 harmonics of a 100-Hz fundamental frequency, presented ipsilaterally with the tonal probe, was used to elicit the MOCR. On-frequency Schroeder-phase complexes (Fig. 1A) consisted of components from 4,800 to 7,200 Hz, and off-frequency complexes (Fig. 1B) consisted of components from 1,600 to 4,000 Hz. These spectral configurations of the on- and off-frequency elicitors were identical to those of the on- and off-frequency forward maskers in the study by Wojtczak and Oxenham (2009b). For each spectral configuration, three phase curvatures of the elicitor, given by C = −1, 0, and 1 in Eq. 1, were used in separate blocks of trials. A trial consisted of eight 8.5-s segments, each comprising 1 s of the probe alone, followed by a 2.5-s interval during which an MOCR elicitor was added to the probe, followed by a 5-s interval containing the probe alone. The elicitor’s polarity was alternated between consecutive segments. This allowed for the cancellation of the physical waveform of the elicitor during the averaging of the recorded waveform across the eight segments while preserving the elicitor’s effect on the ear-canal sound pressure at the probe frequency.

FIG. 1 — Schematic illustration of the spectro-temporal configuration of stimuli used to measure changes in the ear-canal pressure produced by on-frequency (A) and off-frequency (B) harmonic complex elicitors. The *plus* and *minus signs* indicate the alternating polarity of the elicitors in consecutive presentations. The *blue line* represents the continuous probe.

The probe was presented at a level of 50-dB sound pressure level (SPL). This level was 10 dB higher than the probe level used in most SFOAE-based measurements of the effects of MOC efferent activation (Guinan et al. 2003; Backus and Guinan 2006; Lilaonitkul and Guinan 2012). It was necessary to use a higher probe level because the measurements in this study were performed using a 6-kHz probe, and the level of 40 dB SPL was often not sufficiently high to produce a measurable SFOAE at this frequency. It cannot be ruled out that the probe itself activated the MOCR. However, pure tones have been shown to be relatively ineffective elicitors of efferent effects and the on-frequency effects reported in earlier studies usually did not reach significance for elicitor levels below 60–70 dB SPL. In addition, the effects elicited by pure tones have been shown to decrease with increasing frequency of the probe tone (Lilaonitkul and Guinan 2009b, 2012). Walsh et al. (2010) also showed no change in the magnitude of the nonlinear SFOAE measured over the course of 500 ms for a 60-dB SPL 4-kHz probe. It is therefore assumed that any contribution to changes in the ear-canal sound pressure due to a continuous 50-dB SPL 6-kHz probe in this study is negligible. The onset and offset of the probe (at the beginning and end of a trial) were gated with 10-ms raised-cosine ramps. The on-frequency elicitors were presented at 65 dB SPL, and the off-frequency elicitors were presented at 75 dB SPL. The level of the off-frequency elicitors was 10 dB below that used for off-frequency maskers by Wojtczak and Oxenham (2009b) to avoid clipping of the recorded waveform, which occurred for the phase curvature defined by C = 0. All the elicitors were gated with 10-ms raised-cosine ramps.

Prior to testing, it was confirmed for each listener that they had no significant spontaneous emissions within 100 Hz of the probe frequency. Estimates of spontaneous emissions were obtained using the procedure described by Penner et al. (1993). In addition, measurements of the effect of a 60-dB SPL broadband noise elicitor on a tonal probe were performed for probe frequencies within a ±120-Hz range around 6 kHz. The purpose of these measurements was to find a proximal frequency for which the effect of efferent activation was the strongest to ensure that robust effects were observed, as was done in previous studies (e.g., Guinan et al. 2003). Since no appreciable differences were found across the frequencies tested, a 6-kHz probe was used for all the listeners participating in the experiment. In addition, a suppression technique was used to estimate the magnitude of the SFOAE at 6 kHz by intermittently presenting a suppressor tone with a frequency of 5,890 Hz and a level of 70-dB SPL instead of the harmonic complex elicitor during the continuous probe presentation.

Stimuli were generated and recorded on a PC via a 24-bit D/A LynxTwo (LynxStudio) sound card using a sampling rate of 44,100 Hz. The stimuli were delivered to the ear canal via the ear piece of an ER10C system (Etymotic Research). The ear piece contained two sound sources and one microphone. The probe and the elicitor were routed to separate sound sources and presented ipsilaterally to the right ear for all the listeners except S5 for whom stronger effects were found in the left ear. The recorded waveforms were analyzed online for artifact rejection. Only artifact-free recordings contributed to the average waveforms that were used for further analyses. Listeners completed the test with one elicitor configuration (on- or off-frequency chosen at random) with different C values selected in a random order before moving on to the next. For each elicitor, 50 artifact-free 8.5-s segments were recorded. During the recordings, the listeners were seated in a comfortable reclining chair located in a double-walled sound-attenuating booth. They were asked to remain still but awake. The listeners were given breaks as needed during which they could choose to take the ear piece out or remain in the booth with the ear piece in the ear canal. An in-the-ear calibration was performed at the beginning of each session, at the beginning of each elicitor condition, after each break, and any time the probe had to be repositioned, to make sure that the stimuli were presented at the same level throughout the experiment. Typically, recordings for all three C values for one spectral configuration of the elicitor were obtained without a break, before recordings for the second spectral configuration commenced. The measurements of the 6-kHz SFOAE with a suppressor tone were performed after all the measurements with harmonic complex elicitors were completed.

Analysis of the Recorded Waveforms

For each condition, recorded segments that passed the online artifact rejection test underwent additional visual screening to remove recordings that showed small but systematic artifacts, such as those resulting from slow shifts in the position of the ear probe during the recorded segment. Discarded segments were replaced by additional data collected to obtain a total of 50 clean segments per subject. The segments were then averaged, resulting in an 8.5-s waveform, and high-pass filtered using an eighth-order Butterworth filter with a cutoff frequency of 400 Hz. The waveform was subsequently heterodyned to obtain a complex-valued ear-canal sound pressure at the probe frequency (Guinan et al. 2003; Backus and Guinan 2006). The heterodyning involved the calculation of the analytic signal from the 8.5-s average waveform, shifting it by the frequency of the probe, and low-pass filtering the resultant complex-valued waveform using a fourth-order Butterworth filter with a cutoff frequency of 50 Hz. The waveform was then downsampled to save storage space. The upper and lower left panels of Figure 2 show the magnitude and phase of the heterodyned ear-canal sound pressure at the probe frequency, respectively, for the on-frequency elicitor (blue line) and the off-frequency elicitor (red line). To extract changes in the ear-canal sound pressure due to the elicitor, the vector average of the complex ear-canal sound pressure (i.e., the mean real and imaginary part) was first calculated within the 500-ms pre-elicitor window. The window extended over the segment of the heterodyned pressure waveform from 550 to 50 ms before the elicitor’s onset (the green rectangle in the top left panel of Fig. 2). Changes in the ear-canal sound pressure were obtained by subtracting the mean real and imaginary parts in the pre-elicitor window from the real and imaginary part of every complex-valued point of the heterodyned pressure waveform, respectively. This vector subtraction resulted in a complex-valued sound pressure representing the noise floor during the time interval when the pressure at the probe frequency was unaffected by an elicitor and in a change in ear-canal sound pressure during the time interval when the elicitor had an effect. The green rectangle positioned at 3.70 s in the top right panel of Figure 2 illustrates the position of the 100-ms post-elicitor window used to estimate the effect of an elicitor on the ear-canal sound pressure at 6 kHz, hereafter referred to as the residual. The residual was calculated using a post-elicitor (rather than during-elicitor) window to avoid effects of (two-tone) suppression of the basilar-membrane response to the probe by the components of the elicitor and to capture the hypothesized dependence of the residual on the elicitor’s phase curvature during the time period over which the elicitor likely produced forward masking. The post-elicitor window was positioned 20 ms after the offset of the elicitor—a delay that did not include the exact temporal position of the probe in the psychophysical forward-masking experiment in the study by Wojtczak and Oxenham (2009b)—to avoid the inclusion of effects of two-tone suppression in the estimate of the residual. A window immediately following the elicitor would include such effects due to the processing performed on the signal (low-pass filtering). The window was also much longer than 10 ms (the duration of the probe in the psychophysical study) to decrease the variability of the estimated residual. However, it is reasonable to assume that the differences in forward masking observed over the first 10 ms of recovery should be present throughout a time period over which forward-masked thresholds are significantly above the threshold in quiet. Thus, it is assumed that an exact match between the temporal position of the probe in the psychophysical task and the position of the post-elicitor window in this study was not necessary for the purpose of testing the working hypothesis. The post-elicitor effect was calculated as 20 log(ΔP_post), where ΔP_post represents the magnitude obtained by averaging the real and imaginary parts of the change in ear-canal sound pressure within the post-elicitor window (top right panel in Fig. 2). The post-elicitor effects were then compared across the three C values of the elicitor.

FIG. 2 — The magnitude of the ear-canal pressure (*top left panel*), the phase of the ear-canal pressure (*bottom left panel*), the change in magnitude (*top right panel*), and the change in phase (*bottom right panel*) of the ear-canal pressure from averaged recording segments. The *blue traces* show data obtained for the on-frequency elicitor, and the *red traces* are for the off-frequency elicitor. The *green box in the top left panel* illustrates the position of the window over which the vector average of the pressure waveform was calculated. The *green boxes in the top right panel* illustrate the positions of pre- and post-elicitor windows used to estimate noise floor and the elicitor effect, respectively. The data are for one listener, for the on- and off-frequency elicitors generated with C = −1.

Results and Discussion

Figures 2 and 3 show examples of data from two listeners. Each figure shows data for one arbitrarily selected C value (indicated in the figure captions) to illustrate two general patterns observed in the data. These patterns did not show systematic dependence on the C value or the elicitor condition (on- vs off-frequency) across the listeners, and they are representative of the general trends observed in the data discussed below in detail.

FIG. 3 — As Figure 2 but for a different listener and for the on- and off-frequency elicitors generated with C = 0.

The top and bottom left panels in Figures 2 and 3 show the magnitudes and phases of the averaged 8.5-s heterodyned ear-canal pressure waveforms, respectively, for the on-frequency (blue line) and off-frequency (red line) elicitor conditions. The extracted changes in the magnitude and phase of the ear-canal sound pressure at 6 kHz, ΔP, are shown in the top and bottom right panels, respectively. The vertical dashed lines in all panels mark the elicitor’s onset and offset times.

In both figures, the top right panels show a rapid growth of the ΔP magnitude (expressed in dB) that coincided with the onsets of the on- and off-frequency elicitors. In Figure 2, the fast ΔP onset is followed by a gradual (extending over a few hundred milliseconds) increase in ΔP magnitude to an approximately constant level, whereas in Figure 3 the onset is followed by a small gradual decrease in ΔP magnitude. At first appearance, the effects during the elicitor appear consistent with those reported in the studies of the effect of efferent activation on the SFOAE for broadband and notched noise elicitors (Guinan et al. 2003; Backus and Guinan 2006). The relatively slow buildup in Figure 2 seems consistent with the effect of the MOCR taking over the initial dominant effect produced by two-tone suppression of the response to the 6-kHz probe on the BM. The slow decrease in ΔP after the initial rapid increase in ΔP magnitude in the right top panel of Figure 3 is reminiscent of onset adaptation of the distortion product otoacoustic emissions (DPOAEs) due to the ipsilaterally evoked MOCR in cats shown by Liberman et al. (1996). Liberman et al. argued that the primaries presented at relatively high levels evoke the MOCR which decreases the BM response to the primaries (and thus the amplitude of DPOAE) via efferent feedback. A similar finding in humans was reported by Kim et al. (2001), but the effects were smaller than those in animals. Given these reports, the decrease in ΔP magnitude during the first few hundred milliseconds of the elicitor appears consistent with the effect of the MOCR on the elicitor itself.

However, the above interpretation in terms of the effect of the MOCR on the SFOAE evoked by the 6-kHz probe is complicated by very large ΔP magnitudes during the elicitor, particularly for the off-frequency elicitor. In that condition, the ΔP magnitudes often exceeded the magnitudes of the SFOAEs at 6 kHz measured using the standard suppression technique (e.g., Brass and Kemp 1993; Guinan et al. 2003) by up to 10–15 dB for the subjects whose data are shown in Figures 2 and 3 and for all of the other subjects (data not shown). The large effects are inconsistent with the results from previous studies that used lower probe and elicitor levels to measure the effects of efferent activation on the SFOAE (Guinan et al. 2003; Backus and Guinan 2006; Lilaonitkul and Guinan 2009b, a, 2012). In these studies, the reported ΔP magnitudes were always a fraction of the SFOAE magnitude measured with a single-tone suppressor. Because of this discrepancy and the uncertainty about the mechanism underlying elicitor effects in this study, the ΔP magnitude was not normalized by the SFOAE magnitude estimated using the single-tone suppression technique, as was typically done by Guinan and colleagues (Guinan et al. 2003; Backus and Guinan 2006; Lilaonitkul and Guinan 2009a, 2012). It should be noted that such normalizing would not have affected the relative effects across the three elicitor phase curvatures as it would amount to subtracting the same dB amount from each effect for a given subject.

There are a few possible explanations for why the effect of the elicitor on the ear-canal pressure at the probe frequency was greater than the estimated SFOAE magnitude. One possible explanation is that the intense elicitors used in this study drove the outer hair cell stereocilia at the basal end of the cochlea into their nonlinear region thereby generating an SFOAE-like residual at the probe frequency via local distortion processes, as reported in a study by Guinan (1990). This explanation would imply that the elicitor used as a forward masker in the previous psychophysical study by Wojtczak and Oxenham (2009b) would itself produce an additional source of energy at the probe frequency and the amount of added energy could depend on masker phase curvature, contributing to the observed masker phase effects. Although appealing, this interpretation is weakened by the fact that similarly large effects are present when the harmonic complex elicitor is replaced by a notched noise around 6 kHz and when the elicitor is presented contralaterally to the probe (Walsh and Wojtczak 2014). Another possibility is that since the probe was presented at 50 dB SPL, the SFOAE traveling back through the middle ear to ear canal was generated not only around the place with the CF of 6 kHz but also contained significant contributions from basally distributed generators (Siegel and Badri 2002; Siegel et al. 2003; Siegel et al. 2004; Siegel et al. 2005; Charaziak et al. 2013; Moleti et al. 2013; Sisto et al. 2013). A single-tone suppressor with a frequency 110 Hz below that of the probe may have been insufficient to eliminate an SFOAE generated by all the sources either via two-tone suppression or/and via efferent activation. The highest component of the off-frequency elicitor was 4,200 Hz, so it is unlikely that this elicitor could eliminate the SFOAE at 6 kHz more effectively than the 5,890-Hz tone via two-tone suppression. However, because the off-frequency elicitors had broader spectra, they may have been more effective at suppressing the SFOAE generators via the feedback-based efferent system, thus producing a larger residual, ΔP.

An alternative explanation for the sizeable ΔP magnitude is in terms of the MEMR. The activation of the MEMR would affect the impedance of the middle ear, thereby changing the ear-canal sound pressure in a way unrelated to the inner-ear response and thus the SFOAE. A problem with this explanation is that the effects of the MEMR are known to be quite slow. Even for the most intense activators of the MEMR (i.e., 110 dB or more), the effects have been shown to exhibit at least a 20-ms latency followed by at least a 50–100 ms buildup time (Hung and Dallos 1972). Thus, based on the reported time courses, the MEMR could not account for the rapid change in the ear-canal pressure shown in Figure 3.

After the offset of the elicitor (marked by the vertical dashed line at 3.5 s), the data in both figures show an initial rapid decrease in ΔP magnitude followed by a very slow further decay during which the effect remained significantly above the noise floor for a period of several seconds. The presence of the post-elicitor effect is also evidenced by the relatively narrow spread of the ΔP phase compared to that in the pre-elicitor interval (between 0 and 1 s). As the effect decreased to the level of the noise floor, the ΔP phase became scattered over the range from −180 to 180 °, as expected (e.g., Guinan et al. 2003). For the off-frequency elicitor, data from all the listeners exhibited a nonmonotonicity in the recovery function that followed the fast decrease in ΔP magnitude at the elicitor’s offset. The magnitude of the post-elicitor ΔP was also in most cases larger for the off-frequency elicitor than for the on-frequency elicitor.

Overall, the elicitors used in this study produced substantial changes in the ear-canal pressure that persisted for a long time after their offsets. The recovery times were much longer than those reported for the effects of efferent activation by a 60-dB SPL notched noise on the SFOAE at 1 kHz (Backus and Guinan 2006). The nonmonotonic behavior of the post-elicitor ΔP magnitude suggests that more than one mechanism may have played a role.

The post-elicitor effects expressed in dB SPL are shown in Figure 4, for the on-frequency elicitor (upper panel) and the off-frequency elicitor (lower panel). In each panel, the three bars plotted for each subject show the effect for the three phase curvatures used, Schroeder-phase negative (filled bar), zero-phase complex (coarse-hatched bar), and Schroeder-phase positive (fine-hatched bar). The effects of the elicitor were considered significant when they exceeded 5 dB (i.e., exceeded two standard deviations from the mean noise floor level calculated from the average pre-elicitor ΔP magnitude obtained by averaging the real and imaginary parts of the complex-valued ΔP within the 100-ms pre-elicitor window positioned at 0.85 s in the top right panel of Fig. 2) and were statistically significant according to the one-tailed Welch’s t test. According to these two criteria, all the effects shown in Figure 4 were significant. The rightmost set of the bars shows the mean effect for the seven subjects tested.

FIG. 4 — Effects of the on-frequency elicitors (*upper panel*) and off-frequency elicitors (*lower panel*) for the parameter C values of −1 (*filled bars*), 0 (*coarse*-*hatched bars*), and 1 (*fine*-*hatched bars*). The *three rightmost bars in both panels* represent the mean across the seven listeners tested. The *error bars on the rightmost bars* represent one standard error of the mean.

To be consistent with the original hypothesis of Wojtczak and Oxenham (2009b), the effects shown in Figure 4 for C = −1 and C = 1 should have been consistently larger or smaller than for C = 0, depending on the mechanism involved. For example, if the effects in Figure 4 were due to the MOCR, and thus due to a reduction of cochlear gain, then a greater reduction in SFOAE magnitude (i.e., taller bars) for C = −1 and 1 than for C = 0 would be consistent with the psychophysical forward-masking data. A similar pattern would be expected if the effects reflected a reduced admittance due to the activation of the MEMR. If, however, the effects were due to an increased admittance at 6 kHz, then a smaller change in the ear-canal pressure (i.e., a smaller increase in admittance) for elicitors with C = −1 and 1 compared with that for the elicitor with C = 0 would be consistent with the psychophysical data. Neither result was consistently observed in the individual or the mean data shown in Figure 4. A repeated-measures two-way ANOVA with the main factors of phase curvature and condition (on- vs off-frequency elicitor) showed that the effect of the phase curvature was statistically significant [F(2,12) = 4.34, p = 0.04]. The post-elicitor effect was significantly larger for off-frequency elicitors than that for on-frequency elicitors [F(1,6) = 35.55, p = 0.001], but there was no significant interaction between the elicitor’s phase curvature and condition [F(2,12) = 0.90, p = 0.43]. Although the effect of the phase curvature was significant, it reflected the tendency for the elicitor effect to be the largest for C = −1 and smallest for C = 1, inconsistent with the working hypothesis proposed to explain psychophysical masking by these harmonic complexes.

In summary, the original hypothesis that masker-phase-dependent changes in forward masking can be explained in terms of efferent effects was not supported. All six elicitors (with three C values in on- and off-frequency conditions) produced significant changes in the ear-canal sound pressure at the probe frequency during and after the elicitor, but the changes were not affected by the phase characteristics of the masker in a way that was consistent with the psychophysical forward-masking data in the study by Wojtczak and Oxenham (2009b). In addition, based on the results shown thus far, it is not possible to determine which of the feedback-based reflexes, the MOCR or MEMR, dominated these pressure changes. In the following experiment, two paradigms were used to gain more insight into the mechanisms producing the changes in ear-canal sound pressure.