Abstract
Context effects in loudness have been observed in normal auditory perception and may reflect a general gain control of the auditory system. However, little is known about such effects in cochlear-implant (CI) users. Discovering whether and how CI users experience loudness context effects should help us better understand the underlying mechanisms. In the present study, we examined the effects of a long-duration (1-s) intense precursor on the loudness relations between shorter-duration (200-ms) target and comparison stimuli. The precursor and target were separated by a silent gap of 50 ms, and the target and comparison were separated by a silent gap of 2 s. For normal-hearing listeners, the stimuli were narrowband noises; for CI users, all stimuli were delivered as pulse trains directly to the implant. Significant changes in loudness were observed in normal-hearing listeners, in line with earlier studies. The CI users also experienced some loudness changes but, in contrast to the results from normal-hearing listeners, the effect did not increase with increasing level difference between precursor and target. A “dual-process” hypothesis, used to explain earlier data from normal-hearing listeners, may provide an account of the present data by assuming that one of the two mechanisms, involving “induced loudness reduction,” was absent or reduced in CI users.
Keywords: auditory context effects, loudness recalibration, cochlear implants, loudness enhancement
Introduction
Our perception of a stimulus or event is dependent in large part on the context in which it is presented. Much has been learned about perceptual processing through the study of context effects and their neural correlates. In auditory perception, judgments of the loudness of a sound can be affected by the intensity relation between that target sound and sounds that precede it. Early studies showed that when an intense auditory stimulus precedes a weaker one, the loudness of the weaker stimulus can be judged to have increased by as much as 30 dB, whereas when the preceding signal, or precursor, is less intense than the following target signal, the loudness of the target decreases somewhat from its loudness in isolation (Galambos et al. 1972; Elmasian and Galambos 1975; Elmasian et al. 1975). This phenomenon, known as loudness enhancement or decrement, was thought to reflect a general principle of intensity coding and gain control in the auditory system.
These early studies generally involved a three-tone paradigm, with a conditioning (or precursor) tone, followed by a target tone and then a comparison tone, which subjects adjusted in level to match the loudness of the target tone. All three tones were presented at the same frequency. Manipulations of the presentation ear revealed that loudness enhancement was strongest when all tones were presented to the same ear (binaural or monaural presentations). In a dichotic situation (with the precursor and target presented to opposite ears), less, but still significant, loudness enhancement was observed (Elmasian and Galambos 1975). In contrast, loudness decrement effects seemed relatively insensitive to ear of presentation (Elmasian et al. 1980), suggesting that loudness enhancement may involve some monaural, possibly peripheral, processing components, whereas loudness decrement may primarily involve more central sites. A finding that raised fundamental questions concerning the peripheral nature of loudness enhancement was that enhancement (and decrement) could also be observed when the conditioning tone was presented after the target tone in time (Elmasian et al. 1980).
Loudness enhancement and decrement are considered “assimilative” effects, in that the loudness of the target is drawn towards that of the conditioner (and presumably vice versa). Other studies of loudness context effects have reported the opposite, namely that an intense precursor tone can reduce the loudness of a subsequent tone that is presented at a lower level. In contrast to loudness enhancement, this “loudness recalibration” (e.g., Marks 1994; Mapes-Riordan and Yost 1999) or “induced loudness reduction” (e.g., Scharf et al. 2002) seems to be a relatively long-lasting effect. It is generally observed when the precursor and target are at the same frequency, but the comparison tone is presented at a frequency that is remote from that of the precursor and target. As with loudness enhancement, the effect can be relatively large, ranging from about 10 to 20 dB, depending on the measurement method and stimulus parameters used. Interestingly, maximum loudness recalibration is not obtained directly after the precursor, but instead builds up to reach a maximum at a delay of around 1 s, and is still observable at a delay of 3 s (Mapes-Riordan and Yost 1999).
As proposed by Scharf et al. (2002), and supported by Arieh and Marks (2003a), the build-up and relatively long time constants associated with loudness recalibration suggest a possible reinterpretation of the earlier loudness enhancement studies, where all three tones were presented at the same frequency. In particular, it may be that the comparison tone is reduced in loudness by the precursor rather than the target tone being increased in loudness. To investigate this issue, Oberfeld (2007) used a four-tone task, with the first three tones (precursor, target, and comparison) at the same frequency and fourth tone at a remote frequency. He asked listeners to compare the loudness of the original comparison (third) tone with that of the fourth tone. According to Oberfeld’s results, it seems that both enhancement and adaptation contribute to loudness recalibration. Results from his study support an earlier hypothesis of Arieh and Marks (2003a) that loudness recalibration reflects a dual-process mechanism. On one hand, when an intense auditory signal (precursor) precedes a weaker one (target) by a short gap (less than 100 ms), the loudness of the following signal can be enhanced (Elmasian and Galambos 1975; Marks 1988); on the other hand, when the time interval between precursor and target (close in frequency) exceeds 200 ms, the target signal will be reduced, perhaps due to adaptation (Arieh and Marks 2003a). These properties of loudness recalibration could be explained by the interaction between a fast-onset and fast-decay enhancement process and a fast-onset but slower-decay adaptation process (Oberfeld 2007).
There are many potential sources of both enhancement and adaptation along the auditory pathways, and few attempts have been made to constrain the locus or nature of these sources. One of the potential sources of an adaptation-like process is the medial olivocochlear (MOC) efferent system, which acts to reduce both the gain and frequency selectivity of the basilar membrane response to sound, by affecting the action of the outer hair cells (Nieder et al. 2003; Guinan 2006; Jennings et al. 2009). As such, an MOC-based effect could, in principle, help explain why loudness effects transfer only partially across the ears; MOC effects are activated bilaterally but are strongest for ipsilateral activation (Guinan 2006). Although the time constants associated with the MOC fast effect are not thought to extend to several seconds, the slow effect of MOC may at least contribute to loudness changes (Cooper and Guinan 2003).
In this study, we investigated context effects on loudness using both normal-hearing listeners and cochlear-implant (CI) users with a three-tone paradigm similar to that used in early loudness enhancement studies. We use loudness context effect (LCE) as a relatively neutral term to avoid any assumption regarding whether the effect reflects an enhancement of the target or adaptation of the comparison (or both). The stimuli were presented as high-rate pulse trains to single electrodes of the CIs. In the normal-hearing listeners, narrowband noises were used (rather than tones) to better simulate the spread of excitation produced by single-electrode stimulation in CIs (e.g., Bingabr et al. 2008). In addition, we varied the frequency (or electrode) of the precursor relative to that of the target and comparison tones. The rationale was that if two different mechanisms are responsible for the time course of LCE, then the two mechanisms might have different frequency selectivity. The comparison of normal-hearing listeners and CI users allowed us to test the role of the MOC efferent system. Because MOC efferent activation affects cochlear gain, it requires an intact cochlea. Therefore, any portion of the effect due to MOC efferent effects should not be observed in CI users. Thus, if CI users show some LCE, we could conclude that LCE cannot be due solely to MOC activation (although it may still play some role). As a result, investigating LCE in CI users may provide us with important information about the potential underlying mechanisms. Some researchers have suggested that the cochlear gain changes induced by the MOC efferent system may be important for speech perception in noise (e.g., Guinan 2010; Garinis et al. 2011; Clark et al. 2012; de Boer et al. 2012; Mishra and Lutman 2014). Therefore, any differences in the results between normal-hearing listeners and CI users may provide guidance for future CI signal processing systems to restore normal context effects for auditory and speech perception.
Experiment 1: Loudness Context Effects in Normal-Hearing Listeners
Methods
Subjects
Seven listeners (two males, five females) participated in this experiment and were compensated for their time. Their ages ranged from 18 to 63 years (mean age 26.1 years; only one subject older than 45). All listeners had normal hearing, as defined by audiometric thresholds below 20 dB HL at octave frequencies between 0.25 and 8 kHz. All participants provided written informed consent, and all protocols were approved by the Institutional Review Board of the University of Minnesota.
Stimuli
A schematic diagram of the stimuli used in this experiment is shown in Figure 1A. Each trial consisted of three sounds: a precursor, a target, and a comparison. The temporal properties of the stimuli remained constant for the entire experiment. The total duration of the precursor was 1 s, and the total durations of both the target and the comparison were 200 ms. The precursor and target were separated by a silent gap of 50 ms, which was sufficient to trigger both loudness enhancement and ILR effects according to Arieh and Marks (2003a), and the target and comparison were separated by a silent gap of 2 s. All the stimuli were gated on and off with 10-ms raised-cosine ramps. All the stimuli were narrowband noises, created by filtering a Gaussian white noise with a second-order IIR peaking filter in the time domain, with slopes of either 24 or 96 dB/octave. The use of bandpass noise was intended to simulate the spread of current produced by CIs, and the different slopes were intended to simulate different degrees of current spread produced by monopolar and bipolar stimulation modes. The 24 dB/octave slopes were chosen to be within the range provided by Bingabr et al. (2008) to simulate monopolar stimulation (although shallower slopes have also been assumed; see Oxenham and Kreft. 2014); the 96 dB/octave slopes were chosen to be in the range of the values provided by Bingabr et al. (2008) to simulate bipolar stimulation.
The level of the target was always 60 dB sound pressure level (SPL). A precursor level of 70 dB SPL was tested in conjunction with filter slopes of both 24 and 96 dB/octave. The 10 dB level difference between the precursor and target was selected because it was deemed large enough to produce some effect, based on previous studies (Elmasian and Galambos 1975; Elmasian et al. 1980), but not so large as to make a comparison with CI users difficult, based on their more limited dynamic range (Hong et al. 2003). The center frequency of the precursor within each block was selected from one of five values (455, 762, 1278, 2142, or 3590 Hz), approximately logarithmically spaced around the center frequency of the target and comparison, which was always 1278 Hz. The spacing between adjacent components corresponds to 3.5 to 4.5 equivalent rectangular bandwidths (ERBs) of the auditory filters (Glasberg and Moore 1990). The level of precursor and target remained constant within each block. The level of the comparison varied between trials within a specific range centered around the target level, from 57 to 63 dB SPL in 1-dB steps.
Additional data were collected with an 85-dB SPL precursor and a 60-dB SPL target, with filter slopes of 96 dB/octave and only one precursor center frequency of 1278 Hz, corresponding to the center frequency of the target and comparison. The comparison level range was from 55 to 65 dB SPL, in 2-dB steps. A larger step size was used with the higher precursor level, because a larger effect was expected, based on previous literature (Elmasian and Galambos 1975).
The stimuli were generated digitally and played out diotically from a LynxStudio L22 24-bit soundcard at a sampling rate of 22.5 kHz via Sennheiser HD650 headphones to listeners seated in a double-walled sound-attenuating chamber.
Procedure
A training session was run prior to the actual experiment, involving the target and comparison sounds, but no precursor. Listeners were instructed to respond to the question, “Which sound is louder?” via virtual buttons on the computer display. As in the actual experiment, the target was always 60 dB SPL. The comparison was presented at one of six levels: 57, 58, 59, 61, 62, and 63 dB SPL. Each level was presented 10 times, resulting in 60 trials per training block. Feedback was provided throughout the training session. Listeners were required to reach 80 % correct to proceed to the actual experiment. All of the participants achieved this level of performance within two blocks of training.
In the actual experiment, listeners were asked to ignore the precursor (if present) and to again judge which of the two short sounds (the target and the comparison) was louder. A reference condition with no precursor (similar to the training condition) was also included. Each precursor condition was repeated five times in random order within each of three sessions. The first session involved the 70-dB SPL precursor at one of five center frequencies with the 24-dB/octave filter slopes; the second session involved the 70-dB SPL precursor at one of five center frequencies with the 96-dB/octave filter slopes; the third session involved the 85-dB SPL precursor at only a single center frequency with the 96-dB/octave filter slopes. In the first and second sessions, each block comprised one precursor frequency (or no precursor) with seven comparison levels, repeated 10 times in random order, making a total of 70 trials per block. Each session contained 30 blocks (five repetitions for each of the six precursor conditions, with trials in a new random order in each block), for a total of 50 repetitions of each stimulus per subject. In the final session, with the 85-dB precursor, six comparison levels were each repeated 10 times, for a total of 60 trials per block. A total of 10 blocks of trials were presented per subject in the last session (reference and on-frequency condition, five times for each condition), for a total of 50 repetitions of each stimulus per subject. No feedback was provided in the test sessions. The whole experiment lasted about 6 to 8 h, divided into 2-h sessions.
Results and Discussion
The mean results are presented in Figure 2. In each panel, the proportion of trials in which the comparison was judged to be louder than the target is plotted as a function of the comparison level. Figures 2A and 2B show the results with a 70-dB SPL precursor, with data from the 24 and 96 dB/octave filter slopes, respectively. Different symbols represent the different precursor center frequencies, as shown in the legend. Figure 2D shows the data using the precursor level of 85 dB SPL and filter slopes of 96 dB/octave. Figure 2C replots the on-frequency-precursor and no-precursor conditions from Figure 2B for ease of comparison.
Consider first the conditions with no precursor (filled circles). In all three conditions, the point of subjective equality (PSE), i.e., the level at which the comparison was judged louder than the target 50 % of time, was reached at a comparison level between 58 and 60 dB SPL. In other words, the two stimuli were judged equally loud when the target was 0–2 dB higher in level than the comparison. Perceptual biases of this kind have occurred in other loudness comparison studies, although the direction of the bias does not appear to be always consistent. For instance, in Elmasian et al. (1980), for baseline conditions, the 50-dB target alone was matched with a comparison tone level of around 52 dB, whereas the 70-dB target alone was matched with a comparison tone level nearer 66 dB SPL.
Consider next the effect of adding a precursor. In general, the addition of a precursor resulted in the target being judged louder (and/or the comparison being judged quieter), as shown by the fact that the filled circles (precursor absent) lie above the other symbols in all conditions. Moreover, the on-frequency precursor produced the largest effects, as shown by the fact that the open circles generally fall below all the other symbols. In general, the effect of the precursor diminished with increasing spectral distance between the precursor and target. This trend is particularly apparent in the case of the 24 dB/octave slopes, where the progression from no difference to a large difference in center frequency was more systematic; in the condition with 96 dB/octave slopes, the on-frequency precursor produced the largest effect, but all other precursor conditions produced similarly small effects.
Finally, consider the effect of precursor level. As expected from previous studies (Elmasian and Galambos 1975; Mapes-Riordan and Yost 1999), the overall effect (difference between no precursor and on-frequency precursor) seems greater with the higher-level than with the lower-level precursor (compare Fig. 2C and 2D).
Probit analysis was used to fit each of the curves shown in Fig. 2 for each subject individually. The fitted curves from each subject and each condition were then used to calculate the comparison level at the PSE. A level higher than 60 dB SPL implies that the comparison required a higher level than the target to be judged equally loud.
To confirm the statistical significance of the trends described above, within-subjects analyses of variance (ANOVA) were carried out with Huynh-Feldt corrections for lack of sphericity applied where appropriate, using the fitted PSEs as the dependent variable. In the first analysis considering just the conditions with the 70-dB precursor, the factors were filter slope (24 or 96 dB/oct) and precursor (6 levels—5 frequencies or no precursor). Significant main effects were observed for both precursor [F(5,30) = 8.86; p = 0.001] and filter slope [F(1,6) = 6.94; p = 0.039]. There was also a significant interaction between filter slope and precursor type [F(5,30) = 3.24; p = 0.019]. A planned comparison found a significant difference between PSE in the no-precursor condition and the PSE in the on-frequency condition [F(1,6) = 14.5; p = 0.009]. In addition, when the no-precursor condition was removed, contrast analysis revealed a quadratic trend for precursor frequency [F(1,6) = 23.4; p = 0.003]. These two findings indicate that the precursor affected loudness judgments and that the effect appeared to be frequency selective, with the effect decreasing with increasing frequency distance between the precursor and the target frequency. Although the effect of filter slope and its interaction with precursor frequency reached significance, the effects appear small and not easily interpretable.
To assess the effect of precursor level, the difference in PSE between the no-precursor condition and the on-frequency precursor condition was calculated from the data from session 2 (70 dB SPL precursor) and session 3 (85 dB SPL precursor). These differences, which represent the effect of the precursor on the loudness comparison, or LCE (in dB), were subjected to a paired-samples (within-subjects) t test. As illustrated in Figure 3A, the difference in LCEs, which were 1.52 and 5.74 dB for the 70- and 85-dB precursor, respectively, was significant [t(6) = 5.08, p = 0.002].
One puzzling aspect of the data is that the larger LCE with the higher-level precursor is not just due to the higher PSE in the precursor condition but seems to be also due to the lower PSE in the no-precursor condition. It is not clear why the no-precursor PSE was lower in the session that tested the higher-level precursor. It is conceivable that having blocks with the higher-level precursor interspersed with the no-precursor blocks led to an “over-compensation” of responses in the no-precursor blocks, in order for subjects to keep the overall number of “louder” and “quieter” responses more equal, when averaged over the session. However, the effect was relatively small, and further study would be needed to test this speculation.
In summary, significant LCE was observed in normal-hearing listeners. The effect exhibited frequency selectivity: it was greatest when the precursor was at the same frequency as the target and decreased with increasing spectral distance between the precursor and the target. The effect was also level-dependent, as it was greater for the 85-dB precursor than for the 70-dB precursor. Although the effect of filter slope reached statistical significance when all conditions were included, the overall amount of LCE and the effect of frequency separation between precursor and target were similar for both filter slopes tested.
Experiment 2: Loudness Context Effects in Cochlear-Implant Users
Methods
Subjects
Seven post-lingually deafened CI users participated in this study and were compensated for their time. Information regarding the individual CI users is provided in Table 1. All participants provided written informed consent, and all protocols were approved by the Institutional Review Board of the University of Minnesota.
Table 1.
aThe level was measured on target electrode (electrode 8)
Stimuli
The design was similar to that of Experiment 1. All the stimuli were delivered directly to the internal cochlear stimulator (ICS) system based on a clinical research platform, BEDCS, provided by Advanced Bionics. All the center frequencies in Experiment 1 were converted to corresponding electrodes as shown in Figure 1B (compare left and right y-axis labels). The durations of all the signals and gaps were exactly the same as those used in Experiment 1, with the exception that no onset and offset ramps were used. All stimuli were presented as trains of 32 μs/phase, cathodic-first biphasic pulses, presented in monopolar mode at a rate of 2000 pulses per second (pps).
Procedure
Before the experiment, for each selected electrode of each subject, three parameters were measured to calculate the level of stimuli. The three parameters were threshold (THS), most comfortable level (MCL) and maximum acceptable level (MAL). Stimuli were 200-ms pulse trains. The THS was measured using a three-interval, three-alternative forced-choice (3IFC/3AFC) procedure with a two-down, one-up adaptive tracking rule and correct-answer feedback. The THS estimates from two tracks were averaged to obtain a final THS value for each electrode and each subject. The MAL was measured using a one-up, one-down adaptive tracking procedure in which the sound was presented, followed by the question “Was it too loud?” A subject’s “no” and “yes” choices led to increases and decreases in signal level, respectively. The track terminated when the subject had responded that the intensity was too loud six times, and the average current level at the last six reversals was calculated. The MCL estimates from two tracks were averaged to obtain a final value of MCL for each electrode for each subject. The procedure to obtain MCL was the same as that for the MAL, except that the subject’s question was “Was the sound medium loud/comfortable?”
A similar training session with the same criteria was set up as in Experiment 1. The target was always presented at 70 % of the dynamic range (DR) of MCL in units of μA, and the comparison level was selected from 64, 66, 68, 72, 74, and 76 % DR of MCL, based on pilot data. Within one training block, 10 repetitions for each level were presented in random order. No precursor was included. Feedback was provided and listeners were required to reach 80 % correct to proceed to the actual experiment. All of the participants achieved this level of performance within two blocks of training.
The task of subjects was again to compare the loudness of the two brief sounds, the target and comparison. In the first session, a no-precursor reference condition and five precursor conditions (with precursors presented to electrode E2, E5, E8, E11, or E14) were tested. The target level was 70 % DR of MCL, and the precursor (if present) was presented at MCL. In the second and third sessions, two conditions (no-precursor reference and E8) and two level relationships were investigated. The target level was fixed at MCL for both sessions. In the second session, the precursor level was the mean value of MCL and MAL in μA. In the third session, the precursor was presented at MAL. Each condition was repeated five times in random order in all three sessions, which resulted in 30 blocks in the first session and 10 blocks in each of the last two sessions. There were seven comparison levels symmetrically distributed around the target level in all sessions, from 64 to 76 % DR of MCL in the first session, and from 94 to 106 % in the second and third session, with a stepsize of 2 % DR. Each comparison level was run 10 times in each block, for a total of 70 trials in each block and 50 repetitions of each stimulus per subject. No feedback was provided in the test sessions. The entire experiment lasted about 6–8 h, divided into 2-h sessions.
Results and Discussion
Figure 4 shows the mean results of Experiment 2. Panels A and B show the results with a MCL precursor and a target at 70 % DR of MCL. Panels C and D show the data with the precursor at the level corresponding to the mean of MCL and MAL, and at MAL, respectively. Different symbols represent the different precursor electrode numbers, as shown in the legend.
In general, some trends found in CI users were similar to those in normal-hearing subjects, with the presence of the precursor resulting in the target being judged louder than the comparison at equal levels. However, in contrast to the findings with normal-hearing listeners, the higher-level precursor did not result in a larger LCE.
As in Experiment 1, a probit analysis was carried out on the data from the individual CI users in each condition, and the PSEs were derived from those fits. A one-way within-subjects ANOVA on the PSE data revealed a significant main effect for the precursor at MCL (Fig. 4A) [F(5,30) = 2.76; p = 0.043]. A planned analysis comparing the on-precursor condition with the no-precursor condition revealed a significant effect [F(1,6) = 11.7; p = 0.021]. Also, contrast analysis from an ANOVA with only the precursor conditions revealed a significant quadratic trend of electrode number [F(1,6) = 12.0; p = 0.013], reflecting the observation that the amount of LCE decreased with increasing electrode distance from the target. Considering the individual data, only one of the seven CI users showed a negative effect, with a lower PSE in the on-frequency precursor condition than in the no-precursor condition.
To assess the effect of precursor level, a paired-samples t test was used to compare the LCE with the precursor at MAL and at the mean of MCL and MAL, as shown in Figure 3B. No significant effect of precursor level was found [t(6) = −0.207; p = 0.843], reflecting the similar difference between the precursor and no-precursor conditions shown in Figure 3B and seen also by comparing Figure 4C and 4D. Considering the expansive or at least linear loudness growth function of CI users with current level in μA, the increment in the current level of the precursor from the MCL/MAL midpoint to MAL should have resulted in a considerable change in loudness (Hong and Rubinstein 2006). Yet, this relatively large change in the presumed loudness of the precursor failed to produce a change in the size of LCE.
In summary, a significant LCE was observed in CI users. In line with earlier data using artificial vowel stimuli (Wang et al. 2012), the data suggest that some auditory context effects can be observed in CI users. In addition, CI users also demonstrated some spectral (or cochlear spatial) selectivity, in that the effect was greatest when the precursor and target were presented to the same electrode. One apparent difference between the data from normal-hearing listeners and CI users is the lack of a level effect in the CI users. However, the lack of a level effect, along with any conclusions about the size of LCE, is tempered by the fact that a direct comparison between normal-hearing listeners and CI users is made difficult by the different units (dB SPL vs. μA) and by the uncertain nature of the relationship between these variables and loudness. The final section attempts to provide a more quantitative comparison of the data from normal-hearing listeners and CI users by equating the results in terms of overall dynamic range.
Comparing Loudness Context Effects in Normal-Hearing Listeners and Cochlear-Implant Users
To provide a more direct comparison between the LCE measured in normal-hearing listeners and CI users, we scaled the amount of LCE for both groups, relative to their respective dynamic ranges. For this calculation, the currents in μA of the CI users were converted to dB values (re: 1 μA). Then, the differences in dB (or ratio in μA) between the PSEs with and without precursors were divided by the total dynamic range in dB of the individual CI users (THS to MAL), or by an assumed dynamic range of 100 dB for the normal-hearing listeners. The resulting ratio was then treated as a percentage. For instance, a 10-dB difference in PSE for the normal-hearing group, given their 100-dB dynamic range, would be regarded as a 10 % PSE shift. Figure 5A shows the mean normalized PSE shift for the two groups, calculated in this manner.
For the normal-hearing listeners, the small level-difference condition refers to the condition in which the precursor was 70 dB SPL, and the large level-difference condition refers to the condition in which the precursor was 85 dB SPL (both with the 96 dB/octave filter slopes and a target level of 60 dB SPL). For the CI users, the small level-difference condition refers to the condition in which the precursor was presented at a level corresponding to the midpoint between MCL and MAL, and the large level-difference condition refers to the condition in which the precursor was presented at MAL (in both cases, the target was presented at MCL). In terms of dynamic range, the difference in precursor level was 15 % for the NH group and ranged from 2.1 to 12.7 % for the CI group.
A mixed-model ANOVA on the normalized PSEs with group (normal hearing or CI) as a between-subjects factor and level difference (small or large) as a within-subjects factor revealed a significant effect of group [F(1,12) = 18.1; p = 0.001], a significant effect of level difference [F(1,12) = 39.4; p < 0.001], and a significant interaction [F(1,12) = 29.5; p < 0.001], emphasizing the observation that the normalized LCE seems generally smaller in the CI group, and that there is less effect (if any) of level difference in the CI group.
Previous studies have discussed the frequency selectivity of potential underlying processes (Elmasian and Galambos 1975; Marks 1994; Oberfeld 2007) and have concluded that maximal LCE occurs when all stimuli are presented at the same (or similar) frequencies. We observed similar results in both the normal-hearing listeners and CI users in the present experiments. To compare the frequency selectivity across the groups, we used the normalized LCE, as calculated above and plotted it in Figure 5B, as a proportion of the maximum amount of LCE, observed in the on-frequency conditions. To obtain these values, we first obtained the PSEs (in dB, as described above) for each subject in each precursor condition. For each precursor condition, we then individually normalized the PSE as: PSEni = (PSEi–PSEref)/(PSEon–PSEref), where PSEi is the original PSE of condition i, and PSEon and PSEref are PSE from on-frequency precursor condition and no-precursor reference condition, respectively. Finally, the averaged values for each subject group were calculated. For the two filter-slope conditions with the normal-hearing listeners, the outcomes are as expected: narrower excitation patterns result in greater frequency selectivity. Interestingly, the frequency selectivity observed in the CI group is quite similar to that found in the normal-hearing group, with frequency selectivity falling between the two curves from the normal-hearing listeners.
General Discussion
This study measured how the loudness relationship between two brief (200-ms) sounds (the target and comparison), spaced 2 s apart, is affected by the presence of a longer (1-s) precursor, preceding the target and separated by a gap of 50 ms. Both normal-hearing listeners and CI users were tested with the precursor presented at various spectral (or electrode) positions relative to the target.
Our findings from Experiment 1, using normal-hearing listeners, are consistent with those of previous studies. A more intense precursor resulted in the target sound being judged louder than the comparison signal when they were presented at equal levels (e.g., Galambos et al. 1972; Elmasian and Galambos 1975), and an increase in precursor level resulted in an increased effect (Elmasian et al. 1980; Zeng 1994; Arieh and Marks 2003b). Finally, the effect of the precursor depended on the frequency proximity of the precursor to the target and comparison, with the maximum effect occurring when the center frequencies of the precursor and target were the same.
In Experiment 2 using CI users, significant LCE and similar frequency-selectivity effects to those in normal-hearing listeners were found. The fact that LCE was observed at all in CI users suggests that at least part of LCE originates from a stage of processing higher than the cochlea. This observation implies that the MOC cannot be the only source of LCE. Thus, to the extent that LCE reflects auditory gain control, it must occur at least in part beyond the cochlea. However, one potentially important difference between the normal-hearing and CI results was that the LCE observed in CI users did not seem dependent on precursor level, in contrast to the large level effects observed in normal-hearing listeners.
It is difficult to make quantitative comparisons between the results from normal-hearing listeners and those from CI users, because of the different units (dB SPL vs. μA), and the different (and uncertain) relationship between these units and the underlying neural responses and percepts. We provided one possible approach here, by normalizing the units in terms of overall dynamic range (on an individual basis for the CI users and with an assumption of 100 dB for the normal-hearing listeners). However, the conclusions based on these comparisons must be treated with caution. In addition, although the differences in level between the precursors and the targets in the CI users were substantial, it remains unknown whether the differences in the results were due to smaller CI effects or because the differences in current levels between the precursor and the target were not sufficient to induce large effects. Future studies using wider ranges of level differences should help resolve this question.
Several theories have been proposed to explain aspects of LCE. Taking account of the fact that the loudness of a target is enhanced if the precursor is more intense than the target, and that its loudness is reduced if the precursor is less intense (Zwislock and Sokolich 1974; Elmasian et al. 1980), a “mergence hypothesis” was proposed, whereby the loudness of the target is derived from a weighted average of the intensities of both precursor and target (Elmasian et al. 1980). The fact that the effect is only observed for precursor-target gaps of less than 400 ms provides an upper bound to the time window associated with such mergence (Zwislock and Sokolich 1974). This framework explained many aspects of LCE, except for the “mid-difference hump”, that the maximal of effect size was gained only when the level difference was moderate (e.g., 20–30 dB), which was proposed by Oberfeld (2007). According to mergence theory, the effect size should increase monotonically with the level of precursor, in contrast to the data (Zeng 1994; Plack 1996; Mapes-Riordan and Yost 1999). To account for the mid-difference hump, Oberfeld (2008) proposed the “similarity model”. The idea is that mergence in the auditory perceptual system will become more effective when two sounds are more similar perceptually. Therefore, if the precursor is presented at a level that is too different from that of the target, the mergence between the precursor and target would be weaker. With appropriately selected parameters, this model can quantitatively predict some of the LCE patterns observed in behavioral studies (Oberfeld 2008).
As mentioned earlier, studies that use only a single frequency for the target and comparison sounds cannot distinguish between an enhancement of the target and a decrease in the loudness of the comparison (Scharf et al. 2002). Studies using loudness comparisons across frequency have resulted in the proposal of a dual-process (Arieh and Marks 2003a; Oberfeld 2007). The first process is described as a fast-onset and fast-decay process, which is basically the “similarity model” discussed above. The second process is assumed to be a fast-onset, slow-decay process, which is responsible for the reduction of the comparison signal. This process has been termed “induced loudness reduction” by Scharf et al. (2002) and could last for seconds (Arieh and Marks 2003a; Arieh et al. 2005). According to Arieh and Marks (2003a), this effect could monotonically increase to as much as 11 dB within about 1 s and then level off. In Elmasian and Galambos (1975), the amount of loudness enhancement was about 4 dB, with the precursor and target tones presented at 80 and 70 dB SPL, respectively, which was comparable to what we measured here. Considering the short gap (100 ms) between the precursor and target in their study, two processes may have been partially cancelled out by each other, which presumably also occurred in the current study. In our experiment, the effect of the precursor reached a maximum of about 6 dB in normal-hearing listeners. The equivalent effect in CI users appeared to be smaller, when calculated in comparable units (based on overall dynamic range), and the amount of enhancement was less (or not) dependent on precursor level. This outcome, which suggests that at least one of the mechanisms underlying LCE may be different or absent in CI users, is intriguing. Further insights into the respective contributions of the two processes in CI users might be gained by applying the method proposed by Oberfeld (2007) to separate the two processes. However, any direct comparison between the results of normal-hearing listeners and CI users must be treated with caution, given the uncertainties surrounding the mapping of acoustic sound pressure level in normal-hearing listeners to electrical current in CI users. Further insights may be gained by tracking the time course of LCE in these two populations and by separating the effects of the precursor on the target and the comparison stimulus.
Acknowledgments
This work was supported in part by NIH grant R01 DC012262. Author NW was supported by Advanced Bionics and by a Doctoral Dissertation Fellowship from the Graduate School of the University of Minnesota.
Conflict of Interest
The authors declare that they have no conflict of interest.
Contributor Information
Ningyuan Wang, Email: wang2087@umn.edu.
Heather A. Kreft, Email: plumx002@umn.edu
Andrew J. Oxenham, Email: oxenham@umn.edu
References
- Arieh Y, Marks LE. Time course of loudness recalibration: implications for loudness enhancement. J Acoust Soc Am. 2003;114:1550–1556. doi: 10.1121/1.1603768. [DOI] [PubMed] [Google Scholar]
- Arieh Y, Marks LE. Recalibrating the auditory system: a speed-accuracy analysis of intensity perception. J Exp Psychol Hum Percept Perform. 2003;29:523–536. doi: 10.1037/0096-1523.29.3.523. [DOI] [PubMed] [Google Scholar]
- Arieh Y, Kelly K, Marks LE. Tracking the time to recovery after induced loudness reduction. J Acoust Soc Am. 2005;117:3381–3384. doi: 10.1121/1.1898103. [DOI] [PubMed] [Google Scholar]
- Bingabr M, Espinoza-Varas B, Loizou PC. Simulating the effect of spread of excitation in cochlear implants. Hear Res. 2008;241:73–79. doi: 10.1016/j.heares.2008.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark NR, Brown GJ, Jurgens T, Meddis R. A frequency-selective feedback model of auditory efferent suppression and its implications for the recognition of speech in noise. J Acoust Soc Am. 2012;132:1535–1541. doi: 10.1121/1.4742745. [DOI] [PubMed] [Google Scholar]
- Cooper NP, Guinan JJ., Jr Separate mechanical processes underlie fast and slow effects of medial olivocochlear efferent activity. J Physiol. 2003;548:307–312. doi: 10.1113/jphysiol.2003.039081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Boer J, Thornton AR, Krumbholz K. What is the role of the medial olivocochlear system in speech-in-noise processing? J Neurophysiol. 2012;107:1301–1312. doi: 10.1152/jn.00222.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elmasian R, Galambos R. Loudness enhancement: monaural, binaural, and dichotic. J Acoust Soc Am. 1975;58:229–234. doi: 10.1121/1.380650. [DOI] [PubMed] [Google Scholar]
- Elmasian R, Morgan R, Galambos R. Time course of loudness enhancement and intensity discrimination. J Acoust Soc Am. 1975;58:S35–S35. doi: 10.1121/1.2002086. [DOI] [Google Scholar]
- Elmasian R, Galambos R, Bernheim A., Jr Loudness enhancement and decrement in four paradigms. J Acoust Soc Am. 1980;67:601–607. doi: 10.1121/1.383937. [DOI] [PubMed] [Google Scholar]
- Galambos R, Bauer J, Picton T, Squires K, Squires N. Loudness enhancement following contralateral stimulation. J Acoust Soc Am. 1972;52:4. doi: 10.1121/1.1913224. [DOI] [PubMed] [Google Scholar]
- Garinis AC, Glattke T, Cone BK. The MOC reflex during active listening to speech. J Speech Lang Hear Res. 2011;54:1464–1476. doi: 10.1044/1092-4388(2011/10-0223). [DOI] [PubMed] [Google Scholar]
- Glasberg BR, Moore BC. Derivation of auditory filter shapes from notched-noise data. Hear Res. 1990;47:103–138. doi: 10.1016/0378-5955(90)90170-T. [DOI] [PubMed] [Google Scholar]
- Guinan JJ., Jr Olivocochlear efferents: anatomy, physiology, function, and the measurement of efferent effects in humans. Ear Hear. 2006;27:589–607. doi: 10.1097/01.aud.0000240507.83072.e7. [DOI] [PubMed] [Google Scholar]
- Guinan JJ., Jr Cochlear efferent innervation and function. Curr Opin Otolaryngol Head Neck Surg. 2010;18:447–453. doi: 10.1097/MOO.0b013e32833e05d6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hong RS, Rubinstein JT. Conditioning pulse trains in cochlear implants: effects on loudness growth. Otol Neurotol. 2006;27:50–56. doi: 10.1097/01.mao.0000187045.73791.db. [DOI] [PubMed] [Google Scholar]
- Hong RS, Rubinstein JT, Wehner D, Horn D. Dynamic range enhancement for cochlear implants. Otol Neurotol. 2003;24:590–595. doi: 10.1097/00129492-200307000-00010. [DOI] [PubMed] [Google Scholar]
- Jennings SG, Strickland EA, Heinz MG. Precursor effects on behavioral estimates of frequency selectivity and gain in forward masking. J Acoust Soc Am. 2009;125:2172–2181. doi: 10.1121/1.3081383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mapes-Riordan D, Yost WA. Loudness recalibration as a function of level. J Acoust Soc Am. 1999;106:3506–3511. doi: 10.1121/1.428203. [DOI] [PubMed] [Google Scholar]
- Marks LE. Magnitude estimation and sensory matching. Percept Psychophys. 1988;43:511–525. doi: 10.3758/BF03207739. [DOI] [PubMed] [Google Scholar]
- Marks LE. “Recalibrating” the auditory system: the perception of loudness. J Exp Psychol Hum Percept Perform. 1994;20:382–396. doi: 10.1037/0096-1523.20.2.382. [DOI] [PubMed] [Google Scholar]
- Mishra SK, Lutman ME. Top-down influences of the medial olivocochlear efferent system in speech perception in noise. PLoS ONE. 2014;9:e85756. doi: 10.1371/journal.pone.0085756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nieder B, Buus S, Florentine M, Scharf B. Interactions between test- and inducer-tone durations in induced loudness reduction. J Acoust Soc Am. 2003;114:2846–2855. doi: 10.1121/1.1616580. [DOI] [PubMed] [Google Scholar]
- Oberfeld D. Loudness changes induced by a proximal sound: loudness enhancement, loudness recalibration, or both? J Acoust Soc Am. 2007;121:2137–2148. doi: 10.1121/1.2710433. [DOI] [PubMed] [Google Scholar]
- Oberfeld D. The mid-difference hump in forward-masked intensity discrimination. J Acoust Soc Am. 2008;123:1571–1581. doi: 10.1121/1.2837284. [DOI] [PubMed] [Google Scholar]
- Oxenham AJ, Kreft HA (2014) Speech perception in tones and noise via cochlear implants reveals influence of spectral resolution on temporal processing. Trends Hear 18. pii: 2331216514553783 [DOI] [PMC free article] [PubMed]
- Plack CJ. Loudness enhancement and intensity discrimination under forward and backward masking. J Acoust Soc Am. 1996;100:1024–1030. doi: 10.1121/1.416288. [DOI] [PubMed] [Google Scholar]
- Scharf B, Buus S, Nieder B. Loudness enhancement: induced loudness reduction in disguise? (L) J Acoust Soc Am. 2002;112:807–810. doi: 10.1121/1.1500755. [DOI] [PubMed] [Google Scholar]
- Wang N, Kreft H, Oxenham AJ. Vowel enhancement effects in cochlear-implant users. J Acoust Soc Am. 2012;131:EL421–EL426. doi: 10.1121/1.4710838. [DOI] [PubMed] [Google Scholar]
- Zeng FG. Loudness growth in forward masking: relation to intensity discrimination. J Acoust Soc Am. 1994;96:2127–2132. doi: 10.1121/1.410154. [DOI] [PubMed] [Google Scholar]
- Zwislock JJ, Sokolich WG. Loudness enhancement of a tone burst by a preceding tone burst. Percept Psychophys. 1974;16:87–90. doi: 10.3758/BF03203256. [DOI] [Google Scholar]