Abstract
While bilateral cochlear implants (CIs) provide some binaural benefits, these benefits are limited compared to those observed in normal-hearing (NH) listeners. The large frequency-to-electrode allocation bandwidths (BWs) in CIs compared to auditory filter BWs in NH listeners increases the interaural fluctuation rate available for binaural unmasking, which may limit binaural benefits. The purpose of this work was to investigate the effect of interaural fluctuation rate on correlation change discrimination and binaural masking-level differences in NH listeners presented a CI simulation using a pulsed-sine vocoder. In experiment 1, correlation-change just-noticeable differences (JNDs) and tone-in-noise thresholds were measured for narrowband noises with different BWs and center frequencies (CFs). The results suggest that the BW, CF, and/or interaural fluctuation rate are important factors for correlation change discrimination. In experiment 2, the interaural fluctuation rate was systematically varied and dissociated from changes in BW and CF by using a pulsed-sine vocoder. Results indicated that the interaural fluctuation rate did not affect correlation change JNDs for correlated reference noises; however, slow interaural fluctuations increased correlation change JNDs for uncorrelated reference noises. In experiment 3, the BW, CF, and vocoder pulse rate were varied while interaural fluctuation rate was held constant. JNDs increased for increasing BW and decreased for increasing CF. In summary, relatively fast interaural fluctuation rates are not detrimental for detecting changes in interaural correlation. Thus, limiting factors to binaural benefits in CI listeners could be a result of other temporal and/or spectral deficiencies from electrical stimulation.
Keywords: cochlear implants, interaural fluctuation rate, correlation change discrimination
INTRODUCTION
Cochlear implants (CIs) in many cases restore high levels of speech understanding to people with severe-to-profound sensorineural hearing loss. The success of the CI has motivated bilateral implantation, thus providing auditory information to both ears in an attempt to produce binaural benefits, specifically better sound localization and speech understanding in noise. These binaural benefits are mostly derived from temporal envelope information because the present generation of CIs primarily encodes the temporal envelope. CI speech processing consists of bandpass filtering incoming sounds into a contiguous set of channels, extracting the envelope of each channel, and using the envelopes to modulate high-rate electrical pulse trains at each electrode. There are several factors that limit the binaural benefits achieved by CI users, including the lack of time synchronization between processors (van Hoesel 2004; Litovsky et al. 2012), placement of the electrode array in the cochlea (Poon et al. 2009; Kan et al. 2013), age of onset of deafness (Litovsky et al. 2010), and lack of temporal fine-structure encoded in the signal (van Hoesel 2007).
Normal-hearing (NH) listeners are highly sensitive to interaural difference cues that produce binaural benefits. A majority of the binaural benefits result from low-frequency (<1,500 Hz) interaural time differences (ITDs); however, interaural level differences (ILDs) contribute to binaural benefits, particularly at higher frequencies (Bronkhorst and Plomp 1988; Wightman and Kistler 1992; Macpherson and Middlebrooks 2002). Bilateral CI users do not have access to low-frequency fine-structure ITDs because only envelope information is encoded. Therefore, bilateral CI users must rely on envelope ITDs and ILDs for binaural benefits, which is thought to be a major reason why CI users do not demonstrate larger binaural unmasking of speech when presented in spatially separated noise (van Hoesel et al. 2008; Loizou et al. 2009; Culling et al. 2012).
Another interaural cue, called the interaural correlation, is thought to be related to the binaural unmasking of speech (Culling et al. 2004; Lavandier and Culling 2010; Culling et al. 2012). The interaural correlation of a signal is the statistical similarity of the signals between the two ears, and is related to the magnitude of the time-varying ITD and ILD fluctuations (Goupell 2010). NH listeners can detect small changes in the interaural correlation of sounds, particularly at 500 Hz where fine-structure cues are available (e.g., Culling et al. 2001), and also in the envelopes of high-frequency stimuli (van de Par and Kohlrausch 1997; Goupell 2012). Many bilateral CI users demonstrate binaural masking level differences (BMLDs), or lower tone-in-noise thresholds for dichotic versus diotic stimuli if the experiments are performed with single-electrode pairs using time-synchronized research processors (Long et al. 2006; Van Deun et al. 2009, 2011; Lu et al. 2010, 2011). Since lower thresholds for the dichotic conditions can only be achieved by detecting differences between the ears, this is evidence that bilateral CI users are sensitive to changes in interaural envelope correlation. The link between BMLDs and interaural correlation sensitivity has been previously established in NH listeners (Koehnke et al. 1986). However, the envelope encoding in CIs has many differences compared to envelope encoding in NH listeners, particularly for the rate of envelope fluctuations.
Under some conditions, the binaural system is less effective at processing rapid changes in ITDs, ILDs, and interaural correlation than it is at processing slower changes (e.g., Grantham and Wightman 1978; Grantham 1984), which has been called "binaural sluggishness." Several studies have examined the possibility that high-rate interaural fluctuations in decorrelated noises reduce sensitivity to changes in interaural correlation in NH listeners (e.g., Zurek and Durlach 1987; Buss and Hall 2010; van de Par et al. 2012; Yasin and Henning 2012). In normal acoustic hearing, the fluctuation rate of the ITDs and ILDs in decorrelated signals is determined by the fine-structure and envelope modulations inherent to the stimulus and the bandwidth (BW) of the stimulus (determined by the physical BW of the stimulus if it is smaller than the auditory filter BW, or the BW of the auditory filter where a portion of the stimulus is being passed). For a Gaussian noise applied to an ideal bandpass filter, the expected number of maxima of the envelope in 1 s is approximately 60 % of the BW (Rice 1954). The time-varying ITD can be calculated from the instantaneous changes in the fine-structure of the noises and the time-varying ILD can be calculated from the instantaneous changes in the envelopes of the noises (Goupell and Hartmann 2006). It is also clear from a plot of the instantaneous ITD and ILD as a function of time (Fig. 1) that average fluctuation rate of the ITDs and ILDs increase with BW and occur at a similar rate.
In NH, the ITD and ILD fluctuation rate are limited by both the modulations inherent to the acoustic stimulus and the BW of the auditory filter(s) through which the stimulus passes. This is unlike what occurs in bilateral CI speech processing strategies, where the ITD fluctuations are discarded leaving only the ILD fluctuations from the envelopes, and the ILD fluctuation rate is limited by (1) the low-pass filter on the envelope extraction and (2) the analysis-channel BW if it is narrower than the BW of the low-pass filter. The cutoff frequency on this low-pass filter is usually set between 200 and 400 Hz (Loizou 2006). The BW of each analysis channel in a CI is often larger than a typical auditory-filter BW because the present generation of multi-electrode arrays have a limited number of channels in which to encode the entire frequency range necessary to present speech information (see Table 1). Since CI analysis-channel BWs are larger than typical auditory-filter BWs, this would increase the rate of envelope fluctuations in CI listeners compared to NH listeners, particularly at low center frequencies (CFs). Another major difference in CI signal encoding compared to NH listeners is that current spread will encode the same envelope fluctuations coherently over a large frequency range. In NH, the systematic relationship between cochlear filter BWs and CF introduces a dependence of interaural fluctuation rate on frequency. Therefore, there are fundamental differences in the interaural information presented to NH and CI listeners. It is likely that CI listeners typically need to process faster ILD fluctuation rates than NH listeners at low frequencies, which may limit the usefulness of interaural correlation change discrimination in producing binaural unmasking in bilateral CI users.
TABLE 1.
Electrode | Lower cutoff (Hz) | Upper cutoff (Hz) | CF (Hz) | Analysis BW (Hz) | Auditory BW (Hz) | Difference (Hz) |
---|---|---|---|---|---|---|
22 | 188 | 313 | 242.6 | 125 | 50.9 | 74.1 |
21 | 313 | 438 | 370.3 | 125 | 64.7 | 60.3 |
20 | 438 | 563 | 496.6 | 125 | 78.3 | 46.7 |
19 | 563 | 688 | 622.4 | 125 | 91.9 | 33.1 |
18 | 688 | 813 | 747.9 | 125 | 105.4 | 19.6 |
17 | 813 | 938 | 873.3 | 125 | 119.0 | 6.0 |
16 | 938 | 1,063 | 998.5 | 125 | 132.5 | −7.5 |
15 | 1,063 | 1,188 | 1,123.8 | 125 | 146.0 | −21.0 |
14 | 1,188 | 1,313 | 1,248.9 | 125 | 159.5 | −34.5 |
13 | 1,313 | 1,563 | 1,432.6 | 250 | 179.3 | 70.7 |
12 | 1,563 | 1,813 | 1,683.4 | 250 | 206.4 | 43.6 |
11 | 1,813 | 2,063 | 1,934.0 | 250 | 233.5 | 16.5 |
10 | 2,063 | 2,313 | 2,184.4 | 250 | 260.5 | −10.5 |
9 | 2,313 | 2,688 | 2,493.5 | 375 | 293.8 | 81.2 |
8 | 2,688 | 3,063 | 2,869.4 | 375 | 334.4 | 40.6 |
7 | 3,063 | 3,563 | 3,303.6 | 500 | 381.3 | 118.7 |
6 | 3,563 | 4,063 | 3,804.8 | 500 | 435.4 | 64.6 |
5 | 4,063 | 4,688 | 4,364.3 | 625 | 495.8 | 129.2 |
4 | 4,688 | 5,313 | 4,990.7 | 625 | 563.4 | 61.6 |
3 | 5,313 | 6,063 | 5,675.6 | 750 | 637.3 | 112.7 |
2 | 6,063 | 6,938 | 6,485.8 | 875 | 724.8 | 150.2 |
1 | 6,938 | 7,938 | 7,421.2 | 1,000 | 825.7 | 174.3 |
The lower and upper frequency boundaries, geometric CF, and analysis-channel BW are reported for each channel/electrode. In addition, the critical BW for an auditory filter at the analysis-channel CF following Moore and Glasberg (1983) is reported. Lastly, the difference between analysis-channel and auditory filter BWs are reported in the right column.
The purpose of this work was to determine the effect of the rate of fluctuations in detecting interaural correlation changes in NH listeners presented CI simulations. Utilizing envelope-based processing similar to the processing in a CI, we investigated the contribution of individual stimulus factors to correlation change discrimination (namely, dissociating BW from interaural fluctuation rate). CI processing was simulated using a Gaussian-enveloped pulse train with a sinusoidal carrier that sampled the temporal envelope of narrowband noises (i.e., pulsed-sine vocoder) because it allows for independent control of interaural fluctuation rate, BW, and CF. In experiment 1, correlation change sensitivity was measured as a function of BW and CF using unprocessed/non-vocoded noises to determine if the effects of BW were consistent across CF. Then, using pulsed-sine vocoding, the effect of interaural fluctuation rate was investigated for a fixed BW and CF (experiment 2) and the effects of BW and CF were investigated for fixed interaural fluctuation rate (experiment 3). If high-rate interaural fluctuations are detrimental to binaural processing, sensitivity should decrease with increasing interaural fluctuation rate. Such a result could have consequences for binaural hearing in bilateral CI users; in particular, it may be a limiting factor for binaural unmasking of speech in spatially separated noise.
EXPERIMENT I: CORRELATION CHANGE DISCRIMINATION AS A FUNCTION OF BW, CF, AND TASK TYPE
This experiment presents data from a larger experiment, part of which was published in Goupell (2012). The methodology was the same for both studies.
Listeners and equipment
Nine listeners (three females and six males), 20 to 30 years old, participated in this experiment. They had hearing thresholds at octave frequencies between 250 and 8,000 Hz that were within 20 dB of typical thresholds. They also had less than 10-dB interaural asymmetry in thresholds at any frequency. Nine listeners participated in the correlation change discrimination tasks, while only five listeners participated in the BMLD tasks. The first author was one of the listeners and participated in the correlation change and BMLD tasks. The listeners consented to the testing and all procedures were approved by the University of Wisconsin's Institutional Review Board.
The stimuli were delivered by a personal computer to a Tucker–Davis System 3 real-time processor (RP2.1) and headphone driver (HB7), and over a pair of headphones (Sennheiser HD580). Listeners performed the experiments in a double-walled sound attenuating booth (IAC).
Stimuli
Stimuli were dual-channel narrowband Gaussian white noise pairs; a sine tone was embedded in the noise for the BMLD tasks. Both the noise and tone were 500 ms in duration and were temporally shaped by a Tukey window with a rise–fall time of 10 ms. The CF of the stimuli was 500, 2,000, 4,000, or 8,000 Hz. The BW of the stimuli was 10 Hz, 50 Hz, or 1 equivalent rectangular BW (ERB) according to Moore and Glasberg (1983) (BW = 78, 240, 456, or 888 Hz for CF = 500, 2,000, 4,000, and 8,000 Hz, respectively). The stimuli were arithmetically centered on the CF. The stimuli had an A-weighted sound pressure level of 65 dB, a 48.848-kHz sampling rate, and 24-bit resolution.
The stimuli for the correlation change discrimination tasks had a varied interaural correlation, ρ, and were compared against one of two reference correlations (ρref = 1 or 0). The value of ρ was precisely controlled by using the Gram-Schmidt Orthogonalization process (Culling et al. 2001). The stimuli for the BMLD tasks had a tone-in-noise configuration (NoSo and NoSπ) with the tone at the CF of the noise. In total, 48 conditions were tested in this experiment (4 CF × 3 BW × 4 configurations).
The stimuli were generated offline before the experiments. Reproducible noise tokens were used in order to have a set of stimuli and methods that was comparable to those used in subsequent experiments using pulsed-sine vocoded stimuli. It was necessary to generate the pulsed-sine vocoded stimuli offline because the stimulus generation time was on the order of seconds. Therefore, there were 25 different noise tokens for each condition and each value of ρ or SNR used in the adaptive procedure. Stimuli were randomly drawn from the 25 tokens without replacement on each trial.
Procedure
A four-interval, two-alternative forced-choice procedure was used. The listener initiated each trial by pressing a button. Four intervals were played, which were grouped into sequential pairs. The four intervals contained a different noise token. The inter-interval duration was 250 ms between stimuli in the first pair (first and second intervals) and in the second pair (third and fourth intervals). The inter-interval duration was 500 ms between the two pairs (second and third intervals). The first and fourth intervals always contained reference stimuli. The second and third interval contained a reference stimulus and a target stimulus; the interval with the target chosen randomly. The listener was instructed to choose the pair perceived to be different (i.e., the pair that had the target interval). No feedback was provided during the testing.
A two-down, one-up adaptive staircase procedure was used to measure just-noticeable differences (JNDs) and tone-in-noise thresholds. For correlation change discrimination, the initial target value was perfectly uncorrelated (ρ=0) for a correlated reference (ρref = 1), and perfectly correlated (ρ=1) for an uncorrelated reference (ρref = 0). The step size was until the first reversal, 0.2 until the second, 0.1 until the third, and 0.05 for the rest of the staircase.1 For the correlation change discrimination tasks, if listeners could not detect the change in correlation (i.e., had four incorrect answers at the easiest possible testing level), the run was stopped and the JND was recorded as "Not Determinable." This translated to a numerical value of Δα=1.1 for that run when the data were analyzed. For the tone in noise configurations, the level of the sine tone began at a +10-dB SNR. The step size was 8 dB until the first reversal, 4 dB until the second, and 2 dB for the rest of the staircase. Ten reversals were measured for each staircase. The last six reversals were averaged to calculate the JND or threshold for a run. Three runs were performed per condition and the average JND or threshold over the three runs was recorded. In conditions that had a large standard deviation over the three runs (33 % of the JND for the correlation change discrimination tasks and 3 dB for the tone in noise tasks), two extra runs were performed and all five were used to calculate the average JND or threshold.
Training that included correct answer feedback was performed before data collection using 500-Hz CF and 10-Hz BW stimuli, as outlined in Goupell (2012). During data collection, conditions with different BW and CF were randomized over listeners, and at least three runs were completed before moving on to a new condition. The task type was fixed until all the data were taken for the different BWs and CFs. The order of testing was ρref = 1, ρref = 0, NoSo, and NoSπ.
Results
The average correlation change discrimination JNDs are shown in Figure 2. Note that the 10-Hz BW data have been previously reported in Goupell (2012). A three-way repeated-measures analysis of variance (RM ANOVA) was performed on the data with factors BW, CF, and reference (ρref = 1 and 0). The data show that JNDs decreased with increasing BW [F(2,48) = 15.8, p = 0.0002, ηp2 = 0.66]. Subsequent Tukey Honestly Significant Difference post-hoc tests conducted on the JNDs showed that the 10-Hz BW was higher than the 50-Hz BW (p = 0.006); the 10-Hz BW was higher than the 1-ERB (p < 0.0001); and the 50-Hz BW was higher than the 1-ERB (p = 0.001). Regarding CFs, JNDs increased for larger CF [F(3,48) = 34.2, p < 0.0001, ηp2 = 0.81]. Tukey post-hoc tests showed that the JNDs were significantly higher for each level of CF (p < 0.0001), except that the JNDs for 2,000 and 4,000 Hz were not different (p = 0.32).
Regarding reference correlation, JNDs were higher for the ρref = 0 conditions compared to the ρref = 1 conditions [F(1,48) = 12.9, p = 0.007, ηp2 = 0.62]. The interaction CF × reference was significant [F(3,48) = 4.31, p = 0.015, ηp2 = 0.35], showing a larger change for the ρref = 1 conditions as a function of CF than for the ρref = 0 conditions. The interactions BW × CF, BW × reference, and BW × CF × reference were not significant (p > 0.05).
The average NoSo and NoSπ results are shown in Figure 3. A three-way RM ANOVA was performed with factors BW, CF, and configuration (NoSo and NoSπ). There was a significant effect of configuration [F(1,48) = 60.7, p = 0.0015, ηp2 = 0.79]; therefore two separate two-way RM ANOVAs were performed with factors BW and CF.
For the NoSo data, thresholds decreased with increasing BW [F(2,24) = 9.91, p = 0.007, ηp2 = 0.75]. Tukey post-hoc tests showed that thresholds for the 10-Hz BW were not different from thresholds for the 50-Hz BW (p = 0.076), but were higher than thresholds for the 1-ERB (p < 0.0001). Thresholds for the 50-Hz BW were higher than thresholds for the 1-ERB (p = 0.017). Thresholds decreased with increasing CF until 4 kHz then increased again [F(3,24) = 11.7, p = 0.0007, ηp2 = 0.71]. Tukey post-hoc tests showed that thresholds for the 500-Hz CF were higher than thresholds for the 2,000-Hz CF (p = 0.020) and thresholds for the 4,000-Hz CF (p < 0.0001), but were not different from thresholds for the 8,000-Hz CF (p = 0.27). Thresholds for the 2,000-Hz CF were higher than thresholds for the 4,000-Hz CF (p = 0.039), but were not different from thresholds for the 8000-Hz CF (p = 0.57). Thresholds for the 4000-Hz CF were lower than thresholds for the 8000-Hz CF (p = 0.002). The BW × CF interaction was not significant [F(6,24) = 0.95, p = 0.48, ηp2 = 0.19].
For the NoSπ data, there was a significant effect of BW [F(2,24) = 5.55, p = 0.031, ηp2 = 0.58]. Tukey post-hoc tests showed that thresholds for the 10-Hz BW were not different from thresholds for the 50-Hz BW (p = 0.97) and the 1-ERB (p = 0.68). Thresholds for the 50-Hz BW were lower than thresholds for the 1-ERB (p = 0.040). Thresholds increased with increasing CF [F(3,24) = 24.8, p < 0.0001, ηp2 = 0.86]. Tukey post-hoc tests showed that thresholds for the 500-Hz CF were lower than thresholds for the 2,000-, 4,000-, and 8,000-Hz CFs (p < 0.0001). Thresholds for the 2000-Hz CF were not different from thresholds for the 4000-Hz CF (p = 0.24), but were lower than thresholds for the 8,000-Hz CF (p < 0.0001). Thresholds for the 4,000-Hz CF were lower than thresholds for the 8,000-Hz CF (p = 0.0001). The BW × CF interaction was not significant [F(6,24) = 1.30, p = 0.30, ηp2 = 0.25].
Table 2 shows a comparison between the ρref = 1 JNDs and NoSπ thresholds for the five listeners that participated in the BMLD conditions. The final value of ρ for an NoSπ threshold was found by linearly interpolating the value of the ρ between the nearest two SNRs tested in the experiment from the average value of ρ over the 25 tokens at each SNR used in the experiment. The values of ρ are almost identical for the 500-Hz CF conditions, and show slightly larger differences at the higher CFs where listeners needed larger changes in correlation to perform the discrimination task.
TABLE 2.
CF (Hz) | BW | NoSπ threshold | ρ ref = 1 JND | Difference | |
---|---|---|---|---|---|
dB | ρ | ρ | |||
500 | 10 | −28.7 | 0.998 | 0.997 | 0.001 |
500 | 50 | −24.8 | 0.993 | 0.996 | −0.003 |
500 | 78 | −25.3 | 0.994 | 0.996 | −0.002 |
2,000 | 10 | −19.8 | 0.982 | 0.957 | 0.025 |
2,000 | 50 | −22.2 | 0.988 | 0.988 | 0.000 |
2,000 | 240 | −19.1 | 0.978 | 0.982 | −0.004 |
4,000 | 10 | −17.9 | 0.971 | 0.947 | 0.024 |
4,000 | 50 | −20.4 | 0.982 | 0.957 | 0.025 |
4,000 | 456 | −17.4 | 0.963 | 0.979 | −0.016 |
8,000 | 10 | −14.8 | 0.938 | 0.852 | 0.087 |
8,000 | 50 | −14.6 | 0.932 | 0.923 | 0.009 |
8,000 | 888 | −11.8 | 0.875 | 0.910 | −0.035 |
Discussion
The purpose of this experiment was to determine the effect of rate of interaural fluctuations on correlation change discrimination. If the rapid interaural fluctuations are the only factor that affects discrimination of changes in interaural correlation, then correlation change JNDs and NoSπ thresholds should increase with increasing BW. Furthermore, JNDs and thresholds should increase more as a function of CF for the 1-ERB stimuli (i.e., there should be a significant interaction of BW × CF for the NoSπ thresholds). The results of this experiment (see Fig. 2) show that correlation change discrimination JNDs decreased or did not change for increasing BW for both correlated and uncorrelated references for all CFs. While there was a trend for increased NoSπ thresholds for the 1-ERB stimuli in Figure 3, the interaction between BW × CF was not significant. Therefore this experiment provided no clear evidence that rapid fluctuations are detrimental for discrimination of changes in interaural correlation when the stimuli remain in a single auditory filter. However, this interpretation is confounded by the fact that interaural fluctuation rate and BW co-vary for noise stimuli. Furthermore, there was clearly an effect of CF (see Figs. 2 and 3). Therefore, the role of interaural fluctuation rate alone was investigated in Experiment 2 with stimuli that do not contain the BW and CF confounds.
To relate the results of the present study to previous studies, Gabriel and Colburn (1981) investigated BW effects for correlation change discrimination and found no effect of BW for CF = 500 Hz and ρref = 1 stimuli when the BW was less than 115 Hz for one listener and 10 Hz for another listener. The average change in correlation needed for discrimination was approximately Δρ=0.004 for a BW less than 100 Hz. The results of the current work in Figure 2 show no effect of BW up to 78 Hz at a CF of 500 Hz for a group of nine listeners. The average change in correlation needed for discrimination was Δρ=0.017. Gabriel and Colburn also reported that for CF = 500 Hz and ρref = 0, JNDs decreased with increasing BW. The average value needed to detect a change in correlation in that study was between Δρ=0.3 and 0.7 for a BW less than 100 Hz. That was a much greater difference than the ρref = 1 conditions, which is consistent with other studies (e.g., Boehnke et al. 2002). In the present study, the average change in correlation needed to discriminate a partially correlated target from uncorrelated references was Δρ=0.72. Therefore, our results are consistent with the results of Gabriel and Colburn (1981).
The NoSo and NoSπ conditions tested in this study reproduced conditions tested by van de Par and Kohlrausch (1999) for 400-ms, 70-dB SPL stimuli with three listeners. For the CF = 500, 2,000, and 4,000 Hz and BW = 10 and 50 Hz NoSo conditions, van de Par and Kohlrausch found thresholds between +2 and −3 dB SNR. Thresholds for a 10-Hz BW were consistently higher than thresholds for a 50-Hz BW. The thresholds found in this study (see Fig. 3) are slightly smaller for these conditions, between 0 and −5 dB SNR. The NoSπ data in van de Par and Kohlrausch (1999) were best at a CF of 500 Hz and worsened with increasing CF, which is consistent with the results of this study. Additionally, their NoSπ data show little effect of BW between 10 and 50 Hz, also consistent with the results of this study.
Previous research has also shown that correlation change discrimination for ρref = 1 and NoSπ detection yield the same sensitivity when NoSπ thresholds are converted to changes in correlation (Koehnke et al. 1986; Jain et al. 1991). The equivalence of ρref = 1 and NoSπ thresholds was demonstrated in the data collected from five listeners in this experiment, shown in Table 2. In a study demonstrating the contrary, Bernstein and Trahiotis (1997) showed a pronounced discrepancy between ρref = 1 JNDs and NoSπ thresholds for wideband (100–3000 Hz) and narrowband (450–550 Hz) noises. One possible explanation for a difference is that the onset of the sine tone in Bernstein and Trahiotis (1997) was gated asynchronously to the onset of the noise, i.e., there was a temporal fringe for the NoSπ condition but not the ρref = 1 condition. However, data from Yasin and Henning (2012) showed little effect on NoSπ thresholds in the presence of a short temporal fringe. There was no temporal fringe for both ρref = 1 and NoSπ detection in the Koehnke et al. study and the present study. If there is an added benefit of listening to an asynchronous onset of the target sine tone in the NoSπ detection, this would explain the difference in the correlation change needed to achieve threshold across the correlation change and NoSπ detection tasks as measured in Bernstein and Trahiotis, and why our data are more similar to those in Koehnke et al.
EXPERIMENT II: VARIED INTERAURAL FLUCTUATION RATE FOR FIXED VOCODER BW AND VOCODER CF
Experiment 1 did not show evidence that rapid interaural fluctuations were detrimental for discrimination of changes in interaural correlation. This experiment aimed to better determine the role of interaural fluctuations by allowing the interaural fluctuation rate to vary while the CF and BW were fixed.
Methods
Gaussian-noise stimuli were generated as in experiment 1 with a 500-Hz CF and BWs of 10, 25, 50, 100, 200, and 400 Hz, and these stimuli were used to produce the pulsed-sine vocoded stimuli for this experiment. The average rate of the inherent changes in the fine-structure and envelope (and concomitantly the interaural fluctuations, see Fig. 1) was assumed to be 6.4, 16, 32, 64, 128, and 256 Hz, because the average rate of change is about 64 % of the BW (Rice 1954). Note that only the interaural fluctuation rate changed, not the overall distribution of the interaural differences (Zurek 1991). Pulsed-sine vocoded stimuli were generated by first extracting the Hilbert envelope from both channels of the unprocessed narrowband noises. These envelopes were temporally sampled with periodic Gaussian-enveloped tone pulse trains. The pulses had a 4-kHz carrier frequency and were diotic before the amplitudes between the ears were adjusted. The rate of the pulse trains was 800 pulses per second (pps), a rate which has adequate sampling of the envelope at all modulation frequencies. The final −3-dB BW of the stimuli after vocoding was 2 kHz (irrespective of interaural fluctuation rate) and the CF was 4 kHz. We will distinguish the BW and CF of the unprocessed stimuli with the vocoder BW (VBW) and the vocoder CF (VCF). With the VBW and pulse rate chosen for this experiment, the −3-dB pulse duration was 1/2,000 = 0.5 ms, resulting in a modulation depth of the pulse train greater than 99 %.
Four conditions were tested: ρref = 1, ρref = 0, NoSo, and NoSπ. For the latter two conditions, only envelope modulation rates of 6.4, 32, and 256 Hz were tested on five listeners. As in experiment 1, the data were collected as part of a larger set of conditions (Goupell 2012). The methods were the same as in experiment 1, except that there was no additional training because the detection cues should be similar between experiments 1 and 2. Low-frequency masking noise to mask distortion products was not used because it did not seem advantageous to utilize fluctuating ILDs from low-level distortion products near 100 Hz compared to high-level acoustic energy near 4 kHz.
Results
Figure 4A shows the correlation change discrimination JNDs for the pulse-train vocoded stimuli and Figure 4B shows the JNDs normalized to the JND at the 256-Hz rate. The normalized JNDs in Figure 4B confirm the trends in Figure 4A. Again, note that all of the 10-Hz BW data have been previously reported in Goupell (2012). A two-way RM ANOVA was performed with factors of interaural fluctuation rate and reference correlation. Correlation change JNDs for ρref = 1 were lower than those for ρref = 0 [reference: F(1,24) = 19.0, p = 0.0024, ηp2 = 0.70] and there was a larger difference for lower average interaural fluctuation rates [rate × reference: F(5,24) = 7.5, p < 0.0001, ηp2 = 0.48]. Therefore, we analyzed the ρref = 1 and 0 data in separate one-way RM ANOVAs. There was no effect of rate for ρref = 1 [F(5,12) = 0.52, p = 0.76, ηp2 = 0.06]. Lower rates had higher JNDs than higher rates for ρref = 0 [F(5,12) = 6.45, p = 0.0002, ηp2 = 0.45]. Tukey post-hoc tests for ρref = 0 showed that there was no significant difference between rates of 6.4 and both 16 and 32 Hz (p > 0.05 for both), but JNDs for 6.4 Hz were significantly higher than JNDs for 64-, 128-, and 256-Hz rates (p = 0.033, 0.019, and 0.024, respectively). There were no significant differences for any of the combinations of rates above 6.4 Hz (p > 0.05).
Figure 5 shows the average NoSo and NoSπ thresholds for the pulsed-sine vocoded stimuli. NoSπ thresholds were lower than NoSo thresholds [F(1,12) = 74.0, p = 0.001, ηp2 = 0.95]. There was no effect of envelope modulation rate [F(2,12) = 1.57, p = 0.27, ηp2 = 0.28] and the rate × configuration interaction was not significant [F(2,12) = 2.76, p = 0.13, ηp2 = 0.41].
As in experiment 1, the amount of correlation needed to achieve threshold was compared between the ρref = 1 and NoSπ conditions. Table 3 shows that the average NoSπ thresholds converted to ρ yielded values within Δρ=0.006 of the correlation change JNDs.
TABLE 3.
Env. mod. rate | NoSπ threshold | ρ ref = 1 JND | Difference | |
---|---|---|---|---|
dB | ρ | ρ | ||
6.4 | −22.2 | 0.989 | 0.983 | 0.006 |
32 | −20.6 | 0.985 | 0.982 | 0.003 |
256 | −20.9 | 0.986 | 0.980 | 0.006 |
Discussion
The pulsed-sine vocoding of narrowband noises in this experiment replaced any informative fine-structure information and encoded only the envelope information. Hence, changes in correlation for these stimuli were represented solely as dynamic ILDs, which would be the interaural information available to bilateral CI users. The average JND for the pulsed-sine vocoded stimuli was worse than the best JNDs for the unprocessed stimuli in experiment 1 at a 500-Hz CF. However, the JNDs for the pulsed-sine vocoded stimuli, which had a 4-kHz VCF, were better than the JNDs for the unprocessed stimuli at a 4-kHz CF. Although the fine-structure information (hence the dynamic IPDs) remained in the unprocessed and non-vocoded 4-kHz CF stimuli, the auditory system likely cannot utilize the IPDs at high CFs because of the lack of neural phase locking, and must rely on the ILDs from the envelopes. If JNDs only reflect the availability of the interaural cues, the JNDs for the pulsed-sine vocoded stimuli should be at least as good as the JNDs for the unprocessed stimuli. The JNDs for the pulsed-sine vocoded stimuli are better than the JNDs from the unprocessed stimuli at 4-kHz CF and the VBW for the pulsed-sine vocoded stimuli is at least four times greater than the BW of the unprocessed stimuli. Therefore, there seems to be a benefit of detecting binaural cues when they are encoded coherently across a number of auditory filters. Table 3 demonstrates that as in experiment 1, the amount of correlation needed to achieve threshold was nearly the same for the ρref = 1 the NoSπ conditions for the five listeners that performed both measurements. This finding reinforces the equivalence of the two measurements, which occurred even for pulsed-sine vocoded stimuli.
Varying the envelope modulation rate for NoSo and interaural fluctuation rate for the ρref = 1 and NoSπ (Figs. 4 and 5) did not produce a significant effect. Therefore, the listeners tested here did not derive a benefit from slow interaural fluctuations for detecting changes in interaural correlation. Slow interaural fluctuations would be perceived as momentary, but possibly infrequent, ILD fluctuations or features in the stimuli. However, there was a significant effect of interaural fluctuation rate for ρref = 0, where JNDs were worst for the lowest interaural fluctuation rate of 6.4 Hz, and improved with increasing interaural fluctuation rate. Higher interaural fluctuation rates would be perceived as a blurred but stable auditory image.
One way to interpret the poor performance of the ρref = 0 conditions at slow interaural fluctuation rates is that the stability of the auditory image is an important factor. For slow envelope modulation rates, the ILDs vary slowly and randomly. Gabriel and Colburn described the slow interaural fluctuations of decorrelated stimuli as "wandering tones" (Gabriel and Colburn 1981, p. 1396). Comparing the perceptual width of two such wandering tones (one with ρ=0 and one with ρ≈0) would be difficult because a listener might only get a momentary "glimpse" of the largest ILDs, and hence a momentary glimpse at the possible width. Indeed, Gabriel and Colburn's listeners described this comparison as "annoying." The listeners of the current study had a similar subjective impression and reaction to the slow interaural fluctuation rate ρref = 0 conditions. At higher interaural fluctuation rates, where listeners perceived a blurred intracranial image, there were many opportunities to detect the largest ILDs. Hence, we deduce that a contributing factor to produce lower JNDs for the ρref = 0 conditions at higher interaural fluctuation rates was an easier comparison between image widths.
Interpretation of the data in terms of auditory image stability of the reference stimuli may be supported by studies on dynamic interaural differences (Grantham and Wightman 1978; Grantham 1984). These studies utilized sinusoidally changing ITDs or ILDs with different modulation rates, and listeners were asked to choose the interval with the perceptually moving stimulus. Results suggest that listeners' perception of a stimulus changed from a "moving image" to a "blurry image" at a modulation rate of approximately 50 Hz (Grantham and Wightman 1978, pp. 513–514). Similarly, we saw saturation in the decrease of JNDs for ρref = 0 conditions between 32 and 64 Hz (Fig. 4), supporting our interpretation of the data that blurry images provides more stable reference images, thus making the task easier for the listeners.
One limiting factor in this experiment is the across-listener and within-listener variability in the measurements (see Fig. 4). Binaural sensitivity is often highly variable (Bernstein et al. 1998; Buss et al. 2007), and one technique to reduce the across-listener variability is to calculate normalized JNDs. Comparing Figure 4A and B, it is clear that some of the variability in the measurements can be attributed to the across-listener variability. The within-subject variability is more pronounced at low rates, which is likely a result of how the stimuli were generated. We used a physical parameter (the normalized interaural correlation) to generate the stimuli, which may be inconsistent with the listeners' perceptual scale. Applying physiologically relevant transformations to the interaural correlation for the stimulus generation may reduce this source of variability (Bernstein and Trahiotis 2009); however, it is unclear that such transformations would completely remove the stimulus variability (Goupell 2010).
In conclusion, this experiment showed that interaural fluctuation rate was a factor for detecting changes in correlation, but only for ρref = 0 reference stimuli.
EXPERIMENT III: VARIED VOCODER BW, VOCODER CF, AND PULSE RATE WITH FIXED INTERAURAL FLUCTUATION RATE
An additional experiment was conducted to investigate the effect of BW and CF of the vocoded stimuli (i.e., the VBW and VCF) when the interaural fluctuation rate was held constant. The factor of pulse rate was also varied to investigate any effect resulting from the number of harmonics in the stimuli.
Methods
The unprocessed stimuli were 50-Hz BW, 500-Hz CF noises, which have an average interaural fluctuation rate of 32 Hz. These stimuli were chosen because there was no significant effect of interaural fluctuation rate above 32 Hz in experiment 2. The unprocessed stimuli were vocoded using pulses that produced a VBW = 1 or 2 kHz and VCF = 4 or 8 kHz, with pulse rate = 200 or 400 pps. Therefore, eight conditions were tested (2 VBW × 2 VCF × 2 pulse rates). All of these conditions maintained a pulse train modulation depth greater than 99 %. The task was correlation change discrimination with ρref = 1. Otherwise, the methods were the same as those in experiment 2.
Results
Figure 6 shows the JNDs for this experiment as a function of VBW. A three-way RM ANOVA showed that JNDs decreased with increasing VBW [F(1,16) = 6.22, p = 0.037, ηp2 = 0.44] and JNDs increased for increasing VCF [F(1,16) = 33.4, p = 0.0004, ηp2 = 0.81]. However, there was no significant effect of pulse rate [F(1,16) = 0.00001, p = 0.994, ηp2 = 0]. None of the interactions were significant (p > 0.05).
Discussion
The final experiment showed that when discriminating changes in correlation of pulsed-sine vocoded stimuli for a constant interaural fluctuation rate, JNDs decreased a small amount for increasing VBW and JNDs increased for increasing VCF. Therefore, both VBW and VCF are factors that affect detecting changes in interaural correlation. There was no effect of pulse rate (concomitantly, there was no effect of number of harmonics), which is consistent with the results of Goupell (2012). The effects of VBW and VCF on correlation discrimination sensitivity can be thought of as independent factors because of the lack of significant interaction. As postulated in experiment 2, it is advantageous to have a larger number of auditory filters encoding correlation changes in a coherent or comodulated manner, which is an aspect unique to stimuli that are pulsed-sine vocoded or presented electrically with a CI. Such an advantage does not occur for non-vocoded noises, where the interaural information in separate auditory filters can be quite different. It is for this reason that we believe Gabriel and Colburn (1981) showed that ρref = 1 JNDs increased for BWs larger than 115 Hz for non-vocoded noises.
Similar to other reports, binaural sensitivity for rapidly modulated stimuli improves as BW is increased (Goupell et al. 2013b) and worsens as CF is increased (Bernstein and Trahiotis 1996; Majdak and Laback 2009). Using bandpass-filtered pulse trains, Majdak and Laback showed a small but significant decrease in static ITD sensitivity for increasing VCF when the VCF/VBW ratio was held approximately constant. We also showed a significant decrease in binaural sensitivity with increasing VCF when the VCF/VBW ratio was held constant, which was confirmed by a separate two-way RM ANOVA using just the 4-kHz-VCF/1-kHz-VBW and 8-kHz-VCF/2-kHz-VBW data with factors VCF and rate [VCF: F(1,16) = 11.9, p = 0.009, ηp2 = 0.60]. Hence, our results are in agreement with those from Majdak and Laback. Stimuli with an approximately constant VCF/VBW ratio are much like electrical stimulation that occurs in CIs. In many reports with CI listeners, there is no significant effect of electrode location (i.e., CF) on binaural performance (Lu et al. 2010; Litovsky et al. 2012), which is contrary to the results from this experiment using a pulsed-sine vocoder CI simulation; however, part of the lack of significant effect in CI listeners may be because of the variability inherent to the CI population.
GENERAL DISCUSSION
Summary
Bilateral CI users likely need to process faster ILD fluctuation rates than NH listeners at low frequencies, which may limit their ability to experience binaural unmasking of speech in spatially separated noise. Therefore, we investigated the effects of BW, CF, and average interaural fluctuation rate on the ability of NH listeners to detect changes in interaural correlation. This was done by employing both unprocessed (non-vocoded) Gaussian noises and pulsed-sine vocoded noises; in the latter stimuli, the effects of BW, CF, and interaural fluctuation rate could be independently controlled. The data for the pulsed-sine vocoded stimuli (Figs. 4, 5, and 6) show that VCF and possibly VBW are the factors that affect detecting interaural correlation changes. The unprocessed data in experiment 1 (Figs. 2 and 3) can be understood in a similar way that the BW and CF are the critical parameters for ρref = 1 JNDs and NoSπ thresholds.
The average interaural fluctuation rate had no effect on detecting changes from correlated references (ρref = 1 and NoSπ). However, it did have an effect on correlation change discrimination with uncorrelated references (ρref = 0). Specifically, slow interaural fluctuation rates were detrimental to discrimination of changes from uncorrelated references, which we attribute to an auditory image that lacks a stable and easily perceived intracranial width.
Interaural fluctuation rate and binaural sluggishness
Many studies have considered or shown the importance of interaural fluctuations for detecting changes in interaural correlation (e.g., Webster 1951; Jeffress et al. 1956; Hafter and Carrier 1970; Buss et al. 2003; Goupell and Hartmann 2007; van der Heijden and Joris 2009). Slow sinusoidal changes in the ITD and ILD can be explicitly tracked by the auditory system. Blauert (1972) reported that listeners could detect sinusoidal changes in interaural differences "in detail" for ITDs up to 3.1 Hz and ILDs up to 2.4 Hz. In other studies, listeners reported two different perceptions of the stimuli depending on the rate of the modulations of sinusoidally changing ITDs (Grantham and Wightman 1978) and ILDs (Grantham 1984). For slow modulations, less than 50 Hz, listeners reported that they tracked the changing position of the intracranial image. For fast modulations, greater than 50 Hz, listeners reported that they perceived a "blur," implying that the auditory system was unable to track these fast changes in interaural differences, consistent with the binaural temporal integration window as measured by Culling and Summerfield (1998). Therefore, binaural processing does demonstrate sluggishness (Grantham and Wightman 1978; Grantham 1982, 1984; Akeroyd and Summerfield 1999; Bernstein et al. 2001). However, not all binaural tasks have the same time constants (Akeroyd and Bernstein 2001) and comparison of time constants across different types of binaural tasks should be approached cautiously.
Several studies have considered the possibility that the rapid interaural fluctuations reduce binaural performance in NoSπ detection because of binaural sluggishness (McFadden et al. 1972; Zwicker and Henning 1984; Zurek and Durlach 1987; Henning et al. 2007; Nitschmann et al. 2009; Buss and Hall 2010; van de Par et al. 2012). For example, there is an increase in NoSπ thresholds with an increase in noise BW (Bourbon and Jeffress 1965; Wightman 1971; Sever and Small 1979; Hall et al. 1983; Zwicker and Henning 1984; Zurek and Durlach 1987; van de Par and Kohlrausch 1999; Nitschmann et al. 2009; Yasin and Henning 2012). When using off-frequency targets with noise maskers (i.e., spectral masking patterns), there is a sharp decrease in the BMLD (NoSπ–NoSo thresholds) as the frequency separation between target and masker increases (Zwicker and Henning 1984; Henning et al. 2007; Buss and Hall 2010; Nitschmann and Verhey 2012; van de Par et al. 2012). If the off-frequency masker only affected thresholds by energy-based masking cues, the BMLD should be independent of target-masker frequency separation. The evidence to the contrary suggests different mechanisms affecting the diotic and dichotic thresholds. Therefore, the decrease in BMLD with frequency separation may occur because the interaural fluctuation rate increases as the target moves farther away from the noise band.2 Most importantly, when an off-frequency masking experiment is performed with a tonal masker, a non-monotonic masking function with minima at approximately ±10 Hz could be interpreted as evidence that listeners derived a benefit from slow interaural fluctuations (McFadden et al. 1972).
Data from correlation change discrimination studies could also be interpreted such that rapid interaural fluctuations reduce binaural performance because of binaural sluggishness. Gabriel and Colburn (1981) performed a systematic study of the effect of BW on correlation change discrimination for two reference correlations, perfectly correlated (ρref = 1) and uncorrelated (ρref = 0), with a 500-Hz CF. For ρref = 1, they found that there was no effect of BW until about 115 Hz, at which point JNDs increased. Evidence to the contrary, where rapid fluctuations were beneficial for discrimination, was also demonstrated in that same report. For ρref = 0, Gabriel and Colburn found that JNDs decreased with increasing BW from 3 to 115 Hz, after which JNDs remained unchanged. Culling et al. (2001) performed a systematic study of CF with fixed ERB noises where the target was a decorrelated band presented either in isolation or fringed with correlated noise. For 1.3-ERB noises in isolation, there was little effect of CF from 250 (BW = 70 Hz) to 1,500 Hz (BW = 240 Hz). However, when correlated flanking bands were added, thus increasing the rate of interaural fluctuations, performance worsened.
The data from this study, particularly Figures 4 and 5, do not support the interpretation that rapid interaural fluctuations impair the binaural processing of interaural fluctuations, which would be a result of binaural sluggishness. The data from the unprocessed noises in Figures 2 and 3 can be interpreted similarly; however, the confounding variables of BW and CF obscure a clean interpretation of the data in terms of the interaural fluctuation rate. A major difference in the unprocessed and vocoded stimuli is that the vocoded stimuli present coherent modulations across many auditory filters, thus having consistent interaural fluctuations across many auditory filters. The unprocessed stimuli have inconsistent interaural fluctuations across auditory filters. Therefore, the most parsimonious explanation of the different effects of BW, CF, and interaural fluctuation rate observed in Figures 2 and 3 compared to Figures 4 and 5 is the difference in the consistency of the interaural fluctuations across auditory filters.
Other techniques have been employed to study the effects of interaural fluctuation rate on decorrelated stimuli. van de Par et al. (2012) also concluded that rapid interaural fluctuations do not degrade binaural sensitivity by using off-frequency interference tones both above and below the noise masker. Such stimuli reduced the effectiveness of monaural modulation cues in the task, thus providing evidence that the monaural modulation cues were the cause of the decreased BMLDs with increasing frequency separation, not the rapid interaural fluctuations. Another technique adds extra binaural modulations in either the ITDs or ILDs of NoSπ stimuli to "scramble" or "jam" the interaural fluctuations inherent to the stimuli (Culling 2011). Culling used 1.3-ERB Gaussian noises at CFs from 250 to 1500 Hz. For the AM conditions (i.e., scrambled ILDs), there was a negligible effect of the modulation rate of the additional modulations, consistent with the results of the current study. However, it should be noted that there are major differences between the Culling study and the present study. One difference is that the inherent interaural fluctuations covaried with CF for Culling's stimuli. Another difference is that Culling did not test at CFs greater than 1500 Hz. The last difference is that Culling's stimuli also provided ITD fluctuations. Future research could apply the binaural-modulation technique to systematically study the effect of interaural modulation rate on correlation change discrimination as was done using vocoder processing in the present study.
Implications for Cochlear Implants
Vocoders have commonly been used to simulate CI processing (Shannon et al. 1995). In the pulsed-sine vocoder processing that was used in the present experiments, several aspects of CI processing have been reproduced: (1) replacement of any informative fine-structure information and representation of only the temporal envelope, (2) use of high-rate pulse trains to encode acoustic information, and (3) a large effective BW of the stimuli to model large spread of current that occurs in monopolar electrical stimulation (on the order of mm, Nelson et al. 2008). The major motivation of this study was to determine if the rapid interaural fluctuation rates produced by CI-like processing could explain some of the smaller than expected binaural benefits in bilateral CI users compared to CI simulations in NH listeners (e.g., Lu et al. 2010). The study showed that rapid interaural fluctuation rates do not degrade interaural correlation change sensitivity. Therefore, the poor binaural benefits in bilateral CI listeners must be a result of another factor or factors, such as poor temporal modulation processing across the ears in both single-channel (van Hoesel 2007; Goupell et al. 2013a) and multi-channel stimulation (Lu et al. 2011).
The NoSπ thresholds found in this study can be compared to CI data in similar conditions. For example, Lu et al. (2010) performed BMLD measurements in bilateral CI listeners and NH listeners, the latter presented a pulsed-sine vocoder (similar to the one used in this study). They found an average BMLD of 4.6 ± 4.9 dB for bilateral CI listeners and 12.6 ± 5.4 dB for the NH listeners. In the present study in experiment 2, BMLDs averaged over different envelope modulation rates were 14.8 ± 3.8 dB. Therefore, the two measurements obtained with pulsed-sine vocoders are in agreement. Lu et al. varied the CF of the sine tone (125, 250, and 500 Hz) and concomitantly varied the BW of the unprocessed masking noise (50, 100, and 200 Hz); therefore, the interaural fluctuation rate changed between conditions. Similar to the present study, Lu et al. used a pulsed-sine vocoder to convey only the envelope information, and their pulses had a fixed VBW across conditions. They found no effect of changing the unprocessed BW and CF in both the CI and NH listeners. Since the unprocessed BW is related to the average interaural fluctuation rate, the results of Lu et al. are consistent with the results of the current study.
Lu et al. used 1,000-pps pulse trains and tested different VCFs (0.37, 1.4, and 5.3 kHz) using a constant VBW (pulse duration of about 1 ms) and found no effect of VCF on NoSπ thresholds. The results of experiment 3 (using 200- and 400-pps pulse trains) suggest that VBW and VCF are important factors for correlation change discrimination. Perhaps the reason for the discrepancy is that we used higher (4- and 8-kHz) VCFs, whereas Lu et al. used lower VCFs.
Although we have outlined three processing aspects of the pulsed-sine vocoder that we believe closely follow CI processing, we have at least three recommendations for future vocoder implementations. First, VBW should be kept constant in mm not Hz as VCF changes to more accurately model the spread of excitation (Nelson et al. 2008; Goupell et al. 2013b; Kan et al. 2013). Second, a deep modulation depth for the acoustic pulses is desirable (we chose at least 99 %). Modulation depth has been shown to be important for binaural processing, such as static ITD sensitivity (Bernstein and Trahiotis 2009). A Gaussian pulse with a 1-ms, –3-dB duration has a 1000-Hz, –3-dB VBW, because the VBW is the inverse of the duration for a Gaussian pulse. From numerical calculations, we have determined that a relationship of approximately 2.5-Hz VBW for every 1 pps is necessary to achieve a 99 % modulation depth for Gaussian-enveloped tone pulse trains. Therefore, for the high pulse rates necessary to simulate CI processing, the VBW must be quite large (1,000 pps would need a 2,500-Hz VBW), which is not necessarily disadvantageous because the spread of excitation in monopolar CI stimulation is usually quite large (Nelson et al. 2008). Third, the pulse trains should have a VBW appropriate for the VCF of the testing. A 1,000-pps pulse train with a 1,000-Hz VBW should not be tested at VCFs less than 500 Hz to avoid aliasing the signal. In fact, given the shallow insertion depths of CIs into the cochlear, apical stimulation should probably be performed closer to a VCF of 1–1.5 kHz (Ketten et al. 1998).
CONCLUSION
Using a pulsed-sine vocoder as a CI simulation, NH listeners' ability to detect changes in interaural correlation was measured as a function of BW, CF, and interaural fluctuation rate. Smaller BWs and higher CFs produced poorer sensitivity to changes in correlation. Slow fluctuation rates decreased sensitivity for detecting changes in correlation from uncorrelated reference stimuli, but not correlated reference stimuli. Therefore, even though bilateral CI users typically experience faster fluctuation rates when processing stimuli than NH listeners, this is likely not a limiting factor in producing binaural unmasking of spatially separated speech for bilateral CI users. Rather bilateral CI listeners could be experiencing inconsistent across-ear temporal modulation processing, corrupted by both single-channel encoding and across-channel interactions.
Acknowledgments
We thank Corey Stoelb, Mitch Mostardi, Nick Liimatta, Garrison Draves, and Abbey Baus for help collecting data. Corey Stoelb was also vital in organization and plotting of the data. This work was supported by NIH Grant K99/R00 DC010206 (Goupell), R01 DC003083 (Litovsky), and P30 HD03352 (Waisman Center core grant).
Footnotes
Using vector diagrams (e.g., Zurek 1991), the interaural correlation between two noise vectors is the projection of one vector on the other. In other words, the interaural correlation ρ is the similarity of two noises. The perpendicular or orthogonal component that we call α is the dissimilarity of the two noises. We have used α as the scale in the adaptive procedure because the magnitude of the IPD and ILD fluctuations, which may be related to detecting changes in correlation, vary nearly linearly in terms of α, not p (Goupell 2010).
For an off-frequency target and masking configuration, the temporal aspects of the combination of target and masker are dependent on SNR, masker BW, and frequency separation of target and masker. We recommend the vector diagram analysis of the stimuli in Henning et al. (2007) for a better understanding of this complex relationship.
References
- Akeroyd MA, Bernstein LR. The variation across time of sensitivity to interaural disparities: behavioral measurements and quantitative analyses. J Acoust Soc Am. 2001;110:2516–2526. doi: 10.1121/1.1412442. [DOI] [PubMed] [Google Scholar]
- Akeroyd MA, Summerfield AQ. A binaural analog of gap detection. J Acoust Soc Am. 1999;105:2807–2820. doi: 10.1121/1.426897. [DOI] [PubMed] [Google Scholar]
- Bernstein LR, Trahiotis C. The normalized correlation: accounting for binaural detection across center frequency. J Acoust Soc Am. 1996;100:3774–3784. doi: 10.1121/1.417237. [DOI] [PubMed] [Google Scholar]
- Bernstein LR, Trahiotis C. The effects of randomizing values of interaural disparities on binaural detection and on discrimination of interaural correlation. J Acoust Soc Am. 1997;102:1113–1120. doi: 10.1121/1.419863. [DOI] [PubMed] [Google Scholar]
- Bernstein LR, Trahiotis C. How sensitivity to ongoing interaural temporal disparities is affected by manipulations of temporal features of the envelopes of high-frequency stimuli. J Acoust Soc Am. 2009;125:3234–3242. doi: 10.1121/1.3101454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernstein LR, Trahiotis C, Hyde EL. Inter-individual differences in binaural detection of low-frequency or high-frequency tonal signals masked by narrow-band or broadband noise. J Acoust Soc Am. 1998;103:2069–2078. doi: 10.1121/1.421378. [DOI] [PubMed] [Google Scholar]
- Bernstein LR, Trahiotis C, Akeroyd MA, Hartung K. Sensitivity to brief changes of interaural time and interaural intensity. J Acoust Soc Am. 2001;109:1604–1615. doi: 10.1121/1.1354203. [DOI] [PubMed] [Google Scholar]
- Blauert J. On the lag of lateralization caused by interaural time and intensity differences. Audiology. 1972;11:265–270. doi: 10.3109/00206097209072591. [DOI] [PubMed] [Google Scholar]
- Boehnke SE, Hall SE, Marquardt T. Detection of static and dynamic changes in interaural correlation. J Acoust Soc Am. 2002;112:1617–1626. doi: 10.1121/1.1504857. [DOI] [PubMed] [Google Scholar]
- Bourbon WT, Jeffress LA. Effect of bandwidth of masking noise on detection of homophasic and antiphasic tonal signals (A) J Acoust Soc Am. 1965;39:1180–1181. doi: 10.1121/1.1939415. [DOI] [Google Scholar]
- Bronkhorst AW, Plomp R. The effect of head-induced interaural time and level differences on speech intelligibility in noise. J Acoust Soc Am. 1988;83:1508–1516. doi: 10.1121/1.395906. [DOI] [PubMed] [Google Scholar]
- Buss E, Hall JW., 3rd The role of off-frequency masking in binaural hearing. J Acoust Soc Am. 2010;127:3666–3677. doi: 10.1121/1.3377053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buss E, Hall JW, 3rd, Grose JH. The masking level difference for signals placed in masker envelope minima and maxima. J Acoust Soc Am. 2003;114:1557–1564. doi: 10.1121/1.1598199. [DOI] [PubMed] [Google Scholar]
- Buss E, Hall JW, 3rd, Grose JH. Individual differences in the masking level difference with a narrowband masker at 500 or 2000 Hz. J Acoust Soc Am. 2007;121:411–419. doi: 10.1121/1.2400849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Culling JF. Subcomponent cues in binaural unmasking. J Acoust Soc Am. 2011;129:3846–3855. doi: 10.1121/1.3560944. [DOI] [PubMed] [Google Scholar]
- Culling JF, Summerfield AQ. Measurements of the binaural temporal window using a detection task. J Acoust Soc Am. 1998;103:3540–3553. doi: 10.1121/1.423061. [DOI] [Google Scholar]
- Culling JF, Colburn HS, Spurchise M. Interaural correlation sensitivity. J Acoust Soc Am. 2001;110:1020–1029. doi: 10.1121/1.1383296. [DOI] [PubMed] [Google Scholar]
- Culling JF, Hawley ML, Litovsky RY. The role of head-induced interaural time and level differences in the speech reception threshold for multiple interfering sound sources. J Acoust Soc Am. 2004;116:1057–1065. doi: 10.1121/1.1772396. [DOI] [PubMed] [Google Scholar]
- Culling JF, Jelfs S, Talbert A, Grange JA, Backhouse SS. The benefit of bilateral versus unilateral cochlear implantation to speech intelligibility in noise. Ear Hear. 2012;33:673–682. doi: 10.1097/AUD.0b013e3182587356. [DOI] [PubMed] [Google Scholar]
- Gabriel KJ, Colburn HS. Interaural correlation discrimination: I. Bandwidth and level dependence. J Acoust Soc Am. 1981;69:1394–1401. doi: 10.1121/1.385821. [DOI] [PubMed] [Google Scholar]
- Goupell MJ (2010) Interaural fluctuations and the detection of interaural incoherence. IV. The effect of compression on stimulus statistics. J Acoust Soc Am: 3691–3702 [DOI] [PMC free article] [PubMed]
- Goupell MJ. The role of envelope statistics in detecting changes in interaural correlation. J Acoust Soc Am. 2012;132:1561–1572. doi: 10.1121/1.4740498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goupell MJ, Hartmann WM. Interaural fluctuations and the detection of interaural incoherence: bandwidth effects. J Acoust Soc Am. 2006;119:3971–3986. doi: 10.1121/1.2200147. [DOI] [PubMed] [Google Scholar]
- Goupell MJ, Hartmann WM. Interaural fluctuations and the detection of interaural incoherence: II. Brief duration noises. J Acoust Soc Am. 2007;121:2127–2136. doi: 10.1121/1.2436714. [DOI] [PubMed] [Google Scholar]
- Goupell MJ, Kan A, Litovsky RY. Typical mapping procedures can produce non-centered auditory images in bilateral cochlear-implant users. J Acoust Soc Am. 2013;133:EL101–EL107. doi: 10.1121/1.4776772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goupell MJ, Stoelb C, Kan A, Litovsky RY. Effect of mismatched place-of-stimulation on the salience of binaural cues in conditions that simulate bilateral cochlear-implant listening. J Acoust Soc Am. 2013;133:2272–2287. doi: 10.1121/1.4792936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grantham DW. Detectability of time-varying interaural correlation in narrow-band noise stimuli. J Acoust Soc Am. 1982;72:1178–1184. doi: 10.1121/1.388326. [DOI] [PubMed] [Google Scholar]
- Grantham DW. Discrimination of dynamic interaural intensity differences. J Acoust Soc Am. 1984;76:71–76. doi: 10.1121/1.391009. [DOI] [PubMed] [Google Scholar]
- Grantham DW, Wightman FL. Detectability of varying interaural temporal differences. J Acoust Soc Am. 1978;63:511–523. doi: 10.1121/1.381751. [DOI] [PubMed] [Google Scholar]
- Hafter ER, Carrier SC. Masking-level differences obtained with a pulsed tonal masker. J Acoust Soc Am. 1970;47:1041–1047. doi: 10.1121/1.1912003. [DOI] [PubMed] [Google Scholar]
- Hall JW, Tyler RS, Fernandes MA. Monaural and binaural auditory frequency resolution measured using bandlimited noise and notched-noise masking. J Acoust Soc Am. 1983;73:894–898. doi: 10.1121/1.389013. [DOI] [PubMed] [Google Scholar]
- Henning GB, Yasin I, Witton C (2007) Remote masking and the binaural masking-level difference. In: Kollmeier B, Klump G, Hohmann V, Langemann U, Mauermann M, Uppenkamp S, and Verhey JL (ed) Hearing - From Sensory Processing to Perception. Springer Verlag, Berlin, pp. 457–466
- Jain M, Gallagher DT, Koehnke J, Colburn HS. Fringed correlation discrimination and binaural detection. J Acoust Soc Am. 1991;90:1918–1926. doi: 10.1121/1.401671. [DOI] [PubMed] [Google Scholar]
- Jeffress LA, Blodgett HC, Sandel TT, Wood CL 3rd (1956) Masking of tonal signals. J Acoust Soc Am 28
- Kan A, Stoelb C, Litovsky RY, Goupell MJ. Effect of mismatched place-of-stimulation on binaural fusion and lateralization in bilateral cochlear-implant users. J Acoust Soc Am. 2013;134:2923–2936. doi: 10.1121/1.4820889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ketten DR, Skinner MW, Wang G, Vannier MW, Gates GA, Neely JG. In vivo measures of cochlear length and insertion depth of Nucleus cochlear implant electrode arrays. Ann Otol Rhinol Laryngol Suppl. 1998;175:1–16. [PubMed] [Google Scholar]
- Koehnke J, Colburn HS, Durlach NI. Performance in several binaural-interaction experiments. J Acoust Soc Am. 1986;79:1558–1562. doi: 10.1121/1.393682. [DOI] [PubMed] [Google Scholar]
- Lavandier M, Culling JF. Prediction of binaural speech intelligibility against noise in rooms. J Acoust Soc Am. 2010;127:387–399. doi: 10.1121/1.3268612. [DOI] [PubMed] [Google Scholar]
- Litovsky RY, Jones GL, Agrawal S, van Hoesel R. Effect of age at onset of deafness on binaural sensitivity in electric hearing in humans. J Acoust Soc Am. 2010;127:400–414. doi: 10.1121/1.3257546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Litovsky RY, Goupell MJ, Godar S, Grieco-Calub T, Jones GL, Garadat SN, Agrawal S, Kan A, Todd A, Hess C, Misurelli S. Studies on bilateral cochlear implants at the University of Wisconsin's Binaural Hearing and Speech Laboratory. J Am Acad Audiol. 2012;23:476–494. doi: 10.3766/jaaa.23.6.9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loizou PC. Speech processing in vocoder-centric cochlear implants. In: Moller A, editor. Cochlear and brainstem implants. Basel: Karger; 2006. pp. 109–143. [DOI] [PubMed] [Google Scholar]
- Loizou PC, Hu Y, Litovsky R, Yu G, Peters R, Lake J, Roland P. Speech recognition by bilateral cochlear implant users in a cocktail-party setting. J Acoust Soc Am. 2009;125:372–383. doi: 10.1121/1.3036175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Long CJ, Carlyon RP, Litovsky RY, Downs DH. Binaural unmasking with bilateral cochlear implants. J Assoc Res Otolaryngol. 2006;7:352–360. doi: 10.1007/s10162-006-0049-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu T, Litovsky R, Zeng FG. Binaural masking level differences in actual and simulated bilateral cochlear implant listeners. J Acoust Soc Am. 2010;127:1479–1490. doi: 10.1121/1.3290994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu T, Litovsky R, Zeng FG. Binaural unmasking with multiple adjacent masking electrodes in bilateral cochlear implant users. J Acoust Soc Am. 2011;129:3934–3945. doi: 10.1121/1.3570948. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Macpherson EA, Middlebrooks JC. Listener weighting of cues for lateral angle: the duplex theory of sound localization revisited. J Acoust Soc Am. 2002;111:2219–2236. doi: 10.1121/1.1471898. [DOI] [PubMed] [Google Scholar]
- Majdak P, Laback B. Effects of center frequency and rate on the sensitivity to interaural delay in high-frequency click trains. J Acoust Soc Am. 2009;125:3903–3913. doi: 10.1121/1.3120413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McFadden D, Russell WE, Pulliam KA. Monaural and binaural masking patterns for a low-frequency tone. J Acoust Soc Am. 1972;51:534–543. doi: 10.1121/1.1912875. [DOI] [Google Scholar]
- Moore BC, Glasberg BR. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. J Acoust Soc Am. 1983;74:750–753. doi: 10.1121/1.389861. [DOI] [PubMed] [Google Scholar]
- Nelson DA, Donaldson GS, Kreft H. Forward-masked spatial tuning curves in cochlear implant users. J Acoust Soc Am. 2008;123:1522–1543. doi: 10.1121/1.2836786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nitschmann M, Verhey JL. Modulation cues influence binaural masking-level difference in masking-pattern experiments. J Acoust Soc Am. 2012;131:EL223–EL228. doi: 10.1121/1.3681925. [DOI] [PubMed] [Google Scholar]
- Nitschmann M, Verhey JL, Kollmeier B. The role of across-frequency processes in dichotic listening conditions. J Acoust Soc Am. 2009;126:3188–3198. doi: 10.1121/1.3243307. [DOI] [PubMed] [Google Scholar]
- Poon BB, Eddington DK, Noel V, Colburn HS. Sensitivity to interaural time difference with bilateral cochlear implants: development over time and effect of interaural electrode spacing. J Acoust Soc Am. 2009;126:806–815. doi: 10.1121/1.3158821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rice SO. Mathematical analysis of random noise. In: Wax N, editor. Selected papers on noise and stochastic processes. New York: Dover; 1954. pp. 133–294. [Google Scholar]
- Sever JC, Jr, Small AM., Jr Binaural critical masking bands. J Acoust Soc Am. 1979;66:1343–1350. doi: 10.1121/1.383528. [DOI] [PubMed] [Google Scholar]
- Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M. Speech recognition with primarily temporal cues. Science. 1995;270:303–304. doi: 10.1126/science.270.5234.303. [DOI] [PubMed] [Google Scholar]
- van de Par S, Kohlrausch A. A new approach to comparing binaural masking level differences at low and high frequencies. J Acoust Soc Am. 1997;101:1671–1680. doi: 10.1121/1.418151. [DOI] [PubMed] [Google Scholar]
- van de Par S, Kohlrausch A. Dependence of binaural masking level differences on center frequency, masker bandwidth, and interaural parameters. J Acoust Soc Am. 1999;106:1940–1947. doi: 10.1121/1.427942. [DOI] [PubMed] [Google Scholar]
- van de Par S, Luebken B, Verhey JL, Kohlrausch A (2012) Off-frequency BMLD: The role of monaural processing. In: Moore BCJ, Patterson RD, Winter IM, Carlyon RP, Gockel HE (ed) Basic Aspects of Hearing: Physiology and Perception. Springer, Cambridge, England, pp. 293–301
- van der Heijden M, Joris PX. Interaural correlation fails to account for detection in a classic binaural task: Dynamic ITDs dominate NOSπ detection. J Assoc Res Otolaryngol. 2009;11:113–131. doi: 10.1007/s10162-009-0185-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Deun L, van Wieringen A, Francart T, Scherf F, Dhooge IJ, Deggouj N, Desloovere C, Van de Heyning PH, Offeciers FE, De Raeve L, Wouters J. Bilateral cochlear implants in children: binaural unmasking. Audiol Neuro Otol. 2009;14:240–247. doi: 10.1159/000190402. [DOI] [PubMed] [Google Scholar]
- Van Deun L, van Wieringen A, Francart T, Buchner A, Lenarz T, Wouters J. Binaural unmasking of multi-channel stimuli in bilateral cochlear implant users. J Assoc Res Otolaryngol. 2011;12:659–670. doi: 10.1007/s10162-011-0275-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Hoesel RJM. Exploring the benefits of bilateral cochlear implants. Audiol Neuro Otol. 2004;9:234–246. doi: 10.1159/000078393. [DOI] [PubMed] [Google Scholar]
- van Hoesel RJM. Sensitivity to binaural timing in bilateral cochlear implant users. J Acoust Soc Am. 2007;121:2192–2206. doi: 10.1121/1.2537300. [DOI] [PubMed] [Google Scholar]
- van Hoesel RJM, Bohm M, Pesch J, Vandali A, Battmer RD, Lenarz T. Binaural speech unmasking and localization in noise with bilateral cochlear implants using envelope and fine-timing based strategies. J Acoust Soc Am. 2008;123:2249–2263. doi: 10.1121/1.2875229. [DOI] [PubMed] [Google Scholar]
- Webster FA. The influence of interaural phase on masked thresholds: I. The role of interaural time-deviation. J Acoust Soc Am. 1951;23:452–462. doi: 10.1121/1.1906787. [DOI] [Google Scholar]
- Wightman FL. Detection of binaural tones as a function of masker bandwidth. J Acoust Soc Am. 1971;50:623–636. doi: 10.1121/1.1912678. [DOI] [PubMed] [Google Scholar]
- Wightman FL, Kistler DJ. The dominant role of low-frequency interaural time differences in sound localization. J Acoust Soc Am. 1992;91:1648–1661. doi: 10.1121/1.402445. [DOI] [PubMed] [Google Scholar]
- Yasin I, Henning GB. The effects of noise-bandwidth, noise-fringe duration, and temporal signal location on the binaural masking-level difference. J Acoust Soc Am. 2012;132:327–338. doi: 10.1121/1.4718454. [DOI] [PubMed] [Google Scholar]
- Zurek PM. Probability distributions of interaural phase and level differences in binaural detection stimuli. J Acoust Soc Am. 1991;90:1927–1932. doi: 10.1121/1.401672. [DOI] [PubMed] [Google Scholar]
- Zurek PM, Durlach NI. Masker-bandwidth dependence in homophasic and antiphasic tone detection. J Acoust Soc Am. 1987;81:459–464. doi: 10.1121/1.394911. [DOI] [PubMed] [Google Scholar]
- Zwicker E, Henning GB. Binaural masking-level differences with tones masked by noises of various bandwidths and levels. Hear Res. 1984;14:179–183. doi: 10.1016/0378-5955(84)90016-9. [DOI] [PubMed] [Google Scholar]