Evidence of across-channel processing for spectral-ripple discrimination in cochlear implant listeners

Jong Ho Won; Gary L Jones; Ward R Drennan; Elyse M Jameyson; Jay T Rubinstein

doi:10.1121/1.3624820

. 2011 Oct;130(4):2088–2097. doi: 10.1121/1.3624820

Evidence of across-channel processing for spectral-ripple discrimination in cochlear implant listeners ^a

Jong Ho Won ^1,^a), Gary L Jones ¹, Ward R Drennan ¹, Elyse M Jameyson ¹, Jay T Rubinstein ¹

PMCID: PMC3206911 PMID: 21973363

Abstract

Spectral-ripple discrimination has been used widely for psychoacoustical studies in normal-hearing, hearing-impaired, and cochlear implant listeners. The present study investigated the perceptual mechanism for spectral-ripple discrimination in cochlear implant listeners. The main goal of this study was to determine whether cochlear implant listeners use a local intensity cue or global spectral shape for spectral-ripple discrimination. The effect of electrode separation on spectral-ripple discrimination was also evaluated. Results showed that it is highly unlikely that cochlear implant listeners depend on a local intensity cue for spectral-ripple discrimination. A phenomenological model of spectral-ripple discrimination, as an “ideal observer,” showed that a perceptual mechanism based on discrimination of a single intensity difference cannot account for performance of cochlear implant listeners. Spectral modulation depth and electrode separation were found to significantly affect spectral-ripple discrimination. The evidence supports the hypothesis that spectral-ripple discrimination involves integrating information from multiple channels.

INTRODUCTION

Spectral-ripple discrimination, originally developed to investigate the spectral resolution of the normal auditory system (e.g., Supin et al., 1994, 1998, 1999), has recently gained a wide range of attention in the cochlear implant (CI) research field (e.g., Henry and Turner, 2003; Henry et al., 2005; Won et al., 2007; Litvak et al., 2007; Saoji et al., 2009; Drennan et al., 2010; Won et al., 2010, 2011et al.,; Anderson et al., 2011). These previous studies demonstrated that spectral-ripple discrimination correlates with vowel and consonant recognition in quiet (Henry et al., 2003, 2005et al.; Saoji et al., 2009), speech perception in noise (Won et al., 2007), and music perception (Won et al., 2010). Henry et al. (2005) demonstrated that normal-hearing listeners showed best spectral-ripple discrimination performance followed by hearing-impaired listeners and CI users. Spectral-ripple discrimination is also useful to compare CI sound encoding strategies (Berenstein et al., 2008; Drennan et al., 2010), and for evaluating the brain-behavior relationship for spectral sensitivity using an electrophysiological acoustic change complex in response to spectral-ripple phase inversion (Won et al., 2011). All of those previous reports suggest that spectral-ripple discrimination is an efficient measure of spectral resolution, which is useful for multiple clinically relevant research purposes.

Figure 1 shows the acoustic spectrum, excitation pattern, and sound processor output for spectral-ripple stimuli. Stimuli with ripple densities of 1, 2, and 4 ripples∕octave are shown. Two different ripple stimuli are used for the discrimination task: standard and inverted ripple with the location of the spectral peaks and valleys reversed relative to each other. The spectral-ripple depth is 30 dB in this case. As shown in the acoustic spectrum, as the ripple density increases, ripples are spaced more closely, making it more difficult to discriminate between standard and inverted ripple spectrum. The middle column of Fig. 1 shows excitation patterns (Glasberg and Moore, 1990; Moore et al., 1997) for spectral-ripple stimuli. The excitation patterns represent the distribution of neural activity as a function of frequency in a normal auditory system. The largest excitation-pattern differences were observed at 1 ripple∕octave, whereas the difference became smaller as the ripple density increased. The right column of Fig. 1 shows the CI sound processor output corresponding to 16 electrodes. The Advanced Bionics HiResolution^® sound processing strategy was used for this analysis. Average outputs over the duration of ripple stimuli (0.5 s) for each electrode are plotted. For the ripple densities of 1 and 2 ripples∕octave, multiple peaks and valleys are faithfully present across the electrode outputs; however, there is a gradual decrease in the distance between the peaks and valley as the ripple density increases, resulting in reduced spectral contrast between the standard and inverted ripple stimuli, especially at high ripple density (4 ripples∕octave). This is consistent with the excitation pattern shown in the middle panel of Fig. 1. Behavioral discrimination also shows a similar trend that CI subjects’ discrimination performance is worse at high ripple density. The spectral contrast between the standard and inverted ripple stimuli is reflected in the sound processor outputs, and these electrical stimulation patterns would evoke a different excitation pattern for the standard and inverted ripple stimuli in the auditory nerve, suggesting that one of the possible mechanisms of spectral-ripple discrimination is the CI listeners’ ability to detect and discriminate any possible spectral maxima and minima over a broad frequency region.

Acoustic spectrum (left column), cochlear filter excitation pattern (middle column), and sound processor output (right column) for standard (solid lines) and inverted (dotted lines) spectral-ripple stimuli. Stimuli with ripple densities of 1, 2, and 4 ripples∕octave are shown in the upper, middle, and lower plot in each panel.

Other possible factors that could influence spectral-ripple discrimination performance in CI listeners include the number of electrodes available to the subjects, integrity or health of the auditory nerve, channel interaction, or sound processing strategies. Previous studies showed that spectral-ripple discrimination improved as the number of electrodes increased (Henry and Turner, 2003), suggesting that spectral-ripple discrimination ability benefits from having multichannel information. Another possible factor is that the levels at the edge frequencies (i.e., lowest and highest frequency) could change depending on the spectral modulation starting phase and it could potentially provide a cue for discrimination. Anderson et al. (2011) evaluated the spectral edge effect on discrimination performance. They created smooth spectral edge rippled noise by applying a Hanning window to the spectral edges. When CI listeners were tested with steep (i.e., non-windowed) and smooth spectral edges, they did not show a difference in performance, which suggests that the spectral edge effect is unlikely to affect discrimination performance. Anderson et al. (2011) also investigated spectral-ripple discrimination using four different octave-wide band conditions and showed substantial variations in threshold across frequency for most subjects. This observation might indicate if a certain frequency region has a better peripheral condition with, for example, more neural survival or more optimal positioning of the electrodes, then better spectral-ripple discrimination may be observed in that region. From the perspective of CI devices, spectral-ripple discrimination performance depends in part on the sound encoding strategy. Drennan et al. (2010) measured spectral-ripple discrimination with the HiResolution and Fidelity120 strategies and showed a significantly better threshold with Fidelity120 than with HiResolution. Here and throughout this paper, “spectral-ripple threshold” means “threshold for discriminating ripple density”; thus lower thresholds imply worse ripple discrimination performance.

However, some questions have been raised recently about spectral-ripple discrimination (e.g., McKay et al., 2009), speculating that CI listeners can potentially discriminate spectral-ripple stimuli using cues that are not related to spectral resolution such as spectral center of gravity or local loudness changes. Spectral center of gravity refers to the gross maxima of the spectrum shape of the acoustic sound (Chistovich and Lublinskaya, 1979). Normal-hearing listeners can distinguish vowels if differences between formants exceed the critical distance of 3 bark for the spectral center of gravity (Chistovich and Lublinskaya, 1979). One can determine if spectral center of gravity serves as a powerful acoustic cue for spectral-ripple discrimination by estimating the spectral center of gravity for spectral-ripple stimuli. Figure 2 shows examples of spectral center of gravity for ripple stimuli with 1 and 2 ripples∕octave (bandwidth: 100–5000 Hz) estimated every 0.02 s, computed using a custom matlab program (Clark and Atlas, 2009; Atlas et al., 2010). Standard and inverted ripple phases are shown by the solid and dashed lines, respectively. For rippled noise with 1 ripple∕octave spacing, an average difference for spectral center of gravity between standard and inverted ripple was 20 Hz. For 2 ripples∕octave, it was 12 Hz. The maximum difference at one specific time location was 75 and 48 Hz for 1 and 2 ripples∕octave, respectively. All of these differences are less than 0.5 bark, demonstrating that it is highly unlikely that the spectral center of gravity is a cue and especially unlikely it is the only cue for spectral-ripple discrimination by CI listeners and even by normal-hearing listeners.

Spectral center of gravity over the 500-ms duration of standard- and inverted-phase ripple stimuli with 1 and 2 ripples∕octave. The time window for the center of gravity estimation (i.e., the size of each bin) was 0.02 s.

Given the wide range of practical applications of the spectral-ripple discrimination test for CI users, it is important to determine the dependence, if any, of spectral-ripple discrimination on non-spectral cues. The present study is a further examination of whether CI listeners discriminate spectral-ripple stimuli by integrating information from multiple channels (i.e., across-channel processing). Goupell et al. (2008) found no evidence of across-channel processing in CI users in a “profile analysis” task, and they specifically questioned whether CI users respond to intensity information in just one channel in the spectral-ripple discrimination test. In contrast with spectral-ripple discrimination, Goupell et al. used a testing paradigm in which only a single spectral peak or trough was presented and the electrical pulse amplitude in each channel was fixed over the stimulus duration. In addition to these stimulus differences, it should be noted that there are subtle but important differences between the two tasks in the role and purposes of the level rove. In psychoacoustical experiments, a level rove is widely used to prevent listeners from using intensity cues for discrimination when the stimuli are different in level. The traditional profile analysis task and spectral-ripple discrimination differ in that the “peak” stimulus has a higher overall level than the “no peak” stimulus in the profile analysis task; whereas, standard- and inverted-phase rippled noise tokens are equal in level (if unroved). In light of multiple differences in the stimuli and the task, it is unclear whether the absence of any evidence of across-channel processing in the study of Goupell et al. is relevant to spectral-ripple discrimination by CI users.

The current study presents a series of experiments to evaluate the perceptual mechanisms for spectral-ripple discrimination in CI users. Experiment 1 was designed to test the hypothesis that spectral-ripple discrimination involves integrating information from multiple channels. Standard and inverted ripple stimuli are equal in level, but as a result of cochlear implant sound processing, each individual subject receives different patterns of the growth of loudness across electrodes. The magnitude of level rove was generally smaller than the spectral-ripple depth in the previous studies (e.g., Henry and Turner, 2003; Henry et al., 2005; Won et al., 2007; Anderson et al., 2011), thus there is still a concern that listeners may perform the test on the basis of intensity cues without resolving the spectral peaks and valleys. To address the concern about whether spectral-ripple discrimination is driven by the use of a loudness cue, behavioral performance and model prediction were evaluated for different degrees of intensity cues.

In Experiment 2, the effect of electrode separation on spectral-ripple discrimination in CI listeners was evaluated by testing the hypothesis that spectral-ripple discrimination abilities improve with decreasing channel interaction. Sensitivity to a single electrode stimulus decreases when one or more electrodes stimulate overlapping subsets of nerve fibers due to the decrease of across-fiber independence in excitation. This “channel interaction” effect was examined as a potential contributing factor in spectral-ripple discrimination. Larger electrode separation would be expected to decrease possible channel interaction between two active electrodes.

In Experiment 3, the effect of spectral modulation depth was examined. In principle, the available cues are larger at greater modulation depths. Our previous reports used a spectral ripple modulation depth of 30 dB (Won et al., 2007, 2010, 2011et al.), whereas Experiment 1 in the present study used a spectral modulation depth of 13 dB. This experiment compared spectral-ripple discrimination thresholds obtained with spectral modulation depths of 13 and 30 dB. We predicted that spectral-ripple discrimination thresholds with 30-dB depth would be greater than thresholds with 13-dB depth; however, the thresholds with each depth would be highly correlated.

EXPERIMENT 1: THE EFFECT OF SPECTRAL MODULATION PHASE AND LEVEL ROVE ON SPECTRAL-RIPPLE DISCRIMINATION

Subjects

Eight postlingually deafened CI listeners participated. Table TABLE I. shows relevant information for the listeners. This study was approved by the University of Washington Institutional Review Board.

Table 1.

Subject characteristics.

Subject	Age (yr)	Duration of hearing loss (yr)^a	Duration of implant use (yr)	Implant device	Sound processor strategy	Experiment participated
S03	64	5	14	Nucleus 22	SPEAK	1
S04	66	1	7	Nucleus 24	ACE	1,2
S12	53	0	6	MedEl Combi40+	CIS	1
S34	59		5	HiRes90K	HiRes	2
S40	72	5	6	HiRes90K	HiRes	3
S41	52	7	5	HiRes90K	HiRes	3
S42	68	5	3	HiRes90K	Fidelity 120	3
S48	70	10	3	HiRes90K	HiRes	1,2
S49	64	4	1	HiRes90K	Fidelity 120	3
S50	40	15	6	Clarion CII	Fidelity 120	3
S51	56	7	6	Clarion CII	HiRes	3
S52	79	0	3	HiRes90K	Fidelity 120	1,2
S53	63	3	7	Clarion CII	Fidelity 120	3
S54	25	3	2.5	HiRes90K	HiRes	3
S55	65	40	1	HiRes90K	HiRes	3
S69	60	30	1.5	HiRes90K	Fidelity 120	1,2
S71	71	15	1.5	HiRes90K	Fidelity 120	1,2
S79	64	10	0.5	N5	ACE	1,2

Open in a new tab

The duration of their hearing loss before implantation.

Procedure

To create spectral-ripple stimuli, the following equation was used:

s (t) = \sum_{i = 1}^{200} 10^{D \times {abs [\sin (π \times R \times F_{i} + φ)]} ∕ 20} \times \sin (2 \times π \times 100 \times 50^{(i - 1) ∕ 200} \times t + ϕ_{i}),

(1)

in which D is ripple depth in dB, R is ripples∕octave, F_i is the number of octaves from the low cutoff frequency of the passband to the ith component frequency (i.e., [(i-1)log₁₀(50)]∕[200log₁₀(2)]), φ is the spectral modulation starting phase in radians, t is time in seconds, the ϕ_i are the randomized phases in radians (ranged between 0 to 2 π) for 200 pure tone components, and “×” indicates multiplication. The ripple depth (D) of 13 dB was used in this experiment. The 200 tones were spaced equally on a logarithmic frequency scale with a bandwidth of 100 − 4903 Hz. The ripple peaks were spaced equally on a logarithmic frequency scale. The stimuli had 500 ms total duration and were ramped with 150 ms rise∕fall times. Stimuli were filtered with a long-term, speech-shaped filter that was created in CoolEdit 2000 with parameters specified in accordance with the findings of Byrne et al. (1994).

To determine whether spectral-ripple discrimination is dependent on within-channel intensity difference cues or global intensity changes (i.e., information integrated from multiple electrodes), the present study examined spectral-ripple discrimination with three different level roves and two different spectral modulation starting phase conditions. The three different level roves include 0-, 7-, or 15-dB level roves with 1-dB step size. When a level rove of 15-dB was used, it was a condition where the ripple depth (13-dB) was less than the level rove. The two different spectral modulation starting phase conditions include (1) fixed-phase stimuli, in which the spectral modulation starting phase was set to zero radian (sine phase) for standard ripples, and for inverted ripples, it was set to π∕2, and (2) random-phase stimuli, in which the starting phase was randomly selected from a uniform distribution (0 to 2π rad), and for each corresponding inverted ripple stimulus, the phase was determined by adding π∕2 to the phase of the standard ripple stimulus. These two spectral phase conditions were tested to determine if the starting phase of the sinusoid ripple shape in the spectral domain gives any cue for discrimination. In theory, the randomization of the starting phase limits the ability of listeners to rely exclusively on a certain frequency channel to perform spectral-ripple discrimination at a certain ripple density. A similar approach was used by Eddins and Bero (2007), who examined spectral modulation detection in normal-hearing listeners.

The procedure for determining spectral-ripple discrimination thresholds in this experiment is the same as that described by Won et al. (2007). A three-interval paradigm, two-up and one-down adaptive procedure was used to determine the spectral-ripple discrimination threshold. More specifically, for the fixed-phase condition, in which the “odd” stimulus always had π∕2 starting phase, the test is a three-interval, three-alternative forced-choice task. For the random-phase condition, in which the starting phase of the “odd” stimulus varies across trials, the test is a three-interval oddity task (Versfeld et al., 1996). Subjects were asked to click on an onscreen button that was labeled 1, 2, and 3 after they were presented the stimuli. The “odd” stimulus was different from two other reference stimuli. The interstimulus interval (offset to onset) was 500 ms. The threshold for a single adaptive track was estimated by averaging the ripple spacing (the number of ripples∕octave) for the final eight of 13 reversals. The ripple densities differed by ratios of 1.414. A single adaptive track took about 5 min to complete. Six different testing conditions were carried out in random order (3 level roves × 2 phase conditions). For each testing condition, three adaptive tracks were completed to determine the average thresholds for that condition, and then subjects were tested with another testing condition. All tests were conducted in a double-walled, sound-treated booth (IAC). Custom matlab (The Mathworks, Inc.) programs were used to present stimuli on a Macintosh G5 computer with a Crown D45 amplifier. A single loudspeaker (B&W DM303), positioned 1 m in front of the subjects, presented stimuli in a sound field. When the level rove was 15 dB, the presentation level for each interval was randomly chosen between 49 and 64 dBA with 1-dB step. For the level rove of 7 dB, the presentation level was varied randomly between 57 and 64 dBA. Without the level rove, the presentation level was set to 65 dBA. The speaker exceeded ANSI standards for speech audiometry, varying ± 2 dB from 100 to 20 000 Hz. All subjects listened to the stimuli using their own sound processor set to a comfortable listening level. This experiment took about 2 h for each subject.

Results

Figure 3 shows mean spectral-ripple thresholds for each of the six testing conditions. Error bars indicate ± one standard error across eight subjects. Comparison between the phase conditions reveals that the spectral-ripple threshold is robust regardless of spectral modulation starting phase. Thresholds for the 15-dB level rove condition were not significantly different from the 0-dB level rove condition (paired t-test, p > 0.05) although thresholds trended worse. A 2 × 3 × 3 repeated measures analysis of variance (ANOVA) (two phases, three level roves, and three repetitions) indicated that ripple starting phase (F_1,7 = 0.27, p = 0.62), level rove (F_2,14 = 3.01, p = 0.079), and repetition (F_2,14 = 2.13, p = 0.16) had no effect on thresholds. No interaction was found between the parameters. A post hoc Tukey test also showed that thresholds with three different level roves were not significantly different from each other. Significant correlations were found among thresholds for the six testing conditions. In particular, a strong correlation (r = 0.88, p < 0.01) was found between thresholds obtained with the fixed phase, 0-dB level rove and the random phase, 15-dB level rove conditions, suggesting that a similar hearing mechanism was used for the two markedly different testing conditions.

Spectral-ripple discrimination thresholds for six testing conditions. Error bars represent one standard error across subjects. Data points are slightly horizontally displaced for clarity.

Use of a single intensity cue: Model results

The results of the experiment described above are consistent with the use of cues in multiple channels, but they do not specifically rule out a single-channel mechanism of spectral-ripple discrimination. Thus, a phenomenological model was developed to determine whether it is possible to account for the performance of CI users with a single-channel mechanism. As illustrated in Fig. 4, for each acoustic stimulus the model calculated the pulse trains delivered by a speech processor using an Advanced Bionics HiResolution processing strategy and then multiplied a matrix of channel interactions by the mean pulse amplitudes at the CI electrodes. Tests with the model were conducted using a three-interval oddball paradigm and two-up∕one-down adaptive procedure as with human subjects. In each trial, the testing program randomly selected three rippled noise tokens and three stimulus presentation levels within the tested level roving range, and the model output (the “activity vector” in Fig. 4) was calculated for each of the three noise tokens.1

This schematic diagram illustrates the workings of a two-step phenomenological model that was used to determine whether a single-channel mechanism can account for the spectral-ripple discrimination thresholds of CI users. In the first step the model takes the acoustic stimulus and calculates the electrical stimulus delivered by the device. In the second step the electrical stimulus is multiplied by a matrix of interactions between cochlear implant channels.

For this “single-channel mechanism,” the modeled ideal observer had a two-step decision process. First, the model selected the channel in which the difference between the maximum and the minimum of activity was largest. That is, the model was allowed to attend to all channels during the three stimulus intervals and then pick the channel that gave it the largest cue. Second, the model compared activity at the selected channel for the three rippled noise tokens, discarded the two stimuli for which the difference in activity at this channel was smallest, and selected the remaining rippled noise token as the “oddball.” The model’s response was marked as correct if it chose the inverted-phase noise token as the “oddball” and as incorrect if it chose one of the two standard-phase tokens. For each condition, the mean (± standard error) of thresholds from 10 model runs is plotted. A single asterisk indicates comparisons for which p < 0.05; two asterisks indicate comparisons for which p < 0.01. It is clear from the model results [open symbols in Fig. 5a] that single-channel model thresholds declined markedly as the level rove was increased. This is in marked contrast with the relatively stable thresholds of CI users as the level rove was increased [filled circles in Fig. 5a]. A striking example of the effect of level roving on these single-channel thresholds is shown by the upward pointing triangles in Fig. 5a: Even when modeled channel interaction was zero, the single-channel mechanism achieved spectral-ripple discrimination thresholds that were not much greater than the mean thresholds of CI users (filled circles) at a 15-dB level rove. Relative to the no-interaction condition (infinite current decay), performance decreased when current decay was reduced to 8 or 4 dB∕mm, which are typical of bipolar and monopolar stimulation modes, respectively (Bingabr et al., 2008). For interactions typical of bipolar stimulation mode (downward pointing triangles), the model’s thresholds at a 15-dB level rove were found by t-tests to be lower than thresholds of CI users (p < 0.01). When interactions were stepped up further to a level typical of monopolar stimulation mode (open circles), which is the stimulation mode of all CI users in this study, the model’s thresholds were lower than those of CI users at level roves of 7 and 15 dB (p < 0.01 for both). Model calculations in Fig. 5a are for tests with fixed ripple starting phase; similar results were obtained with random starting phase (data not shown). These simulations suggest that CI users cannot perform spectral-ripple discrimination by discriminating intensity differences in a single CI channel.

Thresholds of a phenomenological model that uses discrimination of a single intensity difference to perform the spectral-ripple discrimination task are shown for a mechanism based on the largest single-channel intensity difference (left panel) or on overall intensity difference across all channels (right panel). Model thresholds are shown by open symbols connected by dashed lines, and measured thresholds in 8 CI subjects (shown by filled circles and solid lines) are included for ease of comparison. “MP” and “BP” refer to monopolar and bipolar stimulation modes. Note that the vertical scale differs between the two plots. *: p < 0.05, **: p < 0.01. Error bars represent one standard error.

Figure 5b shows results when the oddball was selected by an “overall intensity difference” mechanism. Specifically, the model calculated the mean of each activity vector (i.e., averaged across all channels), discarded the two stimuli for which the difference in the means was smallest, and selected the remaining stimulus as the “oddball.” The data are plotted as a function of level rove, and performance of CI users at these level roves is shown for comparison [filled circles and solid lines in Fig. 5b]. When a small amount of detection noise was simulated [open triangles in Fig. 5b] thresholds with the overall intensity discrimination mechanism were lower than thresholds of CI users at roves of 0, 7, and 15 dB (p < 0.001 for all three comparisons). The conservative (small) estimate of detection noise used here was the mean of the intensity difference limens for words of the top two performers in the data reported by Rogers et al. (2006). Even when no detection noise was modeled [open circles in Fig. 5b] thresholds were lower with the overall intensity discrimination mechanism than thresholds of CI users at roves of 7 and 15 dB (p < 0.001 for both). These results suggest that CI users cannot rely on overall level differences among rippled noise tokens to perform spectral-ripple discrimination. Taken together, the modeling results strongly suggest that CI users do not perform spectral-ripple discrimination by simply discriminating a single intensity difference between standard- and inverted-phase rippled noise tokens.