Relationship between channel interaction and spectral-ripple discrimination in cochlear implant users

Gary L Jones; Jong Ho Won; Ward R Drennan; Jay T Rubinstein

doi:10.1121/1.4768881

. 2013 Jan;133(1):425–433. doi: 10.1121/1.4768881

Relationship between channel interaction and spectral-ripple discrimination in cochlear implant users^a

Gary L Jones ^1,^a), Jong Ho Won ², Ward R Drennan ³, Jay T Rubinstein ³

PMCID: PMC3548834 PMID: 23297914

Abstract

Cochlear implant (CI) users can achieve remarkable speech understanding, but there is great variability in outcomes that is only partially accounted for by age, residual hearing, and duration of deafness. Results might be improved with the use of psychophysical tests to predict which sound processing strategies offer the best potential outcomes. In particular, the spectral-ripple discrimination test offers a time-efficient, nonlinguistic measure that is correlated with perception of both speech and music by CI users. Features that make this “one-point” test time-efficient, and thus potentially clinically useful, are also connected to controversy within the CI field about what the test measures. The current work examined the relationship between thresholds in the one-point spectral-ripple test, in which stimuli are presented acoustically, and interaction indices measured under the controlled conditions afforded by direct stimulation with a research processor. Results of these studies include the following: (1) within individual subjects there were large variations in the interaction index along the electrode array, (2) interaction indices generally decreased with increasing electrode separation, and (3) spectral-ripple discrimination improved with decreasing mean interaction index at electrode separations of one, three, and five electrodes. These results indicate that spectral-ripple discrimination thresholds can provide a useful metric of the spectral resolution of CI users.

INTRODUCTION

The use of cochlear implants (CIs) has led to remarkable successes such as mean open-set sentence recognition scores in a quiet background and without visual cues that are around 80%, where 70% is considered sufficient to support a telephone conversation (Wilson and Dorman, 2008; Zeng et al., 2008; Rubinstein, 2012). However, there is great variability in outcomes that is only partially accounted for by age, duration of deafness, and degree of residual hearing. Poor understanding of the factors that contribute to individual performance is a critical limitation affecting CI development. The spectral-ripple discrimination test offers a time-efficient, nonlinguistic measure that may be useful for predicting performance of CI users on speech perception (Henry and Turner, 2003; Won et al., 2007) and for comparing CI sound encoding strategies (Berenstein et al., 2008; Drennan et al., 2010). Performance in the task is correlated with vowel and consonant recognition by CI users in quiet (Henry and Turner, 2003; Henry et al., 2005), speech perception in noise (Won et al., 2007), and music perception (Won et al., 2010). These results have been interpreted as indicating the usefulness of spectral-ripple discrimination thresholds as an approximate metric of the spectral resolution of CI users, much as the ripple phase-inversion technique has been used to characterize the frequency resolving power of listeners with normal hearing (Supin et al., 1994, 1997, 1999).

However, there is controversy about the use of the spectral-ripple discrimination test and the interpretation of spectral-ripple discrimination thresholds when the listeners are CI users (Goupell et al. 2008; Azadpour and McKay, 2012). A summary of frequently raised concerns about the spectral-ripple discrimination test was provided by Azadpour and McKay (2012), who argued that it is not clear what underlying psychophysical abilities give rise to the correlation between spectral-ripple discrimination and speech understanding of CI users. First, Azadpour and McKay identified simple cues that they believe could be used by a CI user, such as overall loudness cues, spectral edge cues, or shifts in the spectral center of gravity; they designated these putative cues as “contaminating factors” in the spectral-ripple discrimination test. Goupell et al. (2008) suggested that changes in the intensity of a single channel might be the cue used by CI users in the spectral-ripple discrimination test. A second line of criticism concerns the method of stimulus presentation, in particular, the use of an acoustic stimulus. The spectral-ripple discrimination test is, by design, a fairly brief, “one-point” measure in which only the ripple density parameter is varied and a clinical CI speech processor is used. The criticism of this approach offers an illustration of how the purposes of the spectral-ripple discrimination test differ from the purposes of direct-stimulation testing paradigms, which are typically time intensive but allow the experimenter to specify all parameters of the electrical stimulus directly. Third, there is not general agreement about whether there is value in testing the spectral resolution of CI users. For example, there are questions about whether CI users are sensitive to spectral profiles (e.g., Goupell et al., 2008) and about whether spectral cues other than global spectral changes contribute to the speech understanding of CI users (Azadpour and McKay, 2012). Given the wide range of practical applications of the spectral-ripple discrimination test, it is crucial to address concerns about its usefulness for assessing the spectral resolution of CI users. This article presents the results of experiments that were conducted to investigate the relationship, if any, between extensive, multi-point measures of channel interactions and one-point spectral-ripple discrimination thresholds.

“Spectral ripple” refers to modulation of the amplitude spectrum of a stimulus. In the spectral-ripple discrimination test, the listener's sensitivity to inversions of ripple phase (exchanging the positions of spectral peaks and troughs) is measured at various ripple densities, expressed in ripples/octave. Higher thresholds (more ripples/octave) indicate better performance. Tests of discrimination or detection of spectrally modulated noise were first used to test listeners with normal hearing (Summers and Leek, 1994; Supin et al., 1994, 1999; Macpherson and Middlebrooks, 2003) and were adapted for tests in CI users (Henry and Turner, 2003; Henry et al., 2005, Litvak et al., 2007; Won et al., 2007; Saoji et al., 2009). Possible factors that could influence spectral-ripple discrimination performance in CI listeners include the number of electrodes available to the subjects, amount of intracochlear current spread, integrity or health of the auditory nerve, or sound processing strategies. Previous studies have shown that spectral-ripple discrimination improved as the number of electrodes increased (Henry and Turner, 2003), suggesting that spectral-ripple discrimination ability benefits from having multi-channel information. In addition, Won et al. (2011b) found that spectral-ripple discrimination ability increased with increasing electrode separation; this suggests that performance in the test depends on the extent of overlap in the excitation patterns of the stimulated electrodes.

Despite evidence from several studies suggesting a useful role in predicting speech outcomes for tests of spectral-ripple discrimination (e.g., Henry and Turner, 2003; Henry et al., 2005; Won et al., 2007) and spectral modulation detection (e.g., Litvak et al., 2007; Saoji et al., 2009), critics question the validity of measuring psychophysical ability with a test in which the ultimate parameters of the electrical stimulus are determined by the CI user's speech processing program. Azadpour and McKay (2012) suggested that the results reported in the literature arose due to the influence of “contaminating” factors, but the evidence for this claim is far from conclusive. In fact, sensitivity to some of the potential cues described by Azadpour and McKay (2012), such as spectral shifts and spectral edges, does require some degree of spectral resolution. Moreover, multiple lines of evidence suggest that these and other cues labeled as “contaminating” by Azadpour and McKay might not contribute significantly to performance on the task (Anderson et al., 2011; Won et al., 2011b). Finally, the potential presence of such cues in the acoustic stimulus would have little relevance for its practical application if there was compelling evidence that the test does provide a useful metric of the overall spectral resolution of CI users.

A far broader issue is the conclusion of Azadpour and McKay (2012) that spectral resolution might have very limited relevance to speech understanding in CI users. This has profound implications for speech encoding by CIs and for a large body of research concerning interactions between CI channels (e.g., Nelson et al., 1995; Chatterjee and Shannon, 1998; Throckmorton and Collins, 1999). This claim also has important consequences for the present experiment; namely, it raises the question of whether there is any value in assessing the spectral resolution of CI users. Azadpour and McKay support this claim with the finding that spatial resolution about electrode number 14 in eight users of the Nucleus Freedom™ implant was not correlated with speech scores. However, their result may just serve as an indication that it is difficult to predict speech outcomes from spatial resolution about any one electrode. The latter interpretation is consistent with the observation that the weight given to any one frequency band for speech recognition is highly variable in CI users (Mehr et al., 2001). Previous studies have found a relationship between speech understanding and a multi-electrode average of place-pitch sensitivity (Donaldson and Nelson, 2000) or electrode discrimination (Henry et al., 2000). As regards interactions involving a particular electrode, Stickney et al. (2006) calculated channel interactions due to a single perturbation electrode in the middle of the electrode array; they did not find significant correlations with speech recognition for continuously interleaved sampling (CIS), variations of which dominate the CI field today. Henry et al. (2000) did report significant correlations within narrow frequency bands, but they used a band-specific measure of information transmitted, not raw speech scores. In summary, the observation of Azadpour and McKay that raw speech scores could not be predicted from spatial resolution about a single electrode appears to be consistent with several previously published studies. In particular, their result can be explained without the need to exclude a role for spectral resolution in the speech understanding of CI users. Thus experiments to examine the relationship, if any, between spectral-ripple discrimination thresholds and the spectral resolution of CI users could have important implications for predicting speech outcomes in CI users.

There is evidence that performance on the spectral-ripple discrimination test might also not be predictable from spectral resolution at any one location along the electrode array. Anderson et al. (2011) found no significant correlation between the width of a single spatial tuning curve and broadband spectral-ripple discrimination thresholds. However, tuning curve bandwidths were correlated with spectral-ripple discrimination thresholds in an octave-wide band in the same frequency region. Thus the data of Anderson et al. suggest that multi-point measures of spectral resolution across the whole electrode array may be required to adequately characterize the relationship of broadband spectral-ripple discrimination thresholds with other measures of frequency selectivity.

One obvious limitation on the spectral resolution of CI users is the number of available channels. Spectral resolution is further limited when the channels are not independent. In multi-electrode CIs, there are psychophysically and physiologically measurable effects on sensitivity to a single-electrode stimulus when a second electrode on the same cochlear array is also stimulated. These “channel interactions” can result from overlapping excitation of peripheral auditory nerve fibers or from more central factors. Peripheral channel interactions would contribute to a CI user's ability to analyze and integrate information from multiple channels. For example, patients whose peripheral channel interactions are high receive inputs to the auditory nerve with a high degree of spectral smearing; such patients would be expected to have poorer spectral resolution.

The primary aim of this study was to evaluate the relationship of channel interactions and spectral-ripple discrimination across a range of CI users that includes both low- and high-performing listeners. It was hypothesized that higher channel interactions in CI users are associated with poorer spectral-ripple discrimination and vice versa. This hypothesis was tested using the spectral-ripple discrimination test in which stimuli are presented acoustically along with psychophysical measures of channel interactions under the controlled conditions afforded by direct stimulation with a research processor. Channel interactions were measured at dozens of electrode pairs spanning the entire electrode array and at several different electrode separations. Measured interactions were reported in the form of a normalized metric, the interaction index, as described in Sec. 2. Results supported the hypothesis with a significant negative correlation of spectral-ripple discrimination performance with mean interaction indices at multiple electrode separations.

CHANNEL INTERACTIONS MEASURED BY DIRECT STIMULATION

Methods

General approach

The interaction index (Eddington and Whearty, 2001; Boëx et al., 2003; Stickney et al., 2006) offers a normalized measure of how sensitivity to a probe electrode is affected by polarity inversions of simultaneous pulses on another electrode as shown in Fig. 1. The interaction index is calculated from detection thresholds for a probe pulse train in the presence of a subthreshold “perturbation” pulse train on another electrode with the opposite polarity of the probe (the “T⁻” threshold) or the same polarity as the probe (“T⁺”) (Boëx et al., 2003; Stickney et al., 2006). If there is zero overlap between the channels, then the effect of the pulse train in the perturbation channel on probe detection is the same for both polarities, i.e., T⁻−T⁺ = 0. If there is 100% overlap between the probe and perturbation channels, then for a perturbation current of C (see Fig. 1), C additional units of current are needed to reach the opposite-polarity threshold relative to the probe-alone threshold (T⁻ = T^alone + C), and C fewer units of current are needed to reach the same-polarity threshold (T ⁺ = T^alone − C). Subtracting these two expressions yields the equation T⁻−T⁺ = 2C in the case of 100% overlap. To summarize, the phase-inversion technique maps a 0% interaction to a difference of 0 and a 100% interaction to a difference of 2C. Thus the formula in Eq. 1 expresses channel interactions on a normalized scale from 0 to 1:

i i = \frac{T^{-} - T^{+}}{2 C} .

(1)

Pulse train stimuli used for measuring the interaction index are illustrated above. The testing paradigm measures how detection of a pulse train on a probe electrode is affected by polarity inversions of simultaneous “perturbation” pulses on another electrode.

Subjects

Six users of Advanced Bionics HiRes90K implants participated. These implants were chosen because they have independent current sources for each electrode that allow simultaneous pulse presentation on different electrodes via a BEDCS™ research interface. One bilaterally implanted patient was tested with each CI; thus data are reported for seven ears. Table TABLE I. shows basic information about CI users who participated in this experiment. All experimental procedures followed the regulations set by the National Institutes of Health and were approved by the University of Washington's Human Subject Institutional Review Board. All subjects had at least 6 months experience with their cochlear implant.

TABLE I.

Subject characteristics. Duration of severe to profound hearing loss is based on patients' self-report of the number of years they were unable to understand people on the telephone prior to implantation.

Subject	Age (yr)	Duration of hearing loss (yr)	Duration of implant use (yr)	Implant device	Sound processor strategy
S48	70	10	3	HiRes90K	HiResolution
S52	79	0	3	HiRes90K	Fidelity 120
S71	71	15	1.5	HiRes90K	Fidelity 120
S80	61	2	1.5	HiRes90K	HiResolution
S84	46	26	0.5	HiRes90K	HiResolution
S110 (L)	47	17	8	HiRes90K	Fidelity 120
S110 (R)	47	7	17	HiRes90K	Fidelity 120

Open in a new tab

Stimuli

Pairs of 813-Hz biphasic pulse trains, a cathodic-first “probe” pulse train presented amid a subthreshold cathodic- or anodic-first perturbation pulse train with a temporal fringe, were used as described by Boëx et al. (2003). Each pulse of the brief (30-ms) probe train was simultaneous with a pulse of the longer (300-ms) perturbation train that began 135 ms before the probe and ended 135 ms after the probe. The simultaneity of probe and perturbation pulses was verified visually on an oscilloscope and was also confirmed indirectly by measuring interaction indices as large as 1, which are not observed with nonsimultaneous pulses (de Balthasar et al., 2003). Pulse width was 21.8 μs, which was found in initial testing to be the smallest pulse width at which T⁻ and T⁺ thresholds could always be collected for all electrode pairings. Patients had pulse widths of 10.9 μs in their standard maps, but for adjacent electrode pairs at a 10.9-μs pulse width, the probe was often inaudible up to the highest available stimulation levels when the perturbation signal was polarity-inverted (the T⁻ condition).

Procedures

These experiments used a PC with sound card, soundwave™ software for CI patient maps, a PSP™ research processor, and the BEDCS™ programming interface for direct stimulation of CII and HiRes90K implants from Advanced Bionics Corporation that was controlled through custom-written matlab- and python-based programs. First, at each active electrode (N ≤ 16), thresholds were collected for 300-ms pulse trains. Second, at each probe/perturbation electrode pair, maximum comfortable levels were determined for a 30-ms probe pulse train at each polarity of an inaudible (2 dB below threshold) 300-ms perturbation train on the perturbation electrode. Subsequently, detection levels for a 30-ms probe pulse train at each polarity of an inaudible (2 dB below threshold) 300-ms perturbation train on another electrode were measured adaptively in a two-down/one-up paradigm with six reversals using an approach similar to that of Boëx et al. (2003). Each pair of opposite-polarity T⁻ and same-polarity T⁺ thresholds was collected in a single run using two randomly interleaved adaptive tracks. On each trial there were three stimulus intervals, only one of which contained the probe. The timing of the three intervals was indicated by lights displayed on a computer screen. The initial amplitude of the probe pulses in each track was set to a level that had been determined in initial testing to be below the maximum comfortable level but easily audible. The computer program that controlled the experiment specifically prevented the probe level in each adaptive track from exceeding the previously determined maximum comfortable level for that specific combination of perturbation electrode, probe electrode, and pulse polarity. The step size was 1.4 dB current until the first reversal, then 0.7 dB until the second reversal, and 0.35 dB thereafter. The run was stopped as soon as both adaptive tracks had completed at least six reversals. For each adaptive track, the threshold and standard error were calculated using the Spearman–Karber method (Miller and Ulrich, 2001, 2004). This method estimates the mid-point between chance performance (33.3% for the task in these experiments) and the maximum performance of 100% correct; i.e., thresholds correspond to 66.7% correct performance. The interaction index was calculated using Eq. 1.

Channel interactions were quantified using the interaction index, which was calculated according to Eq. 1 from measurements at 46 electrode pairings in each subject. Four listeners completed the full testing protocol at all 46 electrode pairings. Evaluation of two listeners (three ears) was performed with a reduced protocol in which only electrode pairings with an electrode separation of three electrodes were tested. The 46 tested probe-perturbation electrode pairs that were included in the full test protocol are illustrated in Fig. 2. They consisted of 40 electrode pairs that were distributed in an approximately uniform manner across the electrode array at electrode separations of one, three, and five electrodes; four electrode pairs at an electrode separation of nine electrodes, and two electrode pairs consisting of opposite ends of the array. As noted in the preceding text, some listeners completed a reduced protocol in which only pairs separated by three electrodes were tested. In addition, there were slight modifications to which electrode pairings were tested for one listener, who had three disabled electrodes [see also Fig. 4e]. Total testing time for collecting the 46 interaction indices was about 20 h per subject.

Electrode pairings for which interaction indices were measured in these experiments are shown by the dark squares. Interaction indices were measured for electrode separations ranging from one electrode (nearest-neighbor pairs) to 15 electrodes (opposite ends of the electrode array).

The greyscale matrices show interaction indices measured at electrode pairs across the electrode array at various electrode separations in sixCI users (seven ears). The scale ranges from white at a 0% interaction to black at a 100% interaction. The measured data points are shown in black outline to distinguish them from the white background. Data for subject S80 and for two CIs in subject S110 (lower row) were collected for electrode separations of three electrodes only. The hatched pattern in (e) indicates that subject S80 had three disabled electrodes at the apex of the electrode array.

Results and discussion

Interaction indices in these experiments spanned the full range from 0 (no interaction) to 1 (100% interaction). In some cases, the calculated interaction index was slightly negative (2% of adaptive tracks) or slightly greater than 1 (8% of adaptive tracks); calculated thresholds less than 0 or greater than 1 are reported as 0 or 1, respectively. The standard errors of the individual interaction indices were calculated using the Spearman–Karber method (see Sec. 2A4) and were fairly small: 95% of the standard errors were between 0.01 and 0.09. Two examples of pairs of interleaved adaptive tracks are shown for one subject in Fig. 3. The left panel of this figure shows the interleaved T⁻ and T⁺ tracks at one perturbation-probe electrode pair with a separation of nine electrodes. For this large electrode separation, the T⁻ and T⁺ adaptive tracks converged to similar levels; thus the interaction index was near 0 as one would expect when the numerator of Eq. 1 is small. The right panel shows the adaptive tracks for an electrode pair separated by one electrode. For this pair of adjacent electrodes, the T⁻ and T⁺ tracks had very different points of convergence, and the calculated interaction index was near 1. The adaptive tracks in these plots exhibit a pattern that was typical of the data in these experiments in which the adaptive tracking procedure converged toward threshold rather quickly.

Examples of adaptive tracks used to quantify channel interaction index at two electrode pairs. In the example in the left panel, the difference between the thresholds of the two adaptive tracks is very low, yielding a low interaction index. In the right panel, the two adaptive tracks have very different points of convergence, and the interaction index is higher.

The interaction indices measured across the electrode array for all seven tested CIs are shown in Fig. 4. For subjects in the upper row, the 46 measured interaction indices are plotted on a scale in which white indicates a 0% interaction and black indicates a 100% interaction. For subjects in the lower row, interaction indices for electrode pairs with a separation of three electrodes are plotted. Note that in both the upper and lower rows, the values for the interaction of each electrode with itself, by definition a 100% interaction, are also plotted; these values appear on the main diagonal, which extends from the upper left corner to the lower right corner of each matrix. The measured data points are shown in black outline to distinguish them from the white background. Visual inspection of the interactions matrices reveals several noteworthy patterns in the data. First, in each matrix, interaction indices generally decrease with increasing distance from the main diagonal. In other words, channel interaction decreased with increasing electrode separation. Second, there is considerable variability in channel interactions in different regions of the electrode array as can be seen by scanning parallel to the main diagonal of each matrix. Moreover, channel interactions varied across subjects: When comparing results of subjects whose data are shown in the upper row (i.e., subjects who completed the full testing protocol), interactions were generally lowest in the subject at the left and highest in the two subjects at the right.