Abstract
Objectives.
The study objective was to characterize cochlear implant (CI) pitch perception for pure, complex, and modulated tones for frequencies and fundamental frequencies in the ecologically essential range between 110 and 440 Hz. Stimulus manipulations were used to examine CI users’ reliance on stimulation place and rate cues for pitch discrimination.
Design.
The study was a within-subjects design with twenty-one CI users completing pitch discrimination measures using pure, complex, and modulated tones. Stimulus manipulations were used to test if CI users have better pitch discrimination for low-pass compared to high-pass filtered harmonic complexes, and to test whether they have better pitch discrimination when provided a covarying place cue when listening to amplitude-modulated tones.
Results.
Averaged across conditions, participants had better pitch discrimination for pure tones compared to either complex or amplitude-modulated tones. Participants had better pitch discrimination for low-pass compared to high-pass harmonic complexes and better pitch discrimination for amplitude-modulated tones when provided a covarying place cue.
Conclusions.
CI users integrate place and rate cues across the ecologically essential pitch range between 110 and 440 Hz. We interpret the observed better pitch discrimination for low-pass compared to high-pass filtered harmonics complexes, and for amplitude-modulated tones when provided a covarying place cue, as evidence for the importance of providing place-of-excitation cues for fundamental frequencies below 440 Hz. Discussion considers how such encoding could be implemented with existing devices.
Keywords: auditory neuroscience, cochlear implants, pitch perception, psychophysics
INTRODUCTION
Pitch perception is an intimate aspect of hearing, in that the pitch of a person’s voice provides a sense of familiarity and closeness (Belin et al. 2011; Belin et al. 2004). People who lose their hearing and then have partial restoration through cochlear implants (CIs) can obtain high levels of environmental awareness and speech comprehension (Niparko et al. 2010; Firszt et al. 2004). Pitch perception, however, tends to be poor even in recipients with the best outcomes and even among those who receive auditory training (Luo et al. 2019; Won et al. 2010; Gfeller et al. 2007). Consequently, aspects of hearing facilitated by pitch, such as music appreciation, speech comprehension in noise, and vocal emotion recognition, are diminished (Looi et al. 2012; Caldwell & Nittrouer, 2013; Deroche et al., 2014; do Nascimento & Bevilacqua, 2005; Q.-J. Fu & Nogaki, 2005; Gfeller et al., 2000; Gilbers et al., 2015; Luo et al., 2007; Bruns et al. 2016).
This article considers how electrical stimulation could be configured to improve pitch discrimination for CI users. It is well known that a sense of pitch can be provided by the location of implanted electrodes with more deeply inserted electrodes producing a lower pitch (Shannon 1983; Landsberger & Galvin 2011). It is also well known that a sense of pitch can be provided by stimulation timing with higher stimulation rates or higher modulation frequencies producing a higher pitch (Tong et al., 1982; Zeng, 2002). Modern CIs use both stimulation place and timing cues to provide pitch (Wouters et al., 2015), with place cues provided by tonotopic mapping of acoustic frequency to electrode position, and with timing cues usually provided by modulation frequency of stimulation (Svirsky 2017; Wilson & Dorman 2008), though some devices use variable pulse rates (Vandali et al. 2005; Hochmair et al. 2015). While methods for providing place and timing cues for pitch are present in existing devices, there is considerable debate regarding optimal use of these cues (Miyazono & Moore 2013; Cedolin & Delgutte 2010; Verschooten et al. 2019; Kong & Carlyon 2010; Oxenham et al. 2011; Marimuthu et al. 2016; Loeb 2005; Chatterjee & Oberzut, 2011; Chatterjee & Peng, 2008; Deroche et al., 2014; Q.-J. Fu, 2002; Q.-J. Fu et al., 2004; Geurts & Wouters, 2001, 2004; Shannon, 1992; Vandali & van Hoesel, 2012).
In normal hearing, it is difficult to characterize contributions of place and timing cues to the salience of pitch (Plack & Oxenham 2005; Moore, 1995). Considering the simple case of pure tones, a pure tone of a given frequency excites a characteristic place in the cochlea, hence the auditory nerve, and provides a consistent place cue for pitch (Joris et al. 2004). Likewise, a pure tone excites action potentials in the auditory nerve that are synchronously phase locked to frequencies up to 2 to 3 kHz providing a consistent timing cue for pitch (Dynes & Delgutte 1992). It is unclear to what extent these tonotopic place and synchronous rate cues contribute to the pitch of pure tones. It is known, however, that the resulting resolution in normal hearing is exquisite with listeners able to discriminate pure tones based on frequency differences less than 1% for a wide range of conditions (Micheyl et al. 2012; Moore et al. 2009).
Most natural sounds that produce a strong sense of pitch are not pure tones but contain a complex structure of harmonics based on a fundamental frequency (McDermott & Oxenham 2008). As for pure tones, it is difficult in normal hearing to characterize contributions of place and timing cues for complex pitch, though progress has been made concerning purely temporal cues for pitch using high frequency filtering (Oxenham et al. 2004; Carlyon & Deeks 2002; Oxenham et al. 2011). Complex tones are composed of harmonic components that are integer multiples of the fundamental frequency. Since components are linearly spaced, and since cochlear filtering is approximately logarithmic, the lower harmonics are more finely represented as place cues and described as tonotopically resolved when sufficiently far apart to produce discernible peaks in the auditory nerve response (Plack & Oxenham 2005; Cariani & Delgutte 1996a). Studies have shown that pitch discrimination is better with resolved harmonics, which contain place and timing cues than by unresolved harmonics, which only contain timing cues (Moore & Sęk 2009; Houtsma & Smurzynski 1990).
These considerations are important for understanding and improving pitch discrimination for CI users, who are remarkably sensitive to place cues for pitch (Goldsworthy et al. 2013; Goldsworthy 2015; Padilla et al. 2017). Most CIs have between 12 and 22 intracochlear electrodes that are used to provide tonotopic stimulation. With so few electrodes, the frequency to electrode mapping typically corresponds to having the center frequencies of adjacent electrodes about 1/3rd octave apart (20 to 30% frequency difference). Despite this, the most sensitive CI users can distinguish pure tones based on frequency differences of about 1%, which is generally conveyed by place-of-excitation cues (Goldsworthy et al. 2013; Goldsworthy 2015; Swanson et al., 2009), but noting that several new strategies try to convey pitch cues using temporal or temporal-envelope fine structure (Goldsworthy 2022; Arnoldner et al. 2007; Wouters et al. 2015; Gazibegovic et al. 2010).
This remarkable pitch discrimination for pure tones unfortunately does not extend to complex tones. For complex tones with fundamental frequencies in the range of human voices (100 to 300 Hz), CI filtering does not provide tonotopically resolved representation of harmonic structure (Swanson et al. 2019). Consequently, CI users rely on temporal cues for discriminating complex tones in this range. Temporal cues for pitch become less effective with increasing rates with marked deterioration of resolution between 200 and 300 Hz. This leads many CI users to express frustration with melody recognition above middle C (~262 Hz) (Looi et al. 2012). It has been shown, however, that discrimination sometimes improves for fundamental frequencies higher than 300 Hz, likely because of better access to place cues to make up for the impoverished rate cues (Swanson et al. 2019).
The study described here examined pitch sensitivity in CI users for pure, complex, and modulated tones. Methods were used to characterize how CI users rely on stimulation place and timing cues for pitch. Our motivation was driven by evidence that CI users generally have better pitch discrimination for pure compared to complex or modulated tones (Goldsworthy 2015; Goldsworthy et al. 2013). Our theory is that providing a clear and consistent place cue for complex tones can improve pitch resolution. Two specific hypotheses were tested within this theory. The first central hypothesis tested was that CI users have better pitch discrimination for low-pass compared to high-pass filtered complex tones. The second central hypothesis tested was that CI users have better pitch discrimination for amplitude-modulated tones when provided a covarying place cue. Both hypotheses tested true. Extensive analyses of CI stimulation patterns—provided in an appendix—show that the primary difference between stimuli is an additional place cue. Discussion considers alternative interpretations for why discrimination might be better for low-pass compared to high-pass filtered harmonic complexes, and better for amplitude-modulated tones with a covarying carrier frequency, but our interpretation is that the driving mechanism is the provision of place cues. Discussion considers how place cues could be better provided by existing CIs.
MATERIALS AND METHODS
Twenty-one adult CI users took part in an online listening experiment conducted remotely using a web application. Participants were recruited through IRB approved flyers and announcements which helped establish a CI user participant list. The recruitment took place over email and instructions were given both over email and Zoom. Subjects assented to their understanding of the protocol and testing. Participants were asked to connect to their computer as they normally would for telecommunications. Bilateral participants were given the option of testing monoaurally in a sequential manner (Both as reported in Table 1), or binaurally (Both Together as reported in Table 1) through their usual method of streaming. The preferred manner of obtaining the results was monaurally in a sequential manner, but the option was given to improve retention for those subjects who did not wish to complete the entire protocol twice. For participants with residual hearing, they were asked to either Bluetooth stream directly to their CI only or remove their hearing aid. For the one participant (Subject 5) who did not have access to streaming, they removed their hearing aid and listened in free field. They also reported that their aided ear qualified for cochlear implantation. Participants were asked to use a website designed for listening experiments to complete approximately 90 minutes of listening exercises in one session, with breaks as needed, taking each test in the order presented. The listening exercises included pure tone loudness scaling, pure tone detection, pure tone frequency discrimination, complex tone fundamental frequency discrimination, and an exercise designed to test combined use of carrier and modulation frequency using amplitude-modulated tones. These measures characterize pure and complex frequency discrimination and test whether listeners benefit from covarying place and rate cues in a modulation frequency discrimination task. For all listening exercises except for detection thresholds, the condition order was randomized for each repetition. An interface was provided before each run allowing participants to listen to the initial difference between high and low pitches for the condition. This online listening experiment was conducted using TeamHearing (https://www.teamhearing.org), an open-source web application for auditory rehabilitation and assessment.
Table 1. Subject information.
Age at time of testing and age at onset of hearing loss is given in years. Duration of profound hearing loss prior to implantation is given in years and estimated from subject interview.
ID | Age | Gender | Etiology | Ear Tested | Age at Onset | Duration of Deafness | Years Implanted | CI Company & Processor | Implant Model | CI Stimulation Strategy | Method of Streaming |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 49 | M | Meniere’s Disease | Both | 39 | L:1 R:4 | L:3 R:6 | Cochlear N7s | L:CI532 R:CI24RE (CA) | ACE | Mini Mic |
2 | 36 | F | Unknown | Both | 15 | L:5 R:1 | L:9 R:13 | Cochlear N7s | L:CI24RE (CA) R:CI24RE (CA) | ACE | Mini Mic2 |
3 | 75 | F | Progressive SNHL | Both Together | 40 | L:1 R:5 | L:21 R: 17 | Cochlear N6s | L:CI24R (CS) R:CI24RE (CA) | ACE | Cochlear Binaural Cable |
5 | 83 | M | Noise Induced | Right | 40 | 20 | 13 | Cochlear N6 | CI24RE (CA) | ACE | Juster Multimedia Speaker SP-689 |
8 | 72 | F | Sudden SNHL | Right | 68 | 1 | 3 | Cochlear N7 | CI522 | ACE | N7 Bluetooth |
9 | 74 | M | Unknown | Both | Birth | L:10 R:61 | L:3 R:4 | Cochlear N7s | L:CI532 R:CI532 | ACE | Mini Mic2 |
10 | 46 | M | Ototoxic Medicine | Left | 12 | 1 | 33 | Cochlear N6 | CI22M-, USA | ACE | Mini Mic |
11 | 58 | F | Sudden SNHL | Right | 55 | 1 | 2 | Advanced Bionics Naida CI Q90 | HiRes Ultra 3D CI HIFocus SlimJ | HiRes Optima-S | AB Bluetooth |
12 | 48 | M | Unknown | Right | 40 | 2 | 1 | Cochlear N7 | CI532 | ACE | Mini Mic |
13 | 59 | M | Mumps Disease | Right | 14 | 42 | 3 | Med-EL Sonnet | Sonata | FS4 | I-loop streaming |
14 | 71 | F | Ototoxic Medicine | Both | 4 | L:54 R:33 | L:7 R:28 | L: A.B. Marvel R:A.B. Harmony | L: HiRes 90K HiFocus 1j R:C1 implant, Radial Bipolar Standard Electrode Array | L: HiRes Optima-S R: CIS | L: Bluetooth R: Sony WH-XB900N Headphones |
15 | 57 | M | Ototoxic Medicine | Left | 54 | 1 | 1 | Advanced Bionics Naida CI Q90 | HiRes Ultra 3D CI with HiFocus Mid-Scala Electrode | HiRes Optima-S | Bluetooth/Compilot |
16 | 65 | M | Ototoxic Medicine | Left | 38 | 5 | 18 | Cochlear N5 | CI24R (CS) | ACE | Sony MDR-D150 Headphones |
17 | 74 | F | Unknown | Both Together | Birth | L:9 R:9 | L:20 R:15 | Cochlear N6s | L:CI24R (CS) R:CI24RE (CA) | ACE | Free Field through HP Computer Speakers |
18 | 72 | F | Measles In Utero | Both Together | Birth | L:1 R:1 | L:12 R:10 | Cochlear N6s | L:CI24RE (CA) R:CI512 | ACE | Computer Speakers |
19 | 18 | F | Unknown | Both Together | Birth | L:1 R: 5 | L:17 R:13 | Cochlear N7s | L:CI24R(ST) R:CI24RE(ST) | ACE | Mini Mic |
20 | 66 | F | Unknown | Both Together | 18 | L:14 R:16 | L:4 R:5 | L:Cochlear N6 R:Cochlear N7 | L:CI522 R:CI522 | ACE | Free Field through iPad Speakers |
21 | 65 | M | Autoimmune | Left | 19 | 1 | 6 | Cochlear N7 | CI612 | ACE | Mini Mic |
22 | 65 | F | Mumps Disease | Left | 5 | 58 | 2 | Cochlear N7 | CI512 | ACE | Mini Mic |
23 | 66 | M | Meniere’s Disease | Both Together | L:36 R:42 | L:1 R:7 | L:16 R:14 | Cochlear N6s | L:CI24RE (CA) R:CI24RE (CA) | ACE | Mini Mic |
24 | 62 | M | Progressive SNHL | Left | 57 | 1 | 2 | Med-EL Sonnet 2 | Mi 1200 Synchrony | FS4-P | Bluetooth/Roger Pen |
Participants
Twenty-one adult CI users took part in this study. Sixteen participants used Cochlear Corporation devices (1 with an N22 implant), 3 used Advanced Bionics devices (1 with C1 implant), and 2 used MED-EL devices. Participant ages ranged from 18 to 83 years with an average of 61 years and a standard deviation of 15 years. Participants provided informed consent and were paid for their participation. The experimental protocol was approved by the University of Southern California Institutional Review Board. Participant information is provided in Table 1.
Stimuli and Procedures
Calibration procedures including pure tone loudness scaling and pure tone detection thresholds were collected to characterize the relative loudness of subsequent psychoacoustic procedures. Pure tone loudness scaling of an 880 Hz sinusoid was measured to characterize loudness growth (see supplemental figure 1). Participants were provided with an application interface to set gain values that produced tones that were “Soft”, “Medium Soft”, “Medium”, and “Medium Loud” for an 880 Hz pure tone that was 400 ms in duration with 20 ms raised-cosine attack and release ramps. This measure, along with pure tone detection thresholds, serves as a perceptual anchor for the remaining procedures.
Pure tone detection was measured for 400 ms sinusoids with 20 ms raised-cosine attack and release ramps as a calibration procedure (see supplemental figure 1). These measures were collected to characterize the relative loudness levels for remote testing since all subsequent procedures are set to “comfortable” listening levels, which can then be referenced to Sensation Levels (dB SL). Detection thresholds were measured for octave frequencies from 110 to 7040 Hz. An application interface was provided at the beginning of each measurement run allowing participants to set the gain to be “soft but audible”. From the starting gain, detection thresholds were measured using a three-interval, three-alternative, forced-choice procedure where two of the intervals contained silence and the target interval contained the gain-adjusted tone. The gain was reduced by a step following correct answers and increased by three steps following mistakes. The initial step was 6 dB but was decreased by 2 dB after each of the first two mistakes. A run continued until participants made four mistakes and the average of the last four reversals was taken as the detection threshold. This procedure converges to 75% detection accuracy (Kaernbach 1991). Correct-answer feedback was provided on all trials of this and all subsequently described discrimination procedures.
Pure tone frequency discrimination was measured for frequencies near 110, 220, 440, 880, 1760, and 3520 Hz. Each condition was tested three times in random order. Stimuli were 400 ms sinusoids with 20 ms raised-cosine attack and release ramps. An application interface was provided at the beginning of a run allowing participants to set the stimulus gain to be “comfortable”. Discrimination thresholds were measured using a two-interval, two-alternative, forced-choice procedure with the target having an adaptively higher frequency.
Participants were instructed to choose the interval that was “higher in pitch”. At the beginning of a run, the adaptive frequency difference was 100% (an octave). This frequency difference was reduced by a factor of after correct answers and increased by a factor of 2 after mistakes. For each trial, a roved frequency value was selected from a quarter-octave-wide uniform distribution geometrically centered on the condition frequency. Relative to this roved frequency value, the standard frequency was lowered, and the target raised by . The gain for the standard and target intervals were independently roved by 6 dB, also based on a uniform distribution, centered on the comfortable listening level. A run continued until participants made four mistakes and the average of the last four reversals was taken as the discrimination threshold.
Fundamental frequency discrimination was measured for harmonic complexes for fundamental frequencies near 110, 220, and 440 Hz for low-passed and high-passed filtered complexes. These fundamental frequencies were chosen as representative of spoken speech. The low-pass filtering condition was chosen as representative of spoken speech with a high-frequency roll-off; the high-pass filtering condition was chosen to characterize differences in apical versus basal stimulation (Swanson et al. 2019). Each condition was tested three times in random order.
Harmonic complexes were constructed in the frequency domain by summing harmonics in sine phase from the fundamental to 10 kHz with either a low-pass or high-pass filter. The form of the low-pass filter was:
where is expressed as a linear multiplier applied to each harmonic, is the frequency, and is the edge frequency of the passband, which was set as 1 kHz for the low-pass filter. Note, as thus defined, the low-pass filter gain is 0 above 2 kHz. The high-pass filter was similarly defined:
with the edge frequency specified as 4 kHz resulting in a gain of 0 below 2 kHz.
Fundamental frequency discrimination was measured using a two-interval, two-alternative, forced-choice procedure in which participants were instructed to choose the interval that was “higher in pitch”. The same adaptive and scoring logic described for pure tone frequency discrimination was used.
Modulation frequency discrimination was measured for sinusoidally amplitude modulated (SAM) tones with and without covarying carrier frequency. Discrimination thresholds were measured for modulation frequencies near 110, 220, and 440 Hz using a two-interval, two-alternative, forced-choice procedure in which participants were instructed to choose the interval that was “higher in pitch”. The same adaptive procedure and scoring logic was used as described above with modulation frequencies roved within a quarter octave and stimulus gain independently roved by 6 dB for standard and target stimuli.
For half of the conditions, the carrier frequency of the SAM tone was four times the nominal modulation frequency. For example, for the condition where the modulation frequency was 110 Hz, the carrier frequency was 440 Hz for both standard and target stimuli (i.e., inharmonic complex tones). For the other half of the conditions, the carrier frequency covaried with the trial-by-trial values of the standard and target stimuli. For example, if the standard and target modulation frequencies were 100 and 120 Hz, then the corresponding carrier frequencies would be 400 and 480 Hz, respectively (i.e., harmonic complex tones). Natural sounds are harmonic in that the harmonics are integer multiples of the fundamental frequency. For example, a natural periodic sound with fundamental frequency of 110 Hz, the harmonics occur at 220, 330, 440 Hz, etc. Neural decoding mechanisms may be tuned to expect such harmonic structure, which may lead to degraded performance when using inharmonic stimuli; for example, for a modulation frequency of 110 Hz and a carrier of 500 Hz since, for such a case, there is timing irregularity across modulation periods. Representative examples of the electrical stimulation patterns and modeled neural activity produced by pure, complex, and modulated tones are provided in the appendix.
Data Analysis
Each procedure was conducted with multiple frequencies (or fundamental or modulation frequencies) with three repetitions of each condition. Neither repetition nor any of the second order interactions with repetition were significant, thus indicating little to no effect of familiarization as the test proceeded. Frequency discrimination thresholds were analyzed using a one-way repeated-measures analysis of variance with frequency as the factor. Both fundamental and modulation frequency discrimination included an additional factor. Fundamental frequency discrimination examined the effect of low-pass and high-pass filtering, and measured thresholds were analyzed using a two-way repeated-measures analysis of variance with interactions with frequency and filtering type (low-pass versus high-pass) as factors. Modulation frequency discrimination examined the effect of covarying stimulation cues and measured thresholds were analyzed using a two-way repeated-measures analysis of variance with interactions with frequency and stimulation type (covarying or not) as factors. Of interest is whether the effect size comparing filtering and stimulation types was affected by modulation frequency; consequently, planned multiple comparisons of filtering and stimulation types were conducted based on the analysis of variance statistics with a 0.05 Tukey-Kramer critical value. Effect sizes are reported as Cohen’s d with 95% confidence intervals estimated using a bootstrapping procedure reported in brackets. Unplanned comparisons were explored with company as a grouping factor for all measures.
Pearson correlation coefficients were calculated between discrimination thresholds for each procedure averaged across conditions for each subject.
In addition, unplanned correlations were calculated between discrimination thresholds for each procedure with measures of modulation detection and musical sophistication as reported in Camarena et al., 2022. The musical background and the related skills of the participants were not questioned or evaluated prior to the Camarena et al., 2022 study. Sixteen of the participants in the present study also took part in Camarena et al., 2022. In that study, measures of modulation detection were predictive of consonance and dissonance perception. Other studies have found that modulation sensitivity is also predictive of speech comprehension in background noise. Consequently, we extend the analyses of the present article to consider correlations between pitch discrimination with modulation sensitivity. It is also well known that musicians have better pitch perception than non-musicians (Micheyl et al. 2006); consequently, we further make use of the results from Camarena et al., 2022, to test for correlations between pitch resolution and musical sophistication. Briefly, the reported modulation detection thresholds were measured for 10 and 110 Hz modulated frequencies for a 1 kHz carrier sinusoid. Musical sophistication was quantified based on the Musical Sophistication Index (Mü Llensiefen et al. 2014). Details described in Camarena et al., 2022.
RESULTS
Calibration procedures.
See supplementary Figure 1 for pure tone loudness scaling and detection thresholds, which characterize the stimulus levels of psychoacoustic procedures relative to detection thresholds (sensation levels).
Pure tone frequency discrimination.
Figure 1 shows measured frequency discrimination thresholds for all participants. The differences between discrimination thresholds across frequencies were significant as determined by a one-way ANOVA with repeated measures (). Discrimination was typically poor when measured near 110 Hz, with many participants complaining of audibility issues for such low-frequency pure tones. Some participants commented that they thought loudness differences between comparisons (even though stimulus intensity was roved by 6 dB) was a more effective cue for discrimination since lower frequency sounds were often heard as less loud. Consequently, particular caution is recommended when interpreting the results for the 110 Hz condition. Discrimination improved with increasing frequency from 110 to 880 Hz with a best average performance of 3.1% for 880 Hz and noting that 7 of the participants had thresholds better than 1% for that condition. Discrimination worsened above 880 Hz. These trends likely reflect the way acoustic frequency is mapped to electrode location. To illustrate, the default frequency allocation for Advanced Bionics, Cochlear, and MED-EL were used to express filter bandwidth as a percent of the center frequency and overlaid on Figure 1 for comparison.
Figure 1.
Pure tone frequency discrimination thresholds as percent difference on a logarithmic scale for frequencies from 110 to 3520 Hz. Smaller gray circles indicate discrimination thresholds for each participant averaged across measurement runs. Larger black circles indicate discrimination thresholds averaged across participants with error bars indicating standard errors of the means. The solid, dashed, and dotted black lines indicate the default filter bandwidth for Advanced Bionics, Cochlear, and MED-EL devices, respectively, expressed as a percent of the filter’s center frequency. The shaded region indicates 95% confidence intervals for normal hearing listeners as characterized by Micheyl et al., 2012. Color online: the color of smaller symbols indicates implant manufacturer: Cochlear Corporation (blue), Advanced Bionics Corporation (red), MED-EL Corporation (gold).
Fundamental frequency discrimination.
Figure 2 shows fundamental frequency discrimination thresholds for all participants. A two-way analysis of variance was conducted on discrimination thresholds with fundamental frequency and filtering condition (low-pass versus high-pass filtering) as within-subject factors. The difference in discrimination across fundamental frequencies was significant (). Multiple comparisons of frequencies using Bonferroni critical values confirmed that poorer discrimination was observed for higher fundamental frequencies (). Discrimination was worse for high-pass compared to low-pass filtered complexes (). The interaction between fundamental frequency and filtering condition was not significant ().
Figure 2.
Fundamental frequency discrimination as percent differences for fundamental frequencies from 110 to 440 Hz. Smaller symbols indicate thresholds for each participant averaged across measurement. Larger symbols indicate thresholds averaged across participants with error bars indicating standard errors of the means. Larger circles with error bars replot frequency discrimination thresholds from Figure 1 for 110, 220, and 440 Hz. Color online: the color of smaller symbols indicates implant manufacturer: Cochlear Corporation (blue), Advanced Bionics Corporation (red), MED-EL Corporation (gold).
Effect size was quantified using Cohen’s method and planned multiple comparisons using Bonferroni critical values to compare discrimination for low-pass and high-pass filtering conditions for each fundamental frequency. The effect ranged from small to medium for the comparison of low and high-pass conditions near fundamental frequencies of 110 (), 220 (), and 440 Hz ().
Modulation frequency discrimination with covarying cues.
Figure 3 shows modulation frequency discrimination thresholds for all participants. A two-way analysis of variance was conducted on discrimination thresholds with modulation frequency and stimulation type (covarying or not) as within-subject factors. The effect of modulation frequency was significant () with multiple comparisons using Bonferroni critical values confirming that discrimination was worse for higher modulation frequencies (). The main hypothesis tested true with better discrimination for covarying modulation and carrier frequency (). The interaction between modulation frequency and stimulation type was significant (), which is apparent in that discrimination worsens with increasing modulation frequency for the modulation frequency condition but does not for the combined cue condition.
Figure 3.
Modulation frequency discrimination thresholds as percent differences for modulation frequencies from 110 to 440 Hz. Smaller symbols indicate thresholds for each participant averaged across measurement runs. Larger symbols indicate thresholds averaged across participants with error bars indicating standard errors of the means. Larger circles with error bars replot frequency discrimination thresholds from Figure 1 for the 440, 880, and 1760 Hz conditions reflecting the carrier frequency used in the corresponding condition. Color online: the color of smaller symbols indicates implant manufacturer: Cochlear Corporation (blue), Advanced Bionics Corporation (red), MED-EL Corporation (gold).
Effect size was quantified using Cohen’s method and planned multiple comparisons using Bonferroni critical values to compare discrimination for the two cue types at each fundamental frequency. The effect was small near 110 Hz () with a 1.3% improvement leading to an average discrimination threshold of 4.1% with covarying cues. The effect was large near 220 and 440 Hz with a 10.1% improvement leading to an average discrimination threshold of 4.4% for modulation frequencies near 220 Hz () and a 39.2% improvement due to covarying cues for an average discrimination threshold of 5.1% for modulation frequencies near 440 Hz (). In summary, providing a covarying place cue provided a large benefit on modulation frequency discrimination for SAM tones for modulation frequencies near 220 and 440 Hz.
Correlations between measures.
Figure 4 compares discrimination thresholds for each procedure. Discrimination thresholds were well correlated for each comparison (all p < 0.001). Thresholds were, when averaged across all conditions, lower for pure tones compared to either complex or modulated tones. Thresholds were, when averaged across all conditions, lower for complex than for modulated tones.
Figure 4.
Comparison of discrimination measures across procedures. Discrimination thresholds averaged across conditions (with a logarithmic transform) are plotted for each subject. Solid straight lines indicate regression lines that minimize the squared errors. Dashed lines are unity lines indicating where discrimination thresholds are the same for the two procedures. Dotted lines connect the data for the separately tested ears of bilateral implant users. Pearson correlation coefficients and corresponding significance levels are shown in the upper left corner of each comparison.
Correlations with modulation sensitivity and musicianship.
Timing information is generally conveyed through amplitude modulation in CI signal processing. Modulation sensitivity can inform how well CI users would perform at pitch perception tasks, and both modulation sensitivity and pitch perception ability can be improved by musicianship. This motivated correlating modulation detection results from Camarena et al., 2022 for a subset of subjects to the results of present study. Figure 5 compares discrimination thresholds for each procedure with measures of modulation detection and musical sophistication as reported in Camarena et al., 2022, for a subset of 16 subjects (1 subject tested with both ears separately) who participated in both studies. These comparisons were made because the study of Camarena and colleagues found strong correlations between both modulation sensitivity and musicianship with a wide range of perceptual measures. Here, the results indicate strong correlations between pitch discrimination and modulation detection (all p < 0.001), but no significant correlations between pitch discrimination and musical sophistication based on the composite score of Goldsmith’s Musical Sophistication Index. While only a weak trend, the relationship between pitch discrimination and musical sophistication is in the expected direction with the more musically sophisticated participants having lower discrimination thresholds.
Figure 5.
Comparison of frequency, fundamental frequency, and modulation frequency discrimination thresholds with modulation detection thresholds and with musical sophistication as reported in Camarena et al., 2021. Discrimination thresholds averaged across conditions (with a logarithmic transform) are plotted for each subject. Solid straight lines indicate regression lines that minimize total squared error. Dotted lines connect the data for the separately tested ears of bilateral implant users. Pearson correlation coefficients and corresponding significance levels are shown in the upper left corner of each comparison.
DISCUSSION
The reported study characterizes pitch perception of CI users for frequencies and fundamental frequencies near 110, 220, and 440 Hz (corresponding to musical note ranges centered on A2, A3, and A4, respectively, in Western music notation). This range is of special interest for several reasons. Musically, this range spans bass and treble clefs, encompassing middle C, with all musical instruments represented in this range. Ecologically, this range spans the essential range of spoken speech with the average fundamental frequency of male talkers near 110 Hz, female talkers near 220 Hz, and children ranging from 220 to 440 Hz (Stevens 1998). In terms of stimulation cues for CIs, it has been well described in previous studies that the timing cues for pitch degrade near 220 Hz while resolved place-of-excitation cues emerge near 440 Hz (Marozeau et al. 2014; Swanson et al. 2019). For these reasons, it is important to carefully consider the extent that pitch sensitivity in this range varies between CI users. The two main limitations of the present study are based on the limited information supplied by the CI processor, when compared to the finely controlled stimuli in electrode psychophysics as well as the experience the CI user has with these degraded cues. Furthermore, sound processing for CIs is continually evolving to better provide both place and precise timing cues for pitch. As sound processing evolves, plasticity will be an important avenue to explore. Understanding accessibility of these cues should inform better sound-processing design.
The results of the present study clarify how pitch is perceived by CI users for pure, complex, and modulated tones. For pure tones, frequency discrimination widely varies across listeners with discrimination thresholds ranging from an octave (100% difference) to substantially less than a semitone (a semitone corresponding to roughly a 6% difference). The best discrimination for pure tones was observed near 880 Hz with best discrimination within the range observed in normal-hearing listeners (Micheyl et al. 2012). A relevant finding of the present study is that frequency discrimination closely tracks the frequency allocation provided by clinical programming. This suggests that discrimination in the 110 to 440 Hz range, which is poorly represented by how acoustic frequency is mapped to electrode location, could be improved by increasing the density of frequency allocation within that range (Bissmeyer & Goldsworthy 2022). However, doing so would require reducing the density of frequency allocation at higher frequencies. One potential solution to this issue is Advanced Bionics phantom stimulation which adds a low-frequency channel to the filter bank without sacrificing resolution by changing frequency allocation of the higher channels (de Jong et al., 2020). The exceptional resolution provided near 880 Hz characterized here is a purposeful design consideration as it provides essential resolution for encoding first and second formant regions of speech. The extent that these two design objectives can be balanced is an ongoing topic of interest with some authors suggesting CI programming specifically designed for music (Laneau et al., 2006; Wouters et al., 2015).
The results described here also provide insight into how pitch of complex tones changes across the 110 to 440 Hz range. It is known that pitch discrimination tends to deteriorate across this region but with highly variable results (Zaltz et al. 2018; Goldsworthy 2015; Goldsworthy et al. 2013). The results of the present study show that many CI users can consistently discriminate pitches with some having discrimination thresholds near 1% difference. The temporal cues for pitch are poorly encoded by signal processing for CIs, and some argue that even if timing cues were encoded, they might not be perceptible (Carlyon et al., 2010; Zeng, 2002). But stimulation place cues emerge with some fidelity near 440 Hz (Marozeau et al. 2014; Swanson et al. 2019) and the evidence of the present study clearly indicates that many CI users can use these cues to make pitch judgments. Several participants, who used Cochlear devices, could consistently hear frequency differences near or less than 1% despite that their devices completely discard fine timing cues. In the absence of fine timing cues, participants are likely relying on either small variations in the fractional charge delivered across multiple electrodes to hear out frequency differences or subtle changes in the rate of amplitude modulation.
The results from the discrimination procedures for modulated tones further clarifies the roles of stimulation place and timing cues for pitch perception in CI users. Modulated tones with a fixed carrier frequency do not have place-of-excitation cues except for weakly encoded sideband cues. These sideband cues are difficult to perceive in general, and in the present study they are controlled for by roving the carrier frequency. Consequently, pitch discrimination for modulated tones is primarily driven by timing cues associated with the periodicity of the modulation frequency. Without a covarying place cue, pitch discrimination clearly deteriorates across the 110 to 440 Hz range, dramatically worsening from average discrimination thresholds of 5.4% difference near 110 Hz (which is better than a semitone) to 44.3% difference near 440 Hz (worse than half an octave). But with the covarying place cue, pitch discrimination is consistently near a semitone in resolution.
An essential question related to covarying place and timing cues to encode fundamental frequency is to what extent these two cues can be combined into a salient percept. In normal hearing, when encoding low frequencies, place and timing cues in the auditory nerve are inextricably linked. Increasing the frequency of a pure tone, or increasing the fundamental frequency of a harmonic complex, necessarily changes both the tonotopic place and temporal responses of the nerve in a consistent manner (Cariani & Delgutte 1996a; Cariani & Delgutte 1996b). The extent that place and timing cues contribute to a shared decoding mechanism for pitch is unknown. Still considering normal hearing, the temporal fine structure of higher harmonics of a complex tone are not well coded into synchronous firing of the auditory nerve (Dynes & Delgutte 1992; Verschooten et al. 2019); however, the relatively slow temporal envelope of these higher harmonics encodes fundamental frequency into timing cues. Thus, in normal hearing, two quite different roles of place and timing cues emerge for complex sounds. For low frequencies, place and timing cues are inextricably linked to fundamental frequency—and to each other—and likely contribute to a common decoding mechanism for pitch. Whereas for high frequencies, the temporal envelope encodes fundamental frequency with place cues independently encoding spectral profile, the percept of which is generally referred to as timbre (Allen et al. 2017).
It is unclear to what extent place and timing cues can provide similar pitch percepts for CI users. Some studies have shown that small changes in stimulation place can be offset by small changes in stimulation rate (Rader et al. 2016; Stohl et al. 2008; Bhattacharya et al. 2011; Vandali et al. 2013). However, several studies have shown that the type of percepts evoked by stimulation place are different from those evoked by stimulation rate (Landsberger et al. 2016; Tong et al. 1982; McKay et al. 2000; Landsberger et al. 2018). Consequently, there is debate whether the sense of pitch evoked by stimulation place would be better characterized as timbral brightness. This issue is not easy to resolve because of the role that experience plays in decoding pitch. In normal hearing, place and timing cues associated with low frequencies are inextricably linked from birth, which provides experience for any neural mechanism that jointly uses this information. Likewise, normal-hearing listeners have experience from birth listening to independent place and timing cues associated with high frequencies allowing them to disambiguate aspects of pitch and timbre. CIs generally do not encode fundamental frequency into a consistent covarying representation of place and rate. Until they do, it remains uncertain the extent that these covarying cues could be learned to represent a common pitch percept. The most tantalizing evidence for such so far is that the Fine Spectral Processing strategies implemented on MED-EL devices—a strategy that arguably provides access to covarying place and rate cues—leads to long-term gains in pitch perception after a year of experience (Roy et al. 2015).
While many challenges exist for optimally encoding the fundamental frequency of complex sounds into stimulation place and timing cues for CIs, the results of the present study indicate that the combination of cues can provide consistent and excellent resolution for CI users. The worthwhile end would be to improve pitch resolution for CI users, which would likely improve their music appreciation, speech recognition in noise, tonal language perception, and overall quality of life (Gfeller et al. 2019).
Supplementary Material
Figure A1. Electrical stimulation patterns and modeled neural response for pure tones for frequencies of 220, 440, and 880 Hz. Stimuli are shorter in duration than used in the described behavioral studies to provide better visual resolution of the temporal dynamics. Three subsequent tone pips, 30 ms in duration, were used to probe signal processing emulations provided by each implant manufacturer.
Figure A2. Synchrony quantified as vector strength versus frequency, or fundamental frequency, for pure, complex, and modulated tones.
Figure A3. As Figure A1, but for low pass filtered harmonic complexes for fundamental frequencies of 110, 220, and 440 Hz. Individual tone pips were 70 ms in duration, shorter than used in the behavioral studies, to provide better visual resolution of the temporal dynamics.
Figure A4. As Figure A3, but for high pass filtered harmonic complexes.
Figure A5. As Figure A4, but for sinusoidally amplitude modulated tones.
Figure A6. As Figure A5, but with covarying control over the frequency of the modulated tone.
APPENDIX
The aim of this study was to characterize pitch resolution of CI users for pure, complex, and modulated tones. Our motivation was to better understand how CI users rely on place and rate cues of electrical stimulation. Pure tones were used as probes that primarily provide place cues with timing cues discarded with Cochlear devices, and with variable amounts of temporal cues provided by Advanced Bionics and MED-EL devices. Low-pass filtered harmonic complexes were compared with high-pass filtered harmonics complexes, as often used in studies of normal hearing, since low-pass filtered harmonics provide place and rate cues, while high-pass filtered harmonics primarily provide timing cues. Sinusoidally amplitude-modulated tones were used to provide timing cues, with comparison to amplitude-modulated tones with covarying carrier frequency to provide an additional place cue. These stimuli were processed through various signal processing strategies to result in electrical stimulation; consequently, it is important to characterize what cues are available in the output stimulation for the different devices. To clarify how representative stimuli result in different stimulation patterns, stimuli were processed through signal processing emulations made available by Advanced Bionics, Cochlear, and MED-EL corporations. Stimuli were analyzed in terms of place-of-excitation cues and corresponding vector strength of synchrony between stimulation and the input frequency, or fundamental frequency, of interest.
Signal processing for Advanced Bionics devices was emulated using demonstration code developed and provided by Advanced Bionics as an emulation of Fidelity 120 with current steering. Signal processing for Cochlear devices was emulated using the Nucleus MATLAB Toolbox version 4.42, (Swanson & Mauch, 2006). Default parameters were used to emulate Advanced Combinatorial Encoders (ACE) with 8 active electrodes per frame. Signal processing for MED-EL devices was emulated using simCoding software provided by MED-EL. Parameters were used to emulate the FS4-p coding strategy providing temporal fine structure on the first four apical channels.
Neural response to electrical stimulation was modeled using a model of current spread and a point process model of neural excitation (Goldwyn et al. 2012; Litvak et al. 2007). This modeling framework is as described in Goldsworthy, 2022. In brief, current spread was modeled using an inverse law and assuming the electrode array was linear and parallel to modeled nerve fibers. The interelectrode spacing varies among companies with Cochlear having a spacing of 0.4–0.9 mm (Arnoldner et al., 2007; Ertas et al., 2022; Fu & Shannon, 1999), Advanced Bionics having a spacing of 0.85–3.0 mm (Ertas et al., 2022), and MED-EL having a spacing of 1.9–2.4 mm (Arnoldner et al., 2007; Ertas et al., 2022; Shin et al., 2021; Vermeire et al., 2015). Rationale for using simple models of electrode geometry and summation of electric fields were described by Litvak et al., 2007. The modeled voltage after current spread was used as the input to a point process model of neuronal excitation as described by Goldwyn et al., 2012. The model consists of a cascade of linear and nonlinear stages associated with biophysical mechanisms of neuronal dynamics. A semi-analytical procedure determines parameters based on statistics of single fiber responses to electrical stimulation, including threshold, relative spread, jitter, and chronaxie. Refractory and summation effects are accounted for that influence the responses of auditory nerve fibers to high-rate pulsatile stimulation. For the present study, the distance between electrodes to the closest neuron was specified as 1 millimeter. The electrical current at the location of the nearest neuron was normalized to an input level of 1 milliampere. Neural locations were modeled for cochlear distance from 0 to 28 mm in increments of 0.2 mm. All model parameters were as described by Goldwyn et al., 2012.
Figure A1 shows representative electrical stimulation and modeled neural response for pure tones for three representative frequencies (220, 440, and 880 Hz) for each of the three signal processing emulations made available by CI manufacturers. For a pure tone, stimulation for Advanced Bionics and Cochlear devices is relatively constrained to two or three electrodes compared to the broader stimulation pattern observed with MED-EL devices. However, the modeled neural response is comparable compact for MED-EL devices. With a longer electrode array, the modeled place-of-excitation for the 220 and 440 Hz tones extend beyond 20 mm into the cochlea, relatively deep compared to Advanced Bionics or Cochlear devices. Perhaps the starker difference between manufacturers is the temporal encoding of frequency into stimulation and modeled neural response. Advanced Bionics provides some temporal encoding of frequency information with lower rates observed for the 220 and 440 Hz tones compared to the 880 Hz tone, but it will be shown later in this appendix that the synchrony of stimulation to the input frequency is relatively low compared to MED-EL devices. The emulation of Cochlear Corporation ACE processor does not provide temporal cues for frequency in the electrical stimulation, with identical stimulation rates used for all tones. The FS4 emulation of MED-EL processing does provide temporal cues for frequency, with the stimulation rate based on temporal fine structure of the input tone frequency. These differences in how temporal cues are used to encode frequency are considered in more detail in Figure A6, which is shown as synchrony of stimulation, quantified as vector strength, as a function of input frequency for each of the sound processing strategies.
Figure A2 shows representative stimulation patterns and modeled neural response for complex tones with low harmonics. This comparison of stimulation patterns shows that the primary cue provided for each strategy for the 110 Hz complex tone is the temporal periodicity of the firing pattern, which is clearly seen for each modulation. For the 220 and 440 Hz tone, the modulation depth of this stimulation periodicity diminishes (further analysis provided in Figure A6). For the Advanced Bionics emulation, place pitch cues can be observed in the patterning of electrical stimulation across electrodes for the 220 and 440 Hz condition. Analyses not presented here indicate that this place-cue patterning emerges for Cochlear devices near 300 Hz and for MED-EL devices near 440 Hz. In general, for complex tones with low harmonics, both place and timing cues for pitch are encoded in the resulting stimulation, but with the relative strength of the encoding dependent on the details of filtering.
Figure A3 shows stimulation patterns, like Figure A2, but for complex tones with high harmonics. Depending on the filter specifications of the sound processing emulation, the resulting temporal cues for pitch can be encoded with exceptional precision. The relative depth of stimulation is visible deeper for Cochlear and MED-EL devices compared to that provided by Advanced Bionics. This is likely driven by narrower spectral filtering used with Fidelity 120 compared to filtering used with ACE and FS4. Notably, there are no place-of-excitation cues in the stimulation associated with the fundamental frequency, so it provides a clean comparison of performance with only temporal cues. Synchrony analysis is again provided in Figure A6.
Figure A4 shows electrical stimulation patterns and modeled neural response for sinusoidally amplitude modulated tones without a covarying carrier frequency. The representative modulated tones are for modulation frequencies of 110, 220, and 440 Hz. Without covarying control over the carrier frequency, the center of the place of excitation cues remains the same across modulation frequencies. It is important to note, though, that for the higher modulation frequencies, the place of excitation associated with the carrier frequency and the two side bands produced by amplitude modulation start producing noticeable resolved sidebands in the electrical stimulation pattern. These sidebands limit the efficacy of using modulated tones for higher modulation frequencies because the sidebands provide inconsistent place cues, inconsistent because the lower frequency sideband moves down, and the higher frequency sideband moves up with increasing modulation frequency.
Figure A5 shows similar analyses as Figure A4, but with covarying control over the carrier frequency of the modulated tone. With covarying control over the carrier frequency, a clear change in place of excitation occurs in the electrical stimulation pattern and modeled neural response. The temporal cues associated with modulation frequency are notably weaker in comparison to complex tones. Further analysis of stimulation synchrony to the modulation frequency of interest is provided in Figure A6.
Figure A6 provides analysis details for the temporal cues provided by electrical stimulation for pure, complex, and modulated tones. Synchrony is quantified as the vector strength between electrical stimulation and the frequency, or fundamental frequency, of the input tone. For pure tones, the simulation of Advanced Bionics stimulation indicates some synchrony to the input stimuli for frequencies in the range between 250 and 500 Hz, but the vector strength is less than 0.5. For simulation of Cochlear stimulation, synchrony to the input pure tone frequency is completely discarded. In contrast, synchrony to the input tone frequency is near 1 for all frequencies up to 880 Hz, which is the upper edge of the highest frequency filter used in fine structure processing for MED-EL devices. For both low and high pass filtered harmonic tones, the general trend observed when comparing simulations across input fundamental frequencies is that synchrony tends to be highest for MED-EL devices, followed by Cochlear, and then Advanced Bionics. For MED-EL devices and low harmonics, vector strength at fundamental frequencies jumps to values near 1, which is likely driven by the fine structure encoding subroutines for that simulation. Synchrony quantified as vector strength is comparably poor for modulated tones compared to complex tones with multiple harmonics.
These analyses were provided to better characterize the tonotopic place and synchronous rate cues available for the stimuli used in this study. A primary hypothesis of this study was that CI users would perform better with low pass compared to high-pass filtered harmonics because the former contains both place and rate cues in the excitation pattern whereas the latter only contains synchronous rate cues associated with the fundamental. The analysis of stimulation patterns provided here supports that interpretation of the behavioral results. A secondary hypothesis of this study was that CI users would perform better on pitch ranking of modulated tones when the carrier frequency of the modulated tone was covaried with the modulation frequency. The analysis of stimulation patterns provided here partially supports the interpretation that the additive cue is a tonotopic place of excitation cue; however, for high modulation frequencies, artifactual sideband cues arise in the place of excitation pattern that limit the use of modulated tones as probes of place versus rate cues.
Footnotes
Financial disclosures/conflicts of interest: This research was supported by a grant from the National Institute on Deafness and Other Communication Disorders (NIDCD) of the National Institutes of Health: R01 DC018701. The funding organization had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the decision to submit the article for publication; or in the preparation, review, or approval of the article.
REFERENCES
- Allen EJ, Burton PC, Olman CA, et al. (2017). Representations of pitch and timbre variation in human auditory cortex. Journal of Neuroscience, 37, 1284–1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arnoldner C, Riss D, Brunner M, et al. (2007). Speech and music perception with the new fine structure speech coding strategy: preliminary results. Acta Otolaryngol, 127, 1298–1303. Available at: 10.1080/00016480701275261. [DOI] [PubMed] [Google Scholar]
- Swanson B, H.M. (2006). Nucleus Matlab Toolbox. 4.20 Software User Manual Cochlear Ltd, Lane Cove NSW, Australia (2006). [Google Scholar]
- Belin P, Bestelmeyer PEG, Latinus M, et al. (2011). Understanding Voice Perception. British Journal of Psychology, 102, 711–725. [DOI] [PubMed] [Google Scholar]
- Belin P, Fecteau S, Bédard C (2004). Thinking the voice: neural correlates of voice perception. Trends Cogn Sci, 8, 129–135. Available at: https://pubmed.ncbi.nlm.nih.gov/15301753/ [Accessed August 2, 2022]. [DOI] [PubMed] [Google Scholar]
- Bhattacharya A, Vandali AE, Zeng F-G (2011). Combined spectral and temporal enhancement to improve cochlear-implant speech perception. J Acoust Soc Am, 130, 2951–2960. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bissmeyer SRS, Goldsworthy RL (2022). Combining Place and Rate of Stimulation Improves Frequency Discrimination in Cochlear Implant Users. Hear Res, 424, 108583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bruns L, Mürbe D, Hahne A (2016). Understanding music with cochlear implants. Sci Rep, 6. Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4997320/. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Camarena A, Manchala G, Papadopoulos J, et al. (2021). Pleasantness Ratings of Musical Dyads in Cochlear Implant Users. Brain Sci, 12, 33. Available at: https://www.mdpi.com/2076-3425/12/1/33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cariani PA, Delgutte B (1996a). Neural correlates of the pitch of complex tones. I. Pitch and pitch salience. J Neurophysiol, 76, 1698–1716. [DOI] [PubMed] [Google Scholar]
- Cariani PA, Delgutte B (1996b). Neural correlates of the pitch of complex tones. II. Pitch shift, pitch ambiguity, phase invariance, pitch circularity, rate pitch, and the dominance region for pitch. J Neurophysiol, 76, 1717–1734. [DOI] [PubMed] [Google Scholar]
- Carlyon RP, Deeks JM (2002). Limitations on rate discrimination. J Acoust Soc Am, 112, 1009–1025. [DOI] [PubMed] [Google Scholar]
- Carlyon RP, Deeks JM, McKay CM (2010). The upper limit of temporal pitch for cochlear-implant listeners: Stimulus duration, conditioner pulses, and the number of electrodes stimulated. J Acoust Soc Am, 127, 1469–1478. [DOI] [PubMed] [Google Scholar]
- Cedolin L, Delgutte B (2010). Spatiotemporal representation of the pitch of harmonic complex tones in the auditory nerve. Journal of Neuroscience, 30, 12712–12724. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dynes SBC, Delgutte B (1992). Phase-locking of auditory-nerve discharges to sinusoidal electric stimulation of the cochlea. Hear Res, 58, 79–90. [DOI] [PubMed] [Google Scholar]
- Firszt JB, Holden LK, Skinner MW, et al. (2004). Recognition of speech presented at soft to loud levels by adult cochlear implant recipients of three cochlear implant systems. Ear Hear, 25, 375–387. [DOI] [PubMed] [Google Scholar]
- Gazibegovic D, Arnold L, Rocca C, et al. (2010). Evaluation of Music Perception in Adult Users of HiRes® 120 and Previous Generations of Advanced Bionics® Sound Coding Strategies. Cochlear Implants Int, 11, 296–301. Available at: 10.1179/146701010X12671177989354. [DOI] [PubMed] [Google Scholar]
- Gfeller KE, Mallalieu RMM, Mansouri A, et al. (2019). Practices and Attitudes That Enhance Music Engagement of Adult Cochlear Implant Users. Front Neurosci, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gfeller KE, Turner C, Oleson J, et al. (2007). Accuracy of cochlear implant recipients on pitch perception, melody recognition, and speech reception in noise. Ear Hear, 28, 412–423. [DOI] [PubMed] [Google Scholar]
- Goldsworthy RL (2022). Computational Modeling of Synchrony in the Auditory Nerve in Response to Acoustic and Electric Stimulation. Front Comput Neurosci, 16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldsworthy RL (2015). Correlations Between Pitch and Phoneme Perception in Cochlear Implant Users and Their Normal Hearing Peers. JARO - Journal of the Association for Research in Otolaryngology, 16, 797–809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldsworthy RL, Delhorne LA, Braida LD, et al. (2013). Psychoacoustic and phoneme identification measures in cochlear-implant and normal-hearing listeners. Trends Amplif, 17, 27–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldwyn JH, Rubinstein JT, Shea-Brown E (2012). A point process framework for modeling electrical stimulation of the auditory nerve. J Neurophysiol, 108, 1430–1452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hochmair I, Hochmair E, Nopp P, et al. (2015). Deep electrode insertion and sound coding in cochlear implants. Hear Res, 322, 14–23. [DOI] [PubMed] [Google Scholar]
- Houtsma AJM, Smurzynski J (1990). Pitch identification and discrimination for complex tones with many harmonics. Journal of the Acoustical Society of America, 87, 304–310. [Google Scholar]
- Joris PX, Schreiner CE, Rees A (2004). Neural Processing of Amplitude-Modulated Sounds. 541–577. [DOI] [PubMed] [Google Scholar]
- Kaernbach C (1991). Simple adaptive testing with the weighted up-down method. Percept Psychophys, 49, 227–229. [DOI] [PubMed] [Google Scholar]
- Kong Y-Y, Carlyon RP (2010). Temporal pitch perception at high rates in cochlear implants. J Acoust Soc Am, 127, 3114–23. [DOI] [PubMed] [Google Scholar]
- Landsberger DM, Galvin JJ (2011). Discrimination between sequential and simultaneous virtual channels with electrical hearing. J Acoust Soc Am, 130, 1559–1566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landsberger DM, Marozeau J, Mertens G, et al. (2018). The relationship between time and place coding with cochlear implants with long electrode arrays. J Acoust Soc Am, 144, EL509–EL514. Available at: 10.1121/1.5081472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landsberger DM, Vermeire K, Claes A, et al. (2016). Qualities of single electrode stimulation as a function of rate and place of stimulation with a cochlear implant. Ear Hear, 37, e149–e159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laneau J, Wouters J, Moonen M (2006). Improved music perception with explicit pitch coding in cochlear implants. Audiology and Neurotology, 11, 38–52. [DOI] [PubMed] [Google Scholar]
- Litvak LM, Spahr AJ, Emadi G (2007). Loudness growth observed under partially tripolar stimulation: Model and data from cochlear implant listeners. J Acoust Soc Am, 122, 967–981. [DOI] [PubMed] [Google Scholar]
- Loeb GE (2005). Are cochlear implant patients suffering from perceptual dissonance? Ear Hear, 26, 435–450. [DOI] [PubMed] [Google Scholar]
- Looi V, Gfeller KE, Driscoll VD (2012). Music appreciation and training for cochlear implant recipients: A review. Semin Hear, 33, 307–334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luo X, Soslowsky S, Pulling KR (2019). Interaction Between Pitch and Timbre Perception in Normal-Hearing Listeners and Cochlear Implant Users. JARO - Journal of the Association for Research in Otolaryngology, 20, 57–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marimuthu V, Swanson BA, Mannell R (2016). Cochlear Implant Rate Pitch and Melody Perception as a Function of Place and Number of Electrodes. Trends Hear, 20, 1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marozeau J, Simon N, Innes-Brown H (2014). Cochlear implants can talk but cannot sing in tune. Acoust Aust, 42, 131–135. [Google Scholar]
- McDermott JH, Oxenham AJ (2008). Music perception, pitch, and the auditory system. Curr Opin Neurobiol, 18, 452–463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McKay CM, McDermott HJ, Carlyon RP (2000). Place and temporal cues in pitch perception: Are they truly independent? Acoustic Research Letters Online, 1, 25–30. [Google Scholar]
- Micheyl C, Delhommeau K, Perrot X, et al. (2006). Influence of musical and psychoacoustical training on pitch discrimination. Hear Res, 219, 36–47. [DOI] [PubMed] [Google Scholar]
- Micheyl C, Xiao L, Oxenham AJ (2012). Characterizing the dependence of pure-tone frequency difference limens on frequency, duration, and level. Hear Res, 292, 1–13. Available at: 10.1016/j.heares.2012.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miyazono H, Moore BCJ (2013). Implications for pitch mechanisms of perceptual learning of fundamental frequency discrimination: Effects of spectral region and phase. Acoust Sci Technol, 34, 404–412. [Google Scholar]
- Moore BCJ, Sęk A (2009). Sensitivity of the human auditory system to temporal fine structure at high frequencies. J Acoust Soc Am, 125, 3186. [DOI] [PubMed] [Google Scholar]
- Moore BCJ, Tyler LK, Marslen-Wilson William. (2009). The perception of speech : from sound to meaning, Oxford University Press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mü Llensiefen D, Gingras B, Musil J, et al. (2014). The Musicality of Non-Musicians: An Index for Assessing Musical Sophistication in the General Population. PLoS One, 9, 89642. Available at: www.plosone.org [Accessed January 25, 2022]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Niparko JK, Tobey EA, Thal DJ, et al. (2010). Spoken language development in children following cochlear implantation. JAMA - Journal of the American Medical Association, 303, 1498–1506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oxenham AJ, Bernstein JGW, Penagos H (2004). Correct tonotopic representation is necessary for complex pitch perception. Proc Natl Acad Sci U S A, 101, 1421–1425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oxenham AJ, Micheyl C, Keebler MV, et al. (2011). Pitch perception beyond the traditional existence region of pitch. Proc Natl Acad Sci U S A, 108, 7629–7634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padilla M, Stupak N, Landsberger DM (2017). Pitch ranking with different virtual channel configurations in electrical hearing. Hear Res, 348, 54–62. [DOI] [PubMed] [Google Scholar]
- Plack C, Oxenham AJ (2005). Pitch: Neural Coding and Perception,
- Rader T, Döge J, Adel Y, et al. (2016). Place dependent stimulation rates improve pitch perception in cochlear implantees with single-sided deafness. Hear Res, 339, 94–103. [DOI] [PubMed] [Google Scholar]
- Roy AT, Carver C, Jiradejvong P, et al. (2015). Musical Sound Quality in Cochlear Implant Users: A Comparison in Bass Frequency Perception between Fine Structure Processing and High-Definition Continuous Interleaved Sampling Strategies. Ear Hear, 36, 582–590. [DOI] [PubMed] [Google Scholar]
- Shannon RV (1983). Multichannel electrical stimulation of the auditory nerve in man. I. Basic psychophysics. Hear Res, 11, 157–189. [DOI] [PubMed] [Google Scholar]
- Stevens KN (1998). Acoustic phonetics, Cambridge, Mass. [Google Scholar]
- Stohl JS, Throckmorton CS, Collins LM (2008). Assessing the pitch structure associated with multiple rates and places for cochlear implant users. J Acoust Soc Am, 123, 1043–1053. [DOI] [PubMed] [Google Scholar]
- Svirsky M (2017). Cochlear implants and electronic hearing. Phys Today, 70, 53–58. [Google Scholar]
- Swanson BA, Marimuthu VMR, Mannell RH (2019). Place and Temporal Cues in Cochlear Implant Pitch and Melody Perception. Front Neurosci, 13, 1–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tong YC, Clark GM, Blamey PJ, et al. (1982). Psychophysical studies for two multiple-channel cochlear implant patients. Journal of the Acoustical Society of America, 71, 153–160. [DOI] [PubMed] [Google Scholar]
- Vandali AE, Sly D, Cowan R, et al. (2013). Pitch and loudness matching of unmodulated and modulated stimuli in cochlear implantees. Hear Res, 302, 32–49. [DOI] [PubMed] [Google Scholar]
- Vandali AE, Sucher C, Tsang DJ, et al. (2005). Pitch ranking ability of cochlear implant recipients: A comparison of sound-processing strategies. J Acoust Soc Am, 117, 3126–3138. [DOI] [PubMed] [Google Scholar]
- Verschooten E, Shamma S, Oxenham AJ, et al. (2019). The upper frequency limit for the use of phase locking to code temporal fine structure in humans: A compilation of viewpoints. Hear Res, 377, 109–121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson BS, Dorman MF (2008). Cochlear implants: A remarkable past and a brilliant future. Hear Res, 242, 3–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Won JH, Drennan WR, Kang RS, et al. (2010). Psychoacoustic abilities associated with music perception in cochlear implant users. Ear Hear, 31, 796–805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wouters J, McDermott HJ, Francart T (2015). Sound coding in cochlear implants: From electric pulses to hearing. IEEE Signal Process Mag, 32, 67–80. [Google Scholar]
- Zaltz Y, Goldsworthy RL, Kishon-Rabin L, et al. (2018). Voice Discrimination by Adults with Cochlear Implants: the Benefits of Early Implantation for Vocal-Tract Length Perception. JARO - Journal of the Association for Research in Otolaryngology, 19, 193–209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeng FG (2002). Temporal pitch in electric hearing. Hear Res, 174, 101–106. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure A1. Electrical stimulation patterns and modeled neural response for pure tones for frequencies of 220, 440, and 880 Hz. Stimuli are shorter in duration than used in the described behavioral studies to provide better visual resolution of the temporal dynamics. Three subsequent tone pips, 30 ms in duration, were used to probe signal processing emulations provided by each implant manufacturer.
Figure A2. Synchrony quantified as vector strength versus frequency, or fundamental frequency, for pure, complex, and modulated tones.
Figure A3. As Figure A1, but for low pass filtered harmonic complexes for fundamental frequencies of 110, 220, and 440 Hz. Individual tone pips were 70 ms in duration, shorter than used in the behavioral studies, to provide better visual resolution of the temporal dynamics.
Figure A4. As Figure A3, but for high pass filtered harmonic complexes.
Figure A5. As Figure A4, but for sinusoidally amplitude modulated tones.
Figure A6. As Figure A5, but with covarying control over the frequency of the modulated tone.