Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jul 1.
Published in final edited form as: Ear Hear. 2016 Jul-Aug;37(4):465–472. doi: 10.1097/AUD.0000000000000257

Acoustic cue weighting by adults with cochlear implants: A mismatch negativity study

Aaron C Moberly 1, Jyoti Bhat 1, Antoine J Shahin 2
PMCID: PMC4899308  NIHMSID: NIHMS736558  PMID: 26655914

Abstract

Objectives

Formant Rise Time (FRT) and Amplitude Rise Time (ART) are acoustic cues that inform phonetic identity. FRT represents the rate of transition of the formant(s) to a steady state, while ART represents the rate at which the sound reaches its peak amplitude. Normal hearing (NH) native English speakers weight FRT more than ART during the perceptual labeling of the /bα/-/wα/ contrast. This weighting strategy is reflected neurophysiologically in the magnitude of the mismatch negativity (MMN) – MMN is larger during the FRT than the ART distinction. The present study examined the neurophysiological basis of acoustic cue weighting in adult cochlear implant (CI) listeners using the MMN design. It was hypothesized that individuals with CIs who weight ART more in behavioral labeling (ART-users) would show larger MMNs during the ART than the FRT contrast, and the opposite would be seen for FRT-users.

Design

Electroencephalography (EEG) was recorded while twenty adults with CIs listened passively to combinations of three synthetic speech stimuli: a /bα/ with /bα/-like FRT and ART; a /wα/ with /wα/-like FRT and ART; and a /bα/wa stimulus with /bα/-like FRT and /wα/-like ART. The MMN response was elicited during the FRT contrast by having participants passively listen to a train of /wα/ stimuli interrupted occasionally by /bα/wa stimuli, and vice versa. For the ART contrast, the same procedure was implemented using the /bα/ and /bα/wa stimuli.

Results

Both ART- and FRT-users with CIs elicited MMNs that were equal in magnitudes during FRT and ART contrasts, with the exception that FRT-users exhibited MMNs for ART and FRT contrasts that were temporally segregated. That is, their MMNs occurred significantly earlier during the ART contrast (~100 ms following sound onset) than during the FRT contrast (~200ms). In contrast, the MMNs for ART-users of both contrasts occurred later and were not significantly separable in time (~230 ms). Interestingly, this temporal segregation observed in FRT-users is consistent with the MMN behavior in NH listeners.

Conclusions

Results suggest that listeners with CIs who learn to classify phonemes based on formant dynamics, consistent with NH listeners, develop a strategy similar to NH listeners, in which the organization of the amplitude and spectral representations of phonemes in auditory memory are temporally segregated.

Keywords: Amplitude rise time, Auditory evoked potentials, Cochlear implants, Formant rise time, Mismatch negativity, Speech perception

Introduction

During speech recognition, a listener’s brain organizes representations of spectral and temporal “cues” into a phonetic code and, ultimately, into a speech percept. Listeners with normal hearing (NH) rely on, or assign different weights to, distinct acoustic cues to classify phonemes (Bailey & Summerfield, 1980; Best et al., 1981). Generally, NH adults of the same primary language assign similar weights to a given acoustic cue during phonetic judgments (Nittrouer & Miller, 1997; Ohde & Haley, 1997). This holds true even if cues are equally discriminable (Holt & Lotto, 2006), most likely because those strategies support the most efficient and accurate speech recognition in that language (Best, 1994; Jusczyk et al., 1995; Nittrouer, 2005).

Previous studies have used simple acoustic manipulations, such as the /bα/-/wα/ consonant-vowel (CV) contrast, to assess acoustic cue-weighting strategies. The onset and steady-state formant frequencies are similar for /bα/ and /wα/, but both the time it takes to reach steady-state values, the Formant Rise Time (FRT), and the time required to reach peak amplitude, the Amplitude Rise Time (ART), differ for /bα/ and /wα/. Both FRT and ART are short for /bα/ and longer for /wα/. This contrast could theoretically be perceived by weighting of either the spectral cue (FRT) or the temporal cue (ART). However, NH native English listeners predominately weight the FRT, compared to the ART, in labeling these CVs (Nittrouer & Studdert-Kennedy, 1986; Nittrouer et al., 2012; Walsh & Diehl, 1991). Interestingly, this perceptual effect was found to be reflected in the dynamics of the mismatch negativity (MMN) (Moberly et al., 2014a). The MMN reflects a pre-attentive auditory-evoked potential (AEP) that occurs 100–300 ms following sound onset and is characterized by a fronto-central negativity elicited during any acoustically discriminable “deviant” sound within a regular stream of “standard” stimuli (Näätänen, 2001; Näätänen et al., 2007; Picton et al., 2000). The MMN is thought to represent the brain’s response to a mismatch of a stimulus representation to a trace in short- or long-term auditory memory (Näätänen et al., 2011; Pulvemüller & Shtyrov, 2006). In the Moberly et al. (2014a) study, the following stimuli were used: a /bα/ CV with /bα/-like FRT and ART; a /wα/ CV with /wα/-like FRT and ART; and a /bα/wa CV with /bα/-like FRT and /wα/-like ART. The MMN response was found to represent the superposition of two components. The earlier component (200–250 ms) reflected acoustic change detection while the later one (250–300 ms) represented the encoding of the more dominantly weighted cue. Specifically, the later portion of the MMN was larger during the FRT (/bα/wa versus /wα/, with constant ART but different FRT) than ART (/bα/wa versus /bα/, with constant FRT but different ART) contrast. Moberly et al. (2014a) concluded that the greater MMN elicited during the FRT contrast reflected the weighting strategy or perceptual organization of these cues in auditory memory, such that the FRT representation is weighted more favorably (pushed to the foreground in auditory memory) over the ART representation (pushed to the background).

Building on these neurophysiological results, this study sought to understand the neurophysiological bases of acoustic cue weighting in listeners with cochlear implants (CIs). In contrast to NH listeners, examining the MMN in CI users can be challenging for technical and perceptual reasons. First, the signal from the CI contaminates the EEG signal, and teasing the brain and CI signals apart can be difficult. Second, substantial variability in speech recognition remains among CI users (Kiefer et al.,1998; Peterson et al., 2010; Shipp & Nedzelski, 1995). As such, it cannot be assumed that all CI subjects are equally sensitive (or have equal discriminative abilities) to the FRT and ART contrasts when speech is delivered through their implants. A lack of MMN response could therefore simply be a result of poor auditory discrimination of the tested contrast, which would limit encoding of that cue in auditory memory (i.e., leading to weaker cue weighting). This concern was addressed in a study by Moberly et al. (2014b), which examined discrimination abilities of CI users to non-speech synthetic FRT and ART contrasts and related these findings to cue-weighting strategies. Wide variability was seen in cue-weighting strategies among the CI users (Figure 1, adapted from Moberly et al., 2014b), but the group as a whole showed equal weighting of the two cues in the /bα/ and /wα/ labeling contrast. However, there were several CI listeners who weighted FRT more than ART and others who favored ART more than FRT in categorizing /bα/ and /wα/. These findings would suggest that participants, on average, had relatively similar sensitivity to the spectral and temporal cues. This is inconsistent with the premise that because the CIs deliver the temporal envelope (Friesen et al., 2001) to a greater extent than the spectral structure (Wilson & Dorman, 2008), then their perceptual reliance on ART should necessarily be dominant.

Figure 1.

Figure 1

ART and FRT weighting factors for individual participants. Adapted from “Do adults with cochlear implants rely on different acoustic cues for phoneme perception than adults with normal hearing?,” by A.C. Moberly et al., 2014, JSLHR, 57, 566–82, ASHA. Adapted with permission.

The authors are not aware of studies that used the MMN to examine amplitude and formant cue weighting in CI listeners. However, using other designs, the MMN response has been found with general success for individuals with CIs who exhibit relatively good speech perception (Kraus et al., 1993; Ponton et al., 2000; Roman et al., 2005; Zhang et al., 2011). Importantly, MMN responses have been observed for individuals with CIs listening to both duration and frequency contrasts (Ponton & Don, 1995). In this study, the same synthetic versions of the /bα/-/wα/ contrast documented in the study by Moberly et al. (2014a) were used to elucidate the relationship between MMN and weighting strategies in CI users for ART and FRT. Because the adults with CIs tested in the Moberly et al. (2014b) study showed variable weighting strategies, with some relying on the spectral cue more than the temporal cue, and vice versa, it was hypothesized that the group who weighted FRT more heavily than ART (the “FRT group”) would show a larger MMN response for the FRT contrast. Likewise, the group who weighted ART more heavily than FRT (the “ART group”) would show larger MMN for the ART cue.

Materials and methods

Subjects

Subjects were tested at The Auditory Neuroscience Lab, Eye and Ear Institute, The Ohio State University Wexner Medical Center. Informed written consent was obtained from all subjects in accordance with the ethical guidelines of the Institutional Review Board of The Ohio State University. Subjects were recruited from a clinical pool of the Department of Otolaryngology at The Ohio State University Wexner Medical Center, and they were paid for their participation.

Participants were twenty adult, native English listeners with CIs who had severe-to-profound sensorineural hearing loss and qualified for implantation after age 9 years and, thus, were post-lingualy deafened. They were the same subjects who participated in the previous labeling and discrimination behavioral study examining cue weighting and salience for the /bα/-/wα/ contrast for adults with CIs (Moberly et al., 2014b). In data analysis for the current study, participants were divided into two groups, the FRT-users and the ART-users, based on the results of /bα/-/wα/ contrast labeling in the Moberly et al., 2014b, study. Those whose computed weighting factor (a measure of the perceptual weight assigned to each cue, based on logistic regression of responses during the labeling tasks) was larger for the FRT cue than the ART cue (N = 9) were included in the “FRT-user” group. The “ART-user” group consisted of those participants whose weighting factor was larger for ART than for FRT (N = 11) (Table 1). All subjects were right handed (Edinburgh Handedness Inventory) and between the ages of 18 and 62. All had CI-aided thresholds measured by audiologists within one year prior to testing. Four (out of eleven) of the ART-users were implanted in the right ear and six (out of nine) of the FRT-users were implanted in the right ear. Mean CI-aided pure-tone thresholds for the frequencies of 0.25 to 4 kHz were better than 35 dB hearing level for all subjects. Five subjects had bilateral implants (4 in FRT-users group and 1 in ART-users group), five typically used a hearing aid on the contralateral ear (3 in FRT-users group and 2 in ART-users group), and ten did not use additional amplification. Audiometric testing was performed on non-implanted and implanted ears using a Welch Allyn TM262 audiometer using TDH-39 headphones to measure residual hearing. This was done to determine whether either ear needed to be plugged during screening and testing. None of the participants had pure-tone average (PTA) thresholds for the frequencies of 0.5, 1, and 2 kHz in either ear better than 68 dB HL. Therefore, ear plugging was not necessary during screening and testing, which were performed at around 68 dB sound pressure level.

Table 1.

Mean weighting factors for Amplitude Rise Time (ART) and Formant Rise Time (FRT) for ART-user group and FRT-user group. The t, (df), and p value columns show results of independent samples t-tests comparing weighting factors between groups.

N Groups
t (df) p
ART-users
11
FRT-users
9
Weighting factor
 ART, mean (SD) 3.3 (1.9) 2.3 (1.4) 1.24 18 0.233
 FRT, mean (SD) 1.8 (1.6) 4.5 (2.3) 3.08 18 0.006

For subjects with two CIs, only the first-implanted CI was used during testing. Nineteen subjects used Cochlear Nucleus devices, stimulated using monopolar stimulation, and using the ACE coding strategy. One subject used an Advanced Bionics device with the Harmony strategy. Eleven subjects performed testing using CIs on their right ears, and nine subjects used CIs on their left ears. Before testing, participants completed questionnaires regarding hearing history. Demographics for the subjects are shown in Table 2.

Table 2.

Participant demographics.

Participant Age (years) Gender Implant Processor Contralateral Aid PTA (dB HL) Age Onset Hearing Loss (years) Implant Age (years) Etiology Hearing Loss
1 37 M Freedom Freedom Yes No response 20 30 Adult progressive
2 40 M Nucleus 24 CP810 No No response 20 32 Menieres
3 30 F Nucleus 24 Freedom No No response 0 21 Congenital
4 31 M Freedom CP810 Bilateral CI No response 3 23 Child progressive
5 29 M Freedom CP810 No No response 3 25 Child progressive
6 29 M Advanced Bionics AB Harmony No 78.3 1 22 Infant ototoxicity
7 32 F Nucleus 22 Freedom No No response 2.5 9.5 Meningitis, ototoxicity
8 29 F CI512 CP810 No No response 1.5 27 Meningitis, child progressive
9 34 F Nucleus 24 CP810 No No response 0 25 Congenital progressive
10 18 F Cochlear Unknown Yes No response 3 16 Child progressive
11 37 M Freedom Freedom Bilateral CI No response 33 33 Adult meningitis
12 54 M Nucleus 24 CP810 Bilateral CI No response 0 48 Congenital progressive
13 47 F Nucleus 24 Freedom No No response 0 37 Congenital progressive
14 60 F Freedom Freedom Bilateral CI No response 40 54 Adult progressive
15 46 M Nucleus 24 Freedom Bilateral CI No response 18 38 Adult progressive
16 62 M CI512 CP810 No No response 14 60 Menieres
17 52 F Freedom Freedom Yes 68.3 3 48 Child progressive
18 40 F Freedom Freedom No No response 0 33 Congenital progressive
19 62 F Freedom CP810 Yes 76.7 13 62 Adult progressive
20 62 F Freedom CP810 Yes 90 13 56 Adult progressive

PTA: Better ear pure tone average at 0.5, 1, and 2 Hz, with “no response” indicating a PTA ≥ 90 dB HL

Stimuli

Three synthetic versions of /bα/ and /wα/ from Nittrouer et al. (2013) were used (Figure 2, reprinted from Moberly et al., 2014a): 1) a synthetic /bα/ stimulus with a /bα/-like FRT (30 ms) and ART (10 ms) (termed the /bα/); 2) a /wα/-like FRT (110 ms) and ART (70 ms) (termed the /wα/); and 3) a /bα/-like FRT (30 ms) and a /wα/-like ART (70 ms) (termed the /bα/wa). For natural speech, fundamental frequency can differ between /bα/ and /wα/ contexts, and this could serve as a confounding cue. By using synthetic stimuli, fundamental frequency was held constant. Thus, these stimuli allow examination of MMN responses to spectral and temporal changes. Stimuli were created with a Klatt synthesizer (Sensyn), with sampling rate of 10 kHz. The token durations were 370 ms, with fundamental frequency constant at 100 Hz. The first two formant starting and steady-state frequencies were the same for all stimuli, but the time to reach steady-state frequencies varied. F1 started at 450 Hz and rose to 760 Hz at steady state. F2 started at 800 Hz and rose to 1150 Hz. F3 was constant at 2400 Hz. More details about the creation of stimuli are found in Nittrouer et al. (2013).

Figure 2.

Figure 2

Waveforms (above) and spectrograms (below) for synthetic /bα/, /bα/wa, and/wα/ stimuli. Reprinted from “Neurophysiology of spectrotemporal cue organization of spoken language in auditory memory,” by A.C. Moberly et al., 2014, Brain Lang, 130, 42–49, Elsevier. Reprinted with permission.

We should note that all three sounds are distinguishable from one another. The /bα/ and /bα/wa are clearly distinguishable from /wα/ due to differences in their formant transitions. However, /bα/ and /bα/wa are also distinguishable from one another. /bα/wa is slightly softer than /bα/. However, as previously reported, NH listeners rely mainly on formant dynamics to classify these CVs. If the question were related to amplitude or intensity sensitivity as opposed to classification, one may expect different results.

Procedures

Subjects underwent EEG testing while listening to a passive oddball auditory task. A series of “standard” stimuli were presented and interrupted by occasional “deviant” stimuli. Standard and deviant stimuli were pseudo-randomly presented with Presentation software (Neurobehavioral systems, Albany, CA). Stimuli were delivered using free-field stimulation with two Tannoy Precision 8D (TANNOY, Scotland, UK) speakers 1.5 meters from the participant at 45 degrees off center. Loudness was calibrated at 68 dB at subject distance but was then adjusted (< ± 5 dB) to the participants’ level of comfort and kept constant across the experiment. Continuous EEG data were recorded with a 64-channel cap (10–20 system, Ag-AgCl electrodes, 512 A/D conversion rate, BioSemi ActiveTwo system, Amsterdam, Netherlands) in a sound-attenuated room. Common Mode Sense (CMS) and Driven Right Leg (DRL) passive electrodes served as grounds.

The task consisted of eight oddball blocks, with two identical blocks for each of the four conditions, and one control block of /bα/ only stimuli. In each oddball block, the /bα/wa served as either a standard or deviant stimulus. This design created four conditions: (1) standard /bα/, deviant /bα/wa, an ART contrast; (2) standard /bα/wa, deviant /bα/, an ART contrast; (3) standard /wα/, deviant /bα/wa, an FRT contrast; and (4) standard /bα/wa, deviant /wα/, an FRT contrast. Importantly, blocks were set up to incorporate a “flip-flop” design in which a stimulus acted as the standard in one block and as the deviant in another block. Using a “flip-flop” design largely eliminated responses that were due to differences in the obligatory potentials, the N1-P2, sustained field responses (Hall, 2007; Wunderlich & Cone-Wesson, 2001).

In each oddball block, participants listened to 300 stimulus trials containing 15% deviants (45 trials) and 85% standards (255 trials). For the control block, listeners were presented with 200 stimulus trials of the /bα/ only stimulus. For all blocks, the Inter Stimulus Interval (ISI) was 1000 ms. The nine blocks were randomized across participants. During testing, stimuli were presented in a pseudo-random sequence, with at least 3 standard stimuli presented prior to each deviant stimulus. Throughout the blocks, subjects watched a silent movie of their choice on a 24-inch LCD computer monitor, which was placed 1 meter in front of them. Listeners were instructed to ignore auditory stimuli. Testing duration was 10 minutes per block with one-minute breaks between blocks. The full testing session lasted around 2 hours.

Data analysis

EEGLAB (Delorme & Makeig, 2004) and in-house MATLAB code (The MathWorks, Inc., Natick, MA) were used to process EEG. First, the continuous EEG files were combined for each of the nine blocks into one grand continuous file for each individual. Grand files were then epoched from −100ms to +500ms around the stimulus marker. Epoched data were then referenced to the Nazion (NZ) electrode and baselined to the pre-stimulus interval (−100ms to 0ms). Independent component analysis (ICA), using the infomax algorithm of EEGLAB (runica.m function), was then performed, with 64 ICA components generated for each individual. Principal component analysis (PCA) was applied as a pre-filtering step to the ICA, in which the data was sorted to 64 principal components. Visual inspection of topographies and waveforms of the ICA components was performed, and components representing ocular artifacts, muscle activity and CI artifact were rejected (mean 12.5/64 components per subject, range 4–20). Not surprising, the number of ICA components rejected was significantly larger than that in NH individuals (Moberly et al., 2014a). Most of the subjects exhibited 2–3 clear ICA components that represented the CI artifact while in a few subjects the CI artifact was observed in several ICA components. Following ICA correction, data were further subjected to additional artifact cleaning. Trials containing amplitudes of ± 50 μV or greater in any channel were rejected. This low rejection threshold was feasible given the ample number of trials. Bad channels (above the implant) were then interpolated by replacing the channels’ activity with an average of the surrounding electrodes. The data were average-referenced (without the Nz channel) and bandpass filtered between 0.1 and 30 Hz using a zero-phase FIR filter. Trials were separated to generate a set of standard trials and deviant trials for each of the ART and FRT conditions for each individual. Auditory evoked potentials (AEPs) for each standard and deviant condition were computed by averaging all trials separately for each condition, producing one standard and deviant pair for each subject, channel, and condition. The final number of trials included for each condition (across the flip-flop conditions and following artifact rejection) was 874 ± 104 standard and 153 ± 19 deviant trials for each of the ART and FRT contrasts.

MMN analysis was performed for the mean AEP waveforms of the fronto-central electrodes Fz, F1, F2, F3, F4, Fz, FC1, FC2, FC3, FC4, FCz, C1, C2, C3, C4, and Cz, which were computed for each of the standard and deviant pair, condition, and participant. For each individual, the standard and deviant waveforms (collapsed across the “flip-flop” conditions) were submitted to two-tailed sliding t-test to assess whether a statistically significant (p < 0.05, Bonferroni corrected) MMN response had been elicited. This t-test method is similar to the MMN identification technique used by Kraus, McGee, Carrell, and Sharma (1995) and also similar to the method used by Bishop and Hardiman (2010) who performed a t-test on single-trial analysis of difference waveforms. The test compared the period between 0 ms and 350 ms by sliding a 15 ms segment every 1 sample point (2 ms) and performing the t-test between the deviant and standard waveforms. Note that this time period began earlier than is typically used for analyzing MMN responses in individuals with NH. Extending the analysis period was performed based on previous results of MMN with CI users, in which MMN latencies tended to occur earlier for CI users than individuals with normal hearing (Ponton and Don, 1995). The MMN onset was taken as the latency at which the t-test (Bonferroni corrected for the number of executions) became significant, and the MMN offset was taken as the latency at which the t-test was no longer significant. Thus, the MMN duration was determined to be the duration between the onset and offset time points, as long as a region of negativity was visibly confirmed on the MMN (deviant minus standard) waveform. The MMN peak values were determined to be the time-points at which the largest negative deflection in the difference waveform coincided with a significant p-value. Because this method could conceivably lead to multiple noncontiguous regions of significant negativities in the difference waveforms, this analysis was supplemented with a topographic analysis. That is, topographies were evaluated for the MMN amplitude peaks noted on the MMN waveforms within the statistically significant MMN time periods. Individual topographic plots were examined visually to verify or rule out the presence of MMN, defined as a significant fronto-central to mastoid negativity. Therefore, the combination of a significant difference between the standard and deviant waveform amplitude on the sliding t-test, a visible negativity on the difference (deviant minus standard) waveform, and a confirmatory topography was used to verify or exclude the presence of a MMN response for each individual subject. An MMN was not identified for 3 out of 20 subjects (15%) in one of the conditions (FRT or ART “flip-flop” averaged waveforms) using the sliding t-test. For these subjects an MMN was picked as the negative going waveform with zero crossing between 50 and 350 ms (even though it did not reach significance with the sliding t-test). If the t-test, visible difference waveform negativity, and topographic criteria revealed more than one peak as consistent with an MMN response, the peak with the most MMN-like topographic response was taken as the true MMN response. This process ensured a conservative verification and measurement of true MMN responses.

Subsequently, the MMN was determined to be the area under the curve (potential x time points spanning the onset and offset window), as long as a region of negativity was visibly confirmed on the MMN (deviant minus standard) waveform and its peak was consistent with an MMN topography (fronto-central negativity with reversals in lower temporal-posterior channels). MMN area has frequently been used as a measure of MMN magnitude (Kraus et al., 1995; Tremblay et al., 1997, 1998; Ylinen et al., 2009). The MMN area is a useful way to assess MMN, because the response may have a variable latency and duration as well as a shallow peak. The MMN area was calculated as the area under the curve for the difference waveform (deviant minus standard) from the MMN onset to the MMN offset, based on the results of a sliding t-test. The MMN latency was determined to be the time point that corresponded to the 50% of the area under the curve (Luck and Hillyard 1990). This latency identification method was chosen instead of the traditional peak latency values to circumvent the multiple and broad peak problem associated with these data.

The ninth control condition of /bα/ only stimuli was used in order to evaluate if spuriously significant negativity would be found using the same analysis on the average of 100 permutations, each of which had 85% randomly labeled “standard” and 15% randomly labeled “deviant” stimuli. This helped validate the analyses used for identifying true MMN responses. The analysis did not show any MMN-like responses, validating that the negativities observed for the deviant stimuli in the main analyses represented true responses.

Statistical Analysis

Differences between MMN areas under the curve or differences between MMN latencies were assessed using a two-way analysis of variance (ANOVA, general Linear Model of Statistic v. 9.1, StatSoft, OK), with independent variables being CI-group and the dependent variable being cue (ART or FRT).

Duncan’s test for post hoc analyses was used to account for multiple comparisons. Statistical tests were considered significant for p < 0.05. Sphericity violations were corrected using Greenhouse-Geisser method.

Results

Figure 3A presents the ART-user and FRT-user groups’ results, showing the deviants (grey) and standards (black) waveforms for the ART and FRT contrasts. Figure 3B shows the group average MMN waveforms (deviants minus standards) for the ART (black) and FRT (grey) contrasts.

Figure 3.

Figure 3

A) Group average standard and deviant auditory evoked potential (AEP) waveforms for the ART-user group and FRT-user group to the spectral contrast (FRT) and the temporal (ART) contrast, collapsed across “flip-flop” conditions. B) Group average MMN waveforms (deviant – standard) for the ART-user group and FRT-user group to the spectral contrast (FRT) and the temporal (ART) contrasts. Topographic plot of the group average MMN peaks with corresponding time points are shown in below the MMN waveforms.

MMN area under the curve and latency analyses

An ANOVA contrasting MMN areas with variables being CI-group and cue (ART, FRT) revealed no main effects or interactions between the variables (F < 0.5). AN ANOVA contrasting the MMN latency with variables being CI-group and cue revealed no main effects (F < 3) between groups or conditions, as well. However, there was an interaction between the variables (F(1, 18) = 4.5, p = 0.05; ηp2 = 0.2). This effect can be interpreted in two ways: First, during the ART contrast, shorter MMN latencies occurred in the FRT-users compared to ART-users (p < 0.02, Duncan’s test, Figure 3B). This was not the case during the FRT contrast (p > 0.4). Second, for the FRT users, shorter MMN latencies occurred for the ART than FRT contrast (p < 0.02, Duncan’s test). This was not the case for the ART users, in which the MMNs for ART and FRT contrasts overlapped in time (Figure 3B).

In short, this MMN area analyses did not reveal magnitude differences between ART- and FRT-users or between ART and FRT contrasts. However, there were MMN latency effects. Mainly the MMNs for ART and FRT contrasts were significantly segregated in time (occurring earlier for the ART contrast) in the FRT-users but not in the ART-users.

Validation of the existence of the MMN response

The ninth condition which included /bα/ only stimuli (the control condition) was examined to evaluate if any significant negativity would be found while performing the same analysis on an average of 100 trial permutations. In this case, 85% trials were randomly labeled as “standard,” while 15% were randomly labeled as “deviant” stimuli, even though all stimuli in the ninth condition were identical. This helped to validate the analyses used to identify true MMN responses in the trial blocks. The analysis for the control block did not reveal any MMN-like responses, which validated that the negativities observed for the deviant stimuli during the trial blocks represented a true MMN response effect.

Discussion

In the present study, the neurophysiological basis (MMN response dynamics) of temporal (ART) and spectral (FRT) cue-weighting was examined during the /bα/-/wα/ contrast for adults with CIs. According to results of the previous study (Moberly et al., 2014a) and prior reports (Lipski et al., 2012; Tuomainen & van der Lely, 2007; Ylinen et al., 2009) on NH listeners showing larger MMNs for more heavily weighted cues, it was expected that CI listeners who weight ART more (ART-users) would show larger MMNs during the ART than the FRT contrast, and vice versa for FRT-users. These expectations were not realized for the current group of CI users. However, CI listeners who weighted FRT more than ART during phonetic classifications, which is the strategy predominantly favored by NH listeners, showed latency segregation of the ART and FRT MMNs similar to NH listeners (Moberly et al., 2014a). These results contend that while the neurophysiological mechanisms informing perception in individuals with hearing loss diverge from those of NH users on the whole, a subset of CI users are able to adapt to some NH strategies, as reflected in their behavior and neurophysiology.

Perhaps the fact that the MMN magnitude did not reflect weighting strategies in the current data should not be surprising. Even though all CI individuals favored one cue over the other during classification (see Moberly et al., 2014b), about half of these subjects used the two cues in close proportions (Figure 1). Thus, the degree of dominance for one cue over the other (ART or FRT) is not as conclusive in the current CI population as in NH listeners (Carpenter and Shahin, 2013; Moberly et al., 2014a), which may explain the lack of MMN magnitude differences.

What distinguished the two groups of CI users, however, is that the FRT-user group exhibited MMNs that were temporally segregated for the ART and FRT contrasts. ART MMN occurred much earlier (~100 ms) than the FRT MMN in the FRT-users. On the other hand, ART-users exhibited the MMNs of the different cues within close temporal proximity (Figure 2b). This temporal segregation for cue labeling reflected by the MMN behavior in FRT-users is consistent with the NH MMN behavior reported previously (Moberly et al., 2014a). In that study, it was posited that the early MMN response represented change detection, with the later MMN response potentially representing cue weighting. That is, the later MMN indexed a process whereby the dominant cue (FRT) representation is brought to the foreground in auditory memory, while representations of less useful cues (i.e., ART) are pushed to the background in auditory memory. For the FRT-users, a similar strategy appears to apply, with FRT-users processing ART changes earlier than FRT changes, allowing for the temporal segregation of the two processes along the auditory stream. This suggests that some CI users can adapt to strategies similar to NH behavior; albeit this adaptation is in the timing, not magnitude, of the MMN.

Perhaps demographic or language measures of the two groups, FRT-users and ART-users, might explain why the temporal segregation of the MMN response pattern was seen for the FRT-users but not the ART-users. Examination of demographic and language measures for the individuals of these two groups revealed no differences between groups on measures of vocabulary, socioeconomic status, years of education, years of deafness, age at time of testing, or age at time of implantation. However, interestingly, the FRT-user group demonstrated a significantly higher word recognition score than the ART-user group (66% versus 39%, t(14) = −2.33, p = .035). These findings suggest that the CI users with better speech recognition abilities may be more likely to perceptually weight speech cues like adults with NH, and this is represented neurophysiologically in the temporal segregation of the MMN responses to ART and FRT distinctions.

As has previously been found, the MMN responses for adults with CIs listening to different stimulus manipulations occurred earlier than those of adults with NH (Ponton and Don, 1995). The NH adults previously tested (Moberly et al., 2014a) showed the largest overall MMN responses during the 201–250 ms time window, with a predominance of the MMN response to FRT over ART occurring during the 251–300 ms time. For the adults with CIs presented here, the FRT-users showed a similar finding, but the MMN occurred earlier during the ART distinction. It might be argued that the early response could represent a N1 obligatory potential. However, the use of a “flip-flop” protocol design should, for the most part, eliminate differences between the deviant and standard waveforms due solely to N1 differences. Therefore, these early ART responses are most likely true MMN responses, and their behavior is consistent with the explanation given earlier, whereby the early MMN represents change detection and the later one represents organization in auditory memory.

It is likely that for some of our adults with CIs, the brain’s reliance on ART may have been increased over its typical reliance on FRT, because FRT was relatively degraded by the implant’s processing and signal delivery. However, the presence of MMN responses in most subjects to both the FRT and ART contrasts, along with data from the previous study showing relatively similar average discrimination for the FRT and ART contrasts (Moberly et al., 2014b), would suggest that both cues had similar salience for our listeners. The finding of the existence of late MMN responses to FRT for both groups of CI listeners during the 201–300 ms time window, lends further support to the theory that MMN represents a level of pre-attentive perceptual organization, not just change detection.

It should be noted that the current results do not provide convincing evidence that the MMN can be used as a valuable tool for assessing perceptual differences of speech cue weighting at the individual level in clinical populations, in line with the conclusions of several previous studies that have attempted to identify the MMN response in individual subjects (Bishop & Hardiman, 2010; McGee et al., 1997; Näätänen et al., 2004; Pakarinen et al., 2007; Picton et al., 2000; Ponton et al., 1997). Therefore, MMN may be a valuable tool for examining cue-weighting strategies in groups of subjects, but less so for evaluating individuals. Further investigations of CI listeners are warranted to improve the use of the MMN response, perhaps in combination with other electrophysiological measures, as an investigatory method for assessing perceptual cue weighting.

Finally, we should note that one caveat of the current experimental design was the use of two speakers for sound presentation. The phase misalignment of the two sound waves may have caused destructive interference, albeit minute, of sound waves at the ear. This issue may be especially relevant in CI research and should be considered in future studies.

Conclusion

The findings of this study provide evidence that in adults with cochlear implants, the perceptual weighting of ART and FRT cues distinguishing the /bα/-/wα/ CVs and the neurophysiological mechanisms supporting these strategies diverge from those seen in NH listeners. However, a subset of CI users with relatively good speech recognition skills showed MMN response quality, temporal segregation of ART and FRT MMNs that was consistent with the MMN response dynamics of adults with normal hearing. This was in contrast to a second subset with poorer speech recognition, whose MMN responses did not mirror those seen in normal hearing adults. These findings suggest that some CI users may learn or relearn to adapt to the cue weighting strategy of NH listeners and this adaptation can be indexed by the dynamics of MMN.

Acknowledgments

This study was supported by a NIH/NIDCD award (AJS, R01-DC013543). The authors would like to thank Dr. Susan Nittrouer, Dr. Joanna Lowenstein, and Eric Tarr for their assistance in stimulus design and synthesis, and Dr. D. Bradley Welling for providing access to his patients as participants in the study. The authors would also like to acknowledge Dr. Kelly Tremblay for her insightful comments on data analysis.

Source of Funding: This work was supported by an NIH/NIDCD grant (AJS, R01-DC013543).

Footnotes

Conflicts of Interest: The authors declare no conflict of interest.

References

  1. Bailey P, Summerfield Q. Information in speech: observations on the perception of [s]-stop clusters. J Exp Psychol Hum Percept Perform. 1980;6:356–363. doi: 10.1037//0096-1523.6.3.536. [DOI] [PubMed] [Google Scholar]
  2. Best CT. The emergence of native-language phonological influences in infants: A perceptual assimilation model. In: Goodman JC, Nusbaum HC, editors. The development of speech perception: The transition from speech sounds to spoken words. Cambridge: MIT Press; 1994. pp. 167–224. [Google Scholar]
  3. Best CT, Morrongiello BA, Robson RC. Perceptual equivalence of acoustic cues in speech and nonspeech perception. Percept Psychophys. 1981;29:191–211. doi: 10.3758/bf03207286. [DOI] [PubMed] [Google Scholar]
  4. Bishop DVM, Hardiman MJ. Measurement of mismatch negativity in individuals: A study using single-trial analysis. Psychophys. 2010;47:697–705. doi: 10.1111/j.1469-8986.2009.00970.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Carpenter AL, Shahin AJ. Development of the N1–P2 auditory evoked response to amplitude rise time and rate of formant transition of speech sounds. Neurosci Lett. 2013;544:56–61. doi: 10.1016/j.neulet.2013.03.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Delorme A, Makeig S. EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods. 2004;134:9–21. doi: 10.1016/j.jneumeth.2003.10.009. [DOI] [PubMed] [Google Scholar]
  7. Friesen LM, Shannon RV, Baskent D, et al. Speech recognition in noise as a function of the number of spectral channels: Comparison of acoustic hearing and cochlear implants. JASA. 2001;110:1150–1163. doi: 10.1121/1.1381538. [DOI] [PubMed] [Google Scholar]
  8. Hall JW. New Handbook of Auditory Evoked Responses. Pearson publishing; Boston, MA: 2007. [Google Scholar]
  9. Holt LL, Lotto AJ. Cue weighting in auditory categorization: Implications for first and second language acquisition. JASA. 2006;119:3059–3071. doi: 10.1121/1.2188377. [DOI] [PubMed] [Google Scholar]
  10. Jusczyk PW, Hohne EA, Mandel DR. Picking up regularities in the sound structure of the native language. In: Strange W, editor. Speech Perception and Linguistic Experience: Issues in Cross-language Research. Baltimore: York Press; 1995. pp. 91–119. [Google Scholar]
  11. Kiefer J, von Ilberg C, Reimer B, et al. Results of cochlear implantation in patients with severe to profound hearing loss - Implications for the indications. Audiology. 1998;37:382–395. doi: 10.3109/00206099809072991. [DOI] [PubMed] [Google Scholar]
  12. Kraus N, McGee T, Carrell TD, et al. Neurophysiologic bases of speech discrimination. Ear Hear. 1995;16:19–37. doi: 10.1097/00003446-199502000-00003. [DOI] [PubMed] [Google Scholar]
  13. Kraus N, Micco AG, Koch DB, et al. The mismatch negativity cortical evoked potential elicited by speech in cochlear-implant users. Hear Res. 1993;65:118–24. doi: 10.1016/0378-5955(93)90206-g. [DOI] [PubMed] [Google Scholar]
  14. Lipski SC, Escudero P, Benders T. Language experience modulates weighting of acoustic cues for vowel perception: An event-related potential study. Psychophys. 2012;49:638–650. doi: 10.1111/j.1469-8986.2011.01347.x. [DOI] [PubMed] [Google Scholar]
  15. Luck SJ, Hillyard SA. Electrophysiological evidence for parallel and serial processing during visual search. Perception & Psychophysics. 1990;48:603–617. doi: 10.3758/bf03211606. [DOI] [PubMed] [Google Scholar]
  16. McGee T, Kraus N, Nicol T. Is it really a mismatch negativity? An assessment of methods for determining response validity in individual subjects. Electroen Clin Neuro. 1997;104:359–368. doi: 10.1016/s0168-5597(97)00024-5. [DOI] [PubMed] [Google Scholar]
  17. Moberly AC, Bhat J, Welling DB, et al. Neurophysiology of spectrotemporal cue organization of spoken language in auditory memory. Brain Lang. 2014a;130:42–49. doi: 10.1016/j.bandl.2014.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Moberly AC, Lowenstein JH, Tarr E, et al. Do adults with cochlear implants rely on different acoustic cues for phoneme perception that adults with normal hearing? J Speech Lang Hear Res. 2014b;57:566–582. doi: 10.1044/2014_JSLHR-H-12-0323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Näätänen R. The perception of speech sounds by the human brain as reflected by the mismatch negativity (MMN) and its magnetic equivalent (MMNm) Psychophys. 2001;38:1–21. doi: 10.1017/s0048577201000208. [DOI] [PubMed] [Google Scholar]
  20. Näätänen R, Kujala T, Winkler I. Auditory processing that leads to conscious perception: A unique window to central auditory processing opened by the mismatch negativity and related responses. Psychophys. 2011;48:4–22. doi: 10.1111/j.1469-8986.2010.01114.x. [DOI] [PubMed] [Google Scholar]
  21. Näätänen R, Pakarinen S, Rinne T, et al. The mismatch negativity (MMN) – towards the optimal paradigm. Clin Neurophys. 2004;115:140–144. doi: 10.1016/j.clinph.2003.04.001. [DOI] [PubMed] [Google Scholar]
  22. Näätänen R, Paavilainen P, Rinne T, et al. The mismatch negativity (MMN) in basic research of central auditory processing: a review. Clin Neurophys. 2007;118:2544–2590. doi: 10.1016/j.clinph.2007.04.026. [DOI] [PubMed] [Google Scholar]
  23. Nittrouer S. Age-related differences in weighting and masking of two cues to word-final stop voicing in noise. JASA. 2005;118:1072–1088. doi: 10.1121/1.1940508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Nittrouer S, Lowenstein JH, Tarr E. Amplitude rise time does not cue the /bα/-/wα/ contrast for adults or children. J Speech Lang Hear Res. 2012;56:427–440. doi: 10.1044/1092-4388(2012/12-0075). [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Nittrouer S, Miller ME. Predicting developmental shifts in perceptual weighting schemes. JASA. 1997;101:2253–2266. doi: 10.1121/1.418207. [DOI] [PubMed] [Google Scholar]
  26. Nittrouer S, Studdert-Kennedy M. The stop-glide distinction: Acoustic analysis and perceptual effect of variation in syllable amplitude envelope for initial /b/ and /w/ JASA. 1986;80:1026–1029. doi: 10.1121/1.393843. [DOI] [PubMed] [Google Scholar]
  27. Ohde RN, Haley KL. Stop-consonant and vowel perception in 3- and 4-year-old children. JASA. 1997;102:3711–3722. doi: 10.1121/1.420135. [DOI] [PubMed] [Google Scholar]
  28. Pakarinen S, Takegata R, Rinne, et al. Measurement of extensive auditory discrimination profiles using mismatch negativity (MMN) of the auditory event-related potential. Clin Neurophys. 2007;118:177–185. doi: 10.1016/j.clinph.2006.09.001. [DOI] [PubMed] [Google Scholar]
  29. Picton TW, Alain C, Otten L, et al. Mismatch negativity: different water in the same river. Audiol Neurootol. 2000;5:111–139. doi: 10.1159/000013875. [DOI] [PubMed] [Google Scholar]
  30. Peterson NR, Pisoni DB, Miyamoto RT. Cochlear implants and spoken language processing abilities: Review and assessment of the literature. Restor Neurol Neurosci. 2010;28:237–250. doi: 10.3233/RNN-2010-0535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ponton CW, Don M. The mismatch negativity in cochlear implants users. Ear Hear. 1995;16:131–146. doi: 10.1097/00003446-199502000-00010. [DOI] [PubMed] [Google Scholar]
  32. Ponton CW, Don M, Eggermont JJ, et al. Integrated MMN (MMNi): A noise free representation that allows distribution free single-point statistical tests. Electroen Clin Neuro. 1997;104:143–150. doi: 10.1016/s0168-5597(97)96104-9. [DOI] [PubMed] [Google Scholar]
  33. Ponton CW, Eggermont JJ, Don M, et al. Maturation of the mismatch negativity: Effects of profound deafness and cochlear implant use. Audiol Neurootol. 2000;5:167–185. doi: 10.1159/000013878. [DOI] [PubMed] [Google Scholar]
  34. Pulvemüller F, Shtyrov Y. Language outside the focus of attention: The mismatch negativity as a tool for studying higher cognitive processes. Prog Neurobiol. 2006;79:49–71. doi: 10.1016/j.pneurobio.2006.04.004. [DOI] [PubMed] [Google Scholar]
  35. Roman S, Canévet G, Marquis P, et al. Relationship between auditory perception skills and mismatch negativity recorded in free field in cochlear-implant users. Hear Res. 2005;201:10–20. doi: 10.1016/j.heares.2004.08.021. [DOI] [PubMed] [Google Scholar]
  36. Shipp DB, Nedzelski J. Prognostic indicators of speech recognition performance in adult cochlear implant users: a prospective analysis. Ann Otol Rhinol Laryngol Suppl. 1995;166:194–196. [PubMed] [Google Scholar]
  37. Tremblay K, Kraus N, Carrell TD, et al. Central auditory system plasticity: Generalization to novel stimuli following listening training. JASA. 1997;102:3762–3773. doi: 10.1121/1.420139. [DOI] [PubMed] [Google Scholar]
  38. Tremblay K, Kraus N, McGee T. The time course of auditory perceptual learning: Neurophysiological changes during speech-sound training. Neuroreport. 1998;9:3557–3560. doi: 10.1097/00001756-199811160-00003. [DOI] [PubMed] [Google Scholar]
  39. Tuomainen O, van der Lely H. Processing of acoustic cues for voicing in English: A MMN study. Proceedings of the 16th International Congress of Phonetic Sciences. 2007:813–816. [Google Scholar]
  40. Walsh MA, Diehl RL. Formant transition duration and amplitude rise time as cues to the stop/glide distinction. Q J Exp Psychol. 1991;43:603–620. doi: 10.1080/14640749108400989. [DOI] [PubMed] [Google Scholar]
  41. Wilson BS, Dorman MF. Cochlear implants: Current designs and future possibilities. J Rehabil Res Dev. 2008;45:695–730. doi: 10.1682/jrrd.2007.10.0173. [DOI] [PubMed] [Google Scholar]
  42. Wunderlich JL, Cone-Wesson BK. Effects of stimulus frequency and complexity on the mismatch negativity and other components of the cortical auditory-evoked potential. JASA. 2001;109:1526–1537. doi: 10.1121/1.1349184. [DOI] [PubMed] [Google Scholar]
  43. Ylinen S, Uther M, Latvala A, et al. Training the brain to weight speech cues differently: A study of Finnish second-language users of English. J Cog Neurosci. 2009;22:1319–1332. doi: 10.1162/jocn.2009.21272. [DOI] [PubMed] [Google Scholar]
  44. Zhang F, Hammer T, Banks H, et al. Mismatch negativity and adaptation measures of late auditory evoked potential in cochlear implant users. Hear Res. 2011;275:17–29. doi: 10.1016/j.heares.2010.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES