Abstract
The auditory mismatch negativity (MMN) component of event-related potentials (ERPs) has served as a neural index of auditory change detection. MMN is elicited by presentation of infrequent (deviant) sounds randomly interspersed among frequent (standard) sounds. Deviants elicit a larger negative deflection in the ERP waveform compared to the standard. There is considerable debate as to whether the neural mechanism of this change detection response is due to release from neural adaptation (neural adaptation hypothesis) or from a prediction error signal (predictive coding hypothesis). Previous studies have not been able to distinguish between these explanations because paradigms typically confound the two. The current study disambiguated effects of stimulus-specific adaptation from expectation violation using a unique stimulus design that compared expectation violation responses that did and did not involve stimulus change. The expectation violation response without the stimulus change differed in timing, scalp distribution, and attentional modulation from the more typical MMN response. There is insufficient evidence from the current study to suggest that the negative deflection elicited by the expectation violation alone includes the MMN. Thus, we offer a novel hypothesis that the expectation violation response reflects a fundamentally different neural substrate than that attributed to the canonical MMN.
Keywords: Attention, Event-related brain potentials (ERPs), Expectation, Mismatch negativity (MMN), Novelty detection; Predictive Coding
1. Introduction
In humans, deviance detection can be measured using a scalp-recorded event-related brain response called the mismatch negativity (MMN) component (Squires et al. 1975; Näätänen et al. 1978). The MMN is elicited by infrequently occurring tones which are randomly interspersed with frequently occurring tones differing in a feature of the sounds (e.g., frequency or intensity), as well as by auditory pattern violations (Alho et al. 1993; Sussman et al. 1998; Alain et al. 1999; Sussman et al. 2002; Sussman 2007). The average time-locked response to the infrequent tones elicits a larger negative-going deflection compared to the averaged evoked response to the frequently repeated tones. This negative voltage difference between the frequent and infrequent tones delineates the MMN component, and reflects neurophysiological deviance detection.
The neural mechanism that generates the larger negative response evoked by the deviant is highly debated, and centered around two main theories: one involving neural adaptation and the other predictive coding. The neuronal adaptation hypothesis (Ulanovsky et al. 2003; Pérez-González et al. 2005; Anderson et al. 2009; May and Tiitinen 2010; Fishman 2013) proposes that elicitation of the MMN component is the result of a release from an adapted state of the neurons when a rare stimulus (deviant) is presented. This theory interprets MMN as an “N1 enhancement” a simple bottom-up mechanism, with the larger amplitude response to the deviant due to stimulation of less- or non-adapted cells (Butler 1968; Budd et al. 1998; May and Tiitinen 2010). The predictive coding hypothesis, in contrast, is based on a perceptual, top-down, expectation, in which the predictions must be actively generated. The prediction is maintained by hierarchical interactions between the sensory input (e.g., bottom-up stimulus repetition) and top-down expectations resulting from the stimulus repetition (Rao and Ballard 1999; Friston 2005). This theory suggests that MMN elicitation reflects a comparison between the sensory input and predictions from higher cortical areas (Bendixen, Schröger, & Winkler, 2009; Friston, 2005; Garrido, Kilner, Kiebel, & Friston, 2009; Grimm & Escera, 2012; Rao & Ballard, 1999). The ‘prediction error’ is thus determined by a discrepancy between them. Therefore, this theory suggests that the larger amplitude response evoked by the deviant is due to the violation of the top-down expectation for a recurring stimulus rather than the stimulus change itself.
In most, if not all, previous MMN studies, the deviant that violated an expectation set up by a repetitive stimulus or pattern of stimuli also changed in some stimulus feature (e.g., Alain et al., 1999; Giard, Perrin, Pernier, & Bouchet, 1990; Javitt, Steinschneider, Schroeder, & Arezzo, 1996; Näätänen, Paavilainen, Tiitinen, Jiang, & Alho, 1993). This paradigmatic approach (time-locking the expectation violation to a stimulus change) cannot distinguish between processes associated with release from adaptation and expectation violation because both responses are coupled to the same stimulus event. Thus, it is not currently known whether MMN can be elicited by a pure violation of expectation, or whether a change in some stimulus parameter is needed to trigger the response.
The purpose for the current study was to differentiate deviant neural responses elicited by a change in some stimulus parameter (adaptation hypothesis) from that evoked only by a violation of expectation (predictive coding hypothesis), when both types of deviants involved violations of the same repeating tone pattern. Pattern deviants did not involve a change in any tone feature or timing of the stimuli. We used a unique stimulus design to distinguish between these explanations and determine whether a stimulus would elicit a deviant response if it violated a temporal expectation even when no stimulus feature was altered. We presented a repeating patterned tone-sequence (X1X2X3O, “standard”) and two deviant patterns, in which half of the time the terminal stimulus of the pattern occurred later than expected (X1X2X3X4X5O, “late deviant”), and the other half of the time the terminal stimulus occurred earlier than expected (X1O, “early deviant”). In this way, the ‘deviants’ involved a violation of the temporal pattern created by the unfolding sequence of events. The temporal expectation was violated by ‘misplacing’ the position of the O tone within the sequence, and not by changing the interstimulus relationship of sounds within the sequence. This experimental design allowed us to measure and compare responses to an expectation violation that involved no change in the stimulus (X4 of the late deviant) to that when the deviant involved both a change in the stimulus and a change in the expected pattern (the O of the early deviant). Finding an enhanced negativity to the X4 tone could not be explained by the adaptation hypothesis, by a simple, bottom-up mechanism. This is because there would be no ‘fresh afferents’ response, no non-adapted cells to contribute to the enhancement. Thus, an enhanced negativity to the X4 tone would be better explained by the predictive coding hypothesis, in which the expectation of the O tone of the standard pattern (X1X2X3O X1X2X3O) was violated by the repetition of the X tone in its place (X1X2X3X4X5O).
Integral to the predictive coding hierarchical model (Rao & Ballard, 1999; Friston, 2005) is the involvement of top-down processes that initiate expectations (i.e., the higher level structures feedback to lower level structures). To assess top-down involvement in the expectation violation response, we also manipulated the direction of the listener’s attention to the stimuli by presenting the same sets of patterned sounds in two conditions of attention. One in which listeners actively detected pattern deviants (‘active listening’), and the other in which they watched a movie and had no task involving the sounds (‘passive listening’). The predictive coding model would predict that the response enhancement to the X4 tone of the late deviant should occur only during target detection, when active listening initiates the hierarchical interactions, and not during passive listening.
2. Methods
2.1 Participants
Ten healthy adults were paid for their participation (six females, M = 30 years, SD = 5). To be conservative, we derived our power calculation based on the smallest and most variable ERP component (MMN). Using the amplitude estimation of the MMN obtained from our previous study that used a similar paradigm (Sussman et al. 2002), there was ample power in a Passive condition (1-β=0.77), and in an Active condition (1-β=0.99) to detect an MMN with an alpha level of 0.05 and 10 adults.
All participants provided written informed consent prior to testing after they were told about the experiment, in accordance with the Declaration of Helsinki. The protocol was approved by the Internal Review Board of the Albert Einstein College of Medicine, where the study was conducted. All participants passed a hearing screening (20 dB HL at 500, 1000, 2000, & 4000 Hz) bilaterally, and had no reported history of neurological disorders.
2.2 Stimuli
Two tones (50 ms duration, 5 ms rise/fall time) were created with Adobe Audition 3 software. One had a fundamental frequency (f°) of 880 Hz (hereafter called the ‘X’ tone), and the other tone had a f° of 988 Hz (hereafter called the ‘O’ tone). Both tones had four harmonics respective to their f°. Tones were calibrated to 55 dB(A) and were presented bilaterally through insert earphones (E-a-rtone 3A, Indianapolis, IN). The two tones were presented in a fixed sequential order (XXXO…) with the O tone presented every fourth tone. This standard pattern occurred 80% of the time (STD). Two ‘deviant’ patterns occurred by presentation of the O tone early in the pattern, after only one X tone (XO) (‘early Dev’), or late, after five repetitions of the X tone (XXXXXO) (‘late Dev’). Each deviant pattern occurred 10% of the time. The deviant patterns were pseudo-randomized within the sound sequence, with the constraint that at least one standard pattern followed any deviant. The presentation rate was 200 ms stimulus onset asynchrony. For descriptive purposes, X stimuli will be paired with a subscript number to denote its position within the standard and deviant patterns (e.g., the STD pattern: X1 X2 X3 O). An O tone always terminates the standard and deviant patterns (STD, early Dev, late Dev) (Fig 1A).
2.3 Procedures
Participants were seated comfortably in a sound attenuated, electrically shielded, booth (Industrial Acoustics Corp., Bronx, NY). There were two conditions: Passive and Active. In the Passive condition, participants were told to ignore the auditory stimuli, and had no task with the sounds. They watched a self-selected movie with subtitles and no sound. In the Active condition, participants were informed about the four-tone stimulus pattern, and were instructed to press a response key the moment they detected a deviant pattern: the O tone in early Dev pattern, and X4 in the late Dev pattern. Subjects fixated on a cross at the center of the monitor, where they watched the movie in the Passive condition. The experimenter monitored eye blinking and saccades during the EEG recording to ensure that subjects were reading the captions during the Passive condition task and that the eyes were open during both tasks. In each condition, 9600 tones (1920 four-tone STD patterns, 240 early Dev patterns and 240 late Dev patterns) were presented in twelve separately randomized blocks of 800 tones. Half of the participants were presented with the Active condition first and the other half with the Passive condition first. Total recording session time, including electrode cap placement and breaks was approximately 2.5 hours.
2.4 Electroencephalogram (EEG) Recording
EEG was conducted using a 32-channel electrode cap placed according to the modified International 10–20 System from FPz, Fz, Cz, Pz, Oz, FP1, FP2, F7, F8, F3, F4, FC5, FC6, FC1, FC2, T7, T8, C3, C4, CP5, CP6, CP1, CP2, P7, P8, P3, P4, O1, O2, and from the left and right mastoids (LM and RM, respectively). Horizontal eye movements were measured by recording the horizontal electro-oculogram (EOG) in a bipolar configuration between F7 and F8 electrodes. Vertical EOG was monitored using the FP1 electrode in a bipolar configuration with an external electrode placed below the left eye. The reference electrode was placed at tip of the nose. P09 was used for the ground electrode. Impedance was maintained below 5 kΩ across all electrodes. The EEG and EOG were digitized (Neuroscan Synamps amplifier, Compumedics Corp., El Paso, Texas) at 500 Hz (0.05–100Hz bandpass). The EEG was then filtered offline (0.1–30 Hz) using a finite impulse response filter with zero phase shift and a roll-off slope of 24 dB/octave using Neuroscan SCAN software 4.3 for PC.
2.5 Data Analysis
2.5.1 Behavioral Data
Reaction time (RT), hit rate (HR), and false alarms (FAR) were calculated for the Active conditions. A button response was counted as “hit” when the response was made 100–900 ms from stimulus onset of the detected change (e.g., the O tone of the early Dev and the X4 tone of the late Dev). A false alarm was considered a button press at any other time. To compare behavioral performance between the early Dev and the late Dev, student’s t-tests for dependent measures were conducted separately for RT, HR, and FAR.
2.5.2 ERP Data
Filtered EEG was then epoched into 900-ms segments, including a 100-ms pre-stimulus period. Epochs were baseline corrected and those containing artifacts (EEG or EOG activity exceeding +/− 75 µV) in any recorded electrode were excluded from further analysis. Overall, an average of 18% of trials was rejected.
2.5.2.1 Time course analysis
A statistical analysis was performed in the time windows that were not based on peak latency identification of the more traditional ERP components (see Traditional ERP Components below). Rather than choose the interval to measure based on observed peaks in the grand mean difference waveforms, the intervals were taken in 50 ms time intervals from the onset to the offset of the negative deflection observed in the difference waveforms. This roughly corresponded to the N1, MMN, and N2b ERP components in early (100–150 ms), intermediate (150–200 ms), and later (200–250 ms) time intervals. The mean amplitudes were separately obtained from the difference (deviant-minus-standard) waveforms in each individual at Fz, for each condition, and deviant type (Figs 2B, 3B). Because there was an a priori expectation for the direction of polarity, a one-sample, one-tailed student’s t-tests was used determine at which time intervals the negative deflection was significantly greater than zero.
2.5.2.2 Traditional ERP components
The traditional ERP components that delineate pattern violation detection (MMN, N2b, and P3b) were measured by first visually identifying the peak latency in the grand-mean difference (deviant-minus-standard) waveforms. The peak latency for MMN was determined at the LM electrode (where the peak maximal signal-to-noise ratio [SNR] could be delineated without overlap from other components) in all conditions. For the Active condition, the peak latency of the N2b component was determined from the Cz electrode, and the peak latency of the P3b component was determined from the Pz electrode (the electrodes of their respective maximal SNR). An interval (50 ms for MMN and N2b, and 60 ms for the P3b) centered on the grand-mean peak latency was then used to obtain the mean amplitude for each individual, for each stimulus type (standard and deviant) for each deviant (early Dev and late Dev), in each condition (Passive and Active). For MMN, the grand-mean peak latency in the Active condition for the early Dev was 152 ms and for the late Dev was 170 ms. In the Passive condition grand-mean peak latencies were 154 ms (early Dev) and 172 ms (late Dev). For N2b, the peaks were 202 ms (early Dev) and 214 ms (late Dev); and for the P3b the peaks were 460 ms (early Dev) and 504 ms (late Dev). Table 1 summarizes the peak latency for all ERP components in all conditions. Mean latency for each component was calculated separately, using a peak detection program from the windows centered on the peak latencies as determined above, for each individual and each stimulus type separately.
Table 1.
Condition | ERP Component | |||||
---|---|---|---|---|---|---|
MMN | N2b | P3b | ||||
ED | LD | ED | LD | ED | LD | |
Active | 149 (14) |
171 (16) |
202 (23) |
207 (25) |
461 (21) |
515 (23) |
Passive | 153 (22) |
174 (17) |
Standard deviation is shown in parentheses.
Latency measurements for were taken at electrode LM for MMN, Cz for N2b, and Pz for P3b.
2.5.2.3 Standard and deviant epochs used for difference waveform comparisons
To minimize effects that may be evoked by subtracting responses to physically different stimuli, comparisons were made to the same physical sequence of tones with the difference being that one violated the standard pattern and the other was part of the standard pattern. For the response to the early Dev, the ERP evoked by the O tone (988 Hz) that terminated the pattern early was compared with the ERP evoked by the O tone (988 Hz) of the standard pattern. Thus, the comparison epochs contained the same physical sequence of stimuli (O-X1-X2-X3-O), with the difference between them in the role of the first O tone of the epoch, which was a deviant or a standard (Fig 1B). For the late Dev, we compared the ERP response to the X4 tone (880 Hz) with the ERP response to the X2 tone (880 Hz) of the standard pattern. Here, the X4 tone occurred at the time that a violation of the pattern could be detected (the O tone of the standard pattern was expected to occur as the fourth tone) and this was compared with an X tone that was not deviant (Fig 1C).
2.5.3 Statistical Analyses
We first checked to confirm there were no effects of block order presentation (Passive condition first vs. Active condition first) on any of our dependent measures. No block order effects were found (p values ranged from 0.41–0.91), and thus data were grouped by condition and stimulus type.
An omnibus repeated measures analysis of variance (ANOVA) was conducted to verify the presence of the MMN component (STD, Dev), and to assess scalp topography (Fz, Cz, Pz), and effects of attention (active, passive) and deviant type (early Dev, late Dev). For the N2b and P3b ERP components, elicited only during sound task performance in the Active condition, repeated measures ANOVA were conducted separately to assess the presence of the component (STD, Dev), scalp topography (Fz, Cz, Pz), and effects of deviant type (early Dev, late Dev). The amplitude of the ERP response to the terminal tone of the standard, early Dev, and late Dev patterns (the O tone) were compared using one-way repeated-measures.
For all ANOVA calculations, where data violated the assumption of sphericity, degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity. Corrected df and p values are reported. For post hoc analyses, Tukey HSD for repeated measures was conducted on pairwise contrasts only when the omnibus ANOVA was significant. Contrasts were reported as significantly different at p < 0.05. All statistical analyses were performed using Statistica 12 software (Statsoft, Inc., Tulsa, OK).
Scalp current density maps were created for display purposes at the peak latency of the ERP components using BESA software (Gräfelfing, Germany).
3. Results
3.1 Behavioral Performance
Accuracy of target responses did not significantly differ as a function of deviant type, as reflected in HR (early Dev: M = .65, SE = .061; late Dev: M = .70, SE = .077) (t1, 9 = −1.672, p = 0.129) and FAR (early Dev: M = .007, SE = .0025; late Dev: M = 0.013, SE = .007) (t1, 9 = −1.55, p = 0.278). However, participants were significantly faster in responding to the early Dev (M = 486 ms, SE = 22) than the late Dev (M = 516 ms, SE = 27) (t1, 9 = −3.432, p = .007, r=.60), ∆ = −30.8, 95% CI [−51.0, −10.5].
3.2 ERPs
Figure 2 (Passive condition) and Figure 3 (Active condition) depict the grand-mean ERPs evoked by the early Dev and by the late Dev overlain with the grand-mean ERPs evoked by comparison standards, displaying the electrodes used to measure the ERP components, and their difference waveforms with the corresponding intervals used for the time course analysis. Table 1 summarizes the ERP latencies and Tables 2 and 3 summarize the ERP amplitudes.
Table 2.
Fz | Cz | Pz | |||||||
---|---|---|---|---|---|---|---|---|---|
Deviant Type | DEV | STD | Difference | DEV | STD | Difference | DEV | STD | Difference |
Early Dev | −1.27 (1.3) |
0.59 (0.5) |
−1.86 | −0.92 (1.0) |
0.75 (0.5) |
−1.67 | −0.48 (1.0) |
0.38 (0.6) |
−0.86 |
Late Dev | −0.99 (1.4) |
−0.17 (0.4) |
−0.81 | −0.82 (1.4) |
−0.10 (0.4) |
−0.72 | −0.69 (1.4) |
−0.09 (0.4) |
−0.60 |
Grand-mean amplitudes (in microvolts). SD shown in parentheses.
Table 3.
Fz | Cz | Pz | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Deviant Type | DEV | STD | Difference | DEV | STD | Difference | DEV | STD | Difference | |
MMN | Early Dev | −2.62 (2.1) |
0.82 (0.5) |
−3.44 | −1.72 (1.8) |
1.15 (0.4) |
−2.87 | −0.41 (1.2) |
0.29 (0.9) |
−0.70 |
Late Dev | −1.50 (2.0) |
−0.90 (0.8) |
−0.60 | −1.23 (2.1) |
−0.91 (0.7) |
−0.32 | −0.89 (1.5) |
−0.36 (0.5) |
−0.53 | |
N2b | Early Dev | −2.95 (2.7) |
0.95 (0.8) |
−3.90 | −2.55 (2.8) |
1.19 (0.7) |
−3.74 | −1.37 (2.1) |
0.17 (0.8) |
−1.55 |
Late Dev | −2.31 (2.4) |
−0.85 (0.8) |
−1.46 | −3.04 (2.6) |
−0.99 (0.8) |
−2.05 | −2.81 (2.1) |
−0.44 (0.8) |
−2.37 | |
P3b | Early Dev | 1.53 (2.2) |
0.87 (0.8) |
0.66 | 2.79 (3.1) |
0.12 (0.7) |
2.67 | 4.30 (2.9) |
−0.76 (1.1) |
5.06 |
Late Dev | 1.43 (1.8) |
−0.49 (0.9) |
1.93 | 2.69 (2.6) |
−0.71 (1.0) |
3.41 | 3.67 (2.9) |
−0.73 (0.9) |
4.40 |
Grand-mean amplitudes (in microvolts). SD is shown in parentheses.
3.2.1 Time course analysis
An enhanced negative response was elicited both by the O tone of the early Dev and by the adapted, X4tone of the late Dev without an active task, when participants watched a movie and had no task with the sounds (Passive condition). The negative amplitude of the difference waveform at Fz for the early Dev response was significantly greater than zero in the early (t1,9 = −3.731, p = 0.002), intermediate (t1,9 = −3.584, p = 0.003), and late (t1,9 = −1.883, p = 0.046) time intervals, but only in the intermediate time interval (t1,9 = −1.912, p = 0.044) of the late Dev response (Fig 2B, difference waveforms, bottom row; Fig 4, top row).
An enhanced negative response was also elicited by early and late deviants when participants actively detected pattern violations (Active condition). The negative difference amplitude for the early Dev was significantly greater than zero in all time intervals (early: t1, 9 = −5.77, p =0.0001; intermediate: t1,9 = −4.20, p =0.001; late: t1,9 = −3.92, p = 0.002), similar to the early Dev in the Passive condition (Fig 4, left column). In contrast, for late Dev, only the late time interval was significant (t1,9 = −1.84, p = 0.0497) (Fig 3B, difference waveforms, bottom row) and not the intermediate interval found in the Passive condition (p=.42 early, p=.16 intermediate) (Fig 4, right column).
3.2.2 Traditional ERP components
We also analyzed the ERP responses by their more traditional labeling. MMN, N2b, and P3b components were elicited by pattern deviants. Tables 1–3 summarize the mean amplitudes and latencies elicited by deviants and standards, measured in the corresponding latency range of the ERP components. MMN was elicited by the early Dev and by the late Dev pattern violations in both Passive and Active conditions. Overall, the early Dev MMN amplitude was significantly larger (more negative) than the late Dev MMN amplitude (significant main effect of deviant type, F1, 9 = 22.64, p = 0.001, ηp2 = 0.72). MMN peak latency was also significantly shorter to the early Dev deviant than to the late Dev deviant (main effect of deviant type, F1,9 = 20.37, p = 0.001, ηp2 = 0.69). Scalp distribution was consistent with known MMN topography (main effect of electrode: F1.08, 9.75 = 7.08, ε = 0.54, p =0.02, ηp2= 0.44). Post hoc calculations showed this was due to a significantly more negative amplitude at Fz than Pz (p=0.004). However, the scalp distribution significantly differed between deviants, being more frontal for the late Dev than the early Dev (significant interaction between deviant type, stimulus type, and electrode: F1.3, 11.3 = 17.07, ε = 0.63, p< 0.001, ηp2 = 0.65). Post hoc analysis of this interaction revealed a more negative amplitude to the early Dev deviant than to the standard at Fz, Cz, and Pz (p<0.001 for each comparison), whereas the late Dev deviant amplitude was more negative than the standard only at Fz (p=0.016) and not at Cz (p=0.14) or Pz (p=0.08). In sum, the MMN elicited by the early Dev pattern violation began earlier, was larger in amplitude, and was similar to the more ‘traditional’ MMN response, with ‘typical’ component topography than the response to the late Dev pattern violation.
Attention directed toward or away from the pattern violations did not affect the MMN amplitude (passive vs. active listening) (no main effect of attention on amplitude: F1, 9 = 3.02, p = 0.12), or latency (measured from the left mastoid, no main effect of attention on latency: F1, 9<1, p = 0.52). However, the ‘true’ amplitudes and latencies of the responses cannot be compared between Passive and Active conditions due to overlap with other responses in the Active condition (e.g., N2b).
N2b amplitude was elicited by both the early Dev and the late Dev (significant main effect of stimulus type, F1, 9 = 10.50, p = 0.01, ηp2 = 0.54), which was due to more negative deflection in response to deviants than standards (Fig 3A, Cz electrode). There was a difference in the scalp topography between pattern deviant types (significant interaction between deviant type and electrode, F1.46, 13.12 = 8.09, ε = 0.73, p = 0.008, ηp2 = 0.47) (Fig 3B). Post hoc calculation showing this was due to more negative amplitude at Cz than Fz for the late Dev (p=0.045), and no difference across electrodes for the early Dev (Fz-Cz: p=0.20; Cz-Pz: p=0.99). There was no significant difference in peak N2b latency for the early Dev (202 ms, SD=23) compared to late Dev (207 ms, SD = 25) (t1,9 = 1.14, p=0.28).
P3b was elicited by both detected pattern deviants (main effect of stimulus type, F1, 9 = 21.93, p = 0.001, ηp2 = 0.71). The positive deflection in response to deviants was larger than to standards (Fig 3A, Pz electrode). The P3b amplitude did not differ depending on deviant type (early vs. late, no main effect of deviant type, F1, 9<1, p = 0.37). There was a significant scalp distribution difference between early and late deviants (deviant type, stimulus type, electrode interaction (F1.12, 10.11 = 5.87, ε = 0.56, p = 0.033, ηp2 = 0.39), as was similarly found for MMN and N2b components. Post hoc calculations showed this was due to a larger positive amplitude between deviant and standard at Pz and Cz (p < 0.001) for the early Dev deviants, and no significant difference between deviant and standard at Fz (p = 0.52; i.e., centro-parietal distribution) (Fig 3A). In contrast, the late Dev had a broader distribution, with significant differences between deviants and standards at Pz, Cz, and Fz electrodes (p<0.001 for all contrasts) (Fig 3B). The P3b component also peaked earlier for early Dev (461 ms, SD = 21) than for the late Dev (515 ms, SD = 23) (t1,9 = 7.73, p = 0.00003) (Fig 3B). Thus, even though the task was similar for both types of deviants --target responses involved pressing a key for unexpected pattern violations -- there were significant differences in latency and scalp topography for the target detection components depending on whether the pattern violation included a stimulus change or not.
Pearson’s r was calculated to assess the relationship between RT and ERP component (MMN and P3b) amplitude and latency. There were nonsignificant correlations between RT and the component variables (RT and MMN latency D1: r = .16, p = .65; D2: r = .57, p = .09; RT and P3b latency D1: r = −.10, p = .79; D2: r = .44, p = .20; RT and MMN amplitude D1: r = .62, p = .06; D2: r = .44, p = .21; RT and P3b amplitude D1: r = −.41, p = .24; D2: r = −.12, p = .75).
3.3 O tones
The responses to the O tones are displayed in Figs 2–3. In the Passive condition, for the early deviant, the O tone coincided with the expectation violation and the termination of the pattern, whereas for the late deviant, the expectation violation occurred two tones prior to the O tone, which was the termination of the deviant pattern (Fig 2). Thus, it was difficult to compare responses to the O tones in this condition, as we cannot confirm whether the expectation violation response concluded at the time of detection of the violation within the pattern or at the terminal tone of the pattern. In the Active condition, the response to the O tone to the late deviant could not be delineated due to overlap with the P3b response (Fig 3). Thus, we did not compare the O tone responses statistically.
4. Discussion
The goal of this study was to evaluate neural adaptation and predictive coding explanations for the neurophysiological auditory change detection response (MMN). In most standard MMN paradigms, where the deviant is a different stimulus from the standard, there is a conflation of neuronal adaptation and expectation violation in the neural response, which precludes differentiating between these theories. We disentangled the two effects by comparing a pattern expectation violation response without stimulus change (i.e., coinciding with a tone repetition, the late pattern Dev) to an expectation violation response that includes a stimulus change (the early pattern Dev), and by manipulating the direction of participants’ attention. We found larger (enhanced) responses evoked by both deviant types when compared to physically identical standard stimuli. The early Dev enhanced negativity included a response to a stimulus change, demarcating the pattern violation, and could thus be explained by either theoretical view. However, the enhanced response to the late Dev was evoked by a repeated stimulus (X1X2X3X4X5O) and is difficult to attribute to a release from adaptation. Repetition of the X tone should result in a reduced response by the fourth repetition (X4) (Haenschel et al. 2005). Thus, the enhanced response triggered by X4 is better explained as a neurophysiological manifestation of an expectation violation. Notably, the expectation violation response did not depend upon perceptual expectancy, suggesting that active listening to initiate hierarchical interactions with incoming sensory input is not required for generating prediction errors. Moreover, clear differences between the responses to the two deviant types in our results suggest that the traditional MMN response may reflect a combination of processes, which can be distinguished from the expectation violation response, as described in detail below.
4.1 Responses to expectation violation are difficult to ascribe to stimulus-specific neural adaptation
By design, the responses evoked by the deviants cannot be attributed to adaptation with respect to the overall probability of the O or X tones. Deviancy was set up by violating the sequential relationship of the tones, rather than by manipulating the overall probability of one stimulus compared to another. That is, the placement of the O tones within the overall sequence demarcated an early pattern deviant or a late pattern deviant, but did not alter the ratio of X to O tones. In addition, both Dev types were compared to the response to a physically identical sequence. This is the primary strength of our experimental design, and contrasts with the traditional ‘standard-oddball’ paradigms most commonly used to study the MMN.
We note that a number of studies have used alternative approaches to attempt to distinguish between expectation violation signals and stimulus-specific adaptation. One such study used an ‘equiprobable’ ensemble in which the deviant, occurring with 10% probability in a 90–10% oddball sequence was compared with the response to the same stimulus presented among a diverse stimulus ensemble with each stimulus type occurring with 10% probability (Jacobsen et al. 2003). The rationale of this paradigm is that the contribution of adaptation should be subtracted out (10% probability deviant minus 10% probability standard) and should thus leave only the expectation violation response. However, this logic neglects that adaptation is not perfectly stimulus-specific. The response to a stimulus is reduced not only when it is preceded by itself but also by other similar stimuli (Malone et al. 2002; Blake and Merzenich 2002; Ulanovsky et al. 2003; Kvale and Schreiner 2004; Fishman et al. 2004). Thus, unless the adaptation caused by the standard and the equiprobable distribution are carefully measured, it is impossible to know which condition should evoke stronger stimulus-specific adaptation.
Another study taking a different approach convincingly demonstrated that MMN could be elicited by a pattern expectation violation, not attributed to specific stimulus attributes (Alain et al. 1999). They compared pattern violations involving timing and frequency differences, in separate passive listening conditions, with large and small separations between the standard and deviants. Deviants replaced the first tone of the expected standard patterns. Alain et al. (1999) demonstrated that the MMN to the deviant was based on the violation of the pattern encoded in memory. This was determined by the larger MMN amplitude elicited by the deviant tone with the larger distance from the expected first tone of the pattern, and not by the deviant tone with the larger physical distance from the preceding tone of the standard pattern (in frequency or interstimulus interval). Whereas this study provided strong evidence that patterns were encoded in memory, the results could not differentiate an expectation violation from release from adaptation explanation because deviants involved stimulus changes in frequency or timing compared to the standard repeating pattern.
While the deviant responses in our paradigm cannot be explained by global stimulus probability, one might note that more ‘local’ stimulus-specific adaptation, such as the particular sequence of stimuli that appeared just before the extracted snippets, could explain the responses. However, these ‘local’ effects appear inconsistent with the observed responses. For the early Dev, the unexpected O tone occurred in close proximity to the terminal O tone of the previous standard pattern (i.e., XXXOXO). Thus, on the basis of adaptation, it would be expected that the O tone of the early Dev (XO) should have a reduced amplitude compared to the O tone of the preceding standard pattern. However, the response to the O tone in the early Dev was significantly larger than its comparison standard O tone (Fig 3B). Similarly, for the late Dev, the X2 tone was compared with the X4 tone. Here, at the local level, the fourth repetition of the X tone would be expected to be more adapted than the second repetition of the X tone (Haenschel et al., 2005). However, the evoked response to X4 was significantly larger than the response to X2 (presumably because it violated the expectation for the O tone). Thus, there is no obvious way that the early or late Dev responses can be explained at the local or global level by the adaptation hypothesis. Only if cross-adaptation were stronger than stimulus-specific adaptation—such as if responses to X stimuli were reduced more strongly by a preceding O than by a preceding X— would adaptation arising from the immediately preceding stimuli be expected to result in stronger responses to the early and late Dev.
Our results are more consistent with previous studies suggesting that MMN is part of a system of predictive coding (Herholz et al. 2009; Garrido et al. 2009b; Mill et al. 2011; Wacongne et al. 2011; Todorovic and de Lange 2012) and not a simple reflection of a release from adaptation. The enhanced responses to both the early Dev and late Dev tones violated pattern expectations from the predictable standard XXXO pattern. However, the current data extend the results of previous studies by 1) using comparison responses that were physically matched; 2) assessing effects of attention on expectation violation; and 3) comparing expectation violation responses with the more typical MMN-type paradigm to disambiguate effects of neural adaptation.
Previous studies that have tested MMN for the predictive coding hypothesis have used comparison responses that were not physically matched and thus could not conclusively distinguish pattern processing from stimulus-specific effects (Herholz et al. 2009; Wacongne et al. 2011; Todorovic and de Lange 2012). Further, results from previous studies generally support a hierarchical model, suggesting that the MMN is a consequence of top-down expectations on effects of stimulus repetition (Garrido et al. 2009b; Wacongne et al. 2011; Todorovic and de Lange 2012). In the current study, elicitation of the expectation violation response in the Passive condition suggests that top-down modulation of the sensory input to initiate the hierarchal interactions is not a requirement. Our results, and those of Alain et al. (1999), demonstrate that the expectation violation response can be evoked without perceptual expectancy.
Notably, our results also indicate that the expectation violation response may reflect fundamentally different neural generators than those attributed to the canonical MMN response, but likely overlap with effects of stimulus repetition, as described below in more detail.
4.2 Differences between early and late pattern deviants reflect divergent processes
Significant differences in the timing, scalp distribution, and attention between the response to the O tone of the early Dev and the X4 tone of the late Dev raise questions about whether the late Dev response (expectation violation response not explained by neuronal adaptation) is equivalent to the traditionally defined MMN. The scalp-recorded MMN ERP component likely reflects multiple converging processes, including neural adaptation and expectation violation (Garrido et al. 2008), with previous stimulus designs making it difficult to distinctly observe the overlapping neural contributions. Here we delineate two distinct components, suggesting that although they may overlap, they reflect divergent processes.
The timing of the ERP responses was different between the early Dev and late Dev, and differently depending on whether a task was performed. With no sound task, in the Passive condition, the negativity to the early Dev was significantly larger, started earlier and ended later than the late Dev negativity. The earlier onset response may be explained by a contribution from neuronal adaptation, coinciding with the N1 latency range (although the same physical stimulus was used for comparison). The extended overall response length for the early Dev may be indicative of integrative processes: the change in stimulus coinciding with the violation of the expectation. In contrast, where there was no stimulus change, the late Dev negative response coincided in time with ERP MMN latency range and not with the N1 latency range. When performing the sound task, in the Active condition, the early Dev negativity was similar to that elicited passively, having an overall larger and longer latency response than the late Dev response. In contrast, the timing of the late Dev response was longer that that elicited in the Passive condition (Fig 4, right column, Table 1). Typically, timing differences for MMN have only been shown to occur with respect to the ease of detectability (e.g., earlier peak latency associated with easier to detect deviants; (Näätänen et al. 1993; Tiitinen et al. 1994), and not from attention effects, such as between Passive and Active conditions for the same deviants (Sussman 2007). The timing differences overall suggest divergent responses between deviant types, as well as attention effects on the expectation violation not observed for the early Dev that included overlapping effects of adaptation. This is remarkable when considering that all responses were both prompted by pattern expectation violations.
The timing differences between the early Dev and the late Dev were also reflected in the behavioral reaction time responses. RT was shorter to the early Dev, consistent with the shorter peak latency of the typical ERP components (MMN and P3b), compared to the late Dev responses (Table 1). However, there were nonsignificant correlations between RT and amplitude, and between RT and latency, of the ERP components. For the MMN component, the latency difference was found in both Passive and Active conditions, thus observed even without a behavioral response. This may indicate that the latency difference in the ERP components between early and late deviants was not directly affecting the response decision. The difference in response latency may be attributed to the confluence of cues for the early Dev, namely, a switch in tone frequency that also violated pattern expectation. The early deviant may have been easier to detect, reflected in faster detection times in both behavior and neurophysiologic responses.
Scalp voltage distribution of the MMN, N2b, and P3b ERP components was another important indication of the difference between to the two deviant types. Typical scalp distribution was observed for all ERP components associated with the early Dev: MMN had a fronto-central focus; N2b was more centrally focused; and P3b had a parietal-central focus. In contrast, scalp topography was atypical for the late Dev: MMN component had a specific left-lateralized frontal focus; the N2b was more parietal; and the P3b had a broader distribution that included frontal sites. The difference in scalp topography for the late Dev response may be a reflection of deviance detection that is an expectation violation without the contribution of activity from non-adapted cells. That is, the processes associated with expectation violation may be visualized at the scalp differently when there is no overlap with the more traditional MMN deviance detection response that includes a stimulus change.
The disparities in scalp distribution of all the ERP components between the early Dev and the late Dev indicate different neural substrates for the two negativities, extending to higher cognitive processes associated with deviance detection to those processes involving attention. For example, the frontal contribution of the P3b of the late Dev may be part of the working memory demand that occurs without the extra cue of a stimulus change. Another possibility is that the frontal contribution represents additional executive functions used when there is no stimulus-specific change cue. In either case these would overlap with the P3b target detection response in timing, and would be indistinguishable in the ERP waveform as a separate contribution.
Attention affected the late Dev expectation violation response in multiple ways not observed for the early Dev involving stimulus change plus pattern violation detection. Active detection of the pattern violations modulated the timing of the neural response to the late Dev, but did not affect the timing of the early Dev response (Fig 4). Whereas scalp topography of the N2b and P3b components was typical for the early Dev involving a stimulus change, it was atypical for the ‘pure’ expectation violation (late Dev). Previous ‘typical’ MMN studies involving active target detection for oddball or patterned stimuli have not demonstrated scalp distribution differences in the attention-related N2b or P3b components when comparing target detection for different types of deviants. Thus, the current results may also suggest that a different strategy was employed for detection of the pattern violation prompting the button press for the late Dev that only involved expectation violation.
In sum, the late Dev response was smaller, had atypical component topography, and was affected by attention. These differences between the early and late Dev responses suggest that the ‘pure’ expectation violation response (late Dev) is not the ‘traditional’ MMN response, reflecting a different response with a different neural substrate.
5. Conclusions
The current data suggest that the late Dev deviant is a violation of expectation response with different neural generators than the more traditional MMN response involving a change stimulus. It is possible that the canonical MMN component includes multiple convergent processes, both adaptation and expectation violation, and that these different processes have not yet been fully delineated or explored due to confounding factors inherent of most typical MMN paradigms. However, there is little evidence from the current study to suggest that the negative response elicited by the expectation violation alone includes the MMN. Moreover, the expectation violation response, distinguished from effects of neural adaptation, was not dependent upon top-down perceptual expectancy, indicating that the hierarchical model engaging attention network interactions on lower sensory input is not required for generating prediction errors.
Acknowledgments
This work was supported by the National Institutes of Health (R01 DC004263 to ESS); the Army Research Office (58760LS to ESS, AK, OS), and the Summer Undergraduate Research Program (SURP) of the Albert Einstein College of Medicine (to SW).
References
- Alain C, Cortese F, Picton TW. Event-related brain activity associated with auditory pattern processing. Neuroreport. 1999;10:2429–2434. doi: 10.1097/00001756-199908020-00038. [DOI] [PubMed] [Google Scholar]
- Alho K, Huotilainen M, Tiitinen H, et al. Memory-related processing of complex sound patterns in human auditory cortex: a MEG study. Neuroreport. 1993;4:391–394. doi: 10.1097/00001756-199304000-00012. [DOI] [PubMed] [Google Scholar]
- Anderson LA, Christianson GB, Linden JF. Stimulus-specific adaptation occurs in the auditory thalamus. J Neurosci. 2009;29:7359–7363. doi: 10.1523/JNEUROSCI.0793-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bendixen A, Schröger E, Winkler I. I heard that coming: event-related potential evidence for stimulus-driven prediction in the auditory system. J Neurosci. 2009;29:8447–8451. doi: 10.1523/JNEUROSCI.1493-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blake DT, Merzenich MM. Changes of AI receptive fields with sound density. J Neurophysiol. 2002;88:3409–3420. doi: 10.1152/jn.00233.2002. [DOI] [PubMed] [Google Scholar]
- Budd TW, Barry RJ, Gordon E, et al. Decrement of the N1 auditory event-related potential with stimulus repetition: habituation vs. refractoriness. Int J Psychophysiol. 1998;31:51–68. doi: 10.1016/s0167-8760(98)00040-3. [DOI] [PubMed] [Google Scholar]
- Butler RA. Effect of changes in stimulus frequency and intensity on habituation of the human vertex potential. J Acoust Soc Am. 1968;44:945–950. doi: 10.1121/1.1911233. [DOI] [PubMed] [Google Scholar]
- Fishman YI. The Mechanisms and Meaning of the Mismatch Negativity. Brain Topogr. 2013;27:500–526. doi: 10.1007/s10548-013-0337-3. [DOI] [PubMed] [Google Scholar]
- Fishman YI, Arezzo JC, Steinschneider M. Auditory stream segregation in monkey auditory cortex: effects of frequency separation, presentation rate, and tone duration. J Acoust Soc Am. 2004;116:1656–1670. doi: 10.1121/1.1778903. [DOI] [PubMed] [Google Scholar]
- Friston K. A theory of cortical responses. Philos Trans R Soc Lond B Biol Sci. 2005;360:815–836. doi: 10.1098/rstb.2005.1622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garrido MI, Friston KJ, Kiebel SJ, et al. The functional anatomy of the MMN: a DCM study of the roving paradigm. Neuroimage. 2008;42:936–944. doi: 10.1016/j.neuroimage.2008.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garrido MI, Kilner JM, Kiebel SJ, Friston KJ. Dynamic causal modeling of the response to frequency deviants. J Neurophysiol. 2009a;101:2620–2631. doi: 10.1152/jn.90291.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garrido MI, Kilner JM, Stephan KE, Friston KJ. The mismatch negativity: a review of underlying mechanisms. Clin Neurophysiol. 2009b;120:453–463. doi: 10.1016/j.clinph.2008.11.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Giard MH, Perrin F, Pernier J, Bouchet P. Brain generators implicated in the processing of auditory stimulus deviance: a topographic event-related potential study. Psychophysiology. 1990;27:627–640. doi: 10.1111/j.1469-8986.1990.tb03184.x. [DOI] [PubMed] [Google Scholar]
- Grimm S, Escera C. Auditory deviance detection revisited: evidence for a hierarchical novelty system. Int J Psychophysiol. 2012;85:88–92. doi: 10.1016/j.ijpsycho.2011.05.012. [DOI] [PubMed] [Google Scholar]
- Haenschel C, Vernon DJ, Dwivedi P, et al. Event-related brain potential correlates of human auditory sensory memory-trace formation. J Neurosci. 2005;25:10494–10501. doi: 10.1523/JNEUROSCI.1227-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herholz SC, Lappe C, Pantev C. Looking for a pattern: an MEG study on the abstract mismatch negativity in musicians and nonmusicians. BMC Neurosci. 2009;10:42. doi: 10.1186/1471-2202-10-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobsen T, Schroger E, Horenkamp T, Winkler I. Mismatch negativity to pitch change: varied stimulus proportions in controlling effects of neural refractoriness on human auditory event-related brain potentials. Neurosci Lett. 2003;344:79–82. doi: 10.1016/s0304-3940(03)00408-7. [DOI] [PubMed] [Google Scholar]
- Javitt DC, Steinschneider M, Schroeder CE, Arezzo JC. Role of cortical N-methyl-D-aspartate receptors in auditory sensory memory and mismatch negativity generation: implications for schizophrenia. Proc Natl Acad Sci. 1996;93:11962–11967. doi: 10.1073/pnas.93.21.11962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kvale MN, Schreiner CE. Short-term adaptation of auditory receptive fields to dynamic stimuli. J Neurophysiol. 2004;91:604–612. doi: 10.1152/jn.00484.2003. [DOI] [PubMed] [Google Scholar]
- Malone BJ, Scott BH, Semple MN. Context-dependent adaptive coding of interaural phase disparity in the auditory cortex of awake macaques. J Neurosci. 2002;22:4625–4638. doi: 10.1523/JNEUROSCI.22-11-04625.2002. doi: 20026408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- May PJC, Tiitinen H. Mismatch negativity (MMN), the deviance-elicited auditory deflection, explained. Psychophysiology. 2010;47:66–122. doi: 10.1111/j.1469-8986.2009.00856.x. [DOI] [PubMed] [Google Scholar]
- Mill R, Coath M, Wennekers T, Denham SL. A neurocomputational model of stimulus-specific adaptation to oddball and Markov sequences. PLoS Comput Biol. 2011;7:e1002117. doi: 10.1371/journal.pcbi.1002117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Näätänen R, Gaillard AW, Mäntysalo S. Early selective-attention effect on evoked potential reinterpreted. Acta Psychol (Amst) 1978;42:313–329. doi: 10.1016/0001-6918(78)90006-9. [DOI] [PubMed] [Google Scholar]
- Näätänen R, Paavilainen P, Tiitinen H, et al. Attention and mismatch negativity. Psychophysiology. 1993;30:436–450. doi: 10.1111/j.1469-8986.1993.tb02067.x. [DOI] [PubMed] [Google Scholar]
- Pérez-González D, Malmierca MS, Covey E. Novelty detector neurons in the mammalian auditory midbrain. Eur J Neurosci. 2005;22:2879–2885. doi: 10.1111/j.1460-9568.2005.04472.x. [DOI] [PubMed] [Google Scholar]
- Rao RP, Ballard DH. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci. 1999;2:79–87. doi: 10.1038/4580. [DOI] [PubMed] [Google Scholar]
- Squires NK, Squires KC, Hillyard SA. Two varieties of long-latency positive waves evoked by unpredictable auditory stimuli in man. Electroencephalogr Clin Neurophysiol. 1975;38:387–401. doi: 10.1016/0013-4694(75)90263-1. [DOI] [PubMed] [Google Scholar]
- Sussman E. A New View on the MMN and Attention Debate: The Role of Context in Processing Auditory Events. J Psychophysiol. 2007;21:164–175. [Google Scholar]
- Sussman E, Ritter W, Vaughan HG. Predictability of stimulus deviance and the mismatch negativity. Neuroreport. 1998;9:4167–4170. doi: 10.1097/00001756-199812210-00031. [DOI] [PubMed] [Google Scholar]
- Sussman E, Winkler I, Huotilainen M, et al. Top-down effects can modify the initially stimulus-driven auditory organization. Cogn Brain Res. 2002;13:393–405. doi: 10.1016/s0926-6410(01)00131-8. [DOI] [PubMed] [Google Scholar]
- Tiitinen H, May P, Reinikainen K, Naatanen R. Attentive novelty detection in humans is governed by pre-attentive sensory memory. Nature. 1994;372:90–92. doi: 10.1038/372090a0. [DOI] [PubMed] [Google Scholar]
- Todorovic A, de Lange FP. Repetition suppression and expectation suppression are dissociable in time in early auditory evoked fields. J Neurosci. 2012;32:13389–13395. doi: 10.1523/JNEUROSCI.2227-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ulanovsky N, Las L, Nelken I. Processing of low-probability sounds by cortical neurons. Nat Neurosci. 2003;6:391–398. doi: 10.1038/nn1032. [DOI] [PubMed] [Google Scholar]
- Wacongne C, Labyt E, van Wassenhove V, et al. Evidence for a hierarchy of predictions and prediction errors in human cortex. Proc Natl Acad Sci U S A. 2011;108:20754–20759. doi: 10.1073/pnas.1117807108. [DOI] [PMC free article] [PubMed] [Google Scholar]