Abstract
How can we concentrate on relevant sounds in noisy environments? A “gain model” suggests that auditory attention simply amplifies relevant and suppresses irrelevant afferent inputs. However, it is unclear whether this suffices when attended and ignored features overlap to stimulate the same neuronal receptive fields. A “tuning model” suggests that, in addition to gain, attention modulates feature selectivity of auditory neurons. We recorded magnetoencephalography, EEG, and functional MRI (fMRI) while subjects attended to tones delivered to one ear and ignored opposite-ear inputs. The attended ear was switched every 30 s to quantify how quickly the effects evolve. To produce overlapping inputs, the tones were presented alone vs. during white-noise masking notch-filtered ±1/6 octaves around the tone center frequencies. Amplitude modulation (39 vs. 41 Hz in opposite ears) was applied for “frequency tagging” of attention effects on maskers. Noise masking reduced early (50–150 ms; N1) auditory responses to unattended tones. In support of the tuning model, selective attention canceled out this attenuating effect but did not modulate the gain of 50–150 ms activity to nonmasked tones or steady-state responses to the maskers themselves. These tuning effects originated at nonprimary auditory cortices, purportedly occupied by neurons that, without attention, have wider frequency tuning than ±1/6 octaves. The attentional tuning evolved rapidly, during the first few seconds after attention switching, and correlated with behavioral discrimination performance. In conclusion, a simple gain model alone cannot explain auditory selective attention. In nonprimary auditory cortices, attention-driven short-term plasticity retunes neurons to segregate relevant sounds from noise.
Keywords: processing negativity, cocktail party phenomenon, event-related potentials, negative difference, auditory scene analysis
Humans have a remarkable capacity for auditory selective attention in noisy environments. Somehow our brains are able to pick relevant information from inputs that overlap spatially, spectrally, and temporally, for example, when one concentrates on a particular speaker among a chattering crowd. We may perform this task even when the competing sounds are delivered through a single point of space, such as a radio loudspeaker. This ability is, presumably, supported by top-down modulations of stimulus-evoked brain activations (1–11). How selective attention exactly affects auditory processing is, however, still under debate. “Early selection” theories suggest that auditory attention is explained by enhancement of relevant and suppression of irrelevant inputs (6, 10). Neurophysiologically, this is analogous to gain models of visual spatial attention (12, 13). Contrasting “late selection” theories, such as the processing negativity model (8, 9), maintain that attention does not affect the earliest sound representations, per se. Instead of sensory gain, response modulations such as the attentional EEG negative difference (7) are attributed to endogenous processing that overlaps with afferent activations (8, 9). This purportedly involves a specific set of auditory cortex neurons that support attentional traces of task-relevant sounds, activated independently from stimulus-dependent neurons (8, 9). Evidence supporting a distinction between stimulus- and attention-dependent auditory cortex neurons has also been obtained by functional MRI (fMRI) (3, 4).
These two classic theories have been recently complemented by results supporting a more detailed “tuning model” of auditory attention (14–18). Analogously to its visual counterparts (19, 20), the tuning model suggests that, in addition to gain, attention may also enhance feature selectivity of auditory neurons. These tuning changes could be viewed as attentional traces that are, instead of purely attentional units (8, 9), represented by the same neurons that also respond to afferent inputs (15). The most compelling evidence so far has been obtained from studies on behaving ferrets, demonstrating task-related plasticity that retunes receptive fields of auditory cortex neurons to relevant frequencies or temporal cues (21–23). In most tested neurons these effects evolved, and vanished, within the timescales that the experimental setup allowed differentiating (approximately a few minutes) (22). However, it has not been tested whether these kind of effects are rapid enough to explain human auditory selective attention, which can be engaged and disengaged quite swiftly (<1 s) (24) in everyday listening situations.
In humans, tuning properties of neuron populations can be noninvasively studied by examining neuronal adaptation (i.e., suppression of responses to a given stimulus as a function of its similarity and temporal proximity to preceding stimulation). In addition to purely serial designs (e.g., the oddball or paired-pulse paradigms), frequency-specific adaptation can be induced by continuous notch-filtered noise that occupies all but the task-relevant sound frequencies. Adaptation of responses to tones centered at the frequency notch of such masking helps infer tuning properties of activated neurons (25) and their modulation by attention (16–18). Here, we combined notch-filtered masking with the classic dichotic paradigm (Fig. 1A). To focus on tuning effects rapid enough to explain auditory attention, we asked subjects to switch attention from one ear to the other every 30 s. The resulting activations were localized using a multimodal neuroimaging technique (14, 26–28) that combines temporally precise electromagnetic [magnetoencephalography (MEG)/EEG] and spatially accurate hemodynamic (fMRI) information. Using this multimodal technique, we tested the hypothesis that segregation of relevant sounds from noise is supported by attentional short-term plasticity (i.e., retuning) of auditory cortex neurons (Fig. 1B).
Fig. 1.
Task design and hypotheses. (A) Auditory selective attention task. Two asynchronous standard-tone streams (0.5 vs. 2 kHz) were presented to different ears, in separate blocks with or without notch-filtered white-noise masking. Subjects were instructed to press a button upon hearing a target stimulus (“difficult” 1/24- or “easy” 1/12-octave frequency increase) in the designated ear and to ignore the opposite-ear stimulation. To distinguish short-term “tuning” effects from longer-term learning, the attended ear was shifted after every 30 s, as signaled by a buzzer sound in the designated ear (during fMRI, attention was shifted after every other TR). (B) Tuning hypothesis: Notch-filtered noise results in adaptation/lateral inhibition that decreases response amplitudes. Left: Attention increases single neurons’ selectivity to the attended frequency. In the presence of notch-filtered masking noise (Upper), the response of the single neuron is attenuated due to adaptation in the ignored condition but not in the attended condition. Center: Consequently, in the attended condition, a smaller proportion of neurons responding to the relevant tone frequency become stimulated, and subsequently adapted, by the masker. In contrast, in the ignored condition, the number of neurons responding to the relevant tone is reduced in the presence of the masking noise. Right: In MEG/EEG, the attentional tuning effect is observable as a release from adaptation that counterbalances the attenuating effects of noise masking on N1 activity that is evidenced during the ignored condition. (According to an alternative “gain” hypothesis, attention would significantly increase N1 amplitude also without masking.)
Results
Behavioral Data.
During MEG/EEG and fMRI acquisitions, subjects (n = 10, 5 female) were instructed to attend to tones delivered to one ear and to ignore the opposite-ear stimulation, presented with or without notch-filtered masking (Fig. 1A). There was no behavioral evidence of significantly increased perceptual difficulty of differentiating target tones from nontargets in the presence of masking: the main effects of noise masking, attended ear, and imaging modality on hit rates and reaction times were nonsignificant. Without masking, hit rates (pooled mean ± SEM) were 69% ± 7% vs. 83% ± 5% and reaction times 598 ± 29 ms vs. 569 ± 28 ms for the difficult vs. easy targets, respectively. During noise masking, the respective values were 66% ± 8% vs. 84% ± 6% and 620 ± 29 ms vs. 570 ± 30 ms. Both with and without masking, the subjects discriminated difficult targets more slowly (F1,8 = 19.7, P < 0.01) and inaccurately (F1,8 = 20.1, P < 0.01) than easy targets.
Dynamic MEG/EEG/fMRI Estimates.
Our results showed modulations of MEG/EEG/fMRI activity, which could be explained by attentional retuning of auditory cortex neurons to task-relevant sound frequencies, to counterbalance attenuating effects of noise masking (as hypothesized in Fig. 1B). Specifically, Fig. 2 shows group MEG sensor data (n = 8; MEG/EEG data of 2 subjects were excluded because of technical reasons), showing a masking effect that significantly (F1,6 = 10.1, P < 0.05) reduced auditory responses elicited 50–150 ms after unattended tones. This attenuating effect of masking was canceled out by an early attention effect: during masking, auditory responses 50–150 ms after stimulus were significantly (F1,6 = 6.8, P < 0.05) larger to attended than ignored tones (for details, see Table S1). These attention effects, which were selectively present during masking, were observable already in responses to the very first task-relevant tones after attention shifting cues (Fig. S1), suggesting that the underlying neuronal changes evolve and wash out rapidly after task engagement.
Fig. 2.
Event-related MEG responses. (A) MEG signals to attended and ignored sounds from sensors directly above the left and right auditory cortices, averaged across eight subjects. (B) Average MEG activity 50–150 ms after stimulus in six gradiometer pairs encompassing the left and the right temporal areas. The vector sums of signals of each gradiometer pair were averaged and normalized within each hemisphere and then pooled across hemispheres for the display (Table S1 shows mean amplitudes within each hemisphere). Taken together (A and B), these data show a significant masking-related reduction of early auditory cortex N1 activity (50–150 ms after stimulus) to unattended tones. Selective attention canceled out this attenuation effect, as shown by the significant enhancement of N1 activity that was observable only in the masking condition. *P < 0.05; error bars represent SEM. See also Fig. S1.
Fig. 3 shows the corresponding fMRI-weighted MEG/EEG minimum-norm estimates (MNE) of the attentional difference response [i.e., attended vs. ignored right-ear standard tones (7)], calculated to localize the cortical origins of attention effects. A significant [false discovery rate (FDR) < 0.05] enhancement of inward currents was observed at 50–150 ms after stimulus during masking in nonprimary auditory cortex, possibly reflecting attention-driven release from adaptation/lateral inhibition caused by the noise masker. No such effect was observed without masking at this latency. We also observed a later broadly distributed attention effect, being equally present with and without masking, possibly corresponding to MEG/EEG processing negativity (8, 9) (Fig. 3; see also Fig. S2). Fig. 4 shows a subsequent regions-of-interest (ROI) analysis of MEG/EEG/fMRI data, suggesting that the attention-driven enhancement of early anterior nonprimary auditory cortex activity was significantly stronger (F1,6 = 12.5, P = 0.01) with than without masking in both hemispheres. The effects of masking on the intensity of the later sustained “processing negativity” component were statistically nonsignificant. Analysis of oscillatory steady-state responses (SSR) synchronized to the amplitude modulation (AM) frequencies of the masker sound revealed no selective attention effects (Fig. S3), supporting an interpretation that the above-mentioned early attention effects during masking (Figs. 2–4) cannot be explained by nonspecific gain modulations of auditory-cortex activity (SI Materials and Methods provides a discussion on SSR as an attentional marker).
Fig. 3.
MEG/EEG/fMRI estimates of attentional modulation of auditory cortex activation. The MNE was computed for the difference in the response to attended vs. ignored right-ear tones, with and without masking. Attention-driven inward currents are shown at locations where the corresponding group t statistics were significant at FDR < 0.05. The earlier effect (50–150 ms; approximately N1 peak latency) was only present during masking, which supports our hypothesis that attention enhances feature tuning in auditory cortex. The later “processing negativity” effect (150–350 ms) was not modulated by masking and could reflect more sustained and nonspecific attentional feedback to the auditory cortex. The effects were similar in the right auditory cortex (Fig. S2). PT, planum temporale; HG, Heschl's gyrus; STG, superior temporal gyrus; PP, planum polare.
Fig. 4.
MEG/EEG/fMRI ROI analysis of attentional enhancement of activations in the left and right auditory cortices 50–150 ms after stimulus. The attention effect was significantly stronger with noise masking than without it. Error bars indicate SEM between subjects. **P = 0.01.
We then examined whether attentional modulation of auditory cortex activity, measured from standard tone responses, predicts behavioral discrimination of target tones. MEG/EEG/fMRI estimates of attentional difference responses were correlated with behavioral variables, normalized by calculating hit rate and reaction time differences between responses to the easy vs. difficult targets. In other words, behavioral responses to easy targets provided a baseline of attentional vigilance for measuring the speed and accuracy of sound-frequency discrimination. Significant correlations emerged between the masking-related early attention effect in the left anterior auditory cortex and the hit rate difference to easy vs. difficult targets (Fig. 5). This observation was supported by a significant correlation between the left auditory cortex ROI activity at 50–150 ms and the hit rate tuning measure during masking (Spearman ρ = 0.74, P < 0.05). All other ROI correlations were nonsignificant.
Fig. 5.
Correlations between attentional modulation of auditory cortex activation and behavioral discrimination of target tones (as measured from the difference in the hit rate for easier vs. difficult targets delivered to the right ear). Only statistically significant MEG/EEG/fMRI activations were considered in this analysis. The clearest behavioral correlations were observed during noise masking 50–150 ms after stimulus: improved accuracy of target discrimination correlated with the attentional tuning (release from adaptation caused by masking).
fMRI Results.
Unimodal fMRI analyses (without MEG/EEG) were conducted to determine sustained activations within and beyond auditory areas during auditory selective attention. These analyses showed activations extending from auditory areas to posterior parietal as well as to medial and lateral frontal regions (Figs. S4 and S5). However, these frontal and parietal activities, putatively including areas related to top-down control of attention (e.g., refs. 29 and 30), were not significantly affected by masking (Fig. S4 and S5 and Table S2). This observation is consistent with our behavioral results, suggesting that the perceptual difficulty of differentiating targets from nontargets did not significantly increase during masking. These results argue against an interpretation that the early (50–150 ms) attention effects could be explained simply by effort-related increases, such as a gain effect that is enhanced during masking because of increased top-down feedback or an effort-related “exogenous” activity component that overlaps with “stimulus-dependent” responses when the perceptual difficulty is increased.
Discussion
Here, we demonstrate evidence for a short-term plasticity mechanism of auditory selective attention, by using a multimodal technique that combines anatomical MRI, fMRI, MEG, and EEG information to estimate human brain activations. Our results suggested that selective attention can mitigate the attenuating effects of background noise on early (50–150 ms) activity of human nonprimary auditory cortex. This effect occurred during the first few seconds after task engagement (Fig. S1) and correlated with behavioral sound-discrimination accuracy during noise masking.
The early attentional effects during masking could be explained by modulation of neuronal adaptation that results from background noise. Nonprimary “belt” auditory cortices, where the masking-specific early attention effect originated, are believed to consist of “complex” neurons having relatively wide-frequency tuning curves (31). In these nonprimary auditory cortex areas, unattended tones probably activated neurons whose receptive fields overlapped with the masker spectrum and which were, consequently, adapted by masking. This notion is supported by the fact that our masker's notch width fell within typical estimates of critical bands of auditory cortex neurons (32, 33). Animal models suggest that selective attention can, through center excitation/surround inhibition, fine-tune receptive fields of auditory cortex neurons to the attended frequency (21). Such a mechanism could have enhanced the frequency specificity of neurons representing the attended tones, subsequently decreasing the receptive-field overlap and increasing population responses because of reduced adaptation. This interpretation is supported by previous visual (19) and auditory (14) neuroimaging findings suggesting that attention enhances stimulus-specificity of neuronal adaptation.
Although our alternative hypothesis, a simple gain model, was challenged by the lack of significant early auditory cortex attention effects without masking, such a mechanism could have been selectively activated during masking to increase the gain of neurons sharply pretuned to the attended frequency. This would presume that the activities modulated by attention were generated by neurons having distinct feature-tuned receptive fields [analogously to gain control of retinotopically organized thalamic visual neurons by spatial attention (34)]. However, in contrast to the human primary auditory cortex (35), the nonprimary areas where the masking-specific early attention effects originated are believed to be populated by neurons with relatively wide-frequency tuning curves and complex receptive fields (31). In such populations, it is likely that any given complex neuron, integrating the lower-level activations through spatial convergence, is stimulated by more than one of the competing sound frequencies presented to our subjects. Recent neurophysiological models (36) suggest that more than a simple gain effect is needed for attentional selection in such cases, when the competing stimuli occur within the same receptive field. It is also noteworthy that no significant attention effects were observed in oscillatory SSR synchronized to the masker's AM frequencies (Fig. S3). A gain-based mechanism, which increases neuronal activities without affecting the tuning curves, should have modulated such activities in populations with wider tuning curves than the masker's frequency notch (SI Materials and Methods provides a discussion of SSR as an attentional marker).
Another issue that should be taken into account when considering our alternative hypothesis is the triggering of gain effects during masking. Indices of early gain effects occurring before (10, 37) or during the N1 latency window (6–10) have been suggested to be strongly effort related, being most significant when the competition for attentional resources is intensive (e.g., because of rapid stimulation rates) or when the stimulus salience is reduced to a barely detectable level [e.g., by wideband noise masking (38)]. One could thus assume that, in the present study, noise masking increased the perceptual difficulty to a level that needed to be compensated by significantly increased attentional effort, to activate an effort-related attentional gain mechanism specific to noisy conditions. However, on the basis of previous studies (30, 39), including a very recent auditory fMRI investigation (40), compensating for stimulus degradation can be expected to increase activations in frontal and parietal areas related to attentional control. Here, we found no evidence of enhancement of frontoparietal fMRI activity attributable to increased task effort during masking. At the same time, there were no differences in behavioral performance with vs. without masking either. Taken together, these findings provide little direct support for a gain mechanism triggered selectively during masking only, following from increased effort and consequently strengthened top-down feedback to auditory areas. This is not to say that task-related changes in auditory neurons occur independently of effort (e.g., ref. 23) but that the present task required a similar amount of effort with or without our notch-filtered noise masking. As outlined in our simplified model in Fig. 1A, we presume that attentional optimization of neuronal receptive fields occurred similarly with and without notch filtered masking—masking was needed just to tease out the measurable effect.
A longer-latency (150–350 ms) auditory cortex pattern of attention-elicited inward currents was observed both with and without masking. This effect overlapped with the late processing negativity event-related potentials component, which has been proposed to reflect “endogenous” processing in prefrontal cortices (7–9). A sustained pattern of prefrontal activations was, indeed, observed in our unimodal fMRI data. However, as suggested by the original theory (9), processing negativity essentially involves top-down communications from prefrontal to auditory cortices, observable as dynamic MEG/EEG changes when this feedback is transiently increased (8, 9). Because MEG/EEG signals reflect input to a given area, activations reflecting such top-down communications should be observed at the receiving end, that is, the auditory cortex. Further, because cortico–cortical feedback targets primarily the supragranular top layers, this feedback should transiently enhance inward currents in the postsynaptic neurons (and frontocentral negativities in EEG). Hence, the later (150–350 ms) patterns of inward currents could be associated with transient feedback to auditory cortex subsequent to sound detection.
According to behavioral studies, refocusing auditory attention occurs very rapidly, in less than 1 s (24). However, in most traditional studies (for a review, see refs. 8 and 9) auditory attention effects have been examined by data obtained during much longer blocks (typically, approximately tens of minutes). In comparison, attention was being shifted at a much faster pace, every 30 s, in the present study. Moreover, analyses of changes in response amplitudes within these 30-s periods, during which the task was held constant, suggested that the early attentional tuning effects are evident already in the first few responses after a switching cue. It is thus conceivable that the putative early tuning effects reflect short-term changes instead of longer-term learning effects, which may have confounded previous analogous studies (16, 17) with longer block durations (e.g., in ref. 17 the blocks were obtained in different days). However, such shorter-term tuning effects can be modulated by individual differences: Previous studies have shown that extensive musical training can help selective attending to nonmusical auditory information in noisy conditions (41). This notion is also in line with of our data in Fig. 5, showing that individual differences in the tuning effect correlated with the subjects’ capability to distinguish targets from nontargets during masking.
In summary, our results suggest that segregation of relevant sounds from noise is supported by short-term (approximately seconds) tuning changes of neurons, based on short-term plasticity of auditory cortex. These transient tuning changes could be viewed as an “attentional trace” (see also ref. 9), an interface between top-down and bottom-up processes underlying auditory attention. As a whole, auditory selective attention is probably supported by a combination of gain and tuning effects (16, 18, 37). A simple gain mechanism may suffice at hierarchically lower levels, when the competing stimuli are represented by neurons having distinct receptive fields. Tuning may be the predominant selection mechanism in higher-order auditory areas when multiple spectrally and spatially overlapping stimuli occur in the same neuronal receptive fields.
Materials and Methods
Subjects and Design.
During fMRI and MEG/EEG recordings, healthy right-handed subjects with normal hearing (n = 10, age 23–43 y, 5 female) were presented with a train of 500-Hz “standard” pure tones to one ear and a train of 2,000-Hz “standard” pure tones the other ear [tone duration 100 ms, 10-ms rise and fall time, 80-dB sound pressure level (SPL)] (Fig. 1A) at a randomly varying interstimulus interval (average 1.6 s per ear). Each tone train was occasionally interrupted by “difficult” (1/24-octave increase) or “easy” (1/12-octave increase) deviant tones. In four separate runs, the tones were delivered on top of a continuous binaural white-noise masker (13 dB below the level of tone trains) or with no masking. The opposite streams of the noise masker were separately band-stop filtered (60 dB/semitone roll off) to create a notch of ±1/6 octaves around each ear's standard-tone frequency. The width of the notch was optimized according to ref. 16. To enhance the adaptation effect, the wide-band masker was amplitude modulated (depth 50%) at an average frequency of 40 Hz (ΔfAM = 2 Hz across the ears, for enhancing spatial separation and for measuring oscillatory activations to masking in attended vs. ignored channel; SI Materials and Methods). During fMRI, 46.9-s tone-stimulation periods were interleaved with 23.4-s baseline blocks, with no auditory stimulation or with the noise masking only depending on the type of the run. Stimuli were delivered by using an fMRI-compatible stereo headset (MR Confon) or MEG-compatible plastic tubes and earpieces.
The subjects were instructed to press a button upon hearing a deviant in the designated ear and ignore the inputs to the other ear. To distinguish transient tuning effects from longer-term learning, the attended ear was shifted after every 30 s during MEG/EEG, as signaled by a buzzer sound. For more detailed analyses of evolution of attention effects, tone responses were pooled into five consecutive bins (responses to three subsequent tones per bin) within each 30-s period after attention shifting cues. During fMRI, attention was shifted in between the sparse-sampling echo-planar imaging acquisitions [repetition time (TR), 11.7 s] after every other TR. To avoid prolonged measurements, the monaural 500-Hz and 2,000-Hz trains were presented to the same ears in each participant, in an order counterbalanced across the group: This property was used as a covariate in statistical analyses of auditory cortex ROI activity.
Data Acquisition.
Human subjects’ approval was obtained, and voluntary consent forms were signed before each measurement. The 306-channel MEG (Elekta-Neuromag) and 74-channel EEG data were recorded simultaneously (1,000 samples per second, passband 0.01–325 Hz) in a magnetically shielded room. Electrooculogram (EOG) was also recorded to monitor eye artifacts. All epochs exceeding 150 μV or 3,000 fT/cm at any EEG/EOG or MEG channel, respectively, were discarded. Whole-head 3T fMRI (Siemens Tim Trio) was acquired in a separate session. To circumvent response contamination by scanner noise, we used a sparse-sampling gradient-echo blood oxygen level-dependent (BOLD) sequence [TR/echo time (TE) = 11,700/30 ms, flip angle 90°] with 48 slices along the anterior–posterior commissure line (2.25-mm slices, 0.75-mm gap, 3 × 3 mm2 in-plane resolution), with the coolant pump switched off. T1-weighted 3D MRIs (TR/TE = 2,750/3.9 ms, 1.3 × 1 × 1.3 mm3, 256 × 256 matrix) were obtained for combining anatomical and functional data.
Data Analysis.
Behavioral data collected during fMRI and MEG/EEG were entered into a four-way ANOVA to determine the effects of imaging modality, attended ear, masking condition, and target type on reaction times and hit rates.
In addition to sensor-level analyses, auditory evoked activities were localized using a combined MEG/EEG/fMRI approach. Although fMRI allows spatially accurate whole-brain sampling of activations, it lacks the temporal resolution for determining dynamic neuronal processes. MEG and EEG, in turn, have a millisecond temporal resolution, but a constraining model is needed for localizing the cerebral sources of measured signals. Therefore, we used information from the different imaging modalities to reduce the number of potential solutions. First, the simultaneous MEG and EEG provide complementary information about neural activity (26, 42). Second, because MEG and EEG signals are mainly generated by currents in the cerebral gray matter, the source locations can be restricted to the cortex by using anatomical MRI constraints (43). Finally, the MEG/EEG inverse solution can be constrained even further by fMRI information on the BOLD changes within the gray matter (28, 44). Use of fMRI to constrain MEG/EEG source models can be justified by previous studies showing that the BOLD signal correlates closely with the postsynaptic neuronal events (45) that also generate the MEG/EEG signal (46).
For sensor-level and source analyses, stimulus-locked 800-ms MEG/EEG epochs were averaged offline (200-ms prestimulus baseline). Attentional difference responses were then determined from epochs to attended minus ignored standard tones (7–10). Auditory cortex activations were estimated using fMRI-guided depth-weighted ℓ2 MNE (28, 46). The information from structural segmentation of the individual MRIs and the MEG sensor and EEG electrode locations were used to compute the forward solutions for all source locations using a three-compartment boundary element model (46). For inverse computations, cortical surfaces extracted (47) with Freesurfer software (http://surfer.nmr.mgh.harvard.edu/) were decimated to approximately 5,000 vertices per hemisphere. The individual forward solutions for current dipoles placed at these vertices comprised the columns of the gain matrix (A). A noise covariance matrix (C) was estimated from the raw MEG/EEG data during the baseline and scaled according to the number of averaged epochs. These two matrices, along with the source covariance matrix R, were used to calculate the MNE inverse operator W = RAT (ARAT + C)−1. The MEG/EEG data at each time point were multiplied by W to yield the estimated source activity, as a function of time, on the cortical surface: s(t) = Wx(t) (26–28, 44). A loose orientation constraint was used to prefer currents perpendicular to the cortical surface (27).
Each vertex point in the cortical surface was assigned an fMRI significance value, calculated by using FSL (48) (www.fmrib.ox.ac.uk/fsl): movement-corrected, spatially smoothed (Gaussian kernel of FWHM 5 mm), and intensity-normalized fMRI time-series were entered into a general linear model with the task conditions as explanatory variables. Each subject's functional volumes were then registered to their anatomical images and resampled onto a cortical surface representation (49) to obtain the fMRI priors. fMRI weighting was set to 90% (44). That is, diagonal elements in R corresponding to vertices with below-threshold (P < 0.05, corrected according to the Gaussian random field theory) significance values were multiplied by 0.1.
In addition to the individual-level analyses needed to guide MEG/EEG source modeling, unimodal fMRI group analyses were conducted within the Talairach standard space (Montreal Neurological Institute's MNI-152 template) by using a clustered FSL mixed-effects analysis (48) and, additionally, by using an ROI analysis. Nine ROIs per hemisphere were selected by masking the FSL MNI structural atlas (50) with the main effect of the group fMRI mixed-effect results. fMRI-ROI data were entered in to a repeated-measures ANOVA to test the main effect of noise masking. Additional a priori contrast between the masking and no-masking conditions were calculated within each ROI using paired t tests. (An additional confirmatory uncorrected voxel-by-voxel group fMRI analysis has been described in Fig. S5.)
For group-level statistical analyses of MEG/EEG/fMRI data, individual subjects’ MNEs of attentional difference responses were normalized into a spherical standard brain representation (51). Noise-normalized dynamic statistical parameter maps (dSPM) were then calculated as t statistics across the subjects, within time windows encompassing the early N1 (50–150 ms) and later PN (150–350 ms) responses. A common threshold of FDR was calculated for these dSPMs to control for multiple comparisons (52). Finally, MEG/EEG/fMRI estimates were examined within ROIs determined from the group estimates, separately for each time window, as AROI = AM ∪ ANM, where AM and ANM represent peak activation areas contralateral to the attended ear during masking and no masking, respectively. The ROI-average MNE dipole moment values were entered into a repeated-measures ANOVA to determine the effects of masking on attentional effects in each hemisphere and stimulation condition. For sensor-level analyses, the average magnitude of the MEG field gradients in six gradiometer pairs per hemisphere, covering the temporal regions including the auditory cortices, were calculated 50–150 ms after stimulus. The resulting data were entered into a hemisphere × attention condition by masking condition repeated-measures ANOVA. Confirmatory analyses of oscillatory activities to the AM frequencies of noise maskers have been described in SI Materials and Methods.
To test our hypothesis that attentional tuning of auditory cortex neurons predicts behavioral sound-discrimination performance, standard-tone MEG/EEG/fMRI difference maps obtained from each subject's depth-weighted MNEs (representing dipole moment values) were correlated with behavioral measures of “tuning accuracy” in target-sound discrimination. The behavioral tuning accuracy, determined on the basis of hit rate and reaction time differences between responses to the easy 1/12-octave deviants and difficult 1/24-octave deviants, was correlated with the dipole strength at vertex locations showing a statistically significant auditory attention response.
Supplementary Material
Acknowledgments
We thank Deirdre Foxe, Natsuko Mori, and Dan Wakeman for their help. This work was supported by National Institutes of Health Awards R01MH083744, R21DC010060, R01HD040712, R01NS037462, R01NS057500, R01NS048279, S10RR014798, and P41RR14075, the National Center for Research Resources, and the Academy of Finland.
Footnotes
*This Direct Submission article had a prearranged editor.
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1016134108/-/DCSupplemental.
References
- 1.Grady CL, et al. Attention-related modulation of activity in primary and secondary auditory cortex. Neuroreport. 1997;8:2511–2516. doi: 10.1097/00001756-199707280-00019. [DOI] [PubMed] [Google Scholar]
- 2.Jäncke L, Specht K, Shah JN, Hugdahl K. Focused attention in a simple dichotic listening task: An fMRI experiment. Brain Res Cogn Brain Res. 2003;16:257–266. doi: 10.1016/s0926-6410(02)00281-1. [DOI] [PubMed] [Google Scholar]
- 3.Petkov CI, et al. Attentional modulation of human auditory cortex. Nat Neurosci. 2004;7:658–663. doi: 10.1038/nn1256. [DOI] [PubMed] [Google Scholar]
- 4.Woods DL, et al. Functional maps of human auditory cortex: Effects of acoustic features and attention. PLoS ONE. 2009;4:e5183. doi: 10.1371/journal.pone.0005183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Zatorre RJ, Mondor TA, Evans AC. Auditory attention to space and frequency activates similar cerebral systems. Neuroimage. 1999;10:544–554. doi: 10.1006/nimg.1999.0491. [DOI] [PubMed] [Google Scholar]
- 6.Hillyard SA, Hink RF, Schwent VL, Picton TW. Electrical signs of selective attention in the human brain. Science. 1973;182:177–180. doi: 10.1126/science.182.4108.177. [DOI] [PubMed] [Google Scholar]
- 7.Hansen JC, Hillyard SA. Endogenous brain potentials associated with selective auditory attention. Electroencephalogr Clin Neurophysiol. 1980;49:277–290. doi: 10.1016/0013-4694(80)90222-9. [DOI] [PubMed] [Google Scholar]
- 8.Alho K. Selective attention in auditory processing as reflected by event-related brain potentials. Psychophysiology. 1992;29:247–263. doi: 10.1111/j.1469-8986.1992.tb01695.x. [DOI] [PubMed] [Google Scholar]
- 9.Näätänen R. Attention and Brain Function. Hillsdale, NJ: Lawrence Erlbaum; 1992. [Google Scholar]
- 10.Woldorff MG, et al. Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proc Natl Adac Sci USA. 1993;90:8722–8726. doi: 10.1073/pnas.90.18.8722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Alain C, Arnott SR. Selectively attending to auditory objects. Front Biosci. 2000;5:D202–D212. doi: 10.2741/alain. [DOI] [PubMed] [Google Scholar]
- 12.Desimone R, Duncan J. Neural mechanisms of selective visual attention. Annu Rev Neurosci. 1995;18:193–222. doi: 10.1146/annurev.ne.18.030195.001205. [DOI] [PubMed] [Google Scholar]
- 13.Hillyard SA, Vogel EK, Luck SJ. Sensory gain control (amplification) as a mechanism of selective attention: Electrophysiological and neuroimaging evidence. Philos Trans R Soc Lond B Biol Sci. 1998;353:1257–1270. doi: 10.1098/rstb.1998.0281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ahveninen J, et al. Task-modulated “what” and “where” pathways in human auditory cortex. Proc Natl Acad Sci USA. 2006;103:14608–14613. doi: 10.1073/pnas.0510480103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Jääskeläinen IP, Ahveninen J, Belliveau JW, Raij T, Sams M. Short-term plasticity in auditory cognition. Trends Neurosci. 2007;30:653–661. doi: 10.1016/j.tins.2007.09.003. [DOI] [PubMed] [Google Scholar]
- 16.Kauramäki J, Jääskeläinen IP, Sams M. Selective attention increases both gain and feature selectivity of the human auditory cortex. PLoS ONE. 2007;2:e909. doi: 10.1371/journal.pone.0000909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Okamoto H, Stracke H, Wolters CH, Schmael F, Pantev C. Attention improves population-level frequency tuning in human auditory cortex. J Neurosci. 2007;27:10383–10390. doi: 10.1523/JNEUROSCI.2963-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Okamoto H, Stracke H, Zwitserlood P, Roberts LE, Pantev C. Frequency-specific modulation of population-level frequency tuning in human auditory cortex. BMC Neurosci. 2009;10:1–14. doi: 10.1186/1471-2202-10-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Murray SO, Wojciulik E. Attention increases neural selectivity in the human lateral occipital complex. Nat Neurosci. 2004;7:70–74. doi: 10.1038/nn1161. [DOI] [PubMed] [Google Scholar]
- 20.Womelsdorf T, Anton-Erxleben K, Pieper F, Treue S. Dynamic shifts of visual receptive fields in cortical area MT by spatial attention. Nat Neurosci. 2006;9:1156–1160. doi: 10.1038/nn1748. [DOI] [PubMed] [Google Scholar]
- 21.Fritz J, Shamma S, Elhilali M, Klein D. Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nat Neurosci. 2003;6:1216–1223. doi: 10.1038/nn1141. [DOI] [PubMed] [Google Scholar]
- 22.Fritz JB, Elhilali M, David SV, Shamma SA. Does attention play a role in dynamic receptive field adaptation to changing acoustic salience in A1? Hear Res. 2007;229:186–203. doi: 10.1016/j.heares.2007.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Atiani S, Elhilali M, David SV, Fritz JB, Shamma SA. Task difficulty and performance induce diverse adaptive patterns in gain and shape of primary auditory cortical receptive fields. Neuron. 2009;61:467–480. doi: 10.1016/j.neuron.2008.12.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Mondor TA, Zatorre RJ. Shifting and focusing auditory spatial attention. J Exp Psychol Hum Percept Perform. 1995;21:387–409. doi: 10.1037//0096-1523.21.2.387. [DOI] [PubMed] [Google Scholar]
- 25.Sams M, Salmelin R. Evidence of sharp frequency tuning in the human auditory cortex. Hear Res. 1994;75:67–74. doi: 10.1016/0378-5955(94)90057-4. [DOI] [PubMed] [Google Scholar]
- 26.Liu AK, Dale AM, Belliveau JW. Monte Carlo simulation studies of EEG and MEG localization accuracy. Hum Brain Mapp. 2002;16:47–62. doi: 10.1002/hbm.10024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Lin FH, Belliveau JW, Dale AM, Hämäläinen MS. Distributed current estimates using cortical orientation constraints. Hum Brain Mapp. 2006;27:1–13. doi: 10.1002/hbm.20155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Dale AM, et al. Dynamic statistical parametric mapping: combining fMRI and MEG for high-resolution imaging of cortical activity. Neuron. 2000;26:55–67. doi: 10.1016/s0896-6273(00)81138-1. [DOI] [PubMed] [Google Scholar]
- 29.Ross B, Hillyard SA, Picton TW. Temporal dynamics of selective attention during dichotic listening. Cereb Cortex. 2010;20:1360–1371. doi: 10.1093/cercor/bhp201. [DOI] [PubMed] [Google Scholar]
- 30.Duncan J, Owen AM. Common regions of the human frontal lobe recruited by diverse cognitive demands. Trends Neurosci. 2000;23:475–483. doi: 10.1016/s0166-2236(00)01633-7. [DOI] [PubMed] [Google Scholar]
- 31.Rauschecker JP. Parallel processing in the auditory cortex of primates. Audiol Neurootol. 1998;3:86–103. doi: 10.1159/000013784. [DOI] [PubMed] [Google Scholar]
- 32.Glasberg BR, Moore BC. Derivation of auditory filter shapes from notched-noise data. Hear Res. 1990;47:103–138. doi: 10.1016/0378-5955(90)90170-t. [DOI] [PubMed] [Google Scholar]
- 33.Recanzone GH, Guard DC, Phan ML. Frequency and intensity response properties of single neurons in the auditory cortex of the behaving macaque monkey. J Neurophysiol. 2000;83:2315–2331. doi: 10.1152/jn.2000.83.4.2315. [DOI] [PubMed] [Google Scholar]
- 34.Kastner S, Schneider KA, Wunderlich K. Beyond a relay nucleus: Neuroimaging views on the human LGN. Prog Brain Res. 2006;155:125–143. doi: 10.1016/S0079-6123(06)55008-3. [DOI] [PubMed] [Google Scholar]
- 35.Bitterman Y, Mukamel R, Malach R, Fried I, Nelken I. Ultra-fine frequency tuning revealed in single neurons of human auditory cortex. Nature. 2008;451:197–201. doi: 10.1038/nature06476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lee J, Maunsell JH. A normalization model of attentional modulation of single unit responses. PLoS ONE. 2009;4:e4651. doi: 10.1371/journal.pone.0004651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Luo F, Wang Q, Kashani A, Yan J. Corticofugal modulation of initial sound processing in the brain. J Neurosci. 2008;28:11615–11621. doi: 10.1523/JNEUROSCI.3972-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Schwent VL, Hillyard SA, Galambos R. Selective attention and the auditory vertex potential. Effects of signal intensity and masking noise. Electroencephalogr Clin Neurophysiol. 1976;40:615–622. doi: 10.1016/0013-4694(76)90136-x. [DOI] [PubMed] [Google Scholar]
- 39.Barch DM, et al. Dissociating working memory from task difficulty in human prefrontal cortex. Neuropsychologia. 1997;35:1373–1380. doi: 10.1016/s0028-3932(97)00072-9. [DOI] [PubMed] [Google Scholar]
- 40.Westerhausen R, et al. Identification of attention and cognitive control networks in a parametric auditory fMRI study. Neuropsychologia. 2010;48:2075–2081. doi: 10.1016/j.neuropsychologia.2010.03.028. [DOI] [PubMed] [Google Scholar]
- 41.Parbery-Clark A, Skoe E, Kraus N. Musical experience limits the degradative effects of background noise on the neural processing of sound. J Neurosci. 2009;29:14100–14107. doi: 10.1523/JNEUROSCI.3256-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sharon D, Hämäläinen MS, Tootell RB, Halgren E, Belliveau JW. The advantage of combining MEG and EEG: Comparison to fMRI in focally stimulated visual cortex. Neuroimage. 2007;36:1225–1235. doi: 10.1016/j.neuroimage.2007.03.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Dale A, Sereno M. Improved localization of cortical activity by combining EEG and MEG with MRI cortical surface reconstruction: A linear approach. J Cogn Neurosci. 1993;5:162–176. doi: 10.1162/jocn.1993.5.2.162. [DOI] [PubMed] [Google Scholar]
- 44.Liu A, Belliveau J, Dale A. Spatiotemporal imaging of the human brain activity using functional MRI constrained magnetoenceohalography data: Monte Carlo simulations. Proc Natl Adac Sci USA. 1998;95:8945–8950. doi: 10.1073/pnas.95.15.8945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Logothetis NK. The neural basis of the blood-oxygen-level-dependent functional magnetic resonance imaging signal. Philos Trans R Soc Lond B Biol Sci. 2002;357:1003–1037. doi: 10.1098/rstb.2002.1114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Hämäläinen M, Hari R, Ilmoniemi R, Knuutila J, Lounasmaa O. Magnetoencephalography—theory, instrumentation, and applications to noninvasive studies of the working human brain. Rev Mod Phys. 1993;65:413–497. [Google Scholar]
- 47.Dale AM, Fischl B, Sereno MI. Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage. 1999;9:179–194. doi: 10.1006/nimg.1998.0395. [DOI] [PubMed] [Google Scholar]
- 48.Woolrich MW, et al. Bayesian analysis of neuroimaging data in FSL. Neuroimage. 2009;45(1 Suppl):S173–S186. doi: 10.1016/j.neuroimage.2008.10.055. [DOI] [PubMed] [Google Scholar]
- 49.Fischl B, Sereno MI, Dale AM. Cortical surface-based analysis. II: Inflation, flattening, and a surface-based coordinate system. Neuroimage. 1999;9:195–207. doi: 10.1006/nimg.1998.0396. [DOI] [PubMed] [Google Scholar]
- 50.Mazziotta J, et al. A probabilistic atlas and reference system for the human brain: International Consortium for Brain Mapping (ICBM) Philos Trans R Soc Lond B Biol Sci. 2001;356:1293–1322. doi: 10.1098/rstb.2001.0915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Fischl B, Sereno MI, Tootell RB, Dale AM. High-resolution intersubject averaging and a coordinate system for the cortical surface. Hum Brain Mapp. 1999;8:272–284. doi: 10.1002/(SICI)1097-0193(1999)8:4<272::AID-HBM10>3.0.CO;2-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Genovese CR, Lazar NA, Nichols T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage. 2002;15:870–878. doi: 10.1006/nimg.2001.1037. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





