Abstract
Auditory working memory (WM) processing in everyday acoustic environments depends on our ability to maintain relevant information online in our minds, and to suppress interference caused by competing incoming stimuli. A challenge in communication settings is that the relevant content and irrelevant inputs may emanate from a common source, such as a talkative conversationalist. An open question is how the WM system deals with such interference. Will the distracters become inadvertently filtered before processing for meaning because the primary WM operations deplete all available processing resources? Or are they suppressed post perceptually, through an active control process? We tested these alternative hypotheses by measuring magnetoencephalography (MEG), EEG, and functional MRI (fMRI) during a phonetic auditory continuous performance task. Contextual WM maintenance load was manipulated by adjusting the number of “filler” letter sounds in-between cue and target letter sounds. Trial-to-trial variability of pre- and post-stimulus activations in fMRI-informed cortical MEG/EEG estimates was analyzed within and across 14 subjects using generalized linear mixed effect (GLME) models. High contextual WM maintenance load suppressed left auditory cortex (AC) activations around 250–300 ms after the onset of irrelevant phonetic sounds. This effect coincided with increased 10–14 Hz alpha-range oscillatory functional connectivity between the left dorsolateral prefrontal cortex (DLPFC) and left AC. Suppression of AC responses to irrelevant sounds during active maintenance of the task context also correlated with increased pre-stimulus 7–15 Hz alpha power. Our results suggest that under high auditory WM load, irrelevant sounds are suppressed through a “late” active suppression mechanism, which prevents short-term consolidation of irrelevant information without affecting the initial screening of potentially meaningful stimuli. The results also suggest that AC alpha oscillations play an inhibitory role during auditory WM processing.
Introduction
Our ability to suppress irrelevant sounds from interfering with auditory working memory (WM), while also maintaining the capacity to monitor the acoustic environment, is crucial in everyday communication settings. For example, during a conversation, a listener may hear an interesting detail that needs to be held online and protected from auditory interference until the talker has finished. Despite many previous WM studies involving auditory stimuli, several fundamental questions on how the human brain achieves these feats are still open.
It has been long debated whether irrelevant inputs are filtered before (“early selection”; Broadbent, 1958; Hillyard et al., 1973; Woldorff et al., 1993) or after processing for meaning (“late selection”; Deutsch and Deutsch, 1963; Näätänen, 1992). A potential resolution is offered by the “load theory” of visual attention (Lavie, 2005), which suggests that the selection stage depends on contextual factors. For example, conditions with multiple distractors, i.e., high “perceptual load” will inherently promote “early selection” because no processing resources are left for irrelevant stimuli. In the auditory domain, this idea is supported by studies of the auditory cortex (AC) change detection response, the mismatch negativity (MMN). During easy conditions such as movie viewing MMN is elicited even to semantic changes in the unattended stream, allowing late selection (Näätänen et al., 2001; Pulvermuller et al., 2001), but MMN to task-irrelevant deviants is virtually abolished during difficult dichotic listening (i.e., “early selection”) (Woldorff et al., 1991; Woldorff et al., 1998). Interestingly, recent studies suggest that auditory distracters could be filtered at an early stage, not only under perceptual, but also under high visual WM load (Halin et al., 2015; SanMiguel et al., 2008). If true, these findings would also imply that suppression of irrelevant information is a “byproduct” of the primary WM maintenance and manipulation operations, requiring no dedicated control processes.
However, in many everyday settings, relevant and irrelevant sounds occur in the same stream, and all inputs may need to be initially screened to allow selective WM processing (i.e., “late selection”). For example, in the above-mentioned social conversation example, it would not be constructive for the listener to completely shut off the other person, even though the talker’s voice interferes with their auditory WM. Consistent with these notions, experiments that emulate conditions where relevant and irrelevant sounds are embedded in overlapping streams suggest prominent auditory ERP (Berti and Schroger, 2003; Muller-Gass and Schroger, 2007) and behavioral (Dalton et al., 2009) effects to irrelevant distracters also under high WM load. This has lead to an interpretation that distracter suppression requires WM resources and active control (Berti and Schroger, 2003; Dalton et al., 2009). According to visual studies, one of the areas controlling sensory gating specifically during WM maintenance could be the dorsolateral prefrontal cortex (DLPFC) (Postle, 2005).
Previous studies suggest that suppression of irrelevant information during attention and WM tasks is supported by neuronal oscillations at the alpha range (7–15 Hz) (Cooper et al., 2003; Klimesch et al., 2007; Palva and Palva, 2007; Pfurtscheller, 2003). Alpha-band oscillations have been suggested to regulate cortical excitability to prioritize relevant and disregard irrelevant visual field locations (Foxe et al., 1998; Jensen and Mazaheri, 2010; Worden et al., 2000) and to suppress anticipated WM distracters (Bonnefond and Jensen, 2012). Similar effects are yet to be confirmed in the auditory domain. There is some evidence of lateralized alpha-power increases in ACs ipsilateral to the attended ear (Müller and Weisz, 2012). However, only a few previous studies have been able to show a direct relationship between AC alpha oscillations and distracter-elicited neuronal responses (Wöstmann et al., 2016). The alpha inhibition hypothesis is contrasted by an alternative interpretation that alpha synchronization enhances stimulus maintenance at the network level (Leiberg et al., 2006; Obleser et al., 2012; Palva and Palva, 2007; Wilsch and Obleser, 2015). This idea is based on correlations between alpha power and increased WM load, which has been observed both in visual (Palva and Palva, 2007) and auditory domains (Leiberg et al., 2006; Obleser et al., 2012; Wilsch and Obleser, 2015).
Here, we examined whether irrelevant sounds become inadvertently filtered before becoming fully processed during high auditory WM load (“filtering hypothesis”), or whether they are actively suppressed post-perceptually (“active suppression hypothesis”). In addition to examining the correlates of trial-to-trial variation during auditory WM, we also measured functional coupling between frontal and AC areas after filler sounds by analyzing long-range synchronization of neuronal oscillations, suggested to play a crucial role in WM (Jensen et al., 2007). Second, we also examined the role of alpha oscillations in suppression of irrelevant sensory information in ACs. Alternative hypotheses were compared using a multimodal approach, which combines information from MEG, EEG, and fMRI to achieve high-resolution estimates of stimulus processing and interregional connectivity in the cortical “source space”.
In previous studies, the effects of WM load on auditory stimulus processing have often been studied using intermodal tasks (e.g., Halin et al., 2015; SanMiguel et al., 2008). However, in real-life settings, such as our initial conversation example, the distracters and relevant stimuli can be embedded into the same auditory stream that emanates from a common source. Here, to examine challenges under such conditions we used an auditory “AX” continuous performance task, modified from a paradigm sensitive to auditory WM changes on disorders such as schizophrenia (Seidman et al., 1998; Seidman et al., 2012; Seidman et al., 2016) (Fig. 1). During this task, the subjects monitor a sequence of spoken letters for a target letter, and respond to the target letter only when it follows a cue letter and pre-specified number of “filler” sounds (Seidman et al., 1998; Seidman et al., 2012; Seidman et al., 2016). This requires active maintenance of the cue and sequence information and suppression of the irrelevant fillers during the delay period. In other words, instead of “item load” in the classic sense, the current task requires active maintenance of the sequential task context. The longer the sequence during which this task context needs to be held online, the more it presumably burdens the WM system. In certain blocks the cue–target period could also be interleaved with additional cues, ensuring that the subject needed to continuously monitor the letter sounds also during the cue-target periods, instead of resorting to a simple counting strategy.
Methods
Task and design
Sixteen right-handed college educated adults with self-reported normal hearing and no neurological disorders, psychiatric conditions, or learning disabilities participated. Two subjects were excluded from the final sample due to failing to perform the task (hit rate <75% in a simple vigilance condition), rendering a total of 14 subjects for analysis (age 19–28 years, mean 21 years; 8 females). During an auditory continuous performance task (Seidman et al., 1998; Seidman et al., 2012) (Figure 1), the subjects monitored a sequence of voiced letter sounds for the presence of a target letter “A” that followed a cue letter “Q” and a pre-specified number of “filler” sounds that consisted of the other letters of the English alphabet. In vigilance control task blocks, the subjects were asked to respond to each “A” stimulus that occurred immediately after a “Q” event. In “low WM load” blocks, they were asked to respond to each “A” that occurred after a “Q” event and one other letter stimulus. Finally, in “high WM load” blocks, the subjects were to respond to each “A” stimulus that occurred after a “Q” event and three other letters. In this study, we were specifically interested how the maintenance of the auditory WM task set during this Q–A memory period modulates responses to task-irrelevant “filler” sounds. This interest is motivated by a recent multicenter study (Seidman et al., 2016), which suggested that poor performance in auditory “AX” tasks manipulating the WM maintenance load is among the best neuropsychological predictors of clinical psychosis in individuals at high risk for schizophrenia.
In addition to the number of filler sounds between the “Q” and “A”, our factorial task also manipulated the level of higher-level control demands: In certain task blocks, the Q–A maintenance period was occasionally interrupted by distracter Q events or non-target A letters. In reference to our conversation example, this would be analogous to a significant event that occurs while the listener is holding an important thought and only cursorily monitoring the talker’s voice. The task was divided to 90 s blocks during which the number of filler letters between the cue “Q” and target “A” remained the same. These task blocks started with a visual instruction cue. This instruction was identical before task blocks with or without interfering Q events. This ensured that the subject needed to keep track of the filler sounds in all task conditions.
The voiced letter sounds were presented at a comfortable level and at an average inter-stimulus interval (ISI) of 1 sec, jittered ±50 ms to prevent a buildup of subject’s expectation. The original sound tokens, obtained from the Psychology Experiment Building Language (PEBL) Sound Archive version 0.1, had been edited to ensure the duration of each sound file was 400 ms. Similar task and stimuli were used in separate MEG/EEG and fMRI localizer sessions. The MEG/EEG sessions consisted of two 30-minute runs. The different tasks were presented as randomly ordered 90 s blocks, each starting with a 2-s visual instruction. The fMRI tasks were otherwise similar but the task instruction was changed after every third 10.5-s sparse sampling acquisition to avoid too lengthy analysis blocks in the fMRI analyses. It is important to note that, here, the fMRI data were used merely to confine the MEG/EEG source space, based on a pooled all vs. baseline contrast. In other words, the comparisons of MEG/EEG source activities across conditions were independent of any differences in the corresponding fMRI effects.
Data acquisitions
Human subjects’ approval was obtained and voluntary consents were signed before each measurement. 306-channel MEG (Elekta-Neuromag, Helsinki, Finland) and 74-channel EEG data were recorded simultaneously (600 samples/s, passband 0.01–192 Hz) in a magnetically shielded room. Common-average reference was utilized for all analyses of EEG data. The position of the head relative to the sensor array was monitored continuously using four Head-Position Indicator (HPI) coils attached to the scalp. Electro-oculogram (EOG) was also recorded to monitor eye artifacts. Whole-head 3T fMRI was acquired in a separate session using a 32-channel coil (Siemens TimTrio, Erlagen, Germany). To circumvent response contamination by scanner noise, we used a sparse-sampling gradient-echo BOLD sequence (TR/TE= 10,500/30 ms, 8.32 s silent period between acquisitions, flip angle 90°, FOV 192 mm) with 36 axial slices aligned along the anterior-posterior commissure line (3-mm slices, 0.75-mm gap, 3×3 mm2 in-plane resolution), with the coolant pump switched off. T1-weighted anatomical images were obtained for combining anatomical and functional data using a multi-echo MPRAGE pulse sequence (TR=2510 ms; 4 echoes with TEs=1.64 ms, 3.5 ms, 5.36 ms, 7.22 ms; 176 sagittal slices with 1×1×1 mm3 voxels, 256×256 mm2 matrix; flip angle = 7°). A field mapping sequence (TR= 500 ms, flip angle 55°; TE1=2.83 ms, TE2=5.29 ms) with similar slice and voxel parameters to the EPI sequence was utilized to obtain phase and magnitude maps utilized for unwarping of B0 distortions of the functional data.
Data analysis
Behavioral responses occurring within 1.3 s after the target letter were accepted as correct. A nonparametric two-way Friedman ANOVA was used to examine the main effects of load and interference on performance. Because the subjects were advised to emphasize accuracy over the speed of performance these analyses concentrated on the hit rate (HR) measures. The medians and standard errors of medians of HRs, and reaction times (RT) for the correct detections, were estimated using bootstrapping with the data resampled 100,000 times.
Neuronal bases of auditory WM were studied using an fMRI-informed MEG/EEG approach, analogous to our previous studies (Ahveninen et al., 2013; Huang et al., 2014). External MEG noise was suppressed and subject movements, estimated continuously at 200-ms intervals, were compensated for using the signal-space separation method (Taulu et al., 2005) (Maxfilter, Elekta-Neuromag, Helsinki, Finland). The MEG/EEG data were then downsampled (300 samples/s, passband 0.5–100 Hz). Epochs coinciding with over 150 μV EOG, 100 μV EEG, 3000 fT/cm MEG gradiometer, or 4 pT MEG magnetometer changes were excluded from further analyses. To calculate fMRI-guided depth-weighted ℓ2 minimum-norm estimates (MNE) (Hämäläinen et al., 1993; Lin et al., 2006), the information from structural segmentation of the individual MRIs and the MEG sensor and EEG electrode locations were used to compute the forward solutions for all putative sources in the cortex using a three-compartment boundary element model (Hämäläinen and Sarvas, 1989). The shapes of the surfaces separating the scalp, skull, and brain compartments were determined from the anatomical MRI data using FreeSurfer 5.1 (http://surfer.nmr.mgh.harvard.edu/). For whole-brain inverse computations, cortical surfaces extracted with FreeSurfer were decimated to ~5,000 vertices per hemisphere. The individual forward solutions for current dipoles placed at these vertices comprised the columns of the gain matrix (A). A noise covariance matrix (C) was estimated from the raw MEG/EEG data during a 20–200 ms pre-stimulus baseline during non-active periods of the vigilance task. These two matrices, along with the source covariance matrix R, were used to calculate the MNE inverse operator W = RAT (ARAT + C)−1.
To obtain an fMRI prior, i.e., an fMRI-weighted source covariance matrix, each vertex point in the cortical surface was assigned an fMRI significance value using FreeSurfer-FSFAST 5.1. Individual functional volumes were motion corrected, unwarped, coregistered with each subject’s structural MRI, intensity normalized, resampled into cortical surface space, smoothed using a 2-dimensional Gaussian kernel with an FWHM of 5 mm, and entered into a general-linear model (GLM) with the task conditions as explanatory variables. The fMRI weighting was set to 90%. That is, diagonal elements in R corresponding to vertices with below-threshold (P < 0.05, all conditions vs. baseline) significance values were multiplied by 0.1. The fMRI prior was defined for each subject as the union of the group average and their individual contrast.
One-second MEG/EEG raw data epochs, ranging from 400 ms before to 600 ms after the onset of the filler sounds, were multiplied by the inverse operator W and noise normalized to yield the estimated source activity as a function of time (Lin et al., 2006). The trial-specific source estimates were averaged within a priori regions-of-interest (ROI), including one AC and one DLPFC ROI per hemisphere that were determined based on the group results of our recent publication (Huang et al., 2013) (Figs. 2, 4). The AC ROIs included locations showing significant fMRI response suppression due to auditory WM load or interference in superior temporal locations, as determined based on Desikan et al. (2006). The DLPFC ROIs, which were used only for the connectivity analyses, were determined based on the significant (positive) main effect of WM load in (Huang et al., 2013). In ROI-to-ROI connectivity analyses, the waveform signs of sources were aligned based on the surface-normal orientations to avoid phase cancellations. Pre-stimulus oscillatory power was estimated in a 400-ms window preceding the filler sounds between and within the Q–A memory periods, during low and high-load conditions, using a fast Fourier transform (FFT), implemented in MNE-Python (Gramfort et al., 2013). In pre-stimulus power analyses, our hypotheses considered alpha-band (7–15 Hz) oscillations, whose role in WM and distracter suppression is supported by several studies in other sensory domains. Analyses at the beta (15–30 Hz) and gamma (30–100 Hz) bands are, however, also cursorily reported.
To compute dynamic functional connectivity estimates between the DLPFC and AC ROIs, 1.6 s trial epochs (600 pre-stimulus baseline) were convoluted with a Morlet Wavelet (width 5 cycles, 5–30 Hz, 1-Hz intervals). Phase locking between the four ROIs was then calculated at each time point using the pairwise phase consistency (PPC) measure (Vinck et al., 2010), which is not biased by the number of available trials.
Trial-based changes in AC-ROI activation patterns, pooled to bins of three raw-data time points, and their correlations with band-normalized pre-stimulus oscillatory power were analyzed using generalized linear mixed effects models (GLME, Matlab 9.0). These analyses modeled individual epochs in each task condition/subject, and brought longitudinal temporal dependencies into the statistical model, which allowed controlling for autocorrelations and other biases in aggregated estimates (Baayen et al., 2008; Baayen and Milin, 2010). The dependent variable, the amplitude of ROI-average responses, was modeled separately for each time bin, using a restricted maximum pseudo likelihood GLME with fixed effects for the pre-stimulus power, memory load (low vs. high), level of interference, and the trial number, and random effects for the intercept grouped by the subject identity. The GLME also controlled all possible interactions between the pre-stimulus power, memory load, and interference. We controlled for potential biases caused by non-normality and homogeneity by examining the model residuals against fitted values. Trial-to-trial autocorrelations were examined by determining the autocorrelation function for the residuals. Statistical significances were determined by comparing the sum t statistics of continuous positive or negative clusters of at least three consecutive time points (cluster-forming threshold p<0.05) derived from the initial GLME to a surrogate distribution calculated based on 5000 within-subject permutations. Comparisons of ROI-to-ROI time-frequency representations (TFR) of PPC were conducted using the Fieldtrip cluster-based randomization test (cluster-forming threshold p<0.05) (Maris and Oostenveld, 2007). As for the active Q–A periods, only periods preceding a correctly identified target were considered.
Results
The behavioral results validated our assumptions of the WM task manipulations: The hit rate decreased significantly as a function of increasing WM load (Friedman ANOVA, χ2=16.41, p<0.001) and attentional interference (Friedman ANOVA χ2=13.99, p<0.001). The median ± standard error of median hit rates was 94% ± 1% during the vigilance condition, 88% ± 2% under low WM load, 81% ± 4% under high WM load, 83% ± 5% during blocks with low WM load and interference, and 72% ± 6% during blocks with high WM load and interference. The median ± standard error of median reaction times was 478 ± 28 ms during the vigilance condition, 478 ± 37 ms under low WM load, 472 ± 46 ms under high WM load, 483 ± 12 ms under low WM load and interference, and 467 ± 24 ms under high WM load and interference.
Figs. 2a–c show the results of a GLME that analyzed the effects of WM load on auditory activations to filler sounds during the active Q–A memory periods and those during passive non-maintenance periods (Fig. 2, “other fillers”). The results suggest that the high vs. low memory load decreased negative MEG/EEG dipolar currents to filler sounds, which occurred during the active Q–A memory period, in the left AC. The effect was statistically significant 250–300 ms after stimulus onset (cluster-based GLME randomization test, p<0.01; Fig. 2c). Importantly, no similar effects emerged during the periods between the active Q–A periods. GLME model diagnostics during the latency showing the largest initial t-value is shown in Fig. 2c, which demonstrates that the residuals had no relationship with the fitted values, that the residuals were roughly normally distributed, and that there were no significant autocorrelations of the residuals. Individual-level estimates of the effects of WM load on left AC activations to filler sounds are shown in Inline Supplementary Material (Suppl. Fig. 1).
To examine the hypothesis that dealing with irrelevant sounds reflects “active suppression” through “late selection” mechanism we conducted an interregional functional connectivity analysis between two frontal ROIs (the left and right DLPFC), which had been determined, analogously to the AC ROIs, based on a significant fMRI group effect by WM load in our previous publication (Huang et al., 2013). The main result (Fig. 2d) of this analysis suggests that WM load significantly increases 10–14 Hz alpha PPC between the left DLPFC and left AC at around 200–400 ms after the filler-sound onset (p<0.01, cluster-based randomization test). This effect, which was significant only during the active Q–A periods, was thus highly consistent with the timing of response suppression in the left AC (Fig. 2c). Individual-level data of left DLPFC–AC connectivity differences during high vs. low WM load are shown in Inline Supplementary Material (Suppl. Fig. 2). No significant WM-related PPC effects were observed between the right DLPFC and ACs and between the left DLPFC and right AC.
The load of auditory WM is hard to manipulate without changing the stimulus sequence, including the relative numbers of events across the conditions. To control for the resulting inherent differences in the low vs. high-load sound sequences, we conducted two alternative GLME analyses to verify the load-related suppression in left AC during the active Q–A periods (Fig. 3). The first control GLME considered only the first filler events after each Q. The second consisted of the main model with an additional factor controlling the type of the event that was immediately before each filler sound (see Fig. 1). The results of these control analyses were very highly similar to the main analysis in Fig. 2c, which supports the conclusion that the result reflects a true load effect instead of a sequential bias.
We then examined hypotheses regarding the role of AC alpha activations in suppression of irrelevant sounds during WM processing. The results of GLME models that tested the correlation between pre-stimulus oscillatory power and post-stimulus activation time courses are shown in Fig. 4. Similar to the load-related analyses (Figs. 2–3), significant correlations between response amplitudes and pre-stimulus alpha power were evidenced only during the active Q–A periods. High pre-stimulus alpha correlated with suppressed surface negative responses to filler sounds in the left AC 190–220 ms after the filler sound onset (cluster-based GLME randomization test, p<0.05). In the right AC, an analogous significant suppression of surface negativity was observed at 100–160 ms (cluster-based GLME randomization test, p<0.05). GLME model diagnostics during the latencies showing the largest cluster-forming t-values (Fig. 4c) demonstrate that the residuals had no relationship with the fitted values, that they were roughly normally distributed, and that there were no significant autocorrelations. Individual-level estimates of the effects of pre-stimulus alpha on AC activations to filler sounds are shown in Inline Supplementary Material (Suppl. Fig. 3).
Finally, we also analyzed correlations between pre-stimulus oscillations at the beta and gamma ranges and post-stimulus filler responses. No significant correlations were observed in these analyses.
Discussion
Our GLME analyses suggest that increased contextual maintenance load during an auditory WM task suppresses left AC activations at 250–300 ms after irrelevant phonetic “filler” stimuli. Interestingly, this effect coincided with increased alpha-range functional connectivity, quantified as 10–14 Hz PPC, between the left DLPFC and left AC ROIs. Although the load-related response suppression to irrelevant sounds, per se, would be consistent with a “filtering” hypothesis, the late time window of these suppression effects and the increasing DLPFC–AC functional connectivity are more consistent with a late selection or “active suppression” hypothesis. Finally, our results also demonstrate a trial-to-trial correlation between pre-stimulus alpha oscillations and post-stimulus suppression of AC responses to irrelevant filler sounds.
The way auditory WM demands affect auditory processing has remained unclear. Certain recent studies that employed intermodal distraction paradigms suggest that high visual WM load reduces auditory distractibility (Halin et al., 2015; SanMiguel et al., 2008). This could be speculated to mean that under high WM load, auditory distracters become filtered already at an early processing stage (i.e., “early selection”). However, evidence from ERP (Berti and Schroger, 2003; Muller-Gass and Schroger, 2007) and behavioral (Dalton et al., 2009) studies that utilized purely auditory paradigms supports an alternative view that distracter suppression occurs at a later stage and requires active control (Berti and Schroger, 2003; Dalton et al., 2009). At the same time, recent behavioral studies also suggest that during auditory WM, irrelevant speech sounds are suppressed only after initial feature processing (Ellermeier et al., 2015; Wöstmann and Obleser, 2016). The present results seem to be more consistent with the latter group of ERP and behavioral studies: The aspect of left AC time course that was statistically significantly modulated by the contextual WM maintenance load was relatively late, clearly after the P50/N1/P2 pattern that is thought to reflect the emergence of conscious sound percept and the initiation of stimulus encoding in ACs (Parasuraman and Beatty, 1980). The latency of this effect would be more consistent with the “active suppression” hypothesis.
We also found evidence of load-related increases of alpha-band PPC between the left DLPFC and AC, which coincided with the suppression of AC responses to irrelevant sounds during the Q–A periods. Previous visual fMRI studies suggest that, in contrast to the more ventral aspects of PFC that deal with proactive interference (Jonides and Nee, 2006), DLPFC might support active WM maintenance by gating the incoming irrelevant stimuli during the retention delay period (Postle, 2005; Sakai et al., 2002). Notably, auditory ERP studies on neurological patients suggest that DLPFC gates already the earliest (10–50 ms) inputs to primary AC (Knight et al., 1989), providing a potential mechanism of “early selection”. However, the present load-related suppression and connectivity effects occurred at a relative late stage of AC processing. This could mean that the DLPFC gating systems are able to regulate different stages of information processing depending on the task goals. That is, here, the subjects were given an instruction that required monitoring of the sound stream also during the Q–A periods. This involved, not only the easier task of counting the fillers, but also being prepared for occasional intervening Q or A events. (Note that the task instruction was similar before interference and non-interference task blocks.) Much like in our introductory conversation example, it would not have been beneficial for the subject to completely halt auditory processing during the maintenance periods. We thus speculate that the increased alpha-range connectivity between the left DLPFC and AC is related to top-down modulation of post-perceptual processing, that is, after the subject becomes aware of the phonetic event onset (noting the 400-ms sound duration). The time window of this modulation is consistent with the period during which sensory representations have been suggested to be consolidated to more lasting WM traces (Chun and Potter, 1995; Jolicoeur and Dell’Acqua, 1998). A working hypothesis for future studies is that during auditory WM tasks, in which the competing events are embedded within the same sound stream, distracters are suppressed through a mechanism that allows initial evaluation but modulates consolidation of representations that would cause mnemonic interference at later WM processing stages.
The interpretation that the DLPFC–AC connectivity is related to AC suppression is consistent with previous theories that suggest a primarily inhibitory role for alpha-band connectivity patterns. For example, a recent auditory study suggested that alpha-range connectivity increases between frontal and ACs ipsilateral to the ear that the subject anticipates to be stimulated by a relevant event (Müller and Weisz, 2012). However, because the present oscillatory phase locking analyses do not provide information of the directionality of information flow, future studies using techniques such as transcranial magnetic stimulation (TMS) are needed to causally verify these interpretations.
It is important to note that in previous MEG and EEG studies of auditory selective attention, evidence for “early selection” has been obtained by using, for example, dichotic listening paradigms, in which the irrelevant stream contains little or no useful information (Woldorff et al., 1993). Because the present task contained relevant information also during the Q–A maintenance periods, our results do not fully exclude the possibility that “early selection” mechanisms could also be utilized during auditory WM. Another potential source of bias is that the number of filler sounds and their specific serial positions during the Q–A periods were slightly different between the high and low load conditions. This concern is mitigated by the robustness of GLME approaches in analyses with unequal numbers of observations (Baayen, 2008), as well at the fact that our control GLME models, which also considered the type of event that preceded each filler sound, made the overall results even clearer than the initial analysis. Another level of confirmation is offered by the aforementioned fact that the load-related suppression was also evident in analyses considering only the first fillers after each cue stimulus.
The present task consisted of 90 s blocks, during which the subject was prepared to be maintaining the task set for either short or longer series of events. At the event of each Q stimulus that occurred in the high-load blocks, the subject could be assumed to have “pre-allocated” more memory resources for the task set and sound maintenance than in the low-load blocks. Such an anticipatory strategy would make a lot of sense in this task, because the instruction was identical for high-load blocks with or without interference. A pre-allocation strategy could help explain why the difference between the high and low load conditions was also evident in analyses limited to the first filler sounds following the cues in low and high-load Q–A periods.
Our results also suggest that suppression of AC responses to irrelevant sounds during WM load is predicted by increased pre-stimulus alpha power. This negative correlation between pre-stimulus alpha power and sound-evoked responses is consistent with previous observations in the visual domain (Brandt and Jansen, 1991). Similar to the load-related effects, the present correlation effects were selective to task-irrelevant sounds that occurred during cue–target maintenance periods. One might thus speculate that AC alpha oscillations are related suppression of irrelevant inputs. This speculation receives support from a recent MEG finding that alpha power is increased in ACs ipsilateral to the ear receiving task-irrelevant stimulation during dichotic listening (Wöstmann et al., 2016).
A potential problem in the present alpha-correlation analysis, however, is the event-related AC response that follows sound onsets contains a lot of energy also at the alpha range and there is a risk that any fluctuations in “background” alpha that do not relate to either anticipation or active AC suppression produce false positive autocorrelations. Here, we attempted to control for these biases by using GLME approaches that also considered the temporal dependencies, including the exact timing of each trial and the type of preceding events, which we believe helped us regress out effects related to background oscillatory fluctuations, such as alpha-power fluctuations due to reduced alertness during prolonged experiments. With these controls in place, our GLME did, indeed, provide evidence of suppression of AC activities to irrelevant “filler” sounds during auditory WM processing. These effects were, however, earlier than the effects related to WM load modulations, specifically in the right AC (100–140 ms).
In conclusion, our GLME models suggest that activities elicited to irrelevant sound events are modulated at a relative late processing stage in the left AC under high contextual maintenance load of auditory WM. The load-related AC modulations coincided with increased phase synchronization of alpha oscillations between the left DLPFC and ACs. These effects could reflect a control mechanism that helps suppress short-term consolidation of irrelevant information, without affecting the perceptual encoding that is needed to determine whether a particular sound input is relevant or not. Our results also support the view that pre-stimulus alpha oscillations correlate with suppression of sound-evoked responses in ACs. Our results might help clinical investigation of disorders involving auditory WM dysfunctions, such as schizophrenia (Seidman et al., 2016).
Supplementary Material
Highlights.
We studied distracter suppression under low/high auditory working memory (WM) load
High WM load reduced auditory cortex (AC) activations 250–300 ms after distracters
This involved increased prefrontal–AC alpha-band connectivity and AC alpha activity
Under WM load, irrelevant sounds are suppressed through a “late”, active process
This allows initial screening but prevents consolidation of irrelevant information
Acknowledgments
We thank Nao Suzuki and Stephanie Rossi. This work was supported by National Institutes of Health (NIH) Awards R01MH083744, R21DC014134, R01HD040712, R56NS037462, 5R01EB009048, and Massachusetts Department of Mental Health Commonwealth Research Center SCDMH82101008006 (LJS). The research environment was supported by the NIH awards P41EB015896, S10RR014978, S10RR021110, S10RR019307, S10RR014798, and S10RR023401.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Ahveninen J, Huang S, Belliveau JW, Chang WT, Hämäläinen M. Dynamic oscillatory processes governing cued orienting and allocation of auditory attention. J Cogn Neurosci. 2013;25:1926–1943. doi: 10.1162/jocn_a_00452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baayen RH. Analyzing Linguistic Data: A Practical Introduction to Statistics Using R. Cambridge University Press; Cambridge: 2008. [Google Scholar]
- Baayen RH, Davidson DJ, Bates DM. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language. 2008;59:390–412. [Google Scholar]
- Baayen RH, Milin P. Analyzing reaction times. Int J Psychol Res. 2010:12–28. [Google Scholar]
- Berti S, Schroger E. Working memory controls involuntary attention switching: evidence from an auditory distraction paradigm. Eur J Neurosci. 2003;17:1119–1122. doi: 10.1046/j.1460-9568.2003.02527.x. [DOI] [PubMed] [Google Scholar]
- Bonnefond M, Jensen O. Alpha oscillations serve to protect working memory maintenance against anticipated distracters. Curr Biol. 2012;22:1969–1974. doi: 10.1016/j.cub.2012.08.029. [DOI] [PubMed] [Google Scholar]
- Brandt ME, Jansen BH. The relationship between prestimulus-alpha amplitude and visual evoked potential amplitude. Int J Neurosci. 1991;61:261–268. doi: 10.3109/00207459108990744. [DOI] [PubMed] [Google Scholar]
- Broadbent DE. Perception and Communication. Pergamon Press; London: 1958. [Google Scholar]
- Chun MM, Potter MC. A two-stage model for multiple target detection in rapid serial visual presentation. J Exp Psychol Hum Percept Perform. 1995;21:109–127. doi: 10.1037//0096-1523.21.1.109. [DOI] [PubMed] [Google Scholar]
- Cooper NR, Croft RJ, Dominey SJ, Burgess AP, Gruzelier JH. Paradox lost? Exploring the role of alpha oscillations during externally vs. internally directed attention and the implications for idling and inhibition hypotheses. Int J Psychophysiol. 2003;47:65–74. doi: 10.1016/s0167-8760(02)00107-1. [DOI] [PubMed] [Google Scholar]
- Dalton P, Santangelo V, Spence C. The role of working memory in auditory selective attention. The Quarterly Journal of Experimental Psychology. 2009;62:2126–2132. doi: 10.1080/17470210903023646. [DOI] [PubMed] [Google Scholar]
- Desikan R, Segonne F, Fischl B, Quinn B, Dickerson B, Blacker D, Buckner R, Dale A, Maguire R, Hyman B, Albert M, Killiany R. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage. 2006;31:968–980. doi: 10.1016/j.neuroimage.2006.01.021. [DOI] [PubMed] [Google Scholar]
- Deutsch JA, Deutsch D. Some theoretical considerations. Psychol Rev. 1963;70:80–90. doi: 10.1037/h0039515. [DOI] [PubMed] [Google Scholar]
- Ellermeier W, Kattner F, Ueda K, Doumoto K, Nakajima Y. Memory disruption by irrelevant noise-vocoded speech: Effects of native language and the number of frequency bands. J Acoust Soc Am. 2015;138:1561–1569. doi: 10.1121/1.4928954. [DOI] [PubMed] [Google Scholar]
- Foxe JJ, Simpson GV, Ahlfors SP. Parieto-occipital approximately 10 Hz activity reflects anticipatory state of visual attention mechanisms. Neuroreport. 1998;9:3929–3933. doi: 10.1097/00001756-199812010-00030. [DOI] [PubMed] [Google Scholar]
- Gramfort A, Luessi M, Larson E, Engemann DA, Strohmeier D, Brodbeck C, Goj R, Jas M, Brooks T, Parkkonen L, Hämäläinen M. MEG and EEG data analysis with MNE-Python. Front Neurosci. 2013;7:267. doi: 10.3389/fnins.2013.00267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Halin N, Marsh JE, Sorqvist P. Central load reduces peripheral processing: Evidence from incidental memory of background speech. Scand J Psychol. 2015;56:607–612. doi: 10.1111/sjop.12246. [DOI] [PubMed] [Google Scholar]
- Hämäläinen M, Hari R, Ilmoniemi R, Knuutila J, Lounasmaa O. Magnetoencephalography-theory, instrumentation, and applications to noninvasive studies of the working human brain. Rev Mod Phys. 1993;65:413–497. [Google Scholar]
- Hämäläinen MS, Sarvas J. Realistic conductivity geometry model of the human head for interpretation of neuromagnetic data. IEEE Trans Biomed Eng. 1989;36:165–171. doi: 10.1109/10.16463. [DOI] [PubMed] [Google Scholar]
- Hillyard S, Hink R, Schwent V, Picton T. Electrical signs of selective attention in the human brain. Science. 1973;182:177–180. doi: 10.1126/science.182.4108.177. [DOI] [PubMed] [Google Scholar]
- Huang S, Chang WT, Belliveau JW, Hämäläinen M, Ahveninen J. Lateralized parietotemporal oscillatory phase synchronization during auditory selective attention. Neuroimage. 2014;86:461–469. doi: 10.1016/j.neuroimage.2013.10.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang S, Seidman LJ, Rossi S, Ahveninen J. Distinct cortical networks activated by auditory attention and working memory load. Neuroimage. 2013;83:1098–1108. doi: 10.1016/j.neuroimage.2013.07.074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jensen O, Kaiser J, Lachaux JP. Human gamma-frequency oscillations associated with attention and memory. Trends Neurosci. 2007;30:317–324. doi: 10.1016/j.tins.2007.05.001. [DOI] [PubMed] [Google Scholar]
- Jensen O, Mazaheri A. Shaping functional architecture by oscillatory alpha activity: gating by inhibition. Front Hum Neurosci. 2010;4:186. doi: 10.3389/fnhum.2010.00186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jolicoeur P, Dell’Acqua R. The demonstration of short-term consolidation. Cogn Psychol. 1998;36:138–202. doi: 10.1006/cogp.1998.0684. [DOI] [PubMed] [Google Scholar]
- Jonides J, Nee DE. Brain mechanisms of proactive interference in working memory. Neuroscience. 2006;139:181–193. doi: 10.1016/j.neuroscience.2005.06.042. [DOI] [PubMed] [Google Scholar]
- Klimesch W, Sauseng P, Hanslmayr S. EEG alpha oscillations: the inhibition-timing hypothesis. Brain Res Rev. 2007;53:63–88. doi: 10.1016/j.brainresrev.2006.06.003. [DOI] [PubMed] [Google Scholar]
- Knight RT, Scabini D, Woods DL. Prefrontal cortex gating of auditory transmission in humans. Brain Research. 1989;504:338–342. doi: 10.1016/0006-8993(89)91381-4. [DOI] [PubMed] [Google Scholar]
- Lavie N. Distracted and confused?: selective attention under load. Trends Cogn Sci. 2005;9:75–82. doi: 10.1016/j.tics.2004.12.004. [DOI] [PubMed] [Google Scholar]
- Leiberg S, Lutzenberger W, Kaiser J. Effects of memory load on cortical oscillatory activity during auditory pattern working memory. Brain Res. 2006;1120:131–140. doi: 10.1016/j.brainres.2006.08.066. [DOI] [PubMed] [Google Scholar]
- Lin FH, Belliveau JW, Dale AM, Hämäläinen MS. Distributed current estimates using cortical orientation constraints. Hum Brain Mapp. 2006;27:1–13. doi: 10.1002/hbm.20155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maris E, Oostenveld R. Nonparametric statistical testing of EEG-and MEG-data. J Neurosci Methods. 2007;164:177–190. doi: 10.1016/j.jneumeth.2007.03.024. [DOI] [PubMed] [Google Scholar]
- Müller N, Weisz N. Lateralized auditory cortical alpha band activity and interregional connectivity pattern reflect anticipation of target sounds. Cereb Cortex. 2012;22:1604–1613. doi: 10.1093/cercor/bhr232. [DOI] [PubMed] [Google Scholar]
- Muller-Gass A, Schroger E. Perceptual and cognitive task difficulty has differential effects on auditory distraction. Brain Res. 2007;1136:169–177. doi: 10.1016/j.brainres.2006.12.020. [DOI] [PubMed] [Google Scholar]
- Näätänen R. Attention and Brain Function. Lawrence Erlbaum; Hillsdale: 1992. [Google Scholar]
- Näätänen R, Tervaniemi M, Sussman E, Paavilainen P, Winkler I. “Primitive intelligence” in the auditory cortex. Trends Neurosci. 2001;24:283–288. doi: 10.1016/s0166-2236(00)01790-2. [DOI] [PubMed] [Google Scholar]
- Obleser J, Wostmann M, Hellbernd N, Wilsch A, Maess B. Adverse listening conditions and memory load drive a common alpha oscillatory network. J Neurosci. 2012;32:12376–12383. doi: 10.1523/JNEUROSCI.4908-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palva S, Palva JM. New vistas for alpha-frequency band oscillations. Trends Neurosci. 2007;30:150–158. doi: 10.1016/j.tins.2007.02.001. [DOI] [PubMed] [Google Scholar]
- Parasuraman R, Beatty J. Brain events underlying detection and recognition of weak sensory signals. Science. 1980;210:80–83. doi: 10.1126/science.7414324. [DOI] [PubMed] [Google Scholar]
- Pfurtscheller G. Induced oscillations in the alpha band: functional meaning. Epilepsia. 2003;44(Suppl 12):2–8. doi: 10.1111/j.0013-9580.2003.12001.x. [DOI] [PubMed] [Google Scholar]
- Postle BR. Delay-period activity in the prefrontal cortex: one function is sensory gating. J Cogn Neurosci. 2005;17:1679–1690. doi: 10.1162/089892905774589208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pulvermuller F, Kujala T, Shtyrov Y, Simola J, Tiitinen H, Alku P, Alho K, Martinkauppi S, Ilmoniemi RJ, Näätänen R. Memory traces for words as revealed by the mismatch negativity. Neuroimage. 2001;14:607–616. doi: 10.1006/nimg.2001.0864. [DOI] [PubMed] [Google Scholar]
- Sakai K, Rowe JB, Passingham RE. Active maintenance in prefrontal area 46 creates distractor-resistant memory. Nat Neurosci. 2002;5:479–484. doi: 10.1038/nn846. [DOI] [PubMed] [Google Scholar]
- SanMiguel I, Corral MJ, Escera C. When loading working memory reduces distraction: behavioral and electrophysiological evidence from an auditory-visual distraction paradigm. J Cogn Neurosci. 2008;20:1131–1145. doi: 10.1162/jocn.2008.20078. [DOI] [PubMed] [Google Scholar]
- Seidman LJ, Breiter HC, Goodman JM, Goldstein JM, Woodruff PW, O’Craven K, Savoy R, Tsuang MT, Rosen BR. A functional magnetic resonance imaging study of auditory vigilance with low and high information processing demands. Neuropsychology. 1998;12:505–518. doi: 10.1037//0894-4105.12.4.505. [DOI] [PubMed] [Google Scholar]
- Seidman LJ, Meyer EC, Giuliano AJ, Breiter HC, Goldstein JM, Kremen WS, Thermenos HW, Toomey R, Stone WS, Tsuang MT, Faraone SV. Auditory working memory impairments in individuals at familial high risk for schizophrenia. Neuropsychology. 2012;26:288–303. doi: 10.1037/a0027970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seidman LJ, Shapiro DI, Stone WS, Woodberry KA, Ronzio A, Cornblatt BA, Addington J, Bearden CE, Cadenhead KS, Cannon TD, Mathalon DH, McGlashan TH, Perkins DO, Tsuang MT, Walker EF, Woods SW. Association of Neurocognition With Transition to Psychosis: Baseline Functioning in the Second Phase of the North American Prodrome Longitudinal Study. JAMA Psychiatry. 2016;73:1239–1248. doi: 10.1001/jamapsychiatry.2016.2479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taulu S, Simola J, Kajola M. Applications of the Signal Space Separation Method. IEEE Trans Signal Proc. 2005;53:3359–3372. [Google Scholar]
- Vinck M, van Wingerden M, Womelsdorf T, Fries P, Pennartz CM. The pairwise phase consistency: a bias-free measure of rhythmic neuronal synchronization. Neuroimage. 2010;51:112–122. doi: 10.1016/j.neuroimage.2010.01.073. [DOI] [PubMed] [Google Scholar]
- Wilsch A, Obleser J. What works in auditory working memory? A neural oscillations perspective. Brain Res. 2015 doi: 10.1016/j.brainres.2015.10.054. [DOI] [PubMed] [Google Scholar]
- Woldorff MG, Gallen CC, Hampson SA, Hillyard SA, Pantev C, Sobel D, Bloom FE. Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proceedings of the National Academy of Sciences of the United States of America. 1993;90:8722–8726. doi: 10.1073/pnas.90.18.8722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woldorff MG, Hackley SA, Hillyard SA. The effects of channel-selective attention on the mismatch negativity wave elicited by deviant tones. Psychophysiology. 1991;28:30–42. doi: 10.1111/j.1469-8986.1991.tb03384.x. [DOI] [PubMed] [Google Scholar]
- Woldorff MG, Hillyard SA, Gallen CC, Hampson SR, Bloom FE. Magnetoencephalographic recordings demonstrate attentional modulation of mismatch-related neural activity in human auditory cortex. Psychophysiology. 1998;35:283–292. doi: 10.1017/s0048577298961601. [DOI] [PubMed] [Google Scholar]
- Worden MS, Foxe JJ, Wang N, Simpson GV. Anticipatory biasing of visuospatial attention indexed by retinotopically specific alpha-band electroencephalography increases over occipital cortex. J Neurosci. 2000;20:RC63. doi: 10.1523/JNEUROSCI.20-06-j0002.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wöstmann M, Herrmann B, Maess B, Obleser J. Spatiotemporal dynamics of auditory attention synchronize with speech. Proc Natl Acad Sci U S A. 2016;113:3873–3878. doi: 10.1073/pnas.1523357113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wöstmann M, Obleser J. Acoustic Detail But Not Predictability of Task-Irrelevant Speech Disrupts Working Memory. Front Hum Neurosci. 2016;10:538. doi: 10.3389/fnhum.2016.00538. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.