Abstract
Attention is required during speech perception to focus processing resources on critical information. Previous research has shown that bilingualism modifies attentional processing in nonverbal domains. The current study used event-related potentials (ERPs) to determine whether bilingualism also modifies auditory attention during speech perception. We measured attention to word onsets in spoken English for monolinguals and Chinese-English bilinguals. Auditory probes were inserted at four times in a continuous narrative: concurrent with word onset, 100 ms before or after onset, and at random control times. Greater attention was indexed by an increase in the amplitude of the early negativity (N1). Among monolinguals, probes presented after word onsets elicited a larger N1 than control probes, replicating previous studies. For bilinguals, there was no N1 difference for probes at different times around word onsets, indicating less specificity in allocation of attention. These results suggest that bilingualism shapes attentional strategies during English speech comprehension.
Keywords: auditory attention, ERP, bilingual, speech perception, word onsets
1. Introduction
Individual differences in linguistic experience confer profound changes on language and cognitive processing. There is substantial evidence that mastering two languages shapes cognitive processing, as bilinguals show an advantage on both verbal and nonverbal executive control tasks, particularly those involving selective attention (Bialystok, Craik, & Luk, 2012).). These advantages are believed to arise from the cognitive load incurred by managing two competing language systems, a situation that leads to functional and anatomical changes in executive control networks (Abutalebi & Green, 2008; Garbin et al., 2010). Convincing evidence has confirmed that the non-target language for bilinguals remains activated to some extent during language production tasks involving the other language (Jared & Kroll, 2001; Lee & Williams, 2001; van Heuven, Dijkstra, & Grainger, 1998), so a mechanism is required to enable bilinguals to select the target language. Various proposals for this mechanism include inhibition of the non-target language (Green, 1998) and increased attention to the target language (Costa, Santesteban, & Ivanova, 2006). Regardless of the details of this mechanism, most accounts of the bilingual advantages in executive control focus on the need to manage competing languages during language production. Much less consideration has been devoted to how bilingualism may shape attentional mechanisms involved in receptive language processing.
In order to master two spoken languages, individuals must be attuned to separate sets of phonetic, syntactic, semantic, and prosodic features. This attention to perceptual details is particularly important during early bilingual language acquisition when infants must categorize inconsistent input to learn the architecture of two language systems. Evidence from infant studies demonstrates that bilingual infants outperform monolinguals in a task requiring discrimination of newly learned speech structures (Kovács & Mehler, 2009). In addition, 8-month old bilingual infants show advantages in language discrimination based on visual information alone (Weikum et al., 2007), even among languages to which they have not been exposed (Sebastián-Gallés, Albareda-Castellot, Weikum, & Werker, 2012). In this study, infants raised in bilingual homes were able to detect when an adult shown in a silent video switched languages, even if neither of those languages was part of the infant’s home environment; infants raised in monolingual homes did not detect this switch. This effect demonstrates that bilingualism shapes attention to spoken language even without specific training, suggesting that bilinguals may simply be more attuned to the defining features of multiple languages. Bilingual language development may thus require enhanced perceptual attentiveness in order to track relevant information, categorize perceptual input, and establish two language systems.
There is some evidence that adult bilinguals with two well-elaborated language systems process speech differently from monolinguals. Bilingual adults outperform monolinguals on statistical learning tasks that require novel words to be segmented from continuous sounds (Bartolotti, Marian, Schroeder, & Shook, 2011; Wang & Saffran, 2014), a process that requires attention to spoken word forms. Moreover, data from auditory brainstem response (cABR) demonstrate that compared to monolinguals, bilingual adolescents show more consistent brainstem and cortical responses to speech sounds (Krizman, Skoe, Marian, & Kraus, 2014) and enhancements in subcortical representations of fundamental frequency of speech sounds, particularly when presented in a noisy background containing multi-talker babble (Krizman, Marian, Shook, Skoe, & Kraus, 2012). These sensory improvements are positively correlated with measures of attentional control and language proficiency. Taken together, these results suggest that the complex auditory input faced by bilinguals improves both basic auditory processing and higher-level attention skills.
Differences in low-level auditory perception of speech sounds may profoundly shape the way that bilinguals listen to continuous speech. Although attentional patterns during speech perception remain unexplored among bilinguals, a growing body of event-related potential (ERP) evidence demonstrates that selective attention supports speech perception in monolingual English speakers by allowing them to focus on important moments in the speech stream, such as word onsets (Astheimer & Sanders, 2009, 2012). Attention to word onsets enhances early perceptual processing, as indexed by an increase in N1 amplitude in response to attention probes presented in conjunction with word onsets compared to other times in the speech stream. This N1 enhancement during speech perception resembles both classic ERP effects of auditory attention in the spatial domain (Hillyard, Hink, Schwent, & Picton, 1973; Hink & Hillyard, 1976) and more recent measures of auditory attention in the temporal domain (Lange, Rösler, & Röder, 2003; Sanders & Astheimer, 2008), suggesting that listeners use attention to select for critical information during speech perception. Word onsets are particularly important for word recognition (Marslen-Wilson & Zwitserlood, 1989), and based on transitional probabilities they are relatively unpredictable and therefore highly informative (Saffran, Newport, & Aslin, 1996). In English, a stress-timed language, word onsets are usually stressed and therefore particularly salient (Cutler & Norris, 1988), so that the prosodic patterns of English may enhance attention to word onsets via rhythmic entrainment (Barnes & Jones, 2000).
Although attending to word onsets may be a good listening strategy for monolingual English speakers, little is known about how bilinguals attend to English speech. The question is important because the research described above demonstrates that there is a significant impact of bilingualism on attentional control. Do these attentional differences apply to natural speech processing? Some preliminary evidence comes from two studies by Sanders and Neville (2003a, b) that compared ERP responses elicited by word-initial versus acoustically matched word-medial syllables. Among native English speakers, word-initial syllables elicited a larger N1 response than word-medial syllables (Sanders & Neville, 2003b), but native Japanese speakers who learned English after the age of twelve did not show the same “word onset negativity”, instead showing only a later N400 effect in response to word onsets (Sanders & Neville, 2003a). The authors suggest that because native Japanese speech segmentation relies on morae, sublexical units that often correspond to syllables, Japanese speakers may not enhance early perceptual processing of word onsets in a stressed-timed language like English. However, this differential processing of word onsets between groups could also be attributed to the fact that the bilingual group learned English later in life. Although these results suggest there is less attention to word onsets during speech perception for late English learners, the precise attentional patterns of bilinguals listening to English have not been characterized.
To investigate this question, the current study employed an attention probe paradigm introduced by Astheimer and Sanders (2009). This paradigm is analogous to the classic Hillyard spatial attention paradigm (Hink & Hillyard, 1976) in which auditory probes index attention to speech presented at attended versus unattended locations by providing an abrupt acoustic onset that drives auditory evoked potentials. The paradigm relies on the notion that when attention is directed to speech at a particular location, responses to probes at the same location will be enhanced. In the temporally selective attention version, probes index attention to speech at different times relative to word onsets, including 100 ms before word onsets, concurrently with word onsets, 100 ms after word onsets, and at random control times, providing a more detailed characterization of attention over time during speech perception. These probes provide abrupt acoustic onsets within a continuous speech stream, so that increased attention to times within the speech stream can be observed in the auditory evoked potential.
To begin to understand how bilingualism shapes attention to speech, we compared attention to different points in continuous English speech among native English speakers and Chinese-English bilinguals. Chinese-English bilinguals were chosen because several characteristics of the Chinese language are likely to encourage a different attentional strategy than English. First, English and Chinese have different prosodic patterns. English is a stress-timed language that emphasizes word onsets prosodically (Cutler & Norris, 1988) and may thus entrain attention accordingly. Stress patterns in spoken Chinese show more variability, as the phonological foot is the basic rhythmic unit (Duanmu, 2007), and this prosodic pattern may entrain attention differently. Second, although monosyllabic words are common, disyllabic compounds are the predominant word type in modern conversational Chinese. In disyllabic compounds, information about word identity is equally distributed across both syllables (Zhou & Marslen-Wilson, 1994). Finally, Chinese is a tonal language in which changes in pitch over time convey critical information about syllable identity (Howie, 1976). All of these features should encourage a more evenly distributed allocation of attention than has been reported in English, with equal attention directed to all syllables rather than word onsets specifically. Among Chinese-English bilinguals, the different attentional demands of Chinese may influence attention during English speech perception, but the way Chinese-English bilinguals attend to English speech has not yet been explored.
In the current study, English monolinguals and Chinese-English bilinguals were matched on English proficiency in order to assess attention in highly proficient bilinguals. It would not be surprising if it were found that second-language learners used different attentional strategies than monolinguals, so fluent bilinguals were chosen. This matching allows any differences in attentional patterns to be attributed more directly to bilingualism rather than differences in English proficiency. For bilinguals managing multiple language systems, it may be advantageous to adopt a more generalizable attention strategy rather than maintaining multiple language-specific strategies. We therefore hypothesized that Chinese-English bilinguals would show a different attentional pattern than monolingual English speakers, as indexed by early ERP responses including the N1. More specifically, we predicted that bilinguals would demonstrate a more diffuse attentional pattern rather than focusing specifically on word onsets. Assessing the attentional pattern of bilingual speakers will allow us to understand how linguistic experience shapes the way individuals listen to spoken language.
2. Method
2.1. Participants
Thirty-eight participants (23 female; age range 18–35 years, M = 20.9), recruited from introductory Psychology classes, provided data for the final analyses. All participants were right-handed, and reported no neurological, language, or learning disorders, or any psychoactive medication use. The monolingual group consisted of 19 native English speakers with no other language experience. The bilingual group consisted of 19 native Chinese speakers (11 Cantonese, five Mandarin, three undisclosed dialect) who learned English between the ages of one and 11 years (M = 5.8, SD = 3.4) while maintaining native-like proficiency in Chinese. Groups were matched on socioeconomic status measured on a five-point scale of maternal education (1= no high school diploma, 5 = graduate or professional degree). Two additional participants (one monolingual, one bilingual) participated in the experiment but were excluded from the analyses due to excessive low-frequency drift in their EEG recordings caused by skin potentials. All participants provided written informed consent prior to participation, and were either paid $10/h for their time or given extra course credit for their participation.
2.2. Background Measures
The Peabody Picture Vocabulary Test (PPVT-III; Dunn & Dunn, 1997) was administered as a test of English receptive vocabulary. Participants choose which of four visually presented pictures best represents a word spoken by the experimenter. The items are graduated for difficulty; participants begin at an age-determined baseline and continue until they make at least eight errors in 12 consecutive responses. Standard scores were converted from raw scores based on participants’ ages; the test has a mean of 100 and a standard deviation of 15.
The Shipley Abstraction Test (Zachary, 1986) assesses nonverbal intelligence. The 20-item test contains patterns of letters and numbers, each of which ends with one to four blank spaces. Participants have four minutes to complete as many patterns as possible. The number of correct responses out of 20 was converted to a percentage correct for each participant.
2.3. ERP Task
2.3.1. Stimuli
The attention probe narrative paradigm was adapted from Astheimer & Sanders (2009). A one-hour recording of Edward Abbey’s reading of Freedom and Wilderness (1987) was divided at sentence boundaries into 385 10–20 s segments. Each segment was saved in the left channel of a stereo WAV file with a 44.1 kHz sampling rate. Linguistic attention probes were created by extracting a 50 ms excerpt from the story of the narrator producing the syllable “ba.” A total of 200 attention probes were added to the right channel of the sound files in each of four conditions: concurrent with a word onset, 100 ms before word onset, 100 ms after word onset, and at random control times that were not systematically associated with word onsets. Word onsets were defined as the earliest indication of a new phoneme based on visual inspection of sound waves and listening using a gating procedure. Only words for which the onset times, as determined by three independent coders, fell within a 16 ms range were assigned attention probes. To keep the rate of probe presentation consistent throughout the narrative, an additional 300 probes were added, but ERP responses to these probes were not recorded. The word onsets to which probes were assigned were selected such that the acoustic properties of narrative were similar across all conditions. Specifically, there were no significant differences in average intensity, peak intensity, average pitch, or pitch change in the 200 ms surrounding each probe onset (see Astheimer & Sanders, 2009 for a detailed acoustic analysis).
The narrative with attention probes was presented over two Logitech speakers placed directly in front of participants and connected to a Dell computer running E-Prime software. The peak intensity of the narrative and probes was 65 dB SPL (A-weighted), measured at the location of the participants. Small photographs related to the narrative were presented with a visual angle of 3.5° at the center of a black background on a computer monitor 150 cm in front of the participant. A new image was presented with the onset of each sound file, and picture changes never occurred less than 2000 ms before or 500 ms after an attention probe.
2.3.2. EEG Recording
Continuous electroencephalogram (EEG) was recorded from active Ag/AgCl electrodes (Biosemi Active Two system, Amsterdam, Netherlands) located at 64 standard scalp sites (International 10/20 system) as well as the left and right mastoid. All signals were recorded with a Common Mode Sense (CMS) reference and a bandwidth of .01–80 Hz, digitized at 512 Hz. Electrolytic gel was used to maintain impedances below 20 kΩ throughout the recording session. To detect eye movements and blinks, electrooculogram (EOG) was recorded from four additional electrodes placed below and at the outer canthi of each eye.
All analyses were conducted using the ERPLAB and EEGLAB toolboxes on Matlab software. Continuous EEG was referenced to the average of the left and right mastoids and segmented into 1000 ms epochs from 200 ms before to 800 ms after probe onset, baseline corrected to the 200 ms pre-stimulus interval. Eyeblinks and eye movements were modeled using Infomax independent components analysis (ICA) decomposition in EEGLAB and removed from the recording. Additionally, trials with extreme voltage values, as determined by individual maximum amplitude criteria and visual inspection, were excluded from individual subject averages. Only data from participants with at least 100 artifact-free trials were included in the final analyses.
Mean amplitude was measured in time windows surrounding early auditory onset components that are typically modulated by attention, based on visual inspection of the waveforms: P1 (60–100 ms), N1 (100–160 ms), P2 (160–250 ms), as well as during a late negativity (275–600 ms). Measurements were made at 15 electrode sites, distributed across the anterior central regions where auditory evoked potentials were largest, arranged in a 5 (left-right, or LR) × 3 (anterior-posterior, or AP) grid. Because our primary interest was in detecting different attentional patterns for the two groups, separate 4 (probe time) × 5 (LR) × 3 (AP) mixed-measures ANOVAs were conducted for each group. Planned comparisons (Bonferonni corrected) were conducted for all significant (p < .05) main effects and interactions.
3. Results
3.1. Background Measures
Background measures for both groups are presented in Table 1. One monolingual participant included in the ERP analysis did not provide Shipley or PPVT data. Independent samples t-tests revealed that groups did not differ significantly in their age (p > 0.1), socioeconomic status (p > 0.3), receptive vocabulary (p > 0.1), or nonverbal intelligence (p > 0.2).
Table 1.
Mean scores (and standard deviations) on background measures for monolingual and bilingual participants.
| N | Age | SES | Shipley Nonverbal Percentage |
PPVT Standard Score |
|
|---|---|---|---|---|---|
| Monolinguals | 19* | 19.8 (2.6) | 3.3 (1.1) | 65.8 (15.1) | 105.5 (9.4) |
| Bilinguals | 19 | 21.9 (5.2) | 2.9 (1.3) | 71.7 (15.7) | 101 (6.0) |
One monolingual participant included in the ERP analysis did not provide Shipley or PPVT data
3.2. ERPs
As seen in Figures 1 and 2, attention probes elicited a typical positive-negative-positive series of peaks that was largest over central anterior electrodes. The first positive deflection (P1) peaked around 85 ms, the first negative deflection (N1) peaked around 130 ms, and the second positive deflection (P2) peaked around 200 ms. These auditory evoked potentials were followed by an extended late negativity that reached maximal amplitude around 450 ms. There were no differences in peak latency between groups (ps > 0.5).
Figure 1.
Auditory evoked potentials elicited by probes presented at four times relative to word onset: 100 ms before (long dashed line), concurrent with word onset (short dashed line), 100 ms after word onset (dotted line), and random control times (solid line) for (a) monolingual and (b) bilingual participants. Data are shown from three representative anterior/central electrodes (FC1, FCZ, FC2). Shaded regions with asterisks indicate time windows in which probes after word onset elicited a more negative response than control probes.
Figure 2.
Scatterplots showing the magnitude of observed ERP differences in bilinguals as a function of age of English acquisition (AOA). Amplitude values represent the difference between probes after word onset versus control probes during the (a) P1, (b) N1, and (c) P2 time windows, and (d) the difference between probes at word onset versus control probes during the late negativity.
Figure 1(a) shows that among monolingual participants there was no effect of probe time in the P1 window (p > .15). In the N1 time window, however, there was a main effect of probe time (F(3,54) = 3.52, p < .05), with probes 100 ms after word onset eliciting a larger N1 than control probes (p < .05). This pattern of a larger negativity for probes after word onset versus control continued into the P2 time window (F(3, 54)=5.63, p < .005). The N1 and P2 effects did not interact with electrode position, as they were present across all central anterior electrodes included in the analysis. There was no effect of probe time during the late negativity (p > 0.1).
Bilinguals showed a different attentional pattern, with a main effect of probe time appearing as early as the P1 window (F(3,54) = 3.36, p < .05) when probes presented after word onset elicited a smaller P1 than control probes (Figure 1(b)). The effect of probe time was only marginally significant in the N1 time window (F(3,54) = 2.2, p < .1) when it appears that any probe presented before, at, and after word onset elicited a larger N1 than control probes. In the P2 window, probes presented after word onset elicited a smaller P2 than control probes (F(3, 54) = 3.21, p < .05). Finally, there was a large effect of probe time during the late negativity (F(3, 54) = 5.34, p < .005), when probes presented at and after word onset elicited a larger negativity than control probes (p < .05). There were no interactions of probe time and electrode position in any of these time windows, as these differences were observed across all central anterior electrodes included in the analysis.
To further explore whether the attentional effects observed in bilinguals could be explained by their level of experience with English, the magnitude of observed ERP differences were correlated with the age of English acquisition. There was no correlation between age of English acquisition and the P1 reduction for probes after word onset versus control (r = −.23, p = .35; Figure 2(a)), the marginal N1 enhancement for probes just before (r =.13, p = .60), at (r =.24, p = .32), and after (r = −.37, p = .12; Figure 2(b)) word onset, or the P2 reduction for probes after word onset (r = −.14, p = .57; Figure 2(c)). Age of English acquisition was also not correlated with size of the late negativity for probes presented after word onset (r =−.10, p = .68), but it was marginally correlated with the size of the negativity for probes at word onset (r = .41, p < .08; Figure 2(d)) such that bilinguals who learned English at an earlier age tended to show a more negative response to probes presented at word onset. Similar to the lack of correlation with age of acquisition, bilingual ERP differences were also not significantly correlated with PPVT performance in any time window (ps > .2).
4. Discussion
The current study used ERPs to measure the allocation of attention over time during natural speech perception in native English monolingual and Chinese-English bilingual young adults. Among monolingual participants, attention probes presented after word onsets elicited a larger N1 response than control probes, indicating that native English speakers selectively attended to times that contained word-initial segments (Figure 1). This finding replicates previous research with this paradigm (Astheimer & Sanders, 2009). In contrast, Chinese-English bilinguals who were matched with the monolinguals on English proficiency demonstrated a different attentional pattern. Bilinguals showed a larger N1 response to probes presented before, at, or after word onset compared to control probes, suggesting that attention is allocated in a more diffuse pattern around word onsets (Figure 1). In both groups, this N1 enhancement was often accompanied by a more negative P2 response, and so the differences within these two time windows should not be considered separate responses but rather reflect a prolonged negative difference (Nd) often observed in auditory attention paradigms (Näätänen & Michie, 1979). Taken together, these results indicate that bilingualism causes measurable changes in the way people listen to English speech, which affects early perceptual processing of spoken words.
The ERP patterns observed in monolingual English speakers largely replicate previous findings (Astheimer & Sanders, 2009). Monolinguals pay more attention at times that contain word-initial segments compared to random control times, a strategy that is beneficial for listening to English because these times contain information critical for word identification (Marslen-Wilson & Zwitserlood, 1989; Salasoo & Pisoni, 1985). Given that English is a stress-timed language, word-initial stress may make these segments more salient, thus entraining attention to select for word onsets. Importantly, acoustic properties known to influence N1 amplitude were matched across the different time conditions, so differences observed can be attributed to attention rather than physical differences in the acoustic environment. The emphasis on word onsets in English may encourage native listeners to pay more attention to word beginnings to allow them to glean maximal information from the speech stream.
To our knowledge, attention to continuous English speech has not been previously characterized in bilingual speakers. Although there is some evidence that late-learning Japanese-English bilinguals do not preferentially process word-initial syllables compared to word-medial syllables (Sanders & Neville, 2003a), the attention probe paradigm in the current study provided a more precise temporal profile of attention around word onsets in a group of bilinguals that had learned English at a younger age. Our results indicate that Chinese-English bilinguals do not attend specifically to word-initial syllables; instead, they appear to pay more attention to all times around a word onset. The observation of a larger N1 for times even before word onsets can be interpreted as sustained attention throughout the previous word.
Another key finding among Chinese-English bilinguals is that the earliest effects of attention are observed in the P1 time window, indicating that attention modulates very early perceptual processing during bilingual speech perception. In other temporal attention paradigms, earlier latency effects were observed in more perceptually demanding tasks (Correa, Lupiáñez, Madrid, & Tudela, 2006; Griffin, Miniussi, & Nobre, 2002), so this P1 difference may reflect increased difficulty faced by bilinguals. Similarly, bilinguals also showed a larger late negativity in response to probes presented in conjunction with word onsets. This difference resembles an N400 response, and may therefore reflect the bilinguals’ greater disruption of semantic processing when attention probes were presented in conjunction with critical information at a word onset. Interestingly, the magnitude of this late effect was marginally correlated with age of English acquisition, suggesting that bilinguals who learned English at a younger age may have experienced more disruption when additional acoustic information was presented in conjunction with word onsets. This increased response may even reflect a more sensitive encoding of speech sounds, as recently observed in bilingual infants (Kovács & Mehler, 2009) and adolescents (Krizman et al., 2012).
As a first attempt at characterizing the use of attention during speech perception in bilinguals, the current study raises questions about the mechanism by which bilingualism influences attention to speech. For example, it is unclear whether differences in attention would be observed in any bilingual group, or if specific knowledge of Chinese drives the attentional differences observed in the current study. Recent ERP evidence demonstrates that native Chinese listeners modulate temporally selective attention while listening to Chinese speech according to interacting factors such as accentuation, semantic congruence, and lexical predictability (Li, Lu, & Zhao, 2014; Li & Ren, 2012). It is not clear from these studies whether Chinese listeners select specifically for word onsets versus other time points in their native language, and in fact several aspects of the language are likely to encourage different attentional strategies during Chinese speech perception. As described above, Chinese has different prosodic patterns and segmentation cues than English, and information is more temporally dispersed across tonal syllables and disyllabic compounds. All of these linguistic characteristics may encourage a more widely distributed attentional pattern during Chinese speech perception.
If the structure of Chinese language encourages a more diffuse attentional pattern, then attention strategies during Chinese comprehension may influence the way Chinese-English listeners process English speech. There is evidence that bilinguals apply speech segmentation strategies from their L1 during L2 comprehension (Cutler, Mehler, Norris, & Segui, 1992). For example, Mandarin-English bilinguals automatically attend to consonant and tone information simultaneously when listening to English, even though tone is not lexically relevant in English as it is in Mandarin (Lin & Francis, 2014). Taken together, these results suggest that attentional strategies from the native language may persist when listening to L2. Alternatively, knowledge of multiple languages may simply encourage a more generalizable attention strategy that can be applied to any language, suggesting that bilingualism itself rather than knowledge of Chinese is driving our results.
A closer look at the linguistic experience of the bilinguals in the current study provides insight into the way bilingualism affects attention to speech. Although there was variability in the age of English acquisition within our bilingual sample, all participants showed a similar attention pattern, and as seen in Figure 2, age of acquisition was not correlated with the magnitude of any early ERP effects. Among Chinese-English bilinguals who learned English later in childhood, attentional biases from their knowledge of Chinese may have influenced the way they attended to English. Importantly, four of the bilinguals in our sample reported learning Chinese and English simultaneously, yet they appeared to show the same diffuse attention pattern. Thus, concurrent exposure to a second language may have caused them to attend to English differently than monolingual English speakers. This suggests that bilinguals are not simply applying a previously learned attention strategy to a new language, but rather that the experience of bilingualism is altering the use of attention across multiple languages.
Although attending to word onsets is an ideal listening strategy for English, bilinguals may adopt a more generic attention strategy that can be applied to multiple languages. A more diffuse attentional pattern would allow bilinguals to glean critical information from multiple languages rather than having to maintain separate attention strategies across languages. It is also possible that, given previous evidence for enhanced attentional control and sensory encoding of speech sounds in bilinguals (Krizman et al., 2012, 2014), they are better able to focus attentional resources solely on critical information within the speech stream while filtering the irrelevant probe sounds completely. Another possible explanation for our findings is that during speech comprehension, bilinguals must use attentional control to activate a single language (Costa et al., 2006), so resources are already limited within the target language, preventing attentional enhancement at critical times during English speech. However, given that bilinguals in the current study all spoke Chinese as a first language, it is difficult to conclude definitively whether the observed attentional patterns are driven by specific experience with Chinese or whether they would persist in any multilingual listener regardless of language type. Measuring attention across multiple languages and bilingual groups could help to distinguish among these possibilities.
The attentional system engaged during speech perception demonstrates considerable plasticity as a function of language experience. These results add to a body of evidence that bilingualism causes profound changes in executive control and attention systems. These changes stem not only from the need to manage language competition during speech production but also from the variety of speech input that bilinguals receive. Thus, the diverse auditory information encountered by bilinguals shapes the way speech sounds are processed in the bilingual brain.
Acknowledgments
This work was partially supported by grant R01HD052523 from the US National Institutes of Health and by grant A2559 from the Natural Sciences and Engineering Research Council of Canada to EB. We thank Mike Rakoczy for his assistance with experiment preparation and data collection.
References
- Abbey E. Freedom and wilderness. Minocqua, WI: North Word Audio Press; 1987. [Google Scholar]
- Abutalebi J, Green D. Control mechanisms in bilingual language production: Neural evidence from language switching studies. Language and Cognitive Processes. 2008;23(4):557–582. [Google Scholar]
- Astheimer LB, Sanders LD. Listeners modulate temporally selective attention during natural speech processing. Biological Psychology. 2009;80(1):23–34. doi: 10.1016/j.biopsycho.2008.01.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Astheimer LB, Sanders LD. Temporally selective attention supports speech processing in 3- to 5-year-old children. Developmental Cognitive Neuroscience. 2012;2(1):120–128. doi: 10.1016/j.dcn.2011.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barnes R, Jones MR. Expectancy, attention, and time. Cognitive Psychology. 2000;41(3):254–311. doi: 10.1006/cogp.2000.0738. [DOI] [PubMed] [Google Scholar]
- Bartolotti J, Marian V, Schroeder SR, Shook A. Bilingualism and inhibitory control influence statistical learning of novel word forms. Frontiers in Psychology. 2011;2 doi: 10.3389/fpsyg.2011.00324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bialystok E, Craik FIM, Green DW, Gollan TH. Bilingual minds. Psychological Science in the Public Interest. 2009;10(3):89–129. doi: 10.1177/1529100610387084. [DOI] [PubMed] [Google Scholar]
- Correa A, Lupiáñez J, Madrid E, Tudela P. Temporal attention enhances early visual processing: a review and new evidence from event-related potentials. Brain Research. 2006;1076(1):116–128. doi: 10.1016/j.brainres.2005.11.074. [DOI] [PubMed] [Google Scholar]
- Costa A, Santesteban M, Ivanova I. How do highly proficient bilinguals control their lexicalization process? Inhibitory and language-specific selection mechanisms are both functional. Journal of Experimental Psychology: Learning, Memory, and Cognition. 2006;32(5):1057–1074. doi: 10.1037/0278-7393.32.5.1057. [DOI] [PubMed] [Google Scholar]
- Cutler A, Mehler J, Norris D, Segui J. The monolingual nature of speech segmentation by bilinguals. Cognitive Psychology. 1992;24(3):381–410. doi: 10.1016/0010-0285(92)90012-q. [DOI] [PubMed] [Google Scholar]
- Cutler A, Norris D. The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception & Performance. 1988;14:113–121. [Google Scholar]
- Duanmu S. The Phonology of Standard Chinese. Second. The Phonology of the World’s Languages; 2007. [Google Scholar]
- Dunn LM, Dunn LM. Peabody picture vocabulary test. 3rd. Circle Pines, MN: American Guidance Service; 1997. [Google Scholar]
- Garbin G, Sanjuan A, Forn C, Bustamante JC, Rodriguez-Pujadas A, Belloch V, Ávila C. Bridging language and attention: Brain basis of the impact of bilingualism on cognitive control. NeuroImage. 2010;53(4):1272–1278. doi: 10.1016/j.neuroimage.2010.05.078. [DOI] [PubMed] [Google Scholar]
- Green DW. Mental control of the bilingual lexico-semantic system. Bilingualism: Language and Cognition. 1998;1(02):67–81. [Google Scholar]
- Griffin IC, Miniussi C, Nobre AC. Multiple mechanisms of selective attention: differential modulation of stimulus processing by attention to space or time. Neuropsychologia. 2002;40(13):2325–2340. doi: 10.1016/s0028-3932(02)00087-8. [DOI] [PubMed] [Google Scholar]
- Hillyard SA, Hink RF, Schwent VL, Picton TW. Electrical signs of selective attention in the human brain. Science. 1973;182(108):177–180. doi: 10.1126/science.182.4108.177. [DOI] [PubMed] [Google Scholar]
- Hink RF, Hillyard SA. Auditory evoked potentials during listening to dichotic speech messages. Perception & Psychophysics. 1976;20:236–242. [Google Scholar]
- Howie JM. Acoustical studies of mandarin vowels and tones. Cambridge, UK: Cambridge University Press; 1976. [Google Scholar]
- Jared D, Kroll JF. Do Bilinguals Activate Phonological Representations in One or Both of Their Languages When Naming Words? Journal of Memory and Language. 2001;44(1):2–31. [Google Scholar]
- Kovács ÁM, Mehler J. Flexible Learning of Multiple Speech Structures in Bilingual Infants. Science. 2009;325(5940):611–612. doi: 10.1126/science.1173947. [DOI] [PubMed] [Google Scholar]
- Krizman J, Marian V, Shook A, Skoe E, Kraus N. Subcortical encoding of sound is enhanced in bilinguals and relates to executive function advantages. Proceedings of the National Academy of Sciences. 2012;109(20):7877–7881. doi: 10.1073/pnas.1201575109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krizman J, Skoe E, Marian V, Kraus N. Bilingualism increases neural response consistency and attentional control: Evidence for sensory and cognitive coupling. Brain and Language. 2014;128(1):34–40. doi: 10.1016/j.bandl.2013.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lange K, Rösler F, Röder B. Early processing stages are modulated when auditory stimuli are presented at an attended moment in time: an event-related potential study. Psychophysiology. 2003;40(5):806–817. doi: 10.1111/1469-8986.00081. [DOI] [PubMed] [Google Scholar]
- Lee M-W, Williams JN. Lexical access in spoken word production by bilinguals: evidence from the semantic competitor priming paradigm. Bilingualism: Language and Cognition. 2001;4(03):233–248. [Google Scholar]
- Lin M, Francis AL. Effects of language experience and expectations on attention to consonants and tones in English and Mandarin Chinese. The Journal of the Acoustical Society of America. 2014;136(5):2827–2838. doi: 10.1121/1.4898047. [DOI] [PubMed] [Google Scholar]
- Li X, Lu Y, Zhao H. How and when predictability interacts with accentuation in temporally selective attention during speech comprehension. Neuropsychologia. 2014;64C:71–84. doi: 10.1016/j.neuropsychologia.2014.09.020. [DOI] [PubMed] [Google Scholar]
- Li X, Ren G. How and when accentuation influences temporally selective attention and subsequent semantic processing during on-line spoken language comprehension: an ERP study. Neuropsychologia. 2012;50(8):1882–1894. doi: 10.1016/j.neuropsychologia.2012.04.013. [DOI] [PubMed] [Google Scholar]
- Marslen-Wilson WD, Zwitserlood P. Accessing spoken words: the importance of word onsets. Journal of Experimental Psychology: Human Perception & Performance. 1989;15(3):576–585. [Google Scholar]
- Näätänen R, Michie PT. Early selective-attention effects on the evoked potential: a critical review and reinterpretation. Biological Psychology. 1979;8(2):81–136. doi: 10.1016/0301-0511(79)90053-x. [DOI] [PubMed] [Google Scholar]
- Saffran J, Newport E, Aslin R. Word segmentation: The role of distributional cues. Journal of Memory and Language. 1996;35:606–621. [Google Scholar]
- Salasoo A, Pisoni DB. Interaction of knowledge sources in spoken word identification. Journal of Memory and Language. 1985;24(2):210–231. doi: 10.1016/0749-596X(85)90025-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanders LD, Astheimer LB. Temporally selective attention modulates early perceptual processing: event-related potential evidence. Perception & Psychophysics. 2008;70(4):732–742. doi: 10.3758/pp.70.4.732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanders LD, Neville HJ. An ERP study of continuous speech processing II. Segmentation, semantics, and syntax in non-native speakers. Cognitive Brain Research. 2003a;15(3):214–227. doi: 10.1016/s0926-6410(02)00194-5. [DOI] [PubMed] [Google Scholar]
- Sanders LD, Neville HJ. An ERP study of continuous speech processing I. Segmentation, semantics, and syntax in native speakers. Cognitive Brain Research. 2003b;15(3):228–240. doi: 10.1016/s0926-6410(02)00195-7. [DOI] [PubMed] [Google Scholar]
- Sebastián-Gallés N, Albareda-Castellot B, Weikum WM, Werker JF. A Bilingual Advantage in Visual Language Discrimination in Infancy. Psychological Science. 2012;23(9):994–999. doi: 10.1177/0956797612436817. [DOI] [PubMed] [Google Scholar]
- Van Heuven WJB, Dijkstra T, Grainger J. Orthographic neighborhood effects in bilingual word recognition. Journal of Memory and Language. 1998;39(3):458–483. [Google Scholar]
- Wang T, Saffran JR. Statistical learning of a tonal language: the influence of bilingualism and previous linguistic experience. Language Sciences. 2014;5:953. doi: 10.3389/fpsyg.2014.00953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weikum WM, Vouloumanos A, Navarra J, Soto-Faraco S, Sebastián-Gallés N, Werker JF. Visual Language Discrimination in Infancy. Science. 2007;316(5828):1159–1159. doi: 10.1126/science.1137686. [DOI] [PubMed] [Google Scholar]
- Zachary RA. Shipley Institute of Living Scale: Revised Manual. Los Angeles: Western Psychological Services; 1986. [Google Scholar]
- Zhou X, Marslen-Wilson W. Words, morphemes and syllables in the Chinese mental lexicon. Language and Cognitive Processes. 1994;9(3):393–422. [Google Scholar]


