Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jul 1.
Published in final edited form as: J Phon. 2020 Jun 5;81:100981. doi: 10.1016/j.wocn.2020.100981

Factors modulating cross-linguistic co-activation in bilinguals

Margarethe McDonald a,*, Margarita Kaushanskaya a
PMCID: PMC7375413  NIHMSID: NIHMS1600906  PMID: 32699456

Abstract

Activation of both of a bilingual’s languages during auditory word recognition has been widely documented. Here, we argue that if parallel activation in bilinguals is the result of a bottom-up process where phonetic features that overlap across two languages activate both linguistic systems, then the robustness of such parallel activation is in fact surprising. This is because phonemes across two different languages are rarely perfectly matched to each other in phonetic features. For instance, across Spanish and English, a “voiced” stop is realized in phonetically-distinct ways, and therefore, words that begin with voiced stops in English do not in fact fully overlap in phonetic features with words in Spanish. In two eye-tracking experiments using a visual world paradigm, we examined the effect of a phonemic match (English /b/ matched to Spanish /b/) vs. a phonetic match (English /b/ matched to Spanish /p/) on cross-linguistic co-activation (English words co-activating Spanish) in Spanish L1 and in Spanish L2 speakers. We found that while phonemic matching induced co-activation in both Spanish L1 and Spanish L2 speakers, phonetic matching did not. Together, these results indicate that co-activation of two languages in bilinguals may proceed through activation of categorical phonemic information rather than through activation of phonetic features.

Keywords: cross-linguistic, co-activation, bilingual, VOT, word recognition, eye-tracking

1.0. Introduction

It is currently taken for granted that the two languages of a bilingual are activated in parallel during auditory language comprehension (Blumenfeld & Marian, 2007; Canseco-Gonzalez et al., 2010; Chambers & Cooke, 2009; Lagrou, Hartsuiker & Duyck, 2011; Marian, Blumenfeld, & Boukrina, 2008; Marian & Spivey, 1999, 2003a, 2003b; Shook & Marian, 2012, 2016; Weber & Cutler, 2004). The phenomenon is so prevalent that tonal languages can be activated by adding tones to non-tonal languages, and verbal modalities can activate signed modalities (Shook & Marian, 2012, 2016). Most theories suggest that auditory input of one language leads to activation of the other language through shared phonological and semantic levels across the two languages. However, exactly how the initial representation of the auditory input enters into the language system is not agreed upon. One reason this has not been extensively studied is that research on cross-linguistic co-activation, typically examined using eye-tracking visual world paradigms, has not focused on small-scale phonetic differences between languages. In the present study, we examined whether parallel activation of bilinguals’ two languages is influenced by the way in which words across the two languages are matched in terms of their specific phonetic features. In two experiments, we contrasted phonemic (but not phonetic) matching across two languages (English and Spanish) with phonetic (but not phonemic) matching, and tested how such matching influenced activation of Spanish upon hearing English in Spanish-English bilingual speakers. Our findings speak to theories of bilingual language processing, which are largely seen as non-language-specific but do not agree upon the linguistic unit or level where co-activation of two languages takes place. Our findings are also relevant to the basic methodology targeting bilingual auditory language processing which often considers bilingual language proficiency but not the degree of phonetic overlap between languages.

1.1. Language experience in co-activation

Linguistic co-activation is the activation of multiple lexical items that overlap at the onset or coda during the process of word recognition. For example, when hearing a word like candle while being presented pictures which include a candle and candy, English monolinguals initially look to both candy and candle before settling on the target word (Allopenna, Magnuson, & Tanenhaus, 1998). This indicates that the similarity in the onset phonemes causes activation of both words, and likely all words that have similar onsets (Marslen-Wilson, 1987). Cross-linguistic co-activation is a similar phenomenon that occurs when a listener has knowledge of more than one language. For example, a Spanish-English bilingual who hears the English word leaf may also co-activate book because the Spanish label for book (libro) is phonemically similar at the onset to leaf (Fricke, Kroll, & Dussias, 2016). Eye-tracking studies have been invaluable to the field of auditory processing, and have led to insights both for monolingual and bilingual populations (e.g. Allopenna, Magnuson & Tanenhaus, 1998; Magnuson, Dixon, Tanenhaus, Aslin, 2007; Marian & Spivey, 2003a; Lagrou, Hartsuiker & Duyck, 2011).

The first bilingual studies using eye-tracking to examine cross-linguistic activation were conducted by Marian and Spivey (1999). In a series of studies, Russian-English bilinguals were presented with 4 images: one target image in English (marker), one cross-linguistic competitor in Russian (marka – stamp), and two filler images which were unrelated to the target or competitor. Control trials, where there was no overlap between the 4 images were also presented throughout. An eye-tracker indexed eye-movements to each of the 4 images following the auditory presentation of the target word in English. Fixations to the Russian competitor image were significantly higher than fixations to a control image in the same location on control trials. When testing participants in the other language (co-activation of L2 English during L1 Russian processing) co-activation was also found.

However, follow up studies have not found co-activation as consistently in each language direction. Instead, many studies indicate that while the first language (L1) of a bilingual is activated when listening to the second language (L2), the reverse is less reliably the case. That is, the L2 that is often the less dominant and/or less proficient language is not activated when bilinguals listen to their L1 (Blumenfeld & Marian, 2007; Canseco-Gonzalez et al., 2010; Chambers & Cooke, 2009; Marian & Spivey, 2003b; Weber & Cutler, 2004). For instance, Canseco-Gonzalez et al. (2010) tested 3 types of Spanish-English bilinguals: Spanish from birth, English from birth, and simultaneous. In a task conducted in English testing co-activation of Spanish, both groups that acquired Spanish from birth showed co-activation of Spanish, but the group that acquired Spanish after age 6 years did not. Similarly, Blumenfeld and Marian (2007) found that when presented with auditory targets in English, both English- and German-native bilinguals showed cross-linguistic co-activation for cognate words while only German-native bilinguals showed cross-linguistic co-activation for non-cognate words.

In general, there are only a few studies that find strong L2-activation when processing in the L1 (Marian & Spivey, 1999, 2003a; Lagrou, Hartsuiker, & Duyck, 2011). In a study with Russian-English bilinguals, Marian and Spivey (2003a) found significant co-activation of the L2 when listening to the L1, but no significant co-activation in the other direction. The authors cite amount of overlap between linguistic stimuli as a possible explanation for why the results of this study conflict with their own previous findings. Indeed there is evidence suggesting that the amount of shared phonemes across languages affects the efficiency of lexical retrieval. Marian, Blumenfeld and Boukrina (2008) tested Russian-English bilinguals on an auditory lexical decision task, and examined how unique and shared phonemes of the two languages affected lexical retrieval. When tested in their native language, participants were slower and less accurate at making lexical decisions on words which contained phonemes that existed in both languages than on words which contained phonemes that were specific to their native language. On the other hand, when tested in their L2, participants were faster and more accurate for words with more shared phonemes between their two languages than for words that were specific to their L2. It seems that shared phonemes between languages have an effect on L1 language processing such that greater phonemic overlap with the L2 is more likely to hinder L1 processing. It is likely that co-activation of the L2 during L1 processing leads to this slowing. For eye-tracking studies, this means that the more shared phonemes there are in a word, the more co-activation may occur and that co-activation may not work the same in each language direction.

1.2. Phonemic and phonetic overlap in co-activation

Participant language proficiency profiles have been extensively examined in co-activation literature; however, another important factor has received less attention—the type of cross-linguistic matching of stimuli in co-activation experiments. In addition to the number of shared phonemes in a stimulus, the type of matching across the languages may affect co-activation, and there is evidence that a mismatch in the number of categories available to classily similar sounds in languages has a significant effect on co-activation. For example, Cutler, Weber and Otake (2006) tested co-activation of English /l/ and /r/ initial words in Japanese-English bilinguals. Japanese only has only one liquid consonant category which is more similar to the English /l/ than the English /r/. This leads to difficulty in differentiating /l/ and /r/ for Japanese-native speakers during English comprehension. In an eye-tracking co-activation task, performed completely in English, Japanese-English bilinguals were more likely to co-activate /l/-initial words (e.g. locker) when hearing /r/-initial words (e.g. rocket) than they were to co-activate /r/-initial words when hearing /l/-initial words. Weber and Cutler (2004) found similar results for vowels in Dutch-English bilinguals. While there is a phonemic distinction between English /ε/ and /æ/, only /ε/ exists in Dutch. When tested in English, Dutch-English bilinguals co-activated English words that contained an /ε/ when hearing an English word containing an /æ/, but not English words containing /æ/ when hearing English words containing an /ε/. Weber and Cutler attempted to extend this effect to cross-language co-activation, but found a similar degree of co-activation for words with vowels that overlapped better cross-linguistically and those that did not.

In summary, these studies indicate that the amount of overlap in the phonemic inventories across the two languages can have a significant effect on bilingual language processing. More specific to study design, this suggests that the types of stimuli chosen to probe cross-linguistic co-activation may have a profound effect on the results, depending on participants’ native phonemic inventory. Although the phoneme inventories across the two languages being tested can affect how much cross-linguistic activation is manifested in bilinguals, even when phonemes are seemingly similar (i.e., represented by the same IPA symbols), phonetic features can be language-specific. A hallmark bilingual eye-tracking study by Ju and Luce (2004) exemplifies this. Ju and Luce (2004) tested Spanish-English bilinguals listening to Spanish words (playa-beach) while looking at pictures which included English competitors (pliers). They manipulated the voice onset time (VOT) of the first phoneme of the words to make some more Spanish-like and some more English-like. They found that co-activation only occurred when the onsets were English-like. However, past studies that found co-activation in this direction did not have to manipulate the acoustics of the words to find significant effects (e.g. Marian & Spivey, 2003a). In addition to language direction/proficiency factors, one of the reasons Ju and Luce did not find activation with unmanipulated stimuli may have to do with the fact that they only used voiceless stops (/p, t, k/) as the manipulated initial phoneme while other studies have used a wider range of sounds. Voiceless stops in Spanish and English do not have identical phonetic properties although they are represented generally by the same linguistic symbols.

Stop sounds include /p, t, k, b, d, g/ and while they are generally referred to as voiced (/b, d, g/) and voiceless stops (/p, t, k/), this is an acoustic misnomer when referring to English phonemes. Instead, in English, in word initial position, all sounds tend to be voiceless and instead these two categories differ in aspiration and voice onset time. In a word-onset position, /p, t, k/ are produced with aspiration and with a long-lag VOT between 70–100ms while /b, d, g/ are produced without aspiration and with a short-lag VOT between 0–30ms. On the other hand, in Spanish, voicing does differentiate these two categories and none of these sounds are aspirated. In word-onset position in Spanish, /p, t, k/ are produced with a short-lag VOT between 0–30ms but /b, d, g/ are often produced with a negative VOT known as pre-voicing (Lisker & Abramson, 1964). This means that while phonemically an English /b/ and a Spanish /b/ are similar to each other since they are on the lower end of the VOT continuum, phonetically, based on VOT measurements, an English /b/ and a Spanish /p/ are more similar to each other. See Figure 1 for a comparison of English and Spanish VOT.

Figure 1.

Figure 1.

A comparison of voice onset time in English and Spanish stops

While phonetic mismatch of VOT for Spanish and English stops may be one reason for lack of co-activation with unmanipulated stimuli in the Ju and Luce (2004) study, a study by McMurray, Tanenhaus, Aslin & Spivey (2003) indicates that this may not be the case. In their study, phonetically mismatched VOTs induced co-activation within the same language. An auditory 4-image visual world eye-tracking paradigm was used with English monolinguals where the target and competitor differed only on onset phoneme VOT (e.g. bear and pear). Audio targets were presented with a range of VOT onsets including those far and close to the categorical /b/-/p/ boundary. Mouse-click data were collected to examine the categorical boundary between /b/ and /p/, while eye-tracking data were sensitive to possible differences between the processing of items that were categorized the same way by mouse clicks. Eye fixations showed that the similar onset competitors were fixated significantly more than an unrelated image, and when the VOT was close to the categorical boundary, competitor fixations were higher than when it was far from the boundary. This indicates that within a language, mismatched VOTs can still lead to co-activation. Thus, the lack of co-activation found for mismatched VOTs in Ju and Luce (2004) may be due to the fact that the stimuli from different VOT categories came from two different languages.

Ju and Luce’s study suggests that phonetic matching between cross-linguistic cohort stimuli may be more effective at inducing co-activation than phonemic matching. Yet, many other studies observed co-activation with only phonemic matching of stimuli (Blumenfeld & Marian, 2007; Canseco-Gonzalez et al., 2010; Chambers & Cooke, 2009; Marian, Blumenfeld, & Boukrina, 2008; Marian & Spivey, 1999, 2003a, 2003b). Due to differences in language pairs tested and proficiency profiles of participants, it is difficult to directly compare phonemic and phonetic matching across studies. The purpose of this study therefore was to compare co-activation induced by phonetic and phonemic matching of stimuli to inform inconsistencies in previous findings. The level of matching (phonetic vs phonemic) which induces co-activation also has implications for theoretical models of bilingualism.

Theoretical accounts of bilingual lexical processing presume overlap at varying levels of the bilingual’s two languages. The Bilingual Language Interaction Network for Comprehension of Speech (BLINCS; Shook & Marian, 2013) and the Bilingual Model of Lexical Access (BIMOLA; Grosjean, 1997) both explain bilingual co-activation as the results of interactive mapping within and between language levels. The two models differ in that the BIMOLA model posits a shared set of phonetic features that are activated as the first step in the auditory processing chain, while the BLINCS model posits a shared phonological level where phonemes are activated. This first level of activation (phonetic features for BIMOLA; phonemes for BLINCS) is shared across languages and is key to resulting co-activation at higher lexical levels. If, as in the BIMOLA model, individual features were first activated, then words with initial phonemes that share more features would result in more co-activation than words with initial phonemes that share fewer features. In contrast, phonemes in the BLINCS model are made up of three-dimensional feature vectors and they are positioned such that phonemes with similar vectors, for example stops, are placed near each other. However, the mere presence of a feature does not activate all phonemes which have the feature in their vectors; rather, phonemes that are close to each other activate each other. Crucially, the two models lead to fundamentally different predictions with respect to phonetic overlap and its effect on co-activation of bilinguals’ two languages. BIMOLA, but not BLINCS, would predict that higher degree of overlap in phonetic features between words across two languages should yield higher levels of co-activation.

Interestingly, although the linguistic level of overlap in stimuli (phonetic vs. phonemic) meant to probe cross-linguistic co-activation is important to both theory-building and experimental design, it has not received much attention in the empirical literature. Most studies report the number of overlapping phonemes (Canseco-Gonzalez, et al., 2010; Blumenfeld & Marian, 2007; Ju & Luce, 2004; Marian & Spivey, 2003a, 2003b; Shook & Marian, 2017), and/or phonetic features (Canseco-Gonzalez, et al., 2010; Blumenfeld & Marian, 2007; Marian & Spivey, 2003b), to quantify the overlap and to keep the amount of overlap constant between experimental conditions. However, the precise way in which overlap is defined, and its extent may fundamentally dictate the extent of cross-linguistic activation observed.

1.3. Current Study

Research on cross-linguistic co-activation has mainly matched cross-language stimuli phonemically despite the fact that the same phonemic category does not necessarily overlap phonetically across languages. Two experiments were conducted to examine whether cross-linguistic co-activation is influenced by the type of matching of the word-onsets across languages. Pairing words phonemically was tested in Experiment 1 and pairing word phonetically was tested in Experiment 2. In both experiments, we examined cross-linguistic co-activation in two types of bilinguals: Spanish-L1 bilinguals who acquired Spanish from birth and English after age 3, and Spanish-L2 bilinguals who acquired English from birth and Spanish after age 3. All participants performed the cross-linguistic activation task in English. This design allows for a comparison between groups on a single controlled set of stimuli which has an advantage over a within-subject design where bilinguals are tested on different stimulus sets in their two languages. In line with predictions based upon BIMOLA and results of Ju and Luce (2004), one hypothesis was that phonetic matching would lead to more co-activation than phonemic matching since phonetic matching would yield more similarity between words on an auditory task. In contrast, in line with predictions based upon BLINCS, an alternative hypothesis was that phonetic matching would not necessarily lead to more co-activation than phonemic matching. Whatever the pattern of results, it was expected to clarify how differences in stimulus matching in past studies might have affected the strength of observed co-activation. The experiments were also expected to inform the effect of proficiency on the strength of co-activation. We hypothesized that the Spanish-L1 bilinguals would exhibit more co-activation than the Spanish-L2 bilinguals given previous research showing that an L1 is more likely to be co-activated than an L2 during cross-linguistic tasks.

2.0. Experiment 1: Co-activation of phonemically matched onsets

2.1. Method

2.1.1. Participants

Three groups of participants were tested: monolingual English controls (n = 23), English native speakers with L2 Spanish knowledge (Spanish-L2 bilinguals, n = 31), and Spanish native speakers with L2 English knowledge (Spanish L1-bilinguals, n = 23). All participants were adults between ages 18 and 40 who lived in Madison, WI at the time of testing. Inclusionary criteria for the monolingual group included acquisition of English as a native language, and self-rated Spanish abilities at a 0 or 1 on a scale of 0 (no Spanish abilities) to 10 (perfect Spanish speaker). Inclusionary criteria for the Spanish-L2 bilinguals included acquisition of English before age 3, acquisition of Spanish after age 3, and self-rated Spanish abilities of 5 or above. Inclusionary criteria for the Spanish-L1 bilinguals included acquisition of Spanish at or before age 3, acquisition of English at or after age 3, and self-rated English abilities of 5 or above. The majority of the Spanish-L1 bilinguals used a Mexican dialect of Spanish. Exclusionary criteria included diagnosis of a language, learning, or hearing disorder, and a self-rating of spoken language abilities in a language other than English or Spanish above 5. For those with experience with additional languages (2 monolinguals, 6 Spanish-L1 bilinguals, 2 Spanish-L2 bilinguals), proficiency ratings averaged 3.6 (SD = 1.4).

Participant characteristics are shown in Table 1. The English abilities of all participants were indexed by the standard score on the Peabody Picture Vocabulary Test, Fourth Edition (PPVT-4, Dunn & Dunn, 2007). The Spanish abilities of the bilingual participants were indexed by raw scores on the Test de Vocabulario en Imagenes Peabody (TVIP, Dunn, Padilla, Lugo, & Dunn, 1986). Aspects of language acquisition and exposure history were obtained from the Language Experience and Proficiency Questionnaire (LEAP-Q, Marian, Blumenfeld, & Kaushanskaya, 2007). The monolingual and Spanish-L2 groups did not differ significantly in age (t (26.83) = 1.30, p = .20) or English abilities (t (41.70) = 1.88, p = .07). Spanish-L1 bilinguals were significantly older than the monolinguals (t (32.99) = 14.49, p < .001) and the Spanish-L2 bilinguals (t (23.29) = 5.68, p < .001), and their English language scores were significantly lower than those of both monolinguals (t (30.15) = 2.47, p < .05) and Spanish-L2 bilinguals (t (25.84) = 3.64, p < .01). The Spanish-L2 bilingual group had significantly lower scores in Spanish than the Spanish-L1 group (t (45.14) = 7.76, p < .001).

Table 1.

Experiment 1 Participant Characteristics

Monolingual English Spanish-L2 Bilingual Spanish-L1 Bilingual t1

N 23 31 23
Age 21.33 (3.57) 20.30 (1.37) 28.59 (6.89) 5.68***
AOA English2 Birth Birth 10.13 (7.39)
AOA Spanish2 - 10.13 (3.68) Birth
English Vocab3 107.57 (9.96) 112.37 (8.07) 95.64(20.43) 3.64**
Spanish Vocab4 - 75% (8) 91%(6) 7.76***

Note:

1

Comparison of bilingual groups;

2

AOA = age of acquisition;

3

Indexed by standard score on the Peabody Picture Vocabulary Test (PPVT-4);

4

Indexed by raw score on the Test de Vocabulario en Imagenes Peabody (TVIP).

***

p<.001

2.1.2. Stimuli

Two-hundred and eighty picturable nouns with high lexical frequency and rated as having an early age of acquisition in both English and Spanish were used to create the visual world paradigm displays. Stop-initial target words (/b, d, g, p, t, k/) were matched with a Spanish competitor with the same stop-initial phonemic onset and two other unrelated words that did not overlap phonemically or semantically in either English or Spanish. For example, the English target ‘desk’ was matched with the Spanish competitor desarmador (screwdriver) and the unrelated words ring and caterpillar. The first unrelated word was matched as closely as possible to the age of acquisition and lexical frequency in both English and Spanish to the target word and the second unrelated word was matched similarly to the competitor word. Since high frequency words are more likely to be fixated (Dahan, Magnuson, & Tanenhaus, 2001), this method was used to ensure that fixations between the competitor and the second unrelated image could be directly compared. All analyses focused on comparing looks to the competitor image and the frequency-matched unrelated image within a trial to index co-activation (following Mirman, 2014). Target and competitor items had on average 1.4 overlapping phonemes (SD = 0.6) and 4.7 overlapping phonetic features (SD = 1.5) at the onset (defined as the first 3 phonemes). Consonant features included place, manner, and voicing and vowel features included height, fronting, and tenseness. Average lexical frequency1 of target words was 3.3 (SD = 1.0), of the target-matched unrelated items was 3.3 (SD = 0.8), of competitors was 2.7 (SD = 1.5), and of the competitor-matched unrelated items was 2.6 (SD = 1.3). There was no significant difference between the competitor and competitor-matched unrelated item frequency (t (21) = .31, p = .76). Average imageability ratings of the words from the MRC database (Coltheart, 1981; on a scale of 100–700) for target items was 585 (SD = 34) and for competitor items was 594 (SD = 24). Black-and-white line drawings of the words were gathered from the International Picture Naming Database (Szekely et al., 2004) and Snodgrass & Vanderwart (1980) and were used to create visual displays. The name agreement of the images was normed on twenty-one adults in English to ensure they matched our target word list. The average name agreement of the target items was 0.91 (SD = .12) and the agreement of the competitor items was 0.96 (SD = .07). Each display included the four images randomly placed in one of the four set locations equally far from the center (see Figure 2). Filler trials, where none of the 4 images on the screen were related in any way were also created to present randomly between the trials of interest. There were 22 trials of interest, and 43 filler trials2. See Appendix A for a full list of stimuli. Each word was recorded by a 20-year-old native-English female speaker from Midwest USA in a sound-attenuated booth. Average VOT of the target word-initial voiced stops (M = 18.7 ms, SD = 15.9) and voiceless stops (M = 82.7 ms, SD = 17.2) fell within typical ranges for English speakers. No digital manipulation was applied to the VOT produced by the native speaker. Digital recordings were performed in Adobe Audition at a 44.1kHz sampling rate and all items were intensity-normalized. Words were presented without a carrier phrase since phrase-level coarticulation might affect bilingual groups in different ways (Chang, 2016).

Figure 2.

Figure 2.

The eye-tracking display for a trial where the target is desk, competitor is screwdriver (desarmador), and unrelated images are ring, and caterpillar.

2.1.3. Procedure

After consenting in English, participants performed the experimental task, followed by all other tests and questionnaires. All experimenters were native English speakers. Participants were seated in front of a 60hz Tobii Pro T60XL eye-tracker which was calibrated to ensure good tracking. Following 6 practice trials which did not include any experimental stimuli, participants completed 70 test trials. During a trial, the 4-image display appeared and following 750ms a black dot appeared in the center of the screen with the 4 images still present. Participants were instructed to click on the dot to hear the target word. This was done to ensure fixations at the onset of the auditory stimulus were not on any of the images. Immediately upon a mouse click, the auditory stimulus played and participants were instructed to click on the picture of the word they heard. Each trial was followed by a 500ms interstimulus interval. Participants were given two 15 second breaks to close their eyes during the course of the task to avoid fatigue due to the brightness and proximity of the screen.

2.1.4. Analyses

Growth curves were modeled to examine differences between looks to the competitor and frequency-matched unrelated image. Data cleaning and analyses were performed with the eyetrackingR (Dink & Ferguson, 2018), lme4 (Bates, Maechler, Bolker, & Walker, 2015), and afex (Singmann, Bolker, Westfall & Aust, 2017) packages in R. Only trials where the correct answer was chosen in the mouse click data were included in the growth curve analyses. Based on Mirman (2014), the empirical logit of looks to each image in 50ms bins was calculated for each trial. Curves were calculated between the time period 200ms to 1000ms: 200ms is approximately how long it takes to initialize an eye-movement based on an auditory stimulus, and most participants responded in about 1000ms (Matin, Shao, & Boff, 1993; McMurray, Samelson, Lee, & Tomblin, 2010).

The areas of interest for the 4 images was defined as a 450 by 450 pixel border around each image. Looks to areas of the screen other than the 4 images, such as the fixation circle, were treated as missing. Although participants served as their own control within trials (competitor vs. unrelated image), and despite our efforts to match image complexity, once the monolingual data were collected, it became apparent that some pictures were inherently more interesting than others. This factor could potentially obscure true co-activation effects or yield false positives. Therefore, items with more than 60% looks to a competitor or an unrelated image at any point during the 200–1000ms time window for monolinguals were removed from analyses for all participants. The determination of the cut-off was data-driven to enable the inclusion of many trials while ensuring exclusion of obvious outliers. The same cutoff was used in both experiments. As a result of this cleaning procedure, two target trials which elicited extremely high looks to the unrelated image from monolinguals were removed from the analyses for all participants. Both human and computer error led to missing data. Data cleaning led to the elimination of any trials where 50% of the eye-tracking data was missing across the time window, and exclusion of participants where more than 50% of their total data points were missing. As a result, 818 trials were excluded and 8 participants were excluded—3 monolinguals, 2 Spanish-L1 bilinguals, and 3 Spanish-L2 bilinguals (these participants were also excluded from Table 1). On average, 17% of data from remaining participants was missing after these cleaning procedures.

Orthogonal polynomial time variables including linear time, quadratic time, and cubic time were included in the model. In the model of interest, fixed effects of the orthogonal polynomial time variables, the effect of image (unrelated, competitor) and the effect of group (Spanish-L2, Spanish-L1), along with all higher-order interactions between the variables of interest and the set of time variables were regressed on the empirical logit of looks to images. Only effects including the image term were interpreted as they were the only ones that indicated differences between looks to the competitor and unrelated image. Although there were some significant effects of group in the models, they were not interpreted because they reflected the expected differences in English language ability across groups and not co-activation. Random effects included in the model were the by-participant random effects for all the orthogonal time variables, image, and the interaction between image and all time variables, but due to difficulty with model convergence, covariances between random effects were not included. The two dichotomous variables, image and group, were contrast coded (−0.5, 0.5) such that unrelated image and Spanish-L2 bilinguals were the lower variable respectively. Parameter estimates can be used as quasi-effect size measures, while model comparisons are better tests of significance.

A separate model was run for monolinguals to ensure that any significant results found for bilinguals could be interpreted as indexing cross-linguistic co-activation. See the Monolingual graph in Figure 3 for the time course of looks. All effects related to image were not significant, including the image intercept (χ2(1) = .02, p = .88), image by slope (χ2(1) = .82, p = .37), image by quadratic (χ2(1) = .12, p = .73), and image by cubic (χ2(1) = .21, p = .65) effects. This pattern assures that should any effects of image be observed for bilinguals, they could be reliably interpreted as effects of cross-linguistic co-activation.

Figure 3.

Figure 3.

Time course of looks to the target, competitor, and matched unrelated image for each group in Experiment 1. Stops are matched phonemically.

2.2. Results

See the middle and right graphs in Figure 3 for a visual representation of the looks to each image over time for bilingual groups. Significant effects related to the lowest order terms (those that do not include a time variable) indicate differences in average fixations, or average height of the curve across the 200ms-1000ms time period. The addition of the image term significantly improved the model (χ2(1) = 5.33, p < .05) such that overall there were more looks to the competitor than the unrelated image (β = .07, SE = .03). No other low-order terms related to image significantly improved the model. Significant effects in the linear time domain (any term including ot1) represent differences in the slope of the eye-tracking curve and can be interpreted as the average velocity of eye-movements across the time period. The addition of the interaction between linear time and image did not significantly improve the model (χ2(1) = 2.32, p = .13). Quadratic time effects (ot2) indicate a steeper change in the curve at the middle of the eye-tracking curves and can be interpreted as the acceleration of the eye-movements over the time window. There was no significant model improvement with the addition of the interaction between quadratic time and image (χ2(1) = 2.42, p = .12). Cubic time terms indicate the difference in steepness of the two peaks at the extremes of the curve and can be interpreted as the change in acceleration over the time period. There was no significant improvement in the model with the addition of the interaction between cubic time and image (χ2(1) = 2.11, p = .15). No other effects related to image were significant. See Table 2 for the full model results.

Table 2.

Experiment 1 Growth Curve Model

b SE df χ2 95% CI

Intercept −1.35 0.02 1 249.52*** [−1.39, −1.32]
ot1 −1.36 0.06 1 129.65*** [−1.47, −1.24]
ot2 0.17 0.05 1 9.56** [0.07, 0.27]
ot3 0.22 0.03 1 32.31*** [0.15, 0.28]
Image 0.07 0.03 1 5.33* [0.01, 0.13]
ot1 x Image −0.21 0.14 1 2.32 [−0.47, 0.08]
ot2 x Image 0.19 0.12 1 2.42 [−0.06, 0.43]
ot3 x Image 0.14 0.10 1 2.11 [−0.04, 0.32]
Group 0.05 0.04 1 1.51 [−0.03, 0.12]
ot1 x Group 0.01 0.11 1 0.00 [−0.21, 0.21]
ot2 x Group −0.20 0.11 1 3.45 [−0.41, −0.02]
ot3 x Group 0.06 0.06 1 0.96 [−0.07, 0.18]
Image x Group 0.04 0.06 1 0.37 [−0.07, 0.16]
ot1 x Image x Group 0.04 0.27 1 0.02 [−0.50, 0.55]
ot2 x Image x Group 0.23 0.25 1 0.87 [−0.29, 0.69]
ot3 x Image x Group −0.33 0.19 1 3.04 [−0.70, 0.07]

Note. ot1=linear orthogonal time, ot2 = quadratic orthogonal time, ot3= cubic orthogonal time, χ2 represents likelihood ratio test for each fixed effect compared to the full model. The syntax of the R-code for the fixed and random effects of the model is: Elog~ (ot1 + ot2 + ot3) * Image * Group + ((ot1+ot2+ot3) * Image | | Participant).

*

p<.05,

**

p<.01,

***

p<.001

2.3. Discussion

The goal of Experiment 1 was to determine whether phonemically matched English-Spanish lexical items would co-activate bilinguals’ two languages, and whether bilinguals with different patterns of language experience would display different patterns of co-activation. Both bilingual groups looked at the competitor image more than the unrelated image, indicating co-activation of Spanish. Group differences in co-activation were not captured by the model. However visually, the Spanish-L2 bilinguals showed increased co-activation of the Spanish competitor immediately after presentation of the English target, but the activation reached baseline again by about 500ms. Spanish-L1 bilinguals on the other hand, show dampened, but constant co-activation of the Spanish competitor up to 1000ms after presentation of the English target. Group differences in the levels of co-activation were expected, in that those with native-like Spanish experience would exhibit more activation of Spanish during English auditory processing. However, there was no indication that the Spanish-L1 bilinguals experienced stronger co-activation than the other group. Average co-activation during the initial second was similar in both groups.

Most importantly, our findings, indicating that a phonemic match (in the absence of a perfect match in phonetic features) induces cross-linguistic co-activation in bilinguals, contrast sharply with the findings of Ju and Luce (2004). In their study, items highly similar to ours were only co-activated when phonetic features of the target item were edited to match the phonetic features of the competing language. Further, Ju and Luce (2004) did not observe co-activation when testing only voiceless stops initial words. To test if our results differed for voiced and voiceless stop initial words, a comparison of voiced and voiceless stop trials was performed post-hoc. As there were not enough trials to perform a growth-curve analysis, the time variables were removed from the model to simply compare the looks across the whole time window. The effect of voicing was also included in the interaction of image and group to compare voiced and voiceless stop trials coded (−0.5, 0.5). The results of the model showed that image significantly interacted with voicing (β = .03, SE = .01, χ2(1) = 4.43, p < .05) such that there was a significant difference in overall looks to competitor and unrelated image for the voiceless (β = .03, SE = .01, χ2(1) = 11.07, p < .001) but not the voiced trials (β = .002, SE = .01, χ2(1) = .08, p = .78). There was no significant three-way interaction between group, image, and voicing (β = −.01, SE = .03, χ2(1) = 0.07, p = .79). Therefore, the majority of co-activation in our study came from voiceless stop trials.

These results lend partial support to the BLINCS model (Shook & Marian, 2013) which posits a shared phonemic layer. However, this overlap does not seem to engender the same level of co-activation for every phoneme. In contrast with Ju and Luce (2004) who did not find co-activation for voiceless stops, we found that co-activation in our study was driven by voiceless stop trials. It is possible that the difference is conditioned by the fact that Ju and Luce’s study was performed in Spanish while our participants performed the task in English. When viewed from this perspective, the results from both studies are consistent: in both cases, the long-lag stop induced co-activation of another stop—in our case of a short-lag stop (in Spanish) and in their case of a long-lag stop (in English). Since the driving force behind co-activation in the Ju and Luce study seemed to be phonetic overlap, in our second experiment we tested if phonetic overlap in the absence of phonemic overlap would induce higher levels of co-activation. This is indeed the prediction of the BIMOLA model of bilingual lexical processing.

3.0. Experiment 2: Co-activation of phonetically matched onsets

3.1. Method

3.1.1. Participants

A new set of monolingual (n = 24) and Spanish-L2 bilinguals (n = 32) were recruited for this task. However, the same group of Spanish-L1 bilinguals performed both Experiment 1 and 2 counterbalanced across 2 sessions scheduled about a week apart. This is because recruitment of Spanish-L1 bilinguals proved quite challenging, and it was more expedient to test this group on both experiments, to ensure timely data collection. Three participants in this group did not return for Experiment 2 and computer error led to one participant’s data from this group to not be recorded, therefore data from 19 participants in the Spanish-L1 group were used. Participant characteristics for Experiment 2 are shown in Table 3. All group differences were similar to those obtained in Experiment 1. The monolingual and Spanish-L2 groups did not differ significantly in age (t (53.99) = 0.27, p = .78), but Spanish-L2 bilinguals had slightly higher English abilities (t (42.24) = 2.11, p < .05). Spanish-L1 bilinguals were significantly older than the monolinguals (t (22.24) = 4.13, p < .001) and Spanish-L2 bilinguals (t (23.64) = 4.19, p < .001), and their English language scores were significantly lower than those of both monolinguals (t (27.91) = 2.80, p < .01) and Spanish-L2 bilinguals (t (23.07) = 4.38, p < .001). The Spanish-L2 bilingual group had significantly lower scores in Spanish than the Spanish-L1 group (t (46) = 7.99, p < .001). For those with experience with a third language (5 monolinguals, 4 Spanish-L1 bilinguals, 3 Spanish-L2 bilinguals), ratings of language abilities in the other languages averaged 3.3 (SD = 1.5).

Table 3.

Experiment 2 Participant Characteristics

Monolingual English Spanish-L2 Bilingual Spanish-L1 Bilingual t1

N 24 32 19
Age 21.64 (2.79) 21.40 (3.69) 28.88 (7.24) 4.19***
AOA English2 Birth Birth 10.89 (7.97)
AOA Spanish2 - 10.03 (3.37) Birth
English Vocab3 109.09 (13.93) 116.67 (11.50) 92.89 (21.27) 4.38***
Spanish Vocab4 - 76% (8) 91% (5) 7.99***

Note:

1

Comparison of bilingual groups;

2

AOA = age of acquisition in years;

3

Indexed by standard score on the Peabody Picture Vocabulary Test (PPVT-4);

4

Indexed by raw score on the Test de Vocabulario en Imagenes Peabody (TVIP).

***

p<.001

3.1.2. Stimuli

Words and images were chosen in a similar way as in Experiment 1, but stops were matched phonetically rather than phonemically. English /b, d, g/ initial words were matched with Spanish /p, t, k/ initial words. Target and competitor items had on average 0.4 overlapping phonemes (SD = 0.6) and 5.8 overlapping features (SD = 1.5) at the onset. Average lexical frequency of target words was 3.6 (SD = 0.8), of target-matched unrelated words was 3.4 (SD = 1.0), of competitors was 3.3 (SD = 1.6), of competitor-matched unrelated words was 3.0 (SD = 1.4). There was no significant difference between the competitor and competitor-matched unrelated item frequency (t (13) = 1.6, p = .12). Average imageability ratings for target items was 581 (SD = 39) and for competitor items was 591 (SD = 37). The average name agreement of the target items was 0.94 (SD = .11) and 0.94 (SD = .10) for competitor items. The VOT of target word-initial voiced stops were within typical ranges for English (M = 12.9 ms, SD = 6.3). Filler trials came from the same set as in Experiment 1. The same speaker recorded the words for Experiment 2. Target stimuli were overwhelmingly bilabial-initial stops due to the existence of picturable /b/-initial English and /p/-initial Spanish words, but a near absence of similar alveolar and glottal stops pairs in each language.

3.1.3. Procedure

The procedure followed the same protocol as Experiment 1, except there were only 48 total testing trials (14 stop trials, and 34 filler trials). See Appendix B for a list of stimuli.

3.1.4. Analyses

The same analytical approach was taken as in Experiment 1. One trial with a high proportion of looks to the unrelated image in the monolingual group was excluded from analyses. As the result of data cleaning procedures, 437 trials and 6 participants were excluded - 1 monolingual, 2 L2-Spanish bilinguals, 3 L1-Spanish bilinguals (not included in participant characteristics). On average, 16% of data from the remaining participants were eliminated after these cleaning procedures. See the Monolingual graph in Figure 4 to for the time course of looks to the images of interest. For the monolinguals, all effects related to image were not significant including the image intercept (χ2(1) = .18, p = .67), image by slope (χ2(1) = .00, p = .99), image by quadratic (χ2(1) = 1.12, p = .29), and image by cubic (χ2(1) = .00, p = .97) time effects. Therefore, monolingual data in Experiment 2 suggested a well-controlled stimulus set that would be unlikely to reveal false co-activation in bilinguals.

Figure 4.

Figure 4.

Time course of looks to the target, competitor, and matched unrelated image for each group in Experiment 2. Stops are matched phonetically.

3.2. Results and Discussion

See Figure 4 for the time course of fixations for bilinguals. The addition of image did not significantly improve the model (χ2(1) = .86, p = .35). The addition of the interaction between linear time and image did not significantly improve the model (χ2(1) = .80, p = .37). The addition of the interaction between quadratic time and image did not significantly improve the model (χ2(1) = 3.48, p = .06). There was no significant model improvement with the addition of the interaction between cubic time and image (χ2(1) =2.24, p = .13). None of these effects significantly interacted with group. See Table 4 for the full model results.

Table 4.

Experiment 2 Growth Curve Model

b SE df χ2 95% CI

Intercept −1.41 0.02 1 250.68*** [−1.44, −1.37]
ot1 −1.30 0.08 1 110.68*** [−1.42, −1.18]
ot2 0.21 0.07 1 9.97** [0.09, 0.33]
ot3 0.27 0.06 1 18.95*** [0.15, 0.38]
Image −0.04 0.04 1 0.86 [−0.11, 0.05]
ot1 x Image 0.15 0.16 1 0.80 [−0.17, 0.47]
ot2 x Image −0.23 0.12 1 3.48 [−0.45, 0.02]
ot3 x Image 0.13 0.08 1 2.24 [−0.04, 0.29]
Group 0.12 0.03 1 10.57** [0.05, 0.18]
ot1 x Group −0.10 0.13 1 0.62 [−0.33, 0.14]
ot2 x Group −0.32 0.13 1 5.88* [−0.57, −0.07]
ot3 x Group 0.18 0.11 1 2.49 [0.07, 0.41]
Image x Group −0.04 0.08 1 0.22 [−0.21, 0.12]
ot1 x Image x Group 0.13 0.32 1 0.16 [−0.51, 0.79]
ot2 x Image x Group 0.08 0.24 1 0.10 [−0.38, 0.52]
ot3 x Image x Group 0.06 0.17 1 0.13 [−0.27, 0.38]

Note. ot1=linear orthogonal time, ot2 = quadratic orthogonal time, ot3= cubic orthogonal time, χ2 represents likelihood ratio test for each fixed effect compared to the full model. The syntax of the R-code for the fixed and random effects of the model is: Elog~ (ot1 + ot2 + ot3) * Image * Group + ((ot1+ot2+ot3) * Image | | Participant).

*

p<.05,

**

p<.01,

***

p<.001

Although we expected phonetic matching to elicit greater co-activation than phonemic matching, in Experiment 2, we did not find any evidence of cross-linguistic co-activation. Furthermore, the lack of co-activation for the phonetically matched words was observed across both of our bilingual groups. Language background did not modulate co-activation of phonetically matched stops in any way.

3.3. Cross-experiment comparison

In order to determine if the differences in the magnitude of co-activation found for phonemic and phonetic overlap were statistically significant, we combined all our data to examine the effect of experiment (Experiment 1, Experiment 2). Since the experiment was within-participants for the Spanish-L1 group, but between-participants for the Spanish-L2 group, we ran separate models on each group. The model structure was the same as in the initial analyses, but the effect of group was replaced by the effect of experiment. A random by-participant slope for experiment was included for the Spanish-L1 but not the Spanish-L2 model. We also removed the random by-participant effect of cubic time and all its interaction from the Spanish-L1 model to avoid singularity warnings3. We only examined effects which included an interaction between image and experiment because these indexed differences in co-activation across experiments.

For the Spanish-L1 group, there was no significant interaction between image and experiment (β = −.14, SE = .01, χ2(1) = 2.48, p = .12), among image, experiment, and linear time (β = .39, SE = .36, χ2(1) = 1.16, p = .28), among image, experiment, and quadratic time (β = − .49, SE = .26, χ2(1) = 3.33, p = .07) or among image, experiment, and cubic time (β = 1.46, SE = .16, χ2(1) = .87, p = .35).

For the Spanish-L2 group, we found that there were no significant interactions between image and experiment (β = −.09, SE = .05, χ2(1) = 2.95, p = .09), among image, experiment and linear time (β = .39, SE = .24, χ2(1) = 2.69, p = .10), among image, experiment and quadratic time (β = −.30, SE = .21, χ2(1) = 2.04, p = .15), or among image, experiment, and cubic time (β = −.24, SE = .16, χ2(1) = 2.40, p = .12).

Together, the cross-experimental comparisons revealed that the difference in magnitude of co-activation observed for Experiment 1 but not Experiment 2 was not robust enough to indicate stronger co-activation in Experiment 1 than in Experiment 2.

4.0. General Discussion

Across two eye-tracking experiments, the effect of phonemic vs. phonetic onset matching of stimuli on cross-linguistic co-activation in two bilingual groups (English-Spanish and Spanish-English bilinguals) was examined. We hypothesized that if, as predicted by BIMOLA, phonetic features were first activated in a shared language level, then phonetically matched words would lead to the most co-activation. If as BLINCS predicts, the first shared language level is phonological, then the phonemically matched words would lead to most co-activation. We found significant co-activation in both bilingual groups when stimuli were matched phonemically (English /p/-Spanish /p/), but found no indication of co-activation when stimuli were matched phonetically (English /b/-Spanish /p/). Further, we found that voiceless stops were driving phonemic co-activation. In a cross-experiment comparison, however, the difference in magnitude between phonemic and phonetic co-activation was not significant. Although these results are in more line with BLINCS predictions than those of BIMOLA, an exploration of different types of phonemes would be necessary to definitively favor the BLINCS model. We also hypothesized that Spanish-L1 bilinguals would demonstrate a stronger co-activation pattern than Spanish-L2 bilinguals since both groups performed the task in English. We did not observe differences in the average amount of co-activation between the two groups across the full analysis window; rather, we found that both groups co-activated phonemically matched stimuli but not phonetically matched stimuli. Though not significant, the time course of activation for the phonemically matched experiment differed slightly between groups in that co-activation appeared early and robustly in word processing for the Spanish-L2 bilinguals, and was dampened but constant for Spanish-L1 bilinguals.

Our results contrast sharply with Ju and Luce’s (2004) findings. Similar to our Experiment 1, Ju and Luce’s target and competitor items were from the same phonemic category; however, in their study, phonemic overlap alone was not enough to induce co-activation of voiceless stops, but phonemic in addition to phonetic overlap was. Since the addition of phonetic overlap did lead to co-activation for Ju and Luce (2004), it can be inferred that phonetic overlap drove co-activation in their study. In contrast, in our study, phonemic overlap induced co-activation while phonetic overlap alone did not, and the results of our Experiment 2 contraindicate that phonetic overlap drives cross-linguistic co-activation. How then do we reconcile the findings across the two studies?

If we examine VOT as three separate categories (short-lag, long-lag, and pre-voicing) rather than ‘voiced’ and ‘voiceless’ across two languages, our findings may not be inconsistent after all. Ju and Luce (2004) found that short-lag stops (Spanish voiceless) did not co-activate long-lag stops (English voiced) but that long-lag stops (Spanish manipulated) did co-activate other long-lag stops (English voiced). Because our test language was English and not Spanish as in Ju and Luce (2004), we tested different types of VOT matches. We found that long-lag stops co-activate short-lag stops (Experiment 1), but that short-lag stops do not co-activate short-lag stops (Experiment 2) or pre-voiced stops (Experiment 1). Taken together, this indicates that long-lag stops can induce co-activation across both a similar and different category while short-lag stops do not induce co-activation across any categories. This conclusion is surprising since short-lag stops are the only category that exist in both English and Spanish, and their existence was our motivation for testing phonetic co-activation to begin with. If the absence of a phonetically similar category across languages is driving co-activation, it is possible that both long-lag and pre-voiced stops would lead to co-activation while short-lag stops would not. However, there does not seem to be a compelling explanation of why this would be the case. A more comprehensive comparison of all types of VOT categories and their co-activation of each other would be necessary to move this line of inquiry further. It would also be imperative to test co-activation in both directions within a single study. In our study, we attempted to take language direction into account by testing bilinguals with both L1 and L2 experience in the language of testing; however, testing different populations in the same language is methodologically and conceptually non-equivalent to testing the same population of bilinguals in both of their two languages. Both approaches are likely necessary in order to understand how co-activation proceeds in each language direction.

One important note with relation to our stimuli is that they were produced naturally by a native English speaker and the recordings were not manipulated in any way. Thus, our stimuli followed typical English norms on more than just VOT. For example, aspiration also differentiates English voiced and voiceless stops but is not present in any Spanish stops. When Ju and Luce (2004) manipulated their stimuli to have English-like VOTs, they also maintained English-like amplitude and aspiration of the spliced stop, meaning more than just VOT differed between the manipulated and un-manipulated Spanish stimuli. Because there were multiple cues to the other language, it should have made it easier for co-activation to take place. On the other hand, since we did not splice or manipulate our stimuli, we made it very difficult for co-activation to emerge. Had we isolated VOT to be the only ‘English-like’ portion of the target word, we may have observed more co-activation. This would also be an interesting route for future research.

We interpret our finding that phonemic matching yielded stronger co-activation than phonetic matching as lending more supporting the BLINCS model of bilingual auditory comprehension (Shook & Marian, 2013) than the BIMOLA model (Grosjean, 1997). However, before embracing this theoretical explanation, it is important to highlight that our statistical comparison of the two experiments did not reveal a significant difference in the magnitude of co-activation between Experiments 1 and 2. Therefore, we only cautiously interpret our findings as favoring the BLINCS vs. the BIMOLA model. Nevertheless, given that both groups of participants were shown to co-activate phonemic but not phonetic cross-linguistic matches, it is important to consider whether the different results for the two experiments could have arisen for spurious reasons.

First, we consider whether factors specific to the stimuli sets chosen for the two experiments may have led to more robust co-activation in one experiment than the other. Although lexical frequency did not significantly differ between the competitor and the competitor-matched unrelated item within each experiment, there may have been a difference in the magnitude between them that differed across experiments. A 2 × 2 ANOVA revealed no significant interaction between word type (competitor, unrelated) and experiment (F (1, 71) = .46, p = .50) for lexical frequency. Similarly, the difference in lexical frequency between the target and competitor item did not significantly interact with experiment (F (1, 71) = .01, p = .90). Difference in overlapping phonemes and features did differ between experiments by design such that target-competitor sets in Experiment 1 had more overlapping phonemes than Experiment 2 while Experiment 2 had more overlapping features than Experiment 1. In order to examine if the phonemes and features beyond the first phoneme could account for differences in the results for Experiment 1 and 2, we compared the number of overlapping phonemes and features for the second and the third phonemes across experiments. There were no significant differences in the number of phonemes beyond the first between experiments (Exp 1: M = .41, SD = .59, Exp 2: M = .43, SD = .65, t(25.9) = .09, p = .93) nor were there differences in the number of overlapping features beyond the first phoneme between experiments (Exp 1: M = 2.73, SD = 1.45, Exp 2: M = 2.79, SD = 1.48, t(27.5) = .12, p = .91). Therefore, differences in stimulus characteristics across the two experiments are likely not the reasons for our findings. Consequently, we interpret our results as suggesting that during auditory word recognition, a cross-linguistic overlap in some phonemic categories induces stronger lexical competition between the two languages in a bilingual lexical-semantic system than a cross-linguistic overlap in phonetic properties. In the context of theoretical models which have been used to explain cross-linguistic co-activation, our study lends more supports to the Bilingual Language Interaction Network for Comprehension of Speech model (BLINCS, Shook & Marian, 2013) than the Bilingual Model of Lexical Access (BIMOLA, Grosjean, 1997).

The BLINCS model suggests a shared phoneme level between a bilingual’s two languages that does not take into account the differences in phonetic features across similar phonemes cross-linguistically. The BIMOLA model on the other hand suggests a shared feature level in which similar phonemes in the two languages receive graded activation that reflects the number of phonetic features shared. In Experiment 2, our target and competitor pairs shared more phonetic features than in Experiment 1, and yet they did not yield any co-activation. This is counter to what the BIMOLA model would predict. If, as in the BLINCS model, phonemes rather than phonetic features are shared across languages, this may explain why overlapping phonemes (that may differ in their specific phonetic features) work better at inducing co-activation. However, it is important to note that the co-activation that we found in Experiment 1 was mainly driven by co-activation of voiceless stops. Therefore, phonemic overlap alone does not guarantee cross-linguistic co-activation. Even with a shared phoneme level, there may be some phonemes that have stronger cross-linguistic bonds than others. Future experiments will be required to determine why this may be the case.

An alternative, or an additional possibility is that although not visually present, orthography played a role in auditory processing, such that an overlap in orthography strengthened phonemic activation across the two languages. The letters used to represent the initial sounds of our target-competitor matches in Experiment 1 are the same in both English and Spanish, whereas this is not the case for the target-competitor matches in Experiment 2. A lifetime of seeing a letter P representing both Spanish /p/ and English /p/ may strengthen co-activation of the two phonemes in the auditory domain, overriding a lack of overlap in phonetic features. A counter-argument to this hypothesis is that the original work by Spivey and Marian (1999) who tested Russian-English bilinguals, and obtained strong evidence of cross-linguistic co-activation in response to auditory stimuli despite a lack of full orthographic overlap between Russian and English. Further, Shook and Marian (2016) found co-activation of lexical tone in English by Chinese bilinguals, despite an absence of any orthographical overlap. Further experiments that manipulate orthographic vs. phonological overlap across languages within a tightly controlled stimulus set would be crucial to testing the role of orthographic overlap in bilingual auditory word recognition.

Another explanation of why a phonemic match seemingly induced more cross-linguistic co-activation may have to do with variability in how bilingual listeners represent their phonemic categories as compared to monolinguals. Experiments are often devised with the assumption that prototypical voiced vs. voiceless stop categorizations in English cleanly align with short- and long-lag stops. However, bilingual adults tend to have category boundaries that fall somewhere in between those of monolingual speakers of each of their languages (Flege & Eefing, 1987). Therefore, a bilingual’s representation of English may include both short and long lag stops as valid productions of an English /p/ stop. If a listener’s category of voiceless stops includes both long- and short-lag stops in English, this may lead to co-activation of voiceless Spanish stops upon hearing English. Variability in what VOT features are part of the voiced and voiceless categories for bilinguals may have led to co-activation that we have observed in our study.

Our results yielded only subtle differences between the two bilingual groups, suggesting similar levels of co-activation of a native vs. second language in bilinguals during auditory processing. Both Spanish-L1 and Spanish-L2 bilinguals co-activated Spanish when hearing phonemically matched targets. Neither group co-activated Spanish when hearing phonetically matched targets. The average amount of co-activation was the same for the two types of bilinguals in our first experiment, but the time course of co-activation slightly differed. Spanish-L2 bilinguals showed strong initial co-activation that they were able to resolve quickly; in contrast, Spanish-L1 bilinguals showed dampened, but continuous co-activation. The findings that average total amount of looks to the competitor did not differ between groups are in line with Marian and Spivey (2003a) who also found similar degrees of overall co-activation within participants regardless of the language direction. Although they did not conduct time-course analyses, a description states that the competitor in between-language trials was fixated about 475ms after the onset of the target. In contrast, our participants began fixating the competitor as early as 200ms after the presentation of the target. For our Spanish-L2 bilinguals, competition was almost completely resolved by the 500ms mark. Between-language competitor resolution by the 500ms mark is common in studies that do provide time course data (e.g. Canseco-Gonzalez et al., 2010; Blumenfeld & Marian, 2007). This difference may in part be attributed to the fact that Marian and Spivey’s items were embedded in sentences while ours were single words.

Our study does conflict with findings that co-activation of the native language tends to be stronger than co-activation of a second language (Blumenfeld & Marian, 2007; Canseco-Gonzalez et al., 2010; Marian & Spivey, 2003b; Weber & Cutler 2004). This could be attributed to the fact that we tested different groups of participants on the same task rather than the same groups of participants on two different tasks as in Spivey and Marian (1999). We did this to avoid the possibility of confounds associated with failing to control stimulus sets in two languages. Canseco-Gonzalez et al. (2010) used an approach similar to ours, in which the same target language and different bilingual populations were tested, and found that only bilinguals who acquired Spanish from birth co-activated Spanish when hearing English words. Differences between our study and Canseco-Gonzelez et al. (2010) include the number of items visible on the eye-tracking screen (3 vs. our 4), target item preceded by a carrier phrase vs. single words, and the characteristics of the stimulus sets used (only stops in our study vs. inclusion of other word-initial stops in their study). Method of analysis also differs in that Canseco-Gonzalez et al. (2010) grouped together looks to the images into epochs of which the most relevant are Epoch 2 (200–500ms) and Epoch 3 (500–800ms). Their Spanish-native bilingual groups exhibited co-activation during Epoch 2, but not Epoch 3. When compared to Experiment 1, visually we have similar trends for our Spanish-L2 bilingual group, but our Spanish-L1 bilingual group showed dampened co-activation in what would be classified as both Epoch 2 and 3. This is likely due to differences in the bilingual characteristics between the two studies. A comparison of experience and use of the second language of both our bilingual groups sheds some light on this factor. Since both groups were currently residing in the US, it is likely that the Spanish-L1 bilinguals had more experience with everyday bilingualism. In fact, maintaining a certain level of activation of both languages may be beneficial to this group of bilinguals if they are often in environments where they must switch between their two languages (Fricke, Kroll, Dussias, 2016). In contrast, the Spanish-L2 bilinguals were likely not in situations where maintaining activation of both languages would be beneficial. We have some support for hypothesizing that the language environments occupied by the two groups differed. For Spanish-L1 bilinguals in Experiment 1, the current exposure to each language was 58% English, 40% Spanish; in contrast, for Spanish L2 bilinguals, it was 89% English and 6% Spanish. Although the Spanish-L2 bilinguals were highly proficient in Spanish, they likely did not have to use Spanish in as many contexts and as extensively as the Spanish-L1 bilinguals did. It is possible that these differences in language experience between the two groups may have affected the time-course of activation.

5.0. Conclusions

To conclude, our results fall partly in line with the BLINCS model of bilingual auditory comprehension and indicate that processing of words in bilinguals begins with a shared phonological level of representation that leads to activation of words that share phonemes, but not necessarily features across the two languages. Further, this phoneme-level activation does not affect all phonemes in the same manner, with some being more prone to cross-linguistic co-activation than others. Methodologically, our findings suggest that the phoneme matching method when designing the stimuli for cross-linguistic co-activation studies may indeed be the best way to match items. If phonemic categories are similar enough, activation of both languages is possible even without a perfect phonetic match. In contrast, a close phonetic match in the absence of a phonemic match does not necessarily induce co-activation. The lesson then is that we must be very careful when choosing stimuli for lexical activation studies, especially those examining cross-linguistic effects since very subtle changes in stimulus matching can lead to tangible differences in results.

Highlights.

  • Activation of bilinguals’ languages is driven by phonemic but not phonetic overlap

  • The pattern of activation is similar in both first and second language listeners

  • Some phonemes may be more susceptible to co-activation than others

Acknowledgements

This work was supported by NIH grants R01 DC011750, T32 DC005359, and P30 HD03352. Thanks to Jan Edwards and Bob McMurray for input in the development of Experiment 2. Thanks to all the adults who participated and to the members of the Language Acquisition and Bilingualism Lab and the Learning to Talk Lab for help with piloting, data collection, and feedback.

Appendix A.

Stimulus sets for Experiment 1, shaded items were not included in analyses

Target Competitor Unrelated1 (target) Unrelated2 (competitor)
bowl boca (mouth) fan dog
balloon basura (trash) couch ant
beans bigote (moustache) chicken cake
bone bolsa (purse) roof ladder
bumblebee bomber (fireman) grapes squirrel
desk desarmador (screwdriver) ring caterpillar
duck dulce (candy) bat drawer
drum durazno (peach) mouse brush
goat gorra (hat) witch onion
ghost gusano (worm) shell backpack
gum gancho (hanger) watermelon antler
pot pala (shovel) fence light switch
peanut pie (foot) owl butterfly
pan pelota (ball) cat finger
pillow pinzas (tweezers) hammer ostrich
truck trineo (sled) present rocking chair
toilet tornillo (screw) clock waffle
teeth tijeras (scissors) fish mailbox
cup caballo (horse) neck wheel
carrot queso (cheese) fork soap
comb columpio (swing) broom shoe
corn corbata (tie) flag iron

Appendix B.

Stimulus sets for Experiment 2, shaded items were not included in analyses

Target Competitor Unrelated1 (target) Unrelated2 (competitor)
box pala (shovel) star vase
barn papalote (kite) stove hammer
belt pecho (chest) nail wheel
bowl pollo (chicken) fan milk
balloon pañal (diaper) ant snowman
beans pinzas (tweezers) ice cream dustpan
bell perro (dog) apple shoe
bone pozo (well) chair fence
bed pescado (fish) rain shoulder
bee pie (foot) glasses cloud
desk techo (roof) pitcher helmet
deer tijeras (scissors) backpack ladder
gum caballo (horse) watermelon bridge
ghost corbata (tie) whistle thumb

Footnotes

1

Lexical frequency was measured using the 2015–2017 word frequency per million measure from the Corpus of Contemporary English (COCA; Davies, 2008) and Corpus del Español (Davies, 2016). The measure was transformed by adding one to the cited frequency and then taking the natural log.

2

Five target trials that matched phonemically, but not orthographically (e.g. box-vaca, /baka/cow) across participants were removed from analyses for all participants as they were not eliciting any co-activation.

3

The syntax of the R-code for the fixed and random effects of the combined model for the Spanish-L1 group is: Elog~ (ot1 + ot2 + ot3) * Image * Experiment + ((ot1+ot2) * Image * Experiment | | Participant). The model for the Spanish-L2 group is: Elog~ (ot1 + ot2 + ot3) * Image * Experiment + ((ot1+ot2+ot3) * Image | | Participant).

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Reference List

  1. Allopenna PD, Magnuson JS, & Tanenhaus MK (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of memory and language, 38(4), 419–439. [Google Scholar]
  2. Bates D, Maechler M, & Bolker B. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1–48. [Google Scholar]
  3. Blumenfeld HK, & Marian V. (2007). Constraints on parallel activation in bilingual spoken language processing: Examining proficiency and lexical status using eye-tracking. Language and cognitive processes, 22(5), 633–660. [Google Scholar]
  4. Canseco-Gonzalez E, Brehm L, Brick CA, Brown-Schmidt S, Fischer K, & Wagner K. (2010). Carpet or carcel: The effect of age of acquisition and language mode on bilingual lexical access. Language and Cognitive Processes, 25(5), 669–705. [Google Scholar]
  5. Chambers CG, & Cooke H. (2009). Lexical competition during second-language listening: sentence context, but not proficiency, constrains interference from the native lexicon. Journal of Experimental Psychology: Learning, Memory, and Cognition, 35(4), 1029–1040. [DOI] [PubMed] [Google Scholar]
  6. Chang CB (2016). Bilingual perceptual benefits of experience with a heritage language. Bilingualism: Language and Cognition, 19(4), 791–809. [Google Scholar]
  7. Coltheart M. (1981). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 33A(4), 497–505. [Google Scholar]
  8. Cutler A, Weber A, & Otake T. (2006). Asymmetric mapping from phonetic to lexical representations in second-language listening. Journal of Phonetics, 34(2), 269–284. [Google Scholar]
  9. Dahan D, Magnuson JS, & Tanenhaus MK (2001). Time course of frequency effects in spoken-word recognition: Evidence from eye movements. Cognitive psychology, 42(4), 317–367. [DOI] [PubMed] [Google Scholar]
  10. Davies M. (2008-). The Corpus of Contemporary American English (COCA): 560 million words, 1990-present. Available online at https://corpus.byu.edu/coca/.
  11. Davies M. (2016-). Corpus del Español: Two billion words, 21 countries. Available online at http://www.corpusdelespanol.org/web-dial/.
  12. Dink J and Ferguson B. (2018). _eyetrackingR_. R package version 0.1.7, URL:http://www.eyetracking-R.com.
  13. Dunn LM & Dunn DM (2007). PPVT-4: Peabody picture vocabulary test. Minneapolis, MN: Pearson Assessments. [Google Scholar]
  14. Dunn LM, Padilla ER, Lugo DE, & Dunn LM. (1986). TVIP : Test de Vocabulario en Imagenes Peabody: Adaptacion Hispanoamericana = Peabody Picture Vocabulary Test [Revised] : Hispanic-American adaptation. Circle Pines, Minn: American Guidance Service. [Google Scholar]
  15. Flege JE & Eefting W. (1987). Production and perception of English stops by native Spanish speakers. Journal of phonetics, 15, 67–83. [Google Scholar]
  16. Fricke M, Kroll JF, & Dussias PE (2016). Phonetic variation in bilingual speech: A lens for studying the production–comprehension link. Journal of Memory and Language, 89, 110–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Grosjean F. (1997). Processing mixed language: Issues, findings, and models In de Groot AMB & Kroll JF (Eds.), Tutorials in bilingualism: Psycholinguistic perspectives (225–254). Mahwah, NJ: Erlbaum. [Google Scholar]
  18. Ju M, & Luce PA (2004). Falling on sensitive ears constraints on bilingual lexical activation. Psychological Science, 15(5), 314–318. [DOI] [PubMed] [Google Scholar]
  19. Lagrou E, Hartsuiker RJ, & Duyck W. (2011). Knowledge of a second language influences auditory word recognition in the native language. Journal of Experimental Psychology: Learning, Memory, and Cognition, 37(4), 952–965. [DOI] [PubMed] [Google Scholar]
  20. Lisker L, & Abramson AS (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20(3), 384–422. [Google Scholar]
  21. Magnuson JS, Dixon JA, Tanenhaus MK, & Aslin RN (2007). The dynamics of lexical competition during spoken word recognition. Cognitive Science, 31(1), 133–156. [DOI] [PubMed] [Google Scholar]
  22. Marian V, Blumenfeld HK, & Boukrina OV (2008). Sensitivity to phonological similarity within and across languages. Journal of Psycholinguistic Research, 37(3), 141–170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Marian V, Blumenfeld HK, & Kaushanskaya M. (2007). The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech, Language, and Hearing Research, 50(4), 940–967. [DOI] [PubMed] [Google Scholar]
  24. Marian V, & Spivey M. (1999). Activation of Russian and English cohorts during bilingual spoken word recognition. In Proceedings of the twenty-first annual conference of the cognitive science society (pp. 349–354). Lawrence Erlbaum Associates. [Google Scholar]
  25. Marian V, & Spivey M. (2003a). Bilingual and monolingual processing of competing lexical items. Applied Psycholinguistics, 24(2), 173–193. [Google Scholar]
  26. Marian V, & Spivey M. (2003b). Competing activation in bilingual language processing: Within-and between-language competition. Bilingualism: Language and Cognition, 6(02), 97–115. [Google Scholar]
  27. Marslen-Wilson WD (1987). Functional parallelism in spoken word-recognition. Cognition, 25(1–2), 71–102. [DOI] [PubMed] [Google Scholar]
  28. Matin E, Shao KC, & Boff KR (1993). Saccadic overhead: Information-processing time with and without saccades. Attention, Perception, & Psychophysics, 53, 372–380. [DOI] [PubMed] [Google Scholar]
  29. McMurray B, Samelson VM, Lee SH, & Tomblin JB (2010). Individual differences in online spoken word recognition: Implications for SLI. Cognitive psychology, 60(1), 1–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. McMurray B, Tanenhaus MK, Aslin RN, & Spivey MJ (2003). Probabilistic constraint satisfaction at the lexical/phonetic interface: Evidence for gradient effects of within-category VOT on lexical access. Journal of Psycholinguistic Research, 32(1), 77–97. [DOI] [PubMed] [Google Scholar]
  31. Mirman D. (2014). Growth curve analysis and visualization using R. Boca Raton, Florida: CRC Press. [Google Scholar]
  32. Moreno Bilbao MA, & Mariño Acebal JB (1998). Spanish dialects: Phonetic transcription. In ICSLP 1998: International Conference on Spoken Language Processing: Sydney, Australia: November 30-December 4, 1998 (pp. 189–192). International Speech Communication Association (ISCA). [Google Scholar]
  33. Shook A, & Marian V. (2012). Bimodal bilinguals co-activate both languages during spoken comprehension. Cognition, 124(3), 314–324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Shook A, & Marian V. (2013). The bilingual language interaction network for comprehension of speech. Bilingualism: Language and Cognition, 16(02), 304–324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Shook A, & Marian V. (2016). The influence of native-language tones on lexical access in the second language. The Journal of the Acoustical Society of America, 139(6), 3102–3109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Shook A, & Marian V. (2017). Covert co-activation of bilinguals’ non-target language: Phonological competition from translations Linguistic Approaches to Bilingualism. Advance online publication. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Singmann H, Bolker B, Westfall J & Aust F. (2017). afex: Analysis of Factorial Experiments. R package version 0.18–0. https://CRAN.R-project.org/package=afex.
  38. Snodgrass JG, & Vanderwart M. (1980). A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. Journal of experimental psychology: Human learning and memory, 6(2), 174. [DOI] [PubMed] [Google Scholar]
  39. Szekely A, Jacobsen T, D’Amico S, Devescovi A, Andonova E, Herron D, … & Federmeier K. (2004). A new on-line resource for psycholinguistic studies. Journal of memory and language, 51(2), 247–250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Weber A, & Cutler A. (2004). Lexical competition in non-native spoken-word recognition. Journal of Memory and Language, 50(1), 1–25. [Google Scholar]

RESOURCES