Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Jun 13.
Published in final edited form as: Biling (Camb Engl). 2020 Oct 23;24(2):333–343. doi: 10.1017/s1366728920000565

Word learning in monolingual and bilingual children: The influence of speaker eye-gaze

Ishanti Gangopadhyay 1, Margarita Kaushanskaya 2
PMCID: PMC11175166  NIHMSID: NIHMS1996620  PMID: 38873085

Abstract

The current study examined the impact of a speaker’s gaze on novel-word learning in 4–5-year old monolingual (N = 23) and bilingual children (N = 24). Children were taught novel words when the speaker looked at the object both times while labeling it (consistent) and when the speaker looked at the object only the first time (inconsistent). During teaching, bilingual children differentiated between the target object (that matched the label) and non-target object (that did not match the label) earlier than the monolingual children on trials without eye-gaze information. However, during testing, monolingual children showed more robust retention of novel words than bilingual children in both conditions. Findings suggest that bilingualism shapes children’s attention to eye-gaze during word learning, but that, ultimately, there is no bilingual advantage for utilizing this cue in the service of word retention.

Keywords: social-pragmatic, word learning, speaker eye-gaze, eye-tracking

Introduction

From a very early age, children are sensitive to social-pragmatic cues (Baldwin, 1991). One widely studied social cue is gaze direction or eye-gaze of an adult while interacting with the child (Butterworth, 2001; Carpenter, Nagell, Tomasello, Butterworth & Moore, 1998; Flom, Lee & Muir, 2007; Paulus, 2011; Thoermer & Sodian, 2001). This cue conveys information about the person’s current focus of attention and interest and serves an important function in early language learning. Studies have shown that children use cues like speaker gaze and gestures in joint attention contexts to support acquisition of novel words (Baldwin, 1993; Hirotani, Stets, Striano & Friederici, 2009; Hollich, Hirsh-Pasek, Golinkoff, Brand, Brown, Chung, Hennon, Rocroi & Bloom 2000; Houston-Price, Plunkett & Duffy, 2006), and that this skill may mature over time (Hollich et al., 2000). Even more importantly, research has demonstrated that children actively search for a speakers’ eye-gaze and rely on it to disambiguate between several possible referents when mapping novel words (Baldwin, 1993; Bloom, 2000, 2002). However, there are at least two gaps in our current understanding of how a speaker’s eye-gaze contributes to children’s novel word learning.

First, the contribution of a speaker’s eye-gaze to word learning has mainly been inferred from performance on a single trial in referent selection type tasks (Baldwin, Bill & Ontai, 1996; Houston-Price et al., 2006; Moore, Angelopoulos & Bennett, 1999; Paulus & Fikkert, 2014). Thus, it has yet to be ascertained whether eye-gaze impacts learning beyond this initial referent selection and how it shapes children’s ability to retain the novel words, even over a short term. Second, studies that have examined the influence of speaker eye-gaze on word-learning have primarily tested English-speaking monolingual children, and investigations of this nature are sparse in populations characterized by different experiences (but see Brojde, Ahmed & Colunga, 2012; Yow & Markman, 2015 for exceptions). It is possible that children with different socio-linguistic experiences may attend to social-pragmatic cues, like a speaker’s eye-gaze, differently (Yow & Markman, 2011a). In the current study, we aimed to inform these gaps in the existing literature by examining the influence of eye-gaze on novel word retention using a visual world eye-tracking paradigm in two groups of children: monolingual English-speaking children and Spanish–English bilingual children.

Speaker eye-gaze and word learning

One crucial aspect of language learning is the ability to link words with relevant objects (Matatyaho-Bullaro, Gogate, Mason, Cadavid & Abdel-Mottaleb, 2014; Tincoff & Jusczyk, 2012). The formation of new links between words and objects is social in kind – children believe that speakers are intending to talk about objects to which the speakers are attending. Conversely, if speakers show no signs of an intention to refer to the objects at hand, children will likely disregard those potential links. The social-pragmatic theory of word learning posits that children learn the meanings of word by exploiting pragmatic and social cues the speaker provides (Akhtar, Carpenter & Tomasello, 1996; Baldwin, 1993; Clark & Grossman, 1998; Diesendruck, 2005; Grassmann & Tomasello, 2010; Jaswal & Hansen, 2006). An important assumption of the social-pragmatic account when applied to word learning is that children can infer the speaker’s intended meaning for the novel words, and one cue that has been investigated extensively in the context of word learning is a speaker’s gaze direction or eye-gaze (Butterworth, 2001; Carpenter et al., 1998; Flom et al., 2007; Paulus, 2011; Thoermer & Sodian, 2001).

Undoubtedly, speaker eye-gaze is a robust pragmatic cue that children are highly sensitive to, and a large literature attests to children’s success in spontaneously following a speaker’s eye-gaze to map a novel word to an object (Baldwin et al., 1996; Brooks & Meltzoff, 2002; Carpenter et al., 1998; Hollich et al., 2000; Meltzoff, Gopnik & Repacholi, 1999; Moore et al., 1999; Morales, Mundy & Rojas, 1998; Paulus & Fikkert, 2014; Woodward, 2005). For instance, Baldwin and colleagues (1996) taught 16–19 month infants new labels for novel toys in two conditions – one where the experimenter looked at and labeled a toy which the infants were already looking at, and another where the experimenter looked at and labeled a different toy than the one occupying the infants’ focus of attention. Results revealed that not only did infants learn the labels of toys that were part of their focus, but they also successfully mapped the labels of new toys that the experimenter was looking at. In another study with 24-month-olds, adult gaze direction was pitted against object salience (Moore et al., 1999). In the critical condition, an adult looked at and labeled one toy while another toy was made more salient by being lit up. Although the salience captured the children’s attention briefly, comprehension tests later on revealed that children consistently chose the object that the adult had been looking at. The authors interpreted these results to suggest that gaze direction is a highly significant cue that can override other cues at 24 months.

Crucially, while multiple studies have established that children readily pay attention to a speaker’s gaze during word learning, it is unclear how a speaker’s gaze shapes children’s subsequent retention of the novel words. That is, the vast majority of previous studies examined children’s referent selection preferences following a presentation of a single trial that cued children with eye-gaze information. But does a speaker’s eye-gaze information shape children’s ability to recognize the novel word when given the choice between two novel objects over the course of multiple testing trials? Another notable gap in the literature is that although prior investigations of eye-gaze and its role in word learning have focused largely on monolingual children (Baldwin et al., 1996; Brooks & Meltzoff, 2002; Carpenter et al., 1998; Hollich et al., 2000; Meltzoff et al., 1999; Moore et al., 1999; Paulus & Fikkert, 2014; Woodward, 2005), only a handful of studies have done so in bilingual children (Brojde et al., 2012; Yow & Markman, 2015).

Bilingualism and social-pragmatics

A growing body of literature suggests that bilinguals may be more attentive to socio-pragmatic factors in the language-learning environment than monolinguals (Ben-Zeev, 1977; Buac, Tauzin-Larché, Weisberg & Kaushanskaya, 2018; Comeau, Genesee & Mendelson, 2007; Fan, Liberman, Keysar & Kinzler, 2015; Siegal, Iozzi & Surian, 2009; Yow & Markman, 2011a, 2011b, 2015). Researchers have attributed this difference to the different communicative situations in which monolingual and bilingual children tend to grow up (Fan et al., 2015; Yow & Markman, 2015). Bilingual children routinely have the opportunity to track speakers in their environment in order to determine who speaks a given language, the speaker’s language proficiency, and with whom the speaker is conversing. Thus, bilingual children have to monitor multiple cues from speakers, understand their intentions, and respond appropriately. As a result of navigating these demanding social situations, bilingual children may have a heightened sensitivity to social-pragmatic cues surrounding language use (Yow & Markman, 2011a, 2015), which might be advantageous to them during the word-learning process.

Indeed, a number of studies has demonstrated that bilingual children have greater pragmatic awareness than monolingual children (Ben-Zeev, 1977; Comeau et al., 2007; Cummins & Mulcahy, 1978; Diesendruck, 2005; Fan et al., 2015; Genesee, Tucker & Lambert, 1975; Siegal et al., 2009; Yow & Markman, 2011a, 2011b). Particularly, within the context of eye-gaze, a series of studies by Yow and Markman (2011a) explored monolingual and bilingual preschoolers’ use of nonverbal referential gestures (pointing and gaze direction), to figure out a speaker’s intent to refer. They found that, compared to monolingual children, 3-and 4-year old bilingual children were better able to use referential gestures to locate a hidden toy in the face of conflicting body-distal information (i.e., when the experimenter sat behind one box but gestured to another box). They also found that this bilingual advantage was observed in children as young as 2 years, and that monolingual children caught up to the bilingual children by age 5. The authors concluded that the experience of growing up in a bilingual environment fosters the development of understanding referential intent.

Although past research has shown that, compared to monolinguals, bilingual children have greater sensitivity to social-pragmatic cues, there is a general dearth of research on bilingual children’s use of speaker gaze in the context of word learning. To our knowledge, only two studies have examined the impact of speaker eye-gaze on learning novel words in bilingual children compared to monolingual children. Brojde et al., (2012) targeted eye-gaze as a pragmatic cue paired with object property (e.g., shape, color, texture) as an object cue. The authors taught children (24–36 months) the names of novel objects under different conditions – one where the object cue and pragmatic cue were congruent and another where the object cue and pragmatic cue were incongruent (i.e. the cues gave conflicting information). Results showed that when information from the two cues conflicted, bilingual children attended more to the pragmatic cue while monolingual children attended more to the object cue. The authors concluded that bilingual and monolingual children tend to weigh cues differently depending on their informativeness. Yow and Markman (2015) tested 3-year old monolingual and bilingual children using a procedure in which they saw two novel objects while the experimenter could see only one. The experimenter either looked at the object she could see and said, “There’s the glorp” or looked at the object she could not see and said, “Where’s the flurg?”. They found that while all children picked the visible object equally often when the speaker said “there”, bilingual children were more likely than monolingual children to pick the other object the speaker could not see when she said “where”. The authors interpreted this result to suggest that bilingual children were better at integrating multiple cues (context, eye-gaze, and semantics) to understand the speaker’s referential intent.

Together, the prior studies of bilingual children’s word learning suggest that bilingual children may be more attentive to a speaker’s gaze in word-learning situations (Brojde et al., 2012: Yow & Markman, 2015). However, neither study tested retention of novel words and both presented the eye-gaze cue in combination (and in conflict) with other cues. Therefore, it is still unclear as to what kind of impact speaker eye-gaze, on its own, has on novel word retention in children who come from different backgrounds.

Current study

In the present study, we examined the influence of speaker eye-gaze on word learning in 4- to 5-year old monolingual vs. bilingual children using a visual world eye-tracking paradigm. An older age range was chosen to ensure that bilingual children had a sufficient level of English proficiency to be able to engage in a word-learning task involving English-like stimuli (younger children would have been less likely to experience significant exposure to English). It was also important, given the nature of our word-learning task, to ensure that children would not be fatigued by the task, as younger children likely would have been.

To test the hypothesis that bilinguals may be more sensitive to social-pragmatic cues, the consistency of the eye-gaze cue was manipulated. Children were taught novel words under two conditions – consistent (where the speaker looked at the novel object every time while labeling it) and inconsistent (where the speaker only looked at the novel object the first time while labeling it but not the second time). The inconsistent condition enabled us to examine whether group differences would be stronger in a more difficult teaching scenario where only a single exposure to eye-gaze was present. Retention of novel labels was probed via multiple testing trials by presenting children with two novel objects side by side on the screen and asking them to look at the object being labeled. The presentation of multiple testing trials following all the teaching trials enabled us to assess children’s ability to recognize the novel words after a period of time, thus tapping short-term memory processes. This first step is necessary in order to begin considering whether cues like a speaker’s gaze can stimulate longer-term consolidation of words into a child’s lexicon. Finally, teaching trials were analyzed to assess children’s immediate mapping of novel words in the two conditions – consistent and inconsistent.

We hypothesized that all children should perform better (i.e., have greater and quicker recognition rates) in the consistent condition than in the inconsistent condition because the latter is a more challenging word-learning situation. Furthermore, if bilingual children have better pragmatic awareness than monolingual children, they might outperform monolingual children on the word learning task, especially in the condition where the eye-gaze cue is less consistent.

Method

Participants

Four- and 5-year old English-speaking monolingual children and Spanish–English-speaking bilingual children were tested. All children were recruited from local daycares and schools in Madison, Wisconsin, USA. Children were also recruited by contacting families that have participated in prior studies in the lab who had indicated interest in participating in future studies. A total of 25 monolingual and 25 bilingual children were tested over two 1-hour sessions. All children passed a hearing screening at 20 dB at 1000, 2000 and 4000 Hz. Monolingual children were exposed only to English from birth and had no significant exposure to any other language (defined as >5% weekly exposure). Bilingual children were exposed to both English and Spanish by the age of 36 months and did not have significant exposure to a third language. Because the experimental tasks were administered in English, only children with at least 20% of English exposure (during a typical week) were included in the study (per Pearson, Fernández, Lewedeg & Oller, 1997). Exclusionary criteria for all children included hearing loss, psychological or behavioral disorders, neurological impairments, other developmental disabilities, and scoring 1.5 standard deviations below the mean on the non-verbal IQ test. Children were also excluded if they met any two of the following criteria: standardized vocabulary scores below 85 (English for monolinguals and English-Spanish composite for bilinguals), diagnosis of a language impairment, or parent concerns regarding their child’s language development/skills. After all data-cleaning procedures (described below), the final sample of children included 23 monolinguals (MeanAge: 4.92 years; SD = 0.54) and 24 bilinguals (MeanAge: 5.20 years; SD = 0.47). See Table 1 for participant characteristics.

Table 1.

Participant characteristics

Variables Monolingual Mean (SD) N = 23 Bilinguala Mean (SD) N = 24
Age (Years) 4.92 (0.54) 5.20 (0.47)
KBIT-2b 104.22 (8.42) 106.63 (12.05)
SESc 18.67 (2.32) 15.54 (4.48)**
ROWPVT-4d 115.50 (8.76) 117.42 (8.03)
English Exposuree 98.94 (2.34) 43.18 (20.11)
Spanish Exposuref 1.06 (2.35) 54.61 (18.61)
English Age of Acquisition (months) 0.00 (0.00) 5.67 (12.55)
Spanish Age of Acquisition (months) - 0.88 (3.13)
**

Significance level <0.01

a

Bilingual children acquired both English and Spanish before the age of 36 months

b

Kaufman Brief Intelligence Test - 2nd Edition, Matrices Subtest, Std. Score

c

Socio-economic status indexed by years of maternal education

d

Receptive One Word Picture Vocabulary Test - 4th Edition, Std. Score

e

% English exposure = [(Hours of English heard on a weekday × 5 days/week) + (Hours of English heard on Saturday and Sunday)] / (Hours child is awake per week)

f

% Spanish exposure = 100 - percent of English exposure

Measures

Parent questionnaires

A comprehensive history of each child’s language development, language acquisition, and language exposure was obtained through an interview conducted with the parent. Interviews were conducted in the parents’ native or preferred language. Additionally, parents of all children were asked to fill out a background questionnaire about the child’s family, medical and educational histories. This questionnaire also yielded mother’s years of education, which was used as proxy for the child’s socioeconomic status (SES). The two participant groups differed significantly in SES (p < .01). Finally, parents completed the Language Experience and Proficiency Questionnaire (LEAP-Q; Marian, Blumenfeld & Kaushanskaya, 2007) which asked questions about the parents’ language history and proficiency.

Standardized assessments

To assess English vocabulary knowledge, monolingual children were administered the Receptive One-Word Picture Vocabulary Test, 4th Edition (ROWPVT-4; Martin & Brownell, 2010). Bilingual children were administered the Receptive One-Word Picture Vocabulary Test, Spanish-Bilingual Edition (ROWPVT-4: SBE; Brownell, 2012). The Spanish–English Edition permits examinees to respond in either language; therefore, the test measures conceptual rather than language-specific vocabulary. Since the experimental word-learning task only tested children’s comprehension, a receptive vocabulary measure was chosen to assess language skills. Finally, all children were administered the Visual Matrices subtest of the Kaufman Brief Intelligence Test (KBIT-2; Kaufman & Kaufman, 2004), a test of non-verbal intelligence.

Experimental task

Stimuli

Six English-like novel words were created. The novel words had a CVC structure to resemble common, early-acquired English words, and contained phonemes that are common to both English and Spanish. The novel words were split into two lists that matched on developmental sound class of the consonants (Storkel, 2013), English and Spanish biphone probability (from the CLEARPOND database; Marian, Bartolotti, Chabal & Shook, 2012), and English and Spanish neighborhood density (also from the CLEARPOND database). The final lists of nonwords comprised “dep”, “geed”, and “sook” (List A) and “seech”, “faig”, and “moop” (List B). Pictures of novel objects were selected from the NOUN (Novel Object and Unusual Name; Horst & Hout, 2016) database such that they were equally salient and complex across lists. See Figure 1 for the list of novel word and novel object pairings.

Fig. 1.

Fig. 1.

Novel word-novel object pairings. Object-word pairings in each list were balanced such that “dep” was paired with each type of novel object in List 1.

Recordings

Videos of a Caucasian mid-western adult female were recorded labeling all 6 novel objects. Recordings included the speaker looking towards the bottom left, bottom right, or straight ahead while labeling the objects. Adobe® Premiere® was used to clean all video recordings to reduce background noise and the audio was normalized at 70 dB. The recordings were then spliced such that each video recording was around 3 seconds long. For test trials, the novel words and carrier phrases were audio-recorded by a second midwestern adult female, which were also spliced and normalized at 70db by the acoustic analysis software, Praat (Boersma & Weenink, 2016). All items were recorded using child-directed speech.

Design

A visual world eye-tracking paradigm was used via an E-Prime 2.0 experiment. Children were seated in front of a Tobii T60XL eye tracker with a screen resolution of 1920 × 1200. They were situated within a suitable range from the eye tracker and then a calibration sequence (TETCalibRegular, an E-Prime extension of Tobii) was run to ensure accurate and precise tracking. The eye tracker was calibrated using five locations on the screen (four corners plus center). Calibration was accepted if all the eye movements were contained in the pre-defined calibration regions for all 5 points. If calibration was poor (i.e., the eye movements were not contained in the calibration regions for at least 4 of the 5 points), the experimenter reran the calibration. In the present study, calibration was successful for all children. Children were tested under two conditions: consistent and inconsistent.

In the teaching phase, children were taught 3 novel words in association with 3 novel objects when the eye-gaze cue was always present (consistent), and 3 different novel words in association with 3 different novel objects when the eye-gaze cue was present only half the time (inconsistent). Presentation of the conditions was counterbalanced across children, and novel word and object pairings were counterbalanced across conditions. All children were given the following instructions in English: “This is going to be a listening game. In this game you will see new pictures and learn their names.”

In each condition, children were presented with two novel objects simultaneously on the screen, side by side, and a video of the speaker appeared on top labeling one of the objects (see Figure 2 for a visual depiction of the task). For each trial in the consistent condition, the speaker always looked at one of the novel objects while labeling it with its assigned novel name (“dep!”). Children were exposed to the novel words twice, resulting in 6 trials. Trial presentation order was randomized for each child. In the inconsistent condition, half the trials included speaker gaze (i.e., the speaker looked at the object while labeling) while the other half trials did not (i.e., the speaker looked straight ahead while labeling). Therefore, although there were two trials exposing children to the novel words, children were only exposed to speaker eye-gaze once. The trials that included speaker gaze were always presented first followed by the trials without speaker gaze. This was done because presenting trials without gaze first would not have been informative to the child, which may have resulted in children losing interest in the task.

Fig. 2.

Fig. 2.

Example of a teaching trial. Children were taught 3 new labels in each condition (See Figure 1 for stimuli list). Presentation of conditions was counterbalanced across children and object-word pairings were counterbalanced across conditions. In the Inconsistent condition, teaching trials with gaze were always presented first, followed by no-gaze trials.

Each teaching trial was 4000 ms long. Images of the two novel objects and the video of the speaker appeared at the same time (0ms). The images remained on the screen for the entire trial time. The onset of the target label (e.g., “dep”) occurred, on average, around 1086 ms for all trials (range = 1000ms-1160ms). After the video ended (~3 seconds), a still image of the video/speaker remained on the screen until the end of the trial. Gaze data were collected for all teaching trials.

Each teaching condition was followed by the testing phase, where children were instructed in English as follows – “Now you will see two pictures and hear a word. Your job is to look at the picture that matches the word.” Two novel objects (target and distractor) were shown on the screen side by side, and an unfamiliar female speaker labeled one of the objects (e.g., “Look at the dep!”). The speaker was not shown on the screen, and only audio was presented. Each novel label was tested 3 times (12 total test trials in each condition), position of the novel object was counterbalanced across children, and all trials were presented in random order.

In test trials, the two novel objects appeared first, in silence, on the screen for 2000 ms, followed by the carrier phrase “Look at the…”. An animated fixation video then appeared in the middle of the screen. A gaze-contingent stimulus presentation was used – only after the children looked at the fixation video, the target novel word prompt was heard “….dep!”. Following this, the images remained on the screen for an additional 4 seconds, thus making each trial 4000 ms long from word onset. Gaze data were collected for test trials to assess looking patterns towards the target and distractor images.

Analyses

Data cleaning and analyses were done with the eyetrackingR (Dink & Ferguson, 2018), lme4 (Bates, Kliegl, Vasishth & Baayen, 2015), and afex (Singmann, Bolker, Westfall & Aust, 2017) packages in R. The eye tracker recorded the x-y locations of the child’s gaze at a rate of 60 Hz. Gaze coordinates from each eye were averaged together, and these averages were mapped onto the regions of interest (i.e., target and distractor images for test trials, and target, distractor and speaker regions for teaching trials).

Teaching trials

Two types of analyses were conducted to examine children’s looking patterns during teaching trials. First, children’s proportions of time spent looking at the speaker, the target object, and the distractor object (out of the total time spent looking at all three regions of interest) were calculated for each condition across the entire length of the trial (i.e., 4 seconds). Linear regression models were estimated for the consistent condition in which children’s looks to target, distractor, and speaker were separately regressed on Group. Another set of linear regression models was estimated for the inconsistent condition, in which children’s looks to target, distractor, and speaker were separately regressed on Group and Gaze (gaze or no-gaze). All models included the maximal random effect structure justified by the data (Jaeger, 2009), and included SES (maternal years of education) as a covariate.

Second, divergence analyses were conducted between looks to target and looks to distractor within each condition and each participant group to determine when in the time course the children looked more to the target than the distractor (to show correct mapping of the novel name to the novel object) and to measure looking duration to the target versus the distractor. Because divergence analyses are able to incorporate only two variables at a time, the target and the distractor images were chosen for these analyses to determine when children would disambiguate between the two objects after word onset. This was most important for the no-gaze condition, where, in the absence of the speaker’s gaze, the target and the distractor were the only viable areas of interest. The time-course data for children’s looks to target and distractor were resampled (with replacement) in a bootstrapping procedure where every 5000 bootstrapped samples were fit using R’s boot_-splines function (Dink & Ferguson, 2018). The process estimated whether the difference between target and distractor excluded zero, to indicate the point at which the two curves diverged (corresponding to an α of 0.05) and for how long the divergence lasted.

Test trials

Growth Curve Analysis (GCA) was used to analyze changes in children’s fixations to the target object, following target word onset, throughout the critical time window of 500 ms to 2600 ms. This window was derived empirically where overall looking patterns (across groups and conditions) revealed that children’s looks to target started increasing at 500 ms. However, this approach did not reveal a clear peak accuracy for children’s looks to target. The visual graphs were then split by Condition (across groups). Although the inconsistent condition did not reach peak accuracy until much later (closer to 3500ms), the consistent condition first peaked, then plateaued around 2600 ms. Data cleaning led to the elimination of any trials where 50% of the eye-tracking data was missing, and exclusion of participants where more than 50% of their total data points were missing. As a result of these cleaning procedures, three participants (2 monolinguals and 1 bilingual) were excluded and 15.40% trials were eliminated from the analyses.

To model growth curves, the empirical logit of looks to each image in 50 ms bins was calculated for each trial (Mirman, 2014). The empirical logit transformation was used to accommodate the binary nature of the data (i.e., fixations were either to the target or the distractor). The orthogonal polynomial time variables of intercept, linear time, quadratic time, and cubic time were included in the growth curve model. The intercept is centered and reflects the average overall fixation proportion (analogous to mean accuracy). The linear time term reflects the monotonic change in fixation proportion (the average change in fixation to the target for each unit increase in time). The quadratic time term captures the rate of the symmetric rise and fall around the central inflection point of the curve (the bowing around the peak in fixation proportions). Finally, the cubic time term captures the steepness of the rates of rise and fall for the inflections around the tails (changes in fixation proportions at the beginning and end of the window).

The fixed effects of the orthogonal polynomial time variables, the effect of Condition (contrast coded as −0.5 for Consistent and 0.5 as Inconsistent) and the effect of Group (contrast coded as −0.5 for Monolingual and 0.5 as Bilingual), along with all higher-order interactions between the variables of interest and the set of time variables were regressed on the empirical logit of looks to target. Further, SES was included as a covariate. Random effects, as suggested by Mirman (2014), included participant and participant-by-condition effects for each orthogonal time variable. The two dichotomous variables, Condition and Group were contrast coded similar to the mean accuracy analyses. All models were fit using Maximum Likelihood Estimation and compared with a likelihood ratio test using loglikelihood (−2LL), which is distributed as χ2 with the degrees of freedom corresponding to the difference in the number of parameters included in each model.

To evaluate whether children performed above chance (i.e., if performance was significantly above 50%) in each condition, monolingual and bilingual children’s proportion of time spent looking at the target object (out of the total time spent looking at both the target and distractor objects) was calculated for each condition. One-sample t-tests were conducted to analyze if looks to target were significantly above μ=0.50 for each group in each condition.

Results

Teaching trials

Linear regression models

In the consistent condition, there was no main effect of Group for the proportion of looks to target (F(1,44) = 0.54, p = .47), speaker (F(1,44) = 1.25, p = .27), or distractor (F(1,44) = 1.05, p = .32). This indicates that when the eye-gaze cue was present in all trials, both monolingual and bilingual children had similar number of looks to all three regions of interest. Moreover, SES did not predict performance in any of the models (all ps > .05).

In the inconsistent condition, there was no effect of SES (F(1,89) = 1.98, p = .16), Group (F(1,89) = 2.87, p = .10) or Gaze (F(1,89) = 1.45, p = .23) or an interaction (F(1,89) = 0.17, p = .68) for the proportion of looks to target. Thus, all children had a similar number of looks to the target across gaze and no-gaze trials in the inconsistent condition. A main effect of SES was observed for looks to speaker (F(1,89) = 7.02, p < .01), such that children with higher SES demonstrated looks more often to the speaker than children with lower SES. Above and beyond SES, a main effect of Group (F(1,89) = 8.34, p < .01) was also observed, such that monolingual children (M = 0.54; SD = 0.16) looked significantly more often to the speaker than the bilingual children (M = 0.44, SD = 0.17) across both gaze and no-gaze trials. No other effects were observed for looks to the speaker. Finally, analyses on the proportion of looks to distractor revealed a main effect of SES (F(1,89) = 6.21, p = .02), where children with lower SES demonstrated looks more often to the distractor than children with higher SES. Main effects of Group (F(1,89) = 6.17, p = .02) and Gaze (F(1,89) = 19.60, p < .001) were also observed for the proportion of looks to the distractor. Bilinguals (M = 0.19, SD = 0.12) looked significantly more often to the distractor than monolinguals (M = 0.15, SD = 0.09), and all children demonstrated looking more often to the distractor in the no-gaze trials (M = 0.21, SD = 0.11) vs. the gaze trials (M = 0.13, SD = 0.09). No other effects were observed for looks to the distractor.

Divergence analysis

In the consistent condition, analyses revealed that the looks to the target and distractor reliably diverged from 700 ms until 3300 ms for both monolinguals and bilinguals. In other words, all children started looking more often to the target than the distractor starting around 700 ms. In the inconsistent condition for the trials without gaze, looks to the target and distractor diverged at two timewindows for monolinguals – 1700 ms–1900 ms and then again at 2400 ms–4000 ms. The bilinguals showed a slightly different pattern. Their looks to the target and distractor diverged around 1500 ms–1700 ms initially and then again from 2600–4000 ms. However, they also exhibited an initial burst of looking at the distractor significantly more often than the target from 400ms-700 ms. This early group effect is uninterpretable because the target label (e.g., “dep”) had not yet been uttered by the speaker.

Test trials

See Table 2 for the full model results. GCA revealed that the addition of linear time (b = 0.57, χ2(1) = 5.56, SE = 0.24, p = .02) and quadratic time (b=−0.30, χ2(1) = 3.75, SE = 0.25, p < .05) significantly improved the model. The addition of cubic time did not (b = 0.18, χ2(1) = 2.75, SE = 0.11, p = .10). There was also no significant effect of Condition (b=−0.09, χ2(1) = 0.79, SE = 0.11, p = .37). Therefore, the probability that children fixated the target object was similar across both conditions. However, Condition significantly interacted with linear time (b=−0.87, χ2(1) = 5.69, SE = 0.35, p = .02) and cubic time (b = 0.42, χ2(1) = 3.86, SE = 0.21, p < .05). Follow-up analyses revealed that the consistent condition (b = 1.00, t = 3.42, SE = 0.30, p < .001) had a more positive slope than the inconsistent condition (b = 0.14, t = 0.48, SE = 0.29, p = .63), suggesting greater increases in children’s fixations to target over time in the consistent vs. the inconsistent condition. Additionally, the interaction with cubic time indicated that the two conditions differed in the steepness of the two peaks of their curves. The cubic term was significant for the inconsistent condition (b = 0.40, t = 2.58, SE = 0.15, p < .01) but not for the consistent condition (b=−0.03, t=−0.21, SE = 0.15, p = .83), suggesting a more prominent rise, fall and then rise of fixations to target in the inconsistent condition than in the consistent condition. See Figure 3 for fixations to target in each condition across all children.

Table 2.

Full model of Growth Curve Analysis

b SE df χ 2
Intercept 0.34 0.07 1 17.10***
ot1 0.57 0.24 1 5.56*
ot2 −0.30 0.15 1 3.75*
ot3 0.18 0.11 1 2.75
SES −0.01 0.02 1 0.40
Condition −0.09 0.11 1 0.79
ot1 x Condition −0.87 0.35 1 5.69*
ot2 x Condition −0.14 0.25 1 0.31
ot3 x Condition 0.42 0.21 1 3.86*
Group −0.37 0.16 1 5.17*
ot1 x Group −0.64 0.47 1 1.79
ot2 x Group 0.63 0.30 1 4.37*
ot3 x Group −0.15 0.21 1 0.47
Condition x Group −0.08 0.21 1 0.17
ot1 x Image x Group 0.56 0.71 1 0.62
ot2 x Image x Group −0.14 0.51 1 0.08
ot3 x Image x Group −0.00 0.43 1 0.00

Note. ot1 = linear orthogonal time, ot2 = quadratic orthogonal time, ot3 = cubic orthogonal time, χ2 represents likelihood ratio test for each fixed effect compared to the full model. SES is indexed by years of maternal education

*

p < .05,

**

p < .01,

***

p < .001

Fig. 3.

Fig. 3.

Time course (500–2600ms) of the proportion of looks to target from word onset for each condition across all children. Solid lines represent raw observed means and surrounding ribbons represent standard errors. Dashed horizontal lines mark chance performance (0.50).

The analyses also revealed a main effect of Group (b=−0.37, χ2(1) = 5.17, SE = 0.16, p = .02), where monolingual children had a significantly higher probability of fixations to target than bilingual children. Group also significantly interacted with quadratic time (b = 0.63, χ2(1) = 4.37, SE = 0.29p = .04) which captures the quicker increases in fixations to target in monolingual children’s (b=−0.61, t=−2.88, SE = 0.21, p < .01) compared to bilingual children (b = 0.02, t = 0.10, SE = 0.21, p = .92). In other words, monolingual children showed faster word recognition than the bilingual children. See Figure 4 for looks to target in each group across both conditions.

Fig. 4.

Fig. 4.

Time course (500–2600ms) of the proportion of looks to target from word onset for each group across both conditions. Solid lines represent raw observed means and surrounding ribbons represent standard errors. Dashed horizontal lines mark chance performance (0.50).

One-sample t-tests revealed that monolingual children performed above chance in both the consistent (t(22) = 5.21, p < .001; M = 0.64) and inconsistent (t(22) = 3.99, p < .001; M = 0.63) conditions. Similarly, bilingual children also performed above chance in the consistent condition (t(23) = 1.88, p = .03; M = 0.56) and in the inconsistent condition (t(23) = 1.75, p = .04; M = 0.54).

Discussion

The purpose of this study was to examine the influence of speaker eye-gaze on novel word learning in monolingual and bilingual children. We found that children demonstrated similar levels of retention of novel words taught in the consistent and inconsistent conditions. This suggests that a single exposure to the eye-gaze cue was enough to induce learning in all children. However, children’s temporal patterns of novel-word recognition during testing were different for the two conditions, suggesting that the consistency of the eye-gaze cue influenced how children retrieved the novel words. Monolingual children were more accurate and faster to identify target novel words at test than bilingual children, across both conditions. However, during teaching, bilingual children’s mapping of the novel words in trials without speaker gaze information occurred earlier than monolingual children’s mapping. Together, these findings suggest that bilingualism may enhance children’s attention to a speaker’s gaze during word-learning episodes, but that, ultimately, it does not appear to affect their capacity to retain the novel words.

Influence of eye-gaze on novel-word retention

We found that monolingual and bilingual children demonstrated above-chance performance levels on the test trials in both conditions of the word-learning task. The noteworthy finding here is that children, irrespective of group membership, demonstrated similar levels of novel word retention in both the consistent and the inconsistent eye-gaze conditions. This finding indicates that children tuned in to a speaker’s gaze direction from the outset, and that a more difficult word learning situation (single eye-gaze exposure) had no bearing on the ultimate retention of novel words. It is clear that the eye-gaze cue is highly salient, and a single exposure to a reliable eye-gaze cue is sufficient to engender learning.

Although overall looks to target in the test trials suggested equal learning in the two conditions, time-course data indicated subtle differences between the words learned in the consistent condition vs. the inconsistent condition. These differences were captured in the linear and cubic orthogonal time variables. Notably, children exhibited a more positive slope for looks to target in the consistent condition (vs. the inconsistent condition), indicating greater increases in looks to target for each unit increase in time in the consistent condition than in the inconsistent condition. Furthermore, the rates of rise and fall for the inflections around the tails were steeper for the inconsistent condition (vs. the consistent condition). This difference is clearly observed when comparing the looks to target for the two conditions in Figure 2. Children started looking toward the target at 500 ms for both conditions. However, around 1300 ms children began looking towards the distractor in the inconsistent condition, and then went back to looking at the target at the end of the time window. The consistent condition, on the other hand, did not show this cubic pattern of looks to target, and instead had a steady linear trend. Therefore, it appears that providing children with two exposures to the eye-gaze cue reinforced children’s learning and led them to a more confident retrieval of novel words at testing, as demonstrated by the steady increase in looks to target over time. On the other hand, after a single exposure to the adult’s eye-gaze cue, children appeared to be less confident in selecting the target at testing, and demonstrated a pattern of looks where they first identified the target but then shifted their fixations to the non-target before ultimately settling on the target again. This suggests that cue consistency affects HOW children retrieve words.

One caveat to this interpretation of the findings is the possibility that eye-gaze cue was not the only cue available to the children during learning. That is, we designed the teaching trials so that the target novel object was presented with two different distractors during the two teaching trials. For example, the novel word dep was presented with a distractor object geed in Trial 1, and with distractor object sook in Trial 2. This design was implemented in order to provide children with visually distinct learning trials, and thus, reduce the possibility of boredom. However, this design also created a possibility for cross-situational learning to occur. In other words, it is possible that children learned the name for dep, not only because of the eye-gaze information, but also because of the distributional information. Given the vast literature demonstrating infants’ and children’s sensitivity to cross-situational co-occurrences (Akhtar & Montague, 1999; Smith, Smith & Blythe, 2011; Smith & Yu, 2008; Yu & Smith, 2007), it is plausible that children’s success on our task was at least partially driven by the availability of this additional distributional information regarding the mappings between novel words and novel objects. It may have been the availability of this additional cue that induced robust learning in both the consistent and the inconsistent eye-gaze condition. Therefore, future experiments would benefit from utilizing the same distractor object in all trial types, such that eye-gaze would be the only cue available to disambiguate between objects.

Nevertheless, the current results provide important insights into the role of eye-gaze cue in word learning. The findings signify the strength of the eye-gaze cue during the learning process. Prior work has shown that infants and children are highly sensitive to such a cue, and pay attention to a speaker’s gaze while mapping new labels to objects (Butterworth, 2001; Carpenter et al., 1998; Flom et al., 2007; Paulus, 2011; Thoermer & Sodian, 2001). However, these studies mainly addressed children’s selection of a potential referent immediately after a single trial (Baldwin et al., 1996; Moore et al., 1999; Paulus & Fikkert, 2014). Here, we tested whether eye-gaze cue affects children’s ability to retain the novel words over a period of time. The findings of the present study suggest that eye-gaze information has consequences beyond referent selection and can influence downstream processes of word learning. Indeed, the present study demonstrated that children were able to learn the novel words cued via a speaker’s gaze direction, suggesting that eye-gaze is a robust cue that influences learning at multiple levels. But interestingly, although cue consistency did not affect ultimate retention of novel words, it did yield different temporal patterns of word retrieval. Critically, we found that all children, bilingual and monolingual, were able to utilize the speaker’s eye-gaze to its full extent from a single exposure, suggesting the robustness of the cue for word learning.

The effect of bilingualism on word-learning

Overall performance differences

The current study revealed better novel word retention in monolingual children than bilingual children, such that overall the monolingual children were more accurate and quicker at identifying the target object than the bilingual children, even after taking into account SES differences between the two groups. These findings contrast sharply with the general consensus in the field regarding the positive effects of bilingualism on pragmatic functioning (e.g., Comeau et al., 2007; Genesee et al., 1975; Fan et al., 2015), and on word learning (see Hirosh & Degani, 2018, for a review).

With respect to the word-learning literature, previous studies have demonstrated that bilinguals tend to outperform monolinguals on word-learning tasks (Bartolotti & Marian, 2012; Kalashnikova, Mattock & Monaghan, 2014; Kaushanskaya & Marian, 2009a, 2009b; Kaushanskaya & Rechtzigel, 2012; Keshavarz & Astaneh, 2004; Yoshida, Tran, Benitez & Kuwabara, 2011). Although this literature has mostly focused on adults, there is also evidence that bilingual advantages for word learning can be observed in childhood (Abu-Rabia & Sanitsky, 2010; Kalashnikova et al., 2014; Kaushanskaya, Gross & Buac, 2014; Yoshida et al., 2011). Similar levels of word-learning performance between bilingual and monolingual children have also been reported (Byers-Heinlein, Fennell & Werker, 2013). The unexpected finding in the current study was that bilingual children showed worse word-learning performance than monolingual children. This finding, although unusual within the broader literature, is consistent with recent evidence suggesting that monolingual children can outperform bilingual children on some word-learning tasks (Gogate & Maganti, 2020; Kalashnikova, Escudero & Kidd, 2018). Besides SES discrepancies between groups – a factor we have taken into account in our analyses – we considered a number of possible explanations for this finding.

First, we hypothesized that certain novel words may have been more difficult for bilingual children to learn than monolingual children (perhaps because of cross-linguistic conflict that we did not account for during stimulus design). This led us to examine the performance patterns for the two groups for each novel word across both conditions. We calculated the proportion of looks to target for each item (novel word) for monolingual and bilingual children. Independent-sample t-tests were conducted for looks to target between the two groups. Analyses did not reveal any significant group differences between each item (all ps > .05). However, bilingual children did have difficulty (less than 50% looks) with one specific word “geed” in both conditions. Analyses were re-run without this item and the general pattern of results maintained. Thus, it does not appear that item characteristics are at the root of bilingual/monolingual performance differences in the current study.1

A second possibility we considered was that the novel words in the current study were more English-like (versus Spanish-like), following English phonotactic rules, and that the task instructions were in English. Together, these factors may have resulted in bilingual children’s reduced understanding of the task and disadvantaged the bilingual children during learning. However, this possibility is unlikely for two reasons. First, the same monolingual and bilingual children participated in a different word-learning study where the task instructions were in English and the novel words were also English-like (although different from the words used in the present study; Gangopadhyay & Kaushanskaya, under preparation). In that study, we observed no group differences in word-learning performance. Second, the same lists of novel words and objects were used in another study (although with a different set of participants; Gangopadhyay & Kaushanskaya, 2020). Once again, we found no group differences in word-learning performance. Nonetheless, it would be important for future studies to attempt to replicate the present findings with the same set of stimuli and the exact same paradigm. It would also be useful for future studies to include Spanish-like novel words and task instructions in Spanish to benefit bilingual children who may prefer Spanish.

Finally, it is conceivable that the bilingual children in this particular study experienced more boredom during retention testing than the monolingual children. In other words, the bilingual children may have successfully mapped the novel labels to the novel objects during teaching, but they may have been less attentive than monolinguals during testing. Therefore, future work may consider including attention-grabbing trials during testing to keep children more engaged in the task. Altogether, it is difficult to pinpoint what may have caused bilingual children’s poorer performance on this word-learning task, and we hesitate to over-interpret this finding. Instead, we highlight the finding that, independent of the overall differences in performance, both bilingual and monolingual children performed above chance, and that all children performed similarly in the two word-learning conditions.

Sensitivity to eye-gaze cue

We also hypothesized that bilingual children might outperform monolingual children in the condition where the eye-gaze cue was less consistent. This prediction was driven by an overwhelming literature suggesting that bilingual children have heightened sensitivity to social-pragmatic cues (Ben-Zeev, 1977; Brojde et al., 2012., Buac et al., 2018; Comeau et al., 2007; Fan et al., 2015; Siegal et al., 2009; Yow & Markman, 2011a), and that they prioritize pragmatic cues like eye-gaze over other cues (Brojde et al., 2012; Yow & Markman, 2015). One likely reason for not finding this effect of bilingualism in the current study is that the “bilingual advantage” may more readily be revealed by studies utilizing naturalistic communication situations (e.g., conversational understanding or perspective-taking) or conflicting situations (e.g., pitting eye-gaze against other cues). The present study – that examined the impact of a single pragmatic cue on novel word-learning, without embedding conflict into the task – may be less likely to capture the positive effects of bilingualism of pragmatic functioning.

Our findings also conflict with results of prior studies that have focused specifically on eye-gaze and word-learning (Brojde et al., 2012; Yow & Markman, 2015). We conjecture that differences in the ages of the participants, the specific demographic characteristics of the participants, and task-design choices are all likely contributors to the discrepant patterns of results observed here. For instance, these prior studies tested 2- and 3-year-olds whereas we tested 4-and 5-year old children. It is possible that effects of bilingualism on sensitivity to eye-gaze during learning may show up more readily in younger children (e.g., Gogate & Maganti, 2020), who have less robust vocabulary and cognitive skills.

Notably, the experimental design of our study differed significantly from the other studies in its focus on children’s ability to recognize the novel words at test rather than only on their ability to map the novel words to novel objects at teaching. Brojde et al. (2012) and Yow and Markman (2015) examined the eye-gaze cue in combination with other cues, taking children’s preference for a referent as an indicator of word learning. Both studies suggested a bilingual advantage in utilizing a speaker’s gaze to aid in word learning. We would instead argue that these two studies say less about the possible “bilingual advantage” in using eye-gaze during word learning, and more about the possibility that bilingualism may influence how children treat eye-gaze in the moment of learning to make referent selections.

To this point, results of the teaching trials from the present study provide some corroboration of the possibility that bilingualism may influence children’s attention to a speaker’s eye-gaze during word learning but not necessarily their retention of the novel words. We found that, during teaching, monolingual children looked significantly more to the speaker than bilingual children (all trials collapsed), but that bilingual children looked significantly more to the distractor than monolingual children (all trials collapsed). For the trials without gaze, monolinguals showed divergence between their looks to target vs. distractor from 1700 ms–1900 ms and then again from 2400 ms–4000 ms. This pattern of looks indicates that when the speaker labeled the object with no gaze information present, monolingual children made an initial burst of accurate differentiation between the target and distractor at 1700 ms, followed by a more extended divergence at 2400 ms. The bilinguals showed divergence around 1500 ms–1700 ms initially and then again from 2600–4000 ms. Thus, it appears that bilingual children were able to make an initial accurate differentiation between the target and distractor roughly 200 ms before the monolingual children. This finding supports our earlier conclusion that bilingual children’s sensitivity to a speaker’s gaze may show up more readily in referent selection tasks, rather than retention tasks. This result, however, should be interpreted with caution because these timing windows and divergences are influenced by children’s looks to the speaker. That is, because there were three regions of interest in the teaching phase, the divergence between children’s looks to the target and distractor are confounded by looks to the speaker. Moreover, there were only 6 teaching trials, of which only 3 were without gaze in the inconsistent condition, thereby reducing the power of the analyses.

Conclusion

In conclusion, the present study revealed that the consistency of the speaker eye-gaze cue did not impact the ultimate retention of novel words, but it did impact how children retrieved them. A single exposure to the eye-gaze cue appears to be sufficient to enable short-term retention of the novel word but does result in a slightly less robust retrieval process than two exposures to the eye-gaze cue. Contrary to our expectations, monolingual children outperformed the bilingual children on word-learning during retention testing. However, bilingual children identified the target earlier than monolingual children during the teaching trials that tapped into the ability to establish mappings between novel words and novel objects in the moment. These findings seem to indicate that bilingualism may provide children with an advantage in attending to a speaker’s gaze cue during learning, but that, ultimately, these advantages do not translate to retention benefits. Most importantly, monolingual and bilingual children responded to the manipulation of the eye-gaze cue similarly, suggesting that sensitivity to eye-gaze is not conditioned by differences in socio-linguistic environment.

Acknowledgements.

This research was supported by NSF grant BCS-1749371 (awarded to Ishanti Gangopadhyay and Margarita Kaushanskaya) and NIH grants R01 DC011750 (awarded to Susan Ellis Weismer and Margarita Kaushanskaya), R01 DC016015 (awarded to Margarita Kaushanskaya) and U54 HD090256 (Waisman Center). The authors would like to thank all of the families who participated in this study as well as the school and day-care personnel who generously aided in participant recruitment. We are grateful to the members of the Language Acquisition and Bilingualism Lab, especially Sarah Diel, Queila Griffin, Gloria Lee, Emily Sebranek, and Zoe Sweeney, for their assistance with recruitment, data collection and data coding, and Katie Denzen, Anna Finch, Caroline Larson, and Janine Mathee, for their assistance with stimulus recordings. Particular thanks to Milijana Buac, Ron Pomper and Margarethe McDonald for their valuable input on study design and analysis

Footnotes

1

GCA revealed a significant group effect, indicating that bilingual children performed worse than monolingual children when aggregating all items across both conditions.

References

  1. Abu-Rabia S and Sanitsky E (2010) Advantages of bilinguals over monolinguals in learning a third language. Bilingual Research Journal 33, 173–199. [Google Scholar]
  2. Akhtar N and Montague L (1999) Early lexical acquisition: The role of cross-situational learning. First Language 19, 347–358. [Google Scholar]
  3. Akhtar N, Carpenter M and Tomasello M (1996) The role of discourse novelty in early word learning. Child Development, 635–645. [Google Scholar]
  4. Baldwin DA (1991) Infants’ contribution to the achievement of joint reference. Child Development 62, 874–890. [PubMed] [Google Scholar]
  5. Baldwin DA (1993) Early referential understanding: Infants’ ability to recognize referential acts for what they are. Developmental Psychology 29, 832. [Google Scholar]
  6. Baldwin DA, Bill B and Ontai LL (1996) Infants’ tendency to monitor others’ gaze: Is it rooted in intentional understanding or a result of simple orienting? Infant Behavior and Development 19, 270. [Google Scholar]
  7. Bartolotti J and Marian V (2012) Language learning and control in monolinguals and bilinguals. Cognitive Science 36, 1129–1147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bates D, Kliegl R, Vasishth S and Baayen RH (2015) Parsimonious mixed models. Retrieved from http://arxiv.org/abs/1506.04967
  9. Ben-Zeev S (1977) The influence of bilingualism on cognitive strategy and cognitive development. Child Development, 1009–1018. [Google Scholar]
  10. Bloom P (2000) How children learn the meanings of words. Cambridge, MA: MIT press, pp. 1–23. [Google Scholar]
  11. Bloom P (2002) Mindreading, communication and the learning of names for things. Mind & Language 17(1–2), 37–54. [Google Scholar]
  12. Boersma P and Weenink D (2016) Praat: doing phonetics by computer [Computer program]. Version 6.0.22, retrieved 1 July 2017 from http://www.praat.org/
  13. Brojde CL, Ahmed S and Colunga E (2012) Bilingual and monolingual children attend to different cues when learning new words. Frontiers in Psychology 3, 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Brooks R and Meltzoff AN (2002) The importance of eyes: how infants interpret adult looking behavior. Developmental Psychology 38, 958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Brownell R (2012) Expressive One-Word Picture Vocabulary Test Fourth Edition: Spanish- English Edition (ROWPVT-4: SBE). Novato: Academic Therapy Publications. [Google Scholar]
  16. Buac M, Tauzin-Larché A, Weisberg E and Kaushanskaya M (2018) Effect of speaker certainty on novel word learning in monolingual and bilingual children. Bilingualism: Language and Cognition, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Butterworth G (2001) Joint visual attention in infancy. In Bremner G & Fogel A (eds), Blackwel handbook of infant development. Oxford: Blackwell, pp. 213–240. [Google Scholar]
  18. Byers-Heinlein K and Werker JF (2013) Lexicon structure and the disambiguation of novel words: Evidence from bilingual infants. Cognition 128, 407–416. [DOI] [PubMed] [Google Scholar]
  19. Carpenter M, Nagell K, Tomasello M, Butterworth G and Moore C (1998) Social cognition, joint attention, and communicative competence from 9 to 15 months of age. Monographs of the Society for Research in Child Development, i–174. [PubMed] [Google Scholar]
  20. Clark EV and Grossman JB (1998) Pragmatic directions and children’s word learning. Journal of Child Language 25, 1–18. [DOI] [PubMed] [Google Scholar]
  21. Comeau L, Genesee F and Mendelson M (2007) Bilingual children’s repairs of breakdowns in communication. Journal of Child Language 34, 159–174. [DOI] [PubMed] [Google Scholar]
  22. Cummins J and Mulcahy R (1978) Orientation to language in Ukrainian-English bilingual children. Child Development, 1239–1242. [Google Scholar]
  23. Diesendruck G (2005) The principles of conventionality and contrast in word learning: an empirical examination. Developmental Psychology 41, 451. [DOI] [PubMed] [Google Scholar]
  24. Dink J and Ferguson B (2018) _eyetrackingR_. R package version 0.1.7, <URL: http://www.eyetracking-R.com.
  25. Fan SP, Liberman Z, Keysar B and Kinzler KD (2015) The exposure advantage: Early exposure to a multilingual environment promotes effective communication. Psychological Science 26, 1090–1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Flom R, Lee K and Muir D (2007) Gaze-following: Its development and significance. Mahwah, NJ: Erlbaum. [Google Scholar]
  27. Gangopadhyay I and Kaushanskaya M (2020) The role of speaker eye-gaze and mutual exclusivity in novel word learning by monolingual and bilingual children. Journal of Experimental Child Psychology 197. 10.1016/j.jecp.2020.104878 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Gangopadhyay I and Kaushanskaya M (under preparation) The effect of speaker reliability on word learning in monolingual and bilingual children. [DOI] [PMC free article] [PubMed]
  29. Genesee F, Tucker GR and Lambert WE (1975) Communication skills of bilingual children. Child Development, 1010–1014. [Google Scholar]
  30. Gogate L and Maganti M (2020) Bilingual versus monolingual infants’ novel word-action mapping before and after first-word production: Influence of developing noun-dominance on perceptual narrowing. Bilingualism: Language and Cognition 23, 148–157. [Google Scholar]
  31. Grassmann S and Tomasello M (2010) Young children follow pointing over words in interpreting acts of reference. Developmental Science 13, 252–263. [DOI] [PubMed] [Google Scholar]
  32. Hirosh Z and Degani T (2018) Direct and indirect effects of multilingualism on novel language learning: An integrative review. Psychonomic bulletin & review 25, 892–916. [DOI] [PubMed] [Google Scholar]
  33. Hirotani M, Stets M, Striano T and Friederici AD (2009) Joint attention helps infants learn new words: event-related potential evidence. Neuroreport 20, 600–605. [DOI] [PubMed] [Google Scholar]
  34. Hollich GJ, Hirsh-Pasek K, Golinkoff RM, Brand RJ, Brown E, Chung HL, Hennon E, Rocroi C and Bloom L (2000) Breaking the language barrier: An emergentist coalition model for the origins of word learning. Monographs of the Society for Research in Child Development, i–135. [PubMed] [Google Scholar]
  35. Horst JS and Hout MC (2016) The Novel Object and Unusual Name (NOUN) Database: A collection of novel images for use in experimental research. Behavior Research Methods 48, 1393–1409. [DOI] [PubMed] [Google Scholar]
  36. Houston-Price C, Plunkett K and Duffy H (2006) The use of social and salience cues in early word learning. Journal of Experimental Child Psychology 95, 27–55. [DOI] [PubMed] [Google Scholar]
  37. Jaeger TF (2009) Post to HLP/Jaeger lab blog. May 14, 2009, http://hlplab.wordpress.com/2009/05/14/random-effect-structure.
  38. Jaswal VK and Hansen MB (2006) Learning words: Children disregard some pragmatic information that conflicts with mutual exclusivity. Developmental Science 9, 158–165. [DOI] [PubMed] [Google Scholar]
  39. Kalashnikova M, Escudero P and Kidd E (2018) The development of fast-mapping and novel word retention strategies in monolingual and bilingual infants. Developmental Science 21, e12674. [DOI] [PubMed] [Google Scholar]
  40. Kalashnikova M, Mattock K and Monaghan P (2014) The effects of linguistic experience on the flexible use of mutual exclusivity in word learning. Bilingualism: Language and Cognition 18, 626–638. [Google Scholar]
  41. Kaufman AS and Kaufman NL (2004) Kaufman Brief Intelligence Test, Second Edition. Bloomington, MN: Pearson, Inc. [Google Scholar]
  42. Kaushanskaya M, Gross M and Buac M (2014) Effects of classroom bilingualism on task- shifting, verbal memory, and word learning in children. Developmental Science 17, 564–583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kaushanskaya M and Marian V (2009a) The bilingual advantage in novel word learning. Psychonomic Bulletin & Review 16, 705–710. [DOI] [PubMed] [Google Scholar]
  44. Kaushanskaya M and Marian V (2009b) Bilingualism reduces native-language interference during novel-word learning. Journal of Experimental Psychology. Learning, Memory, and Cognition 35, 829–835. [DOI] [PubMed] [Google Scholar]
  45. Kaushanskaya M and Rechtzigel K (2012) Concreteness effects in bilingual and monolingual word learning. Psychonomic Bulletin & Review 19, 935–941. [DOI] [PubMed] [Google Scholar]
  46. Keshavarz MH and Astaneh H (2004) The impact of bilinguality on the learning of English vocabulary as a foreign language (L3). International Journal of Bilingual Education and Bilingualism 7, 295–302. [Google Scholar]
  47. Marian V, Bartolotti J, Chabal S and Shook A (2012) CLEARPOND: Cross-linguistic easy- access resource for phonological and orthographic neighborhood densities. PloS One 7, e43230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Marian V, Blumenfeld HK and Kaushanskaya M (2007) The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech, Language, and Hearing Research 50, 940–967. [DOI] [PubMed] [Google Scholar]
  49. Martin NA and Brownell R (2010) Receptive One-Word Picture Vocabulary Test Fourth Edition (EOWPVT-4). Novato: Academic Therapy Publications. [Google Scholar]
  50. Matatyaho-Bullaro DJ, Gogate L, Mason Z, Cadavid S and Abdel-Mottaleb M (2014) Type of object motion facilitates word mapping by preverbal infants. Journal of Experimental Child Psychology 118, 27–40. [DOI] [PubMed] [Google Scholar]
  51. Meltzoff AN, Gopnik A and Repacholi BM (1999) Toddlers’ understanding of intentions, desires and emotions: Explorations of the dark ages. In Zelazo PD, Astington JW and Olson DR (eds), Developing theories of intention: Social understanding and self-control. Mahwah, NJ: Erlbaum, pp. 17–41. [Google Scholar]
  52. Mirman D (2014) Growth curve analysis and visualization using R. Boca Raton, FL: Chapman & Hall/CRC. [Google Scholar]
  53. Moore C, Angelopoulos M and Bennett P (1999) Word learning in the context of referential and salience cues. Developmental Psychology 35, 60. [DOI] [PubMed] [Google Scholar]
  54. Morales M, Mundy P and Rojas J (1998) Following the direction of gaze and language development in 6-month-olds. Infant Behavior and Development 21, 373–377. [Google Scholar]
  55. Paulus M (2011) How infants relate looker and object: Evidence for a perceptual learning account of gaze following in infancy. Developmental Science 14, 1301–1310. [DOI] [PubMed] [Google Scholar]
  56. Paulus M and Fikkert P (2014) Conflicting social cues: Fourteen-and 24-month-old infants’ reliance on gaze and pointing cues in word learning. Journal of Cognition and Development 15, 43–59. [Google Scholar]
  57. Pearson BZ, Fernández SC, Lewedeg V and Oller DK (1997) The relation of input factors to lexical learning by bilingual infants. Applied Psycholinguistics 18, 41–58. [Google Scholar]
  58. Siegal M, Iozzi L and Surian L (2009) Bilingualism and conversational understanding. Cognition 110, 115–122. [DOI] [PubMed] [Google Scholar]
  59. Singmann H, Bolker B, Westfall J and Aust F (2017) afex: Analysis of Factorial Experiments. R Package Version. [Google Scholar]
  60. Smith K, Smith AD and Blythe RA (2011) Cross‐situational learning: An experimental study of word‐learning mechanisms. Cognitive Science 35, 480–498. [Google Scholar]
  61. Smith L and Yu C (2008) Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition 106, 1558–1568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Storkel HL (2013) A corpus of consonant–vowel–consonant real words and nonwords: Comparison of phonotactic probability, neighborhood density, and consonant age of acquisition. Behavior Research Methods 45, 1159–1167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Thoermer C and Sodian B (2001) Preverbal infants’ understanding of referential gestures. First Language 21, 245–264. [Google Scholar]
  64. Tincoff R and Jusczyk PW (2012) Six-month-olds comprehend words that refer to parts of the body. Infancy 17, 432–444. [DOI] [PubMed] [Google Scholar]
  65. Woodward AL (2005) The infant origins of intentional understanding. Advances in Child Development and Behavior 33, 229–262. [DOI] [PubMed] [Google Scholar]
  66. Yoshida H, Tran DN, Benitez V and Kuwabara M (2011) Inhibition and adjective learning in bilingual and monolingual children. Frontiers in Psychology 2, 210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Yow WQ and Markman EM (2011a) Young bilingual children’s heightened sensitivity to referential cues. Journal of Cognition and Development 12, 12–31. [Google Scholar]
  68. Yow WQ and Markman EM (2011b) Bilingualism and children’s use of paralinguistic cues to interpret emotion in speech. Bilingualism: Language and Cognition 14, 562–569. [Google Scholar]
  69. Yow WQ and Markman EM (2015) A bilingual advantage in how children integrate multiple cues to understand a speaker’s referential intent. Bilingualism: Language and Cognition 18, 391–399. [Google Scholar]
  70. Yu C and Smith LB (2007) Rapid word learning under uncertainty via cross-situational statistics. Psychological Science 18, 414–420. [DOI] [PubMed] [Google Scholar]

RESOURCES