Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 May 1.
Published in final edited form as: Dev Sci. 2021 Aug 19;25(3):e13166. doi: 10.1111/desc.13166

Flexible fast-mapping: Deaf children dynamically allocate visual attention to learn novel words in American Sign Language

Amy M Lieberman 1,*, Allison Fitch 2, Arielle Borovsky 3
PMCID: PMC8818049  NIHMSID: NIHMS1730810  PMID: 34355837

Abstract

Word learning in young children requires coordinated attention between language input and the referent object. Current accounts of word learning are based on spoken language, where the association between language and objects occurs through simultaneous and multimodal perception. In contrast, deaf children acquiring American Sign Language (ASL) perceive both linguistic and non-linguistic information through the visual mode. In order to coordinate attention to language input and its referents, deaf children must allocate visual attention optimally between objects and signs. We conducted two eye-tracking experiments to investigate how young deaf children allocate attention and process referential cues in order to fast-map novel signs to novel objects. Participants were deaf children learning ASL between the ages of 17–71 months. In Experiment 1, participants (n = 30) were presented with a novel object and a novel sign, along with a referential cue that occurred either before or after the sign label. In Experiment 2, a new group of participants (n = 32) were presented with two novel objects and a novel sign, so that the referential cue was critical for identifying the target object. Across both experiments, participants showed evidence for fast-mapping the signs regardless of the timing of the referential cue. Individual differences in children’s allocation of attention during exposure were correlated with their ability to fast-map the novel signs at test. This study provides first evidence for fast-mapping in sign language, and contributes to theoretical accounts of how word learning develops when all input occurs in the visual modality.

Keywords: American Sign Language, novel word learning, fast-mapping, referential cues, visual attention, deaf children

Introduction

Children learn words by attending to and connecting the input they perceive onto the objects and events around them. While this task may sound simple, young word learners are faced with the daunting challenge of linking an ever-changing array of linguistic and referential cues which may not align in space and time. The synchronous, multisensory alignment of the child’s attention to linguistic information in the auditory channel along with social referential cues in the visual channel (e.g. gaze and pointing), can make the referent mapping transparent and support word learning (Baldwin, 1991; Gogate et al., 2000; Suanda et al., 2016, 2019). Yet, for deaf children learning American Sign Language (ASL), the task of mapping input to its referents is necessarily asynchronous, as linguistic input, referential cues, and objects are all perceived sequentially via the visual modality. Thus, children’s ability to allocate attention in a way that enables them to connect language and its referents sequentially, rather than simultaneously, is a normal and necessary aspect of word learning in ASL. Deaf children learning ASL with early and rich exposure show vocabulary growth at rates that match hearing peers (Caselli et al., 2020), thus raising the question of how children learning ASL succeed at connecting asynchronous linguistic and visual information to learn new words. We address this question by focusing on how timing of referential cues in the input impacts fast-mapping of novel ASL signs, and whether individual differences in children’s ability to allocate attention between multiple visual inputs predicts their ability to learn new signs.

Allocation of attention to language and referential cues

Infants learn to interpret the social referential cues of their adult interlocutors as meaningful connections between object labels and their referents (Baldwin, 1993; Booth et al., 2008). Although infants will follow gaze from a very early age, it is not until about 12 months that infants reliably make the connection between an adult’s gaze and the referential intent of the object of their gaze (Woodward, 2003); and not until 18 months that children use gaze reliably as an entry point to joint attention (Baldwin, 1991; Moore & Corkum, 1998). Gaze continues to be a powerful social cue for children across the first several years of development (Yurovsky & Frank, 2017), and children rely on such cues to resolve referential ambiguity.

Deaf infants learning ASL have enhanced abilities to follow the gaze of their interlocutors relative to hearing infants (Brooks et al., 2020). By the age of two, deaf children acquiring ASL shift gaze meaningfully and frequently when interacting with adult interlocutors to connect language and visual information (Lieberman et al., 2014). This enhanced gaze cueing persists into childhood (Pavani et al., 2019). However, children learning ASL must learn when and how to best allocate their own visual attention to perceive both linguistic and non-linguistic input, particularly when following adult cues requires a shift in attention from one locus to another.

Most current theories of joint attention describe children’s ability to follow cues as an important milestone in development, but cannot as easily account for children’s developing ability to evaluate multiple sources of input and make in-the-moment decisions about whether to maintain or shift gaze. Observing children who are learning ASL can shine a light on this feature of development. Optimal attention allocation would enable children to perceive linguistic input when it is produced, and to attend objects and events during pauses in linguistic input. We do not currently know how attention allocation develops, nor whether or how individual differences in children’s ability to allocate attention predicts word learning, in much the same way that individual differences in gaze following predict later vocabulary and other linguistic outcomes (Markus et al., 2000). These gaps are addressed in the current study.

Timing of referential cues relative to object labels

The timing of referential cues is critical to infants’ ability to connect the cue, the linguistic input, and the referent object. In child-directed speech, parents often align their own speech with the toddler’s gaze (Gogate et al., 2000), and this multisensory synchrony between linguistic labelling and visual attention can support word learning (Yu & Smith, 2012). Perreira and colleagues (Pereira et al., 2014) found that toddlers were more likely to learn the names of objects when the object being labelled was more centrally located and looming large in the child’s view. Further, parent gesture towards an object just prior to object labelling, and continuing after, appears to increase referential transparency (Gogate & Hollich, 2010; Trueswell et al., 2016). Labelling an object while the infant is directing attention to it, manipulating it, touching it, or fixating can provide optimal moments for word learning (Yu & Smith, 2012).

While deaf children are adept at responding to referential cues, the timing of such cues that would most support word learning in ASL is unknown. In sign language, simultaneous perception of the label and the object is theoretically possible, but only when the parent has carefully positioned the object between themselves and the child. In addition, the “single, looming, object” described by Pereira et al. (2014) as optimal for labelling is not compatible with ASL-based interactions—a child cannot simultaneously have a single looming object in their visual field and perceive the ASL label for that object. Similarly, the shift in attention towards an object just prior to labelling, which has been found to be an important predictor of referential transparency in spoken language (Trueswell et al., 2016) would be problematic in an ASL interaction if the shift in attention led the child to look away from the interlocutor. On the other hand, if children only fixate on the interlocutor, they will miss the opportunity to map input to its referents. Instead, the child must perceive the object label and referent in sequence.

Word learning in ASL

Deaf children acquiring ASL from birth acquire words in a rate and manner that is largely parallel to spoken language development (Anderson & Reilly, 2002; Caselli et al., 2020). Yet only two previous studies have examined novel word learning in ASL. Lederberg and colleagues (Lederberg et al., 2000) studied word learning as a function of children’s use of sign and spoken language. They found that learning through direct reference with overt social pragmatic cues preceded learning through indirect mapping that draws on mutual exclusivity constraints. In a follow up study with deaf and hard-of-hearing preschoolers (Lederberg & Spencer, 2009), they found a similar developmental progression in which children who showed evidence for learning via indirect mapping had already mastered mapping via direct reference. In the direct reference conditions, the experimenter gazed at, pointed at, and manipulated the object while labelling it, and then handed it to the child to explore. The child’s gaze or attention to the objects or the experimenter was not explicitly coded, though the researchers “used eye gaze, pointing, and manipulation to make reference clear while labeling the objects, and care was taken to ensure the child attended to the object and to the linguistic stimulus either simultaneously or in quick succession” (Lederberg & Spencer, 2009, p. 51).

Current study

We sought to shed light on the conditions under which temporal alignment of attention supports word learning by carrying out two experiments in deaf children acquiring ASL. We measured fast-mapping skills (Carey & Bartlett, 1978; Halberda, 2003) by presenting children with novel objects and novel ASL labels. During exposure, we varied whether the referential cue occurred before or after the object label. We reasoned that children might learn novel words better when the sign label preceded the point, as this would enable them to perceive both the word and the point prior to shifting gaze from the signer to the object. We then assessed children’s fast-mapping abilities. Here we hypothesized that children would successfully fast-map a novel sign to a novel object, and that there would be a positive correlation between children’s age and performance both in terms of allocating attention at exposure and recognizing the target at test. Finally, we asked whether deaf children’s allocation of attention between the novel sign and object during the exposure phase would predict their performance on the fast-mapping task. We predicted that children who allocated most efficiently would show more robust mapping of the target signs. Experiments 1 and 2 varied in the degree of referential uncertainty during exposure, by presenting an array of one (Experiment 1) or two (Experiment 2) novel objects. We hypothesized that novel word learning in two-object arrays would not only be more challenging, but also might be more sensitive to the timing of referential cues with respect to the sign label, as the learner would be more reliant on the signer’s referential cues to map the label and object.

Experiment 1: Fast-mapping novel ASL signs to novel objects

Methods

Participants

We recruited 41 deaf children between the ages of 17 and 69 months (Mage = 43 months) who were learning ASL to participate in the study. Participants were recruited from the Northeast and Midwestern United States, through programs serving deaf children and through social media. Of the recruited participants, nine children did not complete any trials of the eye-tracking task. Two participants were excluded for excessive trackloss (described below). The final sample of participants (n=30) was between 17 and 69 months (Mage = 44 months), with 22 males and 8 females. Reported race was White (n =22), Asian (n = 2), and more than one race (n = 4), with 2 participants not reporting. Three participants reported Hispanic/Latinx ethnicity, 25 reported not Hispanic/Latinx, and 4 did not report ethnicity. Twenty-five children had deaf parents and five children had hearing parents. The children with deaf parents were exposed to ASL from birth; children with hearing parents were exposed to ASL between 0–36 months. All children attended an early intervention program that used ASL at the time of testing.

Stimuli

Novel signs and objects

Stimuli consisted of six novel objects that were paired with six novel signs. The novel signs were developed by deaf signing researchers and obeyed the phonotactic constraints of ASL. Six novel objects were selected from a novel object database (Horst & Hout, 2016), and set against a white background square that measured 400 × 400 pixels (approximately 4 inches square). Objects and signs were paired arbitrarily, though we ensured that there was no iconic relationship between the novel signs and their labels. The six novel objects were then divided into three pairs. The assignment of object pairs to exposure condition was counterbalanced across versions of the experiment. In addition to the novel sign trials, children also saw trials with familiar sign-object pairs, and these were interleaved with the novel sign trials.

Video stimuli
Exposure videos.

All videos featured a deaf, native ASL user signing the stimuli sentences. Exposure consisted of an attention-getting sign, a labelling utterance, a filler period, and then a second labelling utterance (Figure 1). After the first labelling utterance, the signer returned their hands to a resting position, then produced a filler sign (e.g. COOL or WOW), then returned their hands to a resting position again before producing the second labelling utterance. Each exposure video was edited to be exactly 10 seconds long by removing extraneous frames from the beginning and end of the video. We then identified a unique onset and offset point for each novel sign label. We defined sign onset as the first frame where the signer’s hands moved away from the previous sign, and sign offset as the first frame where the signer’s hands transitioned to the next sign. Similarly, we defined the onset of the filler period as the first frame where the signer’s hands moved away from the final sign in the first labelling utterance, and the offset of the filler period as the first frame where the signer’s hands transitioned to the first sign in the second labelling utterance. Each video was cropped to 400 × 400 pixels with the signer appearing against a white background.

Figure 1:

Figure 1:

Structure of exposure trial sign video. The signer first produced an attention-getting sign, followed by a labelling utterance, a filler period, and then a second identical labelling utterance. The sequence unfolded over ten seconds.

Exposure trials consisted of three conditions. In the point-word condition, the signer simultaneously pointed and gazed at the novel object first, and then labelled it, e.g. HEY! POINT WHAT? KOBA. COOL! POINT WHAT? KOBA (“Hey! What’s that? That [point and gaze] is a KOBA! Cool! What’s that? That [point and gaze] is a KOBA!”). In the word-point condition, the gaze and point cue occurred after the object label, as follows: HEY! KOBA WHAT? POINT. WOW! KOBA WHAT? POINT (“Hey! A KOBA, what is it? It’s that [point and gaze]! Wow! A KOBA, what is it? It’s that [point and gaze]!”). Finally, in the word only condition, the signer did not produce a gaze and point cue, but instead used the word “see” while looking forward, as follows: HEY! SEE WHAT? KOBA. WOW! SEE WHAT? KOBA (“Hey! Do you see it? It’s a KOBA. Wow! Do you see it? It’s a KOBA.”). Note that ASL does not include determiners, and the demonstrative pronoun “that” is realized as a point.

Test videos.

The test video stimuli were identical across conditions. The video consisted of the signer producing the sentence WHERE KOBA? (“Where is the KOBA?”). Stimuli were edited so that the WHERE sign was 1000 ms and the object label was 1000 ms. The onset of the sign was identified as the first frame when the signer’s hand transitioned away from the final movement of the WHERE sign. The videos were 400 × 400 pixels (4 inches square) with the signer appearing against a white background.

Procedure

The protocol was approved by the Institutional Review Board at Boston University. Parents provided informed consent. Parents and children were seated in front of a 17-inch monitor and SR-Research Eyelink 1000+ eye-tracker which recorded eye movements at 500 Hz. A short, animated movie was played to attract the child’s attention while the experimenter affixed a small sticker on the child’s forehead and focused the camera. Next, a five-point calibration sequence was carried out. The experiment consisted of alternating blocks of eight exposure trials from a single condition followed by eight test trials. Both exposure and test blocks included four novel object trials used in the current experiment and four familiar object trials (not analyzed here). During exposure, each object was labelled four times (twice in each of two trials). During test trials, the same object pairs were presented four times, such that each object in the pair appeared twice as the target and twice as the distractor. This allowed us to control for the possibility that children could employ a strategy of attaching novel labels to objects based on amount of exposure, as each object was seen and labelled equally.

In the exposure trials, children first saw the picture of the novel object on one side of the screen for 2000 ms. Then, the exposure video appeared on the opposite side of the screen, while the novel object picture remained on screen (Figure 2a). The picture and video were 480 pixels (approximately five inches) apart on the monitor. Side placement of the picture and video was counterbalanced across trials. In the test trials, pairs of pictures consisting of a target and distractor first appeared on each side in the lower quadrant of the screen. Next, a fixation cross appeared centered in the top half of the screen. The cross was gaze-contingent, such that when children fixated on the cross, this triggered the onset of the video. The test video then played, and children had two seconds to fixate the pictures. The trial ended when the experimenter hit a key on the keyboard following the two second period offset of the test video (Figure 2b). Following each block of trials there was a break, during which children saw an engaging video. Between each trial, a small central image appeared on screen, and the experiment only advanced when the experimenter clicked on the image, allowing children to take breaks as needed. There were three blocks in total, one for each exposure condition. The order of blocks was counterbalanced.

Figure 2:

Figure 2:

Trial structure for a) exposure trials and b) test trials in Experiment 1. In exposure trials, participants first saw a single object on one side of the screen for 2000ms. Then, the sign video appeared on the opposite side of the screen. In test trials, participants saw pairs of signs comprised of a target and distractor. After 2000ms, a gaze-contingent fixation cross appeared. Once the child fixated the cross, the test video appeared, directed the child to the target object.

Results

Approach to analysis

Analysis focused on gaze patterns during exposure and test trials, first independently and then in relation to one another. All analyses were done using R Version 3.6.2 (R Core Team, 2019) using the lme4 package (Bates et al., 2015) and EyetrackingR (Dink & Ferguson, 2015). Following a trackloss analysis, we first examined gaze patterns during the exposure trials to investigate how children allocated attention between the novel sign and novel object, and whether attention to the novel sign varied as a function of condition. Next, we looked at the test trials to determine whether children showed evidence of fast-mapping novel signs. Finally, we analyzed whether children’s gaze patterns during exposure predicted their performance on the fast-mapping task. While we present data visualizations as fixation proportions, prior to analyzing fixations we first calculate log-gaze transformations, which address specific limitations of raw proportions: first, raw proportions are not linearly independent (i.e. greater looks to one AOI means fewer looks to other AOIs); second, as raw proportions are fixed between zero and one, they violate homogeneity of variance assumptions. This approach has been used in our prior eye-movement studies with both hearing (Borovsky, 2020; Borovsky et al., 2016) and deaf (Lieberman & Borovsky, 2020; Wienholz & Lieberman, 2019) children.

Trackloss

We removed trials in which the child was not looking at any AOI for at least 20% of the time across the trial, which led to exclusion of 38 exposure trials and 53 test trials. We excluded any participants whose data did not include at least 25% of the experimental trials, which led to exclusion of two participants. All subsequent analyses were carried out on the remaining 322 exposure and 307 test trials from 30 participants.

Allocation of attention during exposure

We first plotted fixations to the video and image during the full 10-second exposure video, to obtain a broad look at how children divided their gaze between the novel object and ASL input. We then homed in on two critical points during the trial: the sign label windows, during which children needed to be fixating the sign video to perceive the sign, and the filler period between labeling utterances, during which children could fixate the novel object without missing critical linguistic input.

Full trial

We plotted the time course across the full exposure trial window (Figure 3) by condition. Children largely fixated the sign video throughout the trial, with a sharp increase in looks to the novel object that coincided with the filler period between labelling utterances. Children spent an average of 60% of the time fixating the sign video and 12.8% of the time fixating the novel object across conditions.

Figure 3:

Figure 3:

Experiment 1 timecourse of looking to video (signer) and target during exposure trials, by condition. Error bars indicate +/− 1 standard error. Sample exposure sentences are overlayed, with shaded areas showing the analyzed sign labels (blue shading) and filler period (orange shading).

Sign Label

We analyzed fixations to the video during the novel sign label, i.e. the window of time between sign onset and sign offset, which was uniquely identified for each trial. There were two instances of the novel sign during each trial, and these were included as separate data points because they occurred at different points in the trial. We fit a linear mixed-effects model of sign fixations as a log-transformed proportion of total fixations to the sign label [log(PropSign / 1- PropSign)], with fixed effects for condition and age and random effects for participant and item (Table 1). There were no significant effects of condition, but there were significant age effects. Older children fixated more on the sign than younger children.

Table 1:

Experiment 1 model output: Fixations (LogGaze transformed) to the video during novel sign label as a function of condition and age during exposure.

Estimate Standard Error df t p

Intercept 2.07 0.81 32.71 2.55 0.02
Point-Word* 0.37 0.81 12.86 0.46 0.66
Word-Point* −0.59 0.81 12.75 −0.73 0.48
Age 0.11 0.04 27.10 2.90 0.007
*

compared to word-only condition

Filler period

The other critical window was the filler period, i.e. the window of time between the two labelling utterances during which the signer returned her hands to a resting position, then produced a filler sign (e.g. COOL!). Fixations to the novel object during this window would enable children to map the recently provided label onto the sign. We fit a linear mixed-effects model of fixations to the novel object with fixed effects for condition and age and random effects for participant and item (Table 2). There were no significant effects of condition, but there were significant age effects. Older children fixated more on the target picture during the filler period than younger children.

Table 2:

Experiment 1 model output: Fixations (LogGaze transformed) to target object during filler period as a function of condition and age.

Estimate Standard Error df t p

Intercept −2.86 0.42 14.67 −9.19 <0.001
Point-Word * 0.97 0.59 15.01 1.64 0.12
Word-Point * −0.46 0.59 14.81 −0.78 0.45
Age 0.03 0.01 27.73 2.19 0.04
*

compared to word-only condition

Allocation of attention during test

To determine whether children had successfully mapped the novel signs onto the novel objects during test trials, we plotted the timecourse of looks to the target and distractor pictures starting from the onset of the target word and continuing through the length of the trial (Figure 4). Children generally fixated the sign video until approximately 600ms following sign onset, and then shifted gaze to the target picture. Following convention from previous studies of ASL lexical recognition (Lieberman & Borovsky, 2020; MacDonald et al., 2018), we analyzed looks from 600–2500ms following sign onset, which we call the “sign recognition window.” However, given that recognition of novel signs is a more difficult task than recognition of familiar signs and thus often occurs later in the timecourse (Bion et al., 2013; Booth & Waxman, 2009; Borovsky et al., 2016; Houston-Price et al., 2010; Mather & Plunkett, 2010), we analyzed a second, late window, from 2500–3500ms following sign offset. For each window, we fit a linear mixed-effects model of fixations as a log-transformed proportion of looking to the target relative to distractor [log10(ProportionTarget/ProportionDistractor)], with fixed effects for trial condition and child age and random effects for participant and item. In the sign recognition window, children looked to the target more than the distractor as indicated by a log gaze greater than zero (M = .41, SD = .5), suggesting they had successfully mapped the novel signs [t(29) = 4.47, p <.001, 95% CI (.25, Infinite)]. In the late window, log gaze to the target was again greater than zero (M = .23, SD = .74), although the effect was smaller relative to the early window [t(29) = 1.71, p = .049, 95% CI (.002, Infinite)]. There were no effects of exposure condition nor child age on target fixations.

Figure 4:

Figure 4:

Experiment 1 timecourse of looks to target and distractor images during test trials starting at target sign onset. Boxes indicate the sign recognition window (600–2500ms after onset) and the late window (2500–3500ms after onset).

Relationship between gaze patterns at exposure and test

Finally, we conducted an analysis to determine whether children who allocated attention between the sign label and the object most efficiently would show greater evidence of word learning. We predicted that children who were more adept at allocating attention during exposure would show greater fixations to the target during test. For each child, we created two exposure trial predictors: (1) log-adjusted proportion of total looking time to the object label during the sign label window, for all instances of sign labelling across trials and conditions, and (2) log-adjusted proportion of total looking time to the target picture during the filler period, across trials and conditions. In the test trials, we averaged the log-adjusted proportion of total looking time to the target picture across trials and conditions in the sign recognition window. As fixations to specific locations are calculated as a proportion of total fixations to any AOI, this analysis controls for children’s overall likelihood of fixating the screen. We conducted a linear regression using each of the two critical exposure windows to predict proportion looking time to the target during test (Table 3). Proportion of looking to the video during the sign label, but not to the object during the filler period, was a significant predictor of looking at test (full model R2(27)= .23, p = .01). Children who allocated gaze such that they were fixating the signer when the novel label was produced were better at mapping the novel sign to the novel object at test.

Table 3:

Experiment 1 model output: Proportion of fixations (LogGaze transformed) to the video during sign label windows and to the novel object during the filler period as predictors of target looking (600–2500ms window) at test.

Estimate SE t p

Intercept −3.2 0.99 −3.21 0.003
Fixations to video during sign label 0.25 0.09 2.87 0.007
Fixations to target object during filler 0.24 0.24 0.99 0.33

Summary of Experiment 1

Our results illustrate how deaf children learning ASL divide attention between ASL input and visual stimuli. Children were drawn to the dynamic ASL input, and waited for a natural pause in the input to shift gaze away from the signer and towards the object. Contrary to our initial hypotheses, the timing of the referential gaze and point cue relative to the object label did not have a significant impact on children’s gaze patterns. Children instead waited past the cue until the filler period, and then shifted gaze to the object. Our analysis of target fixations during test trials revealed that children successfully mapped the novel object onto the novel label. Critically, the only significant predictor of target looking at test was children’s fixations to the signer when the novel sign was produced during exposure; neither age nor exposure condition predicted fixations at test. This pattern suggests that children are rapidly assessing the structure of the input sentences, which are consistent in overall prosodic cadence, and then strategically allocating attention during natural prosodic breaks. Children who do this most effectively—i.e. those who allocate attention such that they perceive the novel sign label—are better able to learn the novel words.

Experiment 1 allowed us to determine how children allocate visual attention in a highly simplified visual scene. However, the fact that there was only a single object presented with the novel label raises the possibility that children were simply ignoring the referential cue. That is, the object-sign pairing was such that children did not need to process the direction of the gaze and point, as there were no competing objects on the screen. If children are indeed ignoring the referential cues, then we would expect that when there is more than one potential referent during the exposure phase that children would not map the novel sign onto the novel object. In contrast, if children are processing the referential cues, then they would still show evidence of fast-mapping even when more than one potential referent is presented during exposure. To address this possibility, in Experiment 2 we investigated the effects of referential cue timing when the referential cue is necessary in order to identify the target object.

Experiment 2: Fast-mapping novel ASL signs with referential ambiguity

We used an identical paradigm to Experiment 1, with one exception: during exposure to the novel sign, children saw two novel objects on the screen. Thus in order to map the novel sign label onto the correct object, children had to attend to the referential cues. This modification allows us to determine what factors children consider when deciding when and where to shift gaze. We consider two possibilities: If children strategically allocate their attention based solely on moments of semantically light linguistic content and prosodic pauses, then they might not make use of the referential cue to guide their looking behavior. Such a strategy would potentially negatively impact mapping of the novel object-label pairing. In contrast, if children are strategically using the best cues that are available in the moment, as we predict, then they’ll make use of the signer’s point to reduce referential uncertainty in a two-referent condition, and will successfully map the novel sign onto the novel object.

Methods

Participants

We recruited 35 deaf children between the ages of 18 and 71 months (Mage = 43.5 months) to participate in the study. Of these, two participants did not complete the eye-tracking task, and one participant was excluded for excessive trackloss. The age range of the remaining 32 children 18 to 71 months (Mage = 44 months). There were 21 males and 11 females. Reported race was White (n =27), or more than one race (n = 1), with 4 participants not reporting. Eight participants reported Hispanic/Latinx ethnicity, 22 reported not Hispanic/Latinx, and 2 did not report ethnicity. Twenty-eight children had deaf parents and 4 children had hearing parents. Children with hearing parents were exposed to ASL between 0 to 36 months. All children attended an early intervention program that used ASL. Seven children had previously participated in Experiment 1, with at least six months in between testing sessions.

Stimuli

Object signs and videos

The novel object signs and videos were a subset (four out of six) of those used in Experiment 1. In Experiment 2, there were two conditions: Word-point and point-word.

Procedure

The procedure and trial structure were identical to Experiment 1, except for the following: exposure trials consisted of two novel objects (a target and a distractor) that were displayed on either side of the screen, and the sign video was displayed in the center. In order to accommodate two pictures, we used a 24-inch monitor. During exposure, the images and video, each of which were 400 × 400 pixels (4 inches square) were displayed on the same horizontal plane, spaced approximately 7 inches apart (Figure 5). The object pairs were consistent between exposure and test, i.e. the target and distractor objects that appeared during exposure were the same pairs that appeared during test. There were two blocks each of exposure and test trials. The block order was counterbalanced such that half the children saw the point-word condition first, and the other half saw the word-point condition first.

Figure 5:

Figure 5:

Schematic of exposure trials in Experiment 2. During the preview period, the target and distractor were presented on opposite sides of the screen for 2000ms. This was followed by the exposure video, which was presented in the center of the screen.

Results

Approach to analysis

Analysis followed the same structure as Experiment 1.

Trackloss

We removed any individual trials in which the child was not looking at any AOI at least 20% of the time across the trial, which resulted in exclusion of 18 exposure trials and 27 test trials. Next, we excluded any participants whose data did not include at least 25% of the experimental trials, which led to exclusion of one participant. All subsequent analyses were carried out on the remaining 238 exposure and 229 test trials from 32 participants.

Allocation of attention during exposure phase

Full trial.

As in Experiment 1, children largely fixated on the sign video throughout the trial, with a shift to the novel object that coincided with the filler period (Figure 6). Over the full trial window, children spent an average of 72% of the time fixating the video, 10% of the time fixating the novel object, and 4% of the time fixating the distractor object [target vs. distractor: t(31) = 5.87, p < .001, 95% CI = (.04, .08)]. Children were clearly perceiving and processing the referential cue despite the fact that they did not immediately shift their gaze to follow the cue.

Figure 6:

Figure 6:

Experiment 2 timecourse of looking to sign video, target, and distractor during exposure trials, by condition. Error bars indicate +/− 1 standard error. Sample exposure sentences are overlayed, with shaded areas showing the analyzed sign labels (blue shading) and filler period (orange shading).

Sign label.

To analyze fixations to the novel signs, we fit a linear mixed-effects model of sign fixations as a log-transformed proportion of total fixations to the video, with fixed effects for condition and age and random effects for participant and item (Table 4). There were no significant effects of condition nor age.

Table 4:

Experiment 2 model output: Fixations (LogGaze transformed) to the video during novel sign label as a function of condition and age during exposure.

Estimate Standard Error df t p

Intercept 3.46 0.53 12.67 6.49 <0.001
Word-Point* −1.08 0.59 5.39 −1.83 0.12
Age 0.04 0.03 27.04 1.64 0.11
*

compared to point-word condition

Filler period.

To analyze fixations to the target object during the filler period between labelling utterances, we fit a linear mixed-effects model of log-transformed proportion of total fixations to the target picture, with fixed effects for condition and age and random effects for participant and item (Table 5). There were no significant effects of condition, but there were significant age effects. Older children fixated more on the novel object during the filler period than younger children.

Table 5:

Experiment 2 model output: Fixations (LogGaze transformed) to target object during filler period as a function of condition and age.

Estimate Standard Error df t p

Intercept −3.8 0.3 230 −12.57 <0.001
Word-Point* 0.54 0.42 230 1.29 0.2
Age 0.04 0.01 230 2.99 0.003
*

compared to point-word condition

Allocation of attention during test.

We plotted the time course of looks to the target and distractor pictures starting from the onset of the target word and continuing for 3500ms (Figure 7). The data visualization revealed that looks to the target and distractor did not diverge until almost 1000ms after sign onset, and that looks to the target picture continued to increase throughout the trial, with greatest looks occurring between 2000–3500ms following sign onset. We conducted a linear mixed-effects regression with log-transformed ratios of target to distractor looking, with fixed effects for exposure condition and child age, and random effects for participant and item for both the sign recognition (600–2500ms) and late (2500–3500ms) windows. In both the early and late windows, children looked to the target more than the distractor [early: t(31) = 4.2, p <.001, 95% CI (.29, Inf), M = .49, SD = .66; late: t(31):3.36, p = .001, 95% CI (.24, Inf), M = .48, SD = .81], suggesting they had successfully mapped the novel signs. There were no effects of exposure condition nor age in either window.

Figure 7:

Figure 7:

Experiment 2 timecourse of looks to target and distractor images during test trials starting at target sign onset. Boxes indicate the sign recognition window (600–2500ms after onset) and the late window (2500–3500ms after onset).

Relationship between gaze patterns at exposure and test

To determine whether children’s looking patterns during exposure predicted performance at test, we used two predictors from the exposure trials: fixations to the sign video during the sign label windows, and fixations to the target picture during the filler period. We conducted a linear regression using the log-transformed proportion looking to sign during the sign window, and log-transformed proportion looking to target picture during the filler period, to predict log-transformed proportion looking time to the target picture during test trials (Table 6). Looking to the target picture during the filler period was particularly important as children had now been provided the referential information necessary to identify which of the two pictures was the target, which they did not have during the preview period.. As target looking at test was significant in both the sign recognition window and the late window, we repeated the model with both outcome measures (i.e. target looking in sign recognition and in late windows). In the sign recognition window, looking to the object during the filler period, but not looking to the video during the sign label, predicted proportion looking time to the target picture at test (full model R2(29) = .16, p = .033). In the late window, the results were even more pronounced: both exposure trial looking variables were robust predictors of target recognition at test (full model R2(29) = .38, p <.001). Children who allocated attention in a way that enabled them to perceive the target label and follow the referential cue to map the label onto the correct object were more successful at novel word learning as evidenced by target recognition during test.

Table 6:

Experiment 2 model output: Proportion of fixations (LogGaze transformed) to the video during sign label windows and to the novel object during the filler period as predictors of target looking at test in a) sign recognition (600–2500ms) and b) late (2500–3500ms) windows.

a)
Estimate SE t p

Intercept −1.85 .95 −1.95 0.06
Fixations to video during sign label .06 .12 .52 0.61
Fixations to target object during filler .56 .22 2.56 0.02

b)
Estimate SE t p

Intercept −0.09 1.13 −.08 0.94
Fixations to video during sign label .38 .15 2.58 0.02
Fixations to target object during filler .82 .26 3.15 0.004

Summary of Experiment 2

Children successfully mapped a novel ASL sign onto a novel object, as illustrated by increased looks to the target vs. distractor pictures in the test phase. As with Experiment 1, the timing of the referential cues that accompanied the novel sign labels (point-word vs word-point) did not have an effect on target recognition during either exposure or test. Although the timing of these gaze shifts did not differ by condition, the children were clearly attending to and interpreting the referential cue, as indicated by increased fixations to the target vs. the unnamed distractor during the exposure trial. Once again, children’s ability to successfully allocate and alternate attention between the sign video and target picture robustly predicted their ability to learn new signs. Specifically, in the exposure phase children’s attention to the signer during the object labels and children’s attention to the target picture during the filler period each predicted overall attention to the target picture at test.

Discussion

There is a growing consensus that early word learning is supported by aligning synchronous cues in the visual and auditory domain (Gogate et al., 2000; Tomasello & Todd, 1983; Yu & Smith, 2012, 2013). However it is clear that children can show typical rates of vocabulary growth even when these cues appear sequentially in the visual domain only, as is the case in children learning ASL (Caselli et al., 2020; Lillo-Martin, 1999). What consequences does de-coupling synchronous social referential and linguistic cues into sequential, but temporally adjacent events have on early language learning? We sought to gain insight into this question in two studies, where we measured how deaf children learn words by allocating visual attention between linguistic and referential cues that varied in sequential ordering. We presented children with a novel object and paired it with a novel ASL sign, to determine whether and how children would map the label to the object. Children successfully mapped the novel sign onto the novel object in both experiments, providing a first demonstration that fast-mapping can be used to map ASL signs onto their referents. Age was a significant predictor of fixation patterns during exposure but not at test.

We varied the timing of the referential cues relative to the object label during exposure. Contrary to our expectations, timing of the referential cue did not impact gaze patterns during exposure nor target fixations at test. Our results revealed a different story: regardless of when the cue appeared, children maintained fixation to the signer throughout the labelling utterance and only shifted gaze to the target object during a prosodic pause that contained minimal input. Crucially, children varied in how effectively they allocated attention during exposure to perceive the target sign, and—in Experiment 2—the target object, and these individual differences in allocation predicted performance at test. Thus, while children as a whole showed evidence for fast-mapping novel words, individual children’s effective allocation of attention when exposed to new words predicted their success in mapping those words onto their referents.

Allocation of attention to signers and objects

Children learning new words in ASL must alternate attention strategically to perceive the sign label, referential cues, and the referent object. Although linguistic input is dynamic, structured, and highly salient, children must either shift away from ongoing input or wait for pauses in the input to connect the linguistic content to the surrounding visual scene. Recent work has shown that young ASL-learning children rapidly appraise the incoming language and shift gaze away from linguistic input and towards a target precisely when they have sufficient information to identify the target (MacDonald et al., 2020). Our work adds to this growing compendium that children learning ASL are exquisitely attuned to the temporal dynamics of the communicative context, and that they can leverage this attentional flexibility to support language acquisition. Children’s ability to allocate attention improved with age; we speculate that vocabulary would be an equally if not more powerful predictor of this ability.

Children used a unique strategy to decide when to shift gaze: children encoded the referential cue when it was presented, but maintained gaze on the unfolding linguistic input until there was a prosodic break in the input. Only at this point did children shift gaze from the signer to the referent object. In the case of Experiment 1, a possible explanation for the observed pattern of results is that children ignored the referential cue completely. However, this is unlikely, given the salience of these referential cues for both hearing (Baldwin, 1993; Deák et al., 2014; Houston-Price et al., 2006; Moore, 2008) and deaf children (Brooks et al., 2020). In addition, our findings from the exposure phase of Experiment 2, in which children fixated the target more than the distractor, indicate that children did in fact encode the referential cue. Thus, unlike in spoken language, where children use visual referential cues to make an immediate gaze shift to the target (Bertenthal et al., 2014; Brooks & Meltzoff, 2005), children with exposure to ASL waited for optimal moments to shift gaze such that they did not miss any meaningful linguistic input, potentially demonstrating inhibition of gaze shift.

The fact that children successfully learned words even when the linguistic and referential cues were decoupled and presented sequentially requires an expanded understanding of the necessary conditions for language learning. There is robust evidence from spoken language that children learn words best when labels and their referents line up in time and space (Chen et al., 2020; Yu & Smith, 2013, 2017). The current results provide initial evidence that children may learn equally well when linguistic and referential cues occur in direct sequence. Children who are learning language under typical circumstances, as the majority of deaf children in the current study were, can adapt to variations in sensory experience: instead of sustained visual attention to objects, children in the current study alternated visual attention, and did so in a way that enabled them to encode and process all of the meaningful information in the input.

Individual differences

Children’s gaze patterns during exposure trials were correlated with their ability to fast-map the target item at test. In Experiment 1, children’s fixation to the novel sign label within the labelling utterance predicted target looking at test. In Experiment 2, fixation to the novel sign label as well as fixation to the target picture during the filler period each predicted unique variance to target looking at test. Our findings reveal that it is not just attending to the input, but rather attending to the right type of stimulus (language vs. object) at the right moments that helps children map the referents effectively.

While previous work on deaf children’s development of visual attention has typically included small participant samples (Chasin & Harris, 2008; Harris & Mohay, 1997; Lieberman et al., 2014), the larger sample here allows us to explore individual differences in attention allocation. Our results provide initial evidence that word learning is enhanced when children strategically alternate attention between language and objects. These findings are a first step in understanding how children vary in their ability to allocate attention optimally. The study highlights several avenues for future inquiry as to when and how language learning and attentional allocation interact in development. Children may exhibit flexible development of attention based on the specific demands of their input modality. One likely source is parental scaffolding during early interactions. In particular, deaf parents are known to structure early interactions in a way that supports children’s perception of linguistic input. At the earliest ages, such scaffolding takes the form of modified signs and overt attention-getters to direct the child’s attention (Lartz & Lestina, 1995; Pizer & Meier, 2008; Spencer & Harris, 2005). However by 18 months, parents are less likely to rely on these directive behaviors and instead wait for children’s spontaneous gaze shifts to provide input (Swisher, 2000). With this scaffolding, children become adept at shifting gaze between linguistic and visual input such that they make rapid gaze shifts and respond to subtle linguistic and prosodic cues to decide when to do so (Lieberman et al., 2014; Pavani et al., 2019). We speculate that experience participating in interactions with skilled adult interlocutors, along with the child’s developing skills in evaluating and responding to cues in the input, are two likely contributors to children’s ability to allocate attention optimally.

Word learning in ASL across development

While fast-mapping is a known and robust phenomenon among children learning spoken language (Bion et al., 2013; Carey & Bartlett, 1978; Halberda, 2003; Horst & Samuelson, 2008), the current study is the first to our knowledge to demonstrate that children can rapidly map and retain word meanings, irrespective of the sensory modality of linguistic input. Across both experiments, there were increased fixations to the target picture relative to the distractor picture at test, both in a sign recognition window that has been established in previous studies of sign recognition in deaf children (Lieberman & Borovsky, 2020; MacDonald et al., 2018), as well as in a later window. The late window provided more robust target fixations in Experiment 2, which likely reflects the more difficult nature of the task relative to Experiment 1: in Experiment 1, children only had to encode that an object and label were paired, whereas in Experiment 2 children saw the same object pairs during exposure and test trials, and thus had to retain not only an association between the sign and object, but the mapping of the specific sign-object pair through the use of referential cues. The pattern of fixations in Experiment 2 is also consistent with prior studies that observe a more extended time course of novel word recognition in spoken language (Bion et al., 2013; Booth & Waxman, 2009; Borovsky et al., 2016; Houston-Price et al., 2006; Mather & Plunkett, 2010). Our findings shed light on mapping over the course of the experiment; future work should focus on retention of these associations which support robust word learning.

Given the broad age range of the children in the present study, it is likely that children can leverage additional word-learning strategies as they get older and gain more experience linking objects to their referents across situations. Specifically, though even very young children are capable of encoding and retrieving long term memories, these skills improve tremendously across childhood (Lukowski & Bauer, 2013). These cognitive mechanisms may interact with children’s performance on the task, and likely account for some of the age-related differences observed in the current study. Similarly, the child’s own experiences with ASL input might influence their abilities. Deaf children, particularly those with hearing parents, have highly variable early language learning environments, as parents may be learning ASL alongside their children. Emerging evidence suggests that deaf children with hearing parents can acquire ASL at age-appropriate rates (Caselli et al., 2021), yet it remains unknown how the processes and mechanisms underlying word learning are affected by variable exposure to non-native input. We speculate that many of the mechanisms that support word learning in spoken language would be equally supportive of word learning in ASL.

Conclusion

Word learning is a complex phenomenon, and yet children seemingly accomplish this task with ease. Sign language input, unlike spoken language, competes with the visual input through which children encounter objects in the world. Instead of perceiving multi-modal information simultaneously, children acquiring ASL must interpret and connect sequential information to map labels onto objects. Therefore, the study of word learning processes in deaf children acquiring ASL can enrich our understanding of how all learners adapt fundamental mechanisms of word learning. The findings presented here demonstrate that children can evaluate and connect the unfolding input to the surrounding visual world by strategically and actively allocating visual attention across time and space. Simultaneity of linguistic and attentional cues is therefore not a requirement for successful word learning. Our findings suggests that recent accounts of language acquisition that prioritize temporal synchrony of linguistic and referential cues may need refinement. We find that learning may be similarly supported by sequential and temporally adjacent cues and the learner’s own ability to optimize attentional flexibility given the sensory landscape of the learning environment. That is, word learning can proceed successfully even in a case where language input and the objects to which it refers are perceived sequentially, attesting to the remarkable adaptability of children’s ability to divide attention in order to connect word to world.

Research Highlights.

  • Investigated how young deaf children learning American Sign Language learn novel words by flexibly allocating visual attention between sign labels and objects.

  • Deaf children between 17–71 months old participated in two eye-tracking experiments in which they were exposed to novel signs and variably-timed referential cues.

  • Children showed evidence for fast-mapping novel signs to objects, regardless of the timing of referential cues. Individual differences in attention allocation predicted word learning.

  • Findings advance understanding of word learning in the visual modality and contribute to theoretical accounts of the coordination of cues that effectively support word learning.

Acknowledgements:

We are grateful to the many individuals who helped with stimuli creation, recruitment, and data collection: Conrad Baer, Aiken Bottoms, Justin Bergeron, Kerianna Chamberlain, Brittany Farr, Zoe Fieldsteel, Deanna Gagne, Michael Higgins, and Erin Spurgeon. We thank the families for their participation. This work was supported by the National Institutes of Health/National Institute on Deafness and other Communication Disorders grants DC015272 and DC013638, and the Eunice Kennedy Shriver National Institute of Child Health and Human Development grant HD052120.

Footnotes

Conflicts of interest: The authors declare no conflict of interest.

Data availability: The data that support the findings of this study are available upon reasonable request from the corresponding author (alieber@bu.edu).

References

  1. Anderson D, & Reilly J (2002). The MacArthur Communicative Development Inventory: Normative Data for American Sign Language. Journal of Deaf Studies and Deaf Education, 7(2), 83–106. 10.1093/deafed/7.2.83 [DOI] [PubMed] [Google Scholar]
  2. Baldwin DA (1991). Infants’ Contribution to the Achievement of Joint Reference. Child Development, 62(5), 875–890. 10.1111/j.1467-8624.1991.tb01577.x [DOI] [PubMed] [Google Scholar]
  3. Baldwin DA (1993). Early referential understanding: Infants’ ability to recognize referential acts for what they are. Developmental Psychology, 29(5), 832–843. 10.1037/0012-1649.29.5.832 [DOI] [Google Scholar]
  4. Bates D, Maechler M, & Bolker B (2015). Fitting linear mixed-effects models using lme4. J Stat Softw, 67(1), 1–48. [Google Scholar]
  5. Bertenthal BI, Boyer TW, & Harding S (2014). When do infants begin to follow a point? Developmental Psychology, 50(8), 2036–2048. 10.1037/a0037152 [DOI] [PubMed] [Google Scholar]
  6. Bion RH, Borovsky A, & Fernald A (2013). Fast mapping, slow learning: Disambiguation of novel word–object mappings in relation to vocabulary learning at 18, 24, and 30 months. Cognition, 126(1), 39–53. 10.1016/j.cognition.2012.08.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Booth AE, McGregor KK, & Rohlfing KJ (2008). Socio-Pragmatics and Attention: Contributions to Gesturally Guided Word Learning in Toddlers. Language Learning and Development, 4(3), 179–202. 10.1080/15475440802143091 [DOI] [Google Scholar]
  8. Booth AE, & Waxman SR (2009). A Horse of a Different Color: Specifying With Precision Infants’ Mappings of Novel Nouns and Adjectives. Child Development, 80(1), 15–22. 10.1111/j.1467-8624.2008.01242.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Borovsky A (2020). When slowing down processing helps learning: Lexico-semantic structure supports retention, but interferes with disambiguation of novel object-label mappings. Developmental Science, 23(6), e12963. 10.1111/desc.12963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Borovsky A, Ellis EM, Evans JL, & Elman JL (2016). Lexical leverage: Category knowledge boosts real-time novel word recognition in 2-year-olds. Developmental Science, 19(6), 918–932. 10.1111/desc.12343 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Brooks R, & Meltzoff AN (2005). The development of gaze following and its relation to language. Developmental Science, 8(6), 535–543. 10.1111/j.1467-7687.2005.00445.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Brooks R, Singleton JL, & Meltzoff AN (2020). Enhanced gaze-following behavior in Deaf infants of Deaf parents. Developmental Science, 23(2), e12900. 10.1111/desc.12900 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Carey S, & Bartlett E (1978). Acquiring a Single New Word.
  14. Caselli NK, Lieberman AM, & Pyers JE (2020). The ASL-CDI 2.0: An updated, normed adaptation of the MacArthur Bates Communicative Development Inventory for American Sign Language. Behavior Research Methods. 10.3758/s13428-020-01376-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Caselli N, Pyers J, & Lieberman AM (2021). Deaf Children of Hearing Parents Have Age-Level Vocabulary Growth When Exposed to ASL by Six-Months. The Journal of Pediatrics. 10.1016/j.jpeds.2021.01.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Chasin J, & Harris M (2008). The development of visual attention in deaf children in relation to mother’s hearing status. Polish Psychological Bulletin, 39(1), 1–8. 10.2478/v10059-008-0001-z [DOI] [Google Scholar]
  17. Chen C, Castellanos I, Yu C, & Houston DM (2020). What leads to coordinated attention in parent–toddler interactions? Children’s hearing status matters. Developmental Science, 23(3), e12919. 10.1111/desc.12919 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Deák GO, Krasno AM, Triesch J, Lewis J, & Sepeta L (2014). Watch the hands: Infants can learn to follow gaze by seeing adults manipulate objects. Developmental Science, 17(2), 270–281. 10.1111/desc.12122 [DOI] [PubMed] [Google Scholar]
  19. Dink JW, & Ferguson B (2015). Eye-trackingR: An R Library for Eye-tracking Data Analysis. http://www.eyetracking-r.com/
  20. Gogate LJ, Bahrick LE, & Watson JD (2000). A Study of Multimodal Motherese: The Role of Temporal Synchrony between Verbal Labels and Gestures. Child Development, 71(4), 878–894. 10.1111/1467-8624.00197 [DOI] [PubMed] [Google Scholar]
  21. Gogate LJ, & Hollich G (2010). Invariance detection within an interactive system: A perceptual gateway to language development. Psychological Review, 117(2), 496–516. 10.1037/a0019049 [DOI] [PubMed] [Google Scholar]
  22. Halberda J (2003). The development of a word-learning strategy. Cognition, 87(1), B23–B34. 10.1016/S0010-0277(02)00186-5 [DOI] [PubMed] [Google Scholar]
  23. Harris M, & Mohay H (1997). Learning to Look in the Right Place: A Comparison of Attentional Behavior in Deaf Children With Deaf and Hearing Mothers. The Journal of Deaf Studies and Deaf Education, 2(2), 95–103. 10.1093/oxfordjournals.deafed.a014316 [DOI] [PubMed] [Google Scholar]
  24. Horst JS, & Hout MC (2016). The Novel Object and Unusual Name (NOUN) Database: A collection of novel images for use in experimental research. Behavior Research Methods, 48(4), 1393–1409. 10.3758/s13428-015-0647-3 [DOI] [PubMed] [Google Scholar]
  25. Horst JS, & Samuelson LK (2008). Fast Mapping but Poor Retention by 24-Month-Old Infants. Infancy, 13(2), 128–157. 10.1080/15250000701795598 [DOI] [PubMed] [Google Scholar]
  26. Houston-Price C, Caloghiris Z, & Raviglione E (2010). Language Experience Shapes the Development of the Mutual Exclusivity Bias. Infancy, 15(2), 125–150. 10.1111/j.1532-7078.2009.00009.x [DOI] [PubMed] [Google Scholar]
  27. Houston-Price C, Plunkett K, & Duffy H (2006). The use of social and salience cues in early word learning. Journal of Experimental Child Psychology, 95(1), 27–55. 10.1016/j.jecp.2006.03.006 [DOI] [PubMed] [Google Scholar]
  28. Lartz MN, & Lestina LJ (1995). Strategies deaf mothers use when reading to their young deaf or hard of hearing children. American Annals of the Deaf, 140(4), 358–362. 10.1353/aad.2012.0358 [DOI] [PubMed] [Google Scholar]
  29. Lederberg AR, Prezbindowski AK, & Spencer PE (2000). Word-Learning Skills of Deaf Preschoolers: The Development of Novel Mapping and Rapid Word-Learning Strategies. Child Development, 71(6), 1571–1585. 10.1111/1467-8624.00249 [DOI] [PubMed] [Google Scholar]
  30. Lederberg AR, & Spencer PE (2009). Word-Learning Abilities in Deaf and Hard-of-Hearing Preschoolers: Effect of Lexicon Size and Language Modality. The Journal of Deaf Studies and Deaf Education, 14(1), 44–62. 10.1093/deafed/enn021 [DOI] [PubMed] [Google Scholar]
  31. Lieberman AM, & Borovsky A (2020). Lexical Recognition in Deaf Children Learning American Sign Language: Activation of Semantic and Phonological Features of Signs. Language Learning. 10.1111/lang.12409 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Lieberman AM, Hatrak M, & Mayberry RI (2014). Learning to Look for Language: Development of Joint Attention in Young Deaf Children. Language Learning and Development, 10(1), 19–35. 10.1080/15475441.2012.760381 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lillo-Martin D (1999). Modality effects and modularity in language acquisition: The acquisition of American Sign Language. In Ritchie W & Bhatia T (Eds.), Handbook of child language acquisition (pp. 531–567). Academic Press. [Google Scholar]
  34. Lukowski AF, & Bauer PJ (2013). Long-Term Memory in Infancy and Early Childhood. In The Wiley Handbook on the Development of Children’s Memory (pp. 230–254). John Wiley & Sons, Ltd. 10.1002/9781118597705.ch11 [DOI] [Google Scholar]
  35. MacDonald K, LaMarr T, Corina D, Marchman VA, & Fernald A (2018). Real-time lexical comprehension in young children learning American Sign Language. Developmental Science, 21(6), e12672. 10.1111/desc.12672 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. MacDonald K, Marchman VA, Fernald A, & Frank MC (2020). Children flexibly seek visual information to support signed and spoken language comprehension. Journal of Experimental Psychology: General, 149(6), 1078–1096. 10.1037/xge0000702 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Markus J, Mundy P, Morales M, Delgado CEF, & Yale M (2000). Individual Differences in Infant Skills as Predictors of Child-Caregiver Joint Attention and Language. Social Development, 9(3), 302–315. 10.1111/1467-9507.00127 [DOI] [Google Scholar]
  38. Mather E, & Plunkett K (2010). Novel labels support 10-month-olds’ attention to novel objects. Journal of Experimental Child Psychology, 105(3), 232–242. 10.1016/j.jecp.2009.11.004 [DOI] [PubMed] [Google Scholar]
  39. Moore C (2008). The Development of Gaze Following. Child Development Perspectives, 2(2), 66–70. 10.1111/j.1750-8606.2008.00052.x [DOI] [Google Scholar]
  40. Moore C, & Corkum V (1998). Infant gaze following based on eye direction. British Journal of Developmental Psychology, 16(4), 495–503. 10.1111/j.2044-835X.1998.tb00767.x [DOI] [Google Scholar]
  41. Pavani F, Venturini M, Baruffaldi F, Caselli MC, & Zoest W van. (2019). Environmental Learning of Social Cues: Evidence From Enhanced Gaze Cueing in Deaf Children. Child Development, 90(5), 1525–1534. 10.1111/cdev.13284 [DOI] [PubMed] [Google Scholar]
  42. Pereira AF, Smith LB, & Yu C (2014). A bottom-up view of toddler word learning. Psychonomic Bulletin & Review, 21(1), 178–185. 10.3758/s13423-013-0466-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Pizer G, & Meier RP (2008). Child-directed signing in ASL and children’s development of joint attention. Proceedings of the 9th International Conference on Theoretical Issues in Sign Language Research, Florianopolis, Brazil. Petrópolis/Rio de Janeiro, Brazil: Editora Arara Azul, 459–474. [Google Scholar]
  44. R Core Team. (2019). R: A language and envrionment for statistical computing. https://www.R-project.org/
  45. Spencer PE, & Harris M (2005). Patterns and effects of language input to deaf infants and toddlers from deaf and hearing mothers. In Schick B, Marschark M, & Spencer PE (Eds.), Advances in the Sign-Language Development of Deaf Children (pp. 71–101). Oxford University Press. https://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780195180947.001.0001/acprof-9780195180947 [Google Scholar]
  46. Suanda SH, Barnhart M, Smith LB, & Yu C (2019). The Signal in the Noise: The Visual Ecology of Parents’ Object Naming. Infancy, 24(3), 455–476. 10.1111/infa.12278 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Suanda SH, Smith LB, & Yu C (2016). The Multisensory Nature of Verbal Discourse in Parent–Toddler Interactions. Developmental Neuropsychology, 41(5–8), 324–341. 10.1080/87565641.2016.1256403 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Swisher MV (2000). Learning to converse: How deaf mothers support the development of attention and conversational skills in their young deaf children. In Spencer P, Erting CJ, & Marschark M (Eds.), The deaf child in the family and at school: Essays in honor of Kathryn P. Meadow-Orlans (pp. 21–39). Lawrence Erlbaum Associates Publishers. [Google Scholar]
  49. Tomasello M, & Todd J (1983). Joint attention and lexical acquisition style. First Language, 4(12, Pt 3), 197–211. 10.1177/014272378300401202 [DOI] [Google Scholar]
  50. Trueswell JC, Lin Y, Armstrong B, Cartmill EA, Goldin-Meadow S, & Gleitman LR (2016). Perceiving referential intent: Dynamics of reference in natural parent–child interactions. Cognition, 148, 117–135. 10.1016/j.cognition.2015.11.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Wienholz A, & Lieberman AM (2019). Semantic processing of adjectives and nouns in American Sign Language: Effects of reference ambiguity and word order across development. Journal of Cultural Cognitive Science, 3(2), 217–234. 10.1007/s41809-019-00024-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Woodward AL (2003). Infants’ developing understanding of the link between looker and object. Developmental Science, 6(3), 297–311. 10.1111/1467-7687.00286 [DOI] [Google Scholar]
  53. Yu C, & Smith LB (2012). Embodied attention and word learning by toddlers. Cognition, 125(2), 244–262. 10.1016/j.cognition.2012.06.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Yu C, & Smith LB (2013). Joint Attention without Gaze Following: Human Infants and Their Parents Coordinate Visual Attention to Objects through Eye-Hand Coordination. PLOS ONE, 8(11), e79659. 10.1371/journal.pone.0079659 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Yu C, & Smith LB (2017). Multiple Sensory-Motor Pathways Lead to Coordinated Visual Attention. Cognitive Science, 41(S1), 5–31. 10.1111/cogs.12366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Yurovsky D, & Frank MC (2017). Beyond naïve cue combination: Salience and social cues in early word learning. Developmental Science, 20(2), e12349. 10.1111/desc.12349 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES