Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Mar 1.
Published in final edited form as: Dev Psychol. 2022 Mar;58(3):405–416. doi: 10.1037/dev0001285

Mothers Talk About Infants’ Actions: How Verbs Correspond to Infants’ Real-Time Behavior

Kelsey L West 1, Katelyn K Fletcher 2, Karen E Adolph 1, Catherine S Tamis-LeMonda 2
PMCID: PMC9012493  NIHMSID: NIHMS1792493  PMID: 35286106

Abstract

Infants learn nouns during object-naming events—moments when caregivers name the object of infants’ play (e.g., ball as infant holds a ball). Do caregivers also label the actions of infants’ play (e.g., roll as infant rolls a ball)? We investigated connections between mothers’ verb inputs and infants’ actions. We video-recorded 32 infant-mother dyads for 2 hr at home (13 month olds, n = 16; 18 month olds, n = 16; girls, n = 16; White, n = 23; Asian, n = 2; Black, n = 1; other, n = 1; multiple races, n = 5; Hispanic/ Latinx, n = 2). Dyads were predominantly from middle-class to upper middle-class households. We identified each manual verb (e.g., press, shake) and whole-body verb (e.g., kick, go) that mothers directed to infants. We coded whether infants displayed manual and/or whole-body actions during a 6-s window surrounding the verb (i.e., 3 s prior and 3 s after the named verb). Mothers’ verbs and infant actions were largely congruent: Whole-body verbs co-occurred with whole-body actions, and manual verbs co-occurred with manual actions. Moreover, half of mothers’ verbs corresponded precisely to infants’ concurrent action (e.g., infant pressed button as mother said, “Press the button”). In most instances, mothers commented on rather than instigated infants’ actions. Findings suggest that verb learning is embodied, such that infants’ motor actions offer powerful cues to verb meanings. Furthermore, our approach highlights the value of cross-domain research integrating infants’ developing motor and language skills to understand word learning.

Keywords: language input, verb learning, dyadic interactions, motor development, word learning


Verbs are notoriously difficult to learn. Infants must attend to unfolding events, extract common features among them, and connect the inferred meaning to a word (Gentner, 2006; Gleitman & Gleitman, 1992; Golinkoff & Hirsh-Pasek, 2008; Tomasello, 1992). For example, animals, toddlers, parents, and professional athletes can all jump, but they do so in perceptually different ways. To learn the word jump, infants must identify commonalities among different jumping events—despite the fact that the action is fleeting— and connect the jumping action to the verb. Nevertheless, infants learn hundreds of verbs by age two (Bloom, 1993; Braunwald, 1995; Clark, 1996; Fenson et al., 1994; Tomasello, 1992). How do infants learn verbs despite their varied and dynamic nature?

Likely, social and contextual cues are critical. Language learning does not occur in a vacuum. Instead, infants learn words during everyday back-and-forth exchanges with caregivers. For example, as infants manipulate objects, caregivers respond by naming and describing the objects of infants’ action (Custode & Tamis-LeMonda, 2020; Tamis-LeMonda et al., 2013; West & Iverson, 2017; Yu & Smith, 2012). These object-naming events, in which caregivers use nouns to reference the objects of infants’ attention, facilitate noun learning (e.g., Pereira et al., 2014; Yu & Smith, 2012). The predominant focus on nouns and object-naming events begs for extension to verbs and action-naming events—that is, moments when verbs temporally align with infant actions. We hypothesized that infants’ motor actions predictably frame caregivers’ verb inputs and provide cues to verb meaning. For instance, when an infant stacks blocks as a caregiver says, “Can you build?”, the connection between the word build and the stacking action is likely reinforced (e.g., Yu & Smith, 2007). If our hypothesis is correct, then infants—through their spontaneous motor actions—are agents in the verbs they hear. Synchrony between what infants do and the words they hear may be useful for verb learning (e.g., Shatz, 1978).

Here, we took a critical step to test whether caregivers’ verbs, like nouns, are connected to infants’ behavior. Specifically, we asked whether mothers’ infant-directed verbs correspond in time and meaning to infants’ actions based on observations of natural activity in the home.

Connections Between Words and Infant Actions

Caregivers are highly responsive to infant actions. Approximately 50% to 70% of caregivers’ utterances occur in contingent response to infant behaviors (e.g., Bornstein et al., 2008; Tamis-LeMonda et al., 2013). As infants play with toys and objects, caregivers often contingently respond with the corresponding noun (i.e., object-naming events unfold; Custode & Tamis-LeMonda, 2020; Elmlinger et al., 2019; Suanda et al., 2019; Tamis-LeMonda et al., 2013; West & Iverson, 2017; Yu & Smith, 2012). Thus, infant actions with objects often direct the conversation. Importantly, object-naming events increase in frequency and complexity across development (Chang et al., 2015, 2016; West & Iverson, 2017). As infants develop new ways to act on objects (e.g., learning to stack blocks as opposed to shaking or mouthing them), caregivers increasingly offer corresponding nouns, and also decrease their use of irrelevant nouns (West & Iverson, 2017).

Laboratory-based research indicates that object-naming events facilitate infants’ noun learning (Pereira et al., 2014; Smith et al., 2011; Yu & Smith, 2012). For example, infants learn novel nouns most successfully when the referent noun corresponds to a held, visually prominent object (Yu & Smith, 2012). Such object-naming events may facilitate learning because the held object dominates infants’ view, occluding other competing objects and highlighting the nouns’ referent (e.g., Pereira et al., 2014; Smith et al., 2011; Yoshida & Smith, 2008). The connection between infant actions with objects and the nouns they elicit (and subsequently learn) prompted researchers to consider noun learning as an “embodied” process that unfolds within the constraints of, and is influenced by, infants’ own motor actions and abilities (e.g., Iverson, 2010; Needham & Libertus, 2011; Smith & Gasser, 2005).

It is unclear whether a similar process unfolds for verb learning. To what extent do action-naming events characterize infants’ language learning environment? According to our “action-naming hypothesis,” infants learn verbs the same way that they learn nouns: in the context of their own actions as they move their bodies and manipulate objects. Yet, relative to work on nouns, less is known about whether—or how often—caregivers’ verb inputs correspond to infant actions. Prior research yielded valuable information about the number and types of verbs that adults direct to infants and the semantic categories and sentences that frame verb inputs (Hsu et al., 2017; Laakso & Smith, 2007; Naigles & Hoff-Ginsberg, 1995, 1998; Tamis-LeMonda et al., 2019). Most existing descriptions of natural verb input draw from language transcriptions, without using video to connect verbs to infants’ actions. However, two studies show that verb inputs are often grounded in the “here and now,” describing actions as they unfold, either by the speaker, the infant, or another agent (Liu et al., 2019; Nomikou et al., 2017). Given that infants are constantly in action, verb inputs may often align with infants’ own spontaneous motor behaviors.

Infants spend their days exuberantly performing whole-body and manual actions. Whole-body actions include coordinated movements of large muscle groups such as locomotion, postural shifts, and leg movements. Infants crawl, cruise, walk, climb, sit down, stand up, kick, and jump. Additionally, infants perform manual actions with their hands and fingers—they point, clap, touch, reach for things, and play with objects. Perhaps, infants’ whole-body and manual actions provide the backdrop for caregivers’ verb inputs. We hypothesized that as infants interact with objects and spaces in their environment, caregivers offer words for infants’ ongoing behaviors. Reciprocally, hearing verbs may spur new actions in infants (e.g., as infants walk from room to room, mothers may offer verbs like come and bring; as infants play with crayons, mothers may offer verbs like draw or write). Of course, coordinating verbs with actions is likely far more difficult to pull off than coordinating nouns with objects. Objects are tangible and concrete, and they are relatively constant in a child’s environment. In contrast, actions are dynamic and fleeting, and precise temporal coordination between verbs and action requires a high degree of coordination between infant and caregiver.

Meaningful verb–action correspondence could support verb learning in the same way that noun-object correspondence supports noun learning. Indeed, infants’ earliest-learned verbs are actions that they perform multiple times a day. Verbs such as drink, eat, dance, hug and give enter infants’ vocabularies far earlier than do verbs that infants do not regularly perform (e.g., cook, drive), a pattern observed in parent-report diary studies (Clark, 1995; Tomasello, 1992), and large cross-sectional samples (Fenson et al., 1994; Frank et al., 2017).

Multiple theoretical accounts support the hypothesis that action-naming events benefit word learning. From an intersensory perspective, self-generated actions are especially conducive for learning because they generate rich multimodal visual, haptic, and proprioceptive information (e.g., Bahrick et al., 2004; Lewkowicz & Kraebel, 2004). Therefore, the target action is likely more salient when it is self-generated (and multimodal) compared with viewed actions. Relatedly, from an embodied attention perspective, infants’ actions and visual attention are frequently aligned (e.g., infants look at their hands while stacking blocks; Abney et al., 2018; Yu & Smith, 2013). Infants may visually attend to self-produced actions more often than actions produced by others, which infants could more easily overlook or ignore. In addition, from a cross-situational learning perspective, infants may accrue partial knowledge of verb meanings over repeated exposures to predictable pairings of verbs with their own actions (e.g., Scott & Fisher, 2012; Siskind, 1996; Smith & Yu, 2008; Yu & Smith, 2007). Or alternatively, verb–action correspondence could support a “propose-but-verify” mechanism of verb learning, wherein infants rapidly match words to referents and then refine or revise their knowledge with subsequent exposures (e.g., Trueswell et al., 2013). Thus, regardless of the underlying mechanism, multiple theories of language learning converge on the conclusion that repeated and predictable verb–action co-occurrence benefits language acquisition.

Connections Among Verbs, Objects, and Object Locations

In English, verbs are frequently transitive—that is, they refer to actions with objects (e.g., “Press the button”). Verbs with different meanings vary in the extent to which they are transitive or intransitive (Fisher et al., 1991; Jackendoff, 1992). Generally speaking, causative verbs (make) are frequently transitive, and motion verbs (swim) are typically intransitive—they do not have a direct object. As a result, children use syntactic frames to help decipher verb meanings (Hohenstein et al., 2004; Naigles et al., 1986). Do caregivers’ verbs about infants’ whole-body and manual actions also differ in their transitivity? If so, the presence or absence of an object may provide clues to verb meaning.

We hypothesized that manual verbs are transitive more often than are whole-body verbs. Common sense suggests that manual actions frequently involve objects. Infants pull apart nesting cups, stack blocks, shake rattles, and pick up bites of cheese. Recent estimates suggest that infants spend upward of 60% of their daily routines at home manipulating toys and household objects (Herzberg et al., 2021). In contrast, whole-body actions may be variable in their implication of objects. Whole-body actions sometimes include objects (e.g., carrying a doll, kicking a ball), but sometimes do not: Infants regularly sit down, stand up, and locomote without objects (Heiman et al., 2019; Hoch et al., 2020).

Additionally, the location of referent objects may differ for manual and whole-body verbs. Objects must be within reach for infants to perform a manual action on them. However, whole-body actions may refer to nearby (kicking a ball) or far-off objects (walking to retrieve a distant toy). Accordingly, we hypothesized that manual actions consistently involve nearby objects, whereas whole-body actions are heterogenous, involving both near and far-off objects, and sometimes performed without objects at all. If our hypotheses are correct, the location of referent (and known) objects may provide clues to verb meaning (e.g., if caregiver says “Get your ball,” referring to a far-off ball, the infant may infer the target action and retrieve the ball).

Current Study

We embarked on the first study of temporal correspondence between mothers’ verb inputs, infants’ actions, and the objects of infants’ actions. We observed 32 infant-mother dyads during two hours of naturalistic activity at home. We tested three hypotheses:

Hypothesis 1: Caregivers’ verb inputs are connected from moment-to-moment with infants’ actions. We identified whole-body and manual verbs that mothers directed to infants. Then, we coded infants’ whole-body and manual actions during a six-second window surrounding the verb. We predicted that whole-body verbs such as come, go, and bring co-occur with infants’ whole-body actions. Likewise, we predicted that manual verbs such as stack, build, and shake co-occur with infants’ manual actions. Moreover, we predicted that action-naming events (e.g., stacking blocks when mothers say “Stack” or locomoting when mothers say “Come”) are commonplace in infants’ language learning environment.

Hypothesis 2: Action-naming events are largely driven by caregivers’ responses to infants’ ongoing actions. We assessed whether action-naming events preceded (e.g., mother says, “Clap your hands” and infant does so) or followed (e.g., infant claps hands and mother says, “Clap, clap!”) infants’ actions. We predicted that mothers’ verbs would be more likely to follow rather than precede infant actions.

Hypothesis 3: Manual verbs primarily refer to objects located near the infant whereas whole-body verbs refer to objectless actions and to objects located both near and far from the infant. We identified the referent objects of mothers’ verbs and further specified the objects’ location. We predicted that manual verbs would consistently refer to objects within infants’ reach because infants’ manual actions consistently involve nearby objects. We expected heterogeneity in the location of objects specified by whole-body verbs: Sometimes objects would be close (“Bring me the car”), other times objects would be far (“Go get the truck”), and at still other times no object would be specified at all (“Come here”).

We tested 13- and 18-month-old infants, representing substantially different points in language development. Thirteen month olds are just beginning to produce words (Carey, 1978; Fenson et al., 1994), whereas 18 month olds say words often and rapidly acquire new words (Bates et al., 1991; Bloom, 1973; Nelson, 1973). The occurrence of action-naming events might differ for 13 and 18 month olds because older infants understand many more verbs than do younger infants, and thus may elicit a greater variety of verbs from caregivers. Moreover, 13 and 18 month olds differ substantially in their motor capabilities. Eighteen month olds locomote and maneuver with greater proficiency and play with objects in more sophisticated ways than do 13 month olds (Belsky & Most, 1981; Lee et al., 2018; Lockman & Tamis-LeMonda, 2021). If mothers’ verbs are tailored to infants’ ongoing actions, then infants’ motor repertoires may influence the verbs they elicit (e.g., “climb” may be relevant only for older infants who can climb).

Method

Study procedures were approved the New York University Institutional Review Board (Project no.: IRB-FY2019–3295; Project title: Motor Development in Infants, Children, and Adults).

Participants

Two groups of infants and their mothers participated: 13 month olds (n = 16, eight boys: M = 13.07 months, SD = .18 months) and 18 month olds (n = 16, eight boys: M = 18.02 months, SD = .17 months). All infants were first-born, full term at birth, from monolingual English-speaking families, and absent birth complications or disabilities. Families were recruited through pediatric groups in New York City. Twenty-three infants were White, two were Asian, one was Black, one was “other” race, and five were multiple races. Two infants were Hispanic/Latinx. Infants were from mostly from middle- to upper middle-class households. Families received $75 gift cards for participation.

Procedure

Infants and their mothers were visited at home and video-recorded (30 frames per s) for 2 hr during everyday activity. Visits were scheduled between naptimes (so that infants were rested and alert) and mealtimes and when only mother was expected to be present. However, for three dyads an additional family member arrived home briefly but stayed in a separate room until the session concluded. Mothers were instructed to ignore the experimenter and go about their routine as they typically would. The camera was equipped with an external microphone to ensure that mothers’ utterances were audible. The experimenter focused on the infant, attempting to keep infant’s full body in frontal view, kept out of the way, and did not interact with the infant or mother while recording.

Data Coding, Reduction, and Reliability

Coders used Datavyu software (datavyu.org) to transcribe and code the videos in three passes. In the first pass, maternal speech was transcribed verbatim. The second pass identified mothers’ manual and whole-body verbs. The third pass coded the correspondence between mother verbs and infant motor actions.

Transcription

A primary coder transcribed mother speech word-for-word at the utterance level. Transcription procedures followed the Codes for the Human Analysis of Transcripts (MacWhinney, 2000). A second transcriber reviewed 20% of the transcripts to identify errors in timing, segmentation of utterances, and content. Exact agreement was 96.7% of utterances. Identified errors were corrected, and the cleaned transcripts were used for further coding.

To ensure interobserver reliability for mothers’ verbs and verb–action correspondence, a primary coder scored all of the data and a second coder independently scored 25% of each infant’s data. To prevent coder drift, the coders reviewed disagreements after every few sessions. Cohen’s js were for .88 for whole-body verbs, .91 for manual verbs, .87 for infant action surrounding the verb (whole-body, manual, both, or neither action), .83 for degree of verb–action correspondence, .92 for verb reference to objects, and .92 for object distance, ps <.001. Although the number of disagreements was low, to avoid propagating known errors (typos, careless errors) into the final dataset, such errors were corrected.

Action Verbs Mothers Direct to Infants

Based on transcripts, a primary coder identified verb phrases that referred to infants’ manual actions (movements of the hands, arms, and fingers such as stir, stack, open, turn, shake), and whole-body actions (locomotion, postural changes, and large movements of the legs such as kick, stomp, bring, come, go, sit down). Note that a specific verb could appear in distinct verb phrases. For instance, “Get the ball” referring to locomotion and “Get up” referring to standing share the same core verb, but were treated as distinct verb phrases because the target actions are distinct. For ease of reading, we will refer to verb phrases as verbs.

Manual and whole-body verbs were included regardless of whether they were past, present, or future tense (e.g., “You pressed the button,” “Are you running?,” “Go get your book”). Coders identified only verbs that referred to specific actions. Thus, verbs that applied to a broad range of actions (e.g., “What do you wanna do next?”) were excluded. We also excluded prohibitives (e.g., “Do not touch that”), song lyrics, and frequently used jargon (e.g., “C’mon”). For instances when both a whole-body and manual verb occurred in the same utterance (e.g., “Come here and press the button”), each verb phrase was coded separately. However, such instances were exceptionally rare, and occurred in only 38 out of the 5,156 total utterances with action verbs identified. Final variables included the frequency and variety of mothers’ manual verbs and whole-body verbs.

Connections Between Verbs and Infant Actions

The next coding pass investigated the correspondence between mother verbs and infant motor actions. First, we measured whether infants performed a manual and/or whole-body action within a 6-s window surrounding the verb—3 s before and 3 s after verb onset (Bornstein et al., 2008, 2020; Custode & Tamis-LeMonda, 2020). Infants received credit for a whole-body action if they locomoted, changed posture (e.g., sat down, went from a sit to a squat), or demonstrated a large movement of their legs like kicking or jumping. Manual actions were credited for actions of the hands, arms, or fingers (e.g., clapping, waving, stacking, shaking). Infants could receive credit for one, both, or neither action type within each window.

Second, we assessed the degree of correspondence; that is, how closely the infant action matched the action verb (e.g., did the infant stack blocks as mother said “Stack” or kick the ball as mother said “Kick”?). Coders watched each six-second window and specified whether the infant action corresponded to the action verb utterance precisely, imprecisely, or not at all. Precise correspondence was coded when the infant action matched the mothers’ utterance exactly (e.g., mother said “Did you shake your rattle?” after infant shook the rattle; mother said “Go get the book,” and the infant retrieved the book). Imprecise correspondence was coded when the infants’ action adhered somewhat to what the mother said, but the match was not exact. There were two ways infants’ actions could correspond imprecisely. Infants could perform the target action with a different object (e.g., mother said, “Go get your car,” and infant retrieved a ball), or infants could act on the target object, but with a different action (e.g., mother said, “Are you building a tower?” and infant banged the blocks but did not stack them). No correspondence was coded if the infants’ action bore no relation to what mothers said (e.g., mother said, “Stack your blocks,” and infant stood up and walked away). Thus, for whole-body and manual verbs separately, we calculated the percentage of verbs that co-occurred with a whole-body action, manual action, or neither action and the percentage of verbs that corresponded precisely, imprecisely, or not at all to the infant action.

For instances in which mothers’ verbs corresponded precisely to infant actions, coders further identified whether the target infant action occurred during the 3 s before the verb (i.e., mother commented on an infant action already underway), or only began after the verb utterance (i.e., the mothers’ verb prompted the infants’ action).

Connections Between Verbs and Objects

Finally, coders specified whether each manual and whole-body verb referred to an object. Verbs that explicitly referred to an object with a noun or pronoun (e.g., “Bring your book to me,” “Can you stack them?”) were coded as transitive. Verbs that implied an object, but did not explicitly state the object (e.g., shake shake! referring to a rattle, climb up referring to a stool) were coded as pseudotransitive. If the verb had no stated or implied object, it was coded as intransitive (e.g., clap clap, come here).

For transitive and pseudotransitive verbs, coders specified whether the target object was near or far from the infant. Nearby objects were within arm’s reach (i.e., infant could reach out and touch object without locomoting). Far objects were out of reach. Thus, we calculated the percentage of manual and whole-body verbs that were transitive, pseudotransitive, and intransitive, and the percentage of target objects that were near and far from the infant.

Results

Mothers regularly directed whole-body and manual verbs to their infants. In total, 5,156 utterances contained either whole-body or manual verbs (out of the 55,434 total utterances). On average, 5.88% (SD = 3.90) of mothers’ utterances contained manual verbs, and 4.45% (SD = 3.12) of utterances contained whole-body verbs. Every mother produced whole-body and manual verbs during the visit. However, mothers varied widely in the frequency of their verb utterances (range = 36–348), as evidenced by the individual bars in Figure 1A. The frequency of manual verbs increased with infant age (see Figure 1B, left panel). Mothers of older infants used manual verbs (M = 122.88, SD = 50.92) nearly twice as often as did mothers of younger infants (M = 66.31, SD = 43.41). Whole-body verbs also increased with infant age: M = 77.31 (SD = 36.56) for older infants, and M = 55.75 (SD = 31.33) for younger infants. A 2 (age) × 2 (verb type) analysis of variance (ANOVA) indicated main effects of age and verb type on verb frequency, qualified by an Age × Verb Type interaction, F(1, 30) = 4.87, p = .035. Post-hoc simple contrasts with Bonferroni corrections confirmed that mothers of older infants used more manual verbs (p = .002). However, whole-body verb frequency did not differ for younger and older infants (p = .083).

Figure 1. Mothers’ Whole-Body and Manual Verbs.

Figure 1

Note. Whole-body verbs are depicted in red and manual verbs in blue. (A) Whole-body and manual verb frequency for each mother. Distribution shows that each mother produced whole-body and manual verbs, but frequency ranged from 36 to 348 verbs. (B) Means (solid dots) and individual data (open dots) for the frequency (left) and variety (right) of mothers’ verbs to their 13- and 18-month-old infants.

* p <.05.

Mothers used a wide variety of verbs, and they did so in distinct ways. Mothers used 92 unique whole-body verbs and 175 unique manual verbs. Most mothers used idiosyncratic verbs (i.e., verbs not spoken by other mothers). In fact, 46 whole-body verbs (e.g., chase, hop, and tackle) and 74 manual verbs (e.g., write, unzip, and squish) were idiosyncratic. Nonetheless, several popular verbs emerged: seven whole-body verbs (come, go, sit, get, bring, stand, and hug) and 11 manual verbs (put, hold, give, close, take, open, clean, push, throw, and pull) were spoken by more than half of mothers.

The variety of mothers’ verbs increased with infant age (see Figure 1B, right panel). Mothers of older infants used a greater variety of verbs (M = 15.75, SD = 4.51 whole-body; M = 31.13, SD = 9.53 manual) than did mothers of younger infants (M = 11.94, SD = 4.37 whole-body; M = 20.00, SD = 9.90 manual). A 2 (age) × 2 (verb type) ANOVA on verb variety revealed main effects of age and verb type, which were qualified by an Age × Verb Type interaction, F(1, 30) = 7.78, p = .009. Post hoc simple contrasts confirmed that older infants heard a greater variety of both whole-body and manual verbs than did younger infants (ps = .023 and .003). However, the magnitude of the difference between younger and older infants was greater for manual (d = 1.16) than whole-body verbs (d = .85).

Taken together, mothers produced a variety of whole-body and manual verbs—and verb use increased with infant age. Some verbs were ubiquitous (e.g., all mothers said come and go), but most were idiosyncratic and unique to the dyad.

Mothers’ Verbs and Infants’ Actions

Mothers’ verbs and infant actions were largely congruent: Whole-body verbs co-occurred with infants’ whole-body actions, and manual verbs co-occurred with infants’ manual actions (see Figure 2A). On average, 80.45% (SD = 11.71) of whole-body verbs co-occurred with whole-body actions (the combined red and purple bars in Figure 2A). Thus, as infants engaged in whole-body play (e.g., kicking a ball), mothers offered verbs to describe or suggest whole-body actions. Likewise, 88.05% (SD = 6.43) of manual verbs co-occurred with manual actions (the combined blue and purple bars in Figure 2A). As infants engaged in manual play (e.g., nesting cups), mothers offered verbs to describe or suggest manual actions. In contrast, mismatches were less frequent: On average, 37.55% (SD = 14.55) of whole-body-verbs co-occurred only with manual actions (e.g., mother says, “Come here” while infant sits and draws with markers) and 40.67% (SD = 18.78) of manual verbs co-occurred only with whole-body actions (e.g., mother says, “Can you play your guitar?” as infant climbs on the couch). Congruence between verb type and action type was confirmed by a 2 (age) × 2 (verb type) × 2 (action type) ANOVA, which indicated a Verb Type × Action Type interaction, F(1, 30) = 457.66, p <.001, and a main effect of age, F(1, 30) = 12.03, p = .002, indicating that older infants performed more actions overall than did younger infants. Post hoc simple contrasts confirmed that verb–action matches were more prevalent than mismatches for both whole-body and manual categories (ps <.001).

Figure 2. Mother Verbs and Infant Actions.

Figure 2

Note. (A) The percentage of whole-body verbs (left) and manual verbs (right) that co-occurred with infants’ whole-body actions (red), manual actions (blue), both actions (purple), and neither action (grey). Comparison of right and left bars shows that whole-body actions are prevalent during whole-body verbs, and manual actions are prevalent during manual verbs. (B) The percentage of whole-body verbs (left) and manual verbs (right) that corresponded precisely (dark grey), imprecisely (medium grey), or not at all (light grey) to infants’ concurrent action. Bars show that approximately 50% of whole-body and manual verbs correspond precisely to infant actions.

Action-Naming Events: Correspondence Between Verb and Action

We further documented action-naming events—that is, instances when mothers’ verb and infants’ action aligned precisely (e.g., the infant walked as mother said “Walk,” or stacked as mother said “Stack”). Approximately half of verbs corresponded precisely with infants’ actions, as shown in Figure 2B for both whole-body (M = 52.45%, SD = 15.62) and manual verbs (M = 49.13%, SD = 16.03), and both younger (M = 49.45%, SD = 9.52) and older infants (M = 50.05%, SD = 14.13). A 2 (age) × 2 (verb type) ANOVA showed no effect of age or verb type on precise verb–action correspondence.

However, because older infants received more frequent and varied verb inputs overall than younger infants did (Figure 1B), achieving ~50% precise verb–action correspondence meant that older infants experienced higher base-rates of action-naming events than younger infants (see Figure 3A). Action-naming events occurred more frequently for older infants (M = 103.94, SD = 55.27) than for younger infants (M = 61.56, SD = 36.27). Furthermore, the variety of verbs that comprised action-naming events was greater for older infants (M = 45.19, SD = 13.20) than for younger infants (M = 30.63, SD = 12.33). Indeed, 2 (age) × 2 (verb type) ANOVAs confirmed main effects of age on the frequency, F(1, 30) = 6.57, p = .016, and variety, F(1, 30) = 10.27, p = .003, of action-naming events. In addition, the ANOVAs indicated main effects of verb type, which reflected a greater frequency, F(1, 30) = 6.11, p = .019, and variety, F(1, 30) = 49.16, p <.001, of manual action-naming events compared with whole-body naming events.

Figure 3. Action-Naming Events.

Figure 3

Note. (A) The frequency (left) and variety (right) of action-naming events are shown for individual 13 and 18 month olds. Distribution shows that 18-month-old experience more frequent and heterogenous verb events than do 13 month olds. (B) The percentage of whole-body action-naming events (red) and manual action-naming events (blue) to 13 month olds (left graph) and 18 month olds (right graph) in which the infant action preceded the verb (left column) or followed the verb (right column). Distribution shows that for most action-naming events, the infant action occurred before the verb utterance.

The remaining verbs that did not correspond precisely to infant action (i.e., nonaction-naming events), showed either imprecise correspondence (e.g., mother said, “Stack your blocks”, and infant banged the blocks on the table but did not stack), or no correspondence (e.g., mother said, “Stack your blocks” and infant climbed on couch). On average, M = 30.81% (SD = 11.25) of manual verbs corresponded imprecisely to infant actions. However, imprecise correspondence was less common among whole-body (M = 13.01%, SD = 8.25) than manual verbs (M = 30.01%, SD = 8.25). A 2 (age) × 2 (verb type) ANOVA confirmed that imprecise correspondence was more prevalent for manual than whole-body verbs, as evidenced by a main effect of verb type, F(1, 30) = 46.25, p <.001.

Timing of Action-Naming Events: Which Came First, the Action or the Verb?

The correspondence between mothers’ verbs and infants’ actions does not reveal the temporal sequence of behaviors. Thus, for each action-naming event (i.e., the instances when mothers’ verb aligned with infants’ action), we documented whether the verb preceded or followed the infant action (see Figure 3B). For most action-naming events, mothers tended to “follow in” and name infants’ already ongoing actions: On average, mothers produced 74.20% (SD = 14.73) of whole-body and 67.71% (SD = 14.41) of manual verbs after the infant performed the target action (e.g., infant throws a ball and then mother says, “Did you throw it?”). Reciprocally, for a minority of action-naming events, mothers’ verbs preceded infants’ action for 25.80% (SD = 14.72) of whole-body and 32.29% (SD = 14.41) of manual verbs (e.g., mother says, “Stir your soup” and then the infant stirs). A 2 (age) × 2 (verb type) 2 (order; verb followed or preceded action) ANOVA with frequency of action-naming events as the dependent variable confirmed that infant actions preceded mothers’ verbs more often than the reverse, as evidenced by a main effect of order, F(1, 30) = 40.33, p <.001. Once again, main effects of verb type and age reflected that manual action-naming events were more frequent than whole-body naming events, F(1, 30) = 6.60, p = .015, and action naming events were more frequent for older infants compared with younger infants, F(1, 30) = 6.59, p = .016. Temporal ordering did not differ for older and younger infants, or for manual and whole-body verbs.

Mothers’ Verbs in Reference to Objects

Manual verbs nearly always specified objects (M = 93.38%, SD = 6.84; see Figure 4). On average, 74.63% (SD = 16.34) of manual verbs were transitive (i.e., the verb explicitly stated an object; e.g.,“Close the lid,” “Turn it”) and 18.75% (SD = 14.26) were pseudotransitive (i.e., the verb implied—but did not state— an object; e.g.,“Shake!” referring to a rattle). Conversely, whole-body verbs typically did not involve objects. On average, 64.18% (SD = 15.32) of whole-body verbs were intransitive, with mothers using phrases that solely focused on the action, such as “Come here” and “Sit down.” A 2 (age) × 2 (verb type) × 2 (transitive type: transitive, pseudotransitive) ANOVA confirmed main effects of verb type and transitive type, qualified by a Verb Type × Transitive Type interaction, F(1, 30) = 62.71, p <.001. Post hoc simple contrasts confirmed that manual verbs referred to objects (i.e., were transitive and pseudotransitive) more often than did whole-body verbs (ps < 001).

Figure 4. Mothers’ Verbs Reference Objects.

Figure 4

Note. The x-axes show the percentage of whole-body verbs (left) and manual verbs (right) that are transitive (green outline), pseudotransitive (orange outline), and intransitive (black outline). The y-axes show the percentage of transitive (green fill) and pseudotransitive (orange fill) verbs that refer to objects near infants. Comparison of the right and left bars shows that a greater percentage of manual verbs were transitive and pseudotransitive compared with whole-body verbs. Inspection of the bar heights shows that manual verbs referred to objects near infants, whereas whole-body verbs referred to both near and far objects.

Manual verbs typically involved objects within infants’ reach (see Figure 4). In fact, target objects were close to infants for most manual verbs (M = 88.64%, SD = 8.45 for transitive verbs; M = 93.50%, SD = 9.72 for pseudotransitive verbs). In contrast, whole-body verbs involved objects located both near (M = 57.31%, SD = 24.67) and far (M = 42.70%, SD = 24.76) from infants. However, the prevalence of nearby objects diverged for transitive (M = 52.88%, SD = 24.63) and pseudotransitive whole-body verbs (M = 80.66%, SD = 32.63). Thus, when objects were far away from infants, mothers explicitly specified the object (e.g., “Go get the ball”). But when objects were nearby, mothers sometimes implied the object (e.g., “Climb up” as infant climbs on a stool). A 2 (age) 3 2 (verb type) 3 2 (transitive type) ANOVA on the percentage of nearby objects, confirmed main effects of verb type and transitive type, qualified by a Verb Type 3 Transitive Type interaction, F(1, 23) = 11.47, p = .003. Post hoc simple contrasts verified that manual transitive verbs referred to nearby objects more often than did whole-body transitive verbs (p <.001). However, manual and whole-body pseudotransitive verbs similarly referred to nearby objects (p = .114).

Post Hoc Power Analysis

We assessed whether our sample size was sufficient to detect group differences by conducting a post hoc power analysis using G*Power 3.1 (Faul et al., 2009). The sample size of 32 was used to assess statistical power for repeated measures ANOVAs for two independent samples (13- and 18-month-old groups). We considered f2 = .02 a small effect, f2 = .15 a medium effect, and f2 = .35 a large effect (Cohen, 1988). With alpha at p <.05, power was .97 to detect large effects, .38 for medium effects, and .06 for small effects. Thus, our sample size was adequately powered to detect the moderate to large effects we found. However, we cannot rule out the possibility that small to medium effects were undetected.

Discussion

How do infants map words to referents? Context is key. Infants hear words in predictable settings at predictable times. For example, caregivers name and describe objects while infants play with them, as in red car as the infant pushes a toy car (Custode & Tamis-LeMonda, 2020; Tamis-LeMonda et al., 2013, 2019; West & Iverson, 2017). Consequently, contextual cues help infants match words to referents (e.g., Pereira et al., 2014; Roy et al., 2015; Yu & Smith, 2012). However, research on contextual cues primarily focuses on nouns and object-naming events. We found that caregivers likewise use verbs in predictable contexts. Action-naming events are commonplace: Infants’ motor actions temporally align with mothers’ verbs. Thus, infants’ own actions may serve as valuable contextual cues to decipher verb meanings.

Action-Naming Events Are Commonplace

Mothers in our sample talked about infants’ actions more than 80 times per hr, on average. If mothers maintain this rate, infants are exposed to verbs that refer to their whole-body and manual actions approximately 800 times in a 10-hr waking day. And this figure underestimates the number of verbs referring to infant actions given our focus on manual and whole-body verbs (e.g., verbs such as say, play, and look were not considered). The sheer frequency and variety of verb input suggests that infants have ample opportunities to hear—and potentially learn—verbs pertaining to their own actions.

As predicted, verbs unfolded in the context of infants’ moment-to-moment actions: Manual verbs co-occurred with infants’ manual actions, and whole-body verbs co-occurred with whole-body actions. In fact, half of mothers’ verb inputs aligned precisely to infants’ actions. Like object-naming events, action-naming events likely offer opportunities for infants to disambiguate word meaning. Self-generated actions are salient, yielding multimodal visual, haptic, and proprioceptive information, which highlights the target even more than passively viewed events do. And consistent with prior work on object-naming events, action-naming events primarily unfolded because mothers followed in on infants’ actions (i.e., infant actions led the dance).

The temporal distribution of caregivers’ verb inputs likely also influences infant learning. Rapid repetition (i.e., “bursts”) of action-naming events may offer recurring opportunities for infants to connect verb to action, with reduced demands on infants’ memory for recall. Prior work suggests that successive noun repetitions are linked to successful word learning (e.g., Frank et al., 2013; Newman et al., 2016; Schwab & Lew-Williams, 2016). Similarly, mothers in our study tended to repeat verbs in quick succession as infants performed the target action (e.g., mother says “open, open, open” while infant lifts the lid of a container). In addition, action-naming events that vary across time and context (e.g., open in reference to opening toothpaste caps, doors, Tupperware, and dresser drawers) may highlight the common feature among perceptually distinct events. That is, such varied action-naming events may facilitate invariance detection (see Gogate & Hollich, 2010 for review). Future work should test how the temporal structure of action-naming events relates to verb learning.

Nonetheless, verb–action correspondence is far from perfect. Half of mothers’ action verbs did not coincide with infants’ actions. And even when real-time correspondence occurs, connections between words and actions might still be ambiguous (e.g., infant bangs a container, mother says “Twist,” infant twists, mother says “Give it to me,” and infant twists again). Actions and words are fluid and unfold in real time, and unlike nouns, verbs are intangible.

So how do infants know that twist refers to the motion of turning the jar, rather than other temporally adjacent actions (e.g., banging the jar)? Temporal regularities likely characterize patterns of verb exposure. In line with statistical learning theory (e.g., Saffran & Wison, 2003; Yu & Smith, 2007), twist likely co-occurs with a twisting action more often than with other competing actions. Laboratory-based studies indicate that infants can learn new verbs under ambiguous contexts—even in the presence of irrelevant competing actions—if the verb and referent action co-occur across multiple exposures (Howard et al., 2019; Monaghan et al., 2015; Scott & Fisher, 2012). Thus, infants’ motor actions may provide valuable contextual cues that ultimately aid infants’ mapping of verbs to actions.

Action-naming events may be necessary, but insufficient, for infant verb learning. Compared with learning nouns, learning verbs poses unique and complex challenges. Verbs can have concrete or abstract meanings (e.g., grasp refers to the action of grasping an object, and also the mental state of understanding a concept). Verbs vary in their degree of specificity (e.g., play applies to a vast set of actions, whereas kick is more specific). Verbs can refer to distinct characteristics of perceptually identical events (e.g., come refers to the path of locomotion, whereas run refers to the manner of locomotion). And, in English, verbs have distinct inflections that denote their tense (e.g., ran, run, and running), which requires children to extract the verb’s core meaning from different surface structures. Likely, multiple learning mechanisms allow infants to leverage social, contextual, and syntactic cues (i.e., bootstrapping) to decipher verb meaning.

Verbs and Objects Provide Redundant Cues to Word Meaning

Mothers’ verb utterances often included objects (e.g., “press the button,” “bring me your cup”). Understanding verb meanings requires a holistic understanding of the entire context—action, object of the action, and relation between the two (e.g., Mandler, 2006). Although research on bootstrapping in language learning typically focuses on syntax, infants’ prior knowledge of nouns may also bootstrap verb learning (Hirsh-Pasek & Golinkoff, 1999; Naigles et al., 1986; Naigles & Hoff-Ginsberg, 1995). The affordances and locations of objects may serve as a springboard for infants to decipher verb meaning. Considering that nouns are generally learned earlier than verbs, infants can leverage a large repertoire of nouns to map verbs to actions (Gentner, 2006).

Different objects afford different actions, which may likewise offer clues to verb meaning. Balls elicit throwing, rattles shaking, and strollers pushing. In this way, object properties may provide the glue for mapping words like throw, shake, and push to their implied actions, further reinforcing connections between verbs and actions. Perceiving affordances may be helpful for disambiguating verbs that reference the designed actions of specific objects (e.g., manner verbs like twist in reference to a bottle cap or shake in reference to a rattle). Additionally, object locations may bootstrap verb learning. If teddy is located across the room and mother says, “Go get teddy,” the infant may retrieve teddy, despite not knowing the verb phrase “go get.” In turn, the meaning of “go get” is implied by the space between infant and object.

Nearly all manual verbs referred to objects located within infants’ reach, which is consistent with past work on caregivers’ object labeling. That is, prototypical cases of multimodal joint attention include caregivers’ naming nearby objects (e.g., Tomasello & Farrar, 1986; Yu & Smith, 2012). However, we observed a different pattern for whole-body verbs. Whole-body verbs frequently did not include objects (e.g., “Sit down”). When whole-body verbs did reference objects, the object was sometimes close and sometimes far from the infant. Notably, mothers specified far-off objects more often than they specified nearby objects. Presumably, the target object is more obvious when it is close by (e.g., “Climb up,” referring to a stepping stool at the infant’s feet), but more ambiguous when distant (e.g., “Go get your book” requires specification of which object the infant should retrieve).

Infants’ Developing Motor Skills Might Influence Action-Naming Events

As infants develop, caregivers increasingly offer nouns that correspond to the objects of infants’ play (Chang et al., 2015, 2016; West & Iverson, 2017). We found a similar developmental trend for verb inputs. Compared with younger infants, older infants received a richer variety of verbs and more frequent action-naming events. Older infants have more sophisticated motor and language skills than do younger infants, which may shape their verb environments. New motor skills create new opportunities for caregivers to talk about infant actions. For instance, learning to climb stairs likely prompts words like climb and step up and learning to put shapes in a shape-sorter may spur words like put in and get out. Reciprocally, as infants’ motor skills develop, mothers may up the ante and suggest increasingly advanced motor actions.

Additionally, 13 and 18 month olds differ dramatically in their language development. Thirteen month olds are just beginning to produce words (Mdn = 9.8 words), whereas 18 month olds typically have large productive vocabularies (Mdn = 76.0 words; Frank et al., 2017). Older infants’ prior knowledge of verbs may facilitate more frequent action-naming events, as infants comply with mothers’ suggestions for actions. However, the increasing complexity of mothers’ verb inputs suggest that as infants’ word knowledge grows, mothers scaffold learning of novel verbs.

Future Directions

Collectively, our findings highlight the potential of verb–action correspondence for infant language learning and raise questions for future research. Specifically, do infants leverage verb–action correspondences to learn verbs? Although infant actions are available as cues to verb meaning, infants may or may not actually exploit them to learn verbs. Prior laboratory-based study designs that linked object-naming events to noun learning could be adapted to test whether action-naming events likewise relate to verb learning (e.g., Yu & Smith, 2012). In addition, do individual infants experience different rates of verb–action correspondence in their language environments? And if so, how do variations in verb–action correspondence relate to the pace of infants’ verb acquisition? Researchers should connect characteristics of caregivers’ verb input—and how often input corresponds to infants’ actions—to patterns of vocabulary growth over time. Future work should also investigate patterns of verb–action correspondence within richly inflected languages. For example, Spanish verb inflections denote both tense and person (e.g., “I walked” is expressed caminé, whereas “You walked” is expressed caminaste). Distinct verb inflections may predictably correspond to unfolding actions (Clark & De Marneffe, 2012). That is, infants may be more likely to hear caminaste (“You walked”) following their own locomotion, but caminé (“I walked”) following their mothers’ locomotion. Such correspondences may help infants decipher verbs meanings and verb inflections. Finally, do new motor skills (like learning to walk, clap, or climb steps) have a cascading effect on the verbs that caregivers direct to infants (like go, clap, or climb)? If so, infants’ developing motor repertoires provide an important context for understanding processes of developmental cascades broadly and verb learning more specifically.

Conclusions

Research on verb learning is largely conducted in laboratories, where experimenters present infants with preselected verbs and sentence frames via video displays or standardized scripts. Such studies provide insight into learning mechanisms under the rigor of controlled conditions. However, in everyday life, infants actively shape their language experiences. Infants’ motor actions provide a rich contextual backdrop for the language they hear. As infants move through space, they elicit words about their movements. As infants manually interact with objects, they elicit words about their manipulations. Focus on infants’ active role in eliciting language input illuminates an understanding of language learning as an embodied process. The reliable connection between language inputs and motor action might be precisely what infants need to learn words.

Acknowledgments

This research was supported by a grant from the National Institute of Child Health and Human Development to Karen E. Adolph and Catherine S. Tamis-LeMonda (R01HD094830), a grant from the LEGO Foundation to Catherine S. Tamis-LeMonda and Karen E. Adolph, and by a postdoctoral training grant from the National Institute of Health to Kelsey L. West (F32DC017903). Portions of this work were presented virtually at the International Congress on Infant Studies in July 2020 and the Society for Research in Child Development in April 2021. Study materials are shared on the Databrary library (databrary.org). Videos and demographic data are shared with authorized investigators at databrary.org/volume/563. The coding manual, Datavyu coding spreadsheets, scripts for coding reliability and exporting data, and processed data are publicly shared at databrary.org/volume/1144. This study was not preregistered. We thank Orit Herzberg, Catalina Suarez-Rivera, Emily Linn and Jake McCallum for help with recruitment, and Aria Xiao for help with transcription.

References

  1. Abney DH, Karmazyn H, Smith LB, & Yu C. (2018). Hand-eye coordination and visual attention in infancy. Proceedings of the 40th Annual Meeting of the Cognitive Science Society [Google Scholar]
  2. Bahrick LE, Lickliter R, & Flom R. (2004). Intersensory redundancy guides the development of selective attention, perception, and cognition in infancy. Current Directions in Psychological Science, 13(3), 99–102. 10.1111/j.0963-7214.2004.00283.x [DOI] [Google Scholar]
  3. Bates E, Bretherton I, & Snyder LS (1991). From first words to grammar: Individual differences and dissociable mechanisms (Vol. 20). Cambridge University Press. [Google Scholar]
  4. Belsky J, & Most RK (1981). From exploration to play: A cross-sectional study of infant free play behavior. Developmental Psychology, 17(5), 630–639. 10.1037/0012-1649.17.5.630 [DOI] [Google Scholar]
  5. Bloom L. (1973). One word at a time: The use of single word utterances before syntax. Mouton. [Google Scholar]
  6. Bloom L. (1993). The transition from infancy to language: Acquiring the power of expression. Cambridge University Press. 10.1017/CBO9780511752797 [DOI] [Google Scholar]
  7. Bornstein MH, Putnick DL, Hahn CS, Tamis-LeMonda CS, & Esposito G. (2020). Stabilities of infant behaviors and maternal responses to them. Infancy, 25(3), 226–245. 10.1111/infa.12326 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bornstein MH, Tamis-Lemonda CS, Hahn CS, & Haynes OM (2008). Maternal responsiveness to young children at three ages: Longitudinal analysis of a multidimensional, modular, and specific parenting construct. Developmental Psychology, 44(3), 867–874. 10.1037/0012-1649.44.3.867 [DOI] [PubMed] [Google Scholar]
  9. Braunwald SR (1995). Differences in the acquisition of early verbs: Evidence from diary data from sisters. In Tomasello M. & Merriman WE (Eds.), Beyond names for things: Young children’s acquisition of verbs (Vol. 1, pp. 81–111). Psychology Press. [Google Scholar]
  10. Carey S. (1978). The child as word learner. MIT Press. [Google Scholar]
  11. Chang L, de Barbaro K, & Deák GO (2015, August 13–16). To hear and to hold: Maternal naming and infant object exploration. Development and Learning and Epigenetic [ICDL-EpiRob]. 2015 Joint IEEE International Conference on (pp. 112–113). IEEE. 10.1109/DEVLRN.2015.7346125 [DOI] [Google Scholar]
  12. Chang L, de Barbaro K, & Deák G. (2016). Contingencies between infants’ gaze, vocal, and manual actions and mothers’ object-naming: Longitudinal changes from 4 to 9 months. Developmental Neuropsychology, 41(5–8), 342–361. 10.1080/87565641.2016.1274313 [DOI] [PubMed] [Google Scholar]
  13. Clark EV (1995). The lexicon in acquisition (No. 65). Cambridge University Press. [Google Scholar]
  14. Clark EV (1996). Early verbs, event-types, and inflections. In Johnson CE & Gilbert JHV (Eds.), Children’s language (Vol. 9, pp. 61–73). Lawrence Erlbaum. [Google Scholar]
  15. Clark EV, & De Marneffe MC (2012). Constructing verb paradigms in French: Adult construals and emerging grammatical contrasts. Morphology, 22(1), 89–120. 10.1007/s11525-011-9193-6 [DOI] [Google Scholar]
  16. Cohen J. (1988). Statistical power analyses for the social sciences. Lawrence Erlbaum Associates. [Google Scholar]
  17. Custode S, & Tamis-LeMonda CS (2020). Cracking the code: Social and contextual cues to language input in the home environment. Infancy, 25(6), 809–826. 10.1111/infa.12361 [DOI] [PubMed] [Google Scholar]
  18. Elmlinger SL, Suanda SH, Smith LB, & Yu C. (2019, August). Toddlers’ hands organize parent-toddler attention across different social contexts [Paper presentation]. Ninth Joint IEEE International Conference on Development and Learning and Epigenetic Robotics in Oslo, Norway. 10.1109/DEVLRN.2019.8850682 [DOI] [Google Scholar]
  19. Faul F, Erdfelder E, Buchner A, & Lang AG (2009). Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. 10.3758/BRM.41.4.1149 [DOI] [PubMed] [Google Scholar]
  20. Fenson L, Dale PS, Reznick JS, Bates E, Thal DJ, Pethick SJ, Tomasello M, Mervis CB, & Stiles J. (1994). Variability in early communicative development. Monographs of the Society for Research in Child Development, 59(5), v −173. 10.2307/1166093 [DOI] [PubMed] [Google Scholar]
  21. Fisher C, Gleitman H, & Gleitman LR (1991). On the semantic content of subcategorization frames. Cognitive Psychology, 23(3), 331–392. 10.1016/0010-0285(91)90013-E [DOI] [PubMed] [Google Scholar]
  22. Frank MC, Braginsky M, Yurovsky D, & Marchman VA (2017). Wordbank: An open repository for developmental vocabulary data. Journal of Child Language, 44(3), 677–694. 10.1017/S0305000916000209 [DOI] [PubMed] [Google Scholar]
  23. Frank MC, Tenenbaum JB, & Fernald A. (2013). Social and discourse contributions to the determination of reference in cross-situational word learning. Language Learning and Development, 9(1), 1–24. 10.1080/15475441.2012.707101 [DOI] [Google Scholar]
  24. Gentner D. (2006). Why verbs are hard to learn. In Hirsh-Pasek K. & Golinkoff RM (Eds.), Action meets world: How children learn verbs (Vol. 1, pp. 544–564). Oxford University Press. 10.1093/acprof:oso/9780195170009.003.0022 [DOI] [Google Scholar]
  25. Gleitman LR, & Gleitman H. (1992). A picture is worth a thousand words, but that’s the problem: The role of syntax in vocabulary acquisition. Current Directions in Psychological Science, 1(1), 31–35. 10.1111/1467-8721.ep10767853 [DOI] [Google Scholar]
  26. Gogate LJ, & Hollich G. (2010). Invariance detection within an interactive system: A perceptual gateway to language development. Psychological Review, 117(2), 496–516. 10.1037/a0019049 [DOI] [PubMed] [Google Scholar]
  27. Golinkoff RM, & Hirsh-Pasek K. (2008). How toddlers begin to learn verbs. Trends in Cognitive Sciences, 12(10), 397–403. 10.1016/j.tics.2008.07.003 [DOI] [PubMed] [Google Scholar]
  28. Heiman CM, Cole WG, Lee DK, & Adolph KE (2019). Object interaction and walking: Integration of old and new skills in infant development. Infancy, 24(4), 547–569. 10.1111/infa.12289 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Herzberg O, Fletcher KK, Schatz JL, Adolph KE, & Tamis-LeMonda CS (2021). Infant exuberant object play at home: Immense amounts of time-distributed, variable practice. Child Development. Advance online publication. 10.1111/cdev.13669 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hirsh-Pasek K, & Golinkoff RM (1999). The origins of grammar: Evidence from early language comprehension. MIT Press. [Google Scholar]
  31. Hoch JE, Rachwani J, & Adolph KE (2020). Where infants go: Real-time dynamics of locomotor exploration in crawling and walking infants. Child Development, 91(3), 1001–1020. 10.1111/cdev.13250 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hohenstein J, Naigles L, & Eisenberg A. (2004). Keeping verb acquisition in motion: A comparison of English and Spanish. In Hall DG & Waxman SR (Eds.), Weaving a lexicon (pp. 569–602). MIT Press. [Google Scholar]
  33. Howard TJ, Porter BM, & Childers JB (2019). Can young children ignore irrelevant events, or subevents, during verb learning? Journal of Cognition and Development, 20(3), 411–432. 10.1080/15248372.2019.1607861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Hsu N, Hadley PA, & Rispoli M. (2017). Diversity matters: Parent input predicts toddler verb production. Journal of Child Language, 44(1), 63–86. 10.1017/S0305000915000690 [DOI] [PubMed] [Google Scholar]
  35. Iverson JM (2010). Developing language in a developing body: The relationship between motor development and language development. Journal of Child Language, 37(2), 229–261. 10.1017/S0305000909990432 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jackendoff R. (1992). Semantic structures (Vol. 18). MIT Press. [Google Scholar]
  37. Laakso A, & Smith LB (2007). Pronouns and verbs in adult speech to children: A corpus analysis. Journal of Child Language, 34(4), 725–763. 10.1017/S0305000907008136 [DOI] [PubMed] [Google Scholar]
  38. Lee DK, Cole WG, Golenia L, & Adolph KE (2018). The cost of simplifying complex developmental phenomena: A new perspective on learning to walk. Developmental Science, 21(4), e12615. 10.1111/desc.12615 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lewkowicz DJ, & Kraebel KS (2004). The value of multisensory redundancy in the development of intersensory perception. In Calvert GA, Spence C, & Stein BE (Eds.), The handbook of multisensory processes (pp. 655–678). MIT Press. [Google Scholar]
  40. Liu S, Zhang Y, & Yu C. (2019, July–August). Why some verbs are harder to learn than others: A micro-level analysis of everyday learning contexts for early verb learning [Paper presentation]. Proceedings of the 42nd Annual Meeting of the Cognitive Science Society in Toronto, Canada. [Google Scholar]
  41. Lockman JJ, & Tamis-LeMonda CS (2021). Young children’s interactions with objects: Play as practice and practice as play. Annual Review of Developmental Psychology. Advance online publication. 10.1146/annurev-devpsych-050720-102538 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. MacWhinney B. (2000). The CHILDES project: Tools for analyzing talk (3rd ed.). Lawrence Erlbaum. [Google Scholar]
  43. Mandler JM (2006). Actions organize the infants’ world. In Hirsh-Pasek K. & Golinkoff RM (Eds.), Action meets world: How children learn verbs (pp. 111–133). Oxford University Press. 10.1093/acprof:oso/9780195170009.003.0005 [DOI] [Google Scholar]
  44. Monaghan P, Mattock K, Davies RA, & Smith AC (2015). Gavagai is as Gavagai does: Learning nouns and verbs from cross-situational statistics. Cognitive Science, 39(5), 1099–1112. 10.1111/cogs.12186 [DOI] [PubMed] [Google Scholar]
  45. Naigles LR, & Hoff-Ginsberg E. (1998). Why are some verbs learned before other verbs? Effects of input frequency and structure on children’s early verb use. Journal of Child Language, 25(1), 95–120. 10.1017/S0305000997003358 [DOI] [PubMed] [Google Scholar]
  46. Naigles L, & Hoff-Ginsberg E. (1995). Input to verb learning: Evidence for the plausibility of syntactic bootstrapping. Developmental Psychology, 31(5), 827–837. 10.1037/0012-1649.31.5.827 [DOI] [Google Scholar]
  47. Naigles L, Gleitman LR, & Gleitman H. (1986). Children acquire word meaning components from syntactic evidence. In Dromi E. (Ed.), Language and development (Vol. 1). Ablex. [Google Scholar]
  48. Needham A, & Libertus K. (2011). Embodiment in early development. Wiley Interdisciplinary Reviews: Cognitive Science, 2(1), 117–123. 10.1002/wcs.109 [DOI] [PubMed] [Google Scholar]
  49. Nelson K. (1973). Stucture and strategy in learning to talk. Monographs of the Society for Research in Child Development, 38(1/2), 1–135. 10.2307/1165788 [DOI] [Google Scholar]
  50. Newman RS, Rowe ML, & Bernstein Ratner N. (2016). Input and uptake at 7 months predict toddler vocabulary: The role of child-directed speech and infant processing skills in language development. Journal of Child Language, 43(5), 1158–1173. 10.1017/S0305000915000446 [DOI] [PubMed] [Google Scholar]
  51. Nomikou I, Koke M, & Rohlfing KJ (2017). Verbs in mothers’ input to six month olds: Synchrony between presentation, meaning, and actions is related to later verb acquisition. Brain Sciences, 7(12), 52 10.3390/brainsci7050052 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Pereira AF, Smith LB, & Yu C. (2014). A bottom-up view of toddler word learning. Psychonomic Bulletin & Review, 21(1), 178–185. 10.3758/s13423-013-0466-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Roy BC, Frank MC, DeCamp P, Miller M, & Roy D. (2015). Predicting the birth of a spoken word. Proceedings of the National Academy of Sciences of the United States of America, 112(41), 12663–12668. 10.1073/pnas.1419773112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Saffran JR, & Wison DP (2003). From syllables to syntax: Multilevel statistical learning by 12-month-old infants. Infancy, 4, 273–284. 10.1207/S15327078IN0402_07 [DOI] [Google Scholar]
  55. Schwab JF, & Lew-Williams C. (2016). Repetition across successive sentences facilitates young children’s word learning. Developmental Psychology, 52(6), 879–886. 10.1037/dev0000125 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Scott RM, & Fisher C. (2012). 2.5-year-olds use cross-situational consistency to learn verbs under referential uncertainty. Cognition, 122(2), 163–180. 10.1016/j.cognition.2011.10.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Shatz M. (1978). On the development of communicative understandings: An early strategy for interpreting and responding to messages. Cognitive Psychology, 10(3), 271–301. 10.1016/0010-0285(78)90001-4 [DOI] [Google Scholar]
  58. Siskind JM (1996). A computational study of cross-situational techniques for learning word-to-meaning mappings. Cognition, 61(1–2), 39–91. 10.1016/S0010-0277(96)00728-7 [DOI] [PubMed] [Google Scholar]
  59. Smith LB, Yu C, & Pereira AF (2011). Not your mother’s view: The dynamics of toddler visual experience. Developmental Science, 14(1), 9–17. 10.1111/j.1467-7687.2009.00947.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Smith L, & Gasser M. (2005). The development of embodied cognition: Six lessons from babies. Artificial Life, 11(1–2), 13–29. 10.1162/1064546053278973 [DOI] [PubMed] [Google Scholar]
  61. Smith L, & Yu C. (2008). Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition, 106(3), 1558–1568. 10.1016/j.cognition.2007.06.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Suanda SH, Barnhart M, Smith LB, & Yu C. (2019). The signal in the noise: The visual ecology of parents’ object naming. Infancy, 24(3), 455–476. 10.1111/infa.12278 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Tamis-LeMonda CS, Custode S, Kuchirko Y, Escobar K, & Lo T. (2019). Routine language: Speech directed to infants during home activities. Child Development, 90(6), 2135–2152. 10.1111/cdev.13089 [DOI] [PubMed] [Google Scholar]
  64. Tamis-LeMonda CS, Kuchirko Y, & Tafuro L. (2013). From action to interaction: Infant object exploration and mothers’ contingent responsiveness. IEEE Transactions on Autonomous Mental Development, 5(3), 202–209. 10.1109/TAMD.2013.2269905 [DOI] [Google Scholar]
  65. Tomasello M. (1992). First verbs: A case study of early grammatical development. Cambridge University Press. 10.1017/CBO9780511527678 [DOI] [Google Scholar]
  66. Tomasello M, & Farrar MJ (1986). Joint attention and early language. Child Development, 57(6), 1454–1463. 10.2307/1130423 [DOI] [PubMed] [Google Scholar]
  67. Trueswell JC, Medina TN, Hafri A, & Gleitman LR (2013). Propose but verify: Fast mapping meets cross-situational word learning. Cognitive Psychology, 66(1), 126–156. 10.1016/j.cogpsych.2012.10.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. West KL, & Iverson JM (2017). Language learning is hands-on: Exploring links between infants’ object manipulation and verbal input. Cognitive Development, 43, 190–200. 10.1016/j.cogdev.2017.05.004 [DOI] [Google Scholar]
  69. Yoshida H, & Smith LB (2008). What’s in view for toddlers? Using a head camera to study visual experience. Infancy, 13(3), 229–248. 10.1080/15250000802004437 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Yu C, & Smith LB (2007). Rapid word learning under uncertainty via cross-situational statistics. Psychological Science, 18(5), 414–420. 10.1111/j.1467-9280.2007.01915.x [DOI] [PubMed] [Google Scholar]
  71. Yu C, & Smith LB (2012). Embodied attention and word learning by toddlers. Cognition, 125(2), 244–262. 10.1016/j.cognition.2012.06.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Yu C, & Smith LB (2013). Joint attention without gaze following: Human infants and their parents coordinate visual attention to objects through eye-hand coordination. PLoS ONE, 8(11), e79659. 10.1371/journal.pone.0079659 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES