Linguistic context in verb learning: Less is sometimes more

Angela Xiaoxue He; Maxwell Kon; Sudha Arunachalam

doi:10.1080/15475441.2019.1676751

. Author manuscript; available in PMC: 2021 Jan 1.

Published in final edited form as: Lang Learn Dev. 2019 Oct 21;16(1):22–42. doi: 10.1080/15475441.2019.1676751

Linguistic context in verb learning: Less is sometimes more

Angela Xiaoxue He ¹, Maxwell Kon ², Sudha Arunachalam ³

PMCID: PMC7531761 NIHMSID: NIHMS1541410 PMID: 33013240

Abstract

Linguistic contexts provide useful information about verb meanings by narrowing the space of candidate concepts. Intuitively, the more information, the better. For example, “the tall girl is fezzing,” as compared to “the girl is fezzing,” provides more information about which event, out of multiple candidate events, is being labeled; thus, we may expect it to better facilitate verb learning. However, we find evidence to the contrary: in a verb learning study, preschoolers (N = 60, mean age = 38 months) only performed above chance when the subject was an unmodified determiner phase, but not when it was modified (Experiment 1). Experiment 2 replicated this pattern with a different set of stimuli and a wider age range (N = 60, mean age = 45 months). Further, in Experiment 2, we looked at both learning outcomes––by evaluating pointing responses at Test, and also the learning process––by tracking eye gaze during Familiarization. The results suggest that children’s limited processing abilities are to blame for poor learning outcomes, but that a nuanced understanding of how processing affects learning is required.

Introduction

Identifying the meaning of a new word is not a straightforward task, because the context of language use makes available infinitely many possible meanings (Chomsky, 1959; Quine, 1960). For nouns that label common objects, this problem may be mitigated by shared human abilities such as attention to social cues or biases that constrain hypotheses about word meaning (e.g., Clark, 1987; Markman, 1991; Markman & Hutchinson, 1984). But these resources may be markedly less useful for learning verb meanings. In a task where adults were asked to “simulate” child language acquisition by observing muted videos of parent-child interactions and guessing what words the parents were uttering, Gillette, Gleitman, Gleitman, and Lederer (1999) found that adults correctly identified nouns three times more often than verbs, identifying verbs only 7.7% of the time. They inferred that determining the target referent of a verb by observing the world alone is profoundly difficult.

The verb’s linguistic context, however, can play an important role in alleviating this difficulty. Gillette et al. (1999) also found that adults given linguistic information along with the muted videos did much better. A list of the verb’s co-occurring nouns (e.g., “Daddy,” “cookie”) increased correct identification to 29% of the time, the verb’s syntactic frame with nonsense morphemes replacing the content morphemes (e.g., Ver gonna __ the telfa) increased it to 51.7%, and both cues together (e.g., Daddy’s gonna __ the cookie) increased it to 75%. Gleitman and her colleagues argue that linguistic context serves as a “zoom lens” to help learners narrow the search space of potential event referents for an unfamiliar verb (Gleitman, 1990; Landau & Gleitman, 1985). Structurally, the verb’s argument structure offers a cue to the broad type of event it describes; for example, “it’s gorping it” describes a 2-participant event. Semantically, the lexical content of the arguments reveals potential referents of those participants; for example, “the duck is gorping the bunny” describes an event involving those two participants, and “she is gorping him,” while less specific about their identities, cues their genders. The zoom-lens effect will be particularly helpful when the observational situation is cluttered. For example, on a crowded playground with multiple people engaged in different actions, a richly informative linguistic context (e.g., “The girl in the red jacket is swinging”) may allow the listener to identify a specific agent, and thus, the event being labeled.

Therefore, for successful verb learning, the information contained in linguistic context is essential, not only for adults, but also for children (see Piccin & Waxman (2007) for a replication of Gillette et al.’s (1999) task with 7-year-olds). Decades of research have shown that young learners can use linguistic context in the service of verb learning, both structural information such as the number of arguments (e.g., Fisher, 1996, 2002; Fisher, Hall, Rakowitz, & Gleitman, 1994; Hirsh-Pasek & Golinkoff, 1996; Naigles, 1990; Yuan, Fisher, & Snedeker, 2012) and semantic information such as lexical content (e.g., Arunachalam & Waxman, 2011, 2015; Fisher et al., 1994; Imai, Haryu, & Okada, 2005; Imai et al., 2008; Syrett, Arunachalam, & Waxman, 2014).

But what kind of linguistic context is most supportive for verb learning in young children? With respect to semantic information, we might intuitively think that the more information the linguistic context carries (as long as that information is truthful), the more focused the zoom lens will be. Some findings are consistent with this intuition. For example, Arunachalam and Waxman (2011, 2015) taught English-acquiring 2-year-olds novel transitive verbs describing events in which an actor acted on an object. At Test, two new events were presented simultaneously, one depicting the same actor and action but a new object, and the other depicting the same actor and object but a new action. They found that children learned verbs appearing in semantically richer contexts––sentences with lexical determiner phrases (DPs) as subjects and objects (e.g., “The boy is pilking a balloon”)—but failed when given semantically sparser contexts with pronominal arguments (e.g., “He is pilking it”). Similarly, Imai et al. (2008) found that older children, 5-year-old English learners, could succeed with pronominal arguments, but not with null arguments (e.g., “Pilking”), again indicating that more information was better than less. These studies, taken together, document that young children require substantial semantic support from linguistic context to discover the meaning of a novel verb, although the specific amount of semantic information they need may decrease with development.

However, other findings suggest that more information may not always be better. In a replication of Arunachalam and Waxman (2011) with 24-month-old Korean learners, the opposite pattern obtained from the English results. Korean learners performed better when the novel transitive verb occurred with its arguments elided (e.g., “Pilking”) than when both arguments were overt, though performance was not above chance in either condition (Arunachalam, Leddon, Song, Lee, & Waxman, 2013). Again, Imai and colleagues found a similar pattern with older children: Japanese-learning 5-year-olds succeeded with elided arguments but not with pronouns (Imai et al., 2005, 2008).

Similar evidence also comes from intransitive verbs. Lidz, Bunger, Leddon, Baier, and Waxman (2009) showed that younger English-learning children (22 months) learned novel intransitive verbs with a pronominal subject (e.g., “It’s blicking”) but not with a lexical DP subject (e.g., “The flower is blicking”). He and Lidz (2016) replicated this finding and further showed that the difference was due to semantic content rather than syntactic complexity, because children also successfully learned the verbs from sentences like, “That thing is blicking,” where the subject was semantically lighter but structurally complex.

These findings all suggest a seemingly counter-intuitive conclusion–––that sometimes less information is better. But why is that so? We suggest that informativity, which we define as the amount of (truthful) information in the utterance, interacts with other factors and that the balance among these factors determines how supportive a linguistic context may be for verb learning.¹ Specifically, we must also consider how informativity interacts with factors that determine how useful (i.e., how relevant for identifying the verb’s referent), or how usable (i.e., how interpretable or decodable), that information is.

With respect to usefulness, one relevant factor may be pragmatic felicity of the sentence containing the novel verb. The verb learning studies reviewed above presented novel verbs in visual contexts with only one event (e.g., a boy acting on a balloon, a flower moving) and thus there was only one salient candidate referent for the novel verb, and only minimal disambiguating information was needed in the linguistic context. Extra information may have rendered the sentence “overinformative” and therefore pragmatically infelicitous. This would explain why learning intransitive verbs may require less linguistic information than transitive verbs: to acquire transitive verbs, learners must identify who is doing what to whom, requiring some specificity in the linguistic context, which may be overinformative for learning intransitive verbs whose event structure is much simpler. It may also explain why learners of argument-drop languages may do better with less information––overtly mentioning arguments that are often omitted in their language may be pragmatically disfavored.

However, there is no direct evidence that child learners are sensitive to this kind of pragmatic infelicity. Child-directed speech is replete with redundancy (e.g., Newport, 1975; Snow, 1972), and in experimental studies, children up to 5 years of age are tolerant of overinformative statements (Davies & Katsos, 2010; Morisseau, Davies, & Matthews, 2013). For example, Morisseau et al. (2013) found that when provided an overinformative instruction like “Find the bird with the feather” when there was only one normal-looking bird in the display, 3-year-olds did not appear to notice the infelicity and were not delayed in their response.

With respect to usability of information, young learners may be sensitive to the processing load associated with the linguistic context. To benefit from the information in the linguistic context, children must be able to process it, but young children are still honing their processing skills (e.g., Trueswell, Sekerina, Hill, & Logrip, 1999); they must parse to learn while they are still learning to parse (Kidd, Bavin, & Brandt, 2013; Omaki & Lidz, 2015; Trueswell & Gleitman, 2007). Could more information hurt verb learning by virtue of incurring a heavier processing load? In the case of the intransitive verbs studied by Lidz et al. (2009) and He and Lidz (2016), the lexical DP may have incurred too great a processing load for these quite young learners, while the pronominal subject was sufficiently informative (given that there was only one entity in the scene) and easier to process. In support of this interpretation, these authors found that children performed better with the lexical DP if they were exposed to it beforehand (e.g., “Look at the flower…. The flower is blicking”). They surmised that this pre-exposure lessened the processing burden associated with retrieving the lexical DP’s referent. In the case of verb learning in Korean (Arunachalam, Leddon, et al., 2013) and Japanese (Imai et al., 2005, 2008) reviewed above, the availability of argument drop in these languages may mean that learners have less experience, and therefore more difficulty, processing overt arguments (Arunachalam & Waxman, 2011, 2015).

Independent evidence from language comprehension studies outside the domain of verb learning supports the hypothesis that limitations on the developing parser affect children’s use of linguistic context. Choi & Trueswell (2010) demonstrated that 4- and 5-year-olds had difficulty inhibiting misinterpretations of garden-path sentences despite highly informative cues from the sentence, attributable to their limited processing abilities. Huang and Arnold (2016) found decreased sensitivity to syntactic cues in 5-year-olds when the sentence imposed a particularly difficult parsing challenge (i.e., passive sentences that require revising an initial thematic role assignment). More relevantly to the current study, Fernald, Marchman, and Hurtado (2008) found that children who were faster to process through a modified DP were more likely to learn the meaning of a novel noun downstream in the sentence (as in, “There’s a blue cup on the deebo,” given a scene with a blue cup on one novel object and a red cup on another). Thus, the ability to process familiar words in a linguistic context facilitates acquisition of new words encountered later. In the case of verbs, whose referents are often dynamic events, more efficient processing of familiar words early in the sentence may be even more important, allowing the child to shift attention to the correct event in time to see it unfold.

Therefore, informativity, pragmatic felicity, and processability may all play a role in determining whether a particular linguistic context will be supportive for verb learning. While young children usually need significant semantic support to discover the meanings of novel verbs, it may be that “less information is more” when the linguistic information is either not useful or not usable by the young learner.

In the current study, we explore these issues using an experimental verb learning task similar to those used in the studies reviewed above. However, in these prior studies, the visual context depicting the verb was so simple (i.e., one event with one salient action) that there were very few candidate referents. The linguistic zoom lens is likely to play a severely diminished role in such contexts. For an adult, at least, the barest of linguistic contexts (e.g., “Pilking”) would have been sufficient. Given that in the real world, the extralinguistic context is likely to be much more complex than that in a laboratory setting (Gillette et al., 1999; Gleitman, 1990; Landau & Gleitman, 1985), we take a first step toward a more complex learning scenario by introducing novel verbs with two simultaneous events (e.g., girl waving, boy clapping), only one of which is labeled by the verb. Thus, the linguistic context is crucial for identifying the referent event. We ask: when learners encounter novel verbs in a setting where they must use the linguistic context to identify the verb’s referent event, is more information better? Or is there still a tradeoff between informativity and pragmatic felicity and/or processing load? A second goal of the current study is to tease apart pragmatics and processing. Above, we interpreted examples of children’s relatively poorer performance with more information to be either about pragmatics or about processing. Here, we include a manipulation to test both possibilities.

Current Study

In the current study, we study young children’s acquisition of novel intransitive verbs in linguistic contexts that vary in informativity, pragmatic felicity, and processability. We focus on intransitive verbs both because their simple argument structure allows us to manipulate only one argument, and because their simple event structure means that linguistic context can be used primarily to identify which referent event is being labeled rather than how multiple event participants relate to each other.

We vary the verbs’ linguistic contexts by manipulating the semantic content contained in the subject DP. As reviewed above, one salient way in which linguistic contexts can differ in the amount of semantic content is how the verb’s arguments are realized (e.g., as pronouns, content nouns, or null arguments). In the current study, we present subject arguments with contentful DPs either with or without an adjectival modifier, as in (a) and (b) below. Sentence (b) contains more semantic content––that is, a more detailed description of the event participant––than sentence (a) and is thus higher in informativity. (Recall that we define “informativity” as the amount of semantic information contained, regardless of whether that information is useful for identifying the target.)

The girl is fezzing.
The tall girl is fezzing.

The novel verb is presented alongside two visual scenes that play simultaneously on either side of the screen, requiring learners to attend to the linguistic context to learn which is being referred to, and in turn to learn the verb’s meaning. We manipulate the visual context in order to study the potential effects of pragmatic felicity. In one condition, the agents in the two scenes come from different basic-level categories but have contrasting salient properties (e.g., a tall girl in one scene and a short boy in another), such that the two scenes are distinguishable by the noun alone in the DP while the modifier (if any) also contributes true information. In another, the agents also have contrasting salient properties but come from the same basic-level category (e.g. a tall girl in one scene and a short girl in another), such that the two scenes are distinguishable only by the modifier. This allows us to see how verb learning is affected by overly informative linguistic contexts (when the modifier is unnecessary) as compared to appropriately informative contexts (when the modifier is necessary).

Given these two types of linguistic context and two types of visual context, we designed three verb learning conditions,² varying only in the Familiarization phase. In the first condition, which we call the Light condition (i.e., semantically lighter), children hear novel verbs in sentences like (a), with visual scenes with different basic-level-category agents. In the second condition, which we call the Heavy-Unnecessary condition (i.e., semantically heavier, and the modifier is not necessary for identifying the target event), children hear sentences like (b) but see scenes with different basic-level-category agents. In the third condition, which we call the Heavy-Necessary condition (i.e., semantically heavier, and the modifier is necessary for identifying the target event), children hear the verbs in sentences like (b) but see scenes with agents from the same basic-level category. (The fourth logical possibility given our visual and linguistic manipulations is a “light” linguistic stimulus with two agents from the same basic-level category, but we do not include it because it would be unresolvable.) The Test phase in all conditions is identical: children see two new scenes, each depicting a new agent performing one of the actions seen during Familiarization, and they are asked, for example, to “point to fezzing.”

In all three conditions, because there are two candidate events when the verb is first introduced, the linguistic context is needed for identifying the verb’s referent. If it is the case that more information is better, we would expect better verb learning in the two Heavy conditions than in the Light condition. However, the two Heavy conditions differ in whether the modifier is necessary for identifying the verb’s referent. If children are sensitive to pragmatic felicity, we would expect better performance in the Heavy-Necessary condition than the Heavy-Unnecessary condition. Alternatively, because the two Heavy conditions, by virtue of having “heavier” subject DPs, impose a higher processing load as compared to the Light condition, children may only succeed in the Light condition and not in the two Heavy conditions if processing load limits their use of the linguistic context.

We report two experiments. In Experiment 1, like prior verb learning work, we looked at children’s verb learning outcomes––that is, their choice of the target versus the distractor scene during the Test phase. Results suggested an intriguing trend that more information might hinder, rather than facilitate, verb learning, and that processing load, rather than pragmatics, might be accountable. To replicate and to better understand the results in Experiment 1, in Experiment 2, we looked not only at learning outcomes but also at the learning process––by tracking eye gaze during the Familiarization phase to make inferences about which part of the learning task they struggled with. Data and analysis codes for both experiments can be found at osf.io/tv7kh.

Experiment 1

Participants

Sixty English-learning children (30 boys, 30 girls) with a mean age of 38.2 months (range: 30.4–48.0 months) were included in the final sample. They were all reported by their parent to be exposed to English at least 70% of the time, and to have no known language, communication, or uncorrected hearing or vision problems. Data from 13 additional children were excluded due to failure to point correctly on at least one of two pointing training trials, or failure to point at all on at least 2 of the 4 experimental trials.

Stimuli

Stimuli consisted of two pointing training trials and four experimental trials. The visual stimuli for training trials were video clips of familiar characters (e.g., Big Bird), and auditory prompts were provided by an experimenter. For experimental trials, the videos were recordings of actors engaging in simple actions (e.g., a girl waving) or objects manipulated by a hand (e.g., a truck rotating).³ In the latter case, the hand was backgrounded so that the object, not the agent, was the only salient event participant. Auditory stimuli were recorded by a female native English speaker using child-directed speech, edited in Praat (Boersma & Weeknink, 2014), and combined with the visual stimuli in Final Cut Pro X.

Apparatus and Procedure

The child and parent were first welcomed into a playroom where the child played with an experimenter and the parent provided consent and completed the MacArthur Communicative Development Inventory Level III (MCDI) together with a supplementary vocabulary checklist including the adjectival modifiers used in the experiment. MCDI scores were not significantly different across conditions; see Table 1. Then, in the testing room, the child sat in a car seat or on the parent’s lap 18 inches from a 24-inch monitor. Stimuli were presented from a desktop PC (Dell Precision T5500). The parent was asked not to talk, and to wear a blindfold if the child was on the parent’s lap. An experimenter sat next to the child to elicit and record pointing responses; another experimenter also recorded pointing from behind a curtain via integrated webcam feed.

Table 1.

Summary of MCDI-III vocabulary scores of children in Experiment 1

	Light (n = 20)		Heavy-Unnecessary (n = 19)¹⁰		Heavy-Necessary (n = 20)
	Mean	SD	Mean	SD	Mean	SD
Total vocabulary	74.05	12.60	79.32	17.26	74.45	21.13
Nouns	39.20	6.50	41.68	6.91	38.85	9.48
Verbs	9.90	1.71	10.00	2.75	8.95	3.43

Open in a new tab

Pointing Training.

The pointing training session introduced familiar characters engaged in familiar actions. Children saw two video clips side-by-side and were asked to point to one (e.g., “Show me dancing.”).

Experimental Trials.

For experimental trials, children were randomly assigned to one of three conditions in a between-subject design: Light, Heavy-Unnecessary, or Heavy-Necessary (each N = 20). They participated in four trials, each comprising a Familiarization phase and a Test phase. Trials differed by condition only in the Familiarization phase. All children completed four trials in the same order. See Table 2 for an example trial and Table 3 for a summary of stimuli in all trials.

Table 2.

An example trial (Experiment 1)

		Familiarization			Test
		6 sec	.5 sec	6 sec	.5 sec	6 sec	1 sec	6 sec
Visual stimuli	Light & Heavy-Unnecessary conditions
Visual stimuli	Heavy-Necessary conditions
Auditory stimuli	Light condition	The girl is fezzing.	(silence)	(silence)	Take a look!	(silence)	Show me fezzing!	Point to fezzing!
Auditory stimuli	Heavy-Unnecessary & Heavy-Necessary conditions	The tall girl is fezzing	(silence)	(silence)	Take a look!	(silence)	Show me fezzing!	Point to fezzing!

Open in a new tab

Table 3:

A summary of stimuli used in all trials in Experiment 1.

Trial	Actors		Actions	Sentence
Trial	Light & Heavy-Unnecessary conditions	Heavy-Necessary conditions	Actions	Sentence
1	tall girl, short boy	tall girl, short girl	clap, wave	“The (tall) girl is fezzing”
2	happy boy, sad girl	happy boy, sad boy	march, crouch	“The (happy) boy is mooping”
3	round ball, rectangular dump truck	round ball, football	tilt up/down, roll back/forth	“The (round) ball is leaming”
4	big baby doll, little pig	big baby doll, little baby doll	lean forward, bounce	“The (big) baby is gopping”

Open in a new tab

Familiarization Phase.

On each trial, children first saw two side-by-side video clips, each depicting a different event participant and a different action (e.g., a tall girl waving). Children heard a novel verb in a sentence describing one of the video clips. The sentence either had an unmodified subject DP (Light condition, e.g., “The girl is fezzing,”) or one with a modifier (Heavy conditions, e.g., “The tall girl is fezzing”). We chose adjectives that we expected children to know, avoiding color terms because they are notoriously slow to be acquired (e.g., Backscheider & Shatz, 1993; Wagner, Dobkins, & Barner, 2012). Parental report confirmed that these adjectives were in most participants’ productive vocabularies; see Table 4.⁴ The videos played for 6 seconds along with the sentence, and then again for 6 seconds in silence after a brief blank screen. Children only heard the sentence with the novel verb once, making this quite a demanding task.

Table 4.

Percentage of participants reported to produce the adjectives used in Experiment 1

	Light (n = 20)	Heavy-Unnecessary (n = 19)¹¹	Heavy-Necessary (n = 20)
tall	90	90	85
happy	95	90	95
round	na	na	na
big	100	95	100

Open in a new tab

Test Phase.

Two new scenes were then presented, featuring a different actor from the one seen previously, but performing the same two actions. After the attention-getting audio, “Take a look!”, the child was prompted to point to the scene labeled by the verb, twice (“Show me fezzing” and “Point to fezzing”). If the child did not point after the two pre-recorded prompts, a third prompt was offered by the experimenter seated next to the child. The scene displaying the same action as the one labeled by the novel verb during Familiarization was the target, and the other was the distractor. The target scene for each trial was the same for all children (e.g., for all children, the clapping action was the target on the fez trial). The left-right position of the target and distractor at Test, as well as whether it appeared on the same or opposite side as it had during Familiarization, was counterbalanced across trials.

Coding

Both experimenters independently recorded whether the child pointed to the left, right or not at all. Disagreements were resolved by discussion immediately after the session (specifically, the coder behind the curtain sometimes coded left and right from his/her own perspective, which was the mirror image of the child’s; this systematic discrepancy was easily corrected). The first point (if more than one) produced on each trial served as the dependent measure: 0 for pointing to the distractor, and 1 for the target. The 7.5% (18 out of 240) of trials on which children did not point were excluded from analysis.

Results

The results are illustrated in Figure 1. Mean pointing score in the Light condition was 0.65 (SD = 0.29), in the Heavy-Unnecessary condition, 0.53 (SD = 0.30), and in the Heavy-Necessary condition, 0.52 (SD = 0.26). In the Light condition, 18 children chose the target on at least half of the trials, among which 6 did so on all four trials. In the Heavy-Unnecessary condition, only 12 chose the target on at least half of the trials, and 3 did so on all four. In the Heavy-Necessary condition, 15 chose the target on at least half of the trials, and 2 did so on all four. For all statistical analyses, we adopted a significance level of 0.05.⁵ Analyses were carried out in R (version 3.3.3) using the lme4 package (version 1.1–12).

To determine whether children learned the novel verbs in any condition, we compared their performance to chance level (i.e., 0.5). Data from each condition were submitted separately to mixed-effects logistic regression models, which allowed us to add participant and trial as random factors and age (in months; centered around its mean) as a continuous predictor. We compared pointing responses to chance by evaluating the intercept parameter. Only the Light condition yielded above-chance learning (intercept parameter = 0.74, z = 2.03, p < 0.05); the Heavy-Unnecessary condition (intercept parameter = 0.17, z = 0.45, p = 0.65), and the Heavy-Necessary condition (intercept parameter = 0.11, z = 0.47, p = 0.64) did not. There was no effect of age in any condition. Correlation analyses were conducted to see whether there was any correlation between vocabulary (total MCDI vocabulary, as well as number of modifiers known) and performance. However, we found no such correlation––all p-values for Pearson’s R correlation were greater than 0.23, with one exception––in the Heavy-Necessary condition, the correlation between MCDI vocabulary and performance approached significance (R² = 0.19, p = 0.055).

Next, we asked whether children improved across trials by adding block (first two trials vs. last two trials) as a fixed effect to the models. We found no main effect of block in either the Heavy-Unnecessary condition (z = 1.57, p = 0.12) or the Heavy-Necessary condition (z = 1.42, p = 0.15), but a marginally significant effect (z = 1.91, p = 0.056) in the Light condition. A closer look at each trial individually, however, does seem to reveal some across-trial improvement. In particular, children in all conditions performed at chance level on Trial 1, and above chance on Trial 4; on Trials 2 and 3, a similar pattern as the overall pattern obtained––children showed above-chance performance in the Light condition but not in either of the two Heavy conditions. In other words, although there were no significant effects of block, the trends suggest that children may have improved over the course of the study, but more quickly in the Light condition than in the Heavy conditions. See Table 5 for a summary of performance by trial.

Table 5.

Summary of by-trial performance in Experiment 1

	Light		Heavy-Unnecessary		Heavy-Necessary
	Mean	SD	Mean	SD	Mean	SD
Trial 1	0.47	0.51	0.42	0.51	0.42	0.51
Trial 2	0.63	0.50	0.47	0.51	0.47	0.51
Trial 3	0.80	0.41	0.47	0.51	0.50	0.52
Trial 4	0.70	0.47	0.79	0.42	0.72	0.46

Open in a new tab

Finally, we entered the data from all conditions into a single mixed-effect logistic regression model to look for significant differences among conditions, but we found none: there was no significant difference between the Light and Heavy-Unnecessary conditions (z = −1.29, p = 0.20), nor between the Light and Heavy-Necessary conditions (z = −1.42, p = 0.15), nor between the two Heavy conditions (z = 0.13, p = 0.90).

Discussion

Taken together, the analyses that compared performance in each condition to chance level suggest that children learned novel verbs introduced with unmodified DPs, but not with modified DPs. This suggests that more information may not be better, even when information in the linguistic context is helpful for determining the meaning of the novel verb. However, the lack of a significant difference between conditions prevents us from drawing strong conclusions. This null result may be due to the small range of values possible in a binary pointing task, and the relatively low “ceiling” with young children.⁶ Nevertheless, replication of these results is needed, which we take up in Experiment 2.

We considered two possible explanations for children’s difficulty with the modified conditions. One is pragmatics––perhaps children found the modified DP pragmatically infelicitous. We tested this possibility by including two conditions, each with a modified DP, but one with a visual scene that made the modifier necessary (Heavy-Necessary condition) and another with a visual scene that made the modifier unnecessary and therefore infelicitous (Heavy-Unnecessary condition). Children struggled equally in both, indicating that pragmatics is not a likely explanation for their difficulty with the heavy conditions. Instead, the results are consistent with an account taking into consideration the processing load that information imposes on children’s developing parsers. Specifically, the extra information carried by the modified DPs may have overloaded children’s parsers, preventing them from learning the novel verbs downstream.

There are at least two ways in which modified DPs can hinder parsing. One possibility is that children simply fail to parse through the modified DPs. This seems unlikely because Thorpe & Fernald (2006) demonstrated that even 24-month-olds are able to process through DPs with one modifier. A second possibility, then, is that children can parse the modified DP, but are left with insufficient resources to learn the novel verb. Fernald et al. (2008) found in a noun learning study that 36-month-olds were able to process modified DPs and use that information to learn novel nouns downstream. However, prior work on verb learning has established that it is typically even more demanding than noun learning (e.g., Gleitman, Cassidy, Nappa, Papafragou, & Trueswell, 2005), for reasons that apply both in general as well as specifically in our experimental manipulations. First, verbs’ referents are often dynamic events and were depicted with dynamic video scenes in our study, while the referents in Fernald et al. (2008) were novel objects depicted with static images. Dynamic events require children to process the visual scene temporally, and may suggest many possible interpretations as the event unfolds. Second, verb acquisition requires encoding argument structure and event structure information, while the referents of object nouns (like those tested in Fernald et al., 2008) are inherently less complex. Finally, learning any kind of word requires generalizing from a single exemplar to other members of the same category, but there is substantial evidence that children find this more difficult for verbs that label dynamic events than for nouns that label objects (e.g., Behrend, 1990; Forbes & Farrar, 1993, 1995; Imai et al., 2005, 2008; Kersten & Smith, 2002). While in Fernald et al.’s (2008) study, children saw the same novel objects depicted from familiarization to test, our design required children to generalize the new verb to a new agent. Therefore, we suspect that the difficulty of verb learning, both in general and in the specific task children faced in our study, is responsible for children’s difficulty in the Heavy conditions. They lacked sufficient processing resources after processing the modified DPs to permit them to tackle the challenging verb learning task. In Experiment 2, we sought to replicate Experiment 1, and also to pursue this hypothesis about the reason for children’s failure to learn verbs with modified DP subjects.

Experiment 2

To replicate the findings of Experiment 1, we used the same design, but with a new set of stimuli and a slightly broader age range. To obtain insight into why children struggled with the Heavy conditions in Experiment 1, we asked whether, as in Thorpe and Fernald (2006) and Fernald et al. (2008), children were able to process the subject DPs. If so, we have indirect evidence that the barrier to verb learning was not identifying the referent of the modified DPs, but rather the accumulated processing load that doing so encumbered. To accomplish this, we looked not only at children’s pointing responses at Test as an index of their verb learning outcomes, but also their eye gaze during Familiarization as an index of their learning process.