Effects of prosodic and lexical constraints on parsing in young children (and adults)

Jesse Snedeker

doi:10.1016/j.jml.2007.08.001

. Author manuscript; available in PMC: 2009 Feb 1.

Published in final edited form as: J Mem Lang. 2008 Feb;58(2):574–608. doi: 10.1016/j.jml.2007.08.001

Effects of prosodic and lexical constraints on parsing in young children (and adults)

Jesse Snedeker ¹

PMCID: PMC2390868 NIHMSID: NIHMS41621 PMID: 19190721

Abstract

Prior studies of ambiguity resolution in young children have found that children rely heavily on lexical information but persistently fail to use referential constraints in online parsing (Trueswell, Sekerina, Hill & Logrip, 1999; Snedeker & Trueswell, 2004). This pattern is consistent with either a modular parsing system driven by stored lexical information or an interactive system which has yet to acquire low-validity referential constraints. In two experiments we explored whether children could use a third constraint—prosody—to resolve globally ambiguous prepositional-phrase attachments (“You can feel the frog with the feather”). Four to six years olds and adults were tested using the visual world paradigm. In both groups the fixation patterns were influenced by lexical cues by around 200ms after the onset of the critical PP-object noun (“feather”). In adults the prosody manipulation had an effect in this early time window. In children the effect of prosody was delayed by approximately 500 ms. The effects of lexical and prosodic cues were roughly additive: prosody influenced the interpretation of utterances with strong lexical cues and lexical information had an effect on utterances with strong prosodic cues. We conclude that young children, like adults, can rapidly use both of these information sources to resolve structural ambiguities.

Keywords: modularity, parsing, prosody, syntactic ambiguity resolution, children’s language comprehension, eye movements

Effects of prosodic and lexical constraints on parsing in young children (and adults)

The present study explores how adults and children combine information about the prosodic structure of an utterance and information from individual words to guide syntactic analysis. Several recent studies have demonstrated that prosody has rapid effects on parsing in adults (see e.g., Kjelgaard & Speer, 1999; Steinhauer, Alter, & Friederici, 1999; Snedeker & Trueswell, 2003). But prior research has found little or no effect of prosody on children’s interpretation of ambiguous sentences (e.g., Snedeker & Trueswell, 2001; Vogel & Raimy, 2002; Choi & Mazuka, 2003). In these prior studies, however, the lexical content of the utterances and its effects on parsing were not assessed, raising the possibility that children might be sensitive to prosody, but consult it only when lexical constraints are weak. The experiments that follow explore the interaction of prosodic and lexical information in the comprehension of ambiguous sentences in both children and adults.

Our interest in this question is three-fold. First, determining whether young children use prosody in online sentence comprehension could illuminate the architecture of language processing at a critical stage of development. Second, clarifying the role of prosody in children’s online processing may help us understand the developmental origins of the links between prosody and syntax. Finally, while there is ample evidence that adults use prosody to parse spoken utterances, there is little data on how prosody is integrated with other cues (but see Pynte & Prieur, 1996). Does prosody chop up utterances for further processing, overriding subsequent sources of information? Or is it merely used to revise or strengthen analyses that were originally proposed by other information sources? In the remainder of the introduction we briefly review the relevant aspects of adult sentence processing, and then explore each of these issues in turn.

Setting the stage

For decades psycholinguists have explored how adult listeners (and readers) recover the syntactic structure of a sentence from a string of words. Much of this research has focused on understanding the kinds of information that are used in the process, when they become available, and how they interact. These questions have primarily been examined by investigating the way readers initially interpret, and misinterpret, syntactically ambiguous phrases. For example, consider the sentence fragment (1) below:

(1) Allison ate the cake with the…

At this point in the utterance the prepositional phrase (PP) beginning with with is ambiguous: It could be attached to the verb ate (VP-attachment), indicating an instrument (e.g., with the fork); or it could be attached to the definite noun phrase the cake (NP-attachment) indicating a modifier (e.g., with the pink icing). In adults, several kinds of information rapidly influence the interpretation of ambiguous phrases. Three of these are particularly relevant for our discussion. First, knowledge about the particular words in the sentence constrains online interpretation (see e.g., Taraban & McClelland, 1988; Trueswell, Tanenhaus & Kello, 1993). For instance, the sentence in (2) favors the VP-attachment, but if we change the verb from hit to liked (as in 3) the preference flips and the modifier analysis, or NP-attachment, is favored.

(2) Allison hit the cake with the…

(3) Allison liked the cake with the…

Second, under some circumstances adults can use intonation to resolve attachment ambiguities. The presence of prosodic break or a pause before the preposition will support the VP-attachment (4) while the presence of a break before the direct object favors an NP-attachment (5) (Pynte & Prieur, 1996; Schafer, 1997).

(4) Allison ate the cake / with a butcher knife

(5) Allison ate / the cake with the chocolate ganache

Finally, the situation or referential context in which the utterance is used can have an effect (Crain & Steedman, 1985). While the context can constrain interpretation in many ways, research in this area has focused on manipulating the number of potential referents that are available for a definite noun phrase. In the example above, if only one cake is present in the discourse then the VP-attachment is often preferred, but if multiple cakes are available then readers are more likely to initially interpret the ambiguous phrase as a modifier specifying the cake in question (e.g., Altmann & Steedman, 1988; van Berkum, Brown & Hagoort, 1999, but see Ferreira & Clifton, 1986).

In reading studies, such referential constraints typically take a back seat to strong lexical constraints (e.g., Britt, 1994; Spivey-Knowlton & Sedivy, 1995). However Tanenhaus, Spivey and colleagues found that, in a world-situated spoken-language comprehension task, referential cues prevailed over strong countervailing lexical biases (Tanenhaus, Spivey-Knowlton, Eberhard & Sedivy, 1995; Spivey, Tanenhaus, Eberhard & Sedivy, 2002). When participants heard utterances like (6) in the presence of just one apple, they initially interpreted the first prepositional phrase (on the napkin) as a destination. But when two apples were provided (one of which was on a napkin) the participants were able to immediately use the referential context to overcome the strong bias of the verb and avoid this garden path, resulting in eye movements similar to unambiguous controls (e.g., Put the apple that’s on the napkin in the box).

(6) Put the apple on the napkin in the box.

Much of the research on adult sentence processing has focused on questions about time course: Are initial structural hypotheses influenced by all of these information sources? Or, does the architecture of the comprehension system or the nature of the data source force us to exclude some of this information during the earliest stages of processing? Currently the bulk of the evidence suggests that adults rapidly integrate these different information sources to arrive at the analysis that best meets the constraints they have encountered (for reviews see Tanenhaus & Trueswell, 1995; Altmann, 1998). But disputes continue about how this integration occurs: Do some sources of information establish candidate analyses while other sources of information weigh in at a later stage (see e.g., Boland & Cutler, 1996; Pynte & Prieur, 1996)?

What we know about children’s parsing

Our experiments focus on the parsing abilities of children between four and six years of age. We have chosen this age for several reasons. First, by four children’s language comprehension and production appear, to the naked eye, to be almost adult-like. Yet on a number of cognitive dimensions children this age are quite different from adults. For example, they are notoriously poor at tasks which invoke multiple representations of a single entity or require the inhibition of prior or dominant responses (Piaget, 1946; Flavell, 1986; Welsh, Pennington & Groisser, 1991; Permer & Wimmer, 1985). They also have smaller memory spans than adults or older children (for review see Schneider & Bjorklund, 1998). The increase in memory span across development is accompanied by an increase in processing speed across a wide range of tasks, raising the possibility of causal link (Kail, 1991; Kail & Park, 1994).

These cognitive differences could have profound implications for syntactic parsing. For example, in adults individual differences in working memory performance are correlated with qualitative differences in parsing during reading (Just & Carpenter, 1992; MacDonald, Just & Carpenter, 1992). Readers with low memory spans are less able to integrate contextual cues or consider multiple analyses of an ambiguity. Presumably parallel limitations in children could shape their spoken language comprehension. Similarly, a global slow down in processing speed, in face of a constant speech rate, might well limit the amount of processing that is possible as each word is spoken and integrated into the sentence.

The present experiment builds on two prior studies exploring syntactic ambiguity resolution in children (Trueswell, Sekerina, Hill & Logrip, 1999; Snedeker & Trueswell, 2004). Taken together they indicate that children’s parsing is rapidly influenced by lexically-specific information but is relatively impervious to referential context. In the first of these studies, Trueswell and colleagues explored the use of referential constraints in a study which closely paralleled the Tanenhaus and Spivey experiment described above. In contrast to the adults, five year olds (but not eight-year olds) blindly pursued the VP-attachment analysis, ignoring referential information.

Trueswell and colleagues offered two explanations for the overwhelming VP-attachment preference on the part of young children. First, this preference could be driven by the children’s statistical knowledge of the verb put, which strongly supports the presence of a PP-argument. This explanation would be consistent with lexicalist theories and constraint-satisfaction theories more generally (e.g., MacDonald, Pearlmutter & Seidenberg, 1994; Trueswell & Tanenhaus, 1994). Second, five year olds might have been exhibiting a general structural preference for VP-attachment. Such a preference would be predicted by theories of acquisition and parsing that favor simple syntactic structures (i.e., a Minimal Attachment strategy, Goodluck & Tavakolian, 1982; Frazier & Fodor, 1978) or ban complex syntactic operations entirely in the early stages of development (Frank, 1998).

Snedeker and Trueswell (2004) explored these possibilities by fully crossing the statistical preferences of the verb and the number of potential referents for the direct-object noun (S&T 2004 hereafter). The target sentences contained globally ambiguous prepositional phrase attachments, like (7) below. These sentences were presented with sets of toys that provided distinct referents for the prepositional object under each of the two possible analyses. For example in 7b both a large feather and frog holding a feather were provided.

(7) a. Choose the cow with the fork

b. Feel the frog with the feather

c. Tickle the pig with the fan

Both adults and five-year old children were strongly swayed by the type of verb that was used in the instructions. When the verb was one that frequently appeared with an instrument phrase (7c), participants began looking at the potential instrument (e.g., a large fan) shortly after the onset of the prepositional object. When the verb was strongly biased to a modifier analysis (7a), participants focused in on the animal holding the object instead. These lexical biases largely determined the ultimate interpretation that the participants assigned to the prepositional phrase and hence their actions (see Kidd & Bavin, 2005 for converging evidence). While adults also incorporated referential information into their analyses, children showed little sensitivity to this manipulation.

This strong reliance on lexical information clearly rules out the possibility that children’s interpretation relies on a general parsing heuristic (e.g., minimal attachment) that diminishes with age or experience. Instead the work to date demonstrates a near-exclusive role for lexical evidence in informing children’s parsing decisions. This is compatible with three different accounts of the development of parsing.

1. The lexical modularity hypothesis

The observed differences between children and adults could reflect architectural changes brought on by expansions in processing ability. For instance, a limited, single-cue, or encapsulated parsing system might become more interactive as processing ability grows with age. Indeed, several theories of parsing grant an architectural privilege to lexical cues. For example, Boland and colleagues have argued that the lexicon alone proposes syntactic and semantic structures while other cues are used at a later stage to select between the proposed analyses (Boland & Cutler, 1996).

2. The cue validity hypothesis

Alternately, it is possible children have a probabilistic multiple-cue comprehension system from the start, but the order in which the cues are acquired depends largely on their relative reliability (for a related proposal see Bates & MacWhinney, 1987). Under this account, children might show an initial reliance on lexical cues simply because they are a highly reliable source of information about syntactic structure. Work in computational linguistics demonstrates that lexical cues are highly predictive of local structure (e.g., Collins, 1997; Collins & Brooks, 1995; Marcus, 1994) while studies of infant-directed speech demonstrate that this information is robustly present in children’s input (Lederer, Gleitman, and Gleitman, 1995). Other constraints on syntactic structure, such as the need to resolve referential ambiguity, could simply take longer to acquire because they are less reliable in the input database as a whole, and arguably more difficult to track than lexico-syntactic contingencies. Both experimental and theoretical work suggests that the number of referents in a scene is a poor predictor of structure. Although adults understand that a definite NP almost always requires a unique (and agreed-upon) referent, disambiguation of the referent need not be accomplished linguistically, since the local discourse and the goals of the interlocutors often provide the necessary information (e.g., Hawkins, 1978; Prince, 1981). In a referential communication task, Brown-Schmidt and colleagues found that almost half of all definite NPs uttered (e.g., “Now, move the triangle”) did not have a unique referent in the scene (Brown-Schmidt, Campana & Tanenhaus, 2005). Participants, however, often had no difficulty identifying the correct referent, presumably because prior statements and the goals of the task had narrowed the field of possibilities down to one. Conversely, Engelhardt and colleagues found that speakers describing unique referents provided unnecessary post-nominal modifiers on almost one third of the trials (Engelhardt, Bailey & Ferreira, 2006). Surprisingly, naïve listeners judged these over-informative descriptions to be as adequate as the more concise simple nouns.

3. The bottom-up hypothesis

Finally, the results to date are consistent with a hypothesis that lies between the two extremes discussed so far. Children could have a probabilistic multiple-cue comprehension system but be unable to make use of some information types—regardless of their validity—due to architectural or processing constraints. While constraint satisfaction models propose that multiple types of information have rapid and converging effects on parsing, many such models recognize that different kinds of constraints stem from distinct levels of representation which emerge from distinct processing paths (e.g., Trueswell & Tanenhaus, 1994). Thus developmental differences in the use of different cues could arise from differences in the maturation or efficiency of these processing paths or differences in the relation between the constraining representation and the syntactic representation itself.

Referential context is most readily conceived of as a top-down constraint on syntactic structure. The relevance of reference world depends on several aspects of the linguistic analysis that is under construction: 1) whether a definite determiner was used with the noun (“a bird” is sufficient even when two are present); 2) which noun was produced ( “the heron” may be adequate even when “the bird” is not); and 3) whether a prenominal modifier has already disambiguated the referent, and 4) whether the ambiguous phrase is in a syntactic context where NP-attachment is possible (e.g. “pick the dog up with the hat” vs. “pick up the dog with the hat”). In short, it is difficult to imagine how referential constraints could be calculated without some assembly—and semantic evaluation—of the structural hypotheses under consideration.

Top-down constraints may pose problems for young children. To make use of top-down information the child must rapidly construct the syntactic alternatives, information about these alternatives must propagate up to the higher level representation, and then constraints from this higher-level representation must filter back down to the syntax. If children are slower to activate the alternatives, or to pass information from one level to the next, then the syntactic competition is likely to be resolved before the top-down information has arrived. There is ample evidence that processing speed increases in middle childhood across a variety of tasks (see e.g., Kail, 1991). Admittedly, there is limited experimental evidence to support our suggestion that top-down influences on bottom-up processing are generally late to develop (but see Chase & Tallal, 1990; James, 2001). Our analysis, however, receives some support from research on the effects of time manipulations in adult language processing. A global slow-down in processing speed (in the presence of a constant speech rate) is roughly parallel to an experimental manipulation in which time pressure is increased by creating a response deadline or increasing the speech rate. In both cases less processing can occur before the next word is encountered or production is initiated. Dell (1986) found that time pressure resulted in a decrease in the effect of a higher level representation (lexical items) or a lower level process (phoneme selection). These findings were tidily captured by a system of word production that shares many of the features of the constraint satisfaction models described above (e.g., distinct levels of representation, interactive processing, and bi-directional connections between levels).

The experiments that follow test the lexical-modularity hypothesis, by exploring children’s use of a third constraint on syntactic structure: the prosodic structure of the utterance. There are several reasons why prosodic cues might become available earlier in ontogenetic time than the referential cues explored in the prior experiments. First, while informative prosodic breaks are often absent from short utterances, when they are present they are highly reliable, making them a useful predictor of structure (Snedeker & Trueswell, 2003, see discussion below). Second, while referential context exerts a top-down influence on syntactic parsing, prosodic cues are arguably a bottom-up constraint. By this we merely mean that, from the perspective of the listener, the prosodic structure may constrain syntactic structure but it is not dependent upon it (for discussion see, Kjelgaard & Speer, 1999). This asymmetry could influence the parser’s ability to gain timely access to this information during comprehension. Finally, referential and prosodic representations may develop at different rates, which could influence the age at which they become integrated with online parsing. Many have argued that five-year olds are still struggling to understand the referential demands and goals of various communicative situations (Glucksberg & Krauss, 1967; Robinson & Robinson, 1982). In contrast, even young infants show a well-developed sensitivity to the prosodic structure of their language.

Do children use prosodic structure to resolve syntactic ambiguities?

Prosody clearly plays a central role in infant speech perception. Newborns discriminate between languages on the basis of their rhythmic properties (Mehler, Jusczyk, , Lambertz, Halsted, Bertoncini & Amiel-Tison, 1988; Jusczyk, Frederici, Wessels, Svenkerud, & Jusczyk, 1993). Half a year later, infants rely heavily on prosodic cues to segment the speech stream into words (Jusczyk, Culter, & Redanz, 1993; Morgan & Saffran, 1995; Johnson & Jusczyk, 2001). Critically, infants are also sensitive to the coalition of cues that mark the prosodic boundaries between groups of words (Hirsh-Pasek, Kemler Nelson, Jusczyk, Wright-Cassidy, Druss & Kennedy, 1987; Jusczyk, Hirsh-Pasek, Kemler Nelson, Kennedy, Woodward & Piwoz, 1992; Christophe, Mehler & Sebastián-Gallés, 2001). Because the prosody of an utterance depends in part on its syntactic structure, many theorists have suggested that infants might use their extensive experience with prosody to bootstrap their way into syntax (see e.g., Hirsh-Pasek et al., 1987; Morgan, 1996; Christophe, Guasti, Nespor, van Ooyen, 2003). If so, we might expect that prosody would continue to serve as parsing cue for preschoolers (Morgan, 1996; Choi & Mazuka, 2003). However, others have pointed out that the relation between prosody and syntax may be too variable or weak to support a prosodic comprehension strategy (Gerken, 1996; Fernald & McRoberts, 1996).

Recent research favors the skeptics; several studies have found little or no effect of prosody on children’s interpretation of structurally ambiguous sentences (Halbert, Crain, Shankweiler & Woodams, 1995; Snedeker & Trueswell, 2001; Vogel & Raimy, 2002; Choi & Mazuka, 2003).¹ For example, Choi and Mazuka tested Korean speaking children on two kinds of ambiguous utterances: word-segmentation ambiguities and syntactic grouping ambiguities. Both types of utterances were disambiguated with the same kind of prosodic cues (an intonational phrase boundary marked by a pause and a boundary tone). Three to four year olds performed well on the word-segmentation ambiguities but were at chance on the syntactic ambiguities, while adults performed well in both tasks. Choi and Mazuka concluded that while children can clearly detect prosodic boundary cues they fail to use them to resolve syntactic ambiguities.

These results are consistent with our own initial exploration of children’s ability to use prosody to resolve syntactic ambiguity (Snedeker & Trueswell, 2001, see Appendix A). In our study, a referential communication task was used to simultaneously explore whether mothers would provide systematic prosodic cues to the structure of ambiguous utterances and whether their children (ages 4–6) would use them in comprehension. The mothers, like the college students in our previous studies, varied their prosody systematically depending on the intended attachment of the prepositional phrase. Their children, however, performed at chance, suggesting that they were unable to use prosody to constrain parsing (see Appendix A).

But two features of this experiment lead us to question our results, and by extension, the prior findings. First, all of the children in our study showed a systematic preference for a single response type—some preferred the instrument analysis, others the modifier analysis, but they were all quite consistent. In our study prosody was manipulated within each subject and was not blocked. In such a design, a strong tendency to perseverate across trials could easily wipe out a small or fragile effect of prosody. Second, in this initial study the lexical biases of the target sentences were not systematically controlled, raising the possibility that strong biases in individual items may have overwhelmed the effects of prosody. To the best of our knowledge these two features were shared by the other studies that have failed to find an effect of prosody on children’s interpretation of globally ambiguous utterances (Halbert et al., 1995; Vogel & Raimy, 2002; Choi & Mazuka, 2003). Experiment 1 examines children’s use of prosody when these two factors are controlled.

Prosody in adult sentence processing

In contrast with children, adults have a robust ability to use prosodic cues to interpret globally ambiguous utterances (Lehiste 1973; Lehiste, Olive & Streeter 1976; Cooper & Paccia-Cooper 1980; Price, Ostendorf, Shattuck-Hufnagel, & Fong, 1991; Schafer, 1997; Carlson, Clifton & Frazier, 2001; see Cutler, Dahan & van Donselaar, 1997 for review). In fact, there is now a substantial body of evidence that prosody has a rapid effect on online sentence processing as well (Marslen-Wilson, Tyler, Warren, Grenier, & Lee, 1992; Nagel, Shapiro, Tuller, & Nawy, 1996; Pynte & Prieur, 1996; Kjelgaard & Speer, 1999; Steinhauer, Alter & Frederici, 1999; Snedeker & Trueswell, 2003; Weber, Grice & Crocker, 2006).

Despite this robust evidence that prosody can affect parsing, there is considerable controversy over the exact role that it plays. One controversy centers around the question of whether the prosodic manipulations used in psycholinguistic experiments are reflective of the prosodic cues provided in natural speech. Naïve readers often fail to prosodically disambiguate globally ambiguous sentences (Allbritton, McKoon & Ratcliff, 1996; Wales & Toner, 1979) raising the possibility that prosodic cues to structure are infrequent and perhaps unreliable. More recent studies using referential communication tasks have been somewhat more optimistic (Snedeker & Trueswell, 2003; Schafer, Speer & Warren, 2005; Kraljic & Brennan, 2005). All of these studies explored the disambiguation of PP-attachment ambiguities. While they differ in their conclusions, the results converge in three respects. First, all three studies find evidence for reliable prosodic disambiguation when the situational context supports both readings of the ambiguous utterance. Second, despite differences in the length and structure of the utterances there is remarkable consistency in the nature of the prosodic cues that the speakers produce. VP-attachments were generally accompanied by a strong prosodic break immediately before the prepositional phrase. These breaks were reduced or absent in NP-attachments. In both the Schafer and Snedeker studies, NP-attachments were often produced with a substantial prosodic break earlier in the utterance, though the location of this break was variable in the Schafer study and its presence was less reliable in the Snedeker study. Finally, all three studies demonstrate that adult listeners can use these prosodic cues to arrive at the correct interpretation of these otherwise ambiguous utterances. In referential contexts in which utterance is unambiguous, however, the findings of the three studies diverge. Snedeker and Trueswell found that prosodic cues were substantially weaker in unambiguous contexts (perhaps because the speakers failed to notice the structural ambiguity). In contrast the other two studies found reliable prosodic disambiguation even when the utterance was unambiguous in context. These divergent findings may be attributable to differences in the length of the utterances that were used, their syntactic complexity, the nature of the communication task, and the way in which referential ambiguity was manipulated (for discussion see Snedeker & Trueswell, 2003). Nevertheless, the research to date clearly indicates that in referentially ambiguous contexts, like those in the current study, naïve speakers will produce predictable patterns of prosodic phrasing which effectively disambiguate PP-attachment ambiguities, even when those ambiguities are in short simple utterances, much like those in the current study.

A second controversy centers on the relation between prosodic cues and structural parsing preferences or lexical biases. While some observers suggest that prosody is used in the earliest stages of syntactic analysis (Marslen-Wilson et al., 1992; Kjelgaard & Speer, 1999), others claim that it is used at a later stage (Pynte & Prieur, 1996; Marcus & Hindle, 1990). Only one study has directly explored the interaction of prosodic and lexical information in online parsing (Pynte & Prieur, 1996). In these experiments, a word detection task was used to examine the processing of structurally-ambiguous prepositional phrases. The utterances contained either ditransitive or monotransitive verbs, and verb type was fully crossed with both the disambiguation of the utterance (NP-attached or VP-attached) and the presence or absence of a prosodic break before the preposition. Both prosody and argument-structure preferences affected the ease with which participants interpreted NP-attached and VP-attached prepositional phrases. The prosodic effects, however, occurred only when the argument structure cues conflicted with the resolution of the ambiguity. When the lexical cues were consistent with the disambiguation, the effects of prosody disappeared. This led the authors to suggest that lexical information proposes an analysis, while prosody merely plays a role in revision.

There are two reasons to be cautious in accepting this conclusion. First, the conditions in which lexical cues were consistent with the disambiguation had substantially lower reaction times than the others, suggesting that the interaction may have been caused by a floor effect. Reaction times in a dual task paradigm like this depend both on the difficulty of the comprehension task and on the time it takes to complete the overt task. When syntactic processing is relatively easy, variations in difficulty could be absorbed into the slack introduced by the demands of the word detection task. Second, subsequent studies have found early effects of prosody on both the preferred and dispreferred interpretation of closure ambiguities, suggesting that the role of prosody cannot be limited to revision (Kjelgaard & Speer, 1999).

As this example illustrates, interpreting the literature on prosody and online processing is complicated by the nature of the paradigms that are used. The most common are cross-modal lexical decision, cross-modal naming, and speeded judgment tasks, all of which have been criticized for their poor temporal resolution and artificiality (Carlson et al., 2001). Even those experiments using more naturalistic tasks (e.g. ERPs, Steinhauer et al., 1999) employ designs which provide limited information about the time course of prosodic influence. The experiments to date manipulate the consistency of the prosodic contour with subsequent morphosyntactic information, and then measure effects of prosody at or after the disambiguation point (three to ten syllables after the onset of the ambiguity). While these measures can clearly support inferences about the outcome of processes that occurred in the ambiguous region, they are mute about the temporal structure of those processes.

The current study uses the visual word paradigm to explore the interaction between prosodic and lexical constraints in online processing (Tanenhaus et al., 1995). While this technique also has its limitations—most notably all utterances must refer to depictable objects and events in a tightly constrained reference world—it has the advantage of providing a fine-grained measure of online interpretation that can be continuously monitored from the onset of the ambiguous region through the completion of the sentence. In earlier work, we employed this technique to examine the use of prosodic cues by listeners in a referential communication task (Snedeker & Trueswell, 2003). Surprisingly, effects of prosody on online interpretation appeared before the onset of the ambiguous prepositional phrase, suggesting that in some circumstances prosody can be used to predict the content of an upcoming phrase.

Experiment 1 explores whether children can use prosodic cues to syntactic structure when lexical biases are controlled and their perseveratory tendencies are harnessed. In Experiment 2 we return to the question of how adult and child listeners combine lexical and prosodic information in online comprehension.

Experiment 1

The initial goal of this experiment was quite modest. We planned on giving prosody one last chance by testing children with a more sensitive measure (eye-movements), in a design that would allow us to distinguish between a failure to use prosodic cues and strong bias to perseverate across trials and with materials that contained no strong lexical constraints that might compete with prosodic constraints. The critical sentences contained structurally ambiguous with- PPs (“You can feel the frog with the feather”) which could be interpreted as either VP-attached instrument phrases or NP-attached modifiers.

We recorded two versions of each sentence, one with Instrument Prosody (an intonational phrase break after noun) and one with Modifier Prosody (an intonational phrase break after verb). Participants received one block of trials in each prosody condition, with the order of the blocks counterbalanced across participants. Thus prosody was manipulated between subjects in the first block, but within subjects across the two blocks. This design was used so we could eliminate possible interference effects in the first block and investigate them in the second block.

Because the sentences used in this study are never definitively disambiguated, we expect continuity between the listeners’ online attachment preferences and their ultimate interpretations. If listeners can use prosody, then the placement of the intonational break should affect whether they interpret the with-phrase as an instrument or a modifier and this preference should be reflected in both their eye movements and their actions.

Methods

Participants

Twenty-four English-speaking children participated in the study. The children were divided into two age groups. The preschool group was approximately the same age as the five-year-olds in Snedeker and Trueswell (4;2 to 5;8, M = 4;10), while the kindergarten group was about a year older (5;10 to 6;7, M = 6;2). Parents were contacted from schools and daycares in the Cambridge area and from a database of children who had participated in research at the Laboratory for Developmental Studies. All the children who began the experiment completed it and were included in the analyses. Half were male.

Procedure

Children were tested individually in a quiet room in our lab or their school. They were told that they were going to play a game about following instructions. During the experiment the child was seated in front of an inclined podium. At the center of the podium was a hole for a camera which was focused on the participant’s face. In each quadrant of the podium was a shelf where one of the props could be placed. At the beginning of each trial, the experimenter laid out the props and introduced each one using indefinite noun phrases. Any object held by a toy animal was introduced separately rather than as part of a complex NP to ensure that we did not prime the modifier analysis of the target sentences. For instance, the objects shown in Figure 1 would have been introduced by saying: “This bag contains a candle, a feather, a leopard, another candle [referring to the miniature one], a frog and another feather [the miniature one].” This procedure ensured that participant knew the labels for toys and that subsequent reference to the objects using definite noun phrases (e.g., “the frog”) was felicitous.

Example of the referential context used in the experiments (for the target sentence “You can feel the frog with the feather”).

After each object had been labeled twice, the experimenter played prerecorded sound files from a computer connected to external speakers. The trial began with an instruction to look at a fixation point at the center of the display. This was followed by two commands. The child heard the first command, performed that action, and then heard the second (an unambiguous filler). A camera placed behind the child, recorded her actions and the locations of the props, while the camera under the podium recorded her gaze direction. The experimenter moved out of the child’s view before the first sentence began and remained there until the action was completed. If the child refused to respond, the sound file was played again but the eye movements were taken from the initial presentation of the sentence. Children were praised for all responses.

Stimuli

On the critical trials, the first command contained an ambiguous prepositional phrase (8).

(8) You can feel the frog with the feather.

All the ambiguous prepositional phrases were headed by with and could be interpreted as either a modifier of the noun or an instrument (and hence an adjunct of the verb, or perhaps an argument, see Koenig, Mauner & Bienvenue, 2003 for discussion). To increase the probability that prosody would play a decisive role in the interpretation of these utterances, we used sentences in which the with-phrase was equally apt as a modifier or an instrument. In S&T 2004, prosodically neutral versions of these sentences, presented with the same referential contexts, resulted in a mix of instrument and modifier responses (37% and 33% instrument responses for five-year olds and adults respectively). These eight sentences contained verbs that had been found to have no strong bias in a sentence completion study and had prepositional objects (“feather”) which had been rated as moderately plausible instruments for the action in question (see S&T, 2004, Appendices B & C). In this earlier study the root sentences were produced as direct commands rather than indirect commands. In the present study we added the carrier phrase for the purpose of making the disambiguating prosodic break in the Modifier condition more natural, by creating more balanced intonational phrases (contrast “Feel…‥ the frog with the feather” with “You can feel…the frog with the feather”).

As Figure 1 illustrates, the set of toys that accompanied the critical sentences always contained the following objects: 1) a Target Instrument, a full-scale object that could be used to carry out the action (for Figure 1 the large feather); 2) a Target Animal, a stuffed animal carrying a small replica of the Target Instrument (the frog holding a little feather); 3) a Distractor Instrument; a second full-scale object (the candle); and 4) a Distractor Animal, a stuffed animal of a different kind carrying a replica of the Distractor Instrument (the leopard carrying a candle). Contexts with just one potential referent for the direct-object noun (one frog) were used because they allow us to directly compare the performance of children and adults (see Experiment 2). In prior studies manipulations of referential context have not had reliable effects on attachment preferences in five-year olds: in one-referent contexts children have attachment preferences that are similar to adults, but in two-referent contexts children produce fewer modifier responses (S&T, 2004).

Two versions of each sentence were digitally recorded by a female actor, one with Instrument Prosody and one with Modifier Prosody. The utterance with Instrument Prosody had an intonational phrase break after the direct-object noun. The utterance with Modifier Prosody had an intonational phrase break after the verb. This prosody manipulation was modeled on the utterances produced by the mothers in the referential communication task (Appendix A) and by college-aged adults in a parallel experiment (Snedeker & Trueswell, 2003) and is consistent with other production studies using prepositional phrase attachment ambiguities (Schafer, et al., 2005) as well as theoretical descriptions of the relation between prosody and syntax (Watson & Gibson, 2004). In the instrument condition, the presence of a prosodic break before the with-phrase suggests that there is a major syntactic break between the noun and the prepositional phrase, and increases the likelihood that the with-phrase is a VP-attached instrument phrase. In contrast, in the modifier condition the presence of a prosodic break before the noun phrase suggests that there is a heavy constituent after the verb (Ferreira, 1991; Watson & Gibson, 2004), while the absence of a break between the noun and the preposition provides evidence that the prepositional phrase is a part of this constituent, and hence a modifier of the noun. Breaks in these locations have been shown to influence prepositional phrase attachment in adults in both online and offline tasks (Pynte & Prieur, 1996; Snedeker & Trueswell, 2003; Schafer, et al. 2005).

The digital waveforms were examined to verify the phrase break and ensure that there were no other detectable pauses in the utterance. The length of each word was measured and paired t-tests were conducted to verify the differences between the two types of utterances (see Figure 2 and Table 1). Instrument utterances had shorter verbs, shorter post-verbal pauses, longer direct-object nouns, longer post-nominal pauses, and longer prepositions.

Time course for the critical utterances in Experiment 1.

Table 1.

Duration analyses for the stimuli in Experiment 1.

Dependent Variable	Mean for Instrument Prosody	Mean for Modifier Prosody	Analysis

verb length	372 ms	625 ms	t (7) = 8.57, p < .001**
verb length	CI_.95 = ± 54 ms	CI_.95 = ± 79 ms	t (7) = 8.57, p < .001**

verb pause	3 ms	115 ms	t (7) = 6.98, p < .001**
verb pause	CI_.95 = ± 5 ms	CI_.95 = ± 26 ms	t (7) = 6.98, p < .001**

direct-object noun	478 ms	236 ms	t (7) = 13.86, p < .001**
direct-object noun	CI_.95 = ± 60 ms	CI_.95 = ± 35 ms	t (7) = 13.86, p < .001**

noun pause	235 ms	0 ms	t (7) = 9.43, p < .001**
noun pause	CI_.95 = ± 43 ms	0 ms	t (7) = 9.43, p < .001**

"with"	191 ms	157 ms	t (7) = 3.46, p < .01*
"with"	CI_.95 = ± 15 ms	CI_.95 = ± 12 ms	t (7) = 3.46, p < .01*

prepositional object	455 ms	473 ms	t (7) < 1, p > .5
prepositional object	CI_.95 = ± 92 ms	CI_.95 = ± 58 ms	t (7) < 1, p > .5

Open in a new tab

Prosody was manipulated within participants but was blocked. This allowed us to explore whether response perseveration might explain prior failures to finds effects of prosody on children’s syntactic parsing. Two counterbalanced presentation lists were constructed. The first half of one list contained sentences with Instrument Prosody while the first half of the other list contained sentences with Modifier Prosody. The critical trials were interspersed with 10 filler trials. Both filler and target trials consisted of two commands and the second command was always an unambiguous filler sentence. Thus, each child heard 28 unambiguous sentences (the first instruction of the 10 filler trials and the second instruction of all 18 trials) and 8 ambiguous ones. Each list was presented in two orders (forward and reverse). The filler sentences contained a variety of constructions but the same fillers were used in all lists and all conditions. They were selected with the goal of not biasing the participants’ response on the target trials. For example, half the filler sentences requested actions involving one object (like the modifier reading), while half requested actions involving two objects (like the instrument reading)

Coding

Trained coders watched the videotape of the participant’s actions and coded them into four separate categories: (1) Instrument Responses: participant used the Target Instrument to perform action on Target Animal; (2) Mini-Instrument Responses: participant used miniature object attached to the Target Animal to perform action on Target Animal; (3) Modifier Responses: participant performed the action on Target Animal without using the target or mini-instrument; (4) Other: participant failed to perform the target action or performed it on the wrong entity. Because Mini-Instrument and Modifier Responses should both lead to exclusive fixation on the Target Animal, these responses would weaken any effect of attachment on eye movements. To minimize this problem we explicitly discouraged participants from manipulating the miniature objects during the demonstration trials and filler trials (no feedback was given after target sentences). As a result, these responses were infrequent in this experiment (4.2% of target trials). Instrument and Mini-Instrument Responses were combined for all analyses.

Eye movements were coded from the videotape of the participant’s face, using frame-by-frame viewing on a digital VCR. The coder noted the onset of the sentence and the onset of each change in gaze, and the direction of the subsequent fixation. The direction of a fixation was coded as being in one of the quadrants, at center, or away from the display. If the subject’s eyes were closed or not visible, the frame was coded as missing and the data were excluded from the analysis (only 1.9% of the coded frames were missing). Twenty-five percent of the trials were checked by second coder, who was given the list of onset times for the eye movements. The two coders agreed on the direction of fixation for 95.7% of the coded frames. Disagreements were resolved by a third coder. One test trial was excluded from further analysis due to experimental errors. With displays of this kind, this method of collecting and coding eye-movements produces data that is comparable to that produced by a head-mounted eye-tracker. For example, S&T 2004 simultaneously collected data using both methods and found that the coded location was identical for 93% of the video frames.

Results and Discussion

The results are divided into four sections below. First, we present the children’s actions in response to the target instructions, analyzing whether an instrument was used to carry out an action. This measure reflects their final interpretation of the ambiguous phrase. Second, we present data on the proportion of trials that included looks to the target instrument. This provides a coarse-grained measure of eye movements that could presumably reflect both early and late interpretive processes. In the third section we analyze how the children’s fixations change over time to explore how prosodic information is used online. Finally, we explore alternate explanations for the fixation patterns.

For all measures, we initially conducted an analysis of variance (ANOVA) on the participant means containing three between-participant factors (Age, List, and Order) and one within-participant factor (Prosody). Equivalent ANOVAs were conducted on item means containing one between-item factors (Item Group) and three within-item factors (Prosody, Age and Order). Because our manipulation of prosody was blocked, we were also able to examine the effects of prosody in a between subjects design by limiting our analysis to the initial block of trials. In these ANOVAs, Age and Prosody were between participant and within item factors.

Actions

Figure 3 plots the proportion of trials in which the participants performed instrument actions, thus indicating that they had interpreted the ambiguous prepositional phrase as a VP-attached argument or adjunct. Table 2 lists the results of the ANOVA for the critical variables. Prosody had a moderate but reliable effect on interpretation; participants performed instrument actions more often when Instrument Prosody was used. The performance of the kindergarteners and preschoolers was similar, resulting in no effect of Age or interaction between Age and Prosody. However there were interactions between Prosody and Order (F1(1,16) = 6.42, p < .05; F2(1,6) = 6.23, p < .05; minF′(1,17) = 3.16, p = .09), and Prosody, Order and List (F1(1,16) = 9.80, p < .01), suggesting that the pattern of performance changed over the course of the experiment.

Action analysis. The proportion of instrument responses from children in Experiment 1 by trial block.

Table 2.

Action analysis for Experiment 1 (children). The dependant variable is in percentage of instrument actions.

	All Blocks	Block 1	Block 2

Mean Instrument Prosody	61%,	58%	65%
Mean Instrument Prosody	CI_.95 = ± 10%	CI_.95 = ± 18%	CI_.95 = ± 19%

Mean Modifier Prosody	40%	21%	58%
Mean Modifier Prosody	CI_.95 = ± 10%	CI_.95 = ± 12%	CI_.95 = ± 18%

Prosody	F1(1,16) = 9.80, p < .01*	F1(1,20) = 10.39, p < .005**	F1(1,20) < 1, p > .5
	F2(1,6) = 8.82, p < .05*	F2(1,6) = 7.25, p < .05*	F2(1,6) < 1, p > .5
	minF′(1,17) = 4.64, p< .05*	minF′(1,15) = 4.27, p = .06	minF′(1,18) < 1, p > .5

Age (K or Pre-K)	F1(1,16) < 1, p > .5	F1(1,20) < 1, p > .5	F1(1,20) < 1, p > .5
	F2(1,6) < 1, p > .5	F2(1,6) < 1, p > .5	F2(1,6) < 1, p > .5
	minF′(1,17) < 1, p > .5	minF′(1,18) < 1, p > .5	minF′(1,18) < 1, p > .5

Prosody * Age	F1(1,16) < 1, p > .4	F1(1,20) < 1, p > .5	F1(1,20)= 1.15, p > .25
	F2(1,6) = 2.46, p > .15	F2(1,6) < 1, p > .5	F2(1,6) = 2.54, p > .15
	minF′(1,22) < 1, p > .4	minF′(1,18) < 1, p > .5	minF′(1,25) < 1, p > .3

Open in a new tab

As Figure 3 illustrates, there was a strong effect of Prosody in the first block of trials but no effect on the second block. A comparison of the two blocks suggests an intriguing asymmetry between the modifier and Instrument Prosody conditions. Participants who received Modifier Prosody in the first half of the experiment switched to instrument responses when the prosody of the utterances changed (compare the inside bars). But those who started out with Instrument Prosody perseverated after the switch to Modifier Prosody, continuing to produce instrument actions (compare the outside bars). This resulted in a reliable effect of Block for the modifier utterances (F1(1,20) = 10.13, p < .005; F2(1,7) = 23.25, p < .005; minF′(1,27) = 7.06, p < .05*) but not the instrument utterances (F’s < 1, all p’s > .5). These findings suggest that the prior failures to find effects of prosody on parsing may be attributable to perseveration across trials.

Coarse Grained Analysis of Fixations

For each trial we determined whether the participant looked at the Target Instrument any time between the onset of the prepositional object and the beginning of their action (or 1.5 seconds after the prepositional object onset, whichever came first). Figure 4 plots the proportion of trials with instrument fixations in each of the conditions, while Table 3 lists the results of the ANOVAs. Participants tended to look at the Target Instrument when they were going to use it to perform the action but seldom fixated on it otherwise. Thus the results for the fixation analysis closely echo those of the action analysis.

Table 3.

Coarse grained analysis of fixations for Experiment 1 (children). The dependant variable is the proportion of trials with looks to the Target Instrument.

	All Blocks	Block 1	Block 2

Prosody	F1(1,16) = 12.05, p < .005**	F1(1,20) = 6.33, p < .05*	F1(1,20) < 1, p > .5
	F2(1,6) = 5.23, p = .06	F2(1,6) = 3.04, p = .125	F2(1,6) < 1, p > .3
	minF′(1,12) = 3.65, p = .08	minF′(1,12) = 2.05, p = .18	minF′(1,18) < 1, p > .5

Age (K vs. Pre-K)	F1(1,16) < 1, p > .5	F1(1,20) < 1, p > .5	F1(1,20) < 1, p > .3
	F2(1,6) < 1, p > .3	F2(1,6) < 1, p > .5	F2(1,6) = 1.13, p > .3
	minF′(1,17) < 1, p > .5	minF′(1,18) < 1, p > .5	minF′(1,20) < 1, p > .4

Prosody * Age	F1(1,16) = 2.15, p > .15	F1(1,20) < 1, p > .5	F1(1,20)< 1, p > .5
	F2(1,6) < 1, p > .4	F2(1,6) < 1, p > .5	F2(1,6) < 1, p > .3
	minF′(1,17) < 1, p > .5	minF′(1,18) < 1, p > .5	minF′(1,18) < 1, p > .5

Open in a new tab

In the analysis of both blocks, the performance of the kindergarteners and preschoolers was similar, resulting in no effect of Age or interaction between Age and Prosody. Prosody had a modest effect on Target Instrument fixations. But once again there were interactions between Prosody and Order (F1(1,16) = 18.01, p < .001; F2(1,6) = 6.59, p < .05; minF′(1,11) = 4.82, p = .05), and Prosody, Order and List (F1(1,16) = 10.01, p < .01), suggesting that the effect of Prosody changed over the course of the experiment.

As Figure 4 illustrates, there was a strong effect of Prosody in the first block of trials but no effect on the second block. As in the action analysis, this reflected a difference between the Modifier and Instrument conditions. Participants who received Instrument Prosody in the first half of the experiment, persisted in looking at the Target Instrument after the switch to Modifier Prosody in the second block. This resulted in a reliable effect of Block for the modifier utterances (F1(1,22) = 5.51, p < .05; F2(1,7) = 19.70, p < .005; minF′(1,29) = 4.30, p < .05) but not the instrument utterances (F’s < 1, all p’s > .5).

The perseveration of instrument responses and looks indicates that some representation or process is being primed by the child’s experiences earlier in the study. But we cannot determine the level at which this priming occurs: the effect could be mediated by syntactic priming of the VP-attached prepositional phrase, priming of a semantic category like instrument, or priming of an action plan that incorporates both an animal and an object (see Thothathiri & Snedeker, in press for evidence of structural priming in preschoolers’ spoken language comprehension). The lack of perseveration when participants switch from modifier to instrument utterances suggests either that the complementary category (complex noun-phrase, modifier, or action on a single object) is less readily primed or that the prosodic cues for VP-attachment are more potent than cues for NP-attachment and thus more apt to override the effects of perseveration. This apparent asymmetry is explored further in Experiment 2. However, since our primary interest is in children’s ability to use prosodic cues in online comprehension, and not in the nature of this perseveration, subsequent figures and analyses will focus on the data from the first block of trials.

The dashed line in Figure 4 indicates the proportion of trials during which participants looked to the Distractor Instrument. This object was not mentioned in the sentence and thus provides a rough baseline for the Target Instrument looks. During the first block of trials, participants who received Modifier Prosody were no more likely to look at the Target Instrument than they were to look at the Distractor Instrument (F’s < 1, all p’s > .5). Thus this analysis provides no evidence that participants in the modifier condition are initially considering the VP attachment of the ambiguous prepositional phrase. In contrast those who received Instrument Prosody were far more likely to look at the Target Instrument (F1(1,11) = 21.62, p < .001; F2(1,7) = 12.78, p < .01; minF′(1,14) = 8.03, p < .05).

Temporal Analysis of Eye-Movements

To explore the relation between the unfolding utterances and the participant’s evolving interpretation, we analyzed how the distribution of eye movements changes over time (see Figures 5a & 5b). In each figure, time is displayed along the x-axis in increments of 1/30^th of a second (equivalent to a single video frame). Time is measured relative to the onset of the object of the preposition (e.g., “feather” in “You can feel the frog with the feather”). The lines represent the proportion of fixations to each of the four types of objects that the subject could look at: the Target Animal, the Distractor Animal, the Target Instrument, and the Distractor Instrument. We expected that prosody would affect the proportion of fixations to the Target Instrument, since this object is the referent of the prepositional object if and only if the phrase is VP-attached. In contrast we expected to see little or no difference in looking time to the Target Animal, since this object is the referent of the direct-object noun phrase regardless of how the ambiguity is resolved.

Fixation probabilities relative to the onset of the prepositional object for children in Experiment 1.

There are two obvious differences between the instrument and modifier conditions (Figures 5a and 5b). First, participants in the modifier condition shift their gaze from the fixation point to the Target Animal in the 300 ms preceding the PP-Object Onset. In contrast, those in the instrument condition are already looking at the Target Animal prior to this time. This reflects necessary differences in the timing of the two types of utterances. Instrument utterances contain elongated direct-object nouns followed by substantial pauses, giving participants plenty of time to identify this first noun and shift their gaze before the onset of the second noun (M = 998 ms, from the onset of the direct-object noun to the PP-Object Onset). In the modifier utterances, however, there is relatively little time between the two nouns (M = 477 ms from onset to onset). Second and more critically, there is also an increase in looks to the Target Instrument in the instrument condition beginning roughly 300 ms after the PP-Object Onset and plateauing about 900 ms later. In modifier condition, in contrast there are few looks to the Target Instrument during this time period.

To determine when the prosody manipulation began to influence the children’s eye movements, we analyzed Target Instrument fixations in two 500 ms time windows following the PP-Object Onset. Previous research demonstrates that lexical information begins to influence eye movements about 200ms after word onset (Allopenna, Magnuson & Tanenhaus, 1998), so we began our Early PP-Object window at 200ms after the onset of the prepositional object. The Late PP-Object window began 700ms after the prepositional object onset. The results of these analyses are presented in Table 4. The effect of prosody was not reliable for the Early PP-Object window, despite a trend towards greater proportion of Target Instrument looks in the instrument condition. In the Late PP-Object window, however, there was a significant difference between the two types of utterances. Participants looked at the Target Instrument more when the sentence occurred with Instrument Prosody than when it occurred with Modifier Prosody. There was no reliable effect of age group in either time window and no reliable interaction between age and prosody.

Table 4.

Temporal analyses of fixations for Experiment 1 (children). The dependant variable is the proportion of looking time to the Target Instrument.

	Early PP-Object	Late PP-Object

Mean Inst Prosody	19%,	29%,
Mean Inst Prosody	CI_.95 = ± 15%	CI_.95 = ± 16%

Mean Mod Prosody	6%,	8%,
Mean Mod Prosody	CI_.95 = ± 8%	CI_.95 = ± 6%

Prosody	F1(1,20) = 2.22, p = .15	F1(1,20) = 5.02, p < .05*
	F2(1,6) = 2.46, p = .17	F2(1,6) = 7.56, p < .05*
	minF′(1,19) = 1.17, p > .25	minF′(1,22) = 3.02, p = .10

Age (K or Pre-K)	F1(1,20) < 1, p > .4	F1(1,20) < 1, p > .4
	F2(1,6) < 1, p > .4	F2(1,6) < 1, p > .4
	minF′(1,18) < 1, p > .5	minF′(1,18) < 1, p > .5

Prosody * Age	F1(1,20) < 1, p > .4	F1(1,20) < 1, p > .4
	F2(1,6) = 4.38, p = .081	F2(1,6) < 1, p > .4
	minF′(1,26) < 1, p > .3	minF′(1,18) < 1, p > .5

Open in a new tab

Are These Effects of Prosody or Side-Effects of Time?

In this experiment, we manipulated the prosody of the utterance by having the speaker shift the placement of an intonational phrase boundary. This resulted in systematic changes in the lengths of words and placement of pauses (see Table 1 and Figure 2). Consequently, in our study, as in previous experiments, the effects of prosody could be attributable to differences in the timing of words. There are two ways in which this might occur. First, differences in time could be part of the mechanism by which prosodic variation influences parsing. This hypothesis is consistent with our data and will be explored further in the General Discussion. The second possibility is more worrisome. Perhaps the effects of our prosodic manipulation are attributable to effects of time on our dependant measures that are independent of parsing or prosodic structure.

This alternative provides a prima facie explanation for the effects of prosody on temporal analysis of Target Instrument looks. In this study, as in prior studies, the children tended to look at each object as it was mentioned, beginning with the direct-object noun (frog). In the Modifier Prosody condition, the prepositional object rapidly follows the direct-object noun, while in the Instrument condition there is a substantial pause between these words. Thus even if participants had similar parsing preferences for the two types of sentences, we might expect decreased looking to the Target Instrument in the Modifier condition immediately after the PP-Object Onset, simply because participants might still planning and executing their initial looks to the referent of the direct-object noun (the Target Animal). While this account cannot explain the effects of prosody on the participants’ actions, it does raise questions about the interpretation of our online measures.

To explore this possibility we conducted an additional analysis on the trials in which subjects had already shifted their gaze to the Target Animal before encountering the ambiguous prepositional object. If the effects of prosody on instrument fixations in the previous analysis were simply artifacts of delays in looks to the Target Animal then we should see the effects disappear or diminish in this analysis. If however, the differences in instrument fixations reflect the influence of prosody on the interpretation of the ambiguous prepositional phrase, then these effects should persist in this subset of trials. We selected all Block 1 trials on which the participant was gazing at the Target Animal 100ms before the onset of the prepositional object (or 300ms before or first analysis window and the time at which we would expect to see shifts related to the prepositional object). We calculated the proportion of target instrument fixations during the Early and Late PP-Object Time Windows and conducted one-way ANOVAs with Prosody as a between subjects and within item variable. These analyses, presented in Table 5, closely parallel the analyses for the full data set (Table 4). Once again, there was no reliable effect of prosody during the Early PP-Object window. During the Late PP-Object window participants in the instrument condition spend more than twice as much time looking at the Target Instrument as those in the modifier condition. However, the effect of prosody is only marginal in both the subject and item analyses. This is a divergence from the primary analysis where this effect reached the conventional significance level (see Table 4). Since the means for each condition are quite similar in the two analyses, this difference may reflect an increase in variability in the data due to the reduction in the number of trials used in this comparison (59 of the original 94).

Table 5.

Analysis of trials in which participants were fixating on the Target Animal prior to the prepositional object onset for Experiment 1 (children). The dependant variable is the proportion of looking time to the Target Instrument.

	Early PP-Object	Late PP-Object

Mean Inst Prosody	18%,	26%,
Mean Inst Prosody	CI_.95 = ± 15%	CI_.95 = ± 15%

Mean Mod Prosody	6%,	10%,
Mean Mod Prosody	CI_.95 = ± 12%	CI_.95 = ± 9%

Prosody	F1(1,21) = 1.57, p > .2	F1(1,21) = 3.27, p = .085
	F2(1,6) = 1.22, p > .3	F2(1,6) = 4.27, p = .084
	minF′(1,19) = 1.17, p > .25	minF′(1,21) = 1.85, p = .19

Open in a new tab

Summary of Experiment 1

Experiment 1 demonstrates that prosody has an effect on children’s interpretation of an ambiguous prepositional-phrase attachment. This effect was apparent both in their actions and proportion of trials in which they looked at the Target Instrument. When children heard Instrument Prosody there was a reliable increase in eye movements to the Target Instrument about 700ms after the onset of the prepositional object. While this effect could be an artifact of differences in timing and their influence on Target Animal looking time, additional analyses suggest that the difference persists even when participants have succeeded in looking at the Target Animal long before the onset of the critical prepositional object.

However, this study also suggests that the effects of prosody are somewhat fragile; they disappear in the second block of trials, swamped by the perseveration of instrument actions. These results raise questions about the relative contribution of prosodic and lexical cues to online sentence processing in children and adults. In a parallel study, S&T 2004 found that lexical biases had a robust influence on the interpretation of prosodically-neutral prepositional-phrase attachments. In the present study we found that prosody shaped the interpretation of lexically neutral sentences. In Experiment 2 we explore how these cues interact by simultaneously manipulating intonation and lexical biases and children and adults.

This allows us to do three things. First, we can directly compare the relative strength of prosodic and lexical cues to prepositional phrase attachment ambiguity when both information sources are available. Second, manipulating both cues will allow us to explore the time course of prosodic and lexical influences. Third, by examining the full paradigm in both children and adults we can explore whether the relative influence of prosodic and lexical information changes across a period of development in which executive functions blossom, processing speed increases across a variety of tasks, and reading becomes a primary source of linguistic input.

Experiment 2

Experiment 2 examines the effects of prosody and lexical bias in children’s and adult’s online interpretation of ambiguous prepositional phrase attachments. While there is ample evidence that lexical biases and prosodic phrasing rapidly influence adult parsing, there is little work exploring how these information sources interact over time. Verb Bias was manipulated between subjects. Prosody was blocked so that it could be analyzed as a between subjects variable in the first block but as a within subjects variable across the two blocks.