THE BACON not the bacon: How children and adults understand accented and unaccented noun phrases

Jennifer E Arnold

doi:10.1016/j.cognition.2008.01.001

. Author manuscript; available in PMC: 2009 Jul 1.

Published in final edited form as: Cognition. 2008 Mar 20;108(1):69–99. doi: 10.1016/j.cognition.2008.01.001

THE BACON not the bacon: How children and adults understand accented and unaccented noun phrases

Jennifer E Arnold ¹

PMCID: PMC2518042 NIHMSID: NIHMS55201 PMID: 18358460

Abstract

Two eye-tracking experiments examine whether adults and 4 and 5 year old children use the presence or absence of accenting to guide their interpretation of noun phrases (e.g., the bacon) with respect to the discourse context. Unaccented nouns tend to refer to contextually accessible referents, while accented variants tend to be used for less accessible entities. Experiment 1 confirms that accenting is informative for adults, who show a bias toward previously-mentioned objects beginning 300 msec after the onset of unaccented nouns and pronouns. But contrary to findings in the literature, accented words produced no observable bias. In Experiment 2, 4 and 5 year olds were also biased toward previously-mentioned objects with unaccented nouns and pronouns. This builds on findings of limits on children’s on-line reference comprehension (Arnold, Brown-Schmidt, & Trueswell, in press), showing that children’s interpretation of unaccented nouns and pronouns is constrained in contexts with one single highly accessible object.

Learning to understand language involves more than just words and grammatical rules. Children must learn to interpret words and sentences by connecting them with the preceding discourse and the larger context – and to do so very rapidly, as each word and sentence comes at them. This study investigates young children’s ability to generate on-line hypotheses about the referent of expressions like the bagel, with a focus on understanding whether children utilize the presence or absence of an accent to guide these hypotheses. Unaccented words tend to refer to information that is highly accessible in the discourse, while accented words tend to refer to less accessible information (e.g., Venditti & Hirschberg, 2003). Research has shown that adults are highly sensitive to this information, and use it rapidly to guide their interpretation of the nominal referring expression (Dahan, Tanenhaus, & Chambers, 2002). It is not known how accenting is used by children during reference comprehension. Furthermore, what is known about reference comprehension in children presents conflicting information about their ability to integrate linguistic referring expressions with the discourse context.

Reference comprehension and interpretation of accents: Adults

When adults interpret spoken referential expressions, they rapidly utilize detailed information about the linguistic expression to identify the most likely referent, This process is embedded in discourse processing mechanisms whereby adults maintain a mental representation of the entities in the current discourse situation (e.g., van Dijk & Kintsch, 1983; Kintsch, 1988; Johnson-Laird, 1983; Bransford, Barclay & Franks, 1972, Bower & Morrow, 1990; Sanford & Garrod 1981; Zwaan & Radvansky 1998). Those entities that are more central, or salient to the situation are represented as more cognitively accessible (see Arnold, in press; Gundel, Hedberg, & Zacharski, 1993; Sanford & Garrod, 1981), possibly by means of greater activation in the mental model of the discourse (Arnold, 1998). Modulations in referent accessibility have been explained in terms of how people allocate attention differentially to discourse characters (e.g., Arnold & Lao, 2007; Morrow & Bower, 1990; see also Foraker & McElree, 2007). Those entities that attract the listeners’ attentional resources are often termed “in focus”. However, this term should not be confused with the linguistic term focus (as opposed to topic/theme).

Referent accessibility is a critical factor guiding reference interpretation. It is easier to resolve expressions when the referent is contextually accessible, in particular when the expression is lexically or acoustically attenuated. For example, pronominal expressions are initially assumed to refer to the most accessible entity in the discourse that matches their features (e.g., Arnold, Eisenband, Brown-Schmidt, & Trueswell, 2000; Gordon, Grosz, & Gilliom, 1993; Grosz, Joshi, & Weinstein, 1995).

There are a variety of discourse and non-discourse factors that influence the accessibility of discourse entities, but in general listeners focus on things that have been mentioned recently, in particular those mentioned in prominent syntactic or thematic positions (see Arnold, 1998; for a review); acoustic prominence has also been argued to increase the activation of entity representations in the discourse model for subsequent reference (Foraker, Nusbaum, & Schoeneman, 2007). One well-established tendency is for adults to perceive the entity appearing in first-mentioned or subject position as more accessible (e.g., Gernsbacher & Hargreaves, 1989; Kaiser & Trueswell, 2007). For example, Arnold et al. (2000) monitored participants’ eye movements as they viewed a picture and decided if it matched a story, e.g. Donald is bringing some mail to Minnie…. She’s carrying an umbrella…. In situations like this example, where the pronoun only matched one character’s gender, adults began looking at the referent of the pronoun around 200 msec after the pronoun’s offset, indicating a rapid use of gender information to interpret the pronoun. In another condition, Minnie was replaced with Mickey, which required listeners to use information from the discourse context to infer which character was more prominent in the story, and assign the pronoun to that character. Adults looked at the target character just as quickly as in the gender-disambiguated case, but only when it referred to the first-mentioned/subject character from the context sentence (Arnold et al., 2000). This first-mentioned/subject bias is a robust finding with adults (Gernsbacher, 1989; Gordon, et al., 1993; Järvikivi, van Gompel, & Hyönä, 2005; Kaiser & Trueswell, 2007; see Arnold, 1998, for a review).

A similar bias occurs when adults interpret unaccented nominal referring expressions. Spoken words can be pronounced with or without a pitch accent, which is a phonological feature that signals prominence, usually with pitch movement and a local pitch maximum or minimum; in English accents also correlate with longer durations and greater acoustic intensity (e.g., Ladd, 1996). Although the location of accents in an utterance is heavily determined by the linguistic focus structure of the sentence, it also correlates with discourse status. It is frequently claimed that accented words refer to things that are new to the discourse, whereas unaccented words are for given (previously mentioned) information (e.g., Brown, 1983; Chafe, 1987). However, recent evidence suggests that a more precise characterization is that accenting occurs with relatively inaccessible referents, both given and new, and unaccented forms are reserved for highly accessible referents (Hirschberg, 1993; Hirschberg & Pierrehumbert, 1986; for a review see Venditti & Hirschberg, 2003). For example, unaccented variants tend to occur when the referent was mentioned in the previous clause in a parallel syntactic position to the current referring expression (Terken & Hirschberg, 1994). The effect of discourse status on the acoustic properties of a word is not limited to accenting, per se. Even accented words are systematically acoustically attenuated when they refer to entities that have been previously mentioned (Bard & Aylett, 1999, see also Fowler & Housum, 1987; Bard et al., 2000; Bard & Aylett, 2004), especially those in salient discourse positions (Watson & Arnold, 2005).

The above patterns mean that accenting and acoustic prominence could signal the listener about the discourse status of the referent – an unaccented and attenuated expression is likely to have a highly accessible referent, while an accented expression is likely to refer to something less accessible, or discourse new.¹ Thus, unaccented variants can direct the listener to look for the referent in the discourse model, whereas accented tokens may suggest the construction of a new discourse representation.

There is substantial evidence that adults do use accenting and acoustic prominence during reference comprehension. Terken & Nooteboom (1987) reported faster comprehension for unaccented words with given (previously-mentioned) referents, and for accented words with new referents. Similarly, listeners in Bock & Mazzella’s (1983) study understood sentences faster when the new information was accented, and the given information was unaccented. In both of these studies, the preference for unaccented tokens occurred when the referent had been mentioned in a parallel syntactic position as the referring expression, for both subject and nonsubject positions. Listeners can also have more specific interpretations for different kinds of pitch accents, as has been found for German (Baumann & Hadelich, 2003; Baumann & Grice, 2004, in press.).

Moreover, adults can use accenting information extremely rapidly. Dahan, et al. (2002) monitored participants’ eye movements using a visual world paradigm (Tanenhaus et al., 1995). In their experiment 1, participants saw displays similar to Figure 1, and followed instructions like Put the candle/candy below the triangle. Now put the CANDLE/candle above the square. The target instruction was the second one, in which the theme noun was either accented or unaccented. The expression was either anaphoric, in which case it referred to the object previously-mentioned in the highly salient position of theme in the first utterance, or it was nonanaphoric, referring to a previously unmentioned entity.

Dahan et al.’s (2002) study capitalized on the fact that objects like candy/candle and bacon/bagel have names termed cohort competitors, which are words that overlap at their onset, creating a temporary ambiguity (Marslen-Wilson, 1987). This causes listeners to fixate the competitor objects at the onset of the target word on some proportion of the trials (e.g., Allopenna et al., 1998; Tanenhaus et al., 1995). Critically, the level of competition depended on the accenting of the expression and the discourse status of the referent: If the competitor had been previously mentioned, participants looked at it more often when the expression was unaccented, and if the competitor was new, participants looked at it more often in the accented condition. This difference began to emerge around 300 msec after the onset of the referring expression, revealing that adults detected the presence or absence of an accent extremely rapidly, and used it to guide their first hypotheses about the word’s referent.

It is important to note that the interpretation of accented and unaccented expressions never occurs in a vacuum. Variation in accenting necessarily co-occurs with other prosodic changes to an utterance, since accenting is realized partially in comparison with other nearby elements. For example, in Dahan et al’s (2002) study the unaccented condition always used a prominently accented preposition (Now put the candle ABOVE the square). Thus, the effects of accenting (or lack of it) may be in fact the result of a combination of acoustic features in an utterance. Likewise, Birch and Clifton (1995) demonstrate that listeners consider the entire phrase when assessing the meaningful interpretation of accenting.

Reference comprehension in young children

In contrast with adults, the literature offers mixed evidence about whether preschoolers use the discourse context to guide their initial interpretation of referring expressions. There are no prior studies of how accenting guides children’s reference comprehension. The most relevant information about preschoolers’ reference comprehension skills comes from pronoun studies, which provide mixed findings about children’s sensitivity to the discourse context.

Arnold, et al. (2007, experiment 2) examined how children interpret pronouns on-line, examining their earliest hypotheses about the pronoun referent. In the same task as used by Arnold et al. (2000), the eye movements of 4 and 5-year-olds were monitored as they viewed a picture and listened to stories (e.g., Donald is bringing some mail to Mickey…He…). When the pronoun was disambiguated by gender, children identified the referent just as quickly as adults. However, they were not systematically biased toward the first-mentioned character in same-gender items. Results from an offline task with 3 to 5 year olds (experiment 1) were consistent with the online results, revealing no tendency to associate the pronoun with the first-mentioned character.

These results initially appear to be at odds with Song and Fisher’s (2005) findings. In their eyetracking experiments, 3-year-olds viewed pictures and listened to stories. The context segment of the stories were longer, and established a clear discourse topic by mentioning the first-mentioned character more than once and, in most experiments, by pronominalizing reference to the discourse topic prior to the critical reference, e.g.: Meet the crocodile and the toad. The crocodile went on vacation with the toad. And she swam in the sea with the toad. This was followed by a target sentence, She/The crocodile walked along the beach with the toad. Their participants showed a bias to look at a picture in which the pronoun referred to the first-mentioned character, although this bias did not show up until a full second after pronoun onset. Similar findings were reported by Pyykkönen, Matthews, and Järvikivi (2007), whose 3-year-old subjects showed evidence of a subject/first-mention bias but not until 2 seconds after pronoun onset.

One interpretation for these contrasting results builds on the observation that discourse accessibility varies along a continuum. In Song and Fisher’s (2005) study, multiple mechanisms clearly established one character as highly accessible, for example repeated mention, first mention, and pronominalization. In this situation, children linked the pronoun to the more accessible character (although not as quickly as adults). By contrast, Arnold et al. (2007) manipulated accessibility through a simple order-of-mention contrast. Although this is a robust cue for adults, it is a probabilistic cue and by itself did not guide children’s initial interpretations (for further discussion, see Arnold et al., in press).

Thus, the literature on children’s pronoun comprehension suggests that children have some ability to interpret referential expressions with respect to the discourse context. At the same time, their ability to do so is limited to situations where a single character is clearly and redundantly marked as the most accessible one, and they may not be able to use accessibility to guide their very earliest interpretations.

Does accenting guide preschoolers’ reference interpretation?

The current study seeks to extend our understanding of preschoolers’ moment-by-moment processes of interpreting spoken referential expressions, by investigating their understanding of accented and unaccented referential expressions. There is no currently available data about children’s use of accenting during reference comprehension. The only evidence of children’s sensitivity to the relationship between accenting and discourse status comes from production studies, which show that English-speaking preschoolers produce adult-like accenting in their own speech, preferring accented tokens for new referents, and unaccented ones for given referents (Wieman, 1976; Hornby & Hass, 1970; MacWhinney & Bates, 1978). However, these findings do not mean that children can also use accenting during comprehension. If children produce unaccented variants for accessible referents because of production-internal facilitation (cf claims by Bard et al., 2000), they may not know that other speakers use accenting systematically as well.

The literature on children’s pronoun comprehension suggests that children would stand the best chance of utilizing accenting if the discourse situation established a clear distinction in accessibility. The following study therefore examines accenting in a context where one candidate referent is previously-mentioned and highly accessible, and the other is new (unmentioned). Previously mentioned referents are usually more accessible than unmentioned ones, in that discourse participants can presume such information to be known and accessible to all other discourse participants (Clark & Marshall, 1981). By contrast, there is less information about whether one’s interlocutors are focusing their attention on unmentioned objects, even if they are visible in the discourse context. The performance of 4 and 5 year old children is examined, given the contrasting predictions for this age group that emerge from the literature on pronoun comprehension.

The following two experiments examined adults’ and children’s use of accenting during on-line reference comprehension, using the same experimental design as Dahan et al. (2002, experiment 1). Participants viewed a display with four objects, two of which had names that were cohort competitors (e.g., bagel/bacon), and followed instructions like in Table 1. The context sentence (e.g., Put the bacon on the star) established a clear contrast in accessibility: the bacon was previously mentioned and in the highly accessible theme position. Also, since the theme of the first instruction is the only object manipulated, we know with relative certainty that the participant is focusing on it at the onset of the second instruction. The bagel, by contrast, is unmentioned and therefore far less accessible than the bacon. If accenting guides on-line comprehension as in Dahan et al.’s experiment, unaccented expressions should result in faster target looks in the anaphoric (given target) condition, and accented expressions should result in faster target looks in the nonanaphoric (new target) condition. As a control condition, we also examined the comprehension of pronominal instructions, Now put it…. Experiment 1 establishes adult performance in this task, and Experiment 2 investigates performance on the same task by 4 and 5 year old children.

Table 1.

Example auditory instructions for Experiments 1 and 2. Capitalization indicates the critical accenting manipulation. Accents also fell on the theme and destination in the context sentence, and on the words Now and the destination shape in the second sentence.

	Instructions
Nonanaphoric, Accented	Put the bacon on the star. Now put the BAGEL on the square.
Nonanaphoric, Unaccented	Put the bacon on the star. Now put the bagel on the square.
Anaphoric, Accented	Put the bacon on the star. Now put the BACON on the square.
Anaphoric, Unaccented	Put the bacon on the star. Now put the bacon on the square.
Anaphoric, Pronominal	Put the bacon on the star. Now put it on the square.

Open in a new tab

NOTE: In Experiment 1 accent on Now was manipulated as a third variable, but only the

accented-Now conditions are reported here, for direct comparison with Experiment 2.

The experiment used cohort competitors to establish a temporary ambiguity, providing an ideal method for identifying listeners’ earliest hypotheses about the referent. If listeners can identify the word as accented or unaccented during the first syllable of the word, they may integrate the accenting with the temporarily ambiguous input. The first looks after the onset of the target word thus indicate listeners’ biases during reference interpretation.

Predictions

Adults prefer to interpret both pronouns and unaccented nouns as coreferential with highly accessible information. This predicts that children’s interpretation of unaccented noun phrases should use similar mechanisms as their interpretation of pronouns. If the above interpretation of the pronoun literature is right, then children should be able to link unaccented expressions with the more accessible referent, if there is only one highly accessible entity in the context. Furthermore, on some views of language development, the order in which children acquire processing skills is related to the amount of information available in the input, where stronger patterns, with more substantial and reliable evidence, are learned earlier (Arnold et al., 2007; Trueswell & Gleitman, 2004). This would predict an early use of accenting patterns for on-line comprehension if these patterns are robust in child-directed speech. Indeed, there is substantial information available in the speech input about the distribution of unaccented and accented expressions. The adult pattern of using unaccented expressions for given and accessible referents is present in child-directed speech as well (Fisher and Tokura, 1995). More generally, speech to children tends to have attenuated pronunciations for words that are predictable from the discourse or physical context (Bard & Anderson, 1983,1994). If children can detect and categorize tokens by acoustic prominence, they should have amassed a large database of accented and unaccented words at a very young age.

However, evidence from pronoun comprehension suggests that 4–5 year old children may not be able to integrate pragmatic biases with discourse accessibility quickly enough to affect their initial interpretation of the referential expression. While children interpret gender-disambiguated pronouns as quickly as adults (Arnold et al., 2007), manipulations of accessibility either have not influenced young children’s interpretations (Arnold et al., 2007), or have done so only a second or two after the critical expression (Pyykkönen et al., 2007; Song & Fisher, 2005).

Experiment 1: Adults

Method

Participants

49 native English-speaking students at the University of North Carolina at Chapel Hill participated in exchange for course credit. Data from 13 were excluded: 7 because of technical problems, 5 because of calibration problems of track loss, and 1 because the participant had to leave before finishing the experiment. This left 36 participants in the analysis.

Method and Materials

Participants were asked to wear a visor for the purposes of monitoring their eye movements. They viewed pictures on a computer screen, as in Figure 1. 135 of 176 pictures were drawn from a colorized version of the Snodgrass and Vanderwart (1980) database of pictures (Rossion & Purtois, 2001); the rest were from other clipart databases. On each trial, two of the objects had names that were cohort competitors, meaning that they overlapped during the initial segments, like bagel/bacon or candle/candy. The pictures always appeared on a grid, with the same four shapes in the corners on all trials. Participants followed recorded instructions to move objects onto the shapes with the mouse. There were two instructions for each visual stimulus, e.g. Put the bacon on the star. Now put the bacon on the square (see Table 1).

The object in the second instruction was the referring expression of interest, e.g. bacon in this example. The other cohort object (i.e., the one with an overlapping name, e.g., the bagel) was the competitor. The first instruction mentioned either the target (the anaphoric condition) or the competitor (the nonanpahoric condition). We also manipulated the form of the target referring expression, which was accented, unaccented, or pronominal (Now put it…); the pronoun was always anaphoric. For an example of auditory stimuli, see Table 1. There were two sets of items with identical visual stimuli, but the opposite mapping of objects to target and competitor roles (e.g., for Figure 1 the target was bagel in set a and bacon in set b). Thus, any idiosyncratic characteristics of the stimuli were counterbalanced across lists.

The auditory stimuli were recorded by the author, and the same soundfiles were used for both experiments. A single context instruction was recorded for each anaphoric and nonanaphoric condition for each item in cohort set a. These same context instructions were used for cohort set b, but swapped (i.e., the anaphoric context instruction from set a became the nonanaphoric one for set b, and the nonanaphoric one for set a became the anaphoric one in set b). A single recording was created for each target instruction in each condition. All context sentences ended with rising intonation, following Dahan et al.’s (2002) stimuli. This signaled that the speaker wasn’t finished, encouraging participants to interpret the second instruction in the context of the first.

The target instruction sentences were analyzed with Praat (Boersma & Weenik, 2007) to identify the average duration and pitch of each critical word, and the accent pattern of the target words was transcribed in the ToBI labeling system (Beckman & Elam, 1997). In the accented condition, the target word (the theme in the second instruction) carried a pitch accent, had greater pitch movement, and was acoustically prominent and relatively long (avg. 701 ms). In the unaccented condition, the target word carried no pitch accent, and was acoustically attenuated, with a shorter duration (avg. 337 ms), and no boundary tone. The pronoun was also unaccented and acoustically attenuated, average duration 92 ms. In all accented conditions, the target word had a L+H* accent, followed by an L-H% boundary tone. The result was an extremely prominent sounding accent, giving the impression that the speaker was being deliberate and explicit. This established a clear contrast between the accented and unaccented conditions. Table 2 presents the ToBI transcription of the typical accent pattern; Table 3 presents the average acoustic properties of the target words in each condition. Sample acoustic files can be found at www.unc.edu/~jarnold/pages/publications.html.

Table 2.

ToBI transcription of accent pattern in context and stimulus sentences.

	Instructions
Context sentence	Put the bacon on the star.
	H* L+H* H−H%
Pronoun target sentence	Now put it on the square.
	L+H* H* L−L%
Unaccented target sentence	Now put the bacon on the square.
	L+H* H* L−L%
Accented target sentence	Now put the BACON on the square.
	L+H* L+H* L−H% H* L−L%

Open in a new tab

Table 3.

Experiment 1: Acoustic characteristics of the target word in each condition

	Accented	Unaccented	Pronoun
Target word duration	701 ms	337 ms	92 ms
Pause after target word	191 ms	14 ms	0 ms
Maximum pitch	289 Hz.	190 Hz.	185 Hz.
Minimum pitch	145 Hz.	164 Hz.	167 Hz.

Open in a new tab

The focus here is on the different acoustic properties of the target word, but it is important to note that accenting on any particular word is not independent from the prosodic characteristics of the rest of the utterance. In all conditions the destination location also received a prominent pitch excursion. Apart from the presence of pitch accents, other acoustic characteristics of stimuli with accented targets were also systematically different. The target instructions in all conditions had an accent on the initial word Now. However, as shown in Table 4, the initial three words (Now put the) were longer and/or more likely to have a following pause in the accented condition compared with both unaccented and pronominal conditions. The accented and unaccented conditions both used a prominent accent on the destination shape (e.g., on the TRIANGLE). This pattern was deemed most natural because the most contrastive element of the sentence (apart from the target) was the destination shape, which was always different across the two instruction. This pattern differed from Dahan et al.’s unaccented condition, which instead used a prominent accent on the preposition following the target word (2002, p. 298). However, Dahan et al.’s accented condition was very similar to the one used here, with a prominent accent and boundary tone on the target word.

Table 4.

Experiment 1: Average duration for the initial words in each condition (ms)

	Accented	Unaccented	Pronoun
Now duration	310	257	314
Pause after Now	53	49	0
Put duration	105	88	108
Pause after Put	65	38	0
The duration	121	85	--

Open in a new tab

This experiment also included an additional manipulation of the prosody on the word Now. In half the items Now carried a large acoustic prominence, and in half the items it was relatively de-emphasized. This variable did not affect the results substantially, and therefore will not be discussed further. The experiment with children (Experiment 2) only used the prominent Now conditions, so only those items are presented here (i.e., 15 out of the 30 items viewed by each participant).

The basic experimental design was thus 2 (anaphoric vs. nonanaphoric) x 2 (accented vs. unaccented target), plus a pronominal/anaphoric condition. These 5 conditions occurred either with an accented Now or an unaccented Now, for a total for 10 conditions. There were 30 experimental items in 10 conditions, so each participant heard 3 critical items in each experimental condition. The critical items were combined with 12 fillers into 10 lists. As an additional control, there was a second set of 10 lists in which the items that were assigned to target and competitor were swapped; e.g. the example in Table 1 became Put the bagel on the circle. Now put the bacon on the square. Before each list there were two practice items, in one the target expression was accented, in the other it was unaccented; both referred to a different object than was mentioned in the first instruction.

All 12 filler items also included a pair of objects whose names were members of a cohort set. These were never mentioned in the context instruction, and were mentioned in six fillers as the theme of the second instruction. The fillers thus served to reduce the expectation that one of the two objects with overlapping names would be mentioned in the first instruction, as well as the expectation that if one of these objects was mentioned, the other would be too. Out of the 12 fillers and 2 practice items, the target noun was anaphoric in 4 items and nonanaphoric in 10; 7 target expressions were accented, and 7 were unaccented. Across all stimulus and filler/practice items, there were an equal number of anaphoric and nonanaphoric items on each list.

On critical items, the target and competitor objects occurred equally in all four positions in the display. On all items, the target and competitor were placed either diagonally or vertically from each other on the initial display. The experiment was designed so that the first instruction would always result in the object being moved to the shape immediately next to it, because of methodological constraints on coding direction of eye gaze in Experiment 2. However, due to a programming error, all the objects on the bottom half of the screen were moved to the bottom shape on the opposite half of the screen. This did not affect our ability to identify eye movements for this experiment.

Procedure and Apparatus

We monitored participants’ eye movements with an Eyelink II head mounted eyetracker. After the task was explained, the visor with cameras was arranged on the participant’s head, and calibrated. Participants completed two practice items, were given a chance to ask questions, and then went on to do the experimental task.

The visual and auditory stimuli were presented on a PC computer running the ExBuilder software (Longhurst, 2006). Each trial was preceded by a screen with a dot, which participants needed to fixate on and click. This enabled the eyetracker to perform a drift correction, and encouraged participants to attend as each trial began. As soon as the participants clicked on the dot, the visual stimulus appeared and the soundfile began to play. The instruction began within 150 msec of the onset of the soundfile. The Eyelink II sampled participants’ eye movements once every 2 or 4 milliseconds; the data were analyzed with a 4-msec time window for each data point.

If participants behave as those in Dahan et al.’s (2002) experiment 1, the unaccented condition should produce more and earlier looks to the previously-mentioned object than the accented condition. The pronominal condition can only be interpreted as anaphoric, and is expected to produce similar results as the unaccented/anaphoric condition.

Results and Discussion

Three types of analyses are reported: 1) Descriptive data establishing the speed and frequency of eye movements in this task, for comparison with children in Experiment 2; 2) Action errors in moving the objects, and 3) Eye movements to target objects. Eye movements are analyzed in terms of the average number of looks; each look begins at the onset of a saccade to an object, and lasts until the participant makes a saccade to a new object. Saccades within the same object are categorized as a single look. Saccades are grouped together with following fixations because they are ballistic, so the saccade reflects the participant’s decision to look at that object.

In both this and experiment 2, the pronoun condition is presented for comparison with the unaccented condition, but the primary analysis is the comparison of the four conditions resulting from a cross between accent and anaphoricity. All analyses of variance were conducted over both participant and item means. The items analyses always included Item Group (i.e., the class of items that rotates together through the conditions across lists) as a predictor variable; this captures differences between the participants who contribute to the different conditions for a particular item. There were no effects or interactions with Item Group except where noted. All analyses using proportion data were performed twice, once with the raw data and once with arcsin-transformed proportions (arcsine (2*p − 1). The raw analyses are reported for transparency of interpretation; the arcsine-transformed analyses provided the same pattern of results unless noted. Trials were excluded from time-window analyses if there was greater than 33% track loss during the critical period; 11 trials (3%) were excluded from Exp. 1, and 3 trials (1.3%) were excluded from Exp. 2.

Descriptive Data

Two dimensions were measured with the primary purpose of comparing adults and children in terms of the speed with which they moved their eyes and responded to the linguistic input in this task. The speed with which adults change their point of regard was measured in terms of the average amount of time spent on each look, which was 530 msec (SE = 12). That is, on average every 530 msec adults launched a saccade to look at a new region on the screen. Their response to the linguistic input was measured as the time lapse between the onset of the target word and the next saccade to a new object, which averaged 330 msec (SE = 17).

Errors

The presentation software recorded trials on which the participant initially moved the wrong object. Six participants made one error, and always in the unaccented nonanaphoric condition; these errors occurred on four different items (average 5.6% across participant means). There were no errors in any of the other conditions.

Eye movements

Eye movements to display objects were time-locked with the onset of the target referring expression for each item, for each participant. Figure 2 presents the average looks to display objects in both the pronominal and unaccented/anaphoric conditions, starting at the onset of the target expression. Of primary interest are eye movements occurring during the target expression and immediately after. These are the eye movements that are likely to reflect listeners’ initial hypotheses about what the word refers to, as they hear the phonetic input accrue over the course of the word. As predicted, both pronouns and unaccented anaphors yielded an early target preference: looks to the target began to diverge from looks to the competitor very rapidly, about 300 msec after the onset of the referring expression. Although there were slightly more looks to the target in the pronominal condition, the conditions were not reliably different from one another, with either target or competitor looks as the dependent variable (all F’s < 1). This similarity occurs despite the fact that pronouns are substantially shorter, suggesting suggests that the first syllable of the unaccented noun provides enough information for listeners to begin to resolve the reference. This is consistent with the expected anaphoric bias for the unaccented condition.

Experiment 1 results: Proportion looks to target and competitor objects in the unaccented/anaphoric and pronominal conditions. The graph begins at the onset of the critical expression for each item. Proportions are calculated out of all looks, including track loss.

But the critical question is whether the unaccented condition would have a greater anaphoric bias than the accented condition. As Figure 3 illustrates, it does. The proportion of looks to the previously-mentioned (given) objects, both targets and competitors, is calculated out of all looks to cohort objects, collapsing across target condition. This takes advantage of the temporary ambiguity of the target word, which initially is consistent with both target and competitor objects. Thus, early eye movements may have been programmed before the disambiguating information was available. Beginning around 300 msec after the onset of the unaccented referring expression, there is an increase in looks to the previously mentioned object. In the accented condition, by contrast, the rate of looking at the previously mentioned cohort steadily decreases.

Experiment 1 results: proportion looks to the given (previously-mentioned) object out of all looks to both the given and new cohort objects. The vertical line represents the average offset of the target word. The graph begins at the onset of the critical expression. Proportions are calculated out of all looks, including trackloss.

The reliability of this contrast was assessed in two ways. The simplest way to observe the difference between accented and unaccented conditions is to analyze the average looks to the previously-mentioned object following accented and unaccented target words. This analysis uses the common technique of synchronizing items at the onset of the target word, and analyzing a time window shortly thereafter. The window used here was 300–1000 msec after the onset of the target expression, following Dahan et al. (2002). Looks to the previously-mentioned object were reliably greater in the accented condition (M = 37%, SE = 3), than the unaccented condition (M = 28%, SE = 3). These data were submitted to analyses of variance with participants and items as random effects. Results revealed a main effect of accenting (F1(1,35) = 7.69, p <.01; F2 (1,20) = 7.53, p <.05). This supports the prediction that unaccented expressions have a greater bias toward previously-mentioned referents than accented expressions do.

This contrast is further supported by an examination of participants’ first look after the onset of the target word, which is a likely indication of their initial hypothesis about the referent. As shown in Table 5, unaccented expressions yielded more initial looks to the target in the anaphoric than unanaphoric condition, whereas accented expressions produced equal initial target looks in the two conditions. The measurement of first looks is categorical, analyzed here as looks to target vs. other. These data were therefore modeled with a multilevel logistic regression model, with random effects for both participants and items, using SAS proc glimmix with a Penalized Quasi-Likelihood (PQL) estimator (see Bauer & Curran, 2006). The model included three independent variables: accent, anaphoricity, and accent x anaphoricity. As shown in table 6, the t-value for the odds ratio was significant at the.05 level for the main effect of anaphoricity, and for the accent × anaphoricity interaction.

Table 5.

Experiment 1: Eye movement results across the four conditions: Adult participant means in each condition. Standard Error of the Mean is in parentheses.

	Unaccented Anaphoric	Unaccented Nonanaphoric	Accented Anaphoric	Accented NonAnaphoric
Proportion of first looks after target expression onset to the target object	37% (6)	16% (4)	24% (4)	23% (4)
Proportion looks to target object (300–1000 after target onset)	51% (4)	38% (3)	35% (4)	39% (3)
Proportion looks to competitor object (300–1000 after target onset)	12% (2)	23% (3)	22% (3)	22% (3)

Open in a new tab

Table 6.

Experiment 1 statistical data for analyses across all four conditions. Nonsignificant and marginal effects are shaded.

	Main effect of accenting	Main effect of anaphoricity	Interaction (anaphoricity × accenting)
Proportion of first looks after target expression onset to the target onset	Odds ratio = 0.45 S.E. =.36 t = 1.26 DF = 347 p =.21	Odds ratio = 1.15 S.E. =.34 t = 3.36 DF = 347 p <.001	Odds ratio = -1.05 S.E. =.47 t = -2.22 DF = 347 p <.05
Proportion looks to target object (300–1000 after target onset) ⁺ ^*	F1 (1,35) = 6.05 p <.05	F1(1,35) = 1.27 p =.127	F1(1,35) = 8.08, p <.01
	F2(1,20) = 2.18 p =.156	F2 (1,20) = 4.19 p =.054	F2 (1,20) = 15.74 p =.001
Proportion looks to competitor object (300–1000 after target onset) ⁺	F1 (1,35) = 1.86 p =.181	F1(1,35) = 2.80 p =.103	F1 (1,35) = 6.07 p <.05
	F2 (1,20) = 1.05 p =.319	F2 (1,20) = 5.07 p <.05	F2 (1,20) = 6.19, p <.05

Open in a new tab

There was also a three-way interaction itemgroup x accent x anaphoricity in the F2 analysis

⁺

One cell was missing in the F1 analysis and was replaced by the participant mean.

Comparison with Dahan et al.’s (2002) findings

Since this experiment was closely modeled on that of Dahan et al. (2002), it is worth considering whether their effects were replicated. Table 5 presents the average proportion looks to target and competitor over the time window from 300 to 1000 msec, which was the measure reported by Dahan et al. The means show more target and fewer competitor looks in the unaccented/anaphoric condition than in the other three critical conditions. These data were submitted to a 2(accented vs. unaccented) x 2(anaphoric vs. nonanaphoric) ANOVA, which revealed a significant interaction between anaphoricity and accent. See Table 6 for statistical details.

Our findings replicated the interaction between anaphoricity and accenting that was found by Dahan et al. (2002). However, our findings only clearly replicated Dahan et al.’s anaphoric bias with unaccented nouns. By contrast, participants in the current experiment did not exhibit a strong new-object bias with the accented condition. Separate analyses of target looks in the accented and unaccented conditions revealed a significant effect of anaphoricity for unaccented items (F(1, 25) = 8.08, p <.01; F2 (1, 20) = 18, p <.001)², but no effect for accented items (F’s < 1)³.

The only evidence that could possibly be interpreted as a new advantage for accented stimuli occurred fairly late. Figure 4 presents the proportion of target looks (out of all looks) for two seconds after the onset of the critical expression. Beginning around 800 msec after the onset of an accented noun, there were more looks to the nonanaphoric than anaphoric target. Since the accented stimuli were longer in duration than the unaccented stimuli, it is worth considering whether reactions to the input simply occur later in time, and that this late new-object bias reflects listeners’ initial biases when hearing accented referring expressions. However, there are two reasons that this conclusion is not justified. First, participants were only slightly slower to launch a new eye movement following the target word onset in the accented (M = 382 msec) than unaccented (M = 328 msec) conditions, but the new-target advantage in the accented condition emerges roughly 500 msec after the advantage for previously-mentioned objects in the unaccented condition.

Experiment 1: proportion looks to the target object in the four critical conditions. The vertical line represents the average offset of the target word. The graph begins at the onset of the critical expression. Proportions are calculated out of all looks, including trackloss.

Second, the new-target advantage also occurs in the unaccented conditions. Thus, this pattern seems to reflect a general tendency to spend less time fixating targets when they have been previously mentioned, probably because they had already visually examined the previously mentioned object. We assessed this by examining the length of time spent on the first look at the target object after the onset of the target expression onset, which revealed that people look longer at nonanaphoric than anaphoric targets, in both unaccented and accented conditions (unaccented/anaphoric: M = 790 msec, SE = 38; unaccented/nonanaphoric; M = 946, SE, = 49; accented/anaphoric: M = 892, SE = 73; accented/nonanaphoric: M = 1135, SE = 49). This contrast is evident in the results of a 2×2 ANOVA, which revealed a main effect of anaphoricity (F1 (1,35) = 21.46, p <.001; F2 (1,20) = 19.5, p <.001), a main effect of accenting (F1 (1,35) = 4.97, p <.05; F2(1,20) = 5.96, p <.05), and no interaction (F’s < 1).

In sum, the accented condition produced less of a bias toward previously-mentioned objects than the unaccented condition did, but there was little evidence that adults have an initial bias towards low-accessible referents when they hear accented expressions. While there may be a later bias, it is indistinguishable from a general interest in looking at previously unmentioned objects.

This lack of a clear nonanaphoric bias in the accented condition is surprising, given claims in the literature that the L+H* invokes a contrastive interpretation (e.g., Pierrehumbert & Hirschberg, 1990), and can lead to the inference of a contrast set (e.g., Sedivy et al., 1995). Stimuli like Put the bacon…. Now put the BA-, should suggest a contrast with the object of the previous put action. One possibility is that the contrastive interpretation is there, but that unrelated factors lead participants to fixate previously-mentioned entities more quickly (for example, as suggested by an anonymous reviewer, if such items are better represented in visual memory), irrespective of accenting condition. However, it is notable that Dahan et al. (2002) did find a nonanaphoric bias for interpreting accented expressions with the same task. Thus, while the current results are not inconsistent with claims that accenting creates a contrastive interpretation, they suggest that in the context used here, this interpretation is either weaker, or occurs relatively late compared with the anaphoric bias in the unaccented condition.

This asymmetry in the results is consistent with how accenting occurs in speech. Unaccented expressions have a strong probability of being used to refer to something highly accessible and given in the discourse context. Accented definite NPs, on the other hand, seem to be less specialized. They can felicitously be used to refer to something that has not been previously mentioned, as long as the referent is identifiable (see Chafe, 1976, Prince, 1992; Gundel et al., 1993). At the same time, they can refer to something given, particularly if its referent is not highly accessible in memory. For example, 52% of words with given referents were accented in Hirschberg’s (1993) sample (and 87% of words with new referents). Related evidence comes from Watson and Arnold’s (2005) experiment 1, in which acoustically prominent tokens were frequently produced for reference to both unmentioned and given but relatively inaccessible entities. Similarly, Terken & Hirschberg (1994) found accenting when the referent was given but not in a syntactically parallel position. It is also worth noting that Dahan et al’s (2002) data also support the conclusion that the anaphoric bias with unaccented forms is stronger than the nonanaphoric bias with accented forms. The new bias reported in their experiment 1 only affected competitor looks, and their experiment 2 revealed a preference to interpret accented expressions as co-referential with previously mentioned but less accessible entities.

Nevertheless, the important finding from experiment 1 was that adults distinguished between accented and unaccented referring expressions. Adults were more biased toward an anaphoric interpretation with unaccented expressions than accented expressions. Indeed, unaccented expressions were interpreted very much like pronouns. Experiment 2 investigated whether 4 and 5 year old children would also use the presence or absence of an accent to guide their initial referential interpretations.