Standing alone with prosodic help

Lyn Frazier; Charles Clifton, Jr; Katy Carlson; Jesse A Harris

doi:10.1080/01690965.2013.828095

. Author manuscript; available in PMC: 2015 Jan 1.

Published in final edited form as: Lang Cogn Process. 2013 Aug 28;29(4):459–469. doi: 10.1080/01690965.2013.828095

Standing alone with prosodic help^*

Lyn Frazier ¹, Charles Clifton Jr ¹, Katy Carlson ², Jesse A Harris ³

PMCID: PMC3979625 NIHMSID: NIHMS512716 PMID: 24729648

Abstract

Two partially independent issues are addressed in two auditory rating studies: under what circumstances is a sub-string of a sentence identified as a stand-alone sentence, and under what circumstances do globally ill-formed but ‘locally coherent’ analyses (Tabor, Galantucci, & Richardson., 2004) emerge? A new type of locally coherent structure is established in Experiment 1, where a that-less complement clause is at least temporarily analyzed as a stand-alone sentence when it corresponds to a prosodic phrase. In Experiment 2, reduced relative clause structures like those in Tabor et al. were investigated. As in Experiment 1, the root sentence (mis-)analyses emerged most frequently when the locally coherent clause corresponded to a prosodic phrase. However, a substantial number of locally coherent analyses emerged even without prosodic help, especially in examples with for-datives (which do not grammatically permit a reduced relative clause structure for some speakers). Overall, the results suggest that prosodic grouping of constituents encourages analysis of a sub-string as a root sentence, and raise the question of whether all local coherence structures involve analysis of an utterance-final sub-string as a root sentence.

Keywords: local coherence effects, prosody, identifying root structures, sentence processing

In casual speech at least, it may often be unclear when a clause can be analyzed as an independent English sentence. In some sense, any string of words that forms a clause and is not marked as embedded is a potential ‘root’ structure (a stand-alone utterance). Whether it is in fact a root clause presumably depends on syntactic, semantic and intonational properties of the clause¹ and of surrounding linguistic material, because the surrounding remnant material must itself make up a potential root structure for the analysis of the input to be grammatical.

We hypothesize that, during the processing of a sentence, material that is prosodically separated by an ip or an IPh boundary (intermediate phrase or Intonational Phrase; Beckman & Elam, 1997) from previous material can be identified as beginning a new root structure. Given a string like “abcde,” the parser may identify (cde) as a root structure if it forms a potential root clause, even if (ab) does not form a stand-alone utterance. Such an analysis is particularly likely if the string of words is prosodically grouped as (ab) (cde), and it may be facilitated if, as we hypothesize, attention is largely directed at newly-arriving material. When parsing (ab), the processor will not be concerned if (ab) is missing some constituent that may be upcoming, and the presence of a prosodic break might be justified by the length of the upcoming missing constituent. Later, when the parser receives (cde), it might analyze this as a stand-alone (root) clause if (cde) makes up a sentence. So the experience of analyzing the input as it arrives might be relatively smooth: each part of the input receives a reasonable analysis as it is being processed.

We suspect that this is part of what goes on in the ‘local coherence’ structures identified by Tabor, Gallantucci, & Richardson (2004). These authors show self-paced reading time effects that indicated that the underlined part of (1) was mis-identified as a clause even though the analysis would be globally ungrammatical (see below for details).

(1)
The kindergartners liked the little girl brought a toy…

It seems possible that readers assigned an implicit prosody to at least some of these sentences (Fodor, 2002). If they put an implicit prosodic boundary before the ambiguous region (underlined in (1)), they might be particularly likely to misidentify it as a stand-alone clause. In most of the Tabor et al. materials, the ambiguous string was fairly long and appeared in direct object position, which could encourage readers to consider it a prosodic phrase in their mental representation of the sentence.

Rather than attempting to manipulate implicit prosodic boundaries, we examined how listeners analyzed sentences that varied in how they were overtly prosodically divided. We first used a new structure to test the general hypothesis that material that constituted a prosodic unit would more often be analyzed as a root clause, and then in Experiment 2 tested the hypothesis in structures like those investigated by Tabor et al. Before turning to the experiments, we discuss existing evidence on the identification of root structures.

The problem of how root structures are identified prosodically has not received much attention in the literature.² Despite the existence of a vast literature on intonation (e.g., Jun, 2005) and its role in processing (Wagner & Watson, 2010), the topic has not received much attention (though see Silverman, 1987 and Carlson, Frazier, & Clifton, 2012). It has been assumed since the pioneering work of Beckman and Pierrehumbert (1986) that a high boundary tone after a low phrasal accent (L-H%), often called a ‘continuation rise,’ is most compatible with the ends of sentence-internal phrases, and that a steeply falling final boundary tone (L%) marks the end of an utterance.³ Impressionistically there are clear examples of each that have the appropriate interpretation (though see Carlson et al., 2012, for surprising difficulty confirming these assumptions experimentally in examples which are ambiguous between a sentence-internal conjoined clause boundary and a sentence boundary).

Marcus and Hindle (1990) do discuss the problem of how to identify root clauses, emphasizing parenthetical root clauses. They presented a processing account of intonation boundaries (formulated in Description Theory), claiming that intonational boundary ‘morphemes’ (‘unknown “lexical” items’ in Marcus and Hindle’s terminology) are inserted into the analysis of the input lexical string at the point when an intonational boundary is heard. All material until the next intonational boundary would be packaged together, with later unification of matching chunks. For instance, in (2) (where the parentheses indicate prosodic boundaries), the words in the phrase they all knew are packaged together. If that phrase is pronounced with a compressed pitch range, it is interpreted as a parenthetical and treated as a root sentence. If it is pronounced with contrastive pitch accents on we, suspected, they, and knew, it is given a right-node raised interpretation in which we only suspected and they all knew are given parallel interpretations, each with the complement that a burglary had been committed. In this system, root structure identification is an after-the-fact decision.

(2)
We only suspected (they all knew) that a burglary had been committed.

The new studies we report were designed to test the hypothesis that structures which are actually embedded are at times analyzed as root structures, particularly when they form a prosodic unit on their own.

(3)
Prosodic local coherence hypothesis: When material that could stand alone as a root clause exhausts a prosodic phrase, it is more likely to receive a root clause analysis than when it fails to exhaust a prosodic phrase, even when such an analysis is only locally coherent, not globally well-formed.

When global ill-formedness occurs due to a root clause mis-analysis, listeners will presumably recognize it and report that a sentence sounds unnatural. In Experiment 1 we test the hypothesis in an auditory rating study using sentences containing a final complement clause that may or may not be analyzed locally as a root clause. In Experiment 2 we investigate the reduced relative structures tested in reading studies by Tabor et al. (2004) to further explore the various sources of the local coherence effect in those studies. In what follows we only test examples where the locally coherent structure is utterance-final; we do not contrast such cases with examples where a non-final prosodic phrase might on its own be analyzed as a root clause – a point to which we return in the General Discussion.

Experiment 1

The first experiment obtained judgments of the naturalness of sentences like those in (4). The position of a prosodic boundary (indicated by parentheses) was varied so that it immediately preceded either the final complement clause (4a, b) or the earlier complement to the matrix verb (4c, d), and the presence of the complementizer that was also varied (appearing in (4a, c)).

(4)
- a.
  Martin says that Louise believes (that her boss is an alien.)
- b.
  Martin says that Louise believes (her boss is an alien.)
- c.
  Martin says (that Louise believes that her boss is an alien.)
- d.
  Martin says (that Louise believes her boss is an alien.)

Readers may find these sentences difficult to comprehend, and rate them as ‘sounding bad,’ if they misanalyse the final complement clause as a root clause, initially leaving the matrix clause without a required complement, especially if recovering from this misanalysis proves difficult for the processor. The prosodic local coherence hypothesis in (3) predicts that misanalysis will be likely in (4a) and (4b), given that the embedded clause exhausts a prosodic phrase, but not in (4c) and (4d), resulting in lower ratings for the former than the latter. Since the complementizer that in (4a) blocks the possibility of (inappropriately) interpreting the final clause as a root structure, the penalty will be greater in (4b) than in (4a). In other words, the processor may be tempted to treat stand-alone prosodic phrases as root clauses at some point during analysis, a temptation which, we predict, might attenuate in the presence of inconsistent grammatical cues. We thus predict there should be a main effect of boundary placement and an interaction with the presence of that.

Method

Materials

To test the prosodic local coherence hypothesis in (3), 24 sentences like (4) were constructed. Each sentence contained three clauses, permitting a possible prosodic boundary between the first and second, or the second and third, clauses. The sentences contained either an overt complementizer (4a, c), prohibiting a root structure analysis of the final clause, or no overt complementizer (4b, d), thus locally allowing the final clause to be analyzed as a root structure, by hypothesis. The second verb, the verb embedding the critical final clause, was always nonfactive, so that the truth of the critical clause would not be presupposed (Kiparsky & Kiparsky, 1970). Each sentence was recorded with a relatively prominent prosodic boundary placed late before the final clause as in (4a, b), or early before the intermediate clause, as in (4c, d) (see below for acoustic analysis).

The sentences, together with 122 other sentences from unrelated experiments, were recorded by a phonologist with extensive training in prosody. Pitchtracks of an illustrative item appear in Figure 1.

Example pitchtracks, Experiment 1. Panel 1, late boundary, no complementizer; Panel 2, late boundary, complementizer; Panel 3, early boundary, no complementizer; Panel 4, early boundary, complementizer.

Acoustic analyses were conducted using the program PRAAT (Boersma, 2001). The total duration of the first verb (says in (4)), including any silent pause following it, averaged 664 ms in the early boundary condition (averaging over presence vs. absence of that, which had no systematic effects) as compared to 379 ms in the late boundary condition. The corresponding values for the second verb (believes in (4)) were 359 ms in the early boundary condition and 726 ms in the late boundary condition. In addition to the greater length preceding a boundary, the pre-boundary verbs had generally greater pitch excursion than the verbs that did not precede a boundary (V1: 136 vs. 80 Hz, respectively; V2: 94 vs. 57 Hz). The prosodic boundaries varied in phonological type, with around a third of the examples in each condition using IPh boundaries with slight continuation rises and the remaining two-thirds having smaller ip boundaries with low (L-) phrase accents marking their ends. In all cases, these were the largest prosodic boundaries in the sentences. Two items in condition (a) were found to have ip boundaries at the early position as well as ip boundaries at the intended late position, but removing these two items from the analysis did not affect the pattern of results.

To evaluate the possibility that the final clause (her boss is an alien in (4)) varied prosodically across conditions, it was also analyzed phonologically and acoustically. In all cases, the subject had a non-prominent high (H*) accent, and the final phrase (alien) had a more prominent H* accent, as seen in Figure 1. The duration and the pitch range of the entire clause (high – low frequency in Hz) did not differ substantially across conditions. Pitch range varied from 75 to 80 Hz across the four conditions, with an average standard deviation of 14 Hz; duration varied from 1270 to 1328 ms, with an average standard deviation of 258 ms. In sum, the local coherence condition (4b) did not stand out from the other conditions prosodically; the major prosodic differences between sentences were due to the different boundary locations.

Subjects and procedures

Forty-eight University of Massachusetts undergraduates, who participated for course extra credit, were tested in individual half-hour sessions. They received instructions that they were to rate or answer questions about sentences they would hear. After hearing and responding to five practice sentences, each subject heard all 146 sentences (24 experimental sentences plus 122 from other experiments) in an individually-randomized order. Sentences were played at a comfortable listening level over speakers in an acoustically-isolated booth. A subject pressed a key on a keyboard when s/he had heard a sentence, and then a 'question' appeared on a monitor. For the experimental sentences (plus 30 of the other sentences), the question was the presentation of a 5-point scale, where 1 = "sounded fine" and 5 = "sounded bad." (For ease of comprehension, these values will be reversed in reporting the results.) A computer recorded responses and reaction times.

Results

The means and standard errors for each condition are presented in Table 1. The data were analyzed as a linear mixed-effects regression model, using the lme4 package in R (R Development Core Team, 2012). Two fixed effect factors were initially included for analysis: (i) presence of complementizer, (ii) position of break. Each of these factors was sum-coded, so that, as in ANOVA, the effect of each factor can be interpreted as its effect pooled over the levels of the other factor(s). Model comparison indicated that the simplest model that was not significantly (p < .10) worse than any more complicated model had random intercepts and non-interacting random slopes for both subjects and items.

Table 1.

Mean ratings (5 = “sounded fine”), Experiment 1, plus standard errors in parentheses.

	Complementizer
Boundary position	Present	Absent	Overall mean

Late	3.42 (0.76)	3.25 (0.76)	3.34
Early	3.64 (0.71)	3.77 (0.69)	3.71
Overall mean	3.53	3.51

Open in a new tab

The results of the model appear in Table 2. Because estimation of df is uncertain (but very large) in our linear mixed-effects model, and because Monte Carlo Markov Chain estimation is not implemented in R when random slopes are used, a t with absolute value >2.0 will be considered significant. Although there was no overall main effect of presence or absence of the that-complementizer, sentences with a late boundary were rated significantly worse overall than sentences with an early boundary (3.34 vs 3.71 where 5 was ‘sounded fine’), and this difference was significantly larger when the complementizer was absent than when it was present (.52 vs .22 scale points).

Table 2.

Parameters of linear mixed-effects regression model, Experiment 1.

Fixed effects	Estimate	Standard error	t-value

Intercept	3.52	0.13	27.28
Boundary position	−0.18	0.04	−4.87
Complementizer	−0.01	0.04	−0.27
Boundary*Complementizer	−0.07	0.03	−2.74

Open in a new tab

Discussion

The central predictions of the prosodic local coherence hypothesis (3) were confirmed: participants rated sentences with a late prosodic boundary as sounding particularly bad when they contained a final clause which was not marked as embedded with a that-complementizer, relative to those that were overtly marked as embedded. In other words, the late boundary position set the final clause in its own prosodic phrase, tempting the parser to take a root clause analysis when such an analysis was grammatically licensed – i.e., in the absence of a complementizer. The results highlight a potentially pervasive problem confronting the processor: when to assume it is parsing a root structure. If attention is allocated locally to new material and local context, the processor may not be prioritizing the relation between distinct prosodic inputs, giving rise at times to a separate sentence illusion. This leads us to expect that prosody also plays a role in the structures that have already been identified as local coherence structures (Tabor et al., 2004).

In sum, it is not helpful to prosodically separate the final clause from the immediately preceding clause when it needs to be attached into the nearest clause (an effect which we suspect will be especially true when this prosodic boundary is the largest in the sentence; cf. the discussion of the Informative Boundary Hypothesis in Frazier, Carlson, & Clifton, 2006).

Experiment 2

Tabor et al. (2004) reported three self-paced reading studies of reduced relative sentences like (5). They found that the locally possible root clause analysis of the final underlined phrase in (5) disrupted processing, even though analysis of the underlined phrase as a sentence is not consistent with the global structure of the sentence. (See too Konieczny et al., 2009.)

(5)
The coach smiled at the player tossed a frisbee.

Tabor et al. offered three possible accounts of the result. One was that the disruption might be due to the operation of a self-organizing network that is heavily influenced by local analyses. This was dubbed the Local Coherence Account 1 (LCA1). The second possibility was that analysis of the input takes place chunk-by-chunk as in, for example, the Sausage Machine model (Frazier & Fodor, 1978), dubbed LCA2. The third possibility is that the sentences tested, or some subset of them, are not grammatical for the participants, possibly because of their temporarily ambiguous reduced relative clauses, and the disturbance in processing is due to failure to repair or to find a grammatical analysis (LCA3). In this case, the local coherence effect might be a consequence of a parsing failure, rather than its cause. Of course, these accounts are not mutually exclusive – a point to which we return below.

The results of Experiment 1 suggest that locally coherent structures, analyses that are ONLY locally coherent (in that they leave ungrammatical remnants), may be constructed more readily when they subsume an entire prosodic phrase. To test this possibility further, Experiment 2 tested the prosodic local coherence hypothesis using sentences adapted from Tabor et al. spoken with a prosodic boundary after the subject or after the main verb. In this experiment, subjects simply answered a question about the ambiguous string (underlined in (5)), indicating whether their interpretation was the illicit but locally coherent root clause analysis or the licensed and presumably grammatical reduced relative clause analysis. Explicitly testing how subjects interpreted the locally coherent phrases is crucial in order to rule out an alternate hypothesis in which subjects simply give up entirely when faced with an incongruent fragment, eliciting lower acceptability ratings for reasons orthogonal to the hypothesis in (3). If, as predicted, however, a late prosodic boundary makes a merely locally coherent parse more tempting as a stand-alone utterance, then subjects should likewise be tempted into an interpretation supporting the locally coherent parse.

The prosodic local coherence hypothesis predicts that listeners will be most likely to compute the ‘main clause’ analysis of the final clause when it corresponds to a prosodic phrase (i.e., when a prosodic boundary follows the preceding verb). Other sources of the globally ungrammatical main clause analysis will also be tested by looking at the number of main clause analyses when the prosodic phrasing does not encourage a main clause analysis, and by looking separately at examples where all native English speakers’ grammars permit the global analysis (to-datives) and cases where the global analysis is not grammatical for some speakers (for-datives, see below).

Method

Materials

The original Tabor et al. materials included seven sentences where the relative clause modified the object of a preposition (as in (5), with the preposition at), and thirteen where it modified a direct object of the verb. There were five containing for-datives, but most contained to-datives. The difference is whether a non-passive main clause version of the relative clause would use the preposition to, as in Someone tossed a frisbee to the player for (5), or for, as in Someone painted a picture for the artist for (7). Many examples had stereotypical agents as head of the reduced relative clause, which increases the temptation to interpret them as main clauses, such as the teacher taught, the carpenter cut, the professor taught.

The materials for the present experiment adapted many examples from Tabor et al., with the following specifications. Most DPs were definite, and the matrix subject was lengthened in order to make the presence of a prosodic break after the subject or after the verb natural. Only examples involving a direct object modified by a reduced relative were tested; we omitted relative clauses that modified objects of prepositions, because it is infelicitous to place a prosodic break between a preposition and its object. New examples were constructed along the same lines. A total of 16 sentences were constructed; eight of the examples were to-datives and eight were for-datives. Examples are shown in (6) and (7), respectively.

(6)
- a.
  (The kindergartners at school) (liked the little girl brought a toy)
- b.
  (The kindergartners at school liked) (the little girl brought a toy)
(7)
- a.
  (The people at the reception) (just met an artist painted a picture.)
- b.
  (The people at the reception just met) (an artist painted a picture.)

Each example was recorded in two forms. In the ‘early boundary’ versions (6a) and (7a), the subject corresponded to one prosodic phrase and the whole predicate appeared as its own prosodic phrase. In the ‘late boundary’ versions, (6b) and (7b), the subject and verb together appeared as one prosodic phrase, and the object was phrased by itself as a prosodic phrase.

The sentences were recorded and acoustically analyzed as in Experiment 1. The total duration of the final noun in the subject (school in (6), reception in (7)), including any silent pause following it, averaged 608 ms in the early boundary condition as compared to 409 ms in the late boundary condition. The corresponding values for the verb (liked in (6), met in (7)) were 505 ms in the early boundary condition and 688 ms in the late boundary condition. In addition to the greater length preceding a boundary, the pre-boundary items had generally greater pitch excursion than the items that did not precede a boundary (noun: 107 vs. 97 Hz, respectively; verb: 165 vs. 83 Hz). All prosodic boundaries were ip boundaries with a L- end. As in Experiment 1, the final clause was analyzed acoustically. In all cases, the clause was produced with a non-prominent H* accent on the subject and a more prominent H* accent on the final word or phrase, as shown in the pitch tracks in Figure 2. This time, there was some prosodic difference between conditions. The total pitch range averaged 98 Hz (SD = 16) for the early break condition, and 85 Hz (SD = 10 Hz) for the late break condition. Although the former is significantly (p < .01) larger than the latter by a t-test, footnote 4, in the Results section, presents evidence that this difference was not the source of the effects we will report. Durations of the final clause did not differ significantly across conditions, averaging 1507 ms (SD = 181 ms) and 1470 ms (SD = 157 ms) respectively (p > 0.15).

Example pitch tracks, Experiment 2. Panel 1, early boundary; Panel 2, late boundary.

Twenty-eight filler sentences and three practice items were also constructed and recorded. They included ten examples in which the input could only be analyzed as two separate sentences, plus 18 multi-clause single-sentence fillers (12 of which had a ‘visiting relatives’ type ambiguity which permits questions to sensibly be asked about the thematic relations involved. See footnote 6 for results).

Subjects and Procedures

Sixty-five University of Massachusetts undergraduates receiving course extra credit were tested individually, in 15 minute sessions. Subjects were tested in a fashion similar to that used in Experiment 1, except that the program Linger was used (Rohde, 2003). The experimental items were combined with the filler items in two counterbalanced lists, so that each list contained equal numbers of early and late boundary items, and each item was tested in each form in one list. After listening to an item, the subject saw a question on a computer screen, and was given two paraphrases to choose between by pressing a computer keyboard key, as illustrated in (8). One paraphrase indicated the relations that would hold if the final DP-V-DP string was analyzed as a root sentence (8a), and one paraphrase corresponded to the relations in a reduced relative clause analysis (8b). The positions of the paraphrases were counterbalanced

(8)
What happened?
- a.
  … the little girl brought a toy to someone.
- b.
  … someone brought a toy to the little girl.

Results

The percentages of root sentence (main clause) interpretations of the final DP-V-DP string appear in Table 3, together with their standard errors. Nearly three quarters of the time, people chose a main clause analysis of the final DP-V-DP string even when the prosodic break was early rather than immediately before the final string. A logistic mixed effects regression analysis of the proportions of interpretations was conducted with the fixed factors of boundary position and preposition type (to vs. for). Sum (ANOVA-style) contrasts were used, as in Experiment 1. Model comparison indicated that random intercepts of subjects and items provided the optimal model. The results of the analysis appear in Table 4. There were significantly more root sentence interpretations (8a) in the late than the early boundary condition, and more such interpretations for sentences with for-datives than with to-datives. The interaction did not approach significance.⁴

Table 3.

Mean proportions of root sentence interpretations of final clause, Experiment 2; overall and separated into to vs. for-dative sentences, standard error in parentheses.

Boundary	to-dative	for-dative	Overall mean

Early	0.62 (.030)	0.81 (.024)	0.72 (0.020)
Late	0.78 (.026)	0.88 (.020)	0.83 (0.016)

Open in a new tab

Table 4.

Parameters of logistic mixed-effects model analysis proportions, Experiment 2.

Fixed effects	Estimate	Standard error	z-value	p-value

Intercept	2.03	0.32	6.42	< 0.001
Boundary	0.32	0.13	2.47	< 0.05
Preposition	−0.9	0.42	−2.14	< 0.05
Boundary*Preposition	−0.17	0.17	−0.97	n.s.

Open in a new tab

Discussion

Why should listeners compute ungrammatical analyses that allow the constituents of an entire prosodic phrase to correspond to a stand-alone sentence? Grouping in the input may guide syntactic analysis, as envisioned in the Sausage Machine or Tabor et al.’s Local Coherence Account 2 (see in particular Fodor & Frazier, 1980; see also Kjelgaard, Speer, & Dobroth, 1999). Further, when processing becomes difficult, the processor may focus on finding a correct analysis of the current material while previous material may decay in memory. The current prosodic phrase is likely to be an easy unit to retain in memory and analyze, enhancing the effect of preceding material decaying.

In addition to demonstrating an effect of prosodic phrasing, the data of Experiment 2 show a notably high percentage of ungrammatical root clause interpretations, regardless of the presence of a prosodic boundary, consistent with the observations of Tabor et al. (2004). A locally coherent analysis, which results in the ungrammatical root clause interpretation, may be attractive for a variety of reasons. These include the three proposed by Tabor et al., as discussed earlier: the weight given to local coherence in a self-organizing network, the possibility that a sentence is parsed in chunks, and the difficulty of reanalyzing the locally coherent analysis into a globally grammatical analysis. Additional possible reasons include the decay of earlier material after the final clause is read, the possible (mis)analysis of the matrix verb in some sentences as intransitive,⁵ and the encouragement presumably given by the presence of two-sentence filler items.

It is even possible that the reduced relative clause structure of the presumably-grammatical analysis is not available in the grammar of some participants. This possibility is supported by the observation that there were more root clause interpretations for the for-dative than for the to-dative sentences. Some speakers do not allow extraction of the first object of a double object structure for verbs that take a for-marked Beneficiary/Goal (Beckman, 1996), as shown in (9), while they do for to-marked goals, as seen in (10).

(9)
Pattern for many speakers for for-datives (Goal extraction ungrammatical in b–c.)
- a.
  The chef baked the girl a muffin.
- b.
  *Who did the chef bake a muffin?
- c.
  *The girl baked a muffin didn’t eat it.
(10)
Pattern for to-datives (Goal extraction grammatical in c–d)
- a.
  The manager awarded the salesman a prize.
- b.
  Who did the manager award a prize?
- c.
  The employee awarded a prize didn’t accept it.

The acceptability of the final DP-V-DP sequence as a sentence may have influenced how often people assigned the main clause analysis to this string. Unacceptability of sentences like (9c), where goal extraction is required to allow a relative clause to modify the head noun girl, implicates unacceptability of the putatively grammatical sentences used in Experiment 2. The effect of type of dative is brought home by looking at individual sentence data. The only examples that received fewer than 50% root clause interpretations were the early prosodic boundary renditions of: …addressed the woman offered a beer, …questioned the guest brought a drink, and …interviewed the manager awarded a prize. Compare these to the sentences with the most root clause interpretations: …applauded the chef broiled a steak, …just met an artist painted a picture, …watched the boy prepared a sandwich, …admonished the student baked a muffin. Part of this difference appears to be due to the fact that the former sentences are to-datives while the latter ones take for-phrases (as well as containing a possibly-intransitive verb in two cases; see footnote 5). The reduced relative clause analysis may be unavailable for some speakers in the latter case, and thus the locally coherent root clause structure is no worse than the intended reduced relative structure.

The main conclusion of Experiment 2, however, is that when the prosodic boundary was late, and thus the sentential analysis of the final DP-V-DP sequence would exhaust the entire prosodic phrase, there were significantly more root clause interpretations of the string than when the prosodic boundary was earlier. This result is consistent with the hypothesis in (3): comprehenders are more likely to consider a root clause analysis when the entire sequence is contained in a stand-alone prosodic phrase.

General Discussion

The results of Experiment 1 indicate that final embedded clauses (in a three-clause structure) may be misidentified as stand-alone sentences when the prosody encourages such an analysis and the analysis appears to be grammatically licensed (i.e., not prohibited by an overt complementizer signaling the embedding). These results thus establish the existence of a previously unknown local coherence structure, one that does not share the considerable processing complexity of reduced relative clause structures.

The results of Experiment 2 show that prosodic structuring is relevant to the local coherence effects reported by Tabor et al. (2004). The fact that the effects of prosodic phrasing appeared in Experiment 1, where there were few if any other sources of processing difficulty, as well as in Experiment 2, shows that the extreme difficulty of the Tabor et al. type sentences is not required for prosodic structuring to have its effect. We submit that the prosodic local coherence hypothesis applies to both experiments. However, we do not claim that Tabor et al.’s own results can be attributed totally to prosody. Those results were obtained using visual presentation, which entails that prosodic phrasing would have to be implicit (Fodor, 2002). They also involved a mixture of different kinds of sentences, including examples where the critical phrase was not likely to be preceded by a prosodic boundary because the critical phrase followed a preposition rather than being a direct object of the matrix verb.

The prosodic effects we report seem to us most naturally consistent with the account of local coherence effects that Tabor et al. (2004) dubbed the ‘Local Coherence Account 2.’ We believe that prosodic units may correspond to processing units, with priority given to finding analyses that are coherent within a unit when there is doubt or extreme difficulty. Prosodic grouping of material is a property of the input to a sentence processor, making it a likely candidate for a way to naturally break down large units like sentences into manageable sections. Further, Schafer (1997) has shown that Intonational Phrases, the larger prosodic units in English, are domains for semantic processing. However, the results of comparing for- and to-datives in our Experiment 2 suggest that Tabor et al.’s Local Coherence Account 3, on which some materials are outright ungrammatical, has some validity, at least for the for-dative results in Tabor et al.’s materials. But the boundary effect cannot be attributed solely to judgments of ungrammaticality, because a prosodic boundary effect appears at different levels of variables, like the type of dative and transitivity of verbs, that affect the proportion of root-clause interpretations. Further, the local coherence effects in Experiment 1 cannot be explained by this account alone, due to the relative ease and grammaticality of the sentences in that experiment.

Our data do not rule out a self-organizing network account (Tabor et al.’s Local Coherence Account 1), nor do they rule out any of a variety of other putative accounts for local coherence effects, including Bicknell, Levy, and Demberg’s (2010) extension of surprisal, or Hale’s (2011) rational parser. However, any such accounts would have to be extended in some way to include effects of prosodic phrasing.

Further, these accounts do not address the possibility that there are differences between ungrammatical remnants in different sentence positions. It may be important that locally coherent structures in existing demonstrations are sentence-final, while the ungrammatical remnants left behind when a locally coherent analysis is accepted are all sentence non-final. A listener can overlook the missing constituent in a non-final phrase by anticipating that it could arrive in the next phrase. If this next phrase appears to be a complete constituent on its own, the listener receives no evidence that triggers a reanalysis of the preceding incomplete phrase.

To take an example from Tabor et al. (2004), a listener might postpone completing the main clause structure The kindergartners liked, especially if it is terminated by a prosodic boundary, and assume that the missing constituent will appear soon. However, the listener’s attention could shift to the final constituent, the little girl brought a toy. If this material is contained in a prosodic phrase and can stand alone, the listener (following the prosodic local coherence hypothesis) could parse it as a root clause, overlooking the previously-noted missing constituent, possibly attributing its absence to a speech or memory error. However, if a sentence-final constituent is missing an obligatory constituent that is not available from earlier in the sentence, the listener cannot assume that later material will correct the problem. In this case, the listener will not accept a sentence as grammatical, even if a self-organizing network could create locally coherent analyses of earlier phrases.

The question also arises whether it is critical that the locally coherent analysis is a stand-alone root clause structure, as in the current materials, rather than some other syntactic unit. Consider (12):

(12)
(The men at the expensive shop) (adored ladies).

The final string, adored ladies, could be interpreted as a complete noun phrase. We suspect that this is not a tempting analysis even when it lines up with a prosodic phrase. If it turns out that locally coherent structures are limited to potential root structures, this would be revealing since the grammar for the most part constrains syntax within a sentence, not between different sentences. Hence, locally coherent analyses may emerge only in situations involving a potential root sentence precisely because the parser does not expect syntactic constraints to hold beyond a root structure. Of course, far more testing is required before we would have any confidence in our conjecture that locally coherent structures should only be potential root structures⁶. In sum, the present results establish an effect of prosodic grouping on the prevalence of locally coherent analyses, but they also raise questions about whether locally coherent analyses may be restricted to examples involving a final substring that may be analyzed as a root clause. If it is true that local coherence analyses are limited primarily to string-final potential root clauses, this would fit naturally with a prosodic phrase-based parser: fit of the grammatical analysis of the current prosodic phrase with the analysis of prior prosodic phrases would be checked except possibly when the current phrase corresponded to a potential root sentence. Although a self-organizing system might behave in this fashion, we do not see why it would be limited in this way.

The existence of locally coherent analyses is interesting for a variety of reasons. It imposes some limit on the amount of top-down global analysis the processor engages in. It suggests that the parser does not necessarily attach each new input word into the current partial phrase marker before building new bottom-up structure connecting the new item with items subsequent to it. Further, the limits on locally coherent analyses are likely to inform us about the weaknesses and ‘blind spots’ of the sentence processing mechanism. Perhaps even more important is that locally coherent analyses highlight the problems which the processing mechanism solves effortlessly under most conditions. Identifying stand-alone structures may be one of those.

Acknowledgments

This project was supported in part by Grant Number HD18708 from NICHD to the University of Massachusetts. The contents of this paper are solely the responsibility of the authors and do not necessarily represent the official views of NICHD or NIH. We would like to thank Adina Galili for assistance in collecting the data and acoustically analyzing the materials.

Appendix 1

Martin says that Louise believes (that) her boss is an alien.
Emily fears that Mark thinks (that) the dog needed new clothes.
Debby reported that Robert claimed (that) telepathy is possible.
Isaac said that Olivia felt (that) the mayor was crooked.
Peter implied that Susan thought (that) the milk turned into wine.
Marge heard that Duncan imagined (that) the squirrels were talking.
Terry believes that Bruce fears (that) the bills won’t get paid.
Roseanne worried that Steve hinted (that) the roof was damaged.
Eddie suspects that Maria pretends (that) the neighbors are movie stars.
Tim said that Rachel suggested (that) communism was a good idea.
Monica implied that Trent claimed (that) his wife is a monkey.
Sharon reports that Bob doubts (that) his diet will work.
Fred worries that his behavior implies (that) Mary is a fool.
Tom thinks that Anne suspects (that) her husband is cheating.
George believed that May fantasized (that) her daughter was pregnant.
Angela worried that Fred argued (that) Obama is naïve.
Lucy assumed that Brian worried (that) his girlfriend wanted to get married.
Melinda feared that Charles imagined (that) Sue would fail.
Carlo said that Maria dreamed (that) Hillary won the election.
Tom mentioned that Laura hopes (that) Luis will marry soon.
Hugo says that Bella suspects (that) Alex is lying.
Phillip assumed that Anita hopes (that) Carla will leave soon.
Tamara implied that Paul dreamed (that) his job was on the moon.
Annie suspects that Jason assumes (that) his sister is a moron.

Appendix 2

(The kindergartners at school) (liked the little girl brought a toy).

(The kindergartners at school liked) (the little girl brought a toy).
An elderly gentleman addressed the woman offered a beer.
Dr. Sanderson praised the professor taught Swahili.
The hotel owner questioned the guest brought a drink.
The football coach chastised the player tossed a frisbee.
The FBI questioned the congressman mailed a letter.
The math instructor scolded the teacher assigned a high grade.
The tv reporters interviewed the manager awarded a prize.
The people at the reception just met the artist painted a picture.
The preschool teacher congratulated the little boy bought a hat.
The old janitor really liked the young man rented an apartment.
The new school nurse admonished the student baked a muffin.
The anthropologist in Rome interviewed the woman knitted a shawl.
The Fed Ex deliveryman teased the accountant saved a coupon.
The appreciative co-workers applauded the chef broiled a steak.
The kindly policeman watched the boy prepared a sandwich.

Footnotes

The same problem arises for sentence fragments (see for example Merchant et al., to appear.) Indeed, the problem of whether some phrase or clause can or should be analyzed as a root structure is an extremely general one. Simple phrases can be root structures in contexts licensing ellipsis of the rest of the sentence, such as question answers: Who left? Mary.

However, there is a sizeable literature on the syntax of (embedded) root structures, generally (e.g, Emonds 1970, 1976; Hooper & Thompson, 1973, among many others). This literature concentrates on lexical items (like speaker-oriented adverbs, interjections, etc.) and grammatical rules (like Subject-Aux Inversion) that are licensed only in root structures, rather than on the prosodic cues of interest here.

This paper assumes the ToBI system of prosodic analysis and notation for English, as in Beckman & Elam (1997). The minus sign indicates the tone associated with the end of an intermediate phrase (ip), and the percent sign the one associated with the end of an Intonational Phrase (IPh).

⁴

When the centered pitch range of the final clause was added to the analysis, it did not have an effect on proportion of choices (z < 1.0) and the effect of boundary position remained fully significant.

⁵

As noted by a reviewer, three of the matrix verbs could have been intransitive (items 9, 15, and 16). Such an analysis could support a root clause interpretation of the final clause, with a missing period and initial capital. These items did have a notably high proportion of root sentence interpretations, averaging 91%. However, even after eliminating these three items, the pattern of data reported in the text remained.

⁶

Potential evidence for this claim comes from 12 filler items in Experiment 1 like (i),. In one form (a), a prosodic break appeared early so that the prosodic phrase included material beyond the ambiguous phrase (visiting relatives at work). In condition (b), the prosodic phrase included only the ambiguous material.

(i)
- a.
  (Everyone I talked to) (knew visiting relatives at work) can be difficult.
- b.
  (Everyone I talked to knew) (visiting relatives at work) can be difficult.

Participants favored the verbal analysis where relatives is the object of visit 67% of the time. If listeners are more likely to consider alternative analyses for material that exhausts a prosodic phrase than material that does not, then the unpreferred analysis of visiting relatives should be reported more often in (ib) than in (ia). Instead, there were 64% verbal interpretations of sentences like (ia) and 69% for sentences like (ib), a non-significant difference. This suggests that alternative, unlicensed, analyses of prosodically-delimited material may be limited to potential root sentences.

References

Beckman J. Double objects, definiteness, and extraction: A processing perspective. In: Dickey MW, Tunstall S, editors. University of Massachusetts Occasional Papers in Linguistics 19: Linguistics in the laboratory. Amherst, MA: GLSA, University of Massachusetts; 1996. pp. 27–70. [Google Scholar]
Beckman ME, Elam GA. Guidelines for ToBI labeling, version 3.0. Manuscript and accompanying voice materials. Ohio State University; 1997. < http://www.ling.ohio-state.edu/~tobi/ame_tobi/labelling_guide_v3.pdf>. [Google Scholar]
Beckman ME, Pierrehumbert JB. Intonational structure in Japanese and English. Phonology Yearbook. 1986;3:266–309. [Google Scholar]
Bicknell K, Levy R, Demberg V. Proceedings of the 35th Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society; 2010. Correcting the incorrect: Local coherence effects modeled with prior belief update; pp. 13–24. [Google Scholar]
Boersma P. PRAAT, a system for doing phonetics by computer. Glot International. 2001;5:341–345. [Google Scholar]
Carlson K, Frazier L, Clifton C., Jr. Intonational Phrase boundaries: A puzzle. In: Borowsky T, Kawahara S, Shinya T, Sugahara M, editors. Prosody Matters: Essays in Honour of Elisabeth Selkirk. London: Equinox Publishing; 2012. pp. 397–419. [Google Scholar]
Emonds J. Root and Structure-Preserving Transformations. Unpublished doctoral dissertation, MIT; 1970. [Google Scholar]
Emonds J. A Transformational Approach to English Syntax: Root, Structure-Preserving and Local Transformations. New York: Academic Press; 1976. [Google Scholar]
Fodor JD. Prosodic disambiguation in silent reading. In: Hirotani M, editor. Proceedings of the North East Linguistics Society. Vol. 32. Amherst, MA: GLSA; 2002. pp. 112–132. [Google Scholar]
Fodor JD, Frazier L. Is the human sentence parsing mechanism an ATN? Cognition. 1980;8:418–459. doi: 10.1016/0010-0277(80)90003-7. [DOI] [PubMed] [Google Scholar]
Frazier L, Carlson K, Clifton C., Jr. Prosodic phrasing is central to language comprehension. Trends in Cognitive Sciences. 2006;10:244–249. doi: 10.1016/j.tics.2006.04.002. [DOI] [PubMed] [Google Scholar]
Frazier L, Fodor JD. The sausage machine: A new two-stage parsing model. Cognition. 1978;6:291–326. [Google Scholar]
Hale J. What a rational parser would do. Cognitive Science. 2011;35:399–443. [Google Scholar]
Hooper J, Thompson S. On the Applicability of Root Transformations. Linguistic Inquiry. 1973;4:465–497. [Google Scholar]
Jun S-A. Prosodic typology. In: Jun S-A, editor. Prosodic Typology: The phonology of intonation and phrasing. Oxford: Oxford University Press; 2005. pp. 430–458. [Google Scholar]
Kiparsky P, Kiparsky C. Fact. In: Bierwisch M, Heidolph K, editors. Progress in Linguistics: A collection of papers. Mouton: The Hague; 1970. pp. 143–173. [Google Scholar]
Kjelgaard MM, Speer SR. Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity. Journal of Memory and Language. 1999;40:153–194. [Google Scholar]
Konieczny L, Müller D, Hachmann W, Schwarzkopf S, Wolfer SA. Local syntactic coherence interpretation. Evidence from a visual world study. In: Taatgen N, van Rijn H, editors. Proceedings of the 31st Annual Conference of the Cognitive Science Society. Austin, TX: 2009. pp. 1133–1138. [Google Scholar]
Marcus M, Hindle D. Description theory and intonation boundaries. In: Altmann G, editor. Cognitive models of speech processing: Psycholinguistic and computational perspectives. Cambridge, MA: MIT Press; 1990. pp. 483–512. [Google Scholar]
Merchant J, Frazier L, Clifton C, Jr., Weskott T. Fragment answers to questions: A case of inaudible syntax. In: Goldstein L, editor. Context and Communication. Oxford: Oxford University Press; to appear. [Google Scholar]
R Development Core Team. A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2012. [Google Scholar]
Rohde D. Linger: A flexible program for language processing experiments. 2003 Retrieved from http://tedlab.mit.edu/~dr/Linger/
Schafer AJ. Prosodic parsing: The role of prosody in sentence comprehension Unpublished doctoral dissertation. University of Massachusetts Amherst; 1997. [Google Scholar]
Silverman K. The Structure and Processing of Fundamental Frequency Contours. Unpublished doctoral dissertation, University of Cambridge; 1987. [Google Scholar]
Tabor W, Galantucci B, Richardson D. Effects of merely local syntactic coherence on sentence processing. Journal of Memory and Language. 2004;50:355–370. [Google Scholar]
Wagner M, Watson D. Experimental and theoretical advances in prosody: A review. Language and Cognitive Processes. 2010;25:905–945. doi: 10.1080/01690961003589492. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] Beckman J. Double objects, definiteness, and extraction: A processing perspective. In: Dickey MW, Tunstall S, editors. University of Massachusetts Occasional Papers in Linguistics 19: Linguistics in the laboratory. Amherst, MA: GLSA, University of Massachusetts; 1996. pp. 27–70. [Google Scholar]

[R2] Beckman ME, Elam GA. Guidelines for ToBI labeling, version 3.0. Manuscript and accompanying voice materials. Ohio State University; 1997. < http://www.ling.ohio-state.edu/~tobi/ame_tobi/labelling_guide_v3.pdf>. [Google Scholar]

[R3] Beckman ME, Pierrehumbert JB. Intonational structure in Japanese and English. Phonology Yearbook. 1986;3:266–309. [Google Scholar]

[R4] Bicknell K, Levy R, Demberg V. Proceedings of the 35th Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society; 2010. Correcting the incorrect: Local coherence effects modeled with prior belief update; pp. 13–24. [Google Scholar]

[R5] Boersma P. PRAAT, a system for doing phonetics by computer. Glot International. 2001;5:341–345. [Google Scholar]

[R6] Carlson K, Frazier L, Clifton C., Jr. Intonational Phrase boundaries: A puzzle. In: Borowsky T, Kawahara S, Shinya T, Sugahara M, editors. Prosody Matters: Essays in Honour of Elisabeth Selkirk. London: Equinox Publishing; 2012. pp. 397–419. [Google Scholar]

[R7] Emonds J. Root and Structure-Preserving Transformations. Unpublished doctoral dissertation, MIT; 1970. [Google Scholar]

[R8] Emonds J. A Transformational Approach to English Syntax: Root, Structure-Preserving and Local Transformations. New York: Academic Press; 1976. [Google Scholar]

[R9] Fodor JD. Prosodic disambiguation in silent reading. In: Hirotani M, editor. Proceedings of the North East Linguistics Society. Vol. 32. Amherst, MA: GLSA; 2002. pp. 112–132. [Google Scholar]

[R10] Fodor JD, Frazier L. Is the human sentence parsing mechanism an ATN? Cognition. 1980;8:418–459. doi: 10.1016/0010-0277(80)90003-7. [DOI] [PubMed] [Google Scholar]

[R11] Frazier L, Carlson K, Clifton C., Jr. Prosodic phrasing is central to language comprehension. Trends in Cognitive Sciences. 2006;10:244–249. doi: 10.1016/j.tics.2006.04.002. [DOI] [PubMed] [Google Scholar]

[R12] Frazier L, Fodor JD. The sausage machine: A new two-stage parsing model. Cognition. 1978;6:291–326. [Google Scholar]

[R13] Hale J. What a rational parser would do. Cognitive Science. 2011;35:399–443. [Google Scholar]

[R14] Hooper J, Thompson S. On the Applicability of Root Transformations. Linguistic Inquiry. 1973;4:465–497. [Google Scholar]

[R15] Jun S-A. Prosodic typology. In: Jun S-A, editor. Prosodic Typology: The phonology of intonation and phrasing. Oxford: Oxford University Press; 2005. pp. 430–458. [Google Scholar]

[R16] Kiparsky P, Kiparsky C. Fact. In: Bierwisch M, Heidolph K, editors. Progress in Linguistics: A collection of papers. Mouton: The Hague; 1970. pp. 143–173. [Google Scholar]

[R17] Kjelgaard MM, Speer SR. Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity. Journal of Memory and Language. 1999;40:153–194. [Google Scholar]

[R18] Konieczny L, Müller D, Hachmann W, Schwarzkopf S, Wolfer SA. Local syntactic coherence interpretation. Evidence from a visual world study. In: Taatgen N, van Rijn H, editors. Proceedings of the 31st Annual Conference of the Cognitive Science Society. Austin, TX: 2009. pp. 1133–1138. [Google Scholar]

[R19] Marcus M, Hindle D. Description theory and intonation boundaries. In: Altmann G, editor. Cognitive models of speech processing: Psycholinguistic and computational perspectives. Cambridge, MA: MIT Press; 1990. pp. 483–512. [Google Scholar]

[R20] Merchant J, Frazier L, Clifton C, Jr., Weskott T. Fragment answers to questions: A case of inaudible syntax. In: Goldstein L, editor. Context and Communication. Oxford: Oxford University Press; to appear. [Google Scholar]

[R21] R Development Core Team. A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2012. [Google Scholar]

[R22] Rohde D. Linger: A flexible program for language processing experiments. 2003 Retrieved from http://tedlab.mit.edu/~dr/Linger/

[R23] Schafer AJ. Prosodic parsing: The role of prosody in sentence comprehension Unpublished doctoral dissertation. University of Massachusetts Amherst; 1997. [Google Scholar]

[R24] Silverman K. The Structure and Processing of Fundamental Frequency Contours. Unpublished doctoral dissertation, University of Cambridge; 1987. [Google Scholar]

[R25] Tabor W, Galantucci B, Richardson D. Effects of merely local syntactic coherence on sentence processing. Journal of Memory and Language. 2004;50:355–370. [Google Scholar]

[R26] Wagner M, Watson D. Experimental and theoretical advances in prosody: A review. Language and Cognitive Processes. 2010;25:905–945. doi: 10.1080/01690961003589492. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Standing alone with prosodic help^*

Lyn Frazier

Charles Clifton Jr

Katy Carlson

Jesse A Harris

Abstract

Experiment 1

Method

Materials

Figure 1.

Subjects and procedures

Results

Table 1.

Table 2.

Discussion

Experiment 2

Method

Materials

Figure 2.

Subjects and Procedures

Results

Table 3.

Table 4.

Discussion

General Discussion

Acknowledgments

Appendix 1

Appendix 2

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Standing alone with prosodic help*

Lyn Frazier

Charles Clifton Jr

Katy Carlson

Jesse A Harris

Abstract

Experiment 1

Method

Materials

Figure 1.

Subjects and procedures

Results

Table 1.

Table 2.

Discussion

Experiment 2

Method

Materials

Figure 2.

Subjects and Procedures

Results

Table 3.

Table 4.

Discussion

General Discussion

Acknowledgments

Appendix 1

Appendix 2

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Standing alone with prosodic help^*