Abstract
Three experiments tested theories of syntactic representation by assessing stem exchange errors (“hates the record” -> “records the hate”). Previous research has shown that in stem exchanges, speakers pronounce intended nouns (“REcord”) as verbs (“reCORD”), yielding syntactically well-formed utterances. By lexically based theories, resulting utterances are well-formed because speakers originally selected verbal forms (“reCORD”). By frame-based theories, resulting utterances are well-formed because independent syntactic frames compel conversion of intended nouns into verbs. Lexically based theories predict stem exchange errors should occur independently of syntactic context. Experiment 1 showed that speakers pronounced nouns as verbs only in utterances that required verbs; when utterances allowed nouns or verbs (“record and hate”), speakers pronounced nouns as nouns. Experiment 2 showed this was not an artifact of requiring specific utterance types. Experiment 3 ruled out a phonological influence over syntactic production. Consistent with frame-based theories, this evidence suggests that syntactic frames are abstract and independent.
Speakers produce speech remarkably accurately. What’s more, even when speakers’ production mechanisms do not perform accurately -- when speakers make speech errors -- the structures of their erroneous utterances are more often than not still grammatically well-formed. For example, consider an error like, “that log could use another fire,” where a speaker exchanges the position of two intended words. The utterance distorts the speaker’s intended meaning (fires need logs, not the reverse), but because the exchanged words are both nouns, the erroneous utterance has the same syntactic structure as the intended utterance and is therefore syntactically well-formed. In fact, at least 80% of such word-movement errors exchange words from the same syntactic category (Garrett, 1975; Stemberger, 1985) and so are syntactically well-formed despite the error. Similarly, when speakers make non-movement (sometimes called non-contextual) word-substitution errors (e.g., “White Anglo-Saxon prostitute”; Fromkin, 1973), the intended and substituting words come from the same syntactic category in about 95% of cases (Garrett, 1975; Stemberger, 1985). Thus, overwhelmingly, when speakers err in their production, it is their intended meanings that are compromised; their intended syntactic structures almost always remain intact.
The syntactic well-formedness of stem-exchange errors
A recent experimental demonstration reveals the power of the drive for syntactically well-formed production in the face of error. V. S. Ferreira and Humphreys (2001) explored a class of errors sometimes termed stem-exchange or stranding errors (Garrett, 1975). Stem-exchange errors often involve the exchange of words from different syntactic categories, sometimes stranding a suffix or inflection from one or both words. For example, a speaker who says “I roasted a cook” has exchanged an intended noun (“roast”) and verb (“cook”), stranding the past-tense marker (“-ed”). Because such errors involve exchanging words from different syntactic categories, they seemingly result in syntactically ill-formed utterances; at least with respect to speakers’ intended utterance, “I roasted a cook” has a noun spoken where a verb should have been and vice versa. V. S. Ferreira and Humphreys noted, however, that in many such cases, the intended words may belong to multiple syntactic categories; “roast” may have been intended as a noun, but it can also be a verb, and vice versa for “cook.” This raises the possibility that “I roasted a cook” is actually syntactically well-formed, provided that speakers produced the verb form of “roast” in the verb position and the noun form “cook” in the noun position.
Of course, it is impossible in such an error to determine whether the speaker produced the verb or noun form of “roast” (or “cook”), because the noun and verb forms of these words are pronounced identically. However, some words in English are pronounced differently when produced as nouns than as verbs. In particular, stress-shifting words like “record” are pronounced with first-syllable (or trochaic) stress when produced as nouns -- “REcord” (we use capitalization to indicate stress) -- but with second-syllable (or iambic) stress when produced as verbs -- “reCORD.” Thus, if speakers can be induced to make stem exchanges with such stress-shifting words, the syntactic category membership of the produced words -- and thus the syntactic well-formedness of the resulting utterance -- can be determined. To test this, V. S. Ferreira and Humphreys (2001) presented speakers with pairs of words like “tape” and “REcord” (note that auditory presentation ensures that speakers hear the noun form “REcord”), which they were to form into phrases like “taped the REcord.” Like the experiments reported here, the task included several features like time pressure and biased ordering of items specifically to encourage speakers to sometimes misorder the heard words, thereby yielding stem exchanges like “recorded the tape.” Results showed that when speakers produced such stem exchanges, they pronounced “record” with second-syllable stress (“reCORDed the tape”), indicating its use as a verb, much more often than with first-syllable stress (“REcorded the tape”), indicating its use as a noun, even though speakers originally heard “REcord.” In Experiment 1 of V. S. Ferreira and Humphreys, “reCORDed” pronunciations outnumbered “REcorded” 78-5, in Experiment 2, 27-3, and in Experiment 3, 42-8, for a total of 147-16, or a 90%-10% split. In short, despite exchanging words from different syntactic categories, speakers’ productions were still syntactically well-formed, because they produced words with syntactic category memberships that fit their syntactic context.
The objective of the experiments reported here is to determine the cognitive basis for the stress-shifting pattern observed by V. S. Ferreira and Humphreys (2001), which in turn should reveal the nature of the representational basis of speakers’ well-formed productions. Roughly speaking, errors like “I roasted a cook” or “reCORDed the tape” could have two root causes. One is that speakers may have intended to choose the noun form of “roast” or “record,” but ended up choosing the verb form, and having done so, committed themselves to using that form and producing a stem exchange. This is consistent with what are here termed lexically based theories of syntactic representation in production. The other possible cause of such errors is that speakers in fact chose the noun form of “roast” or “record,” but attempted to mention it in a position in a syntactic structure that can only accept verbs. The resulting mismatch could then compel a search or conversion of the originally retrieved noun into a verb form, yielding the stem exchange. This is consistent with what are here termed frame-based theories of syntactic representation in production. These two classes of theories are contrasted next.
Theories of syntactic representation in production
Theories of language production agree on the general outline of the language production process (for a recent review, see V. S. Ferreira & Slevc, 2007). Speakers begin with a nonlinguistic representation of what they intend to express, termed a message. This serves as input to a process termed grammatical encoding that selects and retrieves the linguistic features that can express the message. These linguistic features then guide phonological, phonetic, and eventually articulatory processes that ultimately create an utterance.
Within this general outline, it is grammatical encoding that determines the words in an utterance and the syntactic structures that will organize those words. And so, it is grammatical encoding that operates with either lexically based or frame-based syntactic representations. According to lexically based theories (e.g., F. Ferreira, 2000; MacDonald, Pearlmutter, & Seidenberg, 1994, describe a comprehension version), the syntactic organization of speakers’ utterances is determined by syntactic representations that are inherent parts of the lexical forms that license them. For example, the verb “put” requires a noun phrase and prepositional phrase argument, as in “put [the book] [on the table].” In lexically based theories, this requirement is represented by including as a part of the representation of the verb “put” these noun phrase and prepositional phrase arguments. So here, production proceeds with lexical forms being retrieved on the basis of speakers’ messages. As lexical forms are retrieved, they make available their associated syntactic features. The syntactic features of the different forms to be used in a particular utterance are then integrated through a unification-like process, such that the syntactic requirements of some lexical forms are satisfied by the syntactic features of other lexical forms. The result of this unification is a syntactically well-formed structure that can be sent to phonological, phonetic, and articulatory encoding.
In contrast, according to frame-based theories, the syntactic organization of speakers’ utterances is structured by lexically (as well as conceptually and phonologically) independent representations of syntactic structure sometimes called syntactic frames. Production proceeds with syntactic frames and lexical forms being separately retrieved on the basis of speakers’ messages, as well as mutual constraints among frames and lexical forms. Once retrieved, frames and lexical forms are integrated; this is typically viewed as an insertion process, such that lexical representations are inserted into slots in the selected syntactic frame. Once integrated, the syntactically structured lexical forms can be sent to phonological, phonetic, and articulatory encoding. [Note that the time course described for both types of theories is not meant to preclude interactive influences. For example, phonological processes can influence the retrieval of lexical forms (V. S. Ferreira & Griffin, 2003) or syntactic frames (Lee & Gibbons, 2007). Rather, this time course reflects the ‘global’ information flow (Dell & O'Seaghdha, 1991), still allowing for local cascading or feedback influences.]
The differences between lexically based and frame-based theories of syntactic production have a number of consequences. Lexically based theories easily explain certain lexical dependencies that syntactic production is subject to that require additional assumptions on the part of frame-based theories (e.g., speakers prefer to produce or must produce certain structures with certain lexical items; F. Ferreira, 1994). Frame-based theories easily explain certain lexically independent influences over syntactic production that require additional assumptions on the part of lexically based theories (e.g., speakers repeat syntactic structures from sentence to sentence even when the sentences do not share content words, Bock, 1986; this is so even from idiomatic phrases, Konopka & Bock, 2009). As such, neither type of theory clearly provides the best explanation for the range of syntactic production effects in the literature. Support from the nature of stem exchange errors should thus be informative.
Lexically based versus frame-based stem-exchange errors
As noted, given the evidence reported by V. S. Ferreira and Humphreys (2001), lexically based versus frame-based theories can view stem exchange errors as coming about in different ways. Given the assumption of lexically based theories that syntactic retrieval is a form of lexical retrieval, they claim that stem exchanges like “I roasted a cook” or “reCORDed the tape” are ultimately lexical selection errors: A speaker who intended to retrieve the noun forms of “roast” or “record” erroneously selected the verb form. These verb forms bring with them the transitive structures they are produced with. Once the noun forms of the intended verbs are retrieved (“cook” and “tape”), the direct-object requirement of the verbs are satisfied by the noun phrase features of the noun, and the well-formed production results. (Note that an additional requirement is that the second to-be-produced word be a noun; within this account, this may require an influence from the syntactic features of the verb over the selection of the form class of the next word.)
Frame-based theories proceed differently. Here, the representation in the message that a transitive event is to be described causes retrieval of a transitive syntactic frame, consisting of a verb phrase including a verb slot with a direct object noun phrase, the latter including a noun slot. Meanwhile, on the basis of message features, lexical forms like “cook” (the verb) and “roast” (the noun) or “tape” (the verb) and “REcord” (the noun) are retrieved. At this point, an error can occur as a result of mistiming, if grammatical encoding aims to insert the the noun (“roast” or “REcord”) into the first position of the syntactic frame. The point of a syntactic frame is to forbid such insertion (thereby enforcing well-formedness). However, to explain the well-formedness of stem-exchange errors that V. S. Ferreira and Humphreys (2001) observed, syntactically based theories must claim that rather than simply halt production, grammatical encoding invokes a morphological conversion mechanism that retrieves a syntactically compatible form of the thwarted lexical form. So, the nouns “roast” and “REcord” cannot be inserted into verb slots, but if they can be morphologically converted into their verb forms (“roast” and “reCORD”), the insertion can proceed, resulting in the syntactically well-formed stem-exchange error.
Of course, this frame-based processing account is not meant only to allow for well-formed erroneous production. A similar series of processing steps may give grammatical encoding in general an important strategy for flexibly producing error-free utterances. Imagine a speaker who utters, “I am waiting for the…,” while for whatever reason, retrieving the verb form “deliver.” Rather than producing a syntactic violation (“I am waiting for the deliver”) or reformulating (“I am waiting for the… for them to deliver it”), the speaker can invoke a morphological mechanism to convert “deliver” into “delivery,” allowing a well-formed utterance to result (“I am waiting for the delivery.”) In general, lexical retrieval is likely to be an especially challenging aspect of language production; strategies that allow for flexible well-formed production in the face of lexical retrieval challenges are therefore likely to be cognitively valuable (Bock, 1982).
This description makes clear that the frame-based account of stem-exchange errors involves a number of additional mechanisms that the lexically based account does not. Without supporting evidence, the simpler lexically based alternative should be preferred. The experiments here were designed to seek such evidence. In all experiments, speakers were auditorily presented with pairs of words, and instructed to form them into sensible phrases. Speakers were instructed how to order the to-be-produced words with a visual instruction to REPEAT or SWAP, thereby mentioning them in the same or opposite order that speakers heard them (thereby providing an unambiguous correct response). The procedure was speeded, with speakers having one second to provide responses. Also, fillers heavily outnumbered critical trials, and fillers and criticals used opposite ordering instructions; this led to an overall bias against the critical-trial correct ordering. Together, the time pressure and order bias encouraged stem exchanges.
There are two important differences between the current experiments and those reported in V. S. Ferreira and Humphreys (2001). First, Ferreira and Humphreys asked speakers to produce phrases like “taped the record,” where the verb is a past-tense form. This introduces a confound such that upon exchanging, “reCORDed” is a legal English word (and so is lexically well-formed), but “REcorded” is not. This raises the possibility that at least some part of the effect observed by Ferreira and Humphreys may have been due to lexical well-formedness -- that speakers avoided (or edited out) the production of illegal forms like “REcorded.” Ferreira and Humphreys included an additional experimental condition to assess this possibility (which suggested that lexical well-formedness influences could not explain the entire set of effects that were observed). Here, we avoided this issue by removing the past tense feature from speakers’ productions. That is, speakers were instructed to produce expressions like “hate the record,” which would stem-exchange to “record the hate,” which includes only legal English words regardless of how it pronounced (“REcord the hate” or “reCORD the hate”). Nonetheless, evidence from the first two experiments presented here will point to the possibility that the difference between “record the hate” and “recorded the hate” might influence the rate of stem-exchange errors we observe; a third experiment explored this further.
Also, speakers here were asked to produce expressions like “hate the record” rather than “tape the record” (used by V. S. Ferreira & Humphreys, 2001). The primary difference between these utterances is that the stem-exchanged form -- “record the hate” -- is not especially sensible, compared to “record the tape.” V. S. Ferreira and Humphreys wanted to maximize the likelihood of erroneous utterances, so when possible, they made the stem-exchanged forms seem sensible. But this may confuse speakers as to what form they were actually supposed to produce, however, compromising the degree to which the exchanged forms were truly unintended. To avoid this issue, we designed utterances so that the stem-exchanged forms were relatively neutral in terms of overall sensibility.
Finally, it is well established that in English, two-syllable verbs tend to be pronounced with second-syllable stress and nouns with first-syllable stress, that English pronunciation patterns change in accord with this pressure, and that when pronouncing nonwords, speakers’ pronunciations follow this pattern (Kelly, 1989; Kelly & Bock, 1988). This implies that there might be a general tendency to produce any word with stress on the second syllable when it is used in the verb position of an utterance. To assess this, all experiments included both stress-shifting words (“REcord” vs. “reCORD”) and stress-constant words (compare the noun “PAtent” to the verb “PAtent”). The stress-constant words were meant to measure any general tendency to shift the stress of words erroneously produced in the verb position of sentences. Thus, the extent to which speakers shift stress more with stress-shifting versus stress-constant words should specifically assess speakers’ propensity to select the verb form of a stress-shifting word rather than the noun form.
Experiment 1
Experiment 1 determined whether stem exchanges are the result of a lexical misselection, consistent with lexically based theories, or as a syntactically compelled syntactic category shift, consistent with frame-based theories. Speakers in Experiment 1 produced phrases in two types of utterances. Syntactically constrained utterances were of the form “____ the ____” discussed thus far, where the first word must be a verb and the second a noun. Syntactically agnostic utterances were of the form “____ and ____,” where the first word can either be a verb or a noun, and then the second word must be of the same type. If speakers say “reCORD the hate” in syntactically constrained utterances because they misselected the verb form, they should do so to a similar extent in syntactically agnostic utterances (producing “reCORD and hate”), because speakers have the same opportunity to make the erroneous selection in either condition. But if speakers say “reCORD the hate” in syntactically constrained utterances because a syntactic frame compelled the conversion of “REcord” or the retrieval of “reCORD,” then in syntactically agnostic utterances, in which the frame need not compel such conversions or retrievals, speakers should produce the originally retrieved noun (producing “REcord and hate”).
Method
Participants
Forty-eight undergraduate students participated for course credit. One additional participant was excluded because he participated in the experiment twice, and two more were excluded for not following instructions.
Apparatus
All experiments were administered using PsyScope 1.2.5 (Cohen, MacWhinney, Flatt, & Provost, 1993) on Macintosh computers with 17-inch CRT monitors. Auditory stimuli were digitally recorded and presented through amplified external speakers adjacent to the monitors. Responses were recorded with headset microphones connected to cassette recorders.
Design and materials
Two manipulations were used: word type (stress-shifting and stress-constant) and utterance type (syntactically constrained and syntactically agnostic). Both stress-shifting words and stress-constant words were presented to participants with first-syllable stress (i.e., REcord; PAtent). Four experimental conditions were created by crossing the levels of the word type and the utterance type conditions. Word type was manipulated within subjects and between items and utterance type within subjects and within items, both in counterbalanced fashion. Each participant heard each experimental item in only one utterance-type condition. However, each experimental item appeared in both utterance-type conditions across subjects. Subjects were presented with 36 critical trials and 64 filler trials.
On each trial, participants heard two words followed by an instruction that indicated the order in which the words should be produced relative to the order in which they were presented. The order instructions were either SWAP or REPEAT. The SWAP instruction indicated that speakers should produce the two words in the opposite order in which they heard them. The REPEAT instruction indicated that speakers should produce the two words in the same order in which they heard them. Order of presentation of blocks (see below) was counterbalanced resulting in four distinct groups (in all experiments, analyses of the effects of block order on stress shifting revealed no significant effects of or interactions with block order, all ps > .1).
Procedure
Participants were instructed that they would hear two words and that their task would be to produce those words in one of the two utterance formats. Utterance format was blocked such that speakers used one format for the first half of the experiment and the other format for the second half, allowing for faster presentation of stimuli (verb type was not blocked). A computer-presented instruction of the form “Use _____ the _____” or “Use _____ and _____” was presented at the beginning of the experiment and halfway through to inform participants of which format to use.
On each trial, speakers first viewed a fixation point (a ‘+’ in the middle of the computer monitor) that appeared immediately at the start of a trial and remained on the monitor for 20 milliseconds. Two words (such as “hate” followed by “REcord”) were auditorily presented. The first word began immediately at the start of the trial. The second word began 250 milliseconds after the end of the first word. Next speakers were presented with an ordering instruction on the computer monitor (“SWAP” or “REPEAT”). The instruction began immediately after the end of the second word. It remained on the monitor for one second, and then the fixation point reappeared and the next trial began immediately.
In all critical items as well as filler items, the visual instruction guided speakers to produce utterances such that the word designed to be a verb in the syntactically constrained unexchanged utterance (e.g., “hate”) was produced first and the word designed to be a noun second (e.g., “REcord”). Thus, the order instructions always compelled sensible and syntactically well-formed productions. All filler items used the SWAP instruction, whereas all critical items used the REPEAT instruction. The experiment began with ten practice items, five with a syntactically constrained format (“___ the ___”) and five with a syntactically agnostic format (“___ and ___”).
Scoring and analysis
Tape recordings were transcribed and the utterances coded for accuracy and stress pattern. A response was coded as correct if the speaker produced both words in the correct order in the correct format. Any trial where the second word was produced first (e.g., “record the/and hate”) was coded as an exchange regardless of whether the entire phrase was produced, as long as the entire word was produced (so that its stress pattern could be discerned). Trials in which speakers did not complete the first word, did not use the correct format, or simply said the two words one after the other without any additional words were coded as unanalyzable and not included in the analyses.
All trials coded as exchanges were later digitized and the stress-shifting words and stress-constant words were excised from their inflectional and syntactic context to remove any material that might influence a stress judgment. The first author and a research assistant who was blind to the experimental hypotheses listened to the excised words and independently coded whether they were produced with first or second syllable stress. Occurrences of exchanges were sufficiently obvious and are not theoretically informative for current questions, so a second set of codings on this measure was not collected.
Statistical analyses
We conducted 2 × 2 ANOVAs using subjects (F1) and items (F2) as random factors. The independent variables were target word type (stress-shifting or stress-constant) and utterance type (syntactically constrained or syntactically agnostic). In the primary analysis, the dependent variable was the mean number of stems in stem exchanges per speaker or per item that were pronounced with second-syllable stress. In a secondary analysis, the dependent variable was the mean number of stem exchanges per speaker or per item that were produced. These ANOVAs were conducted twice, once on each judge’s codings. In every case, patterns of significance and nonsignificance were identical in the two sets of codings, and so we report results only from author codings. We report variability with repeated-measures 95% confidence-interval halfwidths (CIs) based on single degree-of-freedom pairwise comparisons (Loftus & Masson, 1994) and a min-F’ criterion (Clark, 1973) in Table 2. We report subject CIs in the text, and subject and item CIs in tables.
Table 2.
ANOVAs on Numbers of Stress Shifts and Stem Exchanges for All Experiments
| Stress Shifts | Exchanges | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| df | F | p | Cl | df | F | p | Cl | ||||
| Experiment 1 | |||||||||||
| Word type | |||||||||||
| By subject | 1,47 | 9.6 | <.01 | 0.10 | 1,47 | 9.4 | <.01 | 0.21 | |||
| By item | 1,34 | 4.7 | <.03 | 0.37 | 1,34 | 2.5 | n.s. | 1.10 | |||
| MinF' | 1,64 | 3.2 | <.08 | 1,52 | 1.97 | n.s. | |||||
| Frame type | |||||||||||
| By subject | 1,47 | 8.3 | <.01 | 0.09 | 1,47 | 12.8 | <.01 | 0.33 | |||
| By item | 1,34 | 6.4 | <.02 | 0.27 | 1,34 | 10.9 | <.01 | 0.98 | |||
| MinF' | 1,75 | 3.6 | <.06 | 1,76 | 5.89 | <.02 | |||||
| Word × frame | |||||||||||
| By subject | 1,47 | 4.5 | <.04 | 0.14 | 1,47 | 2.5 | n.s. | 0.39 | |||
| By item | 1,34 | 4.5 | <.04 | 0.38 | 1,34 | 1.5 | n.s. | 1.38 | |||
| MinF' | 1,79 | 2.3 | n.s. | 1,69 | n.s. | ||||||
| Experiment 2 | |||||||||||
| Word type | |||||||||||
| By subject | 1,71 | 16.3 | <.01 | 0.13 | 1,71 | 19.7 | <.01 | 0.18 | |||
| By item | 1,34 | 7.7 | <.01 | 0.74 | 1,34 | 2.09 | n.s. | 2.31 | |||
| MinF' | 1,79 | 5.2 | <.03 | 1,41 | 1.89 | n.s. | |||||
| Frame type | |||||||||||
| By subject | 1,71 | <1 | n.s. | 0.10 | 1,71 | 3.62 | <.07 | 0.23 | |||
| By item | 1,34 | <1 | n.s. | 0.38 | 1,34 | 4.08 | <.06 | 0.87 | |||
| MinF' | 1,164 | <1 | n.s. | 1,96 | 1.92 | n.s. | |||||
| Word × frame | |||||||||||
| By subject | 1,71 | 1.5 | n.s. | 0.13 | 1,71 | <1 | n.s. | 0.20 | |||
| By item | 1,34 | 1.1 | n.s. | 0.54 | 1,34 | <1 | n.s. | 2.49 | |||
| MinF' | 1,74 | <1 | n.s. | 1,97 | <1 | n.s. | |||||
| Experiment 3 | |||||||||||
| Word type | |||||||||||
| By subject | 1,86 | 43.6 | <.01 | 0.10 | 1,86 | 11.8 | <.01 | 0.20 | |||
| By item | 1,34 | 19.5 | <.01 | 0.73 | 1,34 | 1.9 | n.s. | 2.49 | |||
| MinF' | 1,66 | 13.4 | <.01 | 1,45 | <1 | n.s. | |||||
| Frame type | |||||||||||
| By subject | 1,86 | <1 | n.s. | 0.09 | 1,86 | <1 | n.s. | 0.30 | |||
| By item | 1,34 | <1 | n.s. | 1.10 | 1,34 | <1 | n.s. | 1.18 | |||
| MinF' | 1,113 | <1 | n.s. | 1,113 | <1 | n.s. | |||||
| Word × frame | |||||||||||
| By subject | 1,86 | <1 | n.s. | 0.13 | 1,86 | <1 | n.s. | 0.30 | |||
| By item | 1,34 | 1.2 | n.s. | 0.50 | 1,34 | <1 | n.s. | 1.67 | |||
| MinF' | 1,113 | <1 | n.s. | 1,88 | <1 | n.s. | |||||
Note. p<.05.
p<.01.
“CI” refers to 95% confidence interval halfwidths.
Results and discussion
The results of Experiment 1 are shown in Figure 1. Each total column in the graph indicates the total number of stem exchanges speakers produced in each experimental condition. Within a column, the different shaded areas reveal the total number of critical (stress-shifting or stress-constant) words that speakers pronounced with first and second-syllable stress (and when stress could not be ascertained).
Figure 1.
Number of stress-shifting and stress-constant stems pronounced with first- and second-syllable stress when produced in syntactically constrained (“___ the ___”) and syntactically agnostic (“___ and ___”) frames. “Other” indicates stems that could not be assessed for stress. Values in parentheses indicate first-block totals.
With respect to distinguishing lexically based versus frame-based theories, the most important results concern the patterns of stress speakers produced in stem exchanges. The key observation to take from Figure 1 is that considering stem exchanges for which stress could be measured, speakers made exchanges like “reCORD the hate” in 14 of 32 measurable exchanges (the black area in the first bar), but they made exchanges like “reCORD and hate” in just 2 of 20 measurable exchanges (the black area in the third bar). This shows that speakers often pronounced exchanged stems heard as nouns with stress patterns indicating they were verbs, but only in syntactically constrained utterances. Note too speakers almost never pronounced exchanged stress-constant stems with second-syllable stress, even in syntactically constrained utterances: They produced utterances like “PAtent the award” 56 times, whereas they produced utterances like “paTENT the award” only twice. This shows that the second-syllable pronunciations observed with stress-shifting stems (“reCORD”) were not due to any general tendency to shift stress of any produced word in verb position.
Statistical analyses of the mean numbers of stem-exchanges pronounced with second-syllable stress in each experimental condition confirm this pattern (the means are shown in Table 1 and the statistical analyses in Table 2). Speakers overall pronounced exchanged stress-shifting stems with second-syllable stress (0.18 stems per subject) more than stress-constant stems (0.03 stems per subject, CI = ±0.10 stems). They also pronounced exchanged stems with second-syllable stress overall more in syntactically constrained utterances (0.17 stems per subject) than in syntactically agnostic utterances (0.04 stems per subject, CI = ±0.09 stems). Most important is that the difference in second-syllable pronunciations between exchanged stress-shifting and stress-constant stems was significantly greater in syntactically constrained utterances (0.29 vs. 0.04, a difference of 0.25 stems per subject) than syntactically agnostic utterances (0.06 vs 0.02, a difference of 0.04 stems per subject; interaction CI =±0.14 stems).
Table 1.
Mean Number of Stress Shifts and Proportion of Stem Exchanges per Subject for Each Word Type and Frame Type Condition in All Experiments
| Stress Shifts | Exchanges | |||||
|---|---|---|---|---|---|---|
| Stress-Shift | Stress-Constant | Stress-Shift | Stress-Constant | |||
| Exp. 1 | ||||||
| Syntactictally Constrained | 0.29 | 0.04 | 1.02 | 1.56 | ||
| Syntactictally Agnostic | 0.06 | 0.02 | 0.65 | 0.75 | ||
| Exp. 2 | ||||||
| Syntactically constrained | 0.32 | 0.01 | 1.07 | 1.47 | ||
| No Frame | 0.29 | 0.11 | 0.85 | 1.26 | ||
| Exp 3. | ||||||
| Bare Frame | 0.37 | 0.00 | 0.91 | 1.29 | ||
| Inflected Frame | 0.31 | 0.02 | 1.03 | 1.36 | ||
The pattern of stress production thus confirms the prediction of frame-based theories and disconfirms the prediction of lexically based theories. According to lexically based theories, stress-shifting stems should be pronounced with verb (second-syllable) stress as often in syntactically agnostic utterances as in syntactically constrained utterances, because in either case, the exchange results from incorrectly selecting the verb form of the stress-shifting stem. Instead, it appears that syntactic constraint -- a syntactically constraining frame -- is needed to observe a substantial number of verb productions when a stem exchange occurs.
Figure 1 also reveals differences in the overall numbers of stem exchanges observed in the four experimental conditions (regardless of stress pronunciation). These differences do not distinguish between the competing classes of theories under consideration here, but in any case are likely due to uncontrolled differences among the experimental conditions. Most critical about such differences for present purposes is that speakers can only pronounce a stem with second-syllable stress if there is an actual stem exchange; thus, the more exchanges observed in a condition, the more opportunities speakers have to produce second-syllable stress. This ends up not being an issue here (nor in the other experiments), however. First, critical comparisons are within syntactically constrained and syntactically agnostic utterances; thus, differences in the rate of stem exchanges with the different utterance formats do not compromise the primary comparisons of interest. Second, note that if anything, speakers exchanged stems less in the stress-shifting conditions than in the stress-constant conditions. This implies that speakers have more opportunities for second-syllable pronunciations of exchanged stems with stress-constant stems. Yet, at least in syntactically constrained utterances, the opposite is observed, with exchanged stems pronounced with second-syllable stress more with stress-shifting stems. (One way to equalize for opportunity to shift stress is to compute the proportion of stem exchanges that were produced with a stress shift. Most problematic about this approach, however, is that any subject who did not produce at least one stem exchange in each experimental condition must be excluded from the analysis, as they will have an undefined proportion [0 out of 0] in one condition of the within-subjects design. This would exclude 40 out of 48 subjects in the current experiment, so we opted for the present approach.)
For completeness, we analyzed the number of exchanges per subject (with means in Table 1, statistical analyses in Table 2). Speakers exchanged stems more in syntactically constrained (1.29 exchanges per subject) than in syntactically agnostic utterances (0.70 exchanges per subject), a difference that was significant (CI = ±0.33 exchanges per subject). Speakers tended to exchange stress-constant stems (1.16 exchanges per subject) more than stress-shifting stems (0.83 exchanges per subject), a difference that was significant only by speakers (CI = ±0.21 exchanges per subject). The difference in exchange rate between word types was not statistically different in syntactically constrained utterances (1.02 stress-shifting vs. 1.56 stress-constant exchanges per subject, a difference of 0.54) and syntactically agnostic utterances (0.65 stress-shifting vs. 0.75 stress-constant exchanges per subject, a difference of 0.10; CI = ±0.39 exchanges per subject).
The results of Experiment 1 thus support frame-based theories and fail to support lexically based theories of syntactic production. However, three points about Experiment 1 are important to note. First, it is important to emphasize that Experiment 1 confirms the frame-based prediction (and does not confirm the lexically based prediction) under the assumption (stated in the introduction to Experiment 1) that factors other than the restrictiveness of the frames did not influence the likelihood of selecting the verb form “reCORD.” One possibility is that in the syntactically agnostic condition, speakers might be biased to select the noun form of the non-critical word (“hate”), which would require that the noun form of the critical word (“REcord”) be selected (because conjunctions require two like conjuncts). Another possibility is that the instruction to use a “verb the noun” format in the syntactically constrained condition may have increased the likelihood of verb selection across the entire experimental session, meaning that the syntactically agnostic condition would come with a lower likelihood of verb selection. Yet another possibility is that “noun and noun” conjunctions are more common than “verb and verb” conjunctions in general, which could influence speakers’ selection likelihoods. All of these possibilities amount to different top-down influences on the likelihood of noun versus verb selection of the critical forms. However, it is important to recall the lexically based account that Experiment 1 was designed to assess: A speech error occurs because an erroneous lexical selection is sufficient to violate the presumably strongest of top-down influences, namely, a speaker’s intention to produce a particular (accurate) utterance. These other task-based top-down influences may operate as well, but it is an open question whether they would be powerful enough to overcome the force of the lexical misselection that the lexically based account hypothesizes in the first place.
Second, though about half of the stems speakers exchanged in the stress-shifting condition were pronounced with second-syllable stress (for stems for which stress could be assessed), this is less than was observed by V. S. Ferreira and Humphreys (2001), where stress-shifting words in comparable utterances were pronounced with second-syllable stress about 90% of the time. We set aside this difference for the moment, discussing it below when we introduce Experiment 3.
Third, note that in some sense, speakers in Experiment 1 were specifically instructed to use syntactic frames. That is, within the experiment, speakers were instructed to produce utterances of the form “____ the ____” or “____ and ____.” The explicit instruction to produce utterances with such a format may have artifactually compelled frame-based behavior. Experiment 2 assessed this possibility by comparing the critical syntactically constrained condition in Experiment 1 to a no-frame condition. In the no-frame condition, speakers were instructed simply to assemble grammatical utterances of any type. Furthermore, to promote syntactic variation (so that speakers would not settle into a single syntactic frame anyway), we included new fillers that required different sorts of utterance formats (e.g., “chair” and “sit” followed by SWAP, requiring “sit on the chair”). Thus, in the syntactically constrained condition, as in Experiment 1, the experimenter (hypothetically) provided the syntactic frame; in the no-frame condition, speakers “brought their own” syntactic frame, assuming they have them. If the pattern of stress production observed in Experiment 1 was due to the experimenter-provided frame, we should see a smaller difference in second-syllable-stress pronunciations in the no-frame condition than in the syntactically constrained condition.
Experiment 2
Method
Participants
Seventy-two undergraduate students participated. One additional participant was excluded because she or he could not react quickly enough to perform the task. (More subjects were run in Experiment 2 and Experiment 3 to maximize power for detecting an interaction. Note Experiment 1 did detect an interaction even with fewer subjects.)
Materials and design
Two manipulations were used: word type (stress-shifting and stress-constant) and utterance format (syntactically constrained frame versus no frame). Four experimental conditions were created by crossing the levels of the word type and the utterance format conditions. Word type was manipulated within subjects and between items, and utterance format within subjects and within items all in counterbalanced fashion. The proportion of critical trials to filler trials was the same as in Experiment 1.
Procedure
Experiment 2 employed the same procedure as the previous experiment. In the no-frame condition, speakers were instructed to “say the words in a complete phrase, such as ‘run in an experiment,’” in the order instructed.
Statistical Analyses
The same sets of analyses were conducted as in the previous experiment.
Results and discussion
The results of Experiment 2 are shown in Figure 2. Speakers pronounced exchanged stress-shifting stems with second syllable stress about equally in syntactically constrained and in no-frame utterances, producing utterances like “reCORD the hate” in 22 of 60 measurable exchanges in syntactically constrained utterances versus in 22 of 51 measurable exchanges in no-frame utterances. This suggests that the (hypothetical) experimenter-provided frame does not unduly exaggerate speakers’ tendency to pronounce stress-shifting words with the second-syllable stress indicative of their use as verbs. Again, the tendency to produce exchanged stems with second-syllable stress is not general, as it did not occur with stress-constant stems: Speakers rarely pronounced exchanged stress-constant stems with second-syllable stress, saying “paTENT the award” in 1 of 89 measurable exchanges in syntactically constrained utterances and 8 of 82 measurable exchanges in no-frame utterances.
Figure 2.
Number of stress-shifting and stress-constant stems pronounced with first- and second-syllable stress when produced in experimenter-determined (“___ the ___”) or undetermined (“produce a complete phrase”) frames. “Other” indicates stems that could not be assessed for stress. Values in parentheses indicate first-block totals.
Statistical analyses of the mean number of second-syllable pronunciations of exchanged stems confirmed these observations (means are shown in Table 1 and statistical results in Table 2). Speakers pronounced exchanged stems with second-syllable stress significantly more when stems were stress shifting (0.31 stems per subject) than when stems were stress-constant (0.06 stems per subject; CI = ±0.13 stems per subject). They pronounced exchanged stems with second-syllable stress about equally overall in syntactically constrained (0.17 stems per subject) and in no-frame utterances (0.20 stems per subject; CI = ±0.10 stems per subject). Speakers’ tendency to pronounce stress-shifting stems with second-syllable stress more than stress-constant stems was not statistically different in syntactically constrained utterances (0.32 vs. 0.01, a difference of 0.31 stems per subject) and in no-frame utterances (0.29 vs. 0.11, a difference of 0.18 stems per subject; CI = ±0.13 stems per subject).
Though numerically, the difference in number of second-syllable pronunciations between stress-shifting and stress-constant words in the syntactically constrained condition was larger (0.31 stems per subject) than in the no-frame condition (0.18 stems per subject), two considerations suggest that this difference does not reflect any actual influence of the utterance format on performance. First, the interaction assessing this effect did not approach significance (with an F1 of 1.5 and an F2 of 1.1). Second, note that with stress-shifting words, speakers pronounced exchanged stress-shifting stems with second-syllable stress about equally in the syntactically constrained (0.32 stems per subject) and the no-frame conditions (0.29 stems per subject). Thus, the above-noted difference is due to differences in the control condition (0.01 vs. 0.11), not in the stress-shifting condition. Recall that the point of testing the stress-constant stems is to assess any general tendency to shift stress when exchanging stems. It is not clear why such a tendency would be greater in the no-frame condition, suggesting that this difference is due to random variation. Accordingly, it seems reasonable to conclude that performance is similar in the two utterance-format conditions.
As in Experiment 1, speakers tended to exchange stress-constant stems overall more often (197 exchanges) than stress-shifting stems (138 exchanges). Again however, this difference only allowed more opportunities for stress-constant exchanged stems to be pronounced with second-syllable stress, yet they were still almost never pronounced with second-syllable stress. Overall exchange rates were about equal in the two utterance format conditions (183 exchanges in the syntactically constrained condition vs. 152 in the no-frame condition), suggesting no appreciable opportunity differences in the two conditions. Statistical analyses on number of exchanges per subject confirm that speakers exchanged stress-constant stems significantly more often (1.37 exchanges per subject) than stress-shifting ones (0.96 exchanges per subject; CI = ±0.18 exchanges per subject), and that they exchanged stems about equally in the syntactically constrained condition (1.27 exchanges per subject) and the no-frame condition (1.06 exchanges per subject; CI = ±0.23 exchanges per subject). The difference in number of stress-constant vs. stress-shifting stem exchanges was about equal in the syntactically constrained condition (1.47 vs. 1.07 exchanges per subject, a difference of 0.40) and no-frame condition (1.26 vs. 0.85 exchanges per subject, a difference of 0.41; interaction CI = ±0.20 exchanges per subject), leading the interaction between these factors to be nonsignificant.
In the no-frame condition, speakers produced utterances other than “____ the ____” on 141 of 1296 trials. Many of these substituted “a” for “the.” Some inserted prepositions (“call on the audit”). Some were even more creative (“try to have an accent”). Departures from “____ the ____” occurred more when the no-frame block was first, suggesting that when the bare frame block was first, it primed (consciously or otherwise) the use of the frame in the second block. Note, however, that the critical pattern of stress shifting is still evident even in the first block in the no-frame condition (see Figure 2).
Thus, overall, speakers showed no greater tendency to pronounce exchanged stress-shifting stems with second-syllable stress when the experimenter provided a (hypothetical) frame, compared to when there was no experimenter-provided frame. This suggests that the similar level of second-syllable stress production in the syntactically constrained condition of Experiment 1 was not an artifact of the experimentally provided instructions. And so the support for the frame-based over the lexically based prediction from Experiment 1 remains.
Again in Experiment 2, however, the rate of pronunciation of exchanged stems with second-syllable stress (for measurable stems), though significantly greater than corresponding control conditions, was appreciably less than in V. S. Ferreira and Humphreys (2001). The reason for this difference likely lies in the different utterance types speakers produced in the two series of experiments. Specifically, in the current experiments, speakers produced utterances like “record the hate,” where verbs had bare inflections, whereas in V. S. Ferreira and Humphreys, they produced utterances like “recorded the tape,” where verbs had past-tense inflections. Thus, only in the latter utterance type is there a phonological cue to the verbal status of the initial slot in the utterance -- the past-tense marker ‘ed.’ This raises the possibility that when a phonological cue to the grammatical class of a word is available, more well-formed syntactic production might result. This is broadly consistent with what has been termed the special-status hypothesis (for discussion, see Dell, 1990), which argues that function morphemes (including inflections like the past-tense marker) are treated differently during phonological encoding, including allowing for direct retrieval of their phonological content. In turn, this direct retrieval of the phonological content of function morphemes may lead production mechanisms to rely at least to some extent on that phonological content as a cue to syntactic structure. Of course, if phonology per se is a critical component of syntactic knowledge, this would undermine a core claim of the frame-based approach described above, namely, that the syntactic frames that guide well-formed syntactic production are abstract (i.e., syntactically independent structures) in nature.
On the other hand, a different explanation for the difference between the overall rate of verb-stress production within exchanges in V. S. Ferreira and Humphreys (2001) versus here might have to do with the lexical well-formedness factor raised above. That is, most theories of word production postulate a monitoring component that assesses formulated utterances prior to articulation for well-formedness along a variety of dimensions (for review, see Postma, 2000). One property which monitoring is claimed to assess is lexical status, such that if an illegal form is to be produced, speakers halt production prior to articulation and reformulate. Thus, the greater ratio of utterances like “reCORDed the tape” to “REcorded the tape” observed in the original Ferreira and Humphreys experiments may be because formulated forms like “REcorded” may have been filtered out by a lexical monitor in those experiments. Meanwhile, in Experiments 1 and 2 here, because “REcord the hate” includes legal English forms, “REcord” is not filtered, and so a larger number of noun-stress exchanged forms is observed. Critically, note that this monitoring-based explanation does not posit that the monitor causes “reCORDed” to be selected for production, only that the monitor blocks “REcorded” from being articulated.
Experiment 3 determined if a suffix that has explicit phonological content indeed influences syntactic production. Speakers produced two types of utterances: Utterances with bare inflections, like those in Experiments 1 and 2, which resulted in exchanges like “record the hate,” and utterances with the phonologically realized third-person singular inflection “-s,” which resulted in exchanges like “records the hate.” Note that in the latter utterance, the stress-shifting stem can be pronounced with first- or second-syllable stress without resulting in an illegal lexical form (“REcords the hate” or “reCORDs the hate”). This is because the third-person singular inflection is homophonous with the plural inflection, which can legally apply to the noun pronunciation of stress-shifting words. Thus, lexical well-formedness considerations do not differ between conditions. Rather, what differs is whether inflections are phonologically realized. If phonological content per se can influence syntactic production, speakers should pronounce exchanged stress-shifting stems with second-syllable stress more when utterances have phonologically realized inflections versus bare inflections. But if phonological content does not influence syntactic production, as claimed by frame-based theories, speakers should pronounce exchanged stress-shifting stems with second-syllable stress about equally in the two utterance formats.
Experiment 3
Method
Participants
Eighty-seven undergraduate students participated. Eighteen additional participants were excluded, eleven because their sessions were not recorded, two because of inaudible recordings, two because of a programming error, two participants because they did not follow directions, and one because she or he had already participated in the experiment.
Design and materials
Two manipulations were used: word type (stress-shifting and stress-constant) and utterance format (bare versus phonologically realized). Four experimental conditions were created by crossing the levels of the word type and the frame type conditions. Word type was manipulated within subjects and between items, and utterance format within subjects and within items in counterbalanced fashion. Subjects were presented with thirty-six critical trials and sixty-four filler trials.
Procedure and analyses were as in Experiments 1 and 2.
Results and discussion
The results of Experiment 3 are shown in Figure 3. Speakers pronounced exchanged stress-shifting stems with second-syllable stress about equally in utterances with phonologically bare versus phonologically realized inflections. Of the 66 exchanged stress-shifting stems in utterances with phonologically bare inflections that had measurable stress, 31 were pronounced with second-syllable stress; of the 70 exchanged stress-shifting stems in utterances with phonologically realized inflections that had measurable stress, 26 were pronounced with second-syllable stress. These tendencies to pronounce exchanged stems with second-syllable stress are not general to producing an exchanged word in verb position, as speakers almost never pronounced exchanged stress-constant stems with second-syllable stress, neither in utterances with phonologically bare inflections (0 of 89 measurable stems) nor in utterances with phonologically realized inflections (2 of 107 measurable stems). Together, this pattern suggests that phonological content per se does not influence syntactic production, as speakers selected the verb forms of exchanged stems equally when inflections were or were not phonologically realized.
Figure 3.
Number of stress-shifting and stress-constant stems pronounced with first- and second-syllable stress when produced in bare (“___ the ___”) or phonologically realized (“___s the ___”) frames. “Other” indicates stems that could not be assessed for stress. Values in parentheses indicate first-block totals.
These observations were confirmed by statistical analyses performed on speakers’ mean numbers of second-syllable pronunciations (means are shown in Table 1, statistical analyses in Table 2). Speakers pronounced exchanged stems with second-syllable stress significantly more overall when stems were stress-shifting (0.34 stems per subject) than when stems were stress-constant (0.02 stems per subject, CI = ±0.10 stems per subject). They pronounced exchanged stems with second-syllable stress about equally overall when utterances had phonologically bare inflections (0.19 stems per subject) or phononologically realized ones (0.17 stems per subject; CI = ±0.09 stems per subject). The degree to which speakers pronounced stems with second-syllable stress more with stress-shifting than with stress-constant stems was about equal in utterances with phonologically bare inflections (0.37 vs. 0.01 stems per subject, a difference of 0.36) and in utterances with phonologically realized inflections (0.31 vs. 0.02 stems per subject, a difference of 0.29; CI = ±0.13 stems per subject), so that the interaction between these two factors was nonsignificant.
Once again, speakers tended to exchange stress-constant stems more often overall (231 exchanges) than stress-shifting stems (169 exchanges). As before, with respect to the result of primary interest, this is conservative: Despite having more opportunities with stress-constant items for a second-syllable pronunciation to be observed (because stress-constant stems exchanged more), speakers did so less, compared to stress-shifting stems. Speakers exchanged stems about equally in utterances with phonologically bare inflections (191 exchanges) and in utterances with phonologically realized inflections (209 exchanges).
These observations are supported by statistical analyses performed on mean number of exchanges per speaker (means are shown in Table 1, statistical analyses in Table 2). Speakers exchanged stress-constant stems more overall (1.32 exchanges per subject) than stress-shifting stems (0.97 exchanges per subject; CI = ±0.20 exchanges per subject), a difference that was significant by subjects. Speakers exchanged stems about equally in utterances with phonologically bare inflections (1.10 exchanges per subject) and in utterances with phonologically realized inflections (1.20 exchanges per subject; CI = ±0.30 exchanges per subject). The tendency for more stress-constant than stress-shifting exchanges was about equal in utterances with phonologically bare inflections (1.29 vs. 0.91 exchanges per subject, a difference of 0.38) and in utterances with phonologically realized inflections (1.36 vs. 1.03 exchanges per subject, a difference of 0.33; CI = ±0.30 exchanges per subject).
The results of Experiment 3 thus suggest that phonological content per se does not influence speakers’ syntactic productions. Two implications follow from this result. First, the syntactic frames that influence speakers’ syntactic productions appear to indeed be abstract, at least in the sense that they are lexically independent (as shown by Experiment 1’s results) and phonologically independent (as shown by Experiment 3’s results). Second, the difference between the current experiments and V. S. Ferreira and Humphreys’s (2001) results yet remains: In Experiment 3, as in Experiments 1 and 2, just under half of speakers’ measurable stress-shifting exchanged stems were pronounced with second-syllable stress, compared to about 90% of such stems in V. S. Ferreira and Humphreys. Given that phonological content is not directly responsible for this difference, it is instead likely an effect of lexical well-formedness. On top of the syntactic influences revealed here and by V. S. Ferreira and Humphreys, speakers’ syntactic production can (likely through monitoring) be affected by whether an utterance will contain an illegal lexical form. This is an interesting observation we leave for future research.
General Discussion
The experiments presented here carry four theoretically important implications, three of which are directly relevant to the question of syntactic representation during production:
Exchanged stress-shifting stems were pronounced with second-syllable stress more in syntactically constrained than in syntactically agnostic utterances. This suggests that the production of verb forms in stem exchanges is not due to lexical selection, in turn suggesting that syntactic structures are represented in production as abstract, lexically independent frames rather than as lexically linked syntactic features.
Exchanged stress-shifting stems were pronounced with second-syllable stress about equally when speakers were instructed to use a particular utterance format as when speakers determined the utterance format themselves. This suggests that the first result is not an artifact of requiring speakers to produce utterances in a particular way.
Exchanged stress-shifting stems were pronounced with second-syllable stress about equally regardless of whether utterances included phonologically bare or phonologically realized inflections. This suggests that phonological content does not itself influence the syntactic effects revealed by speakers’ stem-exchanging behavior.
Overall, speakers appeared to pronounce exchanged stress-shifting stems with second-syllable stress proportionally less than in previous demonstrations. This is likely a lexical well-formedness effect, because in the current experiments, first-syllable pronunciations were not lexically illegal, whereas in previous experiments, they were. This suggests that over and above syntactic effects, production mechanisms can filter utterances based on lexical well-formedness considerations.
A possible counter-explanation for why we failed to find evidence for lexically based theories is that the syntactically constrained frame (in Experiment 1) encouraged a thematic influence relative to the syntactically agnostic frame (we thank Gerard Kempen for bringing this possibility to our attention). In other words, the requirement to produce "verb the noun" might have made speakers think "say action the object," and the effort to start a phrase with an action word could have boosted the activation of all verbs, thereby promoting the verbal form of the heard noun (i.e., reCORD when REcord was heard) relative to the syntactically agnostic condition. If this explanation holds, we expect that the boost of activation to action words would cause increased activation of the heard verb (i.e., tape). Consequently, the heard verb in the syntactically constrained frame should be more likely to be chosen for production (because of the increased activation) as compared to the syntactically agnostic frame. Because the heard verb is the correct word to place in the initial slot, this predicts that there should be fewer lexical exchanges with syntactically constrained frames. However, contrary to this prediction, speakers actually produced more lexical exchanges with syntactically constrained frames as compared to syntactically agnostic frames. This suggests that the greater rate of stress-shifts observed in Experiment 1 in syntactically constrained utterances was unlikely to have been due to the production system boosting the accessibility of all verbs across the board in the syntactically constrained condition, but instead was due to the abstract syntactic frame compelling the use of the stress-shifted form only when an error was impending.
Relatedly, the results might be explained by a weaker version of a frame-based theory, whereby grammatical encoding does not postulate full syntactic frames consisting of (say) a verb phrase and its direct object. Instead, grammatical encoding mechanisms might require speakers to begin the current utterance with a verb -- any verb -- as determined by message-level constraints. Such an account would be less frame-based and more lexically based if the subsequent syntactic structure in the utterance (e.g., the direct object) is retrieved as part of the accessed verb representation, thus making that direct-object structure lexically based rather than having been independently retrieved. Nonetheless, it is important to note that this approach maintains an important feature of frame-based accounts in that the production of at least some key parts of utterances (e.g., the selection of the verb) is constrained by syntactic features independently of lexical (as well as semantic and phonological) features. An advantage of this alternative account is that it can better accommodate piecemeal production (itself known to be important; see Levelt, 1989 for argumentation and V. S. Ferreira, 1996 for evidence). A disadvantage is that it would need to further specify how message-level features select the correct form of a verb so that an appropriate syntactic structure is available for the utterance. For example, if the message specifies that grammatical encoding express what would be the direct object of an optionally transitive verb, there must be a way for message-level features to ensure that the transitive rather than intransitive form of the verb is selected. Such influences may amount to the fully abstract syntactic frames specified here.
It is important to acknowledge that the task used in the current experiment differs markedly from natural, extemporaneous production. The task involves minimal conceptual preparation, and lexical content is given to subjects. However, it is reasonable to draw conclusions from the current task, from the perspective of distinguishing the competing theories under consideration. So, for example, unless lexically based theories place specific theoretical importance on, say, conceptual involvement during production, they predict that speakers should have shifted stress equally in the syntactically constrained and agnostic frames in Experiment 1. Put another way, our intent here was not to generalize task performance to real-world production, it was to test competing theories of production mechanisms, and those theories can then be generalized to real-world production (Mook, 1983). With appropriate caution, this strategy suggests the operation of frame-based rather than lexically based representations of syntactic knowledge.
All experiments revealed that speakers produced more stem exchanges with stress-constant stems than with stress-shifting stems. Note that completely different phrases were used in the two conditions. There may be a propensity to exchange more often with phrases in one condition as compared to the other related to the items themselves, rather than to the manipulations at hand. Because there are so few transitive stress-shifting words in English (we used all 18 that we could find), we were unable to control for factors such as frequency and sensibility of the words in the exchanged utterances (see V. S. Ferreira & Humphreys, 2001 for further discussion). Experiment 1 revealed a further overall difference: Speakers tended to exchange stems more in syntactically constrained than in syntactically agnostic utterances. This may be an effect of syntactic flexibility. V. S. Ferreira (1996) showed that speakers began utterances more quickly and produced them with fewer errors and disfluencies when they had more syntactic options for the utterances. The difference between utterance formats in Experiment 1 might be similar: With syntactically agnostic utterances, speakers can choose one of two available syntactic frames (“N and N” or “V and V”), which may have eased production relative to the syntactically constrained utterances, where speakers can only choose one (“V the N”). Easier production of syntactically agnostic utterances in turn may have reduced the rate of stem exchanges.
The support for frame-based theories of sentence production implies that ultimately, stem exchanges are syntactic mismatch errors: The reason speakers sometimes say “record the hate” or “I roasted a cook” is because grammatical encoding selected a lexical form (“record” or “roast”) for a sentence position that it mismatched in terms of syntactic category membership. To reconcile this mismatch, a process that exploits morphological productivity in order to maintain syntactic well-formedness could operate, to convert an item into a form that allows for its mention in a particular syntactic position. Two ways such a morphological process could enact a shift from nominal to verbal forms in this way seem possible. One is that grammatical encoding mechanisms could abandon the selected noun when they encounter a syntactically incompatible verb frame slot. After abandoning the noun, whichever verb is most highly activated could be selected to be placed into that slot. So, grammatical encoding might initially select “REcord,” but then will not be able to place “REcord” into the initial slot because of a syntactic mismatch. Grammatical encoding might then abandon “REcord” and select a different word that syntactically matches the open slot. Presumably, the verbal form “reCORD” will sometimes be highly activated due to its semantic and segmental similarity to the initially selected “REcord.” Consequently, “reCORD” will (at least sometimes) be selected. Once selected, “reCORD” is compatible with the open slot position and so can be slotted into that position. (Also, the intended verb “hate” will be highly activated, because it was presented to subjects; of course, if subjects select “hate” rather than “reCORD,” the trial would be coded as having been produced accurately.) Note that this account requires an additional step: Speakers in our task never produced utterances such as “reCORD the REcord,” producing the same root form of a word in both slots. This suggests that if this mechanism underlies performance in this task, then when a form is eventually inserted into the produced phrase (e.g., when “reCORD” is selected), all alternative forms of that same root must be ‘tagged’ as also having been selected (so that those alternative forms cannot be reselected).
Another possibility, and one that strikes us as especially interesting, is that production processes may include a subprocess or component process that has the function of taking lexical forms and finding morphological variants that are compatible with intended syntactic positions. In this case, this process might take “REcord” and actively return the verbal form “reCORD”. Performing a morphological conversion of this kind allows the selected word to be placed into the initial verb slot. Unlike the abandonment possibility, a morphological conversion process has the advantage that it could also underlie speakers’ ability to productively create new word forms, such as “incentivize,” when a known noun (such as “incentive”) is selected and speakers need a verb in order to produce a statement that accurately reflects their intention. Also, though this second account requires an additional morphological conversion mechanism, this mechanism itself can explain why speakers never produce utterances like “reCORD the REcord,” as the noun form (“REcord”) will be unavailable for insertion because it was entered into this morphological conversion process.
In addition to supporting frame-based theories of syntactic production, the experiments also provide indirect support for the influence of lexical well-formedness on stem exchanges. Specifically, the observation that speakers appear to be more likely to produce “REcord(s) the hate” in these experiments than they were to say “REcorded the tape” in V. S. Ferreira and Humphreys (2001) suggests that the lexically ill-formedness of “REcorded” acts to have its production blocked. As mentioned, such an influence fits well with different theories of production monitoring (e.g., Levelt, 1989), which describe processes that evaluate the appropriateness and accuracy of formulated speech prior to (as well as after) articulation. Most monitoring proposals argue that lexical well-formedness is one aspect of speakers’ utterances that monitoring assesses.
That said, it is important to emphasize that lexical well-formedness cannot explain the entire pattern of results, neither here nor in V. S. Ferreira and Humphreys (2001). V. S. Ferreira and Humphreys showed that speakers rarely produce exchanges like “talented the desire,” even though they include only lexically legal forms. The rare production of such exchanges is likely because they are syntactically ill-formed: though “talented” is a legal English word, “talented” is an adjective and so cannot be produced in the verb position in an utterance. Here, the use of bare or third-person singular inflections ensured that regardless of whether speakers said “REcord” or “reCORD,” the utterance again was lexically well-formed. Thus, lexical well-formedness can explain some aspects of performance in these tasks, but syntactic well-formedness operates as well; the present experiments suggest such syntactic well-formedness operates specifically through abstract representations of syntactic structure.
The influence of lexical well-formedness over the original V. S. Ferreira and Humphreys (2001) results invites one important revision of their conclusions. Specifically, V. S. Ferreira and Humphreys argued that because stress-shifting stems in stem exchanges were overwhelmingly pronounced with second-syllable stress, the syntactic category constraint was categorical -- as categorical as it is with word substitution errors. If part of the large effect observed by V. S. Ferreira and Humphreys was due to (syntactically independent) monitoring for lexical well-formedness, then the syntactic category constraint may not be so categorical.
The possible involvement of monitoring in general in task performance raises an interesting possibility: Might syntactic (as well as lexical) well-formedness constraints exert their influence via monitoring? That is, perhaps grammatical encoding mechanisms formulate “REcord the hate,” monitoring mechanisms detect its syntactic ill-formedness, and repair it to “reCORD the hate.” It has been proposed that monitoring mechanisms assess syntactic well-formedness (Levelt, 1989), so this is a possibility.
Ultimately, our results reveal the representational feature that underlies how speakers adapt their selection and production of words to result in well-formed production: Abstract syntactic frames, independent of lexical or phonological knowledge, compel speakers to shift words from one syntactic category to another. such abstract syntactic frames thus not only inform the production system about the legal configurations of words in their languages (e.g., that verbs precede their direct objects in languages like English), but further compel lexical-processing mechanisms to select the forms of words that are compatible with those legal configurations. These results reveal that production operates with some amount of ‘lexical category flexibility,’ as driven by abstract syntactic knowledge.
Appendix A.
Stimuli used in Experiments 1–3 (adapted from V.S. Ferreira & Humphreys, 2001)
| Stress Constant | Filler | |||
|---|---|---|---|---|
| Target Word | Verb | Target Word | Verb | |
| Accent | Try | Ankle | Twist | |
| Audit | Call | Apartment | Rent | |
| Blanket | Test | Bank | Rob | |
| Boycott | Watch | Barrel | Roll | |
| Budget | Report | Berry | Pick | |
| Carpet | Love | Boat | Dock | |
| Comfort | Design | Bomb | Explode | |
| Contact | List | Bottle | Recycle | |
| Contract | Grant | Bridge | Cross | |
| Credit | Ruin | Bubble | Pop | |
| Harvest | Pawn | Bullet | Dodge | |
| Limit | Place | Cabinet | Close | |
| Mandate | Cage | Lock | ||
| Orbit | Cause | Cake | Decorate | |
| Patent | Award | Canoe | Row | |
| Pilot | Support | Canon | Load | |
| Profit | State | Card | Flip | |
| Warrant | Cave | Explore | ||
| Stress Shifting | Channel | Tape | ||
| Target Word | Verb | Check | Pinch | |
| Compound | Drop | Chestnut | Roast | |
| Contest | Master | Cigarette | Smoke | |
| Convert | Like | Closet | Tidy | |
| Convict | Chase | Cow | Tip | |
| Digest | Read | Cup | Fill | |
| Escort | Name | Dart | Toss | |
| Export | Question | Egg | Scramble | |
| Insult | Shout | Engine | Repair | |
| Permit | Envelope | Seal | ||
| Pervert | Fear | Fence | Paint | |
| Present | Buy | Flag | Wave | |
| Protest | Witness | Hand | Wash | |
| Record | Hate | Lock | Bolt | |
| Refund | Guard | Parcel | ||
| Reject | Flunk | Spoon | Polish | |
| Repeat | Desire | Statue | Admire | |
| Subject | Trick | Umbrella | Replace | |
| Suspect | Frame | Weekend | Waste | |
Acknowledgments
This research was supported by National Institutes of Health grant R01 MH64733 and R01 HD051030. We thank Mai Ebata, Michelle Groisman, and Shelly Dutt for assistance with data collection, Katie Doyle for help with manuscript preparation, Tamar Gollan and Bob Slevc for discussion, and Merrill Garrett and two anonymous reviewers for comments on a previous version of the manuscript.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Bock JK. Toward a cognitive psychology of syntax: Information processing contributions to sentence formulation. Psychological Review. 1982;89:1–47. [Google Scholar]
- Bock JK. Syntactic persistence in language production. Cognitive Psychology. 1986;18:355–387. [Google Scholar]
- Clark HH. The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior. 1973;12:335–359. [Google Scholar]
- Cohen JD, MacWhinney B, Flatt M, Provost J. PsyScope: An interactive graphic system for designing and controlling experiments inthe psychology laboratory using Macintosh computers. Behavior Research Methods, Instruments, and Computers. 1993;25:257–271. [Google Scholar]
- Dell GS. Effects of frequency and vocabulary type on phonological speech errors. Language and Cognitive Processes. 1990;5:313–349. [Google Scholar]
- Dell GS, O'Seaghdha PG. Mediated and convergent lexical priming in language production: A comment on Levelt et al. (1991) Psychological Review. 1991;98:604–614. doi: 10.1037/0033-295x.98.4.604. [DOI] [PubMed] [Google Scholar]
- Ferreira F. Choice of passive voice is affected by verb type and animacy. Journal of Memory and Language. 1994;33:715–736. [Google Scholar]
- Ferreira VS. Is it better to give than to donate? Syntactic flexibility in language production. Journal of Memory and Language. 1996;35:724–755. [Google Scholar]
- Ferreira VS, Griffin ZM. Phonological influences on lexical (mis)selection. Psychological Science. 2003;14:86–90. doi: 10.1111/1467-9280.01424. [DOI] [PubMed] [Google Scholar]
- Ferreira VS, Humphreys KR. Syntactic influences on lexical and morphological processing in language production. Journal of Memory & Language. 2001;44(1):52–80. [Google Scholar]
- Ferreira VS, Slevc LR. Grammatical encoding. In: Gaskell MG, editor. The Oxford Handbook of Psycholinguistics. Oxford: Oxford University Press; 2007. pp. 453–469. [Google Scholar]
- Fromkin VA, editor. Speech errors as linguistic evidence. Mouton: The Hague; 1973. [Google Scholar]
- Garrett MF. The analysis of sentence production. In: Bower GH, editor. The psychology of learning and motivation. Vol. 9. New York: Academic Press; 1975. pp. 133–177. [Google Scholar]
- Kelly MH. Rhythm and language change in English. Journal of Memory and Language. 1989;28:690–710. [Google Scholar]
- Kelly MH, Bock JK. Stress in time. Journal of Experimental Psychology: Human Perception and Performance. 1988;14:389–403. doi: 10.1037//0096-1523.14.3.389. [DOI] [PubMed] [Google Scholar]
- Konopka A, Bock K. Lexical or syntactic control of sentence formulation? Structural generalizations from idiom production. Cognitive Psychology. 2009;58:68–101. doi: 10.1016/j.cogpsych.2008.05.002. [DOI] [PubMed] [Google Scholar]
- Lee M-W, Gibbons J. Rhythmic alternation and the optional complementiser in English: New evidence of phonological influence on grammatical encoding. Cognition. 2007;105:446–456. doi: 10.1016/j.cognition.2006.09.013. [DOI] [PubMed] [Google Scholar]
- Levelt WJM. Speaking: From intention to articulation. Cambridge, MA: MIT Press; 1989. [Google Scholar]
- Loftus GR, Masson MEJ. Using confidence intervals in within-subject designs. Psychonomic Bulletin & Review. 1994;1:476–490. doi: 10.3758/BF03210951. [DOI] [PubMed] [Google Scholar]
- Mook DG. In defense of external invalidity. American Psychologist. 1983;38:379–387. [Google Scholar]
- Postma A. Detection of errors during speech production: A review of speech monitoring models. Cognition. 2000;77:97–131. doi: 10.1016/s0010-0277(00)00090-1. [DOI] [PubMed] [Google Scholar]
- Stemberger JP. An interactive activation model of language production. In: Ellis A, editor. Progress in the psychology of language. Vol. 1. London: Erlbaum; 1985. pp. 143–186. [Google Scholar]



