Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2007 May 2.
Published in final edited form as: J Exp Psychol Learn Mem Cogn. 2002 Nov;28(6):1187–1199.

Central Bottleneck Influences on the Processing Stages of Word Production

Victor S Ferreira 1,, Harold Pashler 1
PMCID: PMC1864932  NIHMSID: NIHMS3779  PMID: 12450341

Abstract

Does producing a word slow performance of a concurrent, unrelated task? In two experiments, 108 subjects named pictures and discriminated tones. In Experiment 1, pictures were named after cloze sentences; the durations of the word-production stages of lemma- and phonological word-form-selection were manipulated with high- and low-constraint cloze sentences and high- and low-frequency-name pictures respectively. In Experiment 2, pictures were presented with simultaneous distractor words; the durations of lemma- and phoneme-selection were manipulated with conceptually and phonologically related distractors. All manipulations except the phoneme-selection manipulation delayed tone-discrimination responses as much as picture-naming responses. These results suggest that early word-production stages – lemma- and phonological word-form-selection – are subject to a central processing bottleneck, while the later stage –phoneme selection – is not.

Central Bottleneck Influences on the Processing Stages of Word Production

A fundamental issue in psychology concerns how easily we can do more than one thing at the same time. Not only is this issue of practical significance (e.g., is it scientifically justifiable to prohibit talking on cellular phones while driving?), but it is also of scientific interest. Specifically, to the extent that a task can be performed without hindering other simultaneously performed tasks, it implies that it is carried out by separate, dedicated processing mechanisms. In contrast, if performance of one task interferes with performance of another, it implies that at least some components of that task are carried out by shared processing mechanisms.

One set of abilities that seems especially likely to be based on dedicated processing mechanisms is our linguistic abilities. Linguistic processes may be highly specialized, as they may be based on a substrate that is cognitively, anatomically, and genetically distinct from that of other, non-linguistic processes (e.g., Pinker, 1994). Furthermore, language performance is highly practiced, and practice may lead linguistic processing mechanisms to operate automatically (e.g., Cohen, Dunbar, & McClelland, 1990). The dedicated mechanisms that might underlie the performance of a single kind of task are here termed modular processing mechanisms.

However, it also may be that linguistic abilities are based on mechanisms that are shared with the processes that underlie the performance of other, nonlinguistic tasks. This is suggested by the fact that people typically have difficulty performing another task when linguistic demands are heavy (e.g., driving during an intense discussion). This has been demonstrated experimentally in that even simple detection- or discrimination-task performance is hindered when people concurrently produce or comprehend sentences (e.g., Ford & Holmes, 1978; Dell & Newman, 1980). Processing mechanisms that are shared among distinct, unrelated tasks are here termed central processing mechanisms.

To investigate the modular versus central nature of linguistic processing, we looked at word production. Word production is well suited to exploring this issue, as it is as highly specialized and practiced as any linguistic task, involving the processing of language-specific grammatical and phonological representations (e.g., Dell, 1986; Levelt, Roelofs, & Meyer, 1999) that are used any time a speaker produces language. Furthermore, unlike more complex linguistic tasks such as sentence comprehension or production, word production is punctate, so that once a speaker has conceptually specified a word to be produced, a modular linguistic system could carry out remaining word-production processing without the further use of central nonlinguistic processing mechanisms. If the specialized, highly practiced, and punctate processes of word production hinder the performance of a concurrently performed, unrelated task, then it would suggest that linguistic processing generally involves central processing mechanisms. On the other hand, if the processes of word production do not hinder the performance of a concurrently performed task, it would be consistent with the possibility that language processes generally operate with dedicated, language-specific mechanisms.

Surprisingly, little research has explored whether the production of an individual word hinders processing in another task that is performed at the same time. Some research has shown that when people name pictures, their eye movements among those pictures can be affected by lexical factors such as frequency (Meyer et al., 1998), length (Zelinsky & Murphy, 2000), phonological priming (Meyer & van der Meulen, 2000), or codability (Griffin, 2001). Such effects only show, however, that eye-movements that are performed in service of a linguistic task are affected by linguistic variables, and do not show that an independent, unrelated non-linguistic task must be directly affected by processing effects within the linguistic processing system. The two experiments below investigate the word-production process in detail during performance of a concurrent unrelated task, to determine whether any aspects of word production interfere with the performance of that task, and if so, which ones.

Processing Stages of Word Production

Retrieving and producing even a single word is a rich process. To produce a word, a speaker begins with an idea they intend to convey, and ends by articulating the sequence of sounds that constitutes the phonological content of a word that expresses that idea. Along the way, a speaker must retrieve grammatical and morphological information (so that the word can be used in a sentence and correctly with prefixes and suffixes), syllabic and metrical structure (so the word is pronounced with the correct lexical stress, etc.), and all of this must occur quickly enough so that the word can be selected and mentioned in an appropriate sentence position.

To account for this multi-faceted word-production process, most theories of word production have adopted a general model like that illustrated in Figure 1 (with similarities to Cutting & Ferreira, 1999; Dell, 1986; Dell, Schwartz, Martin, Saffran, & Gagnon, 1997; Levelt et al., 1999; but see Caramazza, 1997; Rapp & Goldrick, 2000). According to these models, when a speaker produces a word, she or he begins with a set of active conceptual features that constitute the message to be expressed (e.g., from a picture stimulus). Activation then spreads from these conceptual features to connected lemma representations, which primarily encode the syntactic features of words. Lemma representations in turn spread activation to connected phonological word-form (or sometimes, lexeme) representations, which represent the whole-word sound properties of words and the features necessary for morphological processes like suffixation and compounding. Finally, phonological word-forms spread activation to connected phoneme representations, which represent the segmental information necessary to form syllables and eventually drive articulation. Thus, in the course of producing a single word, a speaker goes through a sequence of three processing stages: first, the speaker selects an appropriate lemma (lemma selection), then an appropriate phonological word-form (phonological word-form selection), and finally the appropriate sequence of phonemes (phoneme selection).

Figure 1.

Figure 1

A model of part of the word production lexicon. Information flows from top to bottom.

Given this framework, the question of whether speakers can produce words as they perform another concurrent task can be broken down into specific questions about whether each of the stages of lemma selection, phonological word-form selection, and phoneme selection can be performed at the same time as those other processes.1 The experiments below will explore these questions by having speakers produce words at the same time as they perform another unrelated task. In the domain of attention research, such dual-task situations have been investigated in detail. A description of one kind of dual-task methodology and logic is provided next.

Concurrent Task Performance

Attention researchers have explored dual-task performance using a wide variety of non-linguistic tasks (for review, see Pashler, 1998). When separate tasks that require independent responses are performed together, dual-task interference results, referring to the fact that speed and accuracy generally suffer relative to task performance in isolation. One effect that provides a fine-grained view of the nature of dual-task interference is the psychological refractory period (PRP) effect (e.g., Telford, 1931; Vince, 1949). Here, subjects are tested on two discrete tasks (Task 1 and Task 2), where the onset of the second task stimulus follows the onset of the first task stimulus by varying intervals (referred to as the stimulus onset asynchrony or SOA). The PRP effect refers to the fact that as SOA decreases, Task 2 response latencies increase. That is, with increasing task overlap, responses slow.

Research over several decades has provided strong support for the theory that the PRP effect reflects a postponement of central processing stages in the second task – a processing bottleneck. According to this account, central stages in Task 2 cannot commence until corresponding stages of Task 1 have been completed, whereas perceptual and motoric stages in either of the two tasks can overlap without constraint. How a central-bottleneck explains PRP effects is illustrated in Figure 2. The left panel illustrates how decreasing SOA causes greater slowing in Task 2 responses. Rectangles represent processing stages (or collections thereof), and shading represents stages that are subject to the central bottleneck. In the top situation, Task 2 processing is postponed due to bottleneck limitations, illustrated by the fact that the shaded rectangles do not overlap in time. If the Task 2 stimulus is presented earlier (bottom situation), the amount of postponement increases correspondingly, and the Task 2 response latency therefore increases.

Figure 2.

Figure 2

Left panel: Central bottleneck account of the basic PRP effect. Right panel: Isolating processes that are sensitive to the central processing bottleneck. Time is from left to right. (S1, R1, S2, and R2 designate stimulus and response for Task 1 and Task 2. Shaded rectangles represent processes in each task that are subject to a central processing bottleneck, and therefore cannot occur simultaneously.)

This central bottleneck theory, first proposed by Welford (1967), is supported by various kinds of evidence. One is the existence of PRP slowing in tasks that do not share input or output modalities. Another is the tendency of trial-to-trial variability in latencies for the first response to “propagate” onto the second latency, producing a characteristic relationship between the latency distributions in the two tasks (Pashler, 1989). Even more compelling are the findings of chronometric studies in which the duration of different stages of the two tasks are manipulated (see Pashler, 1998, for a review). When experimental manipulations retard Task 2 stages that are located at or after the bottleneck (e.g., retarding response selection in Task 2 by reducing stimulus-response compatibility), the theory predicts that the slowing caused by this variable on the Task 2 response latencies will combine additively with the effects of SOA. This has been confirmed in most studies. On the other hand, when the duration of pre-bottleneck stages in Task 2 is manipulated (e.g., by increasing perceptual difficulty in Task 2), the theory predicts that the slowing will be reduced as the SOA is shortened (i.e., an underadditive interaction will result); this has also been confirmed (most recently by Dell’Acqua et al, 2000). Experiments involving manipulations of Task 1 latency also provide critical tests of the theory, and the logic of these will be discussed below.

While the occurrence of central postponement in many dual-task situations now seems uncontroversial, debate continues about why such a bottleneck should arise. Some authors suggest that the slowing is in some sense strategic or optional, perhaps undertaken in response to implicit or explicit task demands (Meyer & Kieras, 1997). However, recent experiments in which subjects are given every incentive to perform the central stages of two concurrent tasks simultaneously have continued to find evidence for queuing of central stages (Ruthruff, Pashler, & Klaasen, 2001; Levy & Pashler, 2001), implying that the phenomenon is not strategic. On the other hand, when very simple tasks are used or extensive practice is provided, central processing may come to operate simultaneously in some cases (Schumacher et al., 1999; Pashler, Carrier & Hoffman, 1993) although in other cases it fails to do so (Ruthruff, Johnston, and Van Selst, 2001). In sum, the central bottleneck is stubborn but not entirely ubiquitous; for dual-task performance with tasks that involve even modest levels of complexity or novelty, however, the central bottleneck model seems well confirmed. Parallel central processing is probably confined to laboratory experiments or performance of highly routinized activities (or both).

Thus, it is clear that a bottleneck generally affects the central processes involved in performance of a task. To determine which specific stages are sensitive to the central bottleneck, we can manipulate the duration of a given processing stage in a dual-task situation, and observe how that affects performance in each task. This is illustrated in the right panel of Figure 2. If a given Task 1 stage, labeled “Stage X” in Figure 2, is subject to the central bottleneck, then an experimental manipulation that slows Stage X (the middle situation) should slow Task 2 responses to the same degree. This propagation occurs because the critical stages of Task 2 must wait for the completion of the critical stage of Task 1 before they can commence processing. (Note that the same prediction follows for manipulations of pre-bottleneck processes – ones that operate before any process that is subject to the central bottleneck.) In contrast, if a given Task 1 processing stage is not subject to the central bottleneck but instead operates after any bottleneck stage (labeled Stage “Y” in Figure 3), then slowing that stage (the bottom situation) should not slow Task 2 responses (even though it will slow Task 1 responses).

Figure 3.

Figure 3

Picture-naming and tone-discrimination latencies as a function of task SOA, cloze constraint, and lexical frequency, from Experiment 1. Error bars indicate standard errors.

The experiments reported below used the logic of Figure 2 to assess whether any of the stages of word production – lemma selection, phonological word-form selection, or phoneme selection – is subject to the central processing bottleneck that has been identified in attention research. In Experiment 1, we manipulated the duration of lemma selection and phonological word-form selection, and in Experiment 2, we manipulated the duration of lemma selection and phoneme selection, both in picture-naming tasks (Task 1). As Task 2, subjects performed a three-tone auditory discrimination task. If a given word production stage is subject to the central bottleneck, then slowing that stage should delay tone-discrimination responses as much as picture-naming responses; but if a word production stage is not subject to the bottleneck, then slowing that stage should slow tone-discrimination responses less than picture-naming responses. If none of the stages of word production show bottleneck effects, then it would suggest that word production does not cause dual-task interference and thus operates modularly with respect to other ongoing processes. But if some or all of the processing stages of word production are sensitive to the central bottleneck, it would suggest that even highly practiced and specialized processes like those involved in language production do not operate modularly, but rather impose central processing demands that hinder the performance of other concurrently performed tasks.

Experiment 1

In Experiment 1, subjects named pictures in a rebus-style task, based on the materials and procedure of Experiment 2 in Griffin and Bock (1998). Subjects read full sentences one word at a time from the computer screen, where the last word of the sentence was replaced by a picture that the subject was to name as quickly as possible. We manipulated cloze constraint by preceding the pictures to be named with sentences that strongly constrained (high-constraint cloze sentences) or weakly constrained (low-constraint cloze sentences) the identity of the following picture (e.g., bed is strongly constrained by Bob was tired, so he went to…, but is weakly constrained by She saw a picture of a…). We also manipulated whether subjects named pictures with high-frequency names (e.g., bed) or low-frequency names (e.g., bone). Griffin and Bock (1998) observed an interaction between cloze constraint and lexical frequency, such that pictures were named faster after high-constraint cloze sentences than after low, and after low-constraint cloze sentences only, pictures with high frequency names were named faster than pictures with low frequency names.

Within a model like that shown in Figure 1, cloze constraint should primarily affect the efficiency of lemma selection. A high-constraint cloze sentence allows the meaning of upcoming material to be anticipated; in the model shown in Figure 1, this implies that a high-constraint cloze sentence will preactivate a larger number of the conceptual features for a to-be-produced name. If more of the conceptual features that connect to a given lemma are active at the time that the picture is to be named, the target lemma should accrue activation more quickly and therefore be selected sooner.2

Supporting evidence for this comes from the fact that speakers hesitate less before producing a word that is more predictable from its sentence context (Goldman-Eisler, 1968; Schacter, Christenfeld, Ravina, & Bilous, 1991; Schacter, Rauscher, Christenfeld, & Crone, 1994). If we assume that hesitations reflect a momentary inability to select an intended lemma, this suggests that highly predictable sentence contexts make lemma selection easier. Other supporting evidence comes from a study by Federmeier and Kutas (2001), who had subjects view pictures at the end of cloze sentences while measuring brain responses with event-related potentials (ERPs). They tested three kinds of pictures: those that were expected on the basis of the cloze sentence; those that were unexpected but came from the same conceptual category as the expected completion; and those that were equally unexpected but came from a different conceptual category as the expected completion. They found that the ERP response to the same-category unexpected pictures was closer to the ERP response to the expected pictures than the ERP response to the different-category unexpected pictures. This suggests that cloze sentences lead to the expectation of information that is organized along conceptual lines.

In contrast, lexical frequency seems to affect phonological word-form selection. Evidence for this comes from the homophone frequency-inheritance effect (Jescheniak & Levelt, 1994). Homophones like the high frequency week and the low frequency weak have distinct lemma representations, since they are different words with different meanings and that can have different syntactic features (e.g., one word is a noun while the other is an adjective). However, they share phonological word forms, since they are phonologically identical. Notice that with this representational organization, the weak lemma representation is accessed only when a speaker produces the specific word weak. Thus, if frequency affects the speed of lemma selection, weak should be produced as slowly as other comparable low frequency words. In contrast, the shared phonological word form /wi˘k/ is accessed each time a speaker produces weak or week; thus, if frequency affects the speed of phonological word-form selection, a low frequency homophone like weak should be produced as quickly as would be predicted from the combined (high) frequency of both weak and week. In fact, the latter result was found by Jescheniak and Levelt (1994), suggesting that lexical frequency specifically affects how quickly speakers access phonological word forms (see also del Viso et al., 1991 and Dell, 1990 for supporting speech-error evidence).

Picture-naming performance was assessed in a dual-task paradigm. On each trial, subjects named pictures aloud and discriminated one of three different-pitched tones that began 50, 150, or 900 ms after picture onset. The 50 and 150 ms SOAs were chosen so that postponement was likely to occur on all trials, so responses should be approximately 100 ms slower in Task 2 in the 150 ms SOA condition than in the 50 ms SOA condition; the 900 ms SOA condition was chosen to show diminished slowing as task overlap decreases markedly. Subjects were instructed to name the picture as quickly as possible while promptly identifying the tone as low, medium, or high with a button press. Thus, Task 1 was picture naming (where the picture is presented at the end of a cloze sentence), while Task 2 was tone discrimination. If the tasks generally are subject to the central bottleneck, then a standard PRP effect (illustrated in the left panel of Figure 2 above) should be observed: tone responses (but not picture-naming responses) should slow as the difference between picture- and tone-onsets decreases. Furthermore, as illustrated by the logic illustrated in the right panel of Figure 2, if lemma selection specifically is subject to the central bottleneck, then the cloze constraint manipulation should not only affect the speed of picture-naming responses, but it should also affect the speed of tone-discrimination latencies by about the same amount. Similarly, if phonological word-form selection is subject to the central bottleneck, then lexical frequency should affect the speed of picture-naming responses and tone discrimination responses by about the same amount.

Method

Subjects

Sixty members of the UCSD community participated in Experiment 1. Subjects received class credit for participation. All subjects reported learning English as their first language.

Apparatus

Stimuli were presented and responses collected using PsyScope 1.2.5 (Cohen, MacWhinney, Flatt, & Provost, 1993). The software was run on Macintosh 6500/250 computers with 17-inch color monitors. Auditory stimuli were presented through integrated speakers set immediately below the monitor screen. Voice responses were collected with a Shure SM10A unidirectional headworn microphone, which provided input to a Marantz PMD201 standard cassette recorder (for recording voice responses) and a PsyScope response box (for measuring voice onset latencies). The voice key was calibrated separately for each subject. Button-press responses were measured using the three buttons on the PsyScope response box (which are colored red, yellow, and green from left to right). Picture-naming accuracy was evaluated with paper and pen by an experimenter, who monitored experiment performance in the same room with the subject. The tape recording was used to recover any responses that were missed by the experimenter.

Materials

For the picture-naming task, materials were taken from Griffin and Bock (1998). The picture set included 30 pictures with high frequency names and 30 with low frequency names. The high- and low-frequency-name pictures were matched for picture-name agreement and for object-decision latencies, and the high- and low-frequency names themselves were closely matched for length in syllables, length in phonemes and initial phoneme (see Griffin & Bock, 1998). The mean lexical frequency of the high frequency pictures was 110 occurrences per million according to the CELEX spoken frequency word count (Baayen, Piepenbrock, & Van Rijn, 1993) and 183 occurrences per million according to Francis and Kucera (1982), while the lexical frequency of the low frequency pictures was 15 and 28 occurrences per million according to the two counts respectively.

The cloze sentence frames were the same as those in Griffin and Bock (1998). The high-constraint cloze-sentence frames for the high- and low-frequency-name pictures were matched on cloze probability, both when measured as first-response probability (.85 vs. .93) or as first-three-response probability (.97 vs. .98). The low-constraint cloze sentences were designed to be compatible with almost any imageable entity. The complete set of materials is reported in Griffin and Bock (1998).

Design and analysis

Experiment 1 included three independent variables: cloze constraint (high and low), lexical frequency (high and low), and tone SOA (50, 150, or 900 ms). All independent variables were manipulated within subjects (counterbalanced across items), cloze constraint and tone SOA were manipulated within items (counterbalanced across subjects), and lexical frequency was manipulated between items. Each subject saw each picture stimulus once, so that across the 60 stimuli, they saw 5 stimuli in each experimental condition. The picture stimuli were rotated through the within-item factors, such that across the 60 subjects, each picture was presented in each within-subject experimental condition 10 times.

Response latencies and accuracies were measured for each task. Any trial on which the subject did not accurately name the picture and discriminate the tone was removed from response-latency analysis (a total of 11.4% of trials). The picture-naming latency analysis did not include any trial on which the voice key did not accurately detect the picture-naming response (2.1% of correct trials), or where the picture-naming latency was greater than 2000 ms (an additional 0.8% of correct trials), and the tone response-latency analysis did not include any button-press latency greater than 3000 ms (0.7% of correct observations). We report error performance below in Tables 1 and 2, with 95% confidence interval halfwidths to assist comparison. Because no theoretical implications follow specifically from the error rates, we do not discuss them further.

Table 1.

Mean number of errors per subject for each level of tone SOA, lexical frequency, and cloze probability in Experiment 1.

Cloze probability
Tone SOA and lexical frequency Low High
Picture-naming
50 ms
 Low 0.43 0.08
 High 0.18 0.12
150 ms
 Low 0.37 0.07
 High 0.20 0.20
900 ms
 Low 0.27 0.10
 High 0.23 0.15
Tone-discrimination
50 ms
 Low 0.45 0.30
 High 0.37 0.35
150 ms
 Low 0.67 0.42
 High 0.37 0.33
900 ms
 Low 0.42 0.38
 High 0.47 0.32

Note. The three-way interaction had a 95% confidence-interval halfwidth of 0.17 errors per subject for picture-naming and 0.21 errors per subject for tone-discrimination.

Table 2.

Mean number of errors per subject for each level of tone SOA, distractor relatedness, and distractor SOA in Experiment 2.

Distractor SOA
Tone SOA and distractor relatedness 0 ms 100 ms
Picture-naming
50 ms
 Conceptual 0.31 0.29
 Phonological 0.06 0.15
 Unrelated 0.13 0.00
150 ms
 Conceptual 0.29 0.23
 Phonological 0.08 0.00
 Unrelated 0.08 0.06
900 ms
 Conceptual 0.21 0.27
 Phonological 0.02 0.02
 Unrelated 0.04 0.15
Tone-discrimination
50 ms
 Conceptual 0.92 0.92
 Phonological 0.94 0.77
 Unrelated 0.81 0.75
150 ms
 Conceptual 1.10 0.81
 Phonological 0.96 0.81
 Unrelated 0.73 0.69
900 ms
 Conceptual 0.58 0.92
 Phonological 0.63 0.69
 Unrelated 0.52 0.83

Note. The three-way interaction had a 95% confidence-interval halfwidth of 0.18 errors per subject for picture-naming and 0.33 errors per subject for tone-discrimination.

Both measures for both tasks were analyzed with three-way 2 × 2 × 3 analyses of variance (ANOVAs), using both subjects (F1) and items (F2) as random variables. The ANOVA designs for each analysis correspond to the materials design described above. The effect of frequency was evaluated with planned comparisons of the high- and low-frequency conditions within each level of cloze constraint. We report variability with 95% confidence-interval halfwidths (CIs) based on single degree-of-freedom comparisons, for subjects and for items. All effects reported as significant reached the .05 significance level, unless noted otherwise. All reported means are calculated across subject condition means.

Procedure

Each trial began with the message “<yellow button>.” When the subject pressed the yellow button on the response box, the screen blanked for 1000 ms, and then each word of the cloze sentence was presented in the center of the screen for 285 ms in immediate succession, as in rapid serial visual presentation (RSVP) paradigms. The sentence was presented in bold Courier 14-point font. The picture stimulus immediately replaced the final word of the cloze sentence, which subjects were instructed to name as quickly as possible. The picture remained on the screen until the voice key detected a response. Pictures were presented in a different randomly determined order for each subject.

The auditory stimulus for tone discrimination was presented 50, 150, or 900 ms after picture onset. The tone was 285 ms in duration, and was either low (180 Hz), medium (500 Hz) or high (1200 Hz) in pitch. The pitch of the tone that subjects heard varied randomly from trial, though each subject was presented with an equal number of each pitch across the 60 trials. Subjects were instructed to identify the pitch of the tone promptly, while still naming the picture as quickly as possible. The response buttons were labeled “low,” “medium,” and “high” from left to right, which subjects pressed with the index finger of the left hand and the index and middle fingers of the right hand respectively. Each trial ended when both a voice-key response and a button-press response was registered. The next trial began following a 500 ms delay. All 60 trials in the experimental session were presented in a single block.

Each experimental session began with interactive instructions and a practice session. Subjects first practiced tone discrimination alone for 45 trials (15 of each pitch). They then were given 30 dual-task practice trials, which were identical in structure to the experimental trials described above, but with different pictures and different moderately constraining cloze sentences. Including practice, the experimental session lasted approximately 35 minutes.

Results

Figure 3 shows the mean picture-naming and tone-discrimination response latencies as a function of tone SOA, cloze constraint, and lexical frequency. Error rates are shown in Table 1.

Task 1 performance

The solid lines illustrate picture-naming (Task 1) performance. Subjects named pictures 158 ms more slowly after low-constraint cloze sentences (filled symbols) than after high-constraint cloze sentences (open symbols). In the low-constraint condition, subjects named pictures with low-frequency names (filled squares) 60 ms more slowly than pictures with high frequency names (filled circles), while in the high-constraint condition, subjects named low- and high-frequency-name pictures about equally quickly (an 18 ms difference in the reverse direction). Tone SOA had little effect on picture-naming latencies.

Statistical analyses of picture-naming response-times confirmed these observations. The main effect of cloze constraint was significant (F1(1,59) = 187, CI = ±23 ms; F2(1,58) = 125, CI = ±29 ms), while the main effect of frequency was significant by subjects only (F1(1,59) = 7.9, CI = ±15 ms; F2(1,58) = 1.8, CI = ±45 ms). The interaction between cloze constraint and lexical frequency was also significant (F1(1,59) = 16.4, CI = ±27 ms; F2(1,58) = 8.6, CI = ±31 ms). In the low cloze-constraint condition, low frequency-name pictures were named significantly more slowly than high (F1(1,59) = 19.6, F2(1,58) = 12.5), while in the high cloze-constraint condition, the naming-time difference between the two frequency conditions was not significant (F1(1,59) = 1.7, F2(1,58) < 1). The effect of Tone SOA was only significant by items (F1(2,118) = 1.4, CI = ±25 ms; F2(2,116) = 4.4, CI = ±22 ms), and it did not interact with any other factor (all ps > .1). Although the cloze-constraint by lexical-frequency interaction appears to be weaker in the 150-ms tone SOA condition (see Figure 3), the three-way interaction between tone SOA, cloze constraint, and lexical frequency was not significant (F1(2,118) = 2.4, p < .1, CI = ±38 ms, F2(2,116) = 2.3, CI = ±41 ms).

Task 2 performance

The dashed lines in Figure 3 illustrate tone-discrimination (Task 2) performance. A substantial PRP effect was evident (the slope of the dashed lines), as subjects discriminated tones 472 ms more slowly when tone onset followed picture onset by 50 ms, compared to when it followed by 900 ms (the difference between the 50 ms and the 150 ms SOA conditions was 74 ms, consistent with virtually complete postponement occurring on the great majority of trials). Importantly, the interaction between cloze constraint and lexical frequency observed with picture-naming times occurred also with tone-response times: Subjects’ tone responses were 180 ms slower in the low-constraint cloze-sentence condition than in the high (compared to a 158 ms difference in picture-naming times). In the low-constraint cloze-sentence condition, tone responses were 74 ms slower in the low-frequency picture-name condition than in the high (compared to a 60 ms difference in picture-naming times), whereas in the high-constraint cloze-sentence condition, tone responses showed a 29 ms reverse frequency-effect (compared to an 18 ms difference in the same direction with picture-naming times).

Statistical analyses of the tone response-times confirmed these observations. The main effect of Tone SOA was significant (F1(2,118) = 458, CI = ±33 ms; F2(2,116) = 474, CI = ±34 ms). Tone SOA interacted significantly with constraint (F1(2,118) = 7.9, CI = ±42 ms; F2(2,116) = 4.9, CI = ±52 ms), as the difference between the constraint conditions decreased with increasing tone SOA. No other interaction with tone SOA was significant (all ps > .1). The main effect of cloze constraint was significant (F1(1,59) = 137, CI = ±31 ms; F2(1,58) = 75.5, CI = ±44 ms), while the main effect of frequency was significant only by subjects (F1(1,59) = 4.8, CI = ±21 ms; F2(1,58) = 1.3, CI = ±65 ms). The interaction between cloze constraint and lexical frequency was significant (F1(1,59) = 13.5, CI = ±40 ms; F2(1,58) = 9.1, CI = ±62 ms). In the low-constraint cloze-sentence conditions, tone responses were significantly slower in the low-frequency picture-name condition than in the high (F1(1,59) = 13.9; F2(1,58) = 11.0), whereas in the high-constraint cloze-sentence conditions, tone response-times in the low and high frequency picture-name conditions did not differ (F1(1,59) = 2.1; F2(1,58) < 1).

Discussion

The results of Experiment 1 are straightforward: Picture naming (Task 1) showed a characteristic interaction (Griffin & Bock, 1998) between cloze constraint – a lemma selection factor -- and lexical frequency – a phonological word-form selection factor. The critical finding was that both components of this interaction in Task 1 propagated onto Task 2 latencies: Task 2 latencies were slower when pictures were named in the low-constraint cloze sentence condition compared to the high, and in the low-constraint cloze condition, Task 2 latencies were slower when pictures with low frequency names were named. The propagation of both a lemma-selection effect and a phonological word-form selection effect onto Task 2 performance suggests that both lemma selection and phonological word-form selection are subject to the central bottleneck. That is, critical processing stages in Task 2 cannot begin at least until both lemma selection and phonological word-form selection are complete; any delays incurred by the word production system up to that point are passed on to an unrelated second task.

An unusual aspect of the results in both tasks was the slight reversal of the frequency effect in the high-constraint cloze sentence condition. Griffin and Bock (1998) found this same reversal, and suggested that it is due to the fact that the high-constraint cloze sentences for the low-frequency-name pictures are slightly more constraining than the high-constraint cloze sentences for the high frequency-name pictures.

Next, in Experiment 2, we determined whether the stage of phoneme selection is also subject to a central processing bottleneck. At the same time, Experiment 2 was designed to provide converging evidence about the bottleneck effects that come from lemma selection, by manipulating the speed of lemma selection differently in Experiment 2 than in Experiment 1.

Experiment 2

Another way to manipulate the efficiency of the separate processing stages of word production is to use the Stroop-like picture-word interference task. Here, speakers name line-drawn pictures at the same time that they are presented with (and are instructed to ignore) a near-simultaneous distractor word (which can be presented auditorily or visually, though we used visual presentation to avoid conflicts with the Task 2 tone stimuli). Any distractor word presented with a picture slows picture naming, compared to when pictures are named alone. If the distractor word is similar in meaning, or is conceptually related to the picture (e.g., couch for bed), then picture naming is slowed even more than with an unrelated distractor (e.g., Cutting & Ferreira, 1999; Damian & Martin, 1999; Rayner & Springer, 1986; Roelofs, 1992; Schriefers, Meyer, & Levelt, 1990). On the other hand, if the distractor word is similar in sound, or is phonologically related to the name of the picture (e.g., bend for bed), then picture naming is slowed less than with an unrelated distractor (e.g., Damian & Martin, 1999; Meyer & Schriefers, 1991; Schriefers et al., 1990).

The slowing that occurs with conceptually related distractors (compared to unrelated distractors) has been taken to reflect interference in the process of lemma selection (e.g., Cutting & Ferreira, 1999; Levelt et al., 1999; Roelofs, 1992; Schriefers et al., 1990). This follows from the model shown in Figure 1. In such models, a conceptually related distractor shares conceptual features with the to-be-named picture, while an unrelated distractor does not. Thus, when a conceptually related distractor is presented along with a target picture, the distractor’s lemma representation receives activation from two sources: the distractor presentation itself, and the conceptual representation of the target picture. On the other hand, when an unrelated distractor is presented along with a target picture, its lemma representation receives activation only from distractor presentation, and not from the conceptual representation of the target picture. Thus, as the name of the picture is produced, the lemma of a conceptually related distractor word will be more active than the lemma of an unrelated distractor. This additional activation of the conceptually related distractor lemma will especially slow target lemma selection, for example because of lateral inhibition among coactive lemmas (e.g., Cutting & Ferreira, 1999) or because of a choice-ratio selection threshold (e.g., Roelofs, 1992). Support for this characterization comes from the fact that while conceptually related distractors slow production compared to unrelated distractors, the effect of associatively related distractors (sleep for bed) is much less consistent (Cutting & Ferreira, 1999; La Heij, Dirkx, & Kramer, 1990; Lupker, 1979; Wheeldon & Monsell, 1994). Furthermore, even though conceptually related distractors slow picture naming, they have no effect on the speed of object recognition (Humphreys, Lloyd-Jones, & Fias, 1995; Lupker, 1988; Schriefers et al., 1990).

On the other hand, the facilitation that occurs with phonologically related distractors (compared to unrelated distractors) has been taken to reflect faster phoneme selection (Cutting & Ferreira, 1999; Levelt et al., 1999; Meyer & van der Meulen, 2000). Evidence for the claim that phoneme-selection specifically is involved in such facilitation (rather than, say, phonological word-form selection) comes from the fact that picture-naming times decrease not only with the presentation of a whole word that is similar in sound, but also with the presentation of a word fragment (e.g., Schriefers & Teruel, 1999). Since word fragments are not lexically represented in the production lexicon, this suggests that phonologically similar distractors are effective during phoneme selection, since phonemes are the first representations that phonologically related words and word fragments share.

In Experiment 2, we used the procedure of Damian and Martin (1999, Experiment 2). We presented the distractor words at two different times (relative to picture onset), so that we would have a better chance of uncovering both conceptual interference and phonological facilitation (since as noted above, the strength of each kind of effect varies with distractor onset). To assess whether lemma and phoneme selection are subject to a central processing bottleneck, we had subjects perform this picture-naming task as Task 1; Task 2 was the same tone-discrimination task as in Experiment 1. If lemma selection is subject to the central bottleneck (as suggested by the results of Experiment 1), then tone-discrimination responses should be slowed by conceptually related distractors as much as picture-naming responses are. The new question is whether phoneme selection is also subject to the central bottleneck. If so, then the speed of tone responses should be affected by phonologically related distractors as much as the speed of picture naming responses.3 On the other hand, if phonologically related distractors do not affect the speed of tone-discrimination responses (while still affecting the speed of picture-naming responses), it would imply that phoneme selection is not subject to the central bottleneck.

Method

Subjects

Forty-eight new subjects taken from the same population as Experiment 1 participated in Experiment 2.

Apparatus

The same apparatus was used as in Experiment 1, except auditory stimuli were presented through separate pre-amplified speakers (Optimus Model no. PRO-X55AV) placed immediately beside the monitor.

Materials

The picture-naming materials were taken from Damian and Martin (1999) Experiment 2, except that one picture (ring) was eliminated at random, so that the number of conditions would divide evenly into the number of presented pictures. The picture set consisted of 27 highly nameable line-drawn pictures taken from the Snodgrass and Vanderwart (1980) picture set, each measuring approximately 3 in. by 3 in. Three distractor words were assigned to each picture: a conceptually related distractor word, which was taken from the same semantic category as the picture; a phonologically related distractor word, which shared at least the two initial phonemes of the picture name; and an unrelated distractor word, which bore no obvious relationship to the picture meaning or the picture name. The words in the three distractor conditions were matched in terms length in phonemes and length in letters. All materials are reported in Damian and Martin (1999).

Design and analysis

Experiment 2 included three independent variables: distractor relatedness (conceptually related, phonologically related, or unrelated), distractor SOA (0 ms, with simultaneous picture and distractor-word onset, or 100 ms, where the distractor-word onset followed picture onset by 100 ms), and tone SOA (50, 150, or 900 ms, as in Experiment 1). As in Damian and Martin (1999), distractor relatedness and distractor SOA were manipulated completely within subjects and within items. Tone SOA was also manipulated within subjects, but counterbalanced with the other two factors across speakers and items (to avoid tripling the number of presentations of each picture). Thus, subjects saw each picture six times in the main experiment: twice in each distractor relatedness condition, and three times in each distractor SOA condition. The counterbalancing of tone SOA was such that subjects saw each picture twice in each tone SOA condition, though the tone SOA manipulation was not confounded with any component of the distractor relatedness by distractor SOA interaction. In all, each subject saw 9 items in each cell of the three-factor design, and across the 48 subjects, each picture appeared in each cell 16 times.

The dependent variables, statistical analyses, and reporting conventions were the same as in Experiment 1, except that the three-way ANOVA designs were 2 × 3 × 3 in Experiment 2. We use pairwise comparisons to evaluate conceptual interference (conceptually related versus unrelated) and phonological facilitation (phonologically related versus unrelated) within each distractor-SOA condition. We excluded 11.9% of trials from the latency analyses because subjects were not accurate on picture naming and on tone discrimination, 2.1% of correct trials because the voice key did not accurately measure the voice response (for the picture-naming latency analyses), 0.4% of correct trials because the picture-naming latency was greater than 2000 ms (for the picture-naming latency analyses), and 0.2% of trials because the tone-discrimination latency was greater than 3000 ms (for the tone-discrimination latency analyses).

Procedure

The picture-naming procedure followed Damian and Martin (1999, Experiment 2). Each trial began with a fixation point in the center of the screen for 1000 ms. The picture stimulus appeared 500 ms after fixation point offset, and remained on the screen until the voice key detected a response. The distractor word was presented in bold Courier 14-point font simultaneous with or 100 ms after the onset of the picture. The distractor remained on the screen for 200 ms, and was then replaced by a 500 ms visual mask, consisting of seven uppercase Xs. The distractor SOA manipulation was blocked across trials, so that the first half of the experiment occurred with one distractor SOA and the second half with other. Half of the subjects saw each order of distractor-SOA blocks. Subjects were instructed to name the picture as quickly as possible and to ignore the visual distractor word. Pictures were presented in a fixed, randomly generated order, constrained so that subjects never saw more than three consecutive trials in any distractor relatedness condition.

The details of tone presentation were the same as in Experiment 1. After the subject provided a picture-naming- and button-press response, the experimenter recorded picture-naming accuracy and voice-key accuracy by pressing a key on the computer keyboard.

At the beginning of each session, subjects were presented with all 27 pictures from the experiment, one at a time, along with its name. Subjects were instructed to study each picture and name long enough so that they could accurately name the picture. Subjects were then given practice at tone discrimination, as in Experiment 1. After interactive instructions, subjects proceeded through two practice phases, in which they saw the same 27 pictures to name as in the main experiment, in different fixed, randomly determined orders, and heard the tones to discriminate. In the first phase, no distractors were presented, while in the second, a new unrelated distractor was presented 0 ms or 100 ms after picture onset. The entire session lasted approximately 30 minutes.

Results

Figure 4 shows the mean picture-naming and tone-discrimination response latencies as a function of distractor type and tone SOA separately for the 0-ms distractor-SOA condition (top graph) and the 100-ms distractor-SOA condition (bottom graph). Error rates are reported in Table 2.

Figure 4.

Figure 4

Picture-naming and tone-discrimination latencies as a function of task SOA, distractor relatedness, and distractor SOA, from Experiment 2. Error bars indicate standard errors.

Task 1 performance

Picture-naming latencies are illustrated with solid lines. At both distractor SOAs, conceptually related distractors (marked with squares) slowed picture-naming latencies compared to unrelated distractors (marked with Xs), though the effect was larger in the 0 ms distractor-SOA condition (72 ms) than in the 100 ms distractor-SOA condition (28 ms). Phonological distractors (marked with circles), on the other hand, facilitated picture-naming times compared to unrelated distractors both in the 0 ms distractors SOA condition (by 47 ms) and in the 100 ms distractor-SOA condition (by 56 ms). Tone SOA had little effect on picture naming.

Statistical analyses of the picture-naming response-latencies confirm these observations. The main effect of distractor relatedness was significant (F1(2,94) = 97.5, CI = ±14 ms; F2(2,52) = 49.9, CI = ±19 ms), while the main effect of distractor SOA was significant by items only (F1(1,47) = 2.3, CI = ±16 ms; F2(1,26) = 9.9, CI = ±9 ms). The interaction between these two factors was significant (F1(2,94) = 12.5, CI = ±13 ms; F2(2,52) = 6.5, CI = ±20 ms), reflecting the fact that only conceptually related distractors were sensitive to the distractor SOA manipulation (a 38 ms difference; the phonologically related and unrelated distractors showed 4 ms and 5 ms differences respectively). Conceptual interference (conceptually related versus unrelated distractor conditions) was significant in the 0 ms distractor-SOA condition (t1(94) = 10.9, t2(52) = 6.8), and was significant by subjects only in the 100 ms distractor SOA condition (t1(94) = 4.2, t2(52) = 2.0). Phonological facilitation (phonologically related versus unrelated distractor conditions) was significant both in the 0 ms dist4ractor SOA condition (t1(94) = 7.1, t2(52) = 4.7) and in the 100 ms distractor-SOA condition (t1(94) = 8.5, t2(52) = 5.6). The effect of tone SOA was significant only by items (F1(2,94) = 0.73, CI = ±32 ms; F2(2,52) = 3.9, CI = ±13 ms), and did not interact with any other factor (all ps > .1).

Task 2 performance

Tone-discrimination performance is illustrated with dashed lines in Figure 4. A substantial PRP effect was observed (the slope of the dashed lines), as tone-discrimination responses were 484 ms slower when tone onset followed picture onset by 50 ms compared to when it followed by 900 ms (the difference between the 50 ms and 150 ms SOA conditions was 84 ms). As observed in the picture-naming task, conceptually related distractors affected the speed of tone-discrimination responses (compared to unrelated distractors). In the 0 ms distractor-SOA condition, tone-discrimination response-times were 126 ms slower in the conceptually related than in the unrelated distractor conditions (compared to a 72 ms difference in picture naming), and in the 100 ms distractor-SOA condition, tone-discrimination response-times were 66 ms slower in the conceptually related than in the unrelated distractor condition (compared to a 28 ms difference in picture naming). On the other hand, unlike in the picture-naming task, phonologically related distractors had little effect on tone-discrimination responses. In the 0 ms distractor SOA condition, phonological distractors slowed tone-discrimination responses by 14 ms (at the same time as they facilitated picture-naming by 47 ms), and in the 100 ms distractor SOA condition, phonological distractors facilitated tone-discrimination responses by only 20 ms (compared to a 56 ms facilitatory effect in picture naming).

Statistical analyses of tone-discrimination response-latencies confirm these observations. The PRP effect was supported by a significant effect of tone SOA (F1(2,94) = 690, CI = ±28 ms; F2(2,52) = 808, CI = ±26 ms). Tone SOA also interacted significantly with distractor relatedness (F1(4,188) = 6.6, CI = ±33 ms; F2(4,104) = 5.6, CI = ±36 ms), reflecting the fact that conceptually related distractors slowed tone-discrimination responses less (compared to unrelated distractors) as tone SOA increased. The main effect of distractor relatedness was significant (F1(2,94) = 59.8, CI = ±20 ms; F2(2,52) = 35.9, CI = ±26 ms) while the main effect of distractor SOA was only marginally significant by items (F1(1,47) = 1.0, CI = ±26 ms; F2(1,26) = 3.1, p < .1, CI = ±14 ms). Distractor relatedness and distractor SOA interacted significantly (F1(2,94) = 4.3, CI = ±29 ms; F2(2,52) = 3.2, CI = ±32 ms), as conceptually related distractors were sensitive to the distractor-SOA manipulation (a 42 ms difference), while phonologically related and unrelated distractors were not (16 ms and 18 ms differences respectively). Conceptual interference was significant both in the 0 ms distractor-SOA condition (t1(94) = 8.7; t2(52) = 7.5) and in the 100 ms distractor-SOA condition (t1(94) = 4.5, t2(52) = 4.0). Phonological facilitation, on the other hand, was not significant in either distractor-SOA condition (0 ms: t1(94) = 1.0, t2(52) = 0.7; 100 ms: t1(94) = 1.4, t2(52) = 1.4). The only other significant effect was the distractor relatedness by tone SOA interaction (F1(4,188) = 6.6, CI = ±33 ms; F2(4,104) = 5.6, CI = ±36 ms), which is due to the small crossover of the phonologically related and the unrelated conditions from the 150-ms tone SOA condition to the 900-ms tone SOA condition. No other effects approached significance (all ps > .1).

Discussion

Experiment 2 shows that when picture naming was slowed by conceptually related distractors, tone-discrimination latencies too were slowed by a comparable amount. On the other hand, when picture naming was facilitated by phonologically related distractors, tone-discrimination latencies were unaffected. Hence, these results suggest that lemma selection is subject to the central processing bottleneck (converging with the conclusion from Experiment 1), while phoneme selection is not. The implications of this result for the modular versus central nature of word production is explored in the General Discussion.

An unexpected aspect of the results from Experiment 2 is that conceptually related distractors appeared to affect the speed of tone-discrimination responses substantially more than they affected the speed of picture-naming responses (by 89 ms in tone discrimination vs. 48 ms in picture naming, when collapsed across all other factors). This is not due simply to the fact that tone discrimination response-latencies were slower overall, as it occurred also in the 900-ms SOA condition (69 ms in tone discrimination vs. 40 ms in picture naming), where overall response latencies for the two measures are comparable. (Note that the larger overall effects in tone discrimination in Experiment 1 are likely to be due to such scaling factors, as is apparent from performance in the 900 ms tone-SOA condition.) This effect might be due to post-response monitoring processes that are subject to the central bottleneck (e.g., Welford, 1952). That is, if on some trials, naming responses are subsequently monitored by subjects for accuracy, and if the monitoring processes are subject to the central bottleneck and begin before the bottleneck-sensitive processing stages of the tone-discrimination task, then that post-response monitoring would slow tone discrimination beyond what is reflected by picture naming latencies. It would not be surprising if the conceptually related distractor condition made post-response monitoring especially likely, as conceptually related distractors should engender the most uncertainty about the accuracy of the picture-naming response (and indeed, the conceptually related distractors led to the largest number of picture-naming errors; see Table 2). However, more research is necessary to determine whether this explanation is correct.

Another notable aspect of the results from both experiments concerns the propagation effects that appeared when the Tone SOA was 900 ms. Specifically, since mean picture-naming times are less than 900 ms, propagation is observed even when a significant proportion of trials involved tones that began after the picture-naming response was initiated. Note, however, that PRP effects (and propagation effects) might be expected to outlive response times, either because of post-response monitoring (Welford, 1952, 1980), because of a post-response refractory period (Rabbitt, 1969), or because the process of switching from a Task 1 response to a Task 2 response is relatively slow (Pashler, 1998).4

Finally, one concern about both experiments comes from the possibility that tone-discrimination might be linguistically mediated (especially since the button responses were labeled with words). If so, then the dual-task interference observed here might not reflect the influence of a linguistic upon a nonlinguistic task. However, this explanation is rendered unlikely by the fact that the response mapping was compatible (low, medium, and high proceeded from left to right), so that after practice, subjects almost never looked at the button labels. It is also notable that in the 900 ms SOA condition in Experiment 2, tone-discrimination response times were slightly faster than picture naming times, suggesting that tone-discrimination performance is unlikely to an involve an extra processing step (tone percept to verbal label to button press), compared to picture-naming performance (picture percept to verbal label).

General Discussion

The experiments presented here make three points: (a) The word production stage of lemma selection is subject to a central processing bottleneck, as picture-naming manipulations that slowed lemma selection also slowed tone-discrimination. In Experiment 1, pictures presented after low-constraint cloze sentences were not only named slower than those presented after high-constraint cloze sentences, but Task 2 tone-discrimination responses were affected similarly. Also, in Experiment 2, conceptually related distractor words presented in the picture-word interference task not only slowed picture naming compared to unrelated distractor words, but they slowed Task 2 tone-discrimination responses as well. (b) Phonological word-form selection is also subject to the central bottleneck, as manipulating the speed of phonological word-form selection also affected tone-discrimination performance. In Experiment 1, not only were low frequency picture names produced more slowly than high frequency picture names (in the low-constraint cloze sentence condition), but tone-discrimination latencies were also slower when speakers named low- rather than high-frequency-name pictures. (c) In contrast, phoneme selection is not subject to the central bottleneck, as manipulating the speed of phoneme selection did not affect tone-discrimination responses. In Experiment 2, pictures were named more quickly when a phonologically related distractor word was presented, but tone-discrimination responses were not systematically affected. This pattern of sensitivity of the processing stages of word production to the central bottleneck is illustrated in PRP terms in Figure 5.

Figure 5.

Figure 5

Schematic of the stages of word production in terms of their sensitivity to a central processing bottleneck (shaded rectangles represent processes in each task that are subject to a central processing bottleneck, and therefore cannot occur simultaneously).

These results thus show that some of the processing stages of word-production use central processing mechanisms. That is, while lemma-selection processes narrow down an active set of conceptual features to an optimal single word, and return the syntactic properties of that word, and while phonological word-form selection processes retrieve a phonologically defined representation that specifies the morphological features of that word, central processes in unrelated tasks cannot operate. This point is emphasized especially by the fact that a variable as language-specific as lexical frequency affected not only performance of a linguistic task but also a concurrently performed non-linguistic task. However, not all processing stages of word production are dependent on central mechanisms. In particular, phoneme selection seems to operate in a more modular fashion, such that while phoneme selection processes take the activated phonological word form, and retrieve the individual sounds of that word, central processing in unrelated tasks can take place (though the independence may be general to relatively late processes; see also van Galen & ten Hoopen, 1976).

Why might this pattern of central processing and modularity have emerged? Dual-task investigations have revealed that processes that give rise to central bottleneck effects generally involve response selection rather than response execution. Response-selection processes involve determining an appropriate response for some input representation (e.g., Karlin & Kestenbaum, 1968, Hawkins, Church, and de Lemos, 1978). Response-execution processes, on the other hand, execute a response once it is selected (e.g., Pashler & Christian, 1994). The results of these experiments thus suggest that both lemma selection and phonological word-form selection are response-selection processes -- they involve the selection of an appropriate response in a fashion that is critically similar to the response selection demands that arise in a task such as tone-discrimination. Phoneme selection, on the other hand, is a response execution process, so that once a phonological word-form is selected, response selection is complete, implying that the intended picture-naming response has been fully determined.

Given that word-production is sensitive to bottleneck effects at all, it is not too surprising that lemma selection involves significant response-selection demands, since it is the first stage at which a speaker decides which word to use to express a particular concept. The fact that phonological word-form selection involves response selection may come from the fact that selecting a phonological word form determines important aspects of a lexical response that can vary from one lexical response to the next (especially considering the prefixation, suffixation, and compounding possibilities that phonological word-form representations underlie, though these are not relevant to task performance here). Finally, the weak response-selection demands of phoneme selection may come from the fact that the mapping from a phonological word form to a sequence of phonemes is highly consistent in a language user’s experience, especially if the syllabic structure of that sequence of phonemes is determined at a different processing stage (Levelt et al., 1999).

One general observation reinforces the conclusion that the processing dynamics of lemma selection and phonological word-form selection act as response-selection processes, whereas phoneme selection does not. In everyday speech, it is not unusual for lemma selection processes to halt in the middle of the speech stream. This occurs when a speaker does not know what the right word is to express some concept. Similarly, it is not uncommon for phonological word-form selection processes to halt, in the familiar tip-of-the-tongue state (e.g., Brown & McNeill, 1966; Jones & Langford, 1987; Meyer & Bock, 1992; see Levelt, 1989 for discussion). Here, a speaker knows the word she or he wishes to say (i.e., the lemma has been selected), but does not know how it sounds (the phonological word-form is inaccessible). In contrast, it seems uncommon for phoneme selection to halt in this way. Given that a speaker knows the overall phonological shape of the word to be produced, it seems unusual for a speaker to stop mid-speech stream, unable to retrieve a particular sound. This observation suggests that both lemma selection and phonological word-form selection involve decision processes that are sufficiently open-ended that they sometimes do not run to completion, but that the process of phoneme selection is a more closed process that, while still error prone, nevertheless consistently operates to conclusion (see Pashler, 1998, for related discussion).

One further implication of this investigation is that the dual-task methodology used here can determine not only the nature of the relationship between linguistic and nonlinguistic processes, but also the nature of the word-production process itself. For example, the phonological facilitation observed in Experiment 2 in principle could have reflected either a phoneme-selection effect or a phonological word-form selection effect (since phonological word-forms are claimed to have phonological content). The results here however show that phonological facilitation must affect a processing stage different from that affected by lexical frequency, since lexical frequency propagated dual-task interference, but phonological facilitation did not. Since lexical frequency has been taken to affect phonological word-form selection, these results thus imply that indeed, phonological facilitation effects arise from phoneme-selection rather than from phonological word-form selection (converging with the observation that comes from the Schriefers & Teruel, 1999 experiments discussed above). Importantly, the locus of other word-production factors and other time-course issues could similarly be addressed with dual-task paradigms like the ones used here.

It should be noted that our conclusions rest on the assumption that the manipulations in each experiment target the processing stages in the manner described above. A special concern in this regard is that some models of word production (e.g., Caramazza, 1997; Caramazza & Miozzo, 1997) question whether lexical frequency affects a stage of phonological word-form selection that is separate from lemma selection. While there is good evidence that lexical frequency affects phonological word-form selection (like that described above; for further discussion, see Griffin & Bock, 1998; Levelt et al., 1999; Roelofs, Meyer, & Levelt, 1998), our results are compatible with the possibility that lexical frequency (like cloze constraint and conceptual-distractor relatedness) affects a first stage of lexical access, provided that that first stage (a) is subject to a central processing bottleneck, and (b) can account for the cloze constraint by lexical frequency interaction initially observed by Griffin and Bock (1998) and replicated here.

Overall, the results of the above experiments show that linguistic processes -- even those involved in a task as highly specialized and efficient as word production -- are not carried out in a modular, cognitively independent fashion. As speakers produce words, their ability to perform other non-linguistic tasks is briefly hampered. But looking at word production in detail reveals that not all aspects of word production are central in this manner, but that word production includes some processing stages that impose little or no central processing demands.

Acknowledgments

We thank Kay Bock, Markus Damian, Gary Dell, Zenzi Griffin and Scott Watter and three anonymous reviewers for helpful discussions and comments, and Scott Baldwin, Carla Firato, Erin Rogers, and Josh Wilson for assistance in collecting and coding data. Address requests for reprints to Vic Ferreira at the Department of Psychology, University of California at San Diego, La Jolla, CA, 92093-0109. Email: ferreira@psy.ucsd.edu.

Footnotes

1

Note that some theories of word production claim that these processing levels are discrete (e.g., Levelt et al., 1999), whereas others claim they are interactive (e.g., Dell, 1986). Importantly the logic ahead does not depend on this distinction, so long as a model of word production involves an explicit selection of a representation at each stage which involves modular or central processing mechanisms.

2

This account can be made compatible with approaches to conceptual structure that assume non-decompositional representations by claiming that a high-constraint cloze sentence activates a lexical item’s conceptual representation more strongly than a low-constraint cloze sentence does.

3

Note that the logic of Figure 2 does not depend on whether an experimental manipulation speeds or slows a particular processing stage.

4

Also, in the 900 ms SOA condition, Task 2 response times are faster than Task 1’s, even though propagation is still evident. Note, however, that the logic of Figure 2 does not depend on Task 2 RTs being longer than Task 1’s (for example, if the final processing stages of Task 2 are very short).

Portions of this work were presented at the 41st Annual Meeting of the Psychonomic Society and at AMLaP-2001 in Saarbrücken, Germany.

This work was supported by grants from the National Institute of Mental Health (R01-MH64733 and R01-MH45584).

References

  1. Baayen, R. H., Piepenbrock, R., & van Rijn, H. (1993). The CELEX lexical database[CD-ROM] Philadelphia, PA: Linguistics Data Consortium, University of Pennsylvania,
  2. Brown R, McNeill D. The “tip of the tongue” phenomenon. Journal of Verbal Learning and Verbal Behavior. 1966;5:325–337. [Google Scholar]
  3. Caramazza A. How many levels of processing are there in lexical access? Cognitive Neuropsychology. 1997;14:17–208. [Google Scholar]
  4. Caramazza A, Miozzo M. The relation between syntactic and phonological knowledge in lexical access: Evidence from the ‘tip-of-the-tongue’ phenomenon. Cognition. 1997;64:309–343. doi: 10.1016/s0010-0277(97)00031-0. [DOI] [PubMed] [Google Scholar]
  5. Cohen JD, Dunbar K, McClelland JL. On the control of automatic processes: A parallel distributed processing account of the Stroop effect. 1990;97:332–361. doi: 10.1037/0033-295x.97.3.332. [DOI] [PubMed] [Google Scholar]
  6. Cohen JD, MacWhinney B, Flatt M, Provost J. PsyScope: An interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers. Behavior Research Methods, Instruments, and Computers. 1993;25:257–271. [Google Scholar]
  7. Cutting JC, Ferreira VS. Semantic and phonological information flow in the production lexicon. Journal of Experimental Psychology: Learning, Memory, & Cognition. 1999;25(2):318–344. doi: 10.1037//0278-7393.25.2.318. [DOI] [PubMed] [Google Scholar]
  8. Damian MF, Martin RC. Semantic and phonological codes interact in single word production. Journal of Experimental Psychology: Learning, Memory, & Cognition. 1999;25(2):345–361. doi: 10.1037//0278-7393.25.2.345. [DOI] [PubMed] [Google Scholar]
  9. del Viso S, Igoa JM, García-Albea JE. On the autonomy of phonological encoding: Evidence from slips of the tongue in Spanish. Journal of Psycholinguistic Research. 1991;20:161–185. [Google Scholar]
  10. Dell GS. A spreading-activation theory of retrieval in sentence production. Psychological Review. 1986;93:283–321. [PubMed] [Google Scholar]
  11. Dell GS. Effects of frequency and vocabulary type on phonological speech errors. Language and Cognitive Processes. 1990;5:313–349. [Google Scholar]
  12. Dell GS, Newman JE. Detecting phonemes in fluent speech. 1980;19:608–623. [Google Scholar]
  13. Dell GS, Schwartz MF, Martin N, Saffran EM, Gagnon DA. Lexical access in aphasic and nonaphasic speakers. Psychological Review. 1997;104:801–838. doi: 10.1037/0033-295x.104.4.801. [DOI] [PubMed] [Google Scholar]
  14. Dell’Acqua R, Pascali A, Peressotti F. Modulazione di una variabile percettiva in un contesto di doppio-compito. Giornale Italiano di Psicologia. 2000;27(4):843–848. [Google Scholar]
  15. Federmeier KD, Kutas M. Meaning and modality: Influences of context, semantic memory organization, and perceptual predictability on picture processing. Journal of Experimental Psychology: Learning, Memory, & Cognition. 2001;27(1):202–224. [PubMed] [Google Scholar]
  16. Ford M, Holmes VM. Planning units and syntax in sentence production. Cognition. 1978;6:35–53. [Google Scholar]
  17. Francis, W. N., & Kucera, H. (1982). Frequency analysis of English usage: Lexicon and grammar Boston: Houghton Mifflin Co.
  18. Goldman-Eisler, F. (1968). Psycholinguistics: Experiments in spontaneous speech London: Academic Press.
  19. Griffin ZM. Gaze durations during speech reflect word selection and phonological encoding. Cognition. 2001;82(1):B1–B14. doi: 10.1016/s0010-0277(01)00138-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Griffin ZM, Bock K. Constraint, word frequency, and levels of processing in spoken word production. Journal of Memory and Language. 1998;38:313–338. [Google Scholar]
  21. Hawkins, H. L., Church, M., & de Lemos, S. (1978, June). Time-sharing is not a unitary ability (2). Center for Cognitive and Perceptual Research, University of Oregon.
  22. Humphreys GW, Lloyd-Jones TJ, Fias W. Semantic interference effects on naming using a post-cue procedure: Tapping the links between semantics and phonology with words and pictures. Journal of Experimental Psychology: Learning, Memory, & Cognition. 1995;21:961–980. [Google Scholar]
  23. Jescheniak JD, Levelt WJM. Word frequency effects in speech production: Retrieval of syntactic information and of phonological form. Journal of Experimental Psychology: Learning, Memory, & Cognition. 1994;20:824–843. [Google Scholar]
  24. Jones GV, Langford S. Phonological blocking in the tip of the tongue state. Cognition. 1987;26:115–122. doi: 10.1016/0010-0277(87)90027-8. [DOI] [PubMed] [Google Scholar]
  25. Karlin L, Kestenbaum R. Effects of number of alternatives on the psychological refractory period. Quarterly Journal of Experimental Psychology. 1968;58:167–178. doi: 10.1080/14640746808400145. [DOI] [PubMed] [Google Scholar]
  26. la Heij W, Dirkx J, Kramer P. Categorical interference and associative priming in picture naming. British Journal of Psychology. 1990;81(4):511–525. [Google Scholar]
  27. Levelt, W. J. M. (1989). Speaking: From intention to articulation Cambridge, MA: MIT Press.
  28. Levelt WJM, Roelofs A, Meyer AS. A theory of lexical access in speech production. Behavioral & Brain Sciences. 1999;22(1):1–75. doi: 10.1017/s0140525x99001776. [DOI] [PubMed] [Google Scholar]
  29. Levy J, Pashler H. Is dual-task slowing instruction dependent? Journal of Experimental Psychology: Human Perception & Performance. 2001;27(4):862–869. [PubMed] [Google Scholar]
  30. Lupker SJ. The semantic nature of response competition in the picture-word interference task. Memory & Cognition. 1979;7:485–495. [Google Scholar]
  31. Lupker SJ. Picture naming: An investigation of the nature of categorical priming. Journal of Experimental Psychology: Learning, Memory, and Cognition. 1988;14:444–455. [Google Scholar]
  32. Meyer AS, Bock JK. The tip-of-the-tongue phenomenon: Blocking or partial activation? Memory & Cognition. 1992;20:715–726. doi: 10.3758/bf03202721. [DOI] [PubMed] [Google Scholar]
  33. Meyer AS, Schriefers H. Phonological facilitation in picture-word interference experiments: Effects of stimulus onset asynchrony and types of interfering stimuli. Journal of Experimental Psychology: Learning, Memory, & Cognition. 1991;17(6):1146–1160. [Google Scholar]
  34. Meyer AS, Sleiderink AM, Levelt WJM. Viewing and naming objects: Eye movements during noun phrase production. Cognition. 1998;66(2):B25–B33. doi: 10.1016/s0010-0277(98)00009-2. [DOI] [PubMed] [Google Scholar]
  35. Meyer AS, van der Meulen FF. Phonological priming effects on speech onset latencies and viewing times in object naming. Psychonomic Bulletin & Review. 2000;7:314–319. doi: 10.3758/bf03212987. [DOI] [PubMed] [Google Scholar]
  36. Meyer DE, Kieras DE. A computational theory of executive cognitive processes and multiple-task performance: I. Basic mechanisms. Psychological Review. 1997;104:3–65. doi: 10.1037/0033-295x.104.1.3. [DOI] [PubMed] [Google Scholar]
  37. Pashler H. Dissociations and dependencies between speed and accuracy: Evidence for a two-component theory of divided attention in simple tasks. Cognitive Psychology. 1989;21:469–514. [Google Scholar]
  38. Pashler, H. (1998). The psychology of attention Cambridge, MA: MIT Press.
  39. Pashler, H., & Christian, C. L. (1994). Bottlenecks in planning and producing vocal, manual, and foot responses Unpublished manuscript.
  40. Pashler H, Carrier M, Hoffman J. Saccadic eye movements and dual-task interference. Quarterly Journal of Experimental Psychology: Human Experimental Psychology. 1993;46(1):51–82. doi: 10.1080/14640749308401067. [DOI] [PubMed] [Google Scholar]
  41. Pinker, S. (1994). The language instinct New York: Morrow.
  42. Rabbitt P. Psychological refractory delay and response-stimulus interval duration in serial, choice-response tasks. Acta Psychologica, Amsterdam. 1969;30:195–219. [Google Scholar]
  43. Rapp B, Goldrick M. Discreteness and interactivity in spoken word production. Psychological Review. 2000;107(3):460–499. doi: 10.1037/0033-295x.107.3.460. [DOI] [PubMed] [Google Scholar]
  44. Rayner K, Springer CJ. Graphemic and semantic similarity effects in the picture-word interference task. British Journal of Psychology. 1986;77(2):207–222. doi: 10.1111/j.2044-8295.1986.tb01995.x. [DOI] [PubMed] [Google Scholar]
  45. Roelofs A. A spreading-activation theory of lemma retrieval in speaking. Cognition. 1992;42:107–142. doi: 10.1016/0010-0277(92)90041-f. [DOI] [PubMed] [Google Scholar]
  46. Roelofs A, Meyer AS, Levelt WJM. A case for the lemma/lexeme distinction in models of speaking: Comment on Caramazza and Miozzo (1997) Cognition. 1998;69:219–230. doi: 10.1016/s0010-0277(98)00056-0. [DOI] [PubMed] [Google Scholar]
  47. Ruthruff E, Johnston JC, Van Selst M. Why practice reduces dual-task interference. Journal of Experimental Psychology: Human Perception & Performance. 2001;27(1):3–21. [PubMed] [Google Scholar]
  48. Ruthruff E, Pashler HE, Klaassen A. Processing bottlenecks in dual-task performance: Structural limitation or strategic postponement? Psychonomic Bulletin & Review. 2001;8(1):73–80. doi: 10.3758/bf03196141. [DOI] [PubMed] [Google Scholar]
  49. Schachter S, Christenfeld N, Ravina B, Bilous F. Speech disfluency and the structure of knowledge. Journal of Personality and Social Psychology. 1991;60:362–367. [Google Scholar]
  50. Schachter S, Rauscher, Christenfeld N, Crone The vocabularies of academia. Psychological Science. 1994;5:37–41. [Google Scholar]
  51. Schriefers H, Meyer AS, Levelt WJM. Exploring the time course of lexical access in language production: Picture-word interference studies. Journal of Memory and Language. 1990;29:86–102. [Google Scholar]
  52. Schriefers H, Teruel E. Phonological facilitation in the production of two-word utterances. European Journal of Cognitive Psychology. 1999;11(1):17–50. [Google Scholar]
  53. Schumacher EH, Lauber EJ, Glass JM, Zurbriggen EL, Gmeindl L, Kieras DE, Meyer DE. Concurrent response-selection processes in dual-task performance: Evidence for adaptive executive control of task scheduling. Journal of Experimental Psychology: Human Perception & Performance. 1999;25(3):791–814. [Google Scholar]
  54. Snodgrass JG, Vanderwart M. A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Human Learning and Memory. 1980;6:174–215. doi: 10.1037//0278-7393.6.2.174. [DOI] [PubMed] [Google Scholar]
  55. Telford CW. The refractory phase of voluntary and associative responses. Journal of Experimental Psychology. 1931;14:1–36. [Google Scholar]
  56. van Galen GP, ten Hoopen G. Speech Control and Single Channelness. Acta Psychologica. 1976;40(3):245–255. [Google Scholar]
  57. Vince MA. Rapid response sequences and the psychological refractory period. British Journal of Psychology. 1949;40:23–40. doi: 10.1111/j.2044-8295.1949.tb00225.x. [DOI] [PubMed] [Google Scholar]
  58. Welford AT. The ‘psychological refractory period’ and the timing of high-speed performance--a review and a theory. British Journal of Psychology. 1952;43:2–19. [Google Scholar]
  59. Welford AT. Single-channel operation in the brain. Acta Psychologica. 1967;27:5–22. doi: 10.1016/0001-6918(67)90040-6. [DOI] [PubMed] [Google Scholar]
  60. Welford AT. On the nature of higher-order skills. Journal of Occupational Psychology. 1980;53(2):107–110. [Google Scholar]
  61. Wheeldon LR, Monsell S. Inhibition of spoken word production by priming a semantic competitor. Journal of Memory and Language. 1994;33:332–356. [Google Scholar]
  62. Zelinsky GJ, Murphy GL. Synchronizing visual and language processing: An effect of object name length on eye movements. Psychological Science. 2000;11(2):125–131. doi: 10.1111/1467-9280.00227. [DOI] [PubMed] [Google Scholar]

RESOURCES