Abstract
This paper presents an experiment investigating attention allocation in four tasks requiring varied degrees of lexical processing of 1-4 simultaneously displayed words. Response times and eye movements were only modestly affected by the number of words in the asterisk-detection task but increased markedly with the number of words in the letter-detection, rhyme-judgment, and semantic-judgment tasks, suggesting that attention may not be serial for tasks that do not require significant lexical processing (e.g., detecting visual features), but is approximately serial for tasks that do (e.g., retrieving word meanings). The implications of these results for models of readers’ eye movements are discussed.
Our understanding of how perception, cognition, and action are coordinated during reading has benefited from the recent development of computational models of readers’ eye-movement behavior (for an overview of these models, see the 2006 special issue of Cognitive Systems Research). These models have highlighted a number of theoretical issues, such as: How is attention allocated during reading? Existing models of eye-movement control provide a continuum of alternative answers to this question (Reichle, 2006). At one end of this spectrum, serial-attention shift models (e.g., E-Z Reader; Pollatsek, Reichle, & Rayner, 2006; (Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Rayner, & Pollatsek, 2003; Reichle, Warren, & McConnell, 2008) posit that attention is allocated strictly serially, with lexical processing being completed on only one word at a time. Near the other end of this spectrum, attention-gradient models (e.g., SWIFT: Engbert, Nuthmann, Richter, & Kliegl, 2005; Glenmore: Reilly & Radach, 2006) posit that attention is allocated as a gradient, with lexical processing distributed across several words simultaneously. This attention gradient is asymmetrical, extending further to the right than left, with faster processing of words near the center of the gradient than in the periphery. The present article reports an eye-tracking experiment designed to determine whether the type of lexical processing that occurs during natural reading is more consistent with the assumptions of serial-attention or attention-gradient models. Our results will suggest that, although different kinds of lexical processing may be more consistent with serial or parallel assumptions, the type of lexical processing that is necessary to retrieve the pronunciations and/or meanings of words requires that attention be allocated in a manner that is better approximated by serial-attention shift models.
To examine attention allocation during reading, we ran an eye-movement experiment in which participants engaged in four tasks requiring them to indicate the presence of specific target features in arrays of one to four simultaneously displayed words. Participants had to detect an embedded “*” (asterisk-detection), the letter “q” (letter-detection), whether a word rhymed with “blue” (rhyme-judgment), or whether a word referred to an animal (semantic-judgment). The tasks thus manipulated the “depth” to which the words had to be processed (Craik & Lockhart, 1972), with asterisk-detection being a relatively “shallow” task (i.e., it can be performed using only visual information) and semantic-judgment being a relatively “deep” task (i.e., it requires the retrieval of word meanings). Although none of these tasks are equivalent to natural reading, all but the asterisk-detection task required participants to do some degree or type of lexical processing on each of the displayed words. The logic of having participants perform each of these tasks on 1-4 simultaneously displayed words was to determine how the observed response latencies and patterns of eye movements would be affected by the number of words that had to be concurrently processed in order to perform the tasks. Because word identification is the central component of natural reading, our tasks provide an index of how the allocation of attention among 1-4 simultaneously displayed words might be expected to influence natural reading. In other words, our tasks were designed to look at how lexical processing—which is arguably the primary processing bottleneck in natural reading—is affected by the number of words that have to be processed.
Our predictions were as follows: First, if attention is allocated serially in any of the tasks, then the response latencies in those tasks should increase with each additional word to be concurrently processed. This increase will be due to attention being focused on only one word for the amount of time that is necessary to process that word with each additional word requiring a shift of attention. In contrast, if attention is allocated as a gradient, then the response latencies should be less affected by the number of words because attention will be distributed across more than one at a time 1. Second, the patterns of eye movements (e.g., the number of fixations per trial and/or their durations) should mirror the response latencies, with serial processing being indicated by a strong linear relationship between these measures and the number of words displayed per trial, and parallel processing being indicated by a weaker (or absent) relationship. Of course, these predictions must be qualified by the possibility that the task demands may lead to more or less parallel processing; for example, a relatively shallow task like detecting asterisks might be less affected by the number of words than a deeper task like making semantic judgments. Such an interaction would suggest that tasks that require deeper (lexical) processing also require attention to be allocated in a more serial manner than tasks that require only shallow (visual) processing.
Method
Participants
Fifty-nine undergraduates from the University of Pittsburgh with normal or corrected-to-normal vision performed four tasks in blocks of 200 trials per task: (1) asterisk-detection; (2) letter-detection; (3) rhyme-judgment; and (4) semantic-judgment. Participants completed the experiment to fulfill partial course credit in an introductory psychology course and all participants gave informed consent that had been approved by the University of Pittsburgh’s Institutional Review Board prior to their participation. The data from one participant were lost due to experimenter error and thus were not included in our analyses.
Experimental Design
Tasks were blocked, with each block consisting of 50 one-word trials, then 50 two-word trials, and so on. Both task and number of words per trial were blocked in this manner to encourage participants to use whatever strategies they might find most effective—including parallel processing to the extent that it might facilitate task performance. Task blocks were presented in random order. The eye movements of twenty-seven of these participants were recorded. Between blocks, participants took short breaks and the eye-tracker was recalibrated (when necessary).
Materials
Words 20-100 per million in frequency (Francis & Kučera, 1982) and 4-10 letters in length were selected from the MRC Psycholinguistic Database (Coltheart, 1981) as distractors; these words were divided into four sets of 460, with the sets being rotated through each of the task conditions using a Latin-square design. Forty target words were also selected for each task; these words were divided into four sets of 10, with the sets being rotated through each of the number-of-words conditions using a Latin-square design. The mean frequency (and range) of the target words in each of the tasks were: asterisk-detection = 16.53 (10-40); letter-detection = 14.88 (1-143); rhyme-judgment = 89.25 (0-1,791); and semantic-judgment = 10.38 (0-117). The mean length (and range) of the target words were: asterisk-detection = 6.68 (4-11); letter-detection = 7.25 (4-12); rhyme-judgment = 5.08 (4-8); and semantic-judgment = 6.15 (4-11). The mean orthographic neighborhood density (and range) of the target words (Balota et al., 2007) were: asterisk-detection = 3.03 (0-19); letter-detection = 0.63 (0-6); rhyme-judgment = 2.58 (0-10); and semantic-judgment = 4.10 (0-20). Finally, the mean number of morphemes (and range) were: asterisk-detection = 1.48 (1-3); letter-detection = 1.50 (1-3); rhyme-judgment = 1.15 (1-2); and semantic-judgment = 1.13 (1-3). Although pair-wise comparisons did indicate a few reliable differences in the properties of target words across tasks, these differences always worked against the predicted depth-of-processing effects 2. Such differences are also not unexpected because the assignment of words to conditions is by definition a quasi-experimental manipulation (e.g., see Kliegl, Nuthmann, & Engbert, 2006).
The selection of target and non-target words was exclusive; i.e., a non-target word in one task could not be a target word in another task. Target locations in two-, three-, and four-word trials were equally distributed within and between subjects. Stimuli presentation was done using E-Builder software (SR Research Ltd.).
Procedure
The sequence of events for two sample trials of the letter-detection task is presented in Figure 1. At the beginning of each trial, a fixation cross appeared in the center of the screen, for 350 ms. Because participants were not required to maintain word order or complete any higher-level language processing (e.g., syntactic parsing) that is necessary to understand real text, the centrally displayed fixation cross should have been conducive to optimal task performance by allowing lexical processing from a viewing location that afforded both maximal visual acuity and maximal flexibility in how attention was allocated to the words that were displayed. The fixation cross was then followed by the stimuli (1-4 words displayed simultaneously, with the word(s) displayed on a single line, centered on the screen) for up to 3000 ms or until a response was made. Participants were instructed to press the spacebar as quickly as possible after locating a target (e.g., a word containing the letter “q”). The trial sequences for the other three tasks were structured in exactly the same manner, with only the task (e.g., press the spacebar if any of the words rhymes with “blue”) being different.
Figure 1.

Schematic diagram of the sequence of events for two trials of the letter-detection task.
Equipment
Participants viewed the stimuli binocularly on a 23-inch monitor 63 cm from their eyes with approximately two letters per 1° of visual angle. An EyeLink 1000 eye-tracker (SR Research Ltd.) recorded the gaze location of participants’ right eyes. The eye tracker had a spatial resolution of 0.01° and sampled gaze location every millisecond.
Results
Behavioral Results
Panel A of Figure 2 shows the reaction times (in ms) for trials during which a target was correctly identified, as a function of both the task being performed and the number of words displayed per trial. These data were examined using a mixed-factorial Analysis of Variance (ANOVA) with session type (with vs. without eye-tracking) as a between-participants factor and both task type (asterisk-detection vs. letter-detection vs. rhyme-judgment vs. semantic-judgment) and number of words (1 vs. 2 vs. 3 vs. 4) as within-participant factors. The results of this analysis indicated main effects of task type [F(3, 165) = 262.87, p < .001] and the number of words [F(3, 165) = 509.83, p < .001], and an interaction between them [F(9, 495) = 38.51, p < .001]. Session type had no effects [Main effect: F(1, 55) =.61, p =.438; Task Type × Session Type: F(3, 165) = 1.16 , p = .325; Number of Words × Session Type: F(1, 165) = .84, p = .474; Task Type × Number of Words × Session Type: F(9, 495) = .91, p = .515], suggesting that whether participants’ eye movements were recorded or not had no effect on task performance.
Figure 2.
Mean response latencies (in ms) for correct trials (Panel A) and mean response accuracies (% correct; Panel B), as a function of the task being performed and the number of words being simultaneously displayed.
To further examine the Task Type × Number of Words interaction, pair-wise comparisons were performed using a Bonferroni adjustment for multiple comparisons. The asterisk-detection task elicited significantly shorter response times than the other three tasks [asterisk-detection vs. letter-detection: t(56) = 21.55, p < .001; vs. rhyme-judgment: t(56) = 23.44, p < .001; vs. semantic-judgment: t(56) = 17.73, p < .001], and the rhyme-judgment task elicited longer response times than both letter-detection [t(56) = 8.45, p < .001] and semantic-judgment [t(56) = 8.22, p < .001]. The difference between letter-detection and semantic-judgment was not reliable [t(56) = .05, p = .959]. Additional pair-wise comparisons indicated that the number of words had reliable effects in all comparisons [all t(56)’s > 9, all p’s < .001]. The pattern of latencies suggests that the rate of asterisk detection was only weakly affected by the number of words that were concurrently displayed (approximately 14 ms per additional word), but that the rate of completion of the other tasks slowed with each additional word (approximately 140-172 ms per additional word).
Panel B of Figure 2 shows the response accuracies (i.e., the total percentage of hit and correct-rejection trials), again as a function of both the task and the number of words. Overall, participants performed all four tasks very accurately (more than 96% correct in all four tasks). A mixed-factorial ANOVA using session type, task type, and number of words indicated reliable main effects of task type and number of words [F(3, 165) = 22.20, p < .001; and F(3, 165) = 12.78, p < .001, respectively], and an interaction between them [F(9, 495) = 6.18, p < .001]. As with response latencies, whether or not eye movements were recorded did not affect performance [Main effect: F(1, 55) = 2.10, p =.153; Task Type × Session Type: F(3, 165) = 2.41, p = .069; Number of Words × Session Type: F(3, 165) = 1.13, p = .340; Task Type × Number of Words × Session Type: F(9, 495) = 1.32, p = .224]. Furthermore, participants did not trade accuracy for speed in performing the asterisk-detection task; pair-wise comparisons indicated that participants performed this task more accurately than the letter-detection [t(1, 56) = 3.47, p < .01], rhyme-judgment [t(1, 56) = 5.66, p < .001], and semantic-judgment tasks [t(1, 56) = 3.18, p < .01]. Pair-wise comparisons also indicated that the number of words reliably affected all comparisons other than that for the 30-versus 4-word conditions [1-vs. 2-word: t(1, 56) = 2.08, p < .05; 1-vs. 3-word: t(1, 56) = 3.80, p < .001; 1-vs. 4-word: t(1, 56) = 5.33, p < .001; 2-vs. 3-word: t(1, 56) = 2.60, p < .05; 2-vs. 4-word: t(1, 56) = 4.18, p < .001; and 3-vs. 4-word: t(1, 56) = 1.06, p = .293]. This pattern of results suggests that accuracy in detecting asterisks was only weakly affected by the number of words that were concurrently displayed, whereas accuracy in the other tasks —especially making semantic judgments—decreased as the number of concurrently displayed words increased.
Eye-Tracking Results
Figure 3 shows three eye-movement measures that were computed up until the target was identified for trials in which targets were correctly identified, as a function of both the task being performed and the number of words being displayed. Panel A shows the mean number of fixations per trial (including the fixation that occurred during the display change when the fixation cross was replaced by the stimuli). A repeated-measures ANOVA with task type and number of words as within-participant factors indicated reliable main effects of task type and number of words [F(3, 75) = 117.54, p < .001; and F(3, 75) = 392.31, p < .001, respectively], and a reliable interaction [F(9, 225) = 28.73, p < .001]. Pair-wise comparisons indicated that participants made fewer fixations performing the asterisk-detection task than the letter-detection [t(25) = 18.51, p < .001], rhyme-judgment [t(25) = 14.51, p < .001], and semantic-judgment tasks [t(25) = 11.96, p < .001]. The rhyme-judgment task resulted in more fixations than the semantic-judgment task [t(125) = 4.40, p < .001] and letter-detection task [t(25) = 2.70, p < .05], while the number of fixations in the letter-detection and semantic-judgment tasks were not significantly different [t(25) = 1.78, p = .088]. Pair-wise comparisons also indicated that with each additional word presented, participants tended to make more fixations; all comparisons between number of words displayed were reliable [all t(25) > 6, all p’s < .001]. Together, these results indicate that participants tended to make more fixations when processing more concurrently displayed words, but that this trend was much more pronounced for the three “deeper,” more lexically demanding tasks than for the “shallower,” asterisk-detection task.
Figure 3.

Mean number of fixations per trial (Panel A), mean first-fixation durations per trial (in ms; Panel B), and mean dwell time per word per trial (in ms; Panel C), as a function of the task being performed and the number or words being simultaneously displayed.
Panel B of Figure 3 shows the mean first-fixation durations per trial, excluding whatever time was spent fixating that location prior to the display change. An ANOVA of these data indicated there were main effects of task type and number of words [F(3, 75) = 22.37, p < 0.001; and F(3, 75) = 132.60, p < 0.001, respectively], and an interaction between them [F(9, 225) = 20.85, p < 0.001]. Pair-wise comparisons to examine this interaction indicated that participants’ first fixations were shorter in the letter-detection than asterisk-detection task [t(25) = 7.96, p < .001], and shorter in the letter-detection task than in either the rhyme-judgment or semantic-judgment tasks [t(25) = 6.20, p < .001; and t(25) = 7.72, p < .001, respectively]. Neither the asterisk-detection nor semantic-judgment task were reliably different from the rhyme-judgment task [t(25) = .19, p =.855; and t(25) = 1.28, p = .213, respectively], nor were the first-fixation durations reliably different in the asterisk-detection and semantic-judgment tasks [t(25) = 1.87, p = .073]. Furthermore, comparisons indicated that first-fixation durations decreased with each additional word presented [1- vs. 2-word: t(25) = 11.92, p < .001; 1- vs. 3-word: t(25) = 11.35, p < .001; 1- vs. 4-word: t(25) = 11.88, p < .001; 2- vs. 4-word: t(25) = 3.09, p < .01; 3- vs. 4-word: t(25) = 4.44, p < .001]. Only the comparison of the two-versus three-word conditions was not reliable [t(25) = .95, p = .351]. This pattern suggests that, in the conditions involving a single word, the time required to perform each task is reflected in the mean first-fixation durations; with additional words, this was less true, with more of the time required to perform the tasks being instead distributed across additional fixations.
Panel C of Figure 3 shows the mean dwell times per word per trial after the display change (i.e., the sum of all the fixation durations on that word), as a function of task and number of words. An ANOVA using task type and number of words as within-participant factors indicated that there were main effects of task type and number of words [F(3, 75) = 103.72, p < 0.001; and F(3, 75) = 1497.86, p < 0.001, respectively], and an interaction between these two factors [F(9, 225) = 9.56, p < 0.001]. Predictably, the asterisk-detection task was associated with shorter dwell times than the letter-detection [t(25) = 11.68, p < .001], rhyme-judgment [t(25) = 14.28, p < .001], and semantic-judgment tasks [t(25) = 10.55, p < .001], and the rhyme-judgment task was associated with longer dwell times than both the letter-detection [t(25) = 7.02, p < .001] and semantic judgment tasks [t(25) = 7.01, p < .001]. A comparison of the letter-detection and semantic-judgment tasks was not reliable [t(25) = .32, p =.750]. Further comparisons indicated that dwell time declined as participants had to perform the tasks with an increasing number of words [all t(25)’s > 15, all p’s < .001]. This pattern of results suggests that participants divided their viewing time across words in the multiple-word conditions, and that viewing time was also affected by task difficulty, as illustrated by the shorter dwell times for the asterisk task and the longer dwell times for the rhyme-judgment task.
Finally, Figure 4 shows the percentage of initial saccades that were directed from the fixation cross (at the beginning of the trials) towards each of the possible viewing locations (i.e., each of the displayed words) in the three- (Panel A) and four-word (Panel B) conditions. The logic of examining where participant directed their initial saccades was based on the assumption that, if the processing that was necessary to perform the letter-detection, rhyme-judgment, and semantic-judgment was completed in a more serial manner than the asterisk-detection task, then participants in the former three tasks might be more likely to adopt a strategy of first moving their eyes to the left and then scanning from left to right. To determine whether participants did this, chi-square tests were completed to compare the percentage of initial left-directed saccades in the letter-detection, rhyme-judgment, and semantic-judgment tasks to the percentage of such saccades in the asterisk-detection task. In the 3-word conditions, all three contrasts were statistically reliable (letter-detection vs. asterisk-detection: χ 2 = 21.3, p < .01; rhyme-judgment vs. asterisk-detection: χ 2 = 14.5, p < .01; and semantic-judgment vs. asterisk-detection: χ 2 = 25.6, p < .01). The same was true of the same three contrasts in the 4-word conditions (letter-detection vs. asterisk-detection: χ 2 = 16.9, p < .01; rhyme-judgment vs. asterisk-detection: χ 2 = 27.2, p < .01; and semantic-judgment vs. asterisk-detection: χ 2 = 33.6, p < .01). The results of these analyses therefore support our hypothesis that, in the “deeper” tasks, participants were more likely to adopt the general strategy of first moving their eyes to the left-most word and then scanning from left to right, looking for whatever target they were instructed to detect.
Figure 4.
The initial saccade target as a function of task being performed and the number of words being simultaneously displayed (Panel A = 3 words; Panel B = 4 words).
General Discussion
The results of our experiment can be summarized as follows. First, performance in the asterisk-detection task was much less affected by the number of concurrently displayed words than was performance in the other three tasks. Analyses indicated that the number of words displayed had only modest effects on how long it took participants to detect the asterisks, the number of fixations that they made in performing this task, or the duration of the first fixation in performing this task. Because the accuracy of asterisk-detection did not suffer when the number of words to be processed increased, the relatively flat slope of reaction time for the task cannot be attributed to participants trading accuracy for speed. Results also indicated that participants in this task were less likely to employ the viewing strategy of directing their initial gaze to the left of the display and then scanning from left to right in order to locate the asterisk. These results collectively suggest that participants’ ability to detect the presence of an asterisk was only weakly affected by the number of words being displayed, perhaps because the target seemed to “pop out” of the display 3.
In contrast, performance in the rhyme-judgment task was strongly affected by the number of concurrently displayed words. Analyses indicated that the number of words displayed had pronounced effects on how long it took participants to detect words rhyming with “blue,” as indexed by both the number of fixations and the first-fixation durations. In contrast with what was observed during the asterisk-detection task, participants were more likely to shift their initial gaze towards the left and then scan from left to right. These results suggest that the processing required for the rhyme-judgment task was probably completed one word at a time. This makes sense, given that completing this task required the generation of each word’s pronunciation. It is difficult to imagine how this process could be completed on more than one word at a time given that the processing assumptions of all current models of word identification prohibit the phonological processing of more than one word at a time 4.
Finally, performance in both the letter-detection and semantic-judgment tasks more closely resembled the performance in the rhyme-judgment task, with response times, number of fixations, and first-fixation durations being markedly affected by the number of concurrently displayed words. This suggests that these three tasks were completed one word at a time and required a similar depth of lexical processing. Making semantic judgments entailed some degree of semantic processing—at least enough to determine if the words referred to animate things. Letter-detection likely involved some orthographic processing at the whole-word level because performance in this task was so similar to that in the semantic-judgment task. (This conclusion about the letter-detection task is also consistent with previous results showing that tasks that require the processing of individual letters often result in whole-word processing; e.g., the classic word-superiority effect; Reicher, 1969; Wheeler, 1970.)
Our results also have important ramifications for the debate about the nature of attention allocation during reading (Rayner & Juhasz, 2004). The principle result is that participants’ eye-movement behavior in tasks that required a significant amount of lexical processing (e.g., detecting specific letters, generating pronunciations, and/or retrieving word meanings) is more congruent with serial-attention (Pollatsek et al., 2006; Reichle et al., 1998, 2003, 2007) than attention-gradient (Engbert et al., 2005; Reilly & Radach, 2006) models of eye-movement control. The increases in the response times, number of fixations, and first-fixation durations that were observed with an increasing number of currently displayed words suggests that there is some additional “cost” associated with processing each additional word. This is not unexpected in the rhyme-judgment task because this task necessitates the generation of word pronunciations, which can only be done one word at a time. The fact that the eye movements of participants performing the letter-detection and semantic-judgment tasks resemble those of participants making rhyme judgments strongly suggests that the former two tasks also require significant amounts of serial lexical processing. Although advocates of attention-gradient models might disagree and argue that their models can handle our results, we believe that the burden of proof is on them to demonstrate how 5.
Also in support of our argument is the fact that the asterisk-detection task did not seem to require the serial allocation of attention. The processing costs associated with detecting the presence of a simple (and presumably easy to discriminate) visual feature like an asterisk were instead only weakly affected by the number of words that had to be processed. (Furthermore, these small costs may have been due to visual acuity limitations; Rayner & Bertera, 1979; Rayner & Morrison, 1981.) The eye movements of participants performing the asterisk-detection task are thus more consistent with what one might predict with the parallel allocation of attention. That is, the patterns of eye movements that were observed in the asterisk-detection task are more congruent with what one might predict in all of our tasks if attention had been allocated in the manner described by attention-gradient models (Engbert et al., 2005; Reilly & Radach, 2006).
Because our conclusions are general and are meant to apply to natural reading, it is important to acknowledge that the tasks that were used in our experiment only approximate the actual complexity of natural reading. However, three of our tasks did require significant amounts of lexical processing, which is arguably the critical component of natural reading. Furthermore, our tasks were actually designed to allow (and perhaps even encourage) the parallel processing of words because participants were free to view and process the words in whatever order most facilitated task performance. For example, both the specific tasks and the number of words being displayed per trial were blocked. This effectively meant that our participants could take advantage of the design of our experiment and learn to attend to multiple words in parallel in the service of performing each of the tasks as rapidly and accurately as possible. Our tasks may have also encouraged parallel processing because they did not require participants to do any type of higher-level processing of the words (e.g., syntactic parsing, integrating the individual word meanings into the larger meaning of a sentence, etc.). In natural reading, such higher-level processing might be facilitated by (or require) the serial processing of words. For example, because word order conveys syntactic and other types of important linguistic information (e.g., topic or focus; Kaiser & Trueswell, 2004), and because word-order information is available “for free” with serial processing (Pollatsek & Rayner, 1999), it is conceivable that natural reading requires a greater degree of serial processing than do our tasks. Thus, one might argue that our tasks underestimate the degree to which the serial allocation of attention facilitates (or is required during) natural reading.
Finally, because we have argued that lexical processing necessitates that words be processed one at a time during reading, it follows that serial-attention models should provide a more accurate description of what happens during the “default” reading situation (i.e., when the reader is reading with full comprehension). However, it is important to emphasize again that, regardless of which model ultimately provides a better description of the perceptual, cognitive, and motor processes involved in reading, the present results indicate that not all types of lexical processing are equally demanding. This suggests that, rather than couching the current theoretical debate in terms of attention allocation being strictly serial or strictly parallel, our understanding of reading might benefit from re-framing the issue to examine the conditions under which attention is allocated in a more-or-less serial versus parallel manner.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
The attention-gradient models that our predictions are based on (Engbertet al., 2005; Reilly & Radach, 2006) do not assume a completely restrictive capacity limit, so that the processing of a single word proceeds at a full rate, but the simultaneous processing of two words proceeds at exactly half that rate, etc. Models that do assume such a restrictive capacity limit might be expected to mimic serial-attention models (Townsend & Wenger, 2004), with the rate of lexical processing being inversely proportional to the number of words being concurrently processed. Although attention-gradient models like SWIFT and Glenmore do assume that attention is limited in capacity, this capacity is typically sufficient to allow lexical processing of three or four words, with a full rate of processing for the words near the center of the attention gradient and a reduced rate of processing for more peripheral words.
Target words in the rhyme-judgment task were shorter than those in the other three tasks, and target words in the letter-detection task were longer than those in the semantic-judgment task (all t’s > 2.8, all p’s < 0.01). The orthographic neighborhoods of target words in the letter-detection task were also less dense than in the other three tasks (all t’s > 3, all p’s < 0.01). Finally, target words in the asterisk-detection and letter-detection tasks contained more morphemes than those in both the rhyme-judgment and semantic judgment tasks (all t’s > 2.7, all p’s < 0.01). In all cases, these differences worked against depth-of-processing. For example, because longer words typically take more time to identify than shorter words (Rayner, Sereno, & Raney, 1996), any effect of word length should attenuate any effect of processing depth (e.g., any slowdown that resulted from processing slightly longer words in the letter-detection task should have reduced any difference between this task and the semantic-judgment task that would have resulted from the latter being a “deeper” task).
One might argue that such “pop-out” effects are also consistent with the task being performed in a pre-attentive manner (Treisman & Gelade, 1980). For our purposes, it is not important to discriminate between this interpretation and ours because the critical point is that the letter-detection, rhyme-judgment, and semantic-judgment tasks cannot be performed in this manner, and that performance in these tasks is instead consistent with the classic pattern indicative of serial processing.
According to the triangle model and its variants (Plaut, McClelland, Seidenberg, & Patterson, 1996; Seidenberg & McClelland, 1989), a word’s pronunciation is generated by propagating activation from a set of orthographic input units in parallel to another set of phonological output units. Because each word is represented as a distributed pattern of activation across the input and output units, the simultaneous processing of two or more words would produce a considerable amount of “crosstalk” between the words and result in significant pronunciation errors. Although dual-route models (e.g., Coltheart, 1978; Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001) provide an alternative way of generating a word’s pronunciation (i.e., via the application of non-lexical grapheme-to-phoneme correspondence rules), this process operates in a strictly serial manner, and at any given time converts only a single grapheme into its corresponding phoneme. Therefore, to explain how two or more words would be processed in parallel, current word-identification models would either have to posit the existence of multiple (redundant) lexical-processing mechanisms (e.g., a separate set of input and output units for each word being processed), or make some additional assumptions to explain how the orthographic, phonological, and semantic codes of each word are kept separate from those of the others.
For example, although the SWIFT model of eye-movement control (Engbert et al., 2005) can provide a qualitative account of the observed interaction between task type and number of words, it does so by assuming a fairly sharp attention gradient (i.e., one in which 90% of the gradient is centered on one word) and that becomes even sharper with the deeper, more lexically demanding tasks (Richter, personal communication). Thus, although the model can account for the results reported in this article, it does so by reducing the “focus” of its attention gradient, thereby becoming much more like a serial-attention model.
References
- Balota DA, Yap MJ, Cortese MJ, Hutchinson KA, Kessler B, Loftis B, Neely JH, Nelson DL, Simpson GB, Treiman R. The English Lexicon Project. Behavior Research Methods. 2007;39:445–459. doi: 10.3758/bf03193014. [DOI] [PubMed] [Google Scholar]
- Coltheart M. The MRC Psycholinguistic Database. Quarterly Journal of Experimental Psychology. 1981;33A:497–505. [Google Scholar]
- Coltheart M. Lexical access in simple reading tasks. In: Underwood G, editor. Strategies of information processing. Academic Press; London: 1978. pp. 151–216. [Google Scholar]
- Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J. DRC: A dual route cascaded model of visual word recognition. Psychological Review. 2001;108:204–256. doi: 10.1037/0033-295x.108.1.204. [DOI] [PubMed] [Google Scholar]
- Craik FI, Lockhart Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior. 1972;11:671–684. [Google Scholar]
- Engbert R, Nuthmann A, Richter E, Kliegl R. SWIFT: A dynamical model of saccade generation during reading. Psychological Review. 2005;112:777–813. doi: 10.1037/0033-295X.112.4.777. [DOI] [PubMed] [Google Scholar]
- Francis W, Kučera H. Frequency analysis of English usage: Lexicon and grammar. Houghton Mifflin; Boston: 1982. [Google Scholar]
- Kaiser E, Trueswell J. The role of discourse context in the processing of a flexible word-order language. Cognition. 2004;94:113–147. doi: 10.1016/j.cognition.2004.01.002. [DOI] [PubMed] [Google Scholar]
- Kliegl R, Nuthmann A, Engbert R. Tracking the mind during reading: The influence of past, present, and future words on fixation durations. Journal of Experimental Psychology: General. 2006;135:12–35. doi: 10.1037/0096-3445.135.1.12. [DOI] [PubMed] [Google Scholar]
- Plaut DC, McClelland JL, Seidenberg MS, Patterson K. Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review. 1996;103:56–115. doi: 10.1037/0033-295x.103.1.56. [DOI] [PubMed] [Google Scholar]
- Pollatsek A, Rayner K. Is attention really unnecessary? Behavioral and Brain Sciences. 1999;22:695–696. [Google Scholar]
- Pollatsek A, Reichle ED, Rayner K. Tests of the E-Z Reader model: Exploring the interface between cognition and eye-movement control. Cognitive Psychology. 2006;52:1–56. doi: 10.1016/j.cogpsych.2005.06.001. [DOI] [PubMed] [Google Scholar]
- Rayner K, Bertera JH. Reading without a fovea. Science. 1979;206:468–469. doi: 10.1126/science.504987. [DOI] [PubMed] [Google Scholar]
- Rayner K, Morrison RM. Eye movements and identifying words in parafoveal vision. Bulletin of the Psychonomic Society. 1981;17:135–138. [Google Scholar]
- Rayner K, Juhasz BJ. Eye movements in reading: Old questions and new directions. European Journal of Cognitive Psychology. 2004;16:340–352. [Google Scholar]
- Rayner K, Sereno SC, Raney GE. Eye movement control in reading: A comparison of two types of models. Journal of Experimental Psychology: Human Perception and Performance. 1996;22:1188–1200. doi: 10.1037//0096-1523.22.5.1188. [DOI] [PubMed] [Google Scholar]
- Reicher GM. Perceptual recognition as a function of meaningfulness of stimulus material. Journal of Experimental Psychology. 1969;81:274–280. doi: 10.1037/h0027768. [DOI] [PubMed] [Google Scholar]
- Reichle ED. Theories of the “eye-mind” link: Computational models of eye-movement control during reading. Cognitive Systems Research. 2006;7:2–3. [Google Scholar]
- Reichle ED, Warren T, McConnell K.Using E-Z Reader to model the effects of higher-level language processing on eye movements during reading 2008. Manuscript submitted for review. [DOI] [PMC free article] [PubMed]
- Reichle ED, Pollatsek A, Fisher DL, Rayner K. Toward a model of eye movement control in reading. Psychological Review. 1998;105:125–157. doi: 10.1037/0033-295x.105.1.125. [DOI] [PubMed] [Google Scholar]
- Reichle ED, Rayner K, Pollatsek A. The E-Z Reader model of eye movement control in reading: Comparisons to other models. Behavioral and Brain Sciences. 2003;26:445–476. doi: 10.1017/s0140525x03000104. [DOI] [PubMed] [Google Scholar]
- Reilly R, Radach R. Some empirical tests of an interactive activation model of eye movement control in reading. Cognitive Systems Research. 2006;7:34–55. [Google Scholar]
- Richter E. 2007. Personal communication.
- Seidenberg MS, McClelland JL. A distributed, developmental model of word recognition and naming. Psychological Review. 1989;96:525–568. doi: 10.1037/0033-295x.96.4.523. [DOI] [PubMed] [Google Scholar]
- Townsend JT, Wenger MJ. The serial-parallel dilemma: A case study in a linkage of theory and method. Psychonomic Bulletin & Review. 2004;11:391–418. doi: 10.3758/bf03196588. [DOI] [PubMed] [Google Scholar]
- Treisman A, Gelade G. A feature integration theory of attention. Cognitive Psychology. 1980;12:97–136. doi: 10.1016/0010-0285(80)90005-5. [DOI] [PubMed] [Google Scholar]
- Wheeler DD. Processes in word recognition. Cognitive Psychology. 1970;1:59–85. [Google Scholar]


