Abstract
Contextual cueing experiments show that when displays are repeated, reaction times (RTs) to find a target decrease over time even when observers are not aware of the repetition. It has been thought that the context of the display guides attention to the target. We tested this hypothesis by comparing the effects of guidance in a standard search task to the effects of contextual cueing. Firstly, in standard search, an improvement in guidance causes search slopes (derived from RT × Set Size functions) to decrease. In contrast, we found that search slopes in contextual cueing did not become more efficient over time (Experiment 1). Secondly, when guidance is optimal (e.g. in easy feature search) we still found a small, but reliable contextual cueing effect (Experiments 2a and 2b), suggesting that other factors, such as response selection, contribute to the effect. Experiment 3 supported this hypothesis by showing that the contextual cueing effect disappeared when we added interference to the response selection process. Overall, our data suggest that the relationship between guidance and contextual cueing is weak and that response selection can account for part of the effect.
Keywords: Contextual Cueing, Attention, Guidance, Response selection, Visual Search
Introduction
In everyday life we are inundated by a glut of visual stimuli. Visual scenes are often complex, containing a large amount of irrelevant information. In a commonplace search for something like a particular student in an auditorium, we would be overwhelmed if we were to attempt to attend to every stimulus at once. In response to the inability to process all visual stimuli simultaneously, the visual system has attentional mechanisms that permit us to search for the student by deploying attention to one or a few objects at a time out of the crowded world. Given the inherent complexity of the task, the visual system has evolved a variety of mechanisms to optimize this selection process. Many of these mechanisms come under the rubric of “attentional guidance” (see Wolfe & Horowitz, 2004). Guidance processes speed search by directing attention to items more likely to be targets. Thus, the student is likely to be human-sized and elongated. Attention is guided to objects with those attributes in preference to, for example, small, cubic objects. Spatial configuration of items is a candidate source of guidance. The visual system appears to be sensitive to the predictive value of repeated spatial configurations. In this paper, we ask whether this contextual cueing (Chun & Jiang, 1998) is a form of guidance. Our answer will be that contextual cueing is, at best, a very weak form of guidance and that there are other mechanisms involved in the beneficial impact of repeated configuration on response times.
It has long been known that context speeds object recognition (Biederman, 1972). For example, we would be faster to name a potato masher on a kitchen countertop than the same implement on a workbench. Similarly, a student easily recognized in the classroom might be difficult to place if we ran into her at the mall. But does context affect our ability to search for a specific target? Intuition suggests that it should be easier to find the potato masher if it were habitually stored to the right of the fridge than if it could appear anywhere in the kitchen, and that we would have a better chance of finding our student if she always sat in the same seat than if we had to search the entire auditorium.
Research by Chun and Jiang (1998; 2003) seemed to confirm these intuitive predictions. They demonstrated that the spatial layout of a search display could influence how quickly participants found a target. In a series of studies they found that if the target item was embedded in an invariant configuration that was repeated across the experiment, reaction times (RTs) to find the target were quicker than when it appeared in a novel or unrepeated configuration; this is the basic contextual cueing phenomenon. Further research has found that contextual cueing can be based on implicit memory, is learned after only 5 repetitions of the display (Chun & Jiang, 1998), and can persist for up to a week (Chun & Jiang, 2003).
In their initial paper, Chun and Jiang (1998) suggested that contextual cueing occurs because the visual context can guide spatial attention towards the target. In fact, the notion that contextual cueing helps guidance is repeated throughout the literature (e.g., Chun, 2000; Chun & Jiang, 1998; Chun & Jiang, 1999; Chun & Jiang, 2003: Endo & Takeda, 2004; Hoffmann & Sebald, 2005; Jiang & Chun, 2001; Jiang & Leung, 2005; Jiang, Song & Rigas, 2005; Jiang & Wagner, 2004; Lleras & Von Mühlenen, 2004; Olson & Chun, 2002; Tseng & Li, 2004). This fits with our intuitive notion that when we know where to expect a target we do not need to search too much but instead, taking our example of looking for a student in an auditorium, deploy our attention directly to the expected seat. However, one might observe faster search times without improving the search process at all. For example, it might take just as long to search for a target in a repeated configuration, but once found, the target in the expected location might be recognized and/or responded to more quickly, just as the student is more readily identified in the classroom than in the mall. In this paper we ask whether contextual cueing really guides the search process itself - making the search more efficient – or whether other factors such as facilitation in response selection play a part in contextual cueing.
RTs in visual search experiments can be affected by any processing stage between the retina and the hand (Wolfe et al., 2002). In order to isolate the cost of search proper from perceptual, decision, and response factors, researchers studying search behavior in the RT domain typically vary the number of items (set size), and fit a line to the RT × set size function. The slope of this line can be taken as a measure of the efficiency of search, while non-search factors, such as initial perceptual processing and response selection processes, contribute to the intercept. A wide range of slopes have been observed in the literature (Wolfe, 1998). A slope of 0 msec/item shows that RT is independent of the number of distractors, indicating that attention is directed immediately to the target. Such highly efficient search is characteristic of “feature search” (Treisman & Gelade, 1980), where the target differs markedly from distractors along some basic feature dimension, such as search for a red letter among green letters or for a horizontal bar among verticals (see Wolfe & Horowitz, 2004, for a review). In less efficient search tasks, each additional distractor is associated with an increase in mean RT. For example, conjunction search, in which the target is defined by a combination of features each of which is present separately in the distractors (e.g. finding a red vertical bar among red horizontals and green verticals), is generally less efficient than feature search (Treisman & Gelade, 1980), with slopes averaging 10–15 msec/item (Wolfe, 1998). More difficult spatial configuration searches, where the target is defined by the spatial arrangement of elements (e.g. finding a digital 2 among digital 5s) might produce slopes of 20–40 msec/item (Wolfe, 1998). Many differences in search efficiency can be attributed to differences in guidance. To give one example, the inefficient search for a 2 among 5s becomes more efficient if it is a search for a red 2 among red and black 5s. Attention would be guided to red items, reducing the effective set size.
If contextual cueing were the result of guiding attention to the target, there are several predictions we could make based on the extensive visual search literature. For example, contextual cueing ought to result in improved search efficiency, so we should see a decrease in search slope over the course of a contextual cueing experiment. To take an extreme example, if contextual cueing produced perfect guidance, attention would go directly to the target item and the search slope would drop to zero. While such perfect guidance is unlikely, search slopes for repeated displays should, at the very least, be markedly reduced compared to those from unrepeated displays. We tested this prediction in Experiment 1 and found little, if any, improvement in search efficiency1.
If guidance cannot account for the whole contextual cueing effect, then what can? Experiments 2 and 3 tested the hypothesis that response priming contributes to contextual cueing. Experiment 2 showed that small but reliable contextual cueing effects occur even in tasks when there is already ‘perfect’ guidance (i.e. displays with a single item and feature search tasks). However, contextual cueing disappeared in these tasks when we introduced interference at the level of response selection (Experiment 3). Taken together, these experiments support a role for response factors in contextual cueing. We conclude that several factors including, but probably not limited to response selection, contribute to contextual cueing. Attentional guidance makes, at best, a small contribution.
Experiment 1
If the benefit found in contextual cueing experiments were a result of improved attentional guidance then we would expect to find an improvement in search efficiency when the display was repeated, as well as a benefit in reaction time. Previous contextual cueing studies, with the exception of Chun and Jiang (1998), have not varied set size, and so could not measure search efficiency. Here we ran a contextual cueing experiment in which set size varied from 8 to 12 items, allowing us to compute the RT × set size slope.
Method
Participants
Twelve observers between the ages of 18 and 55 years served as participants. Each participant passed the Ishihara test for color blindness and had normal or corrected to normal vision. All participants gave informed consent and were paid for their time.
Apparatus and Stimuli
This experiment, and all experiments, hereafter, was conducted on a Macintosh G4 computer using Matlab 5.2.1 software with the PsychToolbox (Brainard, 1997; Pelli, 1997). The distractor items were L shapes presented randomly in one of four orientations (0°, 90°, 180° or 270°). The target item was a T shape rotated 90° either to the left or to the right with equal probability. There was always a single target present. A blue dot at the center of the screen served as a fixation point. The background color of the screen was a uniform gray. Three black concentric circles surrounded the fixation point with diameters of 9.5°, 15.5°, and 25° visual angle. Sixteen black lines radiated out from the fixation point roughly equidistant from one another to form a radial lattice. On every trial, either eight or twelve (depending on the set size) circular “placeholders” appeared at the conjunctions between the concentric circles and the spokes. To compensate for the decline in visual acuity with distance from the fixation point, the size of the place-holding circles and of the Ts and Ls increased with eccentricity. Those on the closest concentric circle were 2° in diameter, those on the middle concentric circle were 3.3°, and those on the furthest concentric circle were 5.4°.
All stimuli were made up of two lines of equal length (forming either an L or a T) and appeared within the circular placeholders. Stimuli enclosed in the smallest placeholders subtended a visual angle of 1° × 1°, those enclosed in the middle placeholders subtended 1.5° × 1.5°, and those enclosed in the largest placeholders subtended 2.5° × 2.5°. A tone sounded at the start of each trial, at which point the items appeared on the screen. The color of the items and the placeholders varied for each participant (either yellow, red, blue, orange, cyan, green, purple, or white) but remained constant throughout the experiment. Participants were asked to respond to the direction of the target letter T by pressing the letter ‘a’ if the stem of the T was pointing right and ‘l’ if the stem of the T was pointing left. Error feedback was given after each trial. Example displays are shown in Figure 1.
Procedure
Participants were given a practice block of 10 trials, followed by 512 experimental trials divided into 8 epochs of 64 trials. Approximately half of the trials in each epoch had a set size of 8. The remaining trials had a set size of 12.
Within each set size, for epochs 1 to 7, approximately half the trials had fixed placeholder configurations that were repeated throughout the experiment (predictive displays). These consisted of 4 fixed displays that were repeated 4 times within an epoch for each set size. Overall, each repeated display was shown approximately 28 times throughout the experiment. The other half of the trials had a novel configuration that was generated at random. In order to ensure that participants were not simply learning absolute target locations from the predictive displays, in the random displays targets appeared equally often in 4 randomly selected locations but these appearances were not correlated with any pattern of distractor locations. In epoch 8 the absolute target locations for predictive and random trials remained the same, but all configurations were now made random, so that the context was no longer predictive on any of the trials. This was implemented as a secondary check to make sure any benefit observed for predictive displays was due to the learning of display context rather than the learning of the absolute target locations. If participants were learning the context, then epoch 8 should produce slower RTs than epoch 7, even on trials where the target locations were identical to those used in the predictive displays of epochs 1–7.
Data analysis
In the literature, there have been many ways to formally define contextual cueing. Chun & Jiang (1998) suggested that the contextual cueing effect should be measured as the difference between predictive and random configurations across the last three epochs (see also Jiang, Leung & Burks, submitted, and Kunar, Flusberg & Wolfe, in press). This procedure focuses on the asymptotic benefit for having learned a predictive context over a random one. Following their reasoning, we collapsed the data across the last 3 predictive epochs (here epochs 5 to 7) and used this as our standard measure of contextual cueing
Results and Discussion
Figures 2a and 2b show RTs for both predictive and random configurations for set sizes 8 and 12, respectively. RTs below 200 msec and above 4000 msec were removed. This led to the removal of less than 1% of the data. Examining the RTs, we see that both set sizes showed a contextual cueing effect. For set size 8, there was a main effect of configuration and epoch (for epochs 1 to 7), where RTs in the predictive display were faster than those in the random, F(1, 11) = 12.2, p < 0.01, and RTs became faster over time, F(6, 66) = 2.3, p < 0.05. There was also a significant configuration x epoch interaction, F(6, 66) = 3.4, p < 0.01. RTs decreased more across epoch when the display was predictive than when it was random. Comparing the “predictive” RTs between epoch 7 and epoch 8 (where the predictive configurations were no longer valid) we see that RTs increased when the configuration was no longer predictive, t(11) = 3.0, p < 0.05. This suggests that it is the context that is important rather than the absolute target locations2. When we collapsed the data across epochs 5–7, the results showed a positive contextual cueing effect: predictive RTs were 152 msec faster than random ones, t(11) = 3.8, p < 0.01.
A similar pattern could be seen for set size 12. Here there was a main effect of configuration and epoch (for epochs 1 to 7), where RTs in the predictive display were faster than those in the random, F(1, 11) = 23.3, p < 0.01, and RTs became faster over time, F(6, 66) = 3.4, p < 0.01. However, there was no configuration x epoch interaction. Collapsing the data across epochs 5 to 7 again showed a valid contextual cueing effect. RTs for predictive trials were 174 msec faster than those for random, t(11) = 4.2, p < 0.01.
Overall error rates were quite low at 3%. There was a significant effect of configuration, F(1, 11) = 5.2, p < 0.05; random trials showed a higher error rate than predictive. None of the other main effects or interactions proved reliable.
The RT data for both set sizes showed a reliable contextual cueing effect. For present purposes, the critical question is the effect of contextual cueing on search slope. Slopes for predictive and random displays are shown as a function of epoch in Figure 3. While there may be some effect, it is not very robust and certainly never yields efficient search for contextually cued targets. There was a main effect of context. Over epochs 1 to 7, search slopes were more efficient when the displays were predictive than when they were random, F(1, 11) = 6.5, p < 0.05). The effect of epoch was not reliable, F(6, 66) = 0.4, p = n.s.. Nor was there a reliable condition x display size interaction, F(6, 66) = 0.7, p = n.s.. If we take our standard measure and collapse the data across epochs 5 to 7, there was no contextual cueing effect, t(11) = 0.6, p = n.s.. If anything, more learning makes the contextual cueing effect on slope less reliable.
Another way to look at this question is to see whether the difference in slope between predictive and random displays can account for the size of the contextual cueing benefit. For example, at set size 12, contextual cueing speeded responses by 174 msec (as calculated from epochs 5–7). In order to account for an effect of this magnitude, slopes in the predictive case would have to be 174÷12 or 15 msec/item shallower than in the random case. The observed slope difference, however, was only 5 msec/item (and not reliably different from 0 msec/item). It seems that guidance on its own cannot account for the contextual cueing effect.
If there were any effect of contextual cueing on search efficiency, it was very modest. Instead of seeing a marked improvement in search efficiency, search slopes from repeated displays hovered around 30 msec/item, suggesting, at best, that observers can only eliminate a few items from search3. This is similar to data reported by Chun & Jiang (1998). Since it is hard to interpret essentially negative findings, over the course of our research we have replicated this experiment nine other times (see Figure 4). Table 1, gives a brief description of each of the nine new experiments. None of these experiments yielded a reliable difference between predictive and random slopes (again collapsing the data across epochs 5–7, although two experiments did show a marginal benefit, p = 0.09 in both cases). Furthermore, unlike Experiment 1, eight out of nine of these new experiments showed that there was no reliable main effect of predictive versus random configuration on slope (see Table 1). This again suggests that there was little guidance benefit from having a repeated display. A meta-analysis across all 118 participants in all ten experiments showed that the overall RT contextual cueing effect (as measured from the last three epochs) for set size 12 was 172 msec. Using the logic introduced above, we would predict a 14.4 msec/item slope advantage for the predictive displays, if guidance were to account for the contextual cueing effect. However, the average observed benefit was only half this at 6.9 msec/item (again not reliably different to 0 msec/item, t(117) = 1.4, p = n.s.). Predictive displays produce, at best, weak slope benefits. Guidance seems to account for, if anything, only a small part of the contextual cueing effect.
Table 1.
Experiment | N | SS | Stimuli | Background Lattice |
Main effect of Configuration (Slope) |
Main effect of Configuration (RT) |
---|---|---|---|---|---|---|
2 | 12 | 8, 12 | Letters | Yes | No | Marginal |
3 | 8 | 8, 12 | T vs L | No | No | Yes |
4 | 12 | 4, 8, 12 | T vs L | Yes | No | Yes |
5 | 12 | 8, 12 | V vs H | Yes | No | Yes |
6 | 12 | 8, 12 | V vs H | Yes | No | Yes |
7 | 12 | 8, 12 | T vs L* | Yes | Marginal | Yes |
8 | 12 | 8, 12 | V vs H | Yes | No | Yes |
9 | 13 | 8, 12 | T vs L | Yes | No | Yes |
10 | 13 | 8, 12 | V vs H | Yes | Yes | Yes |
Where N = Number of participants and SS = Set Sizes
Stimuli Description:
Letters = Stimuli were heterogeneous letters. The task was to respond to the mirror reversed letter
T vs L = The task was to respond to the orientation of the letter T among rotated distractor Ls (n.b., this was the same task as that of Experiment 1)
T vs L* = The task was to respond to the color of the T among Ls. All stimuli were randomly colored red or green
V vs H = The task was to report whether the target was a vertical or horizontal line. The distractors were oblique lines orientated either 30, 60, −30 or −60 degrees of the vertical
Learning appeared rapidly over the first few epochs in Experiment 1 (see Figure 2b). Therefore, one could argue that any slope difference should have emerged early on – perhaps over the first few repetitions. In fact, Chun and Jiang (1998) reported that learning could occur within the first two repeats of a display. To investigate this, we compared the data over the first four repetitions (Block 1) and the next four repetitions of the display (Block 2). If learning occurred after a few trials and resulted in improved guidance, we would expect to find a slope benefit within the first few blocks. However, we did not. There was no difference between search slopes for predictive trials versus random for either Block 1 or Block 2 (t(11) = 1.2, p = n.s., and t(11) = 1.6, p = n.s, respectively). Thus even if learning occurred early on in the experiment, this did not result in improved search slopes. Even if we extend these analyses to look at search slopes across all subsequent blocks (i.e., groups of four successive predictive versus random displays), we see that throughout the experiment there was no reliable benefit (all ts < 2.0, ps = n.s.). Furthermore, a meta-analysis on all ten experiments shown in Figure 4 found no effect of predictive versus random search slopes for Blocks 1 or 2, (t(117) = 1.0, p = n.s. and t(117) = 0.1, p = n.s., respectively). This analysis argues that the contextual cueing effect involves, at best, a limited improvement in guidance.
Perhaps we did not find an effect on search slope because of differential contextual cueing effects across set size. It has been suggested that contextual cueing does not occur in crowded displays (e.g., see Hodsoll & Humphreys, 2005), as the context loses some of its distinctiveness. If this were the case, and a display of set size 12 was less distinct than set size 8, there would be less contextual cueing with the former than the latter. Thus a reduction in distinctiveness with increasing set size might offset the benefit of guidance, leading to no net change in slope. We find this explanation unlikely. In Hodsoll and Humphreys’ experiments, displays of set size 10 produced strong contextual cueing, while displays of set size 20 did not. Displays of set size 12 have been shown to produce a robust contextual cueing effect throughout the literature, indicating that they are seen to provide unique and distinct contexts. The difference in distinctiveness between set size 8 and set size 12 seems unlikely to offset any but the weakest of potential guidance effects. However, in the absence of further data we cannot rule out this possibility.
If factors other than attentional guidance were involved in contextual cueing, then we would expect to see reliable differences in intercepts between predictive and random displays. Intercept effects are thought to reflect perceptual processes and/or response selection processes. Figure 5a shows intercept effects across epoch for all of the ten experiments reported above, and Figure 5b shows the difference in predictive versus random displays over the last three epochs. As can be seen there was a clear difference between predictive and random intercepts, reflected in a reliable main effect between predictive and random displays, F(1, 117) = 4.3, p < 0.05, and a significant difference across the last three epochs, t(117) = 2.4, p < 0.05. These data suggest that processes other than guidance must account for some portion of the contextual cueing benefit. Presumably these will be either a facilitation of early processing stages or a facilitation of response selection processes. Experiment 3 investigated the role of this latter component, while Experiment 2 explored whether a contextual cueing effect can still occur when guidance is already optimal.
Experiment 2a
If contextual cueing improved search by guiding attention to the target then it should be of little use when the guidance signal is already strong enough to attract attention to the target location with near certainty. In Experiment 2a, a single letter was presented on each trial. Empty circular placeholders provided the context. There were no distractor items. In this case, standard guidance should direct attention straight to the target. Any guidance by contextual cueing would be redundant.
Method
Participants
Twelve observers between the ages of 18 and 55 years served as participants. Each participant passed the Ishihara test for color blindness and had normal or corrected to normal vision. All participants gave informed consent and were paid for their time.
Apparatus and Stimuli
The apparatus and stimuli were the same as Experiment 1, except that twelve placeholders were presented on every trial, and placeholders and the target were white for all participants.
Procedure
The search task was the same as in Experiment 1. There were 10 practice trials followed by 512 experimental trials divided, for analysis purposes, into 8 epochs of 64 trials. Approximately half of the trials in each epoch had a set size of 1, where the target appeared in one of the placeholders while the remaining placeholders were empty (i.e. no distractor items). The remaining trials had a set size of 12, where the target was in one placeholder and distractors filled the remaining 11 placeholders. An example of set size 1 is shown in Figure 6. The rest of the procedure was the same as that in Experiment 1: half the displays from epochs 1–7 were predictive whereas the rest were random. In epoch 8 every display was random.
Results and Discussion
Overall error rates were quite low at 2%, with no significant effects of set size, configuration or epoch. Neither were any of the interactions reliable. RTs below 200 msec and above 4000 msec were removed. This led to the removal of less than 1% of the data. Figures 7a and 7b show RTs for predictive and random configurations for set size 12 and 1 respectively. RTs at set size 12 showed a similar pattern to those of Experiment 1. There was a main effect of configuration and epoch, where RTs to find targets in predictive configurations (for the first 7 epochs) were faster than those in random, F(1, 11) = 51.3, p < 0.01, and RTs became faster over time, F(6, 66) = 4.3, p < 0.01. The configuration x epoch interaction was not significant. Collapsing the data across epochs 5 to 7 showed a contextual cueing effect: predictive RTs were faster than random ones, t(11) = 6.7, p < 0.01.
For present purposes, the important finding is the small, but reliable, contextual cueing effect at set size 1. There was an RT benefit when the configuration of the display was predictive rather than random, F(1, 11) = 13.6, p < 0.01. There was no effect of epoch, nor a reliable configuration x epoch interaction. Comparing RTs collapsed across epochs 5 to 7 (i.e. the last 3 epochs), however, showed a positive contextual cueing effect. Even when the target was presented in isolation, predictive RTs were faster than random, t(11) = 4.0, p < 0.01. We have replicated this effect two other times: each experiment produced a reliable contextual cueing effect of at least 30 msec (see Figure 8).
Participants should not have had to search for the target at set size 1, because there was only a single item, which was quite obvious in the display. However, we cannot verify that guidance was optimal when only one item was on the screen, because search efficiency can only be measured across set sizes4. Therefore, however unlikely, it could be argued that participants were searching the background placeholders for the target. In order to investigate this, we conducted a pilot study where the number of placeholders of each condition varied from 8 to 12. If participants were searching each placeholder for the target then we should expect to find that the RT x placeholder function would have a slope greater than zero. However, the data show that search slopes did not differ from 0 msec/item for either predictive or random trials (0.2 and 0.9 msec/item respectively)5. Furthermore the RT data replicated the basic contextual cueing finding. Participants were faster at finding a target in a predictive display compared to a random one (t(9) = 2.1, p = 0.06 and t(9) = 1.9, p < 0.09 for set size 8 and 12 respectively). Attention was directed to the target item when participants were searching for that target among empty placeholders. Experiment 2b shows converging evidence for this using a feature search task. Here the target was so salient that it did not require search to find it and guidance was essentially ‘perfect’ - an assumption that can again be tested by computing the search slope.
Experiment 2b
In Experiment 2b, we repeated the basic contextual cueing design using a feature search task instead of a spatial configuration search task. In feature search, the target is known to “pop out” of the display without need for any search at all. Treisman (1985) has shown that explicitly pre-cueing the location of a feature target does not improve detection. Therefore, contextual cueing should provide little or no benefit in a feature search task if it serves only to guide attention to the target location.
Method
Participants
Twelve observers between the ages of 18 and 55 years served as participants. Each participant passed the Ishihara test for color blindness and had normal or corrected to normal vision. All participants gave informed consent and were paid for their time.
Apparatus and Stimuli
The apparatus and stimuli were the same as Experiment 1, except that the target item, T, and its placeholder were always red, whereas the distractor Ls and their placeholders were always green. The target could be immediately identified on the basis of its color, producing a ‘pop-out’ single feature task.
Procedure
There were 10 practice trials followed by 512 experimental trials that were divided into 8 epochs of 64 trials for analysis purposes. Approximately half of the trials in each epoch had a set size of 8 while the remaining trials had a set size of 12. Within each set size, for epochs 1 to 7, half the trials in an epoch had a spatial configuration predicting the target location, whereas the remaining trials had a random, non-predictive configuration. In epoch 8, all of the configurations were randomized.
Results and Discussion
Overall error rates were again low at 4%. There was a significant effect of set size, F(1, 11) = 10.7, p < 0.01, with higher errors at set size 12 than set size 8. The configuration x epoch interaction was also significant, F(7, 77) = 4.7, p < 0.01. Errors at epoch 2 were lower in the predictive condition than errors at other epochs, whereas errors at epoch 2 were higher in the random condition than at other epochs. None of the other main effects or interactions proved reliable. RTs below 200 msec and above 4000 msec were removed. This led to the removal of 1% of the data. Figures 9a and 9b show RTs for both predictive and random configurations for set sizes 8 and 12, respectively. Examining the overall search slopes for both predictive and random trials we see that both slopes are shallow (1.4 msec/item for predictive trials; 1.3 msec/item for random trials) and neither slope is reliably different from 0 msec/item.
While the RT slopes did not differ from 0, there was a significant increase in errors with set size. Does this mean that a speed/accuracy tradeoff might be masking steeper slopes? We think not. The effect of set size, while significant, was quite small, amounting to 0.004 additional errors per item. As a rough measure, we can divide RT by accuracy (Townsend & Ashby, 1978), which yields slopes of 4.6 msec/item for predictive trials and 3.5 msec/item for random trials. These values are well within the range of slopes typically observed for feature search (Wolfe, 1998). Thus, we are confident in describing this experiment as a highly efficient “pop-out” search task. Participants did not have to search the display to find the target.
As in the previous experiment, the important finding is that there is a small contextual cueing effect for this highly efficient search task. Taking set size 8 first, we see that although there was no main effect of configuration, epoch, nor a reliable configuration x epoch interaction, RTs for predictive trials were marginally faster than random ones across epochs 5 to 7, t(11) = 1.9, p = 0.09. With set size 12 we find that overall RTs for predictive configurations were significantly faster than those for random, F(1, 11) = 9.4, p = 0.01. There was no effect of epoch or a reliable configuration x epoch interaction. However collapsing the data across epochs 5 to 7 showed a positive contextual cueing effect, t(11) = 2.4, p < 0.05. Taken together, these results and those from Experiment 2a, show that even with a task that requires no search, and thus already has a ‘perfect’ guidance signal, a predictive display still benefited reaction times.
The contextual cueing benefit from Experiment 2a and 2b is interesting and could reflect the contribution of any of several factors. For example, it could plausibly reflect a benefit in figure-ground segmentation, early processing stages or perhaps in response selection (see Experiment 3). It may even occur as the predictive context adds an "extra" guidance signal to feature search so that the target is found faster. In other words, although guidance is "perfect" in a color feature search task, perhaps the repeated configuration provides a small but significant additional boost to the color guidance signal, producing the contextual cueing effect in Experiment 2b. If this were indeed the case, the RT benefit should also be seen in other feature searches where an additional "guidance" signal has been added. Take, for example, an orientation feature search for a vertical line among horizontal lines. Would the addition of a redundant color signal produce an RT benefit even if search was already highly efficient?
A control study revealed that this is not, in fact, what occurs. In a present/absent search task, participants searched for a vertical line among horizontal lines. On half the trials, the target and distractors were all red. In the remaining trials, half of the distractors were red while the other half, were green. If it were possible to add an "extra" guidance signal to a "perfect guidance" task, we would expect an RT benefit on the trials with green and red distractors, as participants would be able to use the color information as extra guidance away from green distractors and towards the (red) target. However, the results indicated that there was no difference in RT between the two types of trials. These data suggest that the RT benefit found in the contextual cueing Experiments 2a and 2b was not due to any "extra" guidance signal provided by the spatial context.
If the small but reliable contextual cueing effect in efficient search is unlikely to be due to guidance, what processes might benefit from the repeated, predictive context? In the introduction we suggested that a familiar environment might aid response selection; in particular it may reduce the threshold needed in order to respond to a target. If we implicitly learn that the potato masher is always located next to the fridge in our mother’s kitchen, then we might be more ready to respond to the presence of the masher in that location than we would be if we were in a novel kitchen. We explored this possibility in Experiment 3. If contextual cueing really does facilitate response selection, then interfering with response selection would be expected to interfere with the contextual cueing effect seen under perfect guidance conditions.
Experiment 3
Experiment 3 investigated whether the small contextual cueing benefits found in Experiment 2 could have been due to response selection. Here we used a search task where distractor items elicited either a congruent or incongruent response (Starreveld et al., 2004), a manipulation known to produce interference at the level of response selection (e.g. Cohen & Magen, 1999; Cohen & Shoup, 1997; Eriksen & Eriksen, 1974). If the contextual cueing benefit in Experiments 2a and 2b were due to facilitation within response selection, we would expect to find a similar contextual cueing effect when there was no response selection interference (i.e., when the distractors and target were congruous) but not when interference was added to this process (i.e., on incongruent trials). On the other hand, if contextual cueing occurred at the guidance stage of search or at an early, perceptual stage, then standard additive factor analysis of RT would predict that the effects of a response selection manipulation would be additive with the contextual cueing effect. As shown in Experiment 3, response selection manipulations interact with the contextual cueing effect, suggesting that they occur at the same stage of processing.
Method
Participants
Twelve observers between the ages of 18 and 55 years served as participants. Each participant passed the Ishihara test for color blindness and had normal or corrected to normal vision. All participants gave informed consent and were paid for their time.
Apparatus and Stimuli
The apparatus and stimuli were the same as Experiment 2b, except that here the target was either a red A or a red R. The distractors were either all green As or all green Rs. There were no placeholders around the target or distractor items. Like Experiment 2b, the target could be immediately identified on the basis of its color, producing a ‘pop-out’ single feature task.
Procedure
There were 10 practice trials followed by 448 experimental trials that were divided into 7 epochs of 64 trials for analysis purposes. Half of the trials in each epoch were predictive whereas the other half, were random. Within the predictive trials, half of the displays were congruent (a red A among green As or a red R among green Rs) whereas the other half were incongruent (a red A among green Rs or a red R among green As). For predictive trials a configuration was always either congruent or incongruent. It was never both. The set size was always 12 and participants had to respond to whether the red letter was an A or an R.
Results and Discussion
As is typical in experiments of this sort, error rates were higher for incongruent trials (7%) than they were for congruent trials (3%), F(1, 11) = 15.2, p < 0.01. However, none of the main effects or interactions were reliable. RTs below 200 msec and above 4000 msec were removed. This led to the removal of less than 1% of the data.
Figures 10a and 10b show a comparison of contextual cueing effects for congruent and incongruent trials, respectively, while Figure 11 shows a comparison of contextual cueing effects over the last 3 epochs for both the congruent and incongruent trials. Consistent with previous work, there was an overall effect of congruency, F(1, 11) = 76.1, p < 0.01. Participants were faster at responding to congruent targets than to incongruent ones. To examine how congruency affects contextual cueing, we computed the contextual cueing effect separately for congruent and incongruent trials. As predicted, there was a contextual cueing effect for the congruent trials. RTs were faster when the display was predictive, F(1, 11) = 9.9, p < 0.01, than when it was random. This effect was also observed when we collapsed the data over the last 3 epochs (epochs 5–7), t(11) = 3.8, p < 0.01. Examining incongruent trials, however, there was no evidence of a contextual cueing effect. There was no benefit of predictive displays over random, either overall, F(1, 11) = 0.4, p = n.s., or in the last 3 epochs, t(11) = 0.6, p = n.s.. None of the other main effects or interactions were significant.
It is generally accepted that the slowing of RTs in incongruent displays is due to interference at the response selection level (e.g. Cohen & Magen, 1999; Cohen & Shoup, 1997; Eriksen & Eriksen, 1974). Here, all elements in the visual field are processed up to the level in which their associated response has been activated. On incongruent trials, the target item and distractor items activate competing responses, slowing RTs. Experiment 3 demonstrates that interference at the response selection level negates the contextual cueing benefit, at least for feature search displays. This suggests that contextual cueing acts, at least in part, by speeding responses to targets in a familiar context.
General Discussion
Chun and Jiang (1998) found that if a target was embedded in a repeated display where the configuration predicted its location, RTs to find the target were faster than conditions where the display configuration did not predict the target location. We present three experiments suggesting that attentional guidance cannot account for the entire contextual cueing benefit. Experiment 1 examined the effect of contextual cueing on search slopes. Search slopes are assumed to reflect search efficiency: improving the attentional guidance signal reduces search slopes. We found that contextual cueing mainly reduced RTs (and intercepts), and produced, at best, a weak reduction in search slopes. Furthermore, we failed to find a reliable search slope difference between predictive and random displays across nine other experiments. Even when the results were pooled across all these experiments, there was no benefit in search slope. Interestingly, we found similar effects in a study where we investigated whether contextual cueing could occur as a result of global background features (Kunar, Flusberg & Wolfe, in press). Although repeating global background features provided a reliable RT benefit, there was no improvement in search efficiency and hence no evidence for an improvement in attentional guidance.
Experiments 2 and 3 investigated contextual cueing in tasks where attention was deployed directly to the target item, leaving little role for further guidance by contextual cueing. Nevertheless, Experiment 2a showed that a small contextual cueing effect occurred even when the target was presented in isolation. The same result was obtained when the target was defined by a color singleton in Experiment 2b. Search slopes were not different from 0 msec/item, indicating that the deployment of attention here was already perfect. Thus, the additional benefit found was unlikely to be due to guidance. Adding interference to the response selection stage, however, eliminated contextual cueing (Experiment 3). Here the predictive display could either be made up of distractors eliciting a congruent response to the target or those eliciting an incongruent one. In the incongruent case, the target item and the distractor items both activated competing responses, interfering with response selection. With this interference, the contextual cueing effect disappeared. The combination of these experiments argues against contextual cueing occurring solely as a result of attentional guidance and suggests a contribution from other factors, including response selection.
A role for guidance in contextual cueing?
Do these results mean that attentional guidance plays no role in contextual cueing? Although our data suggest that there is little effect on guidance, we do not want to rule out the possibility that contextual cueing could contribute to attentional guidance on some trials. Although their search slopes did not reach that expected by perfect guidance, Chun & Jiang (1998) observed a small but significant slope reduction in their studies. Likewise, two of our ten experiments investigating search efficiency in contextual cueing showed a marginal trend towards search slope improvement (see Figure 4). Peterson and Kramer (2001) also investigated the role of guidance in contextual cueing by measuring eye movements and noting when the eyes went to the target. They found that, although contextual cueing increased the proportion of trials where gaze went to the target on the first saccade from 7.1% to 11.3%, it did not provide perfect guidance on every trial. However, the overall number of saccades necessary to find the target was lower in repeated (predictive) displays than in unrepeated (random) displays. They concluded that recognition of the context is highly imperfect. Sometimes the context is recognized immediately, other times recognition does not occur for some time (if at all). Once the display has been recognized, then guidance can take place. This result argues that contextual cueing does improve guidance on a fraction of trials. However, there must also be other mechanisms that are responsible for the robustness of the contextual cueing effect.
Given enough time, context seems bound to improve search efficiency. Indeed, in recent work, we have found a decrease in slopes in versions of contextual cueing tasks that involve much longer RTs produced by higher set sizes or more complicated display backgrounds (Kunar, Flusberg & Wolfe, in press and Kunar, Flusberg & Wolfe, submitted, respectively). Similarly, an improvement of search efficiency can be seen if participants are presented with the display context substantially prior to the search stimuli (Kunar, Flusberg & Wolfe, submitted) or when participants are given explicit knowledge about the context (Kunar, Flusberg & Wolfe, in press). Therefore it seems that under certain circumstances, context can guide the deployment of attention. However, this form of guidance is relatively slow. In faster search tasks like the ones reported here and in the classic Chun and Jiang work, the robust contextual cueing effect seems to involve contributions from factors other than guidance.
Recent work by Brady and Chun (submitted) offers another reason why any guidance by contextual cueing is likely to be limited. They found that, in contextual cueing, participants only learned the association between the target location and its immediately surrounding distractors (see also Olson & Chun, 2002). In this case only the local context of the display would be available to guide search. If only a few items in the display serve to guide attention, this might explain why the slope benefit is so small. However, in our experiments, a small (statistically unreliable) slope benefit is accompanied by a large net RT benefit. Thus, the Brady and Chun proposal can only account for part of the contextual cueing effect. Some other factor must be posited to account for the large set-size independent effect.
Response selection in contextual cueing
Our results suggest that there is a response selection component to contextual cueing. How might this work? One possibility is that having a predictive display allows you to respond to the target faster once it has been found. To return to the examples we used in the introduction, it may not be easier to find the potato masher in its habitual place by the fridge, but you may be faster to respond to it if you know its location. This benefit may arise in a number of ways. For example, if the target is in a familiar place, any need to ‘double check’ that you have found the target will be eliminated, leading to faster RTs. Similarly, perhaps the response threshold is lowered when the target appears in a familiar context then when it appears in a novel one (see Figure 12). Imagine that an observer normally requires a certain amount of information, X, in favor of a given target identity before committing to that response. If the target appears in a habitual location, however, he/she might reduce this threshold by some amount, N. This allows the threshold to be crossed sooner, leading to the contextual cueing effect.
Another possibility is that when the target is processed, the context is to some extent encoded along with the target itself (e.g. Fazl, Grossberg, & Mingolla, 2005). Thus, when the target is attended in a repeated display, the memory trace of prior episodes with similar contexts are retrieved. Since these traces are associated with responses, retrieval speeds response selection or execution.
Floor effects or components of contextual cueing?
One might ask why the contextual cueing effects for set size one and feature search tasks are smaller than contextual cueing effects for larger set sizes. For example, in the standard contextual cueing task of Experiment 1, with set size 12 the RT benefit was 174 msec. For set size 1 and the feature search tasks in Experiment 2a, 2b and the congruent trials of Experiment 3, the benefits were 33 msec, 12 msec, and 20 msec respectively. There are at least two possible reasons why these latter effects are smaller. Firstly, there may be floor effects. If it is easier to make a response in the pop-out searches, then there will be less room for contextual cueing improvements. Indeed, the surprising finding is that even when RTs were already almost at floor, a predictive display could further reduce response times. A second possibility may be that these small effects reflect response selection factors alone, which are only partly responsible for the contextual cueing benefit. When participants have to search through larger set sizes, repeated displays may recruit additional processes (these could be early perceptual processes, see below). Alternatively, as noted above (see Kunar, Flusberg & Wolfe, submitted) if set sizes are large enough and the time taken to respond to the display is long enough then some guidance processes may come into play.
Other factors in Contextual Cueing
Given that evidence for attentional guidance in contextual cueing is weak and that response selection seems to account for only a part of the contextual cueing effect, what other factors might be involved? One possibility is that contextual cueing helps in the initial processing of the display. That is, a predictive display may help us parse the stimuli from the background. Pilot work in our lab has shown that this might be the case. A greater RT benefit for predictive displays is found in complex displays, where it is more difficult to separate the distractor and targets from the background, than in displays where this segregation is easy. It seems that if the segregation between background and display is more difficult, a predictive context will help display parsing more, which in turn will lead to faster response times. It is up to future work to examine whether, together with weak guidance and facilitated response selection, these contributors can account for the contextual cueing effect in full. In the meantime, however, data from these experiments suggest that attentional guidance is not, by itself, an adequate account of contextual cueing.
Acknowledgements
This research was supported by a grant from the National Institute of Mental Health to JMW. We wish to thank Kristin Michod for her assistance with data collection and Yuhong Jiang, Marvin Chun and one anonymous reviewer for their helpful comments.
Footnotes
Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at http://www.apa.org/journals/xhp/
Chun & Jiang (1998) found that search slopes did become more efficient across time. However, slopes for their spatial configuration task never reached the efficiency of feature or even conjunction searches, showing instead a modest improvement from 40 msec/item to a still inefficient 30 msec/item. To our knowledge no other contextual cueing experiment could measure slope, since none varied set size. However, several studies in our lab (see Experiment 1 and Figure 4) have failed to replicate even the modest effect observed by Chun & Jiang (1998)
This same general pattern occurs in Experiments 2a and 2b reported here, however, in the interest of saving space we do not report these statistics further.
This agrees with work by Brady and Chun (submitted) who suggested that as contextual cueing emerges as a result of learning the association between the target and a few local distractors, the reduction in search slopes should be limited.
Please note that we did not derive search slopes in this experiment, as set size 1 is a special case that does not reflect search in general. Instead we rely on the data from Experiment 1 and its replications to address the effects of slope in contextual cueing.
Since we did not run a control condition without placeholders, we cannot rule out the possibility that the placeholders themselves may have slowed RTs. Nevertheless, the relevant point is that participants were not searching among the placeholders; attention was directed immediately to the target.
References
- Biederman I. Perceiving real-world scenes. Science. 1972;177:77–80. doi: 10.1126/science.177.4043.77. [DOI] [PubMed] [Google Scholar]
- Brady TF, Chun MM. Spatial Constraints on Learning in Visual Search: Modeling Contextual Cueing. Manuscript submitted for publication. doi: 10.1037/0096-1523.33.4.798. (Submitted) [DOI] [PubMed] [Google Scholar]
- Brainard DH. The Psychophysics Toolbox. Spatial Vision. 1997;10:443–446. [PubMed] [Google Scholar]
- Chun MM. Contextual cueing of visual attention. Trends in Cognitive Science. 2000;4:170–178. doi: 10.1016/s1364-6613(00)01476-5. [DOI] [PubMed] [Google Scholar]
- Chun MM, Jiang Y. Contextual cueing: implicit learning and memory of visual context guides spatial attention. Cognitive Psychology. 1998;36:28–71. doi: 10.1006/cogp.1998.0681. [DOI] [PubMed] [Google Scholar]
- Chun MM, Jiang Y. Top-down Attentional Guidance Based on Implicit Learning of Visual Covariation. Psychological Science. 1999;10:360–365. [Google Scholar]
- Chun MM, Jiang Y. Implicit, long-term spatial contextual memory. Journal of Experimental Psychology: Learning, Memory, & Cognition. 2003;29:224–234. doi: 10.1037/0278-7393.29.2.224. [DOI] [PubMed] [Google Scholar]
- Cohen A, Magen H. Intra- and cross-dimensional visual search for single feature targets. Perception & Psychophysics. 1999;61:291–307. doi: 10.3758/bf03206889. [DOI] [PubMed] [Google Scholar]
- Cohen A, Shoup R. Perceptual dimensional constraints in response selection processes. Cognitive Psychology. 1997;32:128–181. doi: 10.1006/cogp.1997.0648. [DOI] [PubMed] [Google Scholar]
- Endo N, Takeda Y. Selective learning of spatial configuration and object identity in visual search. Perception & Psychophysics. 2004;66(2):293–302. doi: 10.3758/bf03194880. [DOI] [PubMed] [Google Scholar]
- Eriksen BA, Eriksen CW. Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception & Psychophysics. 1974;16:143–149. [Google Scholar]
- Fazl A, Grossberg S, Mingolla E. Invariant object learning and recognition using active eye movements and attentional control [Abstract] Journal of Vision. 2005;5(8):738a. [Google Scholar]
- Hodsoll JP, Humphreys GW. Preview Search and Contextual Cuing. Journal of Experimental Psychology: Human Perception and Performance. 2005;31(6):1346–1358. doi: 10.1037/0096-1523.31.6.1346. [DOI] [PubMed] [Google Scholar]
- Hoffmann J, Sebald A. Local contextual cuing in visual search. Experimental Psychology. 2005;52(1):31–38. doi: 10.1027/1618-3169.52.1.31. [DOI] [PubMed] [Google Scholar]
- Jiang Y, Chun MM. Selective Attention Modulates Implicit Learning. The Quarterly Journal of Experimental Psychology (A) 2001;54(4):1105–1124. doi: 10.1080/713756001. [DOI] [PubMed] [Google Scholar]
- Jiang Y, Leung AW. Implicit learning of ignored visual context. Psychonomic Bulletin & Review. 2005;12(1):100–106. doi: 10.3758/bf03196353. [DOI] [PubMed] [Google Scholar]
- Jiang Y, Leung A, Burks S. Source of individual differences in spatial context learning. Memory & Cognition. (submitted) [Google Scholar]
- Jiang Y, Song J-H, Rigas A. High-capacity spatial contextual memory. Psychonomic Bulletin & Review. 2005;12(3):524–529. doi: 10.3758/bf03193799. [DOI] [PubMed] [Google Scholar]
- Jiang Y, Wagner LC. What is learned in spatial contextual cuing – configuration or individual locations? Perception & Psychophysics. 2004;66(3):454–463. doi: 10.3758/bf03194893. [DOI] [PubMed] [Google Scholar]
- Kunar MA, Flusberg SJ, Wolfe JM. Contextual cueing by global features. Perception & Psychophysics. doi: 10.3758/bf03193721. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kunar MA, Flusberg SJ, Wolfe JM. Time to Guide: Evidence for Delayed Attentional Guidance in Contextual Cueing. Visual Cognition. doi: 10.1080/13506280701751224. (submitted) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lleras A, Von Mühlenen A. Spatial context and top-down strategies in visual search. Spatial Vision. 2004;17(4–5):465–482. doi: 10.1163/1568568041920113. [DOI] [PubMed] [Google Scholar]
- Logan G. Attention and preattention in theories of automaticity. American Journal of Psychology. 1992;105:3127–3339. [PubMed] [Google Scholar]
- Olson IR, Chun MM. Perceptual constraints on implicit learning of spatial context. Visual Cognition. 2002;9(3):273–302. [Google Scholar]
- Pelli DG. The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision. 1997;10:437–442. [PubMed] [Google Scholar]
- Peterson MS, Kramer AF. Attentional guidance of the eyes by contextual information and abrupt onsets. Perception & Psychophysics. 2001;63:1239–1249. doi: 10.3758/bf03194537. [DOI] [PubMed] [Google Scholar]
- Starreveld PA, Theeuwes J, Mortier K. Response selection in visual search: The influence of response compatibility of nontargets. Journal of Experimental Psychology: Human Perception and Performance. 2004;30:56–78. doi: 10.1037/0096-1523.30.1.56. [DOI] [PubMed] [Google Scholar]
- Townsend JT, Ashby FG. Methods of modeling capacity in simple processing systems. In: Castellan NJ Jr, Restle F, editors. Cognitive Theory. Vol. 3. Hillsdale, NJ: Lawrence Erlbaum; 1978. pp. 199–239. [Google Scholar]
- Treisman A. Preattentive processing in vision. Computer Vision, Graphics and Image Processing. 1985;31:156–177. [Google Scholar]
- Treisman A, Gelade G. A feature-integration theory of attention. Cognitive Psychology. 1980;12:97–136. doi: 10.1016/0010-0285(80)90005-5. [DOI] [PubMed] [Google Scholar]
- Tseng Y, Li CR. Oculomotor correlates of context-guided learning in visual search. Perception & Psychophysics. 2004;66(8):1363–1378. doi: 10.3758/bf03195004. [DOI] [PubMed] [Google Scholar]
- Wolfe JM. What do 1,000,000 trials tell us about visual search? Psychological Science. 1998;9(1):33–39. [Google Scholar]
- Wolfe JM, Horowitz TS. What attributes guide the deployment of visual attention and how do they do it? Nature Reviews Neuroscience. 2004;5(6):495–501. doi: 10.1038/nrn1411. [DOI] [PubMed] [Google Scholar]
- Wolfe JM, Oliva A, Horowitz TS, Butcher SJ, Bompas A. Segmentation of objects from backgrounds in visual search tasks. Vision Research. 2002;42:2985–3004. doi: 10.1016/s0042-6989(02)00388-7. [DOI] [PubMed] [Google Scholar]