Learning in repeated visual search

Michael C Hout; Stephen D Goldinger

doi:10.3758/APP.72.5.1267

. Author manuscript; available in PMC: 2014 Nov 23.

Published in final edited form as: Atten Percept Psychophys. 2010 Jul;72(5):1267–1282. doi: 10.3758/APP.72.5.1267

Learning in repeated visual search

Michael C Hout ¹, Stephen D Goldinger ¹

PMCID: PMC4241378 NIHMSID: NIHMS642333 PMID: 20601709

Abstract

Visual search (e.g., finding a specific object in an array of other objects) is performed most effectively when people are able to ignore distracting nontargets. In repeated search, however, incidental learning of object identities may facilitate performance. In three experiments, with over 1,100 participants, we examined the extent to which search could be facilitated by object memory and by memory for spatial layouts. Participants searched for new targets (real-world, nameable objects) embedded among repeated distractors. To make the task more challenging, some participants performed search for multiple targets, increasing demands on visual working memory (WM). Following search, memory for search distractors was assessed using a surprise two-alternative forced choice recognition memory test with semantically matched foils. Search performance was facilitated by distractor object learning and by spatial memory; it was most robust when object identity was consistently tied to spatial locations and weakest (or absent) when object identities were inconsistent across trials. Incidental memory for distractors was better among participants who searched under high WM load, relative to low WM load. These results were observed when visual search included exhaustive-search trials (Experiment 1) or when all trials were self-terminating (Experiment 2). In Experiment 3, stimulus exposure was equated across WM load groups by presenting objects in a single-object stream; recognition accuracy was similar to that in Experiments 1 and 2. Together, the results suggest that people incidentally generate memory for nontarget objects encountered during search and that such memory can facilitate search performance.

Consider a task in which you search for real-world objects. For example, imagine that you are asked to determine whether some target object (e.g., a hammer) is present among a set of distractor objects (e.g., a phone, a computer, a shoe). As you search for the hammer, what might you learn about the nontargets? Although the task is most efficiently completed by ignoring the distractors, people must process them to some degree and may incidentally learn their identities (Williams, Henderson, & Zacks, 2005). Indeed, if the same set of distractors was used repeatedly, such learning might facilitate search performance (Chun & Jiang, 1999; Endo & Takeda, 2004; Mruczek & Sheinberg, 2005). For instance, knowledge of the appearance or locations of the phone, computer, and shoe may increase your speed in locating a new target or determining its absence.

In the present investigation, we assessed whether visual search could be facilitated through incidental learning of repeated distractors and whether such learning involved information about both object identities and spatial layouts. In all the experiments, the stimuli were real-world, nameable objects. On each trial in the first experiment, participants searched for new target objects embedded within sets of repeated distractors, indicating whether the target(s) was present in or absent from the display. The location and identity of the target(s) were unpredictable. Search was conducted among fixed spatial locations with fixed object-to-location mapping (Experiment 1A), fixed spatial locations with random object-to-location mapping (Experiment 1B), random locations with fixed object identities (Experiment 1C), and fixed locations with random object identities (Experiment 1D). In Experiment 2, each of these conditions was examined again, but search targets were present on all trials. In both experiments, we also assessed whether a visual working memory (WM) load interacted with the learning of search arrays. Search facilitation was measured, and incidental recognition memory was later assessed using surprise two-alternative forced choice (2AFC) token discrimination tests. Finally, in Experiment 3, participants were given a passive search task in which the search items were presented centrally in rapid succession; incidental memory for distractors was again tested following all search trials.

Incidental Acquisition of Visual Information

It is clear that substantial information is incidentally acquired during visual search. For example, Castelhano and Henderson (2005; see also Hollingworth & Henderson, 2002) had participants view photographs of real-world scenes in two different tasks: intentional memorization and visual search. In both conditions, the participants performed reliably above chance for basic-level token discrimination (i.e., “which of these two different objects did you see previously?”) and for mirror image discrimination (i.e., “which manifestation of this image did you see previously?”), suggesting that detailed long-term visual representations were incidentally generated during scene perception.

Williams et al. (2005) investigated incidental retention of visual details for real-world objects encountered during a search task. Participants counted the number of targets present in an array of 12 photographs, consisting of four image types: targets, distractors matched for category but not for color (category distractors), distractors matched for color but not for category (color distractors), and unrelated distractors. Search arrays were presented twice, once per block of trials. Results showed that all classes of objects were viewed less frequently upon second presentation of the search arrays, suggesting that memory for their visual details was acquired during the first presentation. Incidental recognition memory for search objects was tested with foils matching the semantic labels of the presented objects (e.g., if the presented object was a black phone, the foil was a new black phone). Memory performance was above chance for each class of objects and was best for search targets (83%) and distractors related to the target (≈60%); both types of related distractors were remembered better than unrelated distractors. Williams et al. concluded that detailed visual information for objects is incidentally encoded during visual search.

Search Efficiency and Facilitation

If visual information is acquired during search, it would be reasonable to expect such learning to enhance search efficiency (as indexed by the slope of the line relating response time [RT] to set size). However, this does not seem to be the case. The long-term benefits of repeated visual search were investigated by Wolfe and colleagues in several experiments. Wolfe, Klempen, and Dahlen (2000) found that repeated presentations of an unchanged display elicited shorter RTs as a function of trial number but did not increase search efficiency; stimuli included letters, shapes, simple conjunctions (e.g., a horizontal black bar), and compound conjunctions (e.g., a red circle topped with a yellow vertical line). If search was inefficient on the first presentation, it remained inefficient even after 350 presentations, suggesting that the perceptual effects of attention vanish once attention has been redeployed elsewhere. Wolfe, Oliva, Butcher, and Arsenio (2002) presented participants with more complex stimuli: realistically colored, computer-generated real-world objects (e.g., a coffee machine, a laptop, a fruit bowl). The participants were first shown a number of search objects. At the start of the trial, each of the objects was rotated slightly, with 50% of the trials also replacing one object with a scrambled version of itself. The participants indicated the presence or absence of a scrambled object. In the repeated search condition, the same objects were continuously present on the screen for an entire block of 288 trials; the unrepeated condition consisted of a random selection of objects on each trial. The repeated and unrepeated conditions were not reliably different in mean RTs or search slopes. In another experiment, Wolfe et al. (2002) found that repeated visual search was no more efficient than unrepeated search when participants looked for a scrambled object among normal objects in a realistic scene. Wolfe and colleagues argued that repeated search may lead an observer to acquire visual information but does not enable the observer to perceive the display more efficiently; objects appear to be recognized one at a time. They suggested that when attention is moved away from an object, it no longer affects visual perception and that the attentional guidance used to find a target in a repeated scene is therefore quite similar to the guidance used in a novel scene.

Although Wolfe and colleagues have consistently shown that search efficiency is not enhanced as a scene is repeated, it seems that memory for search displays may nevertheless affect performance. Numerous studies have complemented Wolfe et al.’s (2002) finding that, as search is conducted within a repeated scene, search decisions may be reached more quickly, reducing RTs as a function of experience with the display. In the contextual cuing paradigm (Chun & Jiang, 2003; Jiang & Leung, 2005; Jiang & Song, 2005), repeated presentations of spatial configurations allow participants to more quickly locate targets after only a few repetitions. Chun and Jiang (1998) had participants search for a rotated T among a set of rotated Ls. Half of the search configurations were repeated, and targets appeared in consistent locations within these displays. Targets in repeated configurations were located more quickly than were randomly configured displays, presumably due to learned associations between spatial configurations and target locations. Chun and Jiang (1999) investigated the extent to which search could be facilitated when target identities were cued by the presence of consistent distractor identities. They presented people with novel objects: Search targets were symmetrical around the vertical axis (0°); distractor objects were symmetrical around other orientations (30°, 60°, 90°, 120°, 150°). On consistent-mapping trials, a given target shape was paired with the same distractor identities. On varied-mapping trials, random assortments of distractors were paired with targets. Unlike the prior experiments (in which spatial configurations predicted target location), the locations of targets and distractors were randomized on each trial. RTs were nevertheless shorter in the consistent-mapping condition, suggesting that sensitivity to the distractor context facilitated search performance by cuing the identity of the target.

The contextual-cuing experiments suggest that visual search is improved when people can learn associations between distractor configurations and target locations or between distractor identities and target identities. Taking this further, Endo and Takeda (2004) reported that people can also learn associations between distractor identities and target locations. Using a closed contour search task (i.e., the targets were abstract shapes with closed contours, shown among nonclosed distractors), they investigated each of the possible correlations between target information and distractor regularity. In their fourth experiment, distractor configurations were correlated with consistent target identities (configuration repetition condition), and distractor identities were correlated with consistent target locations (identity repetition condition), in a mixed-block design. Thus, in the configuration repetition condition, the spatial layout cued what the target was but not its location; in the identity repetition condition, distractor identities cued where the target was but not its identity. The results showed a contextual-cuing effect in the identity repetition condition, but not in the configuration repetition condition: Participants could use distractor identities to locate targets but did not learn associations between distractor configurations and target identities.

Considerations of spatial layouts aside, familiarity with targets and distractors also influences search performance. For instance, Frith (1974) reported that it is more difficult to visually scan for a letter among mirrored letters than the converse (see also Reicher, Snyder, & Richards, 1976; Richards & Reicher, 1978). More recently, Mruczek and Sheinberg (2005) examined the effect of target and distractor familiarity on visual search for heterogeneous stimuli. They manipulated levels of familiarity to large sets of targets and distractors by engaging participants in prolonged search for photographs of real-world objects. As the participants gained more experience with the images, RTs and search slopes decreased. Familiarity effects were indicated by faster search among familiar distractors, relative to unfamiliar distractors, and by faster location of familiar targets. They argued that incidental encoding of the images allowed people to more efficiently analyze and dismiss objects, supporting a role for item memory in visual search.

The Present Investigation

In the present investigation, we tested the extent to which visual search performance would benefit from incidental learning of spatial information and object identities. We employed an unpredictable search task for complex visual stimuli; the location and identity of targets was randomized on every trial, and potential targets were seen only once. Thus, the participants were given a difficult serial search task that required careful attentional scanning. Our fundamental question was, to what extent can incidental learning of background information aid in the location (or absence determination) of a previously unseen object?

We sought to answer four specific questions. First, when search is unpredictable, can incidental learning of complex distractor objects facilitate performance over a short period of time? Second, if such learning occurs, does it rely on a correlation between distractor identities and spatial locations, or can performance be improved entirely by memory for the objects? Third, how well are distractors remembered when participants are given no reason to encode them? And fourth, will a concurrent visual WM load interact with distractor learning? We examined search performance and incidental memory (indicated by search times and recognition performance, respectively) as a function of WM load, which was manipulated by requiring people to search for varying numbers of potential targets. Previous work by Menneer, Barrett, Phillips, Donnelly, and Cave (2007; Menneer, Cave, & Donnelly, 2009) indicated that multiple-target search is less accurate than single-target search. We varied the number of search targets such that WM load would be germane to the task at hand. As WM load increased, the participants were required to maintain more potential target images in memory, any (or none) of which might appear in the search array. Task demands not only required the maintenance of visual information in memory, but also tapped executive functions by tacitly requiring the participants to compare each viewed object with the visual objects held in memory.

In all the experiments, the participants repeatedly searched for new targets embedded within sets of repeated distractors. In Experiment 1, they indicated whether targets were present or absent, and in Experiment 2 (in which targets were always present), they indicated which of several potential targets was located. In both experiments, we examined the contributions of spatial consistency and object repetition by systematically decoupling these sources of information. Our first condition (Experiments 1A and 2A) consisted of fixed spatial locations with fixed object-to-location mapping. That is, the same distractor objects appeared in the same places across trials, with targets replacing one distractor per trial (on target-present trials). It was expected, and found, that these highly stable displays would promote the strongest learning across search trials.

The second condition (Experiments 1B and 2B) employed fixed spatial locations with random object-to-location mapping. In this case, the spatial layout was constant, but the distractors randomly traded positions on every trial. The participants could potentially benefit from repeated objects or repeated layouts but could not use any correlations between objects and positions. The third condition (Experiments 1C and 2C) used the same repeated objects, now in random spatial locations across trials. The participants could benefit from seeing repeated distractors but could not use spatial information. Finally, the fourth condition (Experiments 1D and 2D) consisted of fixed spatial locations (as in Condition 1), but object identities were less predictable. Specifically, in the first three conditions, one set of objects was used repeatedly in one block of trials, and a second set of objects was used repeatedly in a second block. In the fourth condition, we spread the use of all objects across both blocks, in randomly generated sets, with the restriction that all the objects must be repeated as often as in the previous three conditions. In this manner, we reduced the likelihood of repeated objects by 50%, while holding spatial layouts constant, making these conditions most naturally comparable to the “A” and “B” conditions in each experiment. Figure 1 provides examples of all these conditions.

Examples of each level of spatial and object consistency employed across experiments. Note that actual search displays consisted of 20 items.

In Experiment 3, visual search was passive; the participants viewed a centrally presented stream of images, one at a time, and indicated their search decisions after all the items had been shown. This experiment was conducted to allow assessment of the effects of WM load on object encoding, while controlling the viewing time for each object.

We made three main predictions. First, we expected that the participants would benefit from repeated exposure to distractor stimuli, with shorter search RTs being elicited in each of the first three conditions and with diminished (or eliminated) learning in the final condition. In Experiments 1A/2A, learning should be robust, due to the consistency of both object identities and spatial layouts. If object memory and spatial memory are independent of each other, we should find similar improvement in Experiments 1B/2B, despite the imperfect correlation of objects and layouts. Conversely, if object and spatial memory are interdependent, we should observe an interference effect, since familiar spatial layouts would be repeatedly populated by rearranged objects. In Experiments 1C/2C, no valid spatial information was available, since object locations were randomized on each trial. Thus, if RTs improved across trials, it would provide compelling evidence that object memory alone can facilitate search performance. Critically, in Experiments 1D/2D, spatial information was consistent, but the coherence of the distractor sets was reduced across trials. If performance is driven largely by incidental learning of distractor identities, we should find diminished (or eliminated) learning in this condition, relative to the others.

Second, we expected participants (in all the experiments) to incidentally generate memory for the distractor objects and, therefore, to discriminate between search distractors and foils at levels exceeding chance. (Note that, in all the conditions [even Experiments 1D and 2D], every object was shown equally often, making all the recognition tests comparable to each other.) Third, our manipulation of WM load motivated an interesting prediction. The load manipulation was mainly intended to challenge the participants, requiring them to search more slowly and carefully, thus providing a greater opportunity to encode distractor objects across trials. We also predicted, however, that despite holding more visual information in memory, high-load participants would generate stronger incidental memory for the distractors. We expected this because high-load participants would have to analyze all distractors more carefully, allowing “deeper” encoding (Craik & Lockhart, 1972). We return to this prediction in Experiment 3.

EXPERIMENT 1

In Experiment 1, participants indicated the presence or absence of a new target object(s) embedded among repeated distractors. When present, the target replaced a single distractor object (occupying its same spatial location in Experiments 1A/1B/1D). No target object appeared more than one time in any condition. In Experiment 1A, spatial layouts and object identities were held constant; each distractor was placed in the same location throughout a block of 40 trials. In Experiment 1B, we held the set of object locations fixed throughout each block but varied the object-to-location mapping of distractors. In this way, the participants could benefit from the consistent layout of objects across the screen but could not predict which image would appear in any given location. In Experiment 1C, we removed the ability to guide search by spatial memory by randomizing the layout on each trial; the participants could use experience with the distractors to improve search but could not use spatial information. In Experiment 1D, we held spatial layout constant within a block of trials, but distractors were not coherently grouped within sets and could appear in either search block. The participants could therefore use experience with the spatial layout to improve search but could not as easily learn object identities. In each condition, following all search trials, we administered a surprise 2AFC recognition memory test. The participants were shown previously seen distractor images, each with a semantically matched foil (e.g., if the old image was a coffee pot, the matched foil would be a new, visually distinct coffee pot), and they indicated which one was old.

Method

Participants

Three hundred seventy students from Arizona State University participated in Experiment 1 in partial fulfillment of a course requirement. Approximately 45 students participated in each of eight between-subjects conditions. All the participants had normal or corrected-to-normal vision.

Design

Four levels of spatial and object consistency (see Figure 1) and two levels of WM load (low, high) were manipulated between subjects. Presence of the target during search (absent, present) was a within-subjects variable in equal proportions.

Stimuli

The stimuli were real-world objects of various image types, including line drawings, detailed sketches, small-scale photographs, and clip art. Images were selected to avoid any obvious categorical relationships among the stimuli, and approximately equal proportions of each image type were used. Most categories were represented by a single image (e.g, the category computer represented by a single laptop). Categories with multiple representations consisted of images that were visually distinct (e.g., the category animals represented by a sitting cat and a standing dog). Images were resized (while maintaining original proportions) to a maximum of 2.5° in visual angle (horizontal or vertical) from a viewing distance of 55 cm. Images were no smaller than 2.0° in visual angle along either dimension, were converted to grayscale, and contained little or no background. A single object or entity was present in each image (e.g., an ice cream cone or a pair of shoes). Although stimulus characteristics such as luminance or contrast were not directly manipulated, the variation across image types and categories would have made it extremely difficult for the participants to select targets on the basis of any such feature.

Apparatus

Data were collected on up to eight computers simultaneously; each was equipped with identical software and hardware (Gateway E4610 PC, 1.8 GHz, 2 GB RAM). Dividing walls separated participant stations on either side to reduce distraction. Each display was a 17-in. NEC (16.0 in. viewable) CRT monitor, with resolution set to 1,280 × 1,024 pixels and a refresh rate of 60 Hz. Display was controlled by an NVIDIA GE Force 7300 GS video card (527 MB). E-Prime v1.2 software (Schneider, Eschman, & Zuccolotto, 2002) was used to control stimulus presentation and collect responses.

Procedure

In visual search, the participants completed two 40-trial blocks, each with 20 target-absent and 20 target-present trials, randomly dispersed, for a total of 80 trials. Each block entailed repeated presentation of the same distractor images for all 40 trials (except in Experiment 1D). A 1-min break was placed between blocks to allow the participants to rest their eyes. A new distractor set was introduced in the second block of trials and was again used for all 40 trials. Order of presentation of distractor sets was counterbalanced across participants. In Experiment 1D, distractor sets were randomly composed on each trial, with the constraint that each image be presented equally often, again with 40 repetitions per object.

The participants were instructed that, at the beginning of each trial, they would see either one or three different potential targets (low and high WM load conditions, respectively) that should be kept in mind. Low-load participants tried to determine whether the target was present in the display. High-load participants tried to determine whether any of the three potential targets were present in the display or whether all were absent; they were informed that only one target would appear on any trial. Given this procedure, we used the same search targets across load conditions, making them directly comparable. Target images were randomized and were not repeated. The participants were also informed that the target image, if present in the display, would be mirrored along its vertical axis, and sample stimuli were shown to demonstrate this point. Gray-scaling of all stimuli and mirroring of the targets were performed to minimize pop-out effects (Treisman & Gelade, 1980) and to avoid potential template-matching strategies that would circumvent visual search. Instructions emphasized accuracy over speed. Two practice trials were administered. None of the practice stimuli were used in the rest of the experiment.

Search trials (see Figure 2) began by showing the target image(s). When the participant was ready, he/she pressed the space bar to clear the screen and initiate an array of 20 images (either 20 distractors or 19 distractors and 1 target), all shown on a blank white background. On target-present trials, a single distractor was replaced by the target; different distractors were replaced across trials (the participants were not informed of this regularity). Spacing and size of the images minimized parafoveal identification of the objects.

Timeline showing the progression of events for a single visual search trial. Participants were shown a target image(s) and progressed to the next screen upon a keypress. The visual search array was then presented and was terminated upon a space bar key-press. Target presence was then queried, followed by 1-sec accuracy feedback and a 1-sec delay prior to the start of the next trial.

The participants rested their fingers on the space bar during search. Once a target was found or it was determined that no target was present in the display, the participants pressed the space bar to terminate search. The search array was then immediately cleared from the screen, and the participants were prompted to press “f “ or “j” (for present and absent, respectively), which were labeled on the query screen. Using the space bar to terminate the display (instead of requiring an immediate presence decision) allowed measures of search time to reflect termination of the search process, without additional time for response selection. Brief accuracy feedback was given, followed by a 1-sec delay screen before the next trial.

After visual search, the participants were given a surprise 2AFC recognition test for the distractor images encountered during search. Two images were shown per trial on a white background: one prior distractor and one semantically matched foil, equally mapped to each side of the screen. The participants indicated their selection on the keyboard, and feedback was provided. All 40 distractor images were tested; the images were pooled and presented in random order to minimize any effect of time elapsed since learning.

Results

Seventeen of the 370 participants (5%) in Experiment 1 were excluded from analysis, for several reasons. Three were lost because of data corruption. One was removed because visual search accuracy was >2.5 standard deviations below the group mean. Five were removed because their mean visual search times were >2.5 standard deviations above their group means, and 8 were removed because recognition accuracy was >2.5 standard deviations below their group means (all were below chance). Overall, error rates for visual search were very low (7%). For the sake of brevity, we do not discuss the analysis of search accuracy, but the results are shown in Table A1.¹

Visual search RTs

For visual search, RTs were divided into epochs, with each epoch comprising 25% of the block’s trials. Following Endo and Takeda (2004), we examined RTs as a function of experience with the search display; main effects of epoch would indicate that RTs reliably decreased as the trials progressed. Although Experiment 1 had many conditions, the key results are easily summarized: Figures 3 and 4 present mean search times (for target-present and target-absent search, respectively); RTs are plotted across epochs, as a function of experiment and load. As is shown, search times were consistently longer for the participants under high WM load and were longer in target-absent trials. Of greater interest, reliable learning (effects of epoch) were observed in every condition, although not to equivalent degrees.

Time series analysis showing mean visual search response times (rTs) on target-present trials, as a function of epoch, in Experiment 1. The results are plotted separately for each working memory load group.

Time series analysis showing mean visual search response times (rTs) on target-absent trials, as a function of epoch, in Experiment 1. The results are plotted separately for each working memory load group.

RTs for accurate search trials were entered into a five-way repeated measures ANOVA, with experiment (1A, 1B, 1C, and 1D), WM load (low, high), trial type (target present, target absent), block (1 or 2), and epoch (1–4) as factors. We found an effect of experiment [F(3,345) = 5.42, p = .001, $η_{p}^{2} = .05$ ], with the shortest overall RTs in Experiment 1A (3,203 msec), followed by Experiment 1C (3,465 msec), Experiment 1D (3,473 msec), and Experiment 1B (3,756 msec). There was an effect of load [F(1,345) = 590.41, p < .001, $η_{p}^{2} = .63$ ], with faster search among low-load groups (2,344 msec) than among high-load groups (4,605 msec).

There was an effect of trial type [F(1,345) = 1,790.19, p < .001, $η_{p}^{2} = .84$ ], with participants responding more quickly to target presence (2,297 msec) than to target absence (4,652 msec). We found an effect of block [F(1,345) = 39.06, p < .001, $η_{p}^{2} = .10$ ], with slower search in Block 1 (3,584 msec) than in Block 2 (3,364 msec). Of key interest, there was a main effect of epoch [F(3,343) = 45.12, p < .001, $η_{p}^{2} = .28$ ], indicating that search RTs decreased significantly within blocks of trials (3,726, 3,507, 3,378, and 3,285 msec for Epochs 1–4, respectively).

There were several two-way interactions. Of particular interest, we found an experiment × block interaction [F(3,345) = 5.16, p = .002, $η_{p}^{2} = .04$ ]. The greatest decrease in RTs across blocks was elicited in Experiment 1D (435 msec), followed by Experiment 1A (219 msec), Experiment 1C (160 msec), and Experiment 1B (66 msec). The largest benefit in Experiment 1D reflects the fact that distractor objects were shared across blocks. We found an experiment × trial type interaction [F(3,345) = 4.85, p = .003, $η_{p}^{2} = .04$ ]; the disparity between trial types (target present, absent) was greatest in Experiment 1B (2,683 msec), followed by Experiment 1C (2,380 msec), Experiment 1D (2,285 msec), and Experiment 1A (2,093 msec). There was a load × trial type interaction [F(1,345) = 211.90, p < .001, $η_{p}^{2} = .38$ ], indicating a larger disparity between trial types among the high-load groups (3,166 msec) than among the low-load groups (1,545 msec).

A load × block interaction [F(1,345) = 14.09, p < .001, $η_{p}^{2} = .04$ ] indicated a larger decrease in RTs across blocks for the high-load groups (352 msec) than for the low-load groups (88 msec). Importantly, a load × epoch interaction [F(3,343) = 11.53, p < .001, $η_{p}^{2} = .09$ ] indicated steeper learning slopes for the high-load groups (−216 msec/epoch) than for the low-load groups (−74 msec/epoch).² Lastly, we found a block × epoch interaction [F(3,343) = 4.32, p = .005, $η_{p}^{2} = .04$ ]. Learning slopes were steeper in Block 1 (−156 msec/epoch) than in Block 2 (−134 msec/epoch). We also found four higher order interactions: experiment × trial type × block, F(3,345) = 7.46, p < .001, $η_{p}^{2} = .06$ ; trial type × block × epoch, F(3,343) = 7.79, p < .001, $η_{p}^{2} = .06$ ; experiment × load × trial type × block, F(3,345) = 5.46, p = .001, $η_{p}^{2} = .05$ ; and load × trial type × block × epoch, F(3,343) = 3.03, p < .03, $η_{p}^{2} = .03$ .

Because the effects of epoch were of particular interest, we tested for simple effects, finding significant epoch effects in each experiment: Experiment 1A, F(3,68) = 15.39, p < .001, $η_{p}^{2} = .40$ (with mean RTs of 3,557, 3,196, 3,078, and 2,979 msec for Epochs 1–4, respectively); Experiment 1B, F(3,84) = 14.39, p < .001, $η_{p}^{2} = .34$ (mean RTs of 4,055, 3,792, 3,636, and 3,541 msec); Experiment 1C, F(3,97) = 11.11, p < .001, $η_{p}^{2} = .26$ (mean RTs of 3,681, 3,496, 3,373, and 3,309 msec); Experiment 1D, F(3,88) = 8.78, p < .001, $η_{p}^{2} = .23$ (mean RTs of 3,612, 3,544, 3,425, and 3,312 msec).

As was noted earlier, Experiment 1A employed complete consistency in both spatial and object identity information within a block of trials; Experiments 1B, 1C, and 1D degraded these information sources in unique manners. Accordingly, we performed three final analyses, comparing Experiment 1A with each of the other experiments. We were specifically interested in potential experiment × epoch interactions, which would indicate different learning slopes across experiments. Neither Experiment 1B nor Experiment 1C exhibited different learning slopes, relative to Experiment 1A [Experiment 1A vs. 1B, F(3,154) = 0.49, p = .123; Experiment 1A vs. 1C, F(3,167) = 1.65, p = .18]. However, the learning slope in Experiment 1A (−185 msec/epoch) was steeper than the slope in Experiment 1D (−102 msec/epoch) [F(3,158) = 3.94, p = .01, $η_{p}^{2} = .07$ ]. Figure 5 presents search RTs as a function of experiment and epoch, collapsed across all other factors. In Figure 5, RTs are scaled, shown as proportions of Epoch 1 means for each experiment.

Time series analysis showing mean visual search response times (rTs) as a function of experiment and epoch in Experiment 1. The results are scaled to the proportion of mean search rTs in Epoch 1 per group.

Recognition

Although chance performance on 2AFC tests should be 50%, we were concerned that our materials may have contained selection biases. That is, some characteristics of the true distractors may have increased the likelihood that we would have originally chosen them for the search trials, rather than the foils used in the recognition tests. If so, the participants could potentially have guessed which images were old on the basis of visual characteristics, irrespective of memory (e.g., “this coffee cup looks more like it would have been used in this task”). We therefore established an empirical baseline: Forty-five naive participants saw the same 2AFC pairs with no prior exposure. The search experiment was described, and these participants were asked to guess which image (per pair) was more likely to be chosen by an experimenter for use in such a task. Mean guessing accuracy was 59% (SD = 0.08), which reliably exceeded 50% [t(44) = 7.46, p < .01], verifying a potential selection bias. To be conservative, we therefore evaluated recognition performance relative to this empirical baseline of 59%, rather than 50%. As we describe next, all the groups produced recognition well in excess of this baseline, with superior performance among high-load groups.

Recognition accuracy was entered into a two-way ANOVA with experiment and WM load as factors. We found an effect of experiment [F(3,345) = 3.98, p = .008, $η_{p}^{2} = .03$ ], with the best performance in Experiment 1B (80%), followed by Experiment 1D (78%), Experiment 1C (77%), and Experiment 1A (75%). There was also an effect of load [F(1,345) = 89.25, p < .001, $η_{p}^{2} = .21$ ], with better memory among the high-load groups (83%) than among the low-load groups (72%). The interaction was not significant (F < 1). Figure 6 presents mean recognition accuracy as a function of experiment and load.

Recognition memory performance in Experiment 1. The results are plotted as a function of experiment and Wm load.

Discussion

The participants in Experiment 1 demonstrated incidental learning of repeated search arrays, with steadily decreasing search RTs over trials. In Experiment 1A, two sources of information could have been used by the participants to improve search performance: distractor identities and their consistent locations. In Experiment 1B, the participants could benefit from the consistent layout of objects across the screen but could not predict which image would appear in any given location. In both experiments, significant learning effects occurred, indicating that people could benefit from spatial and object memory, even when these information sources were imperfectly correlated. In Experiment 1C, spatial consistency was eliminated, but familiarity with the distractors still allowed the participants to improve performance across trials. Conversely, the search displays in Experiment 1D were spatially consistent but were inconsistent with respect to the identities of distractor items: The participants once more improved performance over trials, but the benefit was reduced, relative to the complementary condition in which distractor sets were consistent within blocks of trials (Experiment 1A). Given the results, it appears that the contribution of object learning to search performance outweighed that of spatial learning. When object information was inconsistent, performance was more disrupted, relative to conditions wherein spatial information was disrupted.

It is important to note that we observed reliable practice effects; RTs in Block 2 were consistently shorter than those in Block 1. However, it is unlikely that practice can fully account for our learning effects. The main effect of block accounted for only 10% of the variance in RTs, in contrast with 28% explained by the main effect of epoch. Moreover, in the first three conditions, mean RTs decreased by 148 msec across blocks; but the effect was much larger across epochs. In Experiments 1A, 1B, and 1C, mean RTs dropped by 487 msec across Epochs 1–4, more than triple the change across blocks. Experiment 1D was the only condition in which certain information could be useful in both blocks (i.e., individual distractors could appear in Blocks 1 and 2), and only in this condition was the pattern of results different: The decrease in RTs across blocks (435 msec) was larger than that across epochs (300 msec). That some performance enhancement was produced is not altogether surprising, since the distractors were repeated often across the experiment; they were merely unpredictable on any given trial. In sum, a general practice effect improved the participants’ performance with experience, but this effect was overshadowed by the benefit provided from repeating distractor identities within blocks.

Given the residual learning that occurred in Experiment 1D, we should note that such improvements in search RTs do not appear to reflect a general practice effect. In a separate experiment (N = 38), we replicated the experiments reported above, with one key difference. Participants again searched for targets under low or high WM load, with the same numbers of objects per display. In this case, however, every trial presented all new objects and random spatial layouts. We again observed a robust effect of WM load on search times, with slower search among high-load participants. However, we observed no evidence of learning across epochs. Although this is not surprising (since there was little information for people to learn), it does suggest that general practice has little impact in our procedure.

Although search would have been performed most efficiently by ignoring distracting nontarget objects, people nevertheless benefited from their repetition and retained detailed information about them, without instruction to do so. Consistent with Williams et al. (2005), our participants discriminated previously seen distractors from semantically matched foils at levels greater than chance. Paradoxically, the high-load groups outperformed the low-load groups. Clearly, loaded participants had a more difficult search task, performing more slowly and producing more errors. One might presume that people experiencing greater difficulty during search would evince less memory for the displays. Indeed, our load manipulation forced people to maintain extra images in visual WM, which might be expected to interfere with distractor encoding. However, our method required people to make more frequent and careful mental comparisons, pitting each distractor against three potential targets with distinct visual details. It is possible that such careful, repeated mental comparisons resulted in deeper encoding of distractor identities, akin to verbal depth of processing (Craik & Lockhart, 1972). Similarly, Conway, Cowan, and Bunting (2001) reported that decreased WM resources are associated with difficulty in inhibiting distracting information. Accordingly, loaded participants may have been less able to block distractor identities during search, driving up RTs and, concurrently, increasing retention for their visual details. Experiment 1 could not fully resolve this issue, since visual search RTs were confounded with WM load. That is, the high-load groups also had greater viewing opportunities than did the low-load groups, because of their slower search process. We later address this issue in Experiment 3 by equating stimulus exposure across WM load groups in an altered search procedure.

EXPERIMENT 2

In Experiment 2, we sought to complement the previous findings by answering two questions. First, does distractor learning require frequent exhaustive searches, or will it occur when search targets are always present? Second, are differences in learning among WM load groups graded? That is, is there a qualitative difference between holding one object in memory, relative to numerous objects? Or is the phenomenon continuous, so that the number of objects held in memory is directly related to subsequent search and memory performance? The participants in Experiment 2 performed search with targets present on every trial. Now, rather than an absence versus presence decision, the participants’ task was to search for several targets and (upon finding one) indicate which potential target was found. In Experiments 2A–2D, we varied spatial and object consistency in a fashion analogous to that in Experiments 1A–1D. We also introduced a medium-load group to examine graded differences in visual search behavior and incidental recognition. Finding that RTs decrease across trials would suggest that learning occurs not only during exhaustive search, but also during more variable, self-terminating search. Conversely, if RTs are stable across trials, it would suggest that learning is driven by exhaustive searches, wherein every item is examined on every trial. With respect to WM load, we expected the intermediate-load group to exhibit learning that would fall between the performance levels of the low- and high-load groups. If, however, performance for the medium-load group fell in line with that for the high-load group, it would suggest a qualitative difference between holding one object in memory and holding several objects in memory. The stimuli and apparatus were identical to those in Experiment 1.