Understanding the Function of Visual Short-Term Memory: Transsaccadic Memory, Object Correspondence, and Gaze Correction

Andrew Hollingworth; Ashleigh M Richard; Steven J Luck

doi:10.1037/0096-3445.137.1.163

. Author manuscript; available in PMC: 2009 Nov 28.

Published in final edited form as: J Exp Psychol Gen. 2008 Feb;137(1):163–181. doi: 10.1037/0096-3445.137.1.163

Understanding the Function of Visual Short-Term Memory: Transsaccadic Memory, Object Correspondence, and Gaze Correction

Andrew Hollingworth ¹, Ashleigh M Richard ², Steven J Luck ³

PMCID: PMC2784885 NIHMSID: NIHMS152613 PMID: 18248135

Abstract

Visual short-term memory (VSTM) has received intensive study over the past decade, with research focused on VSTM capacity and representational format. Yet, the function of VSTM in human cognition is not well understood. Here we demonstrate that VSTM plays an important role in the control of saccadic eye movements. Intelligent human behavior depends on directing the eyes to goal-relevant objects in the world, yet saccades are very often inaccurate and require correction. We hypothesized that VSTM is used to remember the features of the current saccade target so that it can be rapidly reacquired after an errant saccade, a fundamental task faced by the visual system thousands of times each day. In four experiments, memory-based gaze correction was found to be accurate, fast, automatic, and largely unconscious. In addition, a concurrent VSTM load was found to interfere with memory-based gaze correction, but a verbal short-term memory load did not. These findings demonstrate VSTM plays a direct role in a fundamentally important aspect of visually guided behavior, and they suggest the existence of previously unknown links between VSTM representations and the occulomotor system.

Human vision is active and selective. In the course of viewing a natural scene, the eyes are reoriented approximately three times each second to bring the projection of individual objects onto the high-resolution, foveal region of the retina (for reviews, see Henderson & Hollingworth, 1998; Rayner, 1998). Periods of eye fixation, during which visual information is acquired, are separated by brief saccadic eye movements, during which vision is suppressed and we are virtually blind (Matin, 1974). The input for vision is therefore divided into a series of discrete episodes. To span the perceptual gap between individual fixations, a transsaccadic memory for the visual properties of the scene must be maintained across each eye movement.

Transsaccadic Memory and Visual Short-Term Memory (VSTM)

Early theories proposed that transsaccadic memory integrates low-level sensory representations (i.e., iconic memory) across saccades to construct a global image of the external world (McConkie & Rayner, 1976). However, a large body of research demonstrates conclusively that is false; participants cannot integrate sensory information presented on separate fixations (Irwin, Yantis, & Jonides, 1983; O'Regan & Lévy-Schoen, 1983; Rayner & Pollatsek, 1983). Recent work using naturalistic scene stimuli has arrived at a similar conclusion. Relatively large changes to a natural scene can go undetected if the change occurs during a saccadic eye movement or other visual disruption (Grimes, 1996; Henderson & Hollingworth, 1999, 2003b; Rensink, O'Regan, & Clark, 1997; Simons & Levin, 1998), an effect that has been termed change blindness. For example, Henderson and Hollingworth (2003b) had participants view scene images that were partially occluded by a set of vertical gray bars (as if viewing the scene from behind a picket fence). During eye movements, the bars were shifted so that the occluded portions of the scene became visible and the visible portions became occluded. Despite the fact that every pixel in the image changed, subjects were almost entirely insensitive to these changes, demonstrating that low-level sensory information is not preserved from one fixation to the next.

Although transsaccadic memory does not support low-level sensory integration, visual representations are nonetheless retained across eye movements. In transsaccadic object identification studies, participants are faster to identify an object when a preview of that object has been available in the periphery before the saccade (Henderson, Pollatsek, & Rayner, 1987), and this benefit is reduced when the object undergoes a visual change during the saccade, such as substitution of one object with another from the same basic-level category (Henderson & Siefert, 2001; Pollatsek, Rayner, & Collins, 1984) and mirror reflection (Henderson & Siefert, 1999). In addition, object priming across saccades is governed primarily by visual similarity rather than by conceptual similarity (Pollatsek et al., 1984). Finally, structural descriptions of simple visual stimuli can be retained across eye movements (Carlson-Radvansky, 1999; Carlson-Radvansky & Irwin, 1995). Thus, memory across saccades appears to be limited to higher level visual codes, abstracted away from precise sensory representation but detailed enough to specify individual object tokens and viewpoint.

Multiple lines of converging evidence indicate that visual memory across saccades depends on the VSTM system originally identified by Phillips (1974) and investigated extensively over the last decade (see Luck, in press, for a review).¹ On all dimensions tested, transsaccadic memory exhibits properties similar to those found for VSTM. Both VSTM and transsaccadic memory have a capacity of 3–4 objects (Irwin, 1992a; Luck & Vogel, 1997; Pashler, 1988) and lower spatial precision than sensory memory (Irwin, 1991; Phillips, 1974). Both VSTM and transsaccadic memory maintain object-based representations, with capacity determined primarily by the number of objects to remember and not by the number of visual features to remember (Irwin & Andrews, 1996; Luck & Vogel, 1997). Finally, both VSTM and transsaccadic memory are sensitive to higher-order pattern structure and grouping (Hollingworth, Hyun, & Zhang, 2005; Irwin, 1991; Jiang, Olson, & Chun, 2000). The logical inference is that VSTM is the memory system comprising transsaccadic memory.

A final strand of evidence about VSTM and transsaccadic memory is particularly germane to the present study. Before a saccade, visual attention is automatically and exclusively directed to the target of that saccade (Deubel & Schneider, 1996; Hoffman & Subramaniam, 1995). Attention also supports the selection of perceptual objects for consolidation into VSTM (Averbach & Coriell, 1961; Hollingworth & Henderson, 2002; Irwin, 1992a; Schmidt, Vogel, Woodman, & Luck, 2002; Sperling, 1960). Thus, the saccade target object is preferentially encoded into VSTM and stored across the saccade. Indeed, memory performance is higher for objects at or near the target position of an impending or just-completed saccade (Henderson & Hollingworth, 1999, 2003a; Irwin, 1992a; Irwin & Gordon, 1998).

The Function of VSTM across Saccades

Although VSTM can be used to store object information across saccades, it is not exactly clear what purpose that storage serves. More generally, the functional role of VSTM in real-world cognition is not well understood. VSTM research has typically investigated the capacity (Alvarez & Cavanagh, 2004; Luck & Vogel, 1997) and representational format (Hollingworth et al., 2005; Irwin & Andrews, 1996; Jiang et al., 2000; Luck & Vogel, 1997; Phillips, 1974) of VSTM, with the question of VSTM function relatively neglected.

One plausible function for VSTM is to establish correspondence between objects visible on separate fixations. With each saccade, the retinal positions of objects change, generating a correspondence problem: How does the visual system determine that an object projecting to one retinal location on fixation N is the same object as the one projecting to a different retinal location on fixation N + 1? This is a fundamental problem the visual system must solve. Researchers have proposed that transsaccadic VSTM may be used to compute object correspondence across saccades. Memory for the visual properties of a few objects—particularly the saccade target—is stored across the saccade and compared with visual information available after the saccade to determine which post-saccade objects correspond with the remembered pre-saccade objects (Currie, McConkie, Carlson-Radvansky, & Irwin, 2000; Henderson & Hollingworth, 1999, 2003a; Irwin, McConkie, Carlson-Radvansky, & Currie, 1994; McConkie & Currie, 1996).

The issue of object correspondence across saccades has often been framed as a problem of visual stability: How do we consciously perceive the world to be stable when the image projected to the retina shifts with each saccade? Irwin, McConkie, and colleagues (Currie et al., 2000; Irwin et al., 1994; McConkie & Currie, 1996; see also Deubel & Schneider, 1994 ) proposed a saccade target theory to explain visual stability. In this view, an object is selected as the saccade target before each saccade. Perceptual features of the target are encoded into VSTM and maintained across the saccade. After completion of the saccade, the visual system searches for an object that matches the target information in VSTM, with search limited to a spatial region around the landing position. If the saccade target is located within the search region, the experience of visual stability is maintained. If the saccade target is not found, the observer becomes consciously aware of a discrepancy between pre- and post-saccade visual experience. Saccade target theory was proposed to account for the phenomenology of visual stability, but the evidence reviewed above demonstrating preferential encoding of the saccade target into VSTM raises the possibility that VSTM supports the more general function of establishing correspondence between the saccade target object visible before and after the saccade. Such correspondence might produce the experience of visual stability, but that need not be its only purpose. We will argue that an important function of VSTM across saccades is to support the correction of gaze when the eyes fail to land on the target of a saccade. Such gaze corrections are required thousands of times each day. A VSTM representation of the saccade target allows the target to be found after an inaccurate eye movement, and a corrective saccade can be generated to that object.

Gaze Correction and VSTM

Real-world tasks require orienting the eyes to goal-relevant objects in the world (Hayhoe, 2000; Henderson & Hollingworth, 1998; Land & Hayhoe, 2001; Land, Mennie, & Rusted, 1999), but eye movements are prone to error, with the eyes often failing to land on the target of the saccade. Even under highly simplified conditions in which participants generate saccades to single targets on blank backgrounds (Frost & Pöppel, 1976; Kapoula, 1985), saccades errors are very common, occurring on at least 30–40% of trials (for a review, see Becker, 1991). Saccade errors presumably occur thousands of times each day as people go through their normal activities. During natural scene viewing, inter-object saccades occur about once per second, on average. Thus, in a 16-hour day, assuming that 30–40% of saccades fail to land on the target object, gaze correction could be required as many as 17,000 times. When the eyes fail to land on the saccade target object, gaze must be corrected to bring that object to the fovea. Accurate, rapid gaze corrections are therefore critical to ensure that the eyes are efficiently directed to goal-relevant objects.

Unlike other motor actions, such as visually guided reaching, saccadic eye movements are ballistic; the short duration of the saccade and visual suppression prevent any correction based on visual input during the eye movement itself. Thus, correction of gaze must depend on perceptual information available after the eyes have landed and target information retained in memory across the saccade. When the eyes miss the target object, there is typically a short fixation followed by a corrective saccade to the target (Becker, 1972; Deubel, Wolf, & Hauske, 1982). The duration of the fixation before the corrective saccade (correction latency) varies inversely with correction distance, with small corrections (< 2°) requiring 150–200 ms to initiate and longer corrections requiring shorter latencies (Becker, 1972; Deubel et al., 1982; Kapoula & Robinson, 1986). Previous research on corrective saccades has used single targets displayed in isolation. When the eyes miss the target in these experiments, finding the target is trivial, because there is only one visible object near the saccade landing position. The complexity of real-world environments makes this problem much more interesting and challenging. If the eyes miss the saccade target, there will likely be multiple objects near fixation. For example, if a saccade to a phone on a cluttered desk misses the phone, other objects (e.g., pen, scissors, notepad) will lie near the landing position of the eye movement. To correct gaze to the appropriate object (and efficiently execute a phone call), correspondence must be established between the remembered pre-saccade target object and visual object information available after the saccade.

There are two ways in which pre-saccade information could be used to correct gaze after an errant saccade. First, the general goals that led to the initial saccade could be used to find the target. In the example of making a saccade to a telephone, the general goal of making a telephone call or the specific goal of looking at the telephone might allow an observer to re-cquire the original target (or another object that is worth fixating). A second possibility is that the specific visual details of the target—which may have been irrelevant to the observer’s goal—could be stored in VSTM and used to re-acquire the target. For example, although the color of the telephone might be irrelevant to the observer’s task, this information may be automatically stored in VSTM prior to the saccade and used to re-acquire the telephone if the eyes fail to land on it. It is certainly possible that both of these mechanisms operate in natural vision, but memory for the exact features of the target may allow a more rapid re-acquisition than general task goals. Indeed, Wolfe, Horowitz, Kenner, Hyle, and Vasan (2004) found that visual search is substantially more efficient when the observers were shown a picture of the target (e.g., a black vertical bar) than when they were given a verbal description of the target (e.g., the words “black vertical”). In the present study, we therefore focus on whether memory for an incidental feature of a saccade target can be used to guide gaze corrections.

There is one study in the existing literature suggesting that memory across saccades might support gaze correction. Deubel, Wolf, and Hauske (1984) presented a stimulus composed of densely packed vertical bars of different widths. During a saccade, the entire array was shifted either in the direction of the saccade or in the reverse direction. The final position of the eyes was systematically related to the shift direction and distance, with a secondary saccade generated to correct for the array displacement. This evidence of memory-guided gaze correction must be considered preliminary, however, because Deubel et al. did not report many of the details of the experiment (stimulus parameters, number of participants, number of corrective saccades on a trial, latency of the corrections, and so on), making it difficult to assess the results of the experiment. Thus, the question of whether memory can drive gaze correction is largely open. The present study focused on whether memory for visual objects in VSTM can guide gaze correction to the appropriate saccade target object.

In summary, we hypothesize that VSTM maintains target information across the saccade, so that when the eyes fail to land on the target, the target can be discriminated from other, nearby objects and gaze efficiently corrected. The use of VSTM to correct gaze is likely to be an important factor governing the efficiency of visual perception and behavior. If VSTM were not available across eye movements, saccade errors might not be successfully corrected, leading to significant delays in the perceptual processing of goal-relevant objects and slowing performance of the complex visual tasks (e.g., making tea, driving, air traffic control, to name just a few) that comprise much of waking life. Considering that we make approximately three saccades each second and many of these fail to land on the target object, establishing object correspondence and supporting gaze correction is likely to be a central function of the VSTM system.

The Gaze Correction Paradigm

To examine the role of VSTM in object correspondence and gaze correction, we developed a new procedure that simulates target ambiguity after an inaccurate eye movement, illustrated in Figure 1. An array of objects was presented in a circular configuration so that each object was equally distant from fixation. After a brief delay, one object was cued by rapid expansion and contraction. The participant’s task simply was to generate a saccade to the cued object and fixate it. On a critical subset of trials (⅓ of the total), the array rotated ½ object position during the saccade to the cued object, when vision was suppressed. The eyes typically landed between two objects, the target object and a distractor object adjacent to the target. On average, this landing position was equidistant from the two objects. By artificially inducing saccade errors, we could precisely control the sensory input that followed the saccade.

Sequence of events in a rotation trial of Experiment 1. The top row shows the full-array condition and the bottom row the single-object condition

When the array rotated, the sensory input once the eyes landed was not by itself sufficient to determine which of the two nearby objects was the original saccade target. In addition, subjects could not correct gaze based on direct perception of the rotation, as the rotation occurred during the period of saccadic suppression. The only means to determine which object was the original target—and to make a corrective saccade—was to remember properties of the array from before the saccade and compare this memory trace to the new perceptual input after the eyes landed. Consider, for example, the case in which the saccade target was a yellow disk, and the ½ position rotation of the array caused the eyes to land midway between the yellow disk and a violet disk (Figure 1). To make a corrective saccade to the yellow disk and not to the violet disk, the visual system must retain some information about the pre-saccade array (e.g., the color of the saccade target) and compare this information with the post-saccade array.

Two measures of gaze correction efficiency were examined. Gaze correction accuracy was the percentage of trials on which the target was fixated first after landing between target and distractor. Gaze correction latency was the duration of the fixation before the corrective saccade when a single corrective saccade took the eyes to the target object. This latter measure reflects the speed with which the target object was identified and a corrective saccade computed and initiated. Gaze correction accuracy and latency under conditions of target ambiguity were compared with a single-object control condition in which only a single object was presented before and after the saccade (Figure 1). This condition is analogous to the conditions of previous experiments on saccade correction (Becker, 1972; Deubel et al., 1982), and it allowed us to assess the efficiency of saccade correction when correction did not require memory.

Experiments 1 and 2 tested whether participants could reliably and efficiently correct gaze to the target object when correction required memory. Experiment 3 provided a direct test of the role of VSTM in gaze correction, adding a concurrent VSTM load that should interfere with gaze correction if gaze correction depends on VSTM. Experiment 4 examined the automaticity of gaze correction by placing gaze correction and task instructions at odds. Together, these experiments demonstrated that visual memory can support rapid and accurate gaze correction, that this ability depends specifically on the VSTM system, and that VSTM-based gaze correction is largely automatic.

In addition to illuminating the role of VSTM in gaze correction, the gaze correction paradigm addressed two additional issues central to understanding the function of VSTM in human cognition. First, the gaze correction task is a variant of visual search task: After the saccade, the target object must be found among a set of distractors. Thus, the present study provided direct evidence regarding the role of VSTM in visual search (Chelazzi, Miller, Duncan, & Desimone, 1993; Desimone & Duncan, 1995; Duncan & Humphreys, 1989; Woodman & Luck, 2004; Woodman, Vogel, & Luck, 2001). Second, general theories of object correspondence (Kahneman, Treisman, & Gibbs, 1992) have held that only spatiotemporal properties of an object (i.e., object position and trajectories) are used to establish object correspondence. In the gaze correction paradigm used in the present study, spatial information is not informative, and successful gaze correction requires that object correspondence be computed on the basis of a surface feature match (e.g., finding the object that matches the target color). Accurate and rapid gaze correction in the present study indicates that memory for surface features can be used to establish object correspondence, and therefore that object correspondence operations are not limited to spatiotemporal information. These two topics are addressed in the General Discussion.

Experiment 1

In Experiment 1, we examined whether memory for a simple visual feature—color—can be used to establish object correspondence across saccades and support efficient gaze correction. Color patches are commonly used in VSTM studies (e.g., Luck & Vogel, 1997), allowing us to draw inferences about the relationship between VSTM and transsaccadic memory.

In addition, we tested an alternative, nonmemorial hypothesis regarding the information used to correct gaze in complex environments. When multiple objects are present after an inaccurate saccade, the saccade target object is quite likely to be the object nearest to the saccade landing position. Thus, the object nearest the landing position might be preferentially selected as the target of the corrective saccade, and this means of selection need not consult memory at all. To test this hypothesis, we exploited natural variability in the saccade landing position in the rotation trials of Experiment 1. We examined the relationship between correction accuracy and the distance of the initial saccade landing position from the target. In addition, we tested whether saccades that landed closer to the target than to the distractor were more likely to be corrected to the target than saccades that landed closer to the distractor than to the target. Although nonmemorial distance effects and memory-based effects are not mutually exclusive possibilities, it is possible that distance can trump memory, with memory-based gaze correction occurring only when the target and a distractor are approximately equidistant from the initial saccade landing position. Alternatively, it is possible that memory can trump distance, allowing accurate gaze correction to the remembered item even when it is substantially farther from the saccade landing position than a distractor.

Methods

Participants

Twelve University of Iowa students participated for course credit. Each reported normal, uncorrected vision.

Stimuli and Apparatus

For the full-array condition, object arrays consisted of 12 color disks (Figure 1, top panel) displayed on a gray background. Two initial array configurations were possible, one with the objects at each of the 12 clock positions and another rotated by 15°. The color of each disk was randomly chosen from a set of seven (red, green, blue, yellow, violet, black, and white), with the constraint that color repetitions be separated by at least two objects. The x, y, and luminance values for each color were measured with a Tektronix model J17 colorimeter using the 1931 CIE color coordinate system, and were as follows: red (x = .64, y = .33; 14.79 cd/m²), green (x = .31, y = .57; 9.08 cd/m²), blue (x = .15, y = .06; 9.20 cd/m²), yellow (x = .42, y = .49; 69.58 cd/m²), violet (x = .27, y = .12; 25.77 cd/m²), black (< .001 cd/m²), and white (75.5 cd/m²). Disks subtended 1.6° and were centered 5.9° from central fixation. The distance between the centers of adjacent disks was 3.0°. The target object was equally likely to appear at each of the 12 possible locations. When the target was cued, it expanded to 140% of its original size and contracted back to the original size over 50 ms of animation. The angular difference between adjacent patches was 30°. For rotation trials, the array was rotated 15° clockwise on half the trials and 15° counterclockwise on the other half. The single-object control condition was identical to the full-array condition, except only the target object was displayed (Figure 1, bottom panel).

Stimuli were displayed on a 17-in CRT monitor with a 120 Hz refresh rate. Eye position was monitored by a video-based, ISCAN ETL-400 eyetracker sampling at 240Hz. A chin and forehead rest (with clamps resting against the temples) was used to maintain a constant viewing distance of 70 cm and to minimize head movement. The experiment was controlled by E-prime software. Gaze position samples were streamed in real time from the eyetracker to the computer running E-prime. E-prime then used gaze position data to control trial events (such as transsaccadic rotation) and saved the raw position data to a file that mapped eye events and stimulus events.

Array rotation during the saccade to the target was accomplished using a boundary technique. Participants initially fixated the center of the array. An invisible, circular boundary was defined with a radius of 1.3° from central fixation. After the cue, E-prime monitored for an eye position sample beyond the circular boundary, and on array rotation trials, the rotated array was then written to the screen (on no-rotation trials, a new image was also written to the screen during the saccade, but it was the same as the preview image). It is important to ensure that rotation changes were made quickly enough to be completed during the eye movement, so they were not directly visible. With the present apparatus, the maximum total delay between boundary crossing and completion of screen change was 19 ms. In pilot work, the mean actual delay between boundary crossing and the first fixation after the change was 29 ms, and the shortest actual delays were 22 ms. Thus, the rotation change occurred during the period of saccadic suppression and was completed before the eyes landed.

Procedure

Each trial was initiated by the experimenter after eyetracker calibration was checked. Following a delay of 500 ms, the preview array was presented for 1000 ms as participants maintained central fixation. Next, the target cue animation was presented for 50 ms. Participants were instructed to generate an eye movement to the target as quickly as possible. The array was rotated during this saccade on 1/3 of trials, typically causing the eyes to land between the target and an adjacent distractor. The rotation was accomplished by replacing the original array (within a single refresh cycle) with a copy that was rotated 15° clockwise or counterclockwise. Once the target had been fixated, it was outlined by a box for 500 ms to indicate successful completion of the trial.

The full-array and single-object conditions were blocked, and block order was counterbalanced across participants. In each block, participants first completed 12 practice trials, followed by 144 experimental trials, 48 of which were rotation trials. Trial order within a block was determined randomly. Each participant completed a total 288 experiment trials. The entire session lasted approximately 45 min.

Data Analysis

Eye tracking data analysis was conducted offline using dedicated software. A velocity criterion (eye rotation > 31°/s) was used to define saccades. During a fixation, the eyes are not perfectly still. Fixation position was calculated as the mean position during a fixation period weighted by the proportion of time at each sub-location within the fixation. These data were then analyzed with respect to critical regions in the image, such as the target and distractor regions, allowing us to determine whether the eyes were directed first to the target region or first to the distractor region and the latency of the correction. Object scoring regions were circular and had a diameter of 1.9°, 20% larger than the color disks themselves.

Rotation trials were eliminated from analysis if the eyes initially landed on an object rather than between objects, if more than one saccade was required to bring the eyes from central fixation to the general region of the object array, or if the eyetracker lost track of the eye. The large majority of eliminated trials were those in which the eyes landed on an object, reflecting the fact that saccades are often inaccurate, a basic assumption of this study. To equate the proportion of eliminated trials in the full-array and single-object conditions, trials were eliminated in the single-object condition if the eyes would have landed on an object had there been a full array of objects. A total of 28% of the trials was eliminated across the experiment.

Results

Rotation trials

The rotation trials were of central interest for examining memory-based gaze correction. Figure 2 shows the eye movement scan paths for a single, representative participant in the full-array rotation condition.

We first examined the basic question of whether visual memory can support accurate gaze correction. In the full-array condition, which required memory to discriminate target from distractor, mean gaze correction accuracy was 98.1%. That is, after landing between target and distractor, the eyes were directed first to the target on 98.1% of trials. Unsurprisingly, gaze correction accuracy in the single-object condition, which did not require memory, was 100% correct. The accuracy difference between the full-array and single-object conditions was statistically reliable, t(11) = 2.31, p < .05.

To assess the speed of memory-based gaze correction, we examined gaze correction latency (the duration of the fixation before an accurate corrective saccade) in the full-array and single-object conditions. When comparing gaze correction latencies, it is important to ensure that the distance of the correction was the same in the two conditions. Indeed, mean corrective saccade distance did not differ between the single-object (1.38°) and full-array conditions (1.38°). Mean gaze correction latency was 240 ms in the full-array condition and 201 ms in the single-object condition, t(11) = 4.1, p < .005. The memory requirements in the full-array condition added only 39 ms, on average, to gaze correction latency. Figure 3 shows the distributions of correction latencies for the single-object and full-array conditions. The use of memory to correct gaze was observed as a shift in the peak of the distribution of latencies and an increase in variability. In summary, memory-based gaze correction was highly accurate and introduced only a relatively small increase in gaze correction latency.

Distributions of correction latencies for the full-array and single-object conditions in Experiment 1.

Correction latencies in this experiment might appear fairly long given existing evidence that corrective saccade latencies can be as short as 110–150 ms in studies presenting a single object (Becker, 1991). However, the longer latencies in the present experiment are likely to reflect the fact that the eyes landed relatively close to the contours of the array objects (less than 1° from the nearest object contour, on average). When the eyes land near an object, gaze correction latency is similar to latencies observed for primary saccades, in the range of 150–205 ms (Becker, 1991). In the present experiment, baseline correction latency in the single-object control condition (201 ms) fell within the range of correction latency observed in prior studies. Thus, we can be confident that the present experiment examined typical corrective saccades (in the single-object condition) and the effect of memory demands on the latency of typical corrective saccades (in the full-array condition). Note also that the peaks of the distributions of saccade latency for the single-object and full-array conditions diverged in the range of 100–200 ms (see Figure 3).

In addition to examining memory-based gaze correction, we tested the hypothesis that the relative distance of objects from the saccade landing position influences gaze correction. Given that correction accuracy 98.1% in the full-array condition, distance could not have exerted a major influence on which object was selected as the goal of the corrective saccade. There was no relationship between correction accuracy and the distance of the initial saccade landing position from the target object (r_pb = −.05, p = .35).² In addition, gaze correction was no more accurate for saccades that landed closer to the target than to the distractor (98.8%) than for saccades that landed closer to the distractor than to the target (97.8%), t(11) = .58, p = .57. Thus, relative distance appears to play little role in gaze correction in the present paradigm, which was dominated by memory for color.

No-rotation trials

We also assessed naturally occurring saccade errors on the no-rotation trials of Experiment 1. A significant proportion of saccades directed from central fixation to the target failed to land on the target, with undershoots the most common error. On 35.9% of trials, the eyes did not land on the 1.6° target object. On 16.8% of trials, the eyes did not land within the 1.9° target scoring region. We can be confident that, in the latter trials, the saccade failed to land on the target. Considering these trials, gaze correction accuracy was similar to the results for experimentally induced errors on the rotation trials. Mean gaze correction accuracy for naturally occurring errors was 99.2% in the full-array condition and 99.4% in the single-object condition, F < 1.

We were also able to examine correction latency for naturally occurring saccade errors, but these data must be treated with caution. The analysis could be performed only over the subset of natural error trials on which a single corrective saccade took the eyes to the target object (a total of only 298 trials across 12 participants), and there was significant variability in the number of natural error trials per subject, leaving some subjects with very few observations. The numerical pattern of correction latency data was similar to the data from the rotation condition. Mean gaze correction latency was 295 ms in the full-array condition and 271 ms in the single-object condition, F < 1. The longer overall latencies observed for the correction of natural errors were driven by two participants who had very few observations and very high mean correction latencies, greater than 400 ms. In addition, the eyes tended to land closer to the target object for naturally occurring saccade errors than for experimentally induced errors. Longer latencies for natural error correction reflected the typical inverse relationship between correction distance and correction latency (Deubel et al., 1982; Kapoula & Robinson, 1986).

As a whole, the data for the correction of natural errors produced a similar pattern of results as that for experimentally induced errors in the rotation condition, validating the use of array rotation during the saccade as a means to examine gaze correction.

Discussion

The results from Experiment 1 demonstrated that visual memory can guide saccade correction in a manner that is highly accurate and efficient. In the full-array condition—which required pre-saccade memory encoding, transsaccadic retention, and post-saccade comparison of memory with objects lying near the landing position—correction accuracy was nearly perfect, and correction latency increased by only 39 ms, on average, compared with a single-object control condition in which correction did not require memory. This remarkably efficient use of visual memory for occulomotor control supports the general hypothesis that VSTM is used to establish object correspondence across saccades (Currie, McConkie, Carlson-Radvansky, & Irwin, 2000; Henderson & Hollingworth, 1999, 2003a; Irwin, McConkie, Carlson-Radvansky, & Currie, 1994; McConkie & Currie, 1996) and supports our specific proposal that VTSM across saccades is used to correct gaze in the common circumstance that the eyes fail to land on the target object.

Experiment 2

The goal of the present study was to understand the functional role of VSTM in real-world perception and behavior. Experiment 1 displayed relatively simple objects that are commonly used in VSTM experiments, but objects in the world typically are much more complex. In Experiment 2, we sought to demonstrate that the occulomotor system can solve the correspondence problem for objects similar in complexity to those visible in the real world. However, we wanted to avoid using familiar, real-world objects as targets and distractors, because accurate correction for familiar objects could be due to non-visual representations such as conceptual or verbal codes. We therefore created a set of complex novel objects, which simulated the complexity of natural objects without activating conceptual representations or names. As illustrated in Figure 4, object arrays consisted of eight novel objects selected from a set of 48 objects. The basic method and logic of the experiment was the same as in Experiment 1.