The role of visual working memory in attentive tracking of unique objects

Tal Makovski; Yuhong V Jiang

doi:10.1037/a0016453

. Author manuscript; available in PMC: 2009 Dec 12.

Published in final edited form as: J Exp Psychol Hum Percept Perform. 2009 Dec;35(6):1687–1697. doi: 10.1037/a0016453

The role of visual working memory in attentive tracking of unique objects

Tal Makovski ¹, Yuhong V Jiang ¹

PMCID: PMC2792568 NIHMSID: NIHMS110962 PMID: 19968429

Abstract

When tracking moving objects in space humans usually attend to the objects’ spatial locations and update this information over time. To what extent do surface features assist attentive tracking? In this study we asked participants to track identical or uniquely colored objects. Tracking was enhanced when objects were unique in color. The benefit was greater when the distance between distractors and targets was smaller, but was eliminated when the objects changed colors 1 to 4 times per second, even though at any instant they were always uniquely colored. Additionally, tracking uniquely colored objects impaired a secondary color-memory task more than tracking identical objects, and holding several colors in working memory eliminated the advantage of tracking uniquely colored objects. Contrary to previous studies showing that feature information is poorly retained during tracking, these findings indicate that surface properties are stored in visual working memory to facilitate tracking performance.

Keywords: attentive tracking, multiple-object tracking, visual working memory

Introduction

Tracking moving objects with attention and remembering their identities in visual working memory are two mechanisms that allow people to maintain temporal continuity in a constantly changing environment. Although both attentive tracking and visual working memory can be used to serve similar functions, they are rarely studied together (Song & Jiang, 2006). Researchers interested in working memory usually investigate memory of static displays (Luck & Vogel, 1997), whereas researchers interested in attentive tracking typically examine tracking of identical objects (Pylyshyn & Storm, 1988). In this study we explore how individuals track moving items that have unique identities. Specifically, we examine the extent to which surface features, such as unique colors, contribute to attentive tracking. This is an important yet largely neglected question in tracking studies. Previous studies have shown that observers usually track objects without remembering their identities (Pylyshyn, 2004). Contrary to this conclusion, here we show that unique features can facilitate tracking. We present four experiments to elucidate how visual working memory interacts with tracking of uniquely colored objects.

Standard multiple-object tracking (MOT) research typically overlooks contributions from surface features. In the MOT task, observers track a subset of pre-specified items among identical distractors. Because the objects are not unique in identity, object memory plays no role in this task. The standard MOT research has left out the component of surface properties for several reasons. First, there is evidence that object identities are poorly retained in attentive tracking. For example, when observers are occasionally probed during tracking of unique objects, they can usually report the targets’ location and motion direction, but not their shape or color (Scholl, Pylyshyn, & Franconeri, 1999). This finding suggests that tracking is achieved primarily by updating an object’s spatiotemporal history, rather than by tagging the object’s identity. Second, evidence from developmental psychology has contributed to the idea that spatiotemporal properties, rather than surface features, are crucial for maintaining object continuity. For instance, Xu and Carey (1996) showed that ten-month-old infants are not surprised when a toy duck disappears behind an occluder and reemerges as a toy truck, even though they are highly sensitive to disruptions in motion continuity. Similarly, visually deprived adults who recently recovered vision rely primarily on movement information to individuate and segregate visual displays. They are initially impaired at using discontinuities in surface features, such as the presence of ‘T’-junctions, in segmenting a static display (Sinha, 2007).

However, when requested, observers are able to remember surface features of moving items. For example, when two unique identities (e.g., ‘A’ and ‘B’) are assigned to two placeholders and then removed, observers readily update the identities inside the placeholders after the placeholders rotate 90° (Kahneman, Treisman, & Gibbs, 1992). Consequently, observers are faster at recalling the letters when they reappear at their original placeholders than when they switch placeholders. Although it is possible to update object identities after simple rotational motion, such updating is more challenging in standard MOT tasks where many objects undergo complex motion. For example, Pylyshyn (2004) placed four digits, one in each of four target circles, before the circles started moving among four distractor circles. The digits were then removed and the target and distractor circles began moving freely on the display. When the motion stopped, participants were able to select the target circles from distractors, but unable to report the associated digit in each target circle. These results suggest that humans often track objects without remembering what they are.

Similar conclusions have been reached using the multiple-identity tracking task. In this task, objects with unique shape or color maintain their unique identities while moving. Observers were asked to remember which objects moved where. At the end of the motion sequence, the objects disappeared and observers are asked to report the identity of a probed target (Oksama & Hyönä, 2004, 2008). Performance in this task was quite poor, supporting the idea that object identities are not well remembered during tracking. For example, Horowitz et al. (2007) asked participants to track cartoon animals with unique identities. When the animals stopped moving and hid behind cactuses, participants were asked to report either the locations of the target animals (standard question) or the location of a specific animal (specific question). The specific question requires participants to remember the individual identities of the target animals. Horowitz et al. estimated that the capacity for recalling the correct location of a specific animal was between 1 and 2, much lower than the number of locations participants knew concealed targets (Horowitz et al., 2007). In Horowitz et al.’s study the animals moved in straight lines, so the motion was less complex than that used in standard MOT. The low capacity estimated for the specific question is perhaps a generous estimation of object memory in MOT.

The literature reviewed above may imply that surface features are rarely, if ever, used in attentive tracking. However, it is also possible that previous research has underestimated the importance of tracking objects with unique identities. MOT studies often use identical objects, an approach that precludes the role of surface features. On the other hand, multiple-identity-tracking studies are often too complex for revealing an effect of unique features. The multiple-identity-tracking task requires participants to simultaneously track the targets and remember their features. The dual-task requirements may induce interference between tracking the targets and remembering the targets’ features (Fougnie & Marois, 2006). In addition, the multiple-identity tracking task usually involves a small number of objects, almost all of which are targets. This task minimizes the demand for individuating targets from distractors, but maximizes the demand for object memory. While this task is useful for characterizing feature memory of moving objects, it is not well suited for examining attentive tracking of unique objects. If one wishes to test how tracking is affected by the uniqueness of objects, it is necessary to combine elements of the multiple-object-tracking and multiple-identity-tracking tasks. Specifically, one needs a task where participants are not required to remember object properties. Instead, observers simply track targets in displays that involve objects with unique identities, similar to real-life experiences.

We conducted four experiments using this paradigm in order to address two questions. First, to what extent can surface properties facilitate attentive tracking of moving objects? Experiments 1 and 2 were designed to establish the basic advantage of tracking unique objects to tracking identical objects. Second, what accounts for the effect of unique features on attentive tracking? Experiments 3 and 4 examine the contribution of visual working memory to tracking.

Experiment 1

Experiment 1 aimed to test whether unique identities affect attentive tracking. On each trial participants were cued to track 4 targets among a total of 8 objects. The 8 objects either had the same color (homogeneous condition), 8 different colors (all-unique), or intermediate levels of color heterogeneity (e.g., 2 or 4 total colors for the 8 objects). Importantly, participants were informed that color was irrelevant and that their memory for colors would not be tested. Sample displays from all experiments can be viewed at: http://jianglab.psych.umn.edu/MOTunique/MOTunique.htm. If surface properties are usually disregarded in an intrinsically spatial task (e.g., Jiang, Olson, & Chun, 2000), and if tracking exclusively relies on spatiotemporal properties, then performance should be equivalent between the homogeneous and heterogeneous colors conditions. Alternatively, heterogeneity among tracked targets may disrupt the grouping of targets (Makovski & Jiang, submitted; Yantis, 1992), lowering performance in the heterogeneous colors condition. A third possibility is that tracking may be enhanced by uniqueness in object identity, either because tracking ability is improved by feature differences between targets and distractors, or because unique identities can be retained in visual working memory. The intermediate levels of heterogeneity (e.g., 4 colors for the 8 objects) allow us to test whether performance change monotonically as a function of object heterogeneity.

Method

Participants

Participants in all experiments were students from the University of Minnesota. They were 18 to 35 years old and had normal color vision and normal or corrected-to-normal visual acuity. The experiments were conducted with the participants’ written consent. Participants received $10/hour or course credit for their time.

There were 20 participants in Experiment 1. Half of them (mean age 24.3 years) completed Experiment 1a and the other half (mean age 22.5 years) completed Experiment 1b.

Equipment

Participants were tested individually in a room with normal interior lighting. They sat approximately 57 cm away from a 19” computer monitor. The experiment was programmed with the Psychophysical Toolbox (Brainard, 1997; Pelli, 1997) implemented in MATLAB (http://www.mathworks.com).

Stimuli

The moving objects were circles (diameter = 0.6°) presented against a gray background. There were eight colors (red, green, blue, yellow, orange, azure, brown, and pink) sampled randomly according to a trial’s condition.

Procedure

Participants completed 125 trials, divided randomly and evenly into five conditions. On each trial, participants pressed the spacebar to initiate the cue period, which brought up 8 stationary objects presented at randomly selected locations within an imaginary square (21° × 21°). The four targets were cued by an outline white square (1.0° × 1.0°). The cue lasted for 1330 ms, after which the white squares disappeared and the objects moved at a constant speed of 17.5 deg/s. Participants were asked to track the cued objects and were encouraged to maintain fixation at the center of the display during tracking. The objects bounced off the edge of the imaginary square or repelled one another at a minimal center-to-center distance of 1.2°. After a few seconds of motion (see Design for specific motion duration), the objects turned black and stopped moving. Participants responded by clicking on four items, after which the correctly selected targets turned green and the missed targets turned red for 1 s to provide feedback.

Design

The five conditions differed in the heterogeneity of item colors used on a tracking trial. Table 1 illustrates the different conditions. In the homogeneous condition, all eight objects were identical in color; the exact color was randomly selected on each trial. In the all-unique condition, the eight objects had 8 different colors. In the paired-four condition, the four targets were unique in color and the four distractors were unique in color, but each target shared its color with one of the distractors. In the paired-two condition, two targets and two distractors were in one color while the other items were in another color. Finally, in the four-unique condition, two targets were one color while two other targets were another color, and two distractors were a third color while two other distractors were a fourth color. All trials were presented in a randomly intermixed order.

Table 1. A schematic illustration of the five conditions tested in Experiment 1.

Condition	# of overall colors	# of target colors
homogeneous	1	1
All-unique	8	4
four-unique	4	2
Paired-four	4	4
Paired-two	2	2

Open in a new tab

Experiment versions

The two versions of the experiment (each involving half of the participants) were identical except for the duration of a trial’s motion period and the use of articulatory suppression. In Experiment 1a, the objects moved for 10 s on each trial. In Experiment 1b, the objects moved for an unpredictable amount of time, randomly selected between 4 to 8 s. The randomized trial duration used in Experiment 1b served to discourage participants from anticipating the trial ending. Participants in Experiment 1b also engaged in articulatory suppression, where they repeated a three-letter word as quickly as they could throughout a trial. Articulatory suppression minimized the possibility that participants would verbally recode the color of objects.

Results

Despite differences in motion duration (fixed or random) and articulatory suppression, the two versions of the experiment produced remarkably similar results (Figure 1). An ANOVA on stimulus type and experimental version produced no effect of experimental version, F(1, 18) = 1.60, p > .22, and no interaction, F(4, 72) = 1.20, p > .33. For the rest of the analyses data were collapsed across the two versions of the experiment.

Tracking accuracy in Experiment 1a (left) and 1b (right). Error bars show ±1 S.E. Trial motion duration was fixed in Experiment 1a and random in Experiment 1b, which employed articulatory suppression.

Tracking accuracy was significantly affected by the heterogeneity of tracked objects, F(4, 72) = 104.80, p < .01, η_p² = .85. Post-hoc contrasts using Bonferonni corrections for multiple comparisons showed that color distinction between targets and distractors enhanced performance. Accuracy was significantly higher in the all-unique condition than the homogeneous condition (p < .01, η_p² = .59), and higher in the four-unique condition than the homogeneous condition, p < .01, η_p² = .89. The four-unique condition was better than the all-unique condition, p < .01, η_p² = .88, possibly because it was easier to group the targets when they were comprised of two rather than four colors. In contrast, when targets were distinctive from one another but not distinctive from distractors, no enhancement was found. The paired-four and paired-two conditions were comparable in accuracy, p > .90, neither of which was better than the homogeneous condition, ps > .35.

Discussion

Experiment 1 established the finding that surface features can be used to enhance attentive tracking. The critical factor for obtaining an advantage is not the heterogeneity of targets among themselves, but the distinction between targets and distractors (Horowitz et al., 2007; Makovski & Jiang, in press). In the paired-four condition, the targets were distinctive from one another, but each target shared the same color as one of the distractors. This condition did not yield any advantage compared with the homogeneous condition. In contrast, in the all-unique condition, the targets and distractors were all different from one another, and a clear advantage for tracking was observed. This advantage should not be treated as a grouping effect (Yantis, 1992), as the targets were distinctive from one another and thus cannot be easily grouped. Nonetheless, grouping clearly interacts with tracking of unique objects, as the uniqueness advantage was eliminated when targets were similar to the distractors (the paired-four condition).

Thus, contrary to the idea that surface features are rarely used during tracking, we have shown that tracking is enhanced when the tracked objects are unique rather than homogeneous. This advantage raises the question as to whether motion tracking was used at all when the tracked objects were all unique. For example, could participants rely on a strategy of “remembering and re-identifying”, where the target colors were remembered in the cue period, and re-identified at the end of a trial’s motion? As we will show in Experiments 3 and 4, visual memory is clearly involved in tracking of unique identities. However, performance in the all-unique condition cannot be supported solely by the remembering and re-identifying strategy. Because of random trial duration (Experiment 1b), participants could not precisely identify the moment at which the colors should be re-identified (note that all items turned black at the end of a trial). This means that tracking is necessary to establish the basis for re-identification or to establish motion continuity. To further strengthen the idea that participants tracked the moving items in our experiments, we conducted a control experiment where color information was available only during the cue phase and the second half of the motion period. In the critical condition, motion sequence was shuffled to produce incoherent jumps. Participants thus had to rely on remembering and re-identifying strategy to establish the correspondence between the cued colors and the colors on the final displays. Yet, performance in this condition dropped to 58%, much lower than its smooth-motion counterpart (73%, p < .01, η_p² = .55), suggesting that participants cannot exclusively rely on color memory to re-identify targets in the all-unique condition.

An interesting finding in Experiment 1a was the lower accuracy in the paired-four and paired-two conditions in comparison to the homogeneous condition, ps < .06, η_p² > .34. This trend toward lower performance involving identity-paired targets and distractors was first observed in Horowitz et al. (2007), but the trend was unstable (Horowitz et al., 2007; Makovski & Jiang, in press). Similarly in our study, the trend was not observed in Experiment 1b, and it was eliminated in a follow-up experiment where different conditions were tested in different blocks. Due to the unreliable nature of this finding, we will not focus on the comparison between the paired and homogeneous conditions in subsequent experiments.

Experiment 2

Psychologists often conceptualize attention as not only limited in the number of foci, but also in the spatial resolution of each focus (He, Cavanagh, & Intrilligator, 1996). Attention is said to have a finite resolution: it is not restricted to the precise location of a target, but can spread to its neighboring space. When the distance between a distractor and a target is smaller than the spatial resolution of attention, the distractor receives attention and may intrude into target perception. The view of attention as limited in spatial resolution has been instrumental in explaining attentive tracking errors (Intrilligator & Cavanagh, 2001) and visual crowding (Chakravarthi & Cavanagh, 2007). In attentive tracking, as distractors get closer to the targets it becomes harder to select just the targets. Consequently, tracking accuracy declines as the minimal target-distractor distance decreases (Shim, Alvarez, & Jiang, 2008).

The main purpose of Experiment 2 is to characterize the interaction between attentional resolution and object uniqueness. We tested whether the advantage afforded by tracking unique objects is constant at different target-distractor distances, or whether it is greater at closer target-distractor distances.