Attention-driven discrete sampling of motion perception

Rufin VanRullen; Leila Reddy; Christof Koch

doi:10.1073/pnas.0409172102

. 2005 Mar 25;102(14):5291–5296. doi: 10.1073/pnas.0409172102

Attention-driven discrete sampling of motion perception

Rufin VanRullen ^*,†,^‡, Leila Reddy ^†, Christof Koch ^†

PMCID: PMC555984 PMID: 15793010

Abstract

In movies or on TV, a wheel can seem to rotate backwards, due to the temporal subsampling inherent in the recording process (the wagon wheel illusion). Surprisingly, this effect has also been reported under continuous light, suggesting that our visual system, too, might sample motion in discrete “snapshots.” Recently, these results and their interpretation have been challenged. Here, we investigate the continuous wagon wheel illusion as a form of bistable percept. We observe a strong temporal frequency dependence: the illusion is maximal at alternation rates around 10 Hz but shows no spatial frequency dependence. We introduce an objective method, based on unbalanced counterphase gratings, for measuring this phenomenon and demonstrate that the effect critically depends on attention: the continuous wagon wheel illusion was almost abolished in the absence of focused attention. A motion-energy model, coupled with attention-dependent temporal subsampling of the perceptual stream at rates between 10 and 20 Hz, can quantitatively account for the observed data.

Keywords: discrete processing, wagon wheel illusion, bistable percept, temporal subsampling, motion energy model

Does the visual system process the stream of sensory information continuously, or in discrete snapshots, like the successive frames of a video camera? Although numerous speculations and some discoveries have been made over the last century (1–7), this important question is still unresolved (8). One critical piece of evidence for the “discrete snapshot” hypothesis is the recent observation by Purves et al. (9) that the well-known wagon wheel illusion (an impression of reversed motion occurring in movies or on TV owing to the temporal subsampling of movie and video cameras) can also be observed in broad daylight, under continuous conditions of illumination. This continuous version of the wagon wheel illusion (which will be referred to as c-WWI), first described experimentally by Schouten (10), can be interpreted as a manifestation of discrete subsampling occurring within the visual system of the observer, rather than within a camera or due to stroboscopic light. More recently, however, these results (11) and their interpretation (12) have been challenged. Kline, Holcombe, and Eagleman (12) emphasize that the subjective appearance of the c-WWI is very distinct from its cinematographic cousin: the c-WWI has a bistable, ambiguous nature. Furthermore, these authors found that, when two rotating patterns were viewed simultaneously, reversed motion tended to be perceived in only one pattern at a time. This result is incompatible with the notion of discrete sampling over the entire visual field but might be explained by object-based or attention-based discrete sampling.

To constrain the range of possible explanations, we investigated this bistable phenomenon in a quantitative manner, by varying the spatial and temporal frequency of the stimulus: the illusion was found to be maximal at alternation rates around 10 Hz, independent of spatial frequency. The c-WWI occurred for first-order (luminance-modulated) as well as second-order (contrast-modulated) motion. This pattern of results can be accounted for by an energy-based model of motion perception (13) with a temporal subsampling component (at a rate of 10–20 Hz). To validate this model, we constructed ambiguous motion stimuli for which the model predicts a specific impairment of motion direction judgments around 10 Hz. This prediction was verified experimentally. Critically, this impairment at 10 Hz vanished when attention was made unavailable by a secondary task: thus, the postulated discrete perceptual snapshots seem to be attention-driven.

Methods

Motor-Mounted Disks. Sunburst patterns with 8, 16, or 32 spokes were printed on cardboard disks, mounted on the shaft of a motor, and rotated in the fronto-parallel plane under constant illumination (i.e., in daylight, without any artificial illumination). The direction and rotation rate of the motor were adjusted by the experimenter by means of a speed controller, coupled with a stepping board that sent square-pulse trains at the appropriate frequency to the motor (Minarik, Glendale, CA). The actual rotation speed of the motor was read out in real time and fed back to the controller, which automatically corrected for potential deviations from the desired speed. This closed-loop system ensured optimal accuracy of our temporal frequency sensitivity measurements. For each sunburst pattern, the rotation speed was varied systematically between 15 and 150 rpm.

Fast-Refresh-Rate Computer Monitor Version. Another version of the illusion used computer-generated stimuli displayed on a computer monitor (Dell, Round Rock, TX) at fast refresh rates. Stimulation was programmed by using the Psychophysics Toolbox in matlab, and the temporal accuracy was controlled within one frame (≈6–8 ms) over 60-s-long stimulation periods. One experiment used rotating radial sinusoidal patterns (8, 16, and 32 cycles) with a refresh rate of 120 Hz. Another, later experiment used horizontally drifting luminance-defined (first-order motion) or contrast-defined (second-order motion) gratings with a refresh rate of 160 Hz. The second-order gratings had static random-noise texture carriers at half the contrast used for first-order gratings, so as to minimize luminance contamination (14). First-order gratings were shown at spatial frequencies of 2, 4, and 8 cycles per degree (cpd), and second-order gratings were shown at spatial frequencies of 1, 2 and 4 cpd) and, for each spatial frequency, at temporal frequencies varying from 2.5 to 40 Hz.

c-WWI Reports. Subjects had to fixate all stimuli. These stimuli subtended a visual angle of <2°, so as to minimize potential spatial aliasing due to less dense retinal cell sampling in the periphery, which could be one basis for the c-WWI. For each condition (i.e., for a given motion direction and temporal and spatial frequency), the moving stimulus was viewed continuously for 60 s, while subjects reported the perceived motion direction by pressing the corresponding arrow on a computer keyboard (and holding it down until the percept alternated or disappeared). This procedure is a standard one for reporting bistable percepts (15). Subjects were encouraged to report reversals, even if weak or only transient. They were also given the option to provide no response at all when the motion direction was too ambiguous to be determined. In particular, they were instructed to refrain from responding when motion was so fast that the spatial pattern became a blur. Thus, we can be confident that any illusory reversal occurred when the spokes could be clearly distinguished. We analyzed the distributions of alternation durations (real vs. illusory direction perceived) and found, as in ref. 12, that they were well-fitted by γ distributions, with a consistently smaller mean for the illusory motion direction. This finding confirms that the c-WWI effect shares certain aspects of bistable perception, with a heavy bias in favor of the real motion direction (12). We tested the effects of spatial and temporal frequency (unbalanced two-way ANOVA) on the strength of the illusory percept, taking into account only those trials where motion (either real or illusory) was reported >90% of the time. This fact insured that the observed effects were not a simple consequence of the low-pass behavior of motion perception in space and time.

Unbalanced Counterphase Gratings and Motion Direction Judgments. To test the c-WWI effect under more objective conditions (i.e., using physical stimulus reversals), we created moving stimuli in which ambiguity regarding the direction of motion could be manipulated. Our unbalanced counterphase gratings were composed of two luminance-modulated sinusoidal gratings of identical spatial and temporal frequencies, but of distinct contrasts, drifting in opposite directions. The contrast balance between these component gratings could be modified to manipulate ambiguity. We used a contrast balance of 60–40%, spatial frequencies of 1 and 2 cpd, and temporal frequencies of 1, 5, 10, 15, and 20 Hz. We presented these stimuli continuously for 60 s and, as previously, asked subjects to track the perceived direction of motion by using keyboard arrows. In this case, however, the direction of motion was physically reversed unpredictably (simply by reversing the contrast balance). Reversals took place gradually, by using an exponential decay with a time constant of 250 ms. We applied a γ distribution of interreversal intervals, with parameters α (mode) of 3 s and β (deviation) of 1.5 s. The distribution was truncated so that no interreversal could be smaller than 1 s. We scored the subjects' responses by using the following scheme: for each 200-ms time bin of the 60-s trial (excluding ≈1 s after each reversal, to allow for reaction time), the response was counted as +1 if subjects pressed the right arrow, –1 if they pressed the left arrow, and 0 if no arrow was pressed. We then averaged these responses separately over the stimulation periods during which the dominant direction was leftwards or rightwards: above-chance responses should thus be significantly lower in the former case than the latter. Overall motion discriminability was computed for each condition by subtracting these two scores.

Attentional Manipulation: Rapid Serial Visual Presentation (RSVP) Letter Stream. We modified the previous paradigm to include an RSVP stream of randomly rotated letters, superimposed at the center of the unbalanced counterphase gratings (in this case the contrast balance was on average 56–44%, accounting for the fact that some of the subjects had already experienced these stimuli in the previous study and had become less sensitive to ambiguity). The letters were updated every 120 ms. Most letters were Ts, but an L could occur at each update with a probability of 2%. This Poisson distribution was further constrained by imposing a 2-s refractory period (no two Ls could appear within 2 s of each other) and a minimum of 4 Ls within each 60-s trial. Subjects tracked the dominant motion direction as before, either while ignoring this central stream (single task), or while monitoring the stream to report each occurrence of the letter L by pressing the space bar within 2 s (dual task). Performance of the letter task was measured with d′, normalized to the maximum d′ value expected given the relative numbers of targets (L) and distractors (T). To ensure that attention was focused on the letter stream equally in all conditions, we verified that the performance on this letter task was comparable at all temporal frequencies of the motion stimulus (F_4,20 = 0.4, P = 0.8). Furthermore, we report motion discrimination performance only for dual-task trials in which letter performance was higher than 40% (chance being at 0%).

Subjects. Six subjects (including one author) participated in the motor version of the c-WWI report experiment (Fig. 1a). Four of them (including the author) also viewed the computer-generated radial stimuli (Fig. 1b). Five subjects (including two authors, plus three new naive subjects) performed this experiment with first-order and second-order moving gratings (Fig. 1 c and d). For these c-WWI experiments, each new observer was allowed a few minutes of habituation to discover the phenomenon. As reported by Kline et al. (12) and as for many other bistable percepts (15), we believe that such previous exposure is a necessary condition for bistability to arise, which might explain previous failures of replicating the c-WWI illusion (11). Over the course of these experiments, 2 subjects of 12 were not tested because they reported at this point that they did not experience the phenomenon (one with the motor, the other with the computer versions of the illusion; one female, one male, ages 32 and 20; one trained and one untrained in psychophysics experiments). For those that were tested, the same physical direction of motion was used throughout the entire experiment (counter-balanced across subjects).

Six subjects participated in the unbalanced counterphase grating experiment (two authors, two of the previous subjects, and two new naive subjects), and five of them were further tested by using the RSVP letter stream. For these experiments, only minimal prior observation of the unbalanced counterphase grating was required to ensure that subjects perceived one dominant motion direction at the slowest temporal frequency (1 Hz). One 60-s trial of practice was also given for the central letter task, first by itself, then under dual-task conditions.

Results

Spatial and Temporal Frequency Dependence. We asked six subjects to fixate a sunburst pattern printed on a cardboard disk mounted on the shaft of a motor, continuously rotating in the fronto-parallel plane, under sunlight. We treated the c-WWI effect as a bistable percept (12) and instructed the subjects to continuously report the perceived direction of motion by using the left and right arrows on a computer keyboard. Subjects were encouraged to report illusory reversals, even if weak or only temporary. The spatial and temporal frequencies of the stimulus were varied systematically by the experimenter, and, for each 60-s trial, we measured the amount of time that each percept occurred (real or illusory direction). As shown in Fig. 1a, the actual direction of motion was clearly the dominant percept, but reversed motion was also reported for a considerable fraction of the time, confirming previous reports (9, 12). Illusory reversals tended to occur preferentially around 10 Hz (F_12,107 = 2.14, P = 0.02), but variations of spatial frequency had no significant effect on the illusion (Fig. 5, which is published as supporting information on the PNAS web site; F_2,107 = 1.79, P = 0.2). This finding implies that the c-WWI cannot be accounted for by a purely spatial subsampling artifact (such as spatial aliasing, for example, due to uneven retinal cell coverage) and restrains the range of possible explanations to those involving temporally specific mechanisms.

The c-WWI on a Computer Monitor. Using true rather than apparent motion and continuous rather than stroboscopic illumination is a compulsory step when investigating the c-WWI (9–12), but this fact unfortunately limits the range of motion stimuli that can be used. This limitation can be overcome, however, if motion is displayed with fast enough refresh rates (i.e., with a Nyquist sampling frequency higher than the fastest motion stimulus to be displayed): on a 120-Hz monitor, for example, each cycle of a motion pattern up to 40 Hz would be defined unambiguously, by three or more frames. Thus, it should be possible to emulate the c-WWI on such a monitor, without the artifacts usually associated with motion pictures. To verify this possibility, we replicated the previous experiment on four subjects using computer-generated motion stimuli. The data, presented in Fig. 1b, are comparable with those obtained previously: a higher rate of illusory reversals for motion around 10 Hz (F_7,78 = 9.41, P < 0.0001) but no spatial frequency preference (Fig. 6, which is published as supporting information on the PNAS web site; F_2,78 = 0.5, P = 0.6). This finding opens up the possibility of investigating the c-WWI with computer-controlled stimuli. Accordingly, all following experiments were performed on a computer monitor with a 160-Hz refresh rate.

First-Order and Second-Order Motion Systems. First-order motion (a drifting modulation of pattern luminance) and second-order motion (a drifting modulation of pattern contrast) are processed in the brain by separate systems (14, 16–19). We asked whether both systems were equally affected by the c-WWI effect. To isolate spatio-temporal components of the Fourier spectrum, we used horizontally drifting vertical gratings defined by luminance (first-order, Fig. 1c) or contrast (second-order, Fig. 1d) modulation. Five subjects viewed these stimuli on a 160-Hz computer monitor under the same conditions as described. Here again, illusory reversals occurred preferentially at temporal frequencies around 10 Hz (F_4,58 = 12.7, P < 0.0001 for first-order motion; F_4,53 = 2.8, P < 0.05 for second-order motion), without a significant effect of varying the grating's spatial frequency (Fig. 7, which is published as supporting information on the PNAS web site; F_2,58 = 1.61, P = 0.2 for first-order motion; F_2,53 = 0.07, P = 0.9 for second-order motion). Because the first-order motion system is incapable of discerning second-order stimuli, the occurrence of a c-WWI with second-order motion implies that the first-order motion system is not, or not solely, responsible for this illusion. In turn, this result indicates that the mechanism underlying the c-WWI is either more general, or is generated at a relatively higher level, than first-order motion-sensitive processes.

A Temporal Subsampling Model of the c-WWI. A sinusoidal grating drifting at a constant speed gives rise to a single pair of spatio-temporal components in the Fourier domain (Fig. 2a). When the same input stream is subsampled in time, however, spurious motion components appear in the Fourier spectrum (Fig. 2b). This is the basis of the classical wagon wheel illusion usually observed in movies. For the same reason, illusory motion reversals are expected to occur if the visual system itself processes the perceptual stream in a sequence of discrete snapshots. To quantitatively account for the data reported in Fig. 1 c and d, and in particular for the illusory reversal peaks around 10 Hz, we calculated that these snapshots must be taken at a rate of ≈15 Hz. To simulate intersubject and intertrial variability, we presume that this rate is not fixed but is drawn randomly from a Gaussian distribution with a mean of 15 Hz and a standard deviation of 4 Hz. This distribution implies that >90% of these snapshots occur at a rate between 10 and 20 Hz. Simulations were repeated several hundred times, each time by using a new sampling frequency, and the results were averaged together. We use the low-pass envelope of the data reported in Fig. 1 c and d (filled diamonds; see also Fig. 7) to estimate our subjects' overall motion sensitivity to first-order and second-order stimuli, respectively. These envelopes are then applied as a “diamond-shaped damping” (based on an arc-tangent function) in Fourier space to limit the range of spatio-temporal components that can participate in a given percept (13). For the range of spatial frequencies used here, the limiting spatial frequency was not a critical parameter in the model: we used a limit frequency of 16 cpd. The limiting temporal frequency was derived from our experimental data to match that observed for first-order (limit frequency 36 Hz) and second-order motion (limit frequency 18 Hz), respectively. For a particular temporal frequency, this discrete sampling model can predict the respective intensities of the percepts corresponding to the actual and the opposite direction of motion: we simply sum motion energy over the corresponding quadrants of the Fourier spectrum (Fig. 2 a and b). By assuming that this discrete sampling mechanism contributes 50% of the entire motion percept (the remaining 50% arising from a similar motion energy model with no subsampling), we obtain theoretical predictions of the subjects' percepts for first- and second-order motion that closely resemble the experimental data (Fig. 2c; compare with Fig. 1 c and d). It is worth noting that the same underlying model was used here to predict first-order and second-order motion percepts, the only difference residing in the low-pass envelopes derived from the experimental data.

Fig. 2. — Motion energy and temporal subsampling. (a) A 1D horizontally drifting sinusoidal grating as shown in the space-time plot on the left is represented in the Fourier domain by a pair of spatio-temporal components, in diagonally opposite quadrants (*Right*). (b) When the same motion stimulus is subsampled in time, spurious motion components appear in the Fourier spectrum. We estimate the intensity of motion perception in the actual and opposite directions as the sum of motion energies over the corresponding quadrants of the Fourier spectrum (marked in the figure as “real” and “illusory” motion, respectively). (c) Assuming that this simple subsampling mechanism contributes half of the total motion percept, and using low-pass motion sensitivity envelopes derived from experimental data, the model predicts rates of illusory reversals for 1st-order and 2nd-order motion that closely resemble those observed experimentally (compare with Fig. 1 c and d).

Why should our discrete sampling process contribute only half of the final motion percept? One possibility might be that only one of the two or three commonly postulated motion systems (14, 16, 17) is temporally subsampled. Because the c-WWI was observed even with second-order motion, only the second- and/or third-order motion systems remain as viable candidates. Alternatively, motion perception might be affected by another factor altogether: attention, which was not controlled in our previous experiments, constitutes an excellent candidate in this case. These two explanations might in fact be congruent, because second-order (20, 21) and third-order (14, 22) motion processes have been found to engage attentional resources.

Unbalanced Counterphase Gratings: An Objective Measure of the c-WWI. So far, our measurements of the c-WWI have relied on subjective reports: our subjects were perfectly aware that motion was to remain constant for the duration of each trial and were encouraged to report even weak impressions of reversed motion. The consistency of these reports within and between subjects is a fair indication that the phenomenon is genuine, and not a byproduct of our subjects' imagination or benevolence. Nevertheless, subjects could always accurately report the actual motion direction: actual reversals of the moving stimulus are easily distinguishable from illusory reversals. This difference might be because, as shown by our motion energy model, the intensity of the actual motion percept is always larger than the strongest illusory motion percept. Under these conditions, bistable percepts are heavily biased toward the dominant interpretation, and much competition (e.g., longer viewing times) is needed to induce reversals (15, 23). However, this bias might be reduced or removed if further ambiguity is added to the stimulus. We thus constructed stimuli within which motion direction ambiguity could be directly manipulated.

A counterphase grating denotes a pair of superimposed luminance gratings of the same spatial frequency, temporal frequency, and contrast, but drifting in opposite directions. Such gratings are generally perceived as stationary flicker, with no dominant direction of motion (24, 25). However, by increasing the contrast of one member of the pair, its direction of motion can be made dominant (Fig. 3a). Ambiguity can be manipulated by changing the relative contrast of the two component gratings (26, 27). We label this class of stimuli “unbalanced counterphase gratings.” This stimulus-induced ambiguity is now superimposed onto the ambiguity due to temporal subsampling (if any). According to our motion energy model, there should exist a range of ambiguities for which motion direction judgments are specifically impaired around 10 Hz, even when using forced-choice, objective reports.

Fig. 3. — Objective measure of the c-WWI. (a) Unbalanced counterphase gratings allow for a direct manipulation of motion ambiguity. The physical stimulus is composed of two superimposed Gabor patches of the same spatial and temporal frequency, but slightly different contrasts (here with a balance of 60–40%), drifting in opposite directions. It is evident from the space-time plot and the corresponding Fourier transform that the motion direction is intrinsically ambiguous, with one direction of motion dominating. (b) This unbalanced counterphase grating was presented continuously for 60 s, physically reversing its dominant direction every 4.5 s on average (γ-distributed). Subjects (n = 6) could easily follow the dominant direction at low or high temporal frequencies: they reliably pressed the left arrow when the dominant direction was leftwards, and the right arrow when it was rightwards (*Left*). However, at temporal frequencies around 10 Hz, direction judgments were unreliable for both dominant directions. The summary in *Right* (a subtraction of the two curves in *Left*) shows that motion direction judgments are more vulnerable to ambiguity around 10 Hz, as predicted by our discrete sampling account.

We asked six subjects to continuously follow, using keyboard arrows, the dominant direction of an unbalanced counterphase grating (where one component grating had two-thirds of the other one's contrast), which physically reversed unpredictably, on average every 4.5 s (γ-distributed). For this particular level of ambiguity, all subjects performed well at low (1 Hz) or high (20 Hz) temporal frequencies. However, when the same stimulus drifted at around 10 cycles per second, motion direction judgments were dramatically impaired, as predicted (Fig. 3b; F_4,50 = 7.6, P < 0.0001). The same result was replicated with a spatial frequency one octave higher (Fig. 8, which is published as supporting information on the PNAS web site). There was no significant effect of spatial frequency on motion direction judgments (F_1,50 = 0.4, P = 0.5), and no interaction between spatial and temporal frequency (F_4,50 = 0.6, P = 0.6). Thus, the observed impairment cannot be explained by the relative inefficiency of one particular spatial or spatio-temporal (e.g., velocity-tuned) motion channel (28, 29). We suggest instead that this impairment is due to the proposed discrete sampling mechanism and constitutes another, more objective manifestation of the c-WWI phenomenon.

Attention-Dependent Motion Illusion. We superimposed on our unbalanced counterphase gratings a central stream (RSVP) of randomly rotated single letters, replaced every 120 ms (Fig. 4a). Most of these letters were Ts, but an L occurred occasionally. We replicated the previous motion judgment experiment with these stimuli under two conditions: observers either ignored the central RSVP stream (and in this case they were free to attend to the motion stimulus) or they were required to monitor the RSVP stream and report each occurrence of the letter L, while simultaneously following the dominant motion direction. In this latter case, the RSVP task was prioritized, and thus it is reasonable to assume that fewer attentional resources were available to the motion stimulus. Accordingly, motion direction judgments were slightly impaired in this dual-task condition for low (1 Hz) and high (20 Hz) temporal frequencies (Fig. 4b). However, for motion alternation rates around 10 Hz, the large impairment observed previously almost vanished when motion was unattended (Fig. 4b). Statistical tests revealed, as before, a significant main effect of temporal frequency on the accuracy of motion direction judgments (F_4,40 = 7.3, P < 0.001), along with the absence of a main effect of attentional condition (motion attended vs. unattended, F_1,40 = 0.4, P = 0.5). Critically, attention interacted significantly with the effects of temporal frequency (F_4,40 = 3.1, P = 0.02). Motion direction judgments at 10 Hz were actually worse when focal attention was directed to the moving stimulus (paired t test, t₄ = 4.5, P = 0.01): to our knowledge, this is one of very few known instances (30) where focal visual attention is found to reduce psychophysical performance.

Fig. 4. — Discrete sampling depends on focal attention. (a) We added a rapid stream of randomly rotated letters to our unbalanced counterphase gratings (here, with a contrast balance of 56–44%). Most letters were Ts, but an L was presented occasionally. (b) Notations as in Fig. 3b. Subjects (n = 5) were told to follow the dominant direction of motion under two conditions: either while ignoring the central letter stream (“motion attended” condition) or while monitoring the letter stream to report occurrences of the letter L (“motion unattended” condition). The selective impairment of motion direction judgments around 10 Hz predicted by the discrete sampling model was replicated here in the motion attended condition (open circles). However, this impairment was still visible but much decreased when motion was unattended (filled squares). At 10 Hz, motion direction judgments were more accurate in the absence of focal attention.

Discussion

It was proposed (10, 12) that the c-WWI effect arises not as a result of discrete sampling, but as an emergent property of Reichardt-based motion detectors (31) in the visual system. Each Reichardt motion detector selectively responds to a particular conjunction of spatial separation and temporal delay; a periodic stimulus (such as a grating) moving in its antipreferred direction, but at the appropriate speed, can thus trigger the spurious activation of the detector, and this activation might explain the illusion of reversed motion (10, 12). It can be demonstrated that a motion system in which space and time are uniformly tiled by elaborated Reichardt detectors is functionally equivalent to a motion energy model (13, 32), and can account for human performance in several psychophysical situations. However, Reichardt detectors are tuned to velocity, rather than temporal frequency. Therefore, this account would have no reason a priori to predict a preferred temporal frequency for the c-WWI effect; given the existence of such a temporal dependence, it would predict that this preferred temporal frequency should increase as the spatial frequency of the stimulus is increased (so as to keep velocity constant). Both these predictions are incompatible with the present data. In addition, motion computation in Reichardt detectors is assumed to be an automatic process, difficult to reconcile with the strong attentional effect observed here.

It is also unlikely that our data represent a simple artifact due to our subjects' eye movements. First, in a pilot experiment on two subjects (IScan IR eye-tracker, 120-Hz sampling rate), we found no systematic shift in eye position or in the frequency distribution of horizontal eye movements around the time of illusory motion reversals of a single vertical drifting grating (data not shown). Second, the temporal frequency at which the discrete sampling was most visible (10 Hz) is about an order of magnitude lower than eye movement-induced artifacts typically observed (33) for moving luminance gratings (which peak at temporal frequencies higher than 50 Hz). Furthermore, Kline et al. (12) also reported little or no correlation between eye movements and the occurrence of the illusory motion reversals in the c-WWI.

Our dual-task experiment suggests that selective attention is a critical factor in the c-WWI: when focal attention is not directed to a moving stimulus, the manifestations of discrete sampling are no longer visible. Thus, the postulated discrete sampling of the perceptual stream seems to be attention-driven. Finding that discrete sampling is driven by attention resolves several open issues regarding the c-WWI phenomenon. First, it provides a simple explanation for the fact that discretely sampled motion contributes only half of the global motion percept in our model. Only the attention-based motion-processing system(s), i.e., the second-(20, 21) and/or third-order (14, 22) systems, but not the first-order system (34), would be affected by discrete sampling. In addition, this finding elucidates why illusory reversals tend to occur in only one object at a time (12): this would be the object currently under the focus of attention. Finally, this dependence on selective, focal attention might explain the ethereal and evanescent appearance of the c-WWI, and why it differs so, subjectively, from the simpler stroboscopic version of the wagon wheel illusion.

What could be the neural substrates of the proposed discrete snapshots? Cortical oscillations are an obvious candidate (8, 35, 36), but the estimated rate of 10–20 Hz does not correspond to any conventional frequency band of the cortical oscillations. However, we must emphasize that our model used a unimodal distribution of subsampling rates (with a mean at 15 Hz) only for the sake of simplicity. Similar simulations using a more complex, bimodal sampling rate distribution (with peaks at 11 Hz and 19 Hz, i.e., in the α and β frequency bands, respectively) can also account for the quantitative features of the c-WWI. Both the α and β frequency bands have previously been associated with discrete perceptual sampling (2–4, 7). Selective attention has complex effects on these oscillatory rhythms: a decrease (37–39) or an increase (40, 41) of oscillatory power in the α band, and increases in the β (42, 43) and γ bands (44). It remains to be seen how these global and local changes of oscillatory activity could interact with motion perception within the focus of attention. Our simple model will hopefully constitute a stepping stone for future exploration of these interactions.

To conclude, the c-WWI is a consistent phenomenon observed at alternation rates around 10 Hz for a wide range of stimuli, including 2nd-order motion. It is also visible in objective psychophysical measurements using ambiguous motion stimuli. These effects can be explained by assuming that a large part of our motion perception is derived from discrete attentional “snapshots” taken every 50–100 ms. To what extent such discrete processing occurs in other visual processing modules that require selective, focal attention, as suggested by some visual search experiments (45), remains an open question. Nonetheless, the observation that attention-mediated perception operates in discrete epochs, if generalized, could have far-reaching implications for human psychology as well as for everyday life (46).

Supplementary Material

Supporting Figures

pnas_102_14_5291__.html^{(3.5KB, html)}

Acknowledgments

The work was greatly inspired and enriched by ideas from the late Francis Crick. We thank Caitlin Berry for substantial help with pilot studies. This research was supported by the Centre National de la Recherche Scientifique, the National Science Foundation Engineering Research Center, the W. M. Keck Foundation Fund, the Gordon More Foundation, and the Swartz Foundation for Computational Neuroscience.

Author contributions: R.V., L.R., and C.K. designed research; R.V. and L.R. performed research; R.V. and L.R. analyzed data; and R.V. wrote the paper.

This paper was submitted directly (Track II) to the PNAS office.

Abbreviations: c-WWI, continuous wagon wheel illusion; RSVP, rapid serial visual presentation.

References

1.James, W. (1890) The Principles of Psychology (Holt, New York).
2.Stroud, J. M. (1956) in Information Theory in Psychology, ed. Quastler, H. (Free Press, Chicago), pp. 174–205.
3.Harter, M. R. (1967) Psychol. Bull. 68, 47–58. [DOI] [PubMed] [Google Scholar]
4.Allport, D. A. (1968) Br. J. Psychol. 59, 395–406. [DOI] [PubMed] [Google Scholar]
5.Crick, F. & Koch, C. (2003) Nat. Neurosci. 6, 119–126. [DOI] [PubMed] [Google Scholar]
6.Andrews, T. J., White, L. E., Binder, D. & Purves, D. (1996) Proc. Natl. Acad. Sci. USA 93, 3689–3692. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Arnold, D. H. & Johnston, A. (2003) Nature 425, 181–184. [DOI] [PubMed] [Google Scholar]
8.VanRullen, R. & Koch, C. (2003) Trends Cogn. Sci. 7, 207–213. [DOI] [PubMed] [Google Scholar]
9.Purves, D., Paydarfar, J. A. & Andrews, T. J. (1996) Proc. Natl. Acad. Sci. USA 93, 3693–3697. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Schouten, J. F. (1967) in Models for the Perception of Speech and Visual Form, ed. Wathen-Dunn, I. (MIT Press, Cambridge, MA), pp. 44–45.
11.Pakarian, P. & Yasamy, M. T. (2003) Perception 32, 1307–1310. [DOI] [PubMed] [Google Scholar]
12.Kline, K., Holcombe, A. O. & Eagleman, D. M. (2004) Vision Res. 44, 2653–2658. [DOI] [PubMed] [Google Scholar]
13.Adelson, E. H. & Bergen, J. R. (1985) J. Opt. Soc. Am. A 2, 284–299. [DOI] [PubMed] [Google Scholar]
14.Lu, Z. L. & Sperling, G. (2001) J. Opt. Soc. Am. A 18, 2331–2370. [DOI] [PubMed] [Google Scholar]
15.Blake, R. & Logothetis, N. K. (2002) Nat. Rev. Neurosci. 3, 13–21. [DOI] [PubMed] [Google Scholar]
16.Lu, Z. L. & Sperling, G. (1995) Vision Res. 35, 2697–2722. [DOI] [PubMed] [Google Scholar]
17.Seiffert, A. E. & Cavanagh, P. (1998) Vision Res. 38, 3569–3582. [DOI] [PubMed] [Google Scholar]
18.Baker, C. L., Jr. (1999) Curr. Opin. Neurobiol. 9, 461–466. [DOI] [PubMed] [Google Scholar]
19.Vaina, L. M. & Cowey, A. (1996) Proc. R. Soc. Lond. Ser. B 263, 1225–1232. [DOI] [PubMed] [Google Scholar]
20.Cavanagh, P. (1992) Science 257, 1563–1565. [DOI] [PubMed] [Google Scholar]
21.Ashida, H., Seiffert, A. E. & Osaka, N. (2001) J. Opt. Soc. Am. A 18, 2255–2266. [DOI] [PubMed] [Google Scholar]
22.Lu, Z. L. & Sperling, G. (1995) Nature 377, 237–239. [DOI] [PubMed] [Google Scholar]
23.Levelt, W. (1965) On Binocular Rivalry (Institute for Perception RVO–TNO, Soesterberg, The Netherlands).
24.Kelly, D. H. (1971) J. Opt. Soc. Am. 61, 632–640. [DOI] [PubMed] [Google Scholar]
25.Levinson, E. & Sekuler, R. (1975) J. Physiol. (London) 250, 347–366. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Stromeyer, C. F., 3rd, Kronauer, R. E., Madsen, J. C. & Klein, S. A. (1984) J. Opt. Soc. Am. A 1, 876–884. [DOI] [PubMed] [Google Scholar]
27.Ledgeway, T. (1994) Vision Res. 34, 2879–2889. [DOI] [PubMed] [Google Scholar]
28.Pantle, A. J. & Sekuler, R. W. (1968) Vision Res. 8, 445–450. [DOI] [PubMed] [Google Scholar]
29.Pantle, A., Lehmkuhle, S. & Caudill, M. (1978) Perception 7, 261–267. [DOI] [PubMed] [Google Scholar]
30.Yeshurun, Y. & Carrasco, M. (1998) Nature 396, 72–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Reichardt, W. (1961) in Sensory Communication, ed. Rosenblith, W. A. (MIT Press, Cambridge, MA), pp. 303–317.
32.van Santen, J. P. & Sperling, G. (1985) J. Opt. Soc. Am. A 2, 300–321. [DOI] [PubMed] [Google Scholar]
33.Kelly, D. H. (1990) J. Opt. Soc. Am. A 7, 2237–2244. [DOI] [PubMed] [Google Scholar]
34.Allen, H. A. & Ledgeway, T. (2003) Vision Res. 43, 2927–2936. [DOI] [PubMed] [Google Scholar]
35.Salinas, E. & Sejnowski, T. J. (2001) Nat. Rev. Neurosci. 2, 539–550. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Llinas, R. (2001) I of the Vortex: From Neurons to Self (MIT Press, Cambridge, MA).
37.Walsh, E. G. (1953) J. Physiol. (London) 120, 155–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Bunnell, D. E. (1982) Int. J. Neurosci. 17, 39–42. [DOI] [PubMed] [Google Scholar]
39.Pfurtscheller, G., Neuper, C. & Mohl, W. (1994) Int. J. Psychophysiol. 16, 147–153. [DOI] [PubMed] [Google Scholar]
40.Klimesch, W. (1999) Brain Res. Brain Res. Rev. 29, 169–195. [DOI] [PubMed] [Google Scholar]
41.Yamagishi, N., Callan, D. E., Goda, N., Anderson, S. J., Yoshida, Y. & Kawato, M. (2003) NeuroImage 20, 98–113. [DOI] [PubMed] [Google Scholar]
42.Wrobel, A. (2000) Acta Neurobiol. Exp. (Wars) 60, 247–260. [DOI] [PubMed] [Google Scholar]
43.Vazquez Marrufo, M., Vaquero, E., Cardoso, M. J. & Gomez, C. M. (2001) Brain Res. Cogn. Brain Res. 12, 315–320. [DOI] [PubMed] [Google Scholar]
44.Fries, P., Reynolds, J. H., Rorie, A. E. & Desimone, R. (2001) Science 291, 1560–1563. [DOI] [PubMed] [Google Scholar]
45.Dehaene, S. (1993) Psychol. Sci. 4, 264–270. [Google Scholar]
46.Sacks, O. (August 23, 2004) The New Yorker, pp. 78–87.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Figures

pnas_102_14_5291__.html^{(3.5KB, html)}

pnas_102_14_5291__1.pdf^{(403.4KB, pdf)}

pnas_102_14_5291__2.pdf^{(403.4KB, pdf)}

pnas_102_14_5291__3.pdf^{(396.9KB, pdf)}

pnas_102_14_5291__4.pdf^{(88KB, pdf)}

[ref1] 1.James, W. (1890) The Principles of Psychology (Holt, New York).

[ref2] 2.Stroud, J. M. (1956) in Information Theory in Psychology, ed. Quastler, H. (Free Press, Chicago), pp. 174–205.

[N0x985a7a0.0x9e77ac8] 3.Harter, M. R. (1967) Psychol. Bull. 68, 47–58. [DOI] [PubMed] [Google Scholar]

[ref4] 4.Allport, D. A. (1968) Br. J. Psychol. 59, 395–406. [DOI] [PubMed] [Google Scholar]

[N0x985a7a0.0x9e77ce8] 5.Crick, F. & Koch, C. (2003) Nat. Neurosci. 6, 119–126. [DOI] [PubMed] [Google Scholar]

[N0x985a7a0.0x9e77e08] 6.Andrews, T. J., White, L. E., Binder, D. & Purves, D. (1996) Proc. Natl. Acad. Sci. USA 93, 3689–3692. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref7] 7.Arnold, D. H. & Johnston, A. (2003) Nature 425, 181–184. [DOI] [PubMed] [Google Scholar]

[ref8] 8.VanRullen, R. & Koch, C. (2003) Trends Cogn. Sci. 7, 207–213. [DOI] [PubMed] [Google Scholar]

[ref9] 9.Purves, D., Paydarfar, J. A. & Andrews, T. J. (1996) Proc. Natl. Acad. Sci. USA 93, 3693–3697. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref10] 10.Schouten, J. F. (1967) in Models for the Perception of Speech and Visual Form, ed. Wathen-Dunn, I. (MIT Press, Cambridge, MA), pp. 44–45.

[ref11] 11.Pakarian, P. & Yasamy, M. T. (2003) Perception 32, 1307–1310. [DOI] [PubMed] [Google Scholar]

[ref12] 12.Kline, K., Holcombe, A. O. & Eagleman, D. M. (2004) Vision Res. 44, 2653–2658. [DOI] [PubMed] [Google Scholar]

[ref13] 13.Adelson, E. H. & Bergen, J. R. (1985) J. Opt. Soc. Am. A 2, 284–299. [DOI] [PubMed] [Google Scholar]

[ref14] 14.Lu, Z. L. & Sperling, G. (2001) J. Opt. Soc. Am. A 18, 2331–2370. [DOI] [PubMed] [Google Scholar]

[ref15] 15.Blake, R. & Logothetis, N. K. (2002) Nat. Rev. Neurosci. 3, 13–21. [DOI] [PubMed] [Google Scholar]

[ref16] 16.Lu, Z. L. & Sperling, G. (1995) Vision Res. 35, 2697–2722. [DOI] [PubMed] [Google Scholar]

[ref17] 17.Seiffert, A. E. & Cavanagh, P. (1998) Vision Res. 38, 3569–3582. [DOI] [PubMed] [Google Scholar]

[N0x985a7a0.0x9e7fa18] 18.Baker, C. L., Jr. (1999) Curr. Opin. Neurobiol. 9, 461–466. [DOI] [PubMed] [Google Scholar]

[ref19] 19.Vaina, L. M. & Cowey, A. (1996) Proc. R. Soc. Lond. Ser. B 263, 1225–1232. [DOI] [PubMed] [Google Scholar]

[ref20] 20.Cavanagh, P. (1992) Science 257, 1563–1565. [DOI] [PubMed] [Google Scholar]

[ref21] 21.Ashida, H., Seiffert, A. E. & Osaka, N. (2001) J. Opt. Soc. Am. A 18, 2255–2266. [DOI] [PubMed] [Google Scholar]

[ref22] 22.Lu, Z. L. & Sperling, G. (1995) Nature 377, 237–239. [DOI] [PubMed] [Google Scholar]

[ref23] 23.Levelt, W. (1965) On Binocular Rivalry (Institute for Perception RVO–TNO, Soesterberg, The Netherlands).

[ref24] 24.Kelly, D. H. (1971) J. Opt. Soc. Am. 61, 632–640. [DOI] [PubMed] [Google Scholar]

[ref25] 25.Levinson, E. & Sekuler, R. (1975) J. Physiol. (London) 250, 347–366. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref26] 26.Stromeyer, C. F., 3rd, Kronauer, R. E., Madsen, J. C. & Klein, S. A. (1984) J. Opt. Soc. Am. A 1, 876–884. [DOI] [PubMed] [Google Scholar]

[ref27] 27.Ledgeway, T. (1994) Vision Res. 34, 2879–2889. [DOI] [PubMed] [Google Scholar]

[ref28] 28.Pantle, A. J. & Sekuler, R. W. (1968) Vision Res. 8, 445–450. [DOI] [PubMed] [Google Scholar]

[ref29] 29.Pantle, A., Lehmkuhle, S. & Caudill, M. (1978) Perception 7, 261–267. [DOI] [PubMed] [Google Scholar]

[ref30] 30.Yeshurun, Y. & Carrasco, M. (1998) Nature 396, 72–75. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref31] 31.Reichardt, W. (1961) in Sensory Communication, ed. Rosenblith, W. A. (MIT Press, Cambridge, MA), pp. 303–317.

[ref32] 32.van Santen, J. P. & Sperling, G. (1985) J. Opt. Soc. Am. A 2, 300–321. [DOI] [PubMed] [Google Scholar]

[ref33] 33.Kelly, D. H. (1990) J. Opt. Soc. Am. A 7, 2237–2244. [DOI] [PubMed] [Google Scholar]

[N0x985a7a0.0x9e81890] 34.Allen, H. A. & Ledgeway, T. (2003) Vision Res. 43, 2927–2936. [DOI] [PubMed] [Google Scholar]

[ref35] 35.Salinas, E. & Sejnowski, T. J. (2001) Nat. Rev. Neurosci. 2, 539–550. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref36] 36.Llinas, R. (2001) I of the Vortex: From Neurons to Self (MIT Press, Cambridge, MA).

[ref37] 37.Walsh, E. G. (1953) J. Physiol. (London) 120, 155–159. [DOI] [PMC free article] [PubMed] [Google Scholar]

[N0x985a7a0.0x9e81c50] 38.Bunnell, D. E. (1982) Int. J. Neurosci. 17, 39–42. [DOI] [PubMed] [Google Scholar]

[ref39] 39.Pfurtscheller, G., Neuper, C. & Mohl, W. (1994) Int. J. Psychophysiol. 16, 147–153. [DOI] [PubMed] [Google Scholar]

[ref40] 40.Klimesch, W. (1999) Brain Res. Brain Res. Rev. 29, 169–195. [DOI] [PubMed] [Google Scholar]

[ref41] 41.Yamagishi, N., Callan, D. E., Goda, N., Anderson, S. J., Yoshida, Y. & Kawato, M. (2003) NeuroImage 20, 98–113. [DOI] [PubMed] [Google Scholar]

[ref42] 42.Wrobel, A. (2000) Acta Neurobiol. Exp. (Wars) 60, 247–260. [DOI] [PubMed] [Google Scholar]

[ref43] 43.Vazquez Marrufo, M., Vaquero, E., Cardoso, M. J. & Gomez, C. M. (2001) Brain Res. Cogn. Brain Res. 12, 315–320. [DOI] [PubMed] [Google Scholar]

[ref44] 44.Fries, P., Reynolds, J. H., Rorie, A. E. & Desimone, R. (2001) Science 291, 1560–1563. [DOI] [PubMed] [Google Scholar]

[ref45] 45.Dehaene, S. (1993) Psychol. Sci. 4, 264–270. [Google Scholar]

[ref46] 46.Sacks, O. (August 23, 2004) The New Yorker, pp. 78–87.

PERMALINK

Attention-driven discrete sampling of motion perception

Rufin VanRullen

Leila Reddy

Christof Koch

Abstract

Methods

Fig. 1.

Results

Fig. 2.

Fig. 3.

Fig. 4.

Discussion

Supplementary Material

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Attention-driven discrete sampling of motion perception

Rufin VanRullen

Leila Reddy

Christof Koch

Abstract

Methods

Fig. 1.

Results

Fig. 2.

Fig. 3.

Fig. 4.

Discussion

Supplementary Material

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases