Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Apr 13.
Published in final edited form as: Psychol Sci. 2011 Aug 9;22(9):1132–1137. doi: 10.1177/0956797611418346

Common-Fate Grouping as Feature Selection

Brian R Levinthal 1, Steven L Franconeri 1
PMCID: PMC4395002  NIHMSID: NIHMS675329  PMID: 21828350

Abstract

The visual system groups elements within the visual field that are physically separated yet similar to each other. Although grouping processes have been intensely studied for a century, the mechanisms of grouping remain elusive. We propose that a primary mechanism for grouping by common fate is attentional selection of a direction of motion. A unique prediction follows from this account: that the visual system must be limited to forming only a single common-fate group at a time, and that attempts to find a particular common-fate group among other groups, or among nongroups, should therefore be highly inefficient. We show that this is true in searches for vertically oriented groups of moving dots among horizontally oriented groups (Experiment 1) and in searches for motion-linked groups among nonlinked objects (Experiment 2). Feature selection may limit the visual system to the construction of only one common-fate group at a time, and thus the experience of simultaneous grouping may be an illusion.

Keywords: visual perception, perceptual organization, attention, selection


The human visual system breaks an incoming image into separate elements, but also reassembles these elements into groups. Grouping processes have been studied for a century, and much of this work has focused on classifying different types of groups, determining when grouping occurs (S. E. Palmer, 2002), and measuring the relative strength of different forms of grouping (Kubovy & van den Berg, 2008). Although neurally plausible mechanisms have been proposed for some types of grouping (e.g., contour grouping; Roelfsema, 2006), for many other types of grouping, such explanations remain elusive. We propose a mechanism for the Gestalt principle of common fate, according to which objects appear grouped when they display the same pattern of motion. We argue that a primary mechanism underlying this powerful form of grouping is the selection of a direction of motion.

This parsimonious explanation is consistent with previous demonstrations that the selection of a direction of motion (as well as of features such as color and orientation) can occur in parallel across the visual field (Maunsell & Treue, 2006; Saenz, Buracas, & Boynton, 2002). Selecting a direction of motion could produce a spatially organized map of locations in the visual field that correspond to the selected direction (see Fig. 1); “peaks” in this map would correspond to the locations of all elements moving in that direction (see Huang & Pashler, 2007, for a similar account). The initial selection of a direction of motion could be based on a statistical summary of the motion patterns within a display (Williams & Sekuler, 1984) or on the direction of motion of a single selected object. Once a group has been created, changes to its direction of motion could be updated by a simple feedback loop that modifies the selected direction of motion (see Martinez-Trujillo, Cheyne, Gaetz, Simine, & Tsotsos, 2007, for evidence of motion-direction updating during the tracking of translating objects).

Fig. 1.

Fig. 1

Illustration of a feature-selection mechanism for common-fate grouping. The flashlights at the top illustrate whether or not a direction of motion has been selected. If an observer did not select a direction of motion (illustration at the left), all moving elements in a display would be processed together, and there would be no common-fate grouping. If an observer selected rightward motion (illustration at the right), target elements would be enhanced and form a common-fate group. Changes in the direction of selected motion could be updated via feedback connections within motion-sensitive areas of the visual system (e.g., Martinez-Trujillo, Cheyne, Gaetz, Simine, & Tsotsos, 2007).

Such a map would provide two critical components of the grouping process. First, it would help isolate processing of features and identities to the objects at the locations of the peaks in the map (Treisman & Gelade, 1980). Simultaneously selecting these locations might lead them to appear to belong together, even though they are spatially separated (Xu, 2008). Simultaneous selection might also provide a summary representation (e.g., Alvarez & Oliva, 2009) of other features at these locations, leading to a holistic representation of the otherwise separate objects. For example, selecting a set of rightward-moving objects could facilitate judgments about the number of those objects (Halberda, Sires, & Feigenson, 2006) or the distribution of their sizes (Chong & Treisman, 2005). Second, because any area of the visual field containing the selected feature would result in a peak in the spatially organized map, the complex distribution of the selected locations would convey the shape of the group (Huang & Pashler, 2007).

The mechanism we propose is consistent with a powerful property of common-fate grouping: A group can be constructed in a massively parallel operation in which all elements across the visual field that share a feature are selected. However, our account also suggests that common-fate grouping should have a salient weakness: Peaks on a spatially organized map should be distinguishable only by their location, so the visual system should be able to construct only one group at a time. This prediction contrasts with observers’ experience that they see multiple common-fate groups simultaneously. We report the results of two visual search experiments that demonstrate that there are severe limitations on the number of common-fate groups that can be constructed at once.

Experiment 1

In Experiment 1, we asked participants to perform a visual search for a vertically oriented group among horizontally oriented groups; groups consisted of pairs of dots that were moving in the same direction. Because searches for objects with a unique orientation are typically efficient (e.g., Treisman & Gelade, 1980), this search task is well suited for determining whether common-fate groups can be formed in parallel. To evaluate participants’ efficiency in performing this task, we included a control condition, in which the elements in each common-fate group were connected by a 1-pixel line.

Method

Ten participants (ages 18 to 21) took part in this study. All were either volunteers from the Northwestern University student population or paid participants from the Evanston, Illinois, community.

Stimuli were presented on a 17-in. CRT monitor (resolution of 1024 × 768 pixels, 85-Hz refresh rate) and were generated using MATLAB (The MathWorks, Natick, MA) and the Psychophysics Toolbox (Brainard, 1997).

Displays consisted of pairs of moving dots against a static background of 250 randomly positioned dots (approximately 0.3 dots per square degree). The two dots in each pair moved within a square region measuring 5.2° on each side, maintained a constant distance (2°) from each other throughout the trial, and moved at a constant velocity (2.65°/s). Each dot pair was assigned an initial random angle of motion and moved along a straight path until one of the dots reached the invisible boundary of the square region, at which point the pair’s angle of motion was reflected. On each trial, two, four, or eight dot pairs were placed in adjacent positions along an imaginary circle with an 8° radius.

Trials were blocked into two conditions. In the grouped condition, participants were asked to determine whether a target (i.e., a vertically oriented dot pair) was present or absent in each display; the dots in a display continued to move until participants responded via key press. In the connected (control) condition, the task was identical, except that the dots within each pair were connected by a 1-pixel line. Figure 2a depicts the experimental task. In target-absent trials (50% of trials), all dot pairs were arranged horizontally. In target-present trials (50% of trials), a single pair of dots was arranged vertically. Movies of sample trials are available online at http://viscog.psych.northwestern.edu/.

Fig. 2.

Fig. 2

Experimental stimuli and results. In Experiment 1, each display contained two, four, or eight dots pairs defined by common movement (a), and participants searched for a vertically oriented pair (highlighted here by the gray shading) among horizontally oriented pairs; the graph (b) shows response time in the two conditions as a function of the number of dot pairs in the display. In Experiment 2, each display contained five or nine dot pairs; the dots in distractor dot pairs moved 180° out of phase, and the dots in target pairs moved in phase (c). Participants indicated whether a target dot pair was absent or present on each trial. The graph (d) shows response time in the two conditions as a function of the number of dot pairs in the display. Error bars indicate standard errors of the mean.

Results

Figure 2b depicts the results for Experiment 1. According to our account, if only one common-fate group is created at a time, adding more groups to the display should substantially increase response times. Indeed, response times in the grouped condition increased with the number of distractor pairs, and the search slope of 55 ms per pair was significantly different from 1, t(9) = 17.3, p < .001. When common-fate groups were replaced by connectivity groups (connected condition), which can be processed in parallel (Franconeri, Bemis, & Alvarez, 2009; Rensink & Enns, 1995), adding more groups to the display did not substantially affect response times (search slope = 6 ms/pair), t(9) = 2.7, p = .023. Furthermore, searches among common-fate groups were significantly less efficient than searches among connectivity groups, t(9) = 12.3, p < .001. Thus, the costs of searching through the common-fate groups were attributable to how they were constructed, as the two conditions did not differ in display complexity or the shape-identification requirements of the search task.

Experiment 2

Experiment 1 showed that the capacity for extracting the shape of a common-fate group is severely limited. However, even if shape information is not available, the visual system may be able to detect coarse properties of groups in a display (e.g., the number of clusters of grouped elements; see Trick & Enns, 1997). Although discrimination of the shape of common-fate groups may be inefficient, observers may nonetheless be able to efficiently detect the existence of such groups. To test this possibility, we altered the visual search task so that participants judged whether a single common-fate group was present in a display filled with nongrouped objects (see Fig. 2c).

Method

Twelve participants (ages 18 to 21) took part in this experiment. All were either volunteers from the Northwestern University student population or paid participants from the Evanston, Illinois, community.

The displays in Experiment 2 consisted of pairs of moving dots against a black background. Either five or nine dot pairs were displayed on each trial. Each dot orbited an imaginary circle with a 1° radius at a rate of 6.28°/s. In distractor dot pairs, the two dots revolved around separate loci and moved 180° out of phase. Paired distractor dots never approached closer than 0.5° and were never separated by more than 3.5°. The dots in a target pair also revolved around separate loci, but their movement was in phase, and therefore the dots maintained a constant separation of 2°. A target pair was assigned a unique phase, and distractor dots in the same display were always at least 75° out of phase from the dots in the target pair. Trials were blocked into two conditions. In the grouped condition, all dot pairs were white. In the color-cue condition, distractor dot pairs were white, and the target dot pair was magenta (on each target-absent trial, a random distractor dot pair was magenta). Participants were instructed to determine whether a target was present or absent in each display; motion persisted in the display until participants indicated their response via key press. Movies of sample trials are available at http://viscog.psych.northwestern.edu. Figure 2c depicts the experimental task.

Results

Response times in the grouped condition increased with the number of distractor pairs, and the search slope of 82 ms per pair was significantly different from 0, t(11) = 12.3, p < .001 (Fig. 2d); this result suggests that the capacity for detecting the presence of common-fate groups is severely restricted. Results for participants in the color-cue condition (Fig. 2d) confirmed that the search costs associated with additional distractors were not simply due to increased display complexity: When the target group was cued with a unique color, adding more distractors to the display did not affect response times (search slope = 7 ms/pair), t(11) = 1.49, p =.16. Also, search slopes in the two conditions differed significantly from each other, t(11) = 7.45, p < .001.

Although Experiment 2 showed that the capacity for detecting common-fate groups is severely limited, there may be ways in which an observer can detect groups efficiently in certain kinds of displays. For example, consider a common-fate group consisting of coherently moving elements among randomly moving elements. As the number of objects within the group increases (relative to the total number of objects on the screen), the group’s direction of motion should begin to dominate the global distribution of motion directions in the entire display. In this case, competition among multiple directions of motion might be resolved through simple feedback mechanisms in the visual system (Chey, Grossberg, & Mingolla, 1997; Martinez-Trujillo et al., 2007) such that the dominant direction of motion would tend to be selected first. This selection would result in an efficient and seemingly automatic representation of a common-fate group. By contrast, Experiment 2 shows that when a common-fate group’s direction of motion does not dominate that of other groups by a significant ratio, the common-fate group cannot be efficiently detected.

Discussion

We have argued that a core mechanism of common-fate grouping is the selection of a direction of motion. This selection would enhance the neural activity associated with similarly moving elements across the visual field, providing the visual system with a map of locations for further processing. If this account is correct, it should be possible to form only one common-fate group at a time. We found evidence consistent with this prediction in two experiments. In Experiment 1, participants searched for a vertically oriented group among horizontally oriented groups. When common-fate grouping was required to perform the task, searches were highly inefficient. However, when elements were connected by lines, the vertical group could be identified instantly. Similarly, in Experiment 2, when participants searched for a single common-fate group among nongrouped objects, searches were highly inefficient.

We argue that the inefficiency of these searches was due to the need to sequentially select the current direction of motion of each group. We note that two alternative explanations are theoretically possible. First, it is possible that common-fate groups are formed in parallel, but the search process cannot access these representations efficiently. This alternative is unlikely because previous studies have shown performance on visual search tasks to be a sensitive measure of visual grouping processes (e.g., Trick & Enns, 1997; Rensink & Enns, 1995). Second, searches for common-fate groups might be inefficient because the parallel grouping process has a limited capacity (see J. Palmer, 1995, for discussion). This alternative is also unlikely, because it is unclear why such a resource limitation might arise; this account would also require additional mechanisms to maintain the representations of a potentially unrestricted number of common-fate groups. Compared with these alternatives, our account of sequential selection has the powerful advantage of parsimony. Because previous work has shown that directions of motion can be selected in a global fashion (Saenz, Buracas, & Boynton, 2002), we can concretely specify how this known mechanism could support common-fate grouping. The burden of proof therefore falls on alternative accounts to identify experimental effects that cannot be explained by serial feature selection.

The feature-selection account, at first glance, appears to contradict prior observations that feature-based grouping can be accomplished outside the focus of attention (Kimchi & Razpurker-Apfeld, 2004; Moore & Egeth, 1997; Russell & Driver, 2005). Those prior studies showed that when participants performed a demanding primary task at fixation, the perceptual organization of an irrelevant background (groupings of luminance, color, or orientation) influenced performance, even though participants reported no awareness or memory of the background’s content. However, any ostensible conflict between our results and those of previous studies would be due to different definitions of attention. Prior studies have shown a processing bottleneck that prevents perceptual groups from reaching the processing stages required for awareness or memory. Grouping by feature selection should still be possible as long as the selection of features is not incompatible with the primary task.

Although the present study focused on common-fate grouping, feature selection may be a more generalized mechanism for grouping by any type of similarity, including not only similarity of motion, but also similarity of color, shape, or orientation (Huang & Pashler, 2007). If so, grouping by similarity may be subject to the same limits as common-fate grouping: The visual system may be able to create only one group at a time. This prediction is supported by recent studies in which observers were asked to determine whether multicolored patterns were symmetrical. In these studies, judgments were faster when the displays consisted of few colors and slower when the displays consisted of more colors; these results suggest that symmetry judgments can be made for only a single color subpattern at a time (Morales & Pashler, 1999). The proposal that feature selection is the mechanism for grouping by similarity implies a substantial difference between grouping by similarity and other types of grouping. Specifically, grouping by proximity, connectivity, or common region may rely on a different mechanism that can produce discrete units in parallel (Franconeri et al., 2009; S. Palmer & Rock, 1994; Rensink & Enns, 1995). The term grouping is likely too broad to capture the diversity of mechanisms that cause some elements of the visual field to be associated with others.

Selection of a direction of motion may not be the only mechanism for grouping moving objects. There are almost certainly long-term representations for more specific patterns of motion, such as those produced by walking bodies, flapping wings, or mouths moving during speech (Cavanagh, Labianca, & Thornton, 2001), and these representations could be used to facilitate perception. A challenge for our proposed account, which predicts that only a single surface is available for perceptual processing at a given time, is to explain more complex grouping abilities related to common fate, such as the mental construction of rigid three-dimensional objects from a dense array of moving dots (the structure-from-motion phenomenon; Wallach & O’Connell, 1953). Although such percepts might require complex local processing of the pattern of motion (Ullman, 1979), we argue that the relatively simple mechanism described in this article may produce surprisingly rich percepts, especially when combined with other cues, such as statistical information (Alvarez & Oliva, 2009; Balas, Nakano, & Rosenholtz, 2009) about the distribution of directions of motion and edges created by motion contrast (Regan & Beverley, 1984).

For simple common-fate groups, and possibly for more complex patterns of motion, feature selection presents a parsimonious account of grouping by motion. Observers may construct a single such group at a time, and what seems like the construction of multiple groups may be an illusion of perceptual detail.

Acknowledgments

We thank Hyunyoung Park for assistance in data collection.

Funding

This work was partially supported by National Science Foundation CAREER Grant BCS-1056730 (to S. L. F.) and National Institutes of Health Grant T32-NS047987 (to B. R. L.).

Footnotes

Declaration of Conflicting Interests

The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.

References

  1. Alvarez GA, Oliva A. Spatial ensemble statistics are efficient codes that can be represented with reduced attention. Proceedings of the National Academy of Sciences, USA. 2009;106:7345–7350. doi: 10.1073/pnas.0808981106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Balas B, Nakano L, Rosenholtz R. A summary-statistic representation in peripheral vision explains visual crowding. Journal of Vision. 2009;9(12):Article 13. doi: 10.1167/9.12.13. Retrieved from http://www.journalofvision.org/content/9/12/13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Brainard DH. The Psychophysics Toolbox. Spatial Vision. 1997;10:433–436. [PubMed] [Google Scholar]
  4. Cavanagh P, Labianca AT, Thornton IM. Attention-based visual routines: Sprites. Cognition. 2001;80:47–60. doi: 10.1016/s0010-0277(00)00153-0. [DOI] [PubMed] [Google Scholar]
  5. Chey J, Grossberg S, Mingolla E. Neural dynamics of motion grouping: From aperture ambiguity to object speed and direction. Journal of the Optical Society of America. 1997;14:2570–2594. [Google Scholar]
  6. Chong SC, Treisman A. Statistical processing: Computing the average size in perceptual groups. Vision Research. 2005;45:891–900. doi: 10.1016/j.visres.2004.10.004. [DOI] [PubMed] [Google Scholar]
  7. Franconeri SL, Bemis DK, Alvarez GA. Number estimation relies on a set of segmented objects. Cognition. 2009;113:1–13. doi: 10.1016/j.cognition.2009.07.002. [DOI] [PubMed] [Google Scholar]
  8. Halberda J, Sires SF, Feigenson L. Multiple spatially overlapping sets can be enumerated in parallel. Psychological Science. 2006;17:572–576. doi: 10.1111/j.1467-9280.2006.01746.x. [DOI] [PubMed] [Google Scholar]
  9. Huang L, Pashler H. A Boolean map theory of visual attention. Psychological Review. 2007;114:599–631. doi: 10.1037/0033-295X.114.3.599. [DOI] [PubMed] [Google Scholar]
  10. Kimchi R, Razpurker-Apfeld I. Perceptual grouping and attention: Not all groupings are equal. Psychonomic Bulletin & Review. 2004;11:687–696. doi: 10.3758/bf03196621. [DOI] [PubMed] [Google Scholar]
  11. Kubovy M, van den Berg M. The whole is equal to the sum of its parts: A probabilistic model of grouping by proximity and similarity in regular patterns. Psychological Review. 2008;115:131–154. doi: 10.1037/0033-295X.115.1.131. [DOI] [PubMed] [Google Scholar]
  12. Martinez-Trujillo JC, Cheyne D, Gaetz W, Simine E, Tsotsos JK. Activation of area MT/V5 and the right inferior parietal cortex during the discrimination of transient direction changes in translational motion. Cerebral Cortex. 2007;17:1733–1739. doi: 10.1093/cercor/bhl084. [DOI] [PubMed] [Google Scholar]
  13. Maunsell JHR, Treue S. Feature-based attention in visual cortex. Trends in Neurosciences. 2006;29:317–322. doi: 10.1016/j.tins.2006.04.001. [DOI] [PubMed] [Google Scholar]
  14. Moore CM, Egeth HE. Perception without attention: Evidence of grouping under conditions of inattention. Journal of Experimental Psychology: Human Perception and Performance. 1997;23:339–352. doi: 10.1037//0096-1523.23.2.339. [DOI] [PubMed] [Google Scholar]
  15. Morales D, Pashler H. No role for colour in symmetry perception. Nature. 1999;399:115–116. doi: 10.1038/20103. [DOI] [PubMed] [Google Scholar]
  16. Palmer J. Attention in visual search: Distinguishing four causes of set-size effects. Current Directions in Psychological Science. 1995;4:118–123. [Google Scholar]
  17. Palmer S, Rock I. Rethinking perceptual organization: The role of uniform connectedness. Psychonomic Bulletin & Review. 1994;1:29–55. doi: 10.3758/BF03200760. [DOI] [PubMed] [Google Scholar]
  18. Palmer SE. Perceptual grouping: It’s later than you think. Current Directions in Psychological Science. 2002;11:101–106. [Google Scholar]
  19. Regan D, Beverley KI. Figure-ground segregation by motion contrast and by luminance contrast. Journal of the Optical Society of America. 1984;1:433–442. doi: 10.1364/josaa.1.000433. [DOI] [PubMed] [Google Scholar]
  20. Rensink RA, Enns JT. Preemption effects in visual search: Evidence for low-level grouping. Psychological Review. 1995;102:101–130. doi: 10.1037/0033-295x.102.1.101. [DOI] [PubMed] [Google Scholar]
  21. Roelfsema PR. Cortical algorithms for perceptual grouping. Annual Review of Neuroscience. 2006;29:203–227. doi: 10.1146/annurev.neuro.29.051605.112939. [DOI] [PubMed] [Google Scholar]
  22. Russell C, Driver J. New indirect measures of “inattentive” visual grouping in a change-detection task. Perception & Psychophysics. 2005;64:606–623. doi: 10.3758/bf03193518. [DOI] [PubMed] [Google Scholar]
  23. Saenz M, Buracas GT, Boynton GM. Global effects of feature-based attention in human visual cortex. Nature Neuroscience. 2002;5:631–632. doi: 10.1038/nn876. [DOI] [PubMed] [Google Scholar]
  24. Treisman AM, Gelade G. A feature-integration theory of attention. Cognitive Psychology. 1980;12:97–136. doi: 10.1016/0010-0285(80)90005-5. [DOI] [PubMed] [Google Scholar]
  25. Trick LM, Enns JT. Clusters precede shapes in perceptual organization. Psychological Science. 1997;8:124–129. [Google Scholar]
  26. Ullman S. The interpretation of structure from motion. Proceedings of the Royal Society B: Biological Sciences. 1979;203:405–426. doi: 10.1098/rspb.1979.0006. [DOI] [PubMed] [Google Scholar]
  27. Wallach H, O’Connell DN. The kinetic depth effect. Journal of Experimental Psychology. 1953;45:205–217. doi: 10.1037/h0056880. [DOI] [PubMed] [Google Scholar]
  28. Williams DW, Sekuler R. Coherent global motion percepts from stochastic local motions. Vision Research. 1984;24:55–62. doi: 10.1016/0042-6989(84)90144-5. [DOI] [PubMed] [Google Scholar]
  29. Xu Y. Representing connected and disconnected shapes in intraparietal sulcus. NeuroImage. 2008;40:1849–1856. doi: 10.1016/j.neuroimage.2008.02.014. [DOI] [PubMed] [Google Scholar]

RESOURCES