Vision is subserved by a hierarchical system, where increasingly complex representations are formed at each processing stage (Van Essen et al., 1992). This collection of interconnected brain regions is understood to work in concert to build a meaningful representation of our surroundings. Beginning with point-wise light detection by retinal ganglion cells, representations are pieced together in successive stages, forming scene segmentation and object representations. The earliest processing stages, involving the retina, lateral geniculate nucleus and early visual cortices, are thought to act as filters selective for various stimulus dimensions, giving rise to features such as orientation in areas LGN and V1 (Cheong et al., 2013; De Valois et al., 1982; Hubel & Wiesel, 1968). These early visual areas have been heavily studied and are understood to encode information about our surroundings in a retinotopic coordinate system, which preserves information in retinal coordinates centered on the focus of gaze. However, at some stage in the process, our visual system clearly must transition away from purely retinotopic frames of reference, to those which are of use in actually interacting with the objects around us – spatiotopic and body-centered reference frames. That is, world coordinates, which do not necessarily match retinotopic ones. Higher visual areas, whose cells’ receptive fields can be so large as to cover the entire visual field, likely operate in such reference frames (Melcher & Morrone, 2015).
How is the transition from retinotopic to world coordinates achieved? Perhaps we can draw from what we know about the formation of representations within the early visual system of neurotypical humans. Even within V1, we begin to see departures from the one-to-one correspondence of a strict retinotopic coordinate system. Indeed, one of the hallmark transformations within V1 is the transition from simple cells to complex cells, which goes from spatial phase-variant to phase-invariant selective preferences. This primitive conversion of spatial coordinate systems involves gains and losses in information. In V1, the gains are clear: complex, phase-invariant representations endow these populations with the ability to encode motion direction and stereoscopic depth – both essential building blocks necessary in order to build an understanding of our environment. But, there is a loss in information with the construction of these representations, as well. In gaining the ability to process motion and depth, complex cells lose the ability to identify the precise spatial phase of visual stimulation. And yet, we are generally still able to consciously access that information when asked to identify the precise spatial phase of a visual stimulus, presumably by relying on information that resides within simple cells, or even the LGN. This principle of gains and losses likely repeats itself throughout the visual hierarchy: as abstract, higher-level representations are gained in extrastriate cortices, specific, lower-level information is lost. In terms of spatial representations, the losses are well-known along the visuocortical hierarchy, with retinotopically-defined receptive fields increasing substantially in size along the hierarchy (Dumoulin & Wandell, 2008; Yoshor et al., 2007). These growing receptive fields effectively act as increasingly aggressive lowpass spatial frequency filters, which end up limiting knowledge of the retinotopic specificity of an item’s location.
While the loss in retinotopic spatial coordinates along the hierarchy is clear, due to larger receptive fields, we lack a very good understanding of what information is gained in representation in these areas, particularly in intermediate stages of visual processing. One suggestion is that a transitional stage of processing exists where the visual scene is divided into mutually exclusive, bounded 2-D shapes based on segmentation processes guided by principles of “uniform connectedness” (Palmer & Rock, 1994), and that at this stage shapes may be represented in shape-based coordinates, as opposed to retinotopic or spatiotopic (Olson, 2003). The intermediate stages of processing and their hypothesized object-based reference frames have proven more difficult to study than either early or high-level visual processes. Most of the evidence comes from neuropsychological studies of patients with specific deficits in object representation – for example, one patient was found whose visual neglect consisted of ignoring one side of individual objects in scenes, as opposed to one side of space (Driver & Halligan, 1991). There is also evidence that these shape representations are formed in multiple parallel pathways, each processing different types of visual cues, such as motion or color; patients have been reported who had difficulty perceiving shapes from some cues but not others (Cowey & Vaina, 2000; Rizzo et al., 1995).
Vannuscorps and colleagues (2021) report a patient, Davida, whose disorder is consistent with these ideas and offers additional information about some of the properties of shape-centered representations in intermediate vision. Davida’s disorder appears to be highly selective for the perception of 2-D shapes whose characteristics strongly suggest involvement of the parvocellular pathway (sharp edges, medium-to-high contrast) and a processing stage in which shape-centered representations have been formed and need to be mapped onto body-centered or spatiotopic frames of refence. For instance, she can correctly judge the tilt magnitude of a line on a computer screen and match it to a neighboring line in a staircase procedure, but when asked to grasp the outer points of a line drawn on a board, most of the time she responds as if the line were rotated by 90 degrees. In general, her visual deficit is characterized by perceiving sharp-edged 2-D objects (shapes) as if they were inverted, mirrored across their axes or reversed by 90, 180 or 270 degrees around their center, as the authors demonstrate in numerous tasks.
Interestingly, Davida reports that her perception of different shape orientations is often unstable, alternating gradually between shape-centered-axis representations, akin to the perceptual oscillations that define binocular rivalry (Vannuscorps et al., 2021). This perceived fading in and out of the various orientations, combined with Davida’s inability to correctly interact with these shapes, is intriguing, as it suggests that the precise information about the orientation of the shape is inaccessible to whatever process is responsible for incorporating shape information into higher-complexity reference frames. At the same time, this information has clearly been computed by and is present in earlier visual areas, since Davida matches controls on some tasks which require correct orientation perception and likely rely on retinotopic representations (e.g. tilt magnitude discrimination). This suggests that the visual areas where shape-centered representations are mapped onto world coordinates do not themselves represent information about the orientation computed in the preceding stages. Instead, Davida’s case suggests that when shape information is being pieced together, it is in pursuit of orientation invariance in its own axis coordinates. This is reminiscent of what occurs in V1, with the transition from phase-variant (simple cells) to phase-invariant (complex cells) preferences, losing the retinotopic information obtained in the previous stage. In intermediate stages, while gaining object-centered representations and the ability to interact with objects in the real world by virtue of mapping the axes of the shape representation onto higher coordinate frames, evidently some areas lose the ability to construct the specific orientation of that exemplar shape. And while neurotypical individuals can access the preceding stages to know specifically what the orientation of the exemplar is, in Davida’s case, these areas appear to have lost the ability to explicitly access this information, making the formation of stable 2-D shape representations impossible.
The high specificity of Davida’s visual disorder helps shed light on some of the processing that occurs in intermediate stages of object perception and the properties of intermediate shape-centered representations (ISCRs), while presenting a range of new questions. Specifically, what are the neural areas representing ISCRs and mapping them onto higher order frames? And why, in this case, do they lack access to retinotopic orientation information? Previous research suggests that human visual areas LO1-LO2 (V4d in monkey) are well situated within the visual processing stream and could be suitable candidates for computing ISCRs, since they can encode information about shapes in both retinotopic and object-centered reference frames (El-Shamayleh & Pasupathy, 2016; Nandy et al., 2013; Pasupathy & Connor, 2001), and maintain segregation between groups of neurons which respond to shapes from different cues (Tanigawa et al., 2010; Tootell & Nasr, 2017). Vannuscorps and colleagues further suggest that the parietal dorsal stream is critical in providing information about shape axis correspondence and object location (Friedman-Hill et al., 1995; Harris et al., 2001; Priftis et al., 2003), both essential for mapping shape representations. However, the neural underpinnings of Davida’s unique perceptual experience have not been directly tested. There are a variety of functional neuroimaging studies that could be deployed with Davida to help identify the neural loci of ISCR processing. One could, for instance, relate the perceptual switches Davida experiences when viewing 2-D shapes with brain activity, to test whether the switches in subjective percept are correlated with neural oscillations in specific regions along the visual stream (Lumer et al., 1998; Polonsky et al., 2000; Wunderlich et al., 2005). Experiments such as these this may help localize the deficit within the visual processing stream, more directly vetting the hypothesis laid forth by Vannuscorps and colleagues that lateral occipital and dorsal stream regions play a putative role in intermediate shape processing.
References
- Cheong SK, Tailby C, Solomon SG, & Martin PR (2013). Cortical-Like Receptive Fields in the Lateral Geniculate Nucleus of Marmoset Monkeys. Journal of Neuroscience, 33(16), 6864–6876. 10.1523/JNEUROSCI.5208-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cowey A, & Vaina LM (2000). Blindness to form from motion despite intact static form perception and motion detection. Neuropsychologia, 38(5), 566–578. 10.1016/S0028-3932(99)00117-7 [DOI] [PubMed] [Google Scholar]
- De Valois RL, William Yund E, & Hepler N (1982). The orientation and direction selectivity of cells in macaque visual cortex. Vision Research, 22(5), 531–544. 10.1016/0042-6989(82)90112-2 [DOI] [PubMed] [Google Scholar]
- Driver J, & Halligan PW (1991). Can Visual Neglect Operate in Object-centred Co-ordinates? An Affirmative Single-case Study. Cognitive Neuropsychology, 8(6), 475–496. 10.1080/02643299108253384 [DOI] [Google Scholar]
- Dumoulin SO, & Wandell BA (2008). Population receptive field estimates in human visual cortex. NeuroImage, 39(2), 647–660. 10.1016/j.neuroimage.2007.09.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- El-Shamayleh Y, & Pasupathy A (2016). Contour Curvature As an Invariant Code for Objects in Visual Area V4. Journal of Neuroscience, 36(20), 5532–5543. 10.1523/JNEUROSCI.4139-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friedman-Hill SR, Robertson LC, & Treisman A (1995). Parietal Contributions to Visual Feature Binding: Evidence from a Patient with Bilateral Lesions. Science, 269(5225), 853–855. 10.1126/science.7638604 [DOI] [PubMed] [Google Scholar]
- Harris IM, Harris JA, & Caine D (2001). Object Orientation Agnosia: A Failure to Find the Axis? Journal of Cognitive Neuroscience, 13(6), 800–812. 10.1162/08989290152541467 [DOI] [PubMed] [Google Scholar]
- Hubel DH, & Wiesel TN (1968). Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195(1), 215–243. 10.1113/jphysiol.1968.sp008455 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lumer ED, Friston KJ, & Rees G (1998). Neural Correlates of Perceptual Rivalry in the Human Brain. Science, 280(5371), 1930–1934. 10.1126/science.280.5371.1930 [DOI] [PubMed] [Google Scholar]
- Melcher D, & Morrone MC (2015). Nonretinotopic visual processing in the brain. Visual Neuroscience, 32, E017. 10.1017/S095252381500019X [DOI] [PubMed] [Google Scholar]
- Nandy AS, Sharpee TO, Reynolds JH, & Mitchell JF (2013). The Fine Structure of Shape Tuning in Area V4. Neuron, 78(6), 1102–1115. 10.1016/j.neuron.2013.04.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olson CR (2003). Brain Representation of Object-Centered Space in Monkeys and Humans. Annual Review of Neuroscience, 26(1), 331–354. 10.1146/annurev.neuro.26.041002.131405 [DOI] [PubMed] [Google Scholar]
- Palmer S, & Rock I (1994). Rethinking perceptual organization: The role of uniform connectedness. Psychonomic Bulletin & Review, 1(1), 29–55. 10.3758/BF03200760 [DOI] [PubMed] [Google Scholar]
- Pasupathy A, & Connor CE (2001). Shape Representation in Area V4: Position-Specific Tuning for Boundary Conformation. Journal of Neurophysiology, 86(5), 2505–2519. 10.1152/jn.2001.86.5.2505 [DOI] [PubMed] [Google Scholar]
- Polonsky A, Blake R, Braun J, & Heeger DJ (2000). Neuronal activity in human primary visual cortex correlates with perception during binocular rivalry. Nature Neuroscience, 3(11), 1153–1159. 10.1038/80676 [DOI] [PubMed] [Google Scholar]
- Priftis K, Rusconi E, Umiltà C, & Zorzi M (2003). Pure agnosia for mirror stimuli after right inferior parietal lesion. Brain, 126(4), 908–919. 10.1093/brain/awg075 [DOI] [PubMed] [Google Scholar]
- Rizzo M, Nawrot M, & Zihl J (1995). Motion and shape perception in cerebral akinetopsia. Brain, 118(5), 1105–1127. 10.1093/brain/118.5.1105 [DOI] [PubMed] [Google Scholar]
- Tanigawa H, Lu HD, & Roe AW (2010). Functional organization for color and orientation in macaque V4. Nature Neuroscience, 13(12), 1542–1548. 10.1038/nn.2676 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tootell RBH, & Nasr S (2017). Columnar Segregation of Magnocellular and Parvocellular Streams in Human Extrastriate Cortex. The Journal of Neuroscience, 37(33), 8014–8032. 10.1523/JNEUROSCI.0690-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Essen DC, Anderson CH, & Felleman DJ (1992). Information Processing in the Primate Visual System: An Integrated Systems Perspective. Science, 255(5043), 419–423. 10.1126/science.1734518 [DOI] [PubMed] [Google Scholar]
- Vannuscorps G, Galaburda A, & Caramazza A (2021). Shape-centered representations of bounded regions of space mediate the perception of objects. Cognitive Neuropsychology, 1–50. 10.1080/02643294.2021.1960495 [DOI] [PubMed] [Google Scholar]
- Wunderlich K, Schneider KA, & Kastner S (2005). Neural correlates of binocular rivalry in the human lateral geniculate nucleus. Nature Neuroscience, 8(11), 1595–1602. 10.1038/nn1554 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoshor D, Bosking WH, Ghose GM, & Maunsell JHR (2007). Receptive Fields in Human Visual Cortex Mapped with Surface Electrodes. Cerebral Cortex, 17(10), 2293–2302. 10.1093/cercor/bhl138 [DOI] [PubMed] [Google Scholar]
