Abstract
As we move our gaze through a complex scene, the retinal image is constantly shifted and overwritten. A new study using human intracranial recordings offers a fresh perspective on how the brain creates a sense of perceptual continuity through natural visual behavior.
In Brief:
As we move our gaze through a complex scene, the retinal image is constantly shifted and overwritten. A new study using human intracranial recordings offers a fresh perspective on how the brain creates a sense of perceptual continuity through natural visual behavior.
As we explore the world around us, our brain enlists its subordinate sensory organs to gather information about the scene. In humans, directing eye gaze is particularly important, because our impressions of objects, spatial relations and social information are largely determined by vision. Sensory input is by nature imperfect and fragmented. Thus, a major task of the brain is to build a plausible perceptual scenario from incomplete sensory details, a process Helmholtz called ‘unconscious inference’ [1]. In the case of vision, our restless gaze adds another twist: each fraction of a second, a saccadic eye movement abruptly shifts our focus, dragging the image across the retina and overwriting the previous pattern of photoreceptor stimulation (Figure 1A). Nonetheless, our brain protects us from these raw sensory events, and we perceive a seamless world across space and time, with only a vague awareness that our eyes are always moving. Since the time of Helmholtz, researchers have puzzled over how the brain is able to ignore, or perhaps compensate for, the image disruptions caused by eye movements (for a recent review, see [2]). In this issue of Current Biology, Podvalny et al. [3] approach natural gaze behavior from a new angle, recording its impact on neural activity in the visual object pathway during a social exchange.
How should one think about perceptual continuity in vision? Historically, researchers have focused on the spatial problem of saccadic displacement and the brain’s capacity to maintain perceptual stability despite large image shifts on the retina. These studies have typically employed tightly controlled paradigms, for example in which a trained monkey or human is cued to make saccades between bright dots on a dark background. Psychophysical and electrophysiological findings suggest that the brain counteracts such retinal image displacement through several tricks. For one, it keeps track of its own oculomotor actions and sends this information to perceptual centers via a ‘corollary discharge’ signal [2]. Based in part on such signals, neurons in certain cortical areas then appear to anticipate the sensory consequences of each saccade through an active remapping of visual space [4]. Further, the visual brain is able to use landmark stimuli in order to establish spatial correspondences before and after each eye movement [5]. As these tricks pertain to space and action, they fit with the idea that perceptual stability is a problem of the dorsal visual stream. Related questions, however, such as how the brain maintains a persistent representation of objects across fixations, may not reduce to a spatial problem. In their study, Podvalny et al. [3] focus on object-selective responses in the ventral visual pathway and the extent to which they are shaped by spontaneous eye movements.
While a handful of previous studies [6–10] in monkeys and humans have investigated the interaction between gaze behavior and neural responses to complex objects, most work in this area has aimed to minimize the contribution of eye movements by flashing images briefly during periods of steady gaze fixation. The natural viewing approach applied by Podvalny et al. [3] goes beyond previous free-viewing studies by allowing subjects to turn their head, move their eyes, and engage in natural social exchanges. The subjects were patients whose brains were covered with dozens of subdural electrocorticogram (ECoG) electrodes that had been temporarily implanted to monitor epileptic activity. The patients additionally wore two types of head-mounted camera — one to record the contents of the scene in front of them, and the other to track their eye position within the scene. As is common in ECoG studies, the authors took electrophysiological power in the high frequency gamma range as an instantaneous measure of local neural activity.
This novel combination of methods allowed Podvalny et al. [3] to track neural activity at sites across the brain during a natural conversation, and at the same time determine precisely the evolving image content on the retina. With this, they investigated the extent to which self-generated gaze behavior shaped visual responses in two broadly defined cortical areas: early retinotopic cortex (Broadmann’s areas 17 and 18), and later object-selective cortex, including regions selective for faces. In analysing the data, they concentrated on periods of stable fixation, rather than the saccades themselves, framing their question as one of visual sampling rather than responses to eye movements per se.
The most salient finding was that gaze behavior had very different effects on neural responses in the early and late cortical areas. In the early visual areas, details of the gaze and low-level stimulus properties strongly influenced responses. Once the eye had landed, responses were strongly shaped by the gaze dwell time, or fixation duration (Figure 1B, left). The main excitatory response depended strongly on the luminance contrast between the pre- and post-saccadic image content. In addition to this main excitatory response, there was an initial dip in activity that appeared to begin even while the eye was in motion — a neural response feature that may relate to the eye movement itself. This dip was absent in control experiments in which the subject was shown flashed stimuli in a more conventional manner.
Further along in the ventral visual pathway, responses were very different. Podvalny et al. [3] focused mainly on ventral cortical areas known to be specialized in the processing of socially relevant stimuli such as faces [11,12]. Unlike in the early visual cortex, the fixation parameters in object-selective cortex did not affect the structure of neural responses. Individual sites exhibited stereotypical patterns of neural responses that, while selective for complex object content, were uncoupled from the details of each fixation: brief and long fixations led to the same response profiles (Figure 1B, right). Responses here did not depend strongly on low-level image contrast, and did not display an initial dip in their fixation response. Instead, the responses appeared to be governed by an internal dynamic that was invariant to the fixation sampling process itself.
These findings highlight a characteristic of high-level visual responses in the human brain that was previously unknown, and were only revealed during natural visual behavior. Evidently, along the ventral visual pathway there is a transition in which object selective neural responses gain invariance to the details of oculomotor sampling. One interpretation of this finding, supported by the authors, is that this transition may be an important step for achieving perceptual stability and continuity of complex visual information. This is a fascinating prospect, and one that calls for more investigation through future experiments. The findings also raise several new questions. For example, what internal principles govern the stereotypical temporal dynamics observed in object selective cortex, and why would different sites exhibit such distinct temporal profiles? It would be of great interest to learn whether such invariant dynamics are present among individual neurons, or perhaps in local populations of interacting neurons. As the present study [3] did not break down the fixation content in great detail, in part because of the challenges and limitations of recordings in human patients, more work is needed to understand whether these pulses are associated explicitly with foveation, whether their strength is affected by the recent fixation history, and how broadly the principles extend to stimuli beyond faces. These issues are important for understanding the extent to which the invariant response profiles observed in the present study might serve as a basis to support the continued perception of visual objects more generally.
In many ways, this study [3] is on the leading edge of a recent trend in systems neuroscience to complement traditional stimulus presentation with more naturalistic paradigms [10,13,14]. The obvious risk in taking such a step is the loss of control over quantifiable stimulus parameters. But temporary relief from tight testing paradigms can also be liberating. Experimentally, the brain is comfortable, if not eager, to step into its more natural set of conditions, and corresponding neurophysiological data are often straightforward to acquire. Importantly, recent work has shown that under the right conditions neural signals can show a high degree of reliability when faced with complex stimuli during free viewing [10,15], with naturalistic paradigms recently inviting a range of new neuroscientific questions [16–18]. The new study [3] is in that category, and the results may be an important step for understanding how the brain processes complex sensory information in the context of natural actions, all the while maintaining the continuous sense of a stable and meaningful world.
References
- 1.von Helmholtz H (1925). Treatise on physiological optics. 3rd ed. New York, NY: (translated J.P.C. Southall). [Google Scholar]
- 2.Wurtz RH, Joiner WM, and Berman RA (2011). Neuronal mechanisms for visual stability: progress and problems. Phil. Trans. R. Soc. Lond. B 366,492–503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Podvalny E, Yeagle E, Megevand P, Sarid N, Harel M, Chechik G, Mehta AD., and Malach R (2016). Invariant temporal dynamics underlie perceptual stability in human visual cortex. Curr. Biol (this issue). [DOI] [PubMed] [Google Scholar]
- 4.Duhamel JR, Colby CL, and Goldberg ME (1992). The updating of the representation of visual space in parietal cortex by intended eye movements. Science 255:90–2. [DOI] [PubMed] [Google Scholar]
- 5.Deubel H, Koch C, and Bridgeman B (2010). Landmarks facilitate visual space constancy across saccades and during fixation. Vision Res 50:249–59. [DOI] [PubMed] [Google Scholar]
- 6.DiCarlo JJ and Maunsell JH (2000). Form representation in monkey inferotemporal cortex is virtually unaltered by free viewing. Nat. Neurosci 3:814–21. [DOI] [PubMed] [Google Scholar]
- 7.Marsman JBC, Renken R, Haak KV, and Cornelissen FW (2013). Linking cortical visual processing to viewing behavior using fMRI. Front Syst. Neurosci 7:109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Russ BE and Leopold DA (2015). Functional MRI mapping of dynamic visual features during natural viewing in the macaque. NeuroImage 109:84–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sheinberg DL and Logothetis NK (2001). Noticing familiar objects in real world scenes: the role of temporal cortical neurons in natural vision. J. Neurosci 21:1340–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hasson U, Nir Y, Levy I, Fuhrmann G and Malach R (2004). Intersubject synchronization of cortical activity during natural vision. Science 303:1634–40. [DOI] [PubMed] [Google Scholar]
- 11.Kanwisher N (2010). Functional specificity in the human brain: a window into the functional architecture of the mind. Proc. Nat. Acad. Sci. USA 107:11163–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bentin S, Allison T, Puce A, Perez E, and McCarthy G (1996). Electrophysiological studies of face perception in humans. J. Cogn. Neurosci 8:551–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bartels A and Zeki S (2004). The chronoarchitecture of the human brain--natural viewing conditions reveal a time-based anatomy of the brain. NeuroImage 22:419–33. [DOI] [PubMed] [Google Scholar]
- 14.McMahon DBT, Russ BE, Elnaiem HD, Kurnikova AI, and Leopold DA (2015). Single-unit activity during natural vision: diversity, consistency, and spatial sensitivity among AF face patch neurons. J. Neurosci 35:5537–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Çukur T, Nishimoto S, Huth AG, and Gallant JL (2013). Attention during natural vision warps semantic representation across the human brain. Nat. Neurosci 16:763–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Huth AG, Lee T, Nishimoto S, Bilenko NY, Vu AT, and Gallant JL (2016). Decoding the semantic content of natural movies from human brain activity. Front. Syst. Neurosci 2016;10:81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Killian NJ, Jutras MJ, and Buffalo EA (2012). A map of visual space in the primate entorhinal cortex. Nature 491:761–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Russ BE, Kaneko T, Saleem KS, Berman RA, and Leopold DA (2016). Distinct fMRI responses to self-induced versus stimulus motion during free viewing in the macaque. J. Neurosci 36:9580–9. [DOI] [PMC free article] [PubMed] [Google Scholar]