Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2019 May 8;116(20):9699–9700. doi: 10.1073/pnas.1905456116

A neural basis of the serial bottleneck in visual word recognition

Lars Strother a,1
PMCID: PMC6525531  PMID: 31068471

Written language is a hallmark of cultural and technological development. The ability to read written language is a testament to the effects of learning on human behavior and brain function. However, even highly practiced readers exhibit fundamental neural constraints. The fact that you are unable to read the collection of words comprising this text all at once, as desirable as that may be, draws attention to a defining property of the human brain: its limited information-processing capacity. A study by White et al. (1) published in PNAS highlights an extreme case of capacity-limited visual information processing—our inability to read more than one word at a time—and reveals the neural basis of this limitation using functional magnetic resonance imaging (fMRI).

Participants in White et al.’s (1) study performed a semantic categorization task while viewing pairs of words presented simultaneously to the right and left of fixation. On each trial, participants viewed two briefly displayed words (nouns), one displayed to the left of fixation, and the other to the right of fixation, and categorized one of the words as either living or nonliving. In a “focal cue” condition, participants performed the task on either the left or the right word, according to a precue. In a “distributed cue” condition, participants paid attention to both words and subsequently reported the semantic category of one of the words, but without knowing which in advance. The authors, in a previous behavioral study (2), used a similar task to show that even highly skilled readers are able to recognize only one word at a time. In the current study, participants performed the task during fMRI scanning, which measures blood oxygen level-dependent (BOLD) signals with millimeter-level spatial resolution.

BOLD-Based Spatial Channels

Due to the neural architecture of the human visual system, input from the left visual field (LVF) and the right visual field (RVF) is initially transmitted to retinotopic visual cortex in the opposite (contralateral) cerebral hemisphere. In contrast to retinotopic cortex, more anterior regions of ventral occipitotemporal cortex (VOTC) in the left cerebral hemisphere receive and process visual input from words viewed in either the LVF or the RVF (3). With this in mind, White et al. (1) model BOLD-based spatial channels corresponding to either LVF or RVF input, which allows them to implement a neuronal attention operating characteristic (AOC) as a measure of visual processing capacity. Their main goal was to identify the neural basis of a serial bottleneck revealed in the behavioral study mentioned previously (2), with a specific focus on retinotopic visual cortex and VOTC.

As a starting point, White et al. (1) show that retinotopic cortex in each cerebral hemisphere shows a contralateral hemifield–hemisphere relationship such that retinotopic cortex in the right hemisphere processes LVF words, and retinotopic cortex in the left hemisphere processes RVF words. They reason that if a serial bottleneck exists in retinotopic cortex, then depending on the word location that participants focused on to perform the task, BOLD responses would be greater in retinotopic cortex contralateral to this location relative to BOLD responses in a divided-attention condition (which did not require participants to focus on either the LVF or the RVF word). In contrast to other fMRI studies of divided attention (4, 5), White et al. find that retinotopic cortex processes contralateral words in parallel when viewed simultaneously (see figure 6 of ref. 1), albeit in separate hemispheres and not necessarily to the same degree in each hemisphere. While this finding is important in its own right, it is critical in the current study because it allows the authors to rule out retinotopic cortex as the most likely neural basis of our inability to recognize more than one word at the same time. It is also critical because it allows them to assert that information from the two channels converges in left VOTC.

The ability of left VOTC to visually process words viewed at different locations is the topic of ongoing fMRI research (3, 6, 7). Like other types of visual object recognition (8), word recognition involves the transformation of retinotopically organized visual features into increasingly complex representations of letters and words by neural circuits in VOTC (9). However, unlike other types of visual object recognition, word recognition engages distinct populations of VOTC neurons in the left cerebral hemisphere (10, 11). Despite progress in our understanding of the involvement of left VOTC in word recognition, the emergence of bilateral visual field representation in left VOTC—the representation of both RVF and LVF visual input in distinct spatial channels—remains somewhat mysterious.

To some degree, the ability of left VOTC to process visual inputs from both the RVF and the LVF is explained by its direct inheritance of contralateral (RVF) inputs via retinotopic cortex in the same hemisphere, and by the transfer of ipsilateral (LVF) inputs from homotopic regions the opposite hemisphere via the corpus callosum (12). Nevertheless, White et al.’s (1) finding that LVF and RVF channels originating in retinotopic cortex in each hemisphere converge in left VOTC is surprising because of the specific anatomical location of this convergence. The visual word form area (VWFA) of left VOTC was originally defined as having bilateral visual field (LVF and RVF) sensitivity to words (3). However, more recent fMRI findings suggest that bilateral word representation occurs in two distinct regions of left VOTC: the VWFA and a more posterior region of left VOTC (13). Consistent with this possibility, White et al. (1) delineate two VWFAs in left VOTC, only one of which shows parallel processing in both LVF and RVF channels.

The delineation of two VWFAs is of particular importance in White et al.’s (1) study because it enables them to identify a transition between two functionally and anatomically distinct stages of visual processing, and thus identify the neural basis of the serial bottleneck in word recognition. The first stage of processing entails a funneling of LVF and RVF information into a posterior region of left VOTC (VWFA-1) via distinct spatial channels. The second stage of processing entails the selective filtering of information conveyed by VWFA-1 to an anterior but adjacent VWFA-2—a serial bottleneck in visual processing of two simultaneously viewed words (see figure 6 of ref. 1). The relative anatomical locations of VWFA-1 and VWFA-2 are worth noting because the location of VWFA-2 is consistent with that of the VWFA originally reported to support both RVF and LVF word recognition (3) and sometimes referred to as the brain’s visual dictionary (14). White et al. (1), therefore, reveal an important functional property of this neural dictionary: its limited processing capacity. Their study demonstrates the power of the neuronal AOC mentioned at the beginning of this commentary. The neuronal

A study by White et al. published in PNAS highlights an extreme case of capacity-limited visual information processing—our inability to read more than one word at a time—and reveals the neural basis of this limitation using functional magnetic resonance imaging (fMRI).

AOC is an exciting development in the analysis of fMRI data that has enormous potential to advance our understanding of the interface of vision and language in the human brain.

Open Questions

The results of White et al. (1) open the door to a number of questions that could be answered using BOLD-based channel modeling and the neuronal AOC. For instance, centrally viewed words entail hemifield splitting of visual information between the RVF and LVF (15). Do RVF and LVF channels converge in posterior VOTC (VWFA-1) for centrally viewed words? Findings from a related fMRI study of centrally viewed words suggest they do (13). However, in contrast to peripherally viewed words, LVF and RVF channels corresponding to hemifield-split parts of a word should both contribute to further processing in anterior VOTC (VWFA-2), since this region represents whole words. This highlights an important prospective difference in the convergence of spatial channels in left VOTC for words viewed in the periphery compared with words fixated during normal reading. The results of White et al. (1) also raise interesting questions about the relationship between word recognition and other types of object recognition. For instance, does left VOTC contain separable channels for words versus other types of nonword objects? Given the possibility of an opponent cerebral relationship between word recognition and face recognition (16, 17), does a similar bottleneck exist for face recognition by neural circuits in the right hemisphere? The study by White et al. (1) is a promising reminder that the answers to these and other important questions concerning the neural basis of visual word recognition are within reach of cognitive neuroscience.

Footnotes

The author declares no conflict of interest.

See companion article on page 10087.

References

  • 1.White AL, Palmer J, Boynton GM, Yeatman JD (2019) Parallel spatial channels converge at a bottleneck in anterior word-selective cortex. Proc Natl Acad Sci USA 116:10087–10096.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.White AL, Palmer J, Boynton GM (2018) Evidence of serial processing in visual word recognition. Psychol Sci 29:1062–1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Cohen L, et al. (2002) Language-specific tuning of visual cortex? Functional properties of the visual word form area. Brain 125:1054–1069. [DOI] [PubMed] [Google Scholar]
  • 4.Pestilli F, Carrasco M, Heeger DJ, Gardner JL (2011) Attentional enhancement via selection and pooling of early sensory responses in human visual cortex. Neuron 72:832–846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.McMains SA, Somers DC (2005) Processing efficiency of divided spatial attention mechanisms in human visual cortex. J Neurosci 25:9444–9448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rauschecker AM, Bowen RF, Parvizi J, Wandell BA (2012) Position sensitivity in the visual word form area. Proc Natl Acad Sci USA 109:E1568–E1577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gomez J, Natu V, Jeska B, Barnett M, Grill-Spector K (2018) Development differentially sculpts receptive fields across early and high-level human visual cortex. Nat Commun 9:788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Grill-Spector K. (2003) The neural basis of object perception. Curr Opin Neurobiol 13:159–166. [DOI] [PubMed] [Google Scholar]
  • 9.Vinckier F, et al. (2007) Hierarchical coding of letter strings in the ventral stream: Dissecting the inner organization of the visual word-form system. Neuron 55:143–156. [DOI] [PubMed] [Google Scholar]
  • 10.Dehaene S, Cohen L (2011) The unique role of the visual word form area in reading. Trends Cogn Sci 15:254–262. [DOI] [PubMed] [Google Scholar]
  • 11.Seghier ML, Price CJ (2011) Explaining left lateralization for words in the ventral occipitotemporal cortex. J Neurosci 31:14745–14753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Berlucchi G. (2014) Visual interhemispheric communication and callosal connections of the occipital lobes. Cortex 56:1–13. [DOI] [PubMed] [Google Scholar]
  • 13.Strother L, Coros AM, Vilis T (2016) Visual cortical representation of whole words and hemifield-split word parts. J Cogn Neurosci 28:252–260. [DOI] [PubMed] [Google Scholar]
  • 14.Glezer LS, Kim J, Rule J, Jiang X, Riesenhuber M (2015) Adding words to the brain’s visual dictionary: Novel word learning selectively sharpens orthographic representations in the VWFA. J Neurosci 35:4965–4972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lavidor M, Walsh V (2004) The nature of foveal representation. Nat Rev Neurosci 5:729–735. [DOI] [PubMed] [Google Scholar]
  • 16.Behrmann M, Plaut DC (2013) Distributed circuits, not circumscribed centers, mediate visual recognition. Trends Cogn Sci 17:210–219. [DOI] [PubMed] [Google Scholar]
  • 17.Dehaene S, et al. (2010) How learning to read changes the cortical networks for vision and language. Science 330:1359–1364. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES