Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Sep 25.
Published before final editing as: Cogn Neurosci. 2025 Aug 20:1–28. doi: 10.1080/17588928.2025.2543890

Rethinking category-selectivity in human visual cortex

J Brendan Ritchie a, Susan G Wardle a, Maryam Vaziri-Pashkam a,b, Dwight J Kravitz c,d, Chris I Baker a
PMCID: PMC12458057  NIHMSID: NIHMS2111622  PMID: 40836402

Abstract

A wealth of studies report evidence that occipitotemporal cortex tessellates into ‘category-selective’ brain regions that are apparently specialized for representing ecologically important visual stimuli like faces, bodies, scenes, and tools. Here, we argue that while valuable insights have been gained through the lens of category-selectivity, a more complete view of visual function in occipitotemporal cortex requires centering the behavioral relevance of visual properties in real-world environments rather than stimulus category. Focusing on behavioral relevance challenges a simple mapping between stimulus and visual function in occipitotemporal cortex because the environmental properties relevant to a behavior are visually diverse and how a given property is represented is modulated by our goals. Grounding our thinking in behavioral relevance rather than category-selectivity raises a host of theoretical and empirical issues that we discuss while providing proposals for how existing tools can be harnessed in this light to better understand visual function in occipitotemporal cortex.

Keywords: Category-selectivity, occipitotemporal cortex, behavioral relevance, real-world scenes, vision, tasks, neuroimaging

1. Introduction

How does visual cortex help us make sense of what we see? One popular answer is that it does so in part via a tessellation of occipitotemporal cortex (OTC) into regions specialized for representing complex stimulus categories, including faces, bodies, scenes, tools, words, and object forms more generally (Box 1). This topography of cortical activation, often referred to as category-selectivity, is one of the great success stories of visual neuroscience – indeed, it has been central to much of our own research, both past and present. Theoretically, it is fundamental to how the field thinks about clustering of visual function in OTC and dovetails with the assumption, common to computational models of vision, that categorization and the ability to label what we see is a major information-processing goal of the ventral visual stream (DiCarlo et al., 2012; Grill-Spector & Weiner, 2014). Empirically, category-selectivity has proven to be highly consistent across individuals and robust to considerable variation in experimental design and stimulus manipulations. Further, many exciting current research directions in visual neuroscience focus on the developmental origins of category-selectivity in both humans and primates (Arcaro & Livingstone, 2021; Op de Beeck et al., 2019) and its computational basis as modeled with deep neural networks (DNNs) (Kanwisher et al., 2023; Margalit et al., 2024).

Box 1: What is ‘category-selectivity?’.

Box 1:

‘Category-selectivity’ is most strongly associated with clustering of neural responses in OTC for certain region-defining stimuli, as revealed with human neuroimaging (primarily fMRI). However, its origins are in neuropsychology and computer vision. For example, reports of face recognition impairment following lesion to the fusiform gyrus date back to the 19th century (Bodamer, 1947) and early work on semantic dementia described focal brain lesions that disrupted the ability to recognize or name specific types of objects (for a review see (Capitani et al., 2003). Similarly, the first primate electrophysiology studies of inferior temporal cortex found cells that responded most strongly to certain objects such as faces, bodies and hands (Desimone et al., 1984; Gross et al., 1972; Perrett et al., 1982). Early human recordings from the cortical surface also revealed category specific responses, such as for faces (Allison, 1999; Allison et al., 1994) and letter strings (Nobre et al., 1994). These findings dovetailed with an interest, which remains strong, in visual recognition for categories within the computer vision community (Hoffman & Richards, 1987; Marr, 1982).

To date, ‘category-selective’ responses have been reported for stimulus classes such as faces (Kanwisher et al., 1997; Puce et al., 1995) objects (Kourtzi & Kanwisher, 2001; Malach et al., 1995) scenes (R. Epstein & Kanwisher, 1998,) tools (Chao et al., 1999; Stevens et al., 2015) bodies (Downing et al., 2001; Peelen & Downing, 2005) words (L. Cohen et al., 2000; Dehaene et al., 2002) numbers (Grotheer et al., 2016) and as recently proposed, food (Jain et al., 2023; Khosla et al., 2022; Pennock et al., 2023). When the neural responses to these stimulus classes are contrasted with each other, they produce reliable peaks of activation in predictable locations that can be schematically mapped across the ventral surface of OTC (see the figure, panel a). While early reports described individual loci, subsequent work has revealed larger networks of regions, especially for faces (Duchaine & Yovel, 2015,) bodies (Peelen & Downing, 2007,) scenes (R. A. Epstein & Baker, 2019,) and tools (Mahon & Almeida, 2024) with additional peak responses on the lateral OTC surface (not shown).

Although such findings of ‘category-selectivity’ are ubiquitous in visual neuroscience, its effects are not operationalized consistently and the construct often remains obscure despite some attempts at a clearer definition (Bracci, Ritchie, et al., 2017). Here we briefly elaborate on both issues.

First, minimally a neural response or region of the brain is ‘selective’ if it responds differentially and preferentially to a particular stimulus. Showing this usually involves comparing the neural responses between stimuli, but in practice many non-equivalent definitions are used that contrast the response to a target condition (figure, part b: A (red) = target; B (blue) = control condition(s)) (Downing et al., 2006; Op de Beeck et al., 2008; Stigliani et al., 2015). However, the contrasted stimulus conditions are often selected because of historical precedent rather than a clear theoretical rationale. For example, much of the evidence for distinct face- and body-selective areas comes from studies that do not directly compare the responses between faces and bodies, even though they are part of the same behaviorally relevant object: persons. In this respect, face- and body-selectivity may in fact reflect broader selectivity for persons, but this is obscured by standard procedures which fail to directly contrast them (Hu et al., 2020; Taubert et al., 2022) (but see Engell & McCarthy, 2014; Kliger & Yovel, 2024). This type of concern is not new (Friston et al., 2006; Gauthier & Nelson, 2001) but as we examine further in Section 4 (Figure 2), it has important theoretical consequences for understanding visual function in OTC.

Second, the sense in which region-defining stimuli form comparable ‘categories’ is rarely stated clearly. The localized neural responses to faces now associated with category-selectivity have also been termed ‘domain-specific’ (e.g Kanwisher, 2000), and sometimes ‘faces’ are still referred to as both a ‘domain’ and a ‘category’ (Arcaro & Livingstone, 2021; Bracci & Op de Beeck, 2023) without any clear articulation of either construct. By comparison, when referring to whole-brain networks that are domain-specific, the term ‘domain’ refers to the information-processing resources of the brain that serve a broad class of behavioral goals, like object manipulation (Mahon, 2022; Mahon & Caramazza, 2011).

To make the issue more salient, consider that substantive notions of ‘category’ do not readily group different region-defining stimuli together as comparable stimulus types (figure, part c). A category can arise from a discontinuous response to stimuli that vary continuously (e.g., left vs right orientation) (Ashby & Maddox, 1993; Goldstone, 1998) Yet, stimuli like faces and scenes do not vary along visual feature dimensions such that they can easily be compared (Op de Beeck et al., 2008). A category can also be seen as a component of a taxonomic system of knowledge (e.g., for fruit) (McRae et al., 2005; Mirman et al., 2017). However, there is no obvious taxonomic system that applies to both faces and scenes at different levels. Finally, categories can also be arbitrary, and so defined based on any rule or property that we can represent (Lupyan & Thompson-Schill, 2012; Mirman et al., 2017). But this conflicts with the idea that OTC represents faces and scenes because they are ecologically important. Thus, it remains unclear what puts the ‘category’ in ‘category-selectivity.’

For all the progress the field has made from the study of category-selectivity, we believe it is worth asking whether this framework is still leading the field in the right direction (Box 1). In this Perspective we scrutinize a variety of theoretical and empirical assumptions of the category-selectivity framework. These assumptions are often implicit in our scientific practices such as experimental design and data analysis, which can make them entrenched and difficult to relinquish. Here, we attempt to bring them to the surface. We suggest a different course for the investigation of visual function in OTC that centers on the flexible use of visual information toward behavioral goals, rather than stimulus categories.

A commonly accepted explanation for why OTC exhibits category-selectivity is that region-defining stimuli are ‘ecologically important’ to different forms of natural behavior that are well adapted to our environment (Bracci & Op de Beeck, 2023; Conway, 2018; Malcolm et al., 2016; Peelen & Downing, 2017). For example, faces and bodies are important sources of social information for interacting with con- and allo-specifics; scenes include path and location information for navigation; and tools are graspable objects that we use in manipulating our environment. However, if the apparent selectivity for these stimuli is explained by the complex behaviors they facilitate, then we propose the topography of visual function in OTC may be better described as directly coalescing around these associated natural behaviors themselves, rather than region-defining stimulus categories as such. In other words, the functional organization of OTC is optimized for representing properties of the visible environment relevant to planning natural behaviors – or more simply, their behavioral relevance (Cisek & Kalaska, 2010; Treue, 2001).

In proposing a framework centered on how OTC codes for behavioral relevance, we seek to build on many themes found in prior work. As human neuroscience has progressed, there have been many calls for the adoption of more ethological approaches to studying the brain (Cisek & Kalaska, 2010; Krakauer et al., 2017; Pessoa et al., 2022; Shamay-Tsoory & Mendelsohn, 2019; Nastase et al., 2020). In the case of visual neuroscience, there have also long been pleas that the visual system should be investigated based on its contributions to natural behavior (Churchland et al., 1993; Ingle, 1967). There have also been many theories that situate visual function in OTC relative to larger whole-brain networks that subserve the planning of different natural behaviors (Behrmann & Plaut, 2013; Kravitz et al., 2013; Mahon & Caramazza, 2011; A. Martin, 2016; Milner & Goodale, 2008). Also of particular note, recently many have argued that the visual function of category-selective brain regions in OTC is not exclusively centered on visual recognition, but the representation of many properties of region-defining stimuli that are important to natural behaviors (Bracci & Op de Beeck, 2023; Cox, 2014; Malcolm et al., 2016; Peelen & Downing, 2017). However, lacking has been a direct interrogation of the framework of category-selectivity itself or the articulation of a viable ethologically inspired alternative.

Where we depart from prior work is in arguing that the explanation of how visual function in OTC is organized around behavioral relevance ultimately requires moving beyond the limiting framework of category-selectivity, even as we continue to build upon its hard-won insights. More specifically, our discussion proceeds as follows. We first elaborate on our characterization of behavioral relevance and the unique explanatory challenges that it poses (Section 2). We then make the case for moving beyond the framework of category-selectivity. We show how the default framework of category-selectivity is in tension with prioritizing behavioral relevance in our understanding of visual function in OTC (Section 3). We also argue that the evidence for category-selectivity, while robust and reliable, has led much of the field toward a limiting view of visual function in OTC (Section 4). Given these critiques, we propose an alternative model for explaining how OTC codes behaviorally relevant visual properties, which also accommodates findings associated with category-selectivity (Section 5). Finally, we suggest practical ways that existing tools can be harnessed to study behavioral relevance more directly (Section 6).

2. Putting behavior first

In our view, a fuller understanding of visual function in OTC will only be achieved by adopting a more ethological framework that prioritizes behavior in our thinking. A foundational theme in the fields of both ethology and neuroscience is that studying a complex biological system requires asking not just what an organism is trying to do, but also why (Marr, 1982; Mayr, 1961; Ritchie, 2019; Tinbergen, 1963). From an ethological perspective the ‘why’ of visual processing is the facilitation of different forms of complex, adaptive, natural behavior in a dynamic environment. This perspective requires conceptualizing the functional organization of OTC in terms of its role in processing properties of the visible environment in the service of goal-directed behavior (Ingle, 1967; Milner & Goodale, 2008).

The behavioral relevance of any given visual property is determined by the interaction between what we see and what we are trying to do. Imagine going for a jog in a park and encountering a woman walking her dog (Figure 1(a)). In such a situation you may pursue a number of different goals, such as continuing with your run or stopping to pet the dog. For either action the visible environment offers a wealth of useful information. If you choose to jog past the woman, you might consider the distance to the tree, the wetness of the grass, and how fast she is walking. The gaze direction of the woman, the firmness of her grip on the leash, or the friendliness of the dog, may also be important sources of information. These very same visual features may also be important, but to different degrees, if you wish to pet the dog. In which case, the apparent friendliness of the dog, the gaze of the woman, and her grip on the leash may take on greater significance compared to how fast she is walking, the state of the grass, or the position of the tree line. This simple example highlights why behavioral relevance is not a fixed feature of the visual environment, but results from the interaction between the environment and our behavioral goals (Cisek & Kalaska, 2010; Treue, 2001).

Figure 1.

Figure 1.

Behavioral relevance as a framework for understanding visual function in occipitotemporal cortex. (a) While we frequently experience digital renderings of visually complex natural images, our visual system is structured (through evolution and development) to encounter visual environments only during natural behavior. Many different aspects of the visual environment can be behaviorally relevant depending on the interaction between our behavioral goals and what we are looking at. (b) Within the traditional framework of category selectivity, the stimulus is parsed into visual categories (e.g., faces, tools, scenes) which are each associated with different broadly defined classes of behaviors (e.g., social interaction, object manipulation, visual navigation). In contrast, an ethologically inspired framework puts (classes of) natural behavior first (e.g., navigating by running around an obstacle), rather than the stimulus. Based on the behavioral goal, the relevant visual information is extracted from the stimulus— it is a product of both the visual stimulus and the natural behavior.

The interactive nature of behavioral relevance poses two important challenges to visual function in OTC (Malcolm et al., 2016, Nau et al., 2024). The first is visual diversity: a wide range of properties in the visible environment can be relevant to the same behavioral goal (Figure 1(a)). For example, if you choose to jog around the woman or pet the dog, relevant visual information may come from material (wetness of the grass), spatial (distance to the trees), social (friendliness), relational (gaze direction), affordance (secureness of leash), or dynamic (speed of walking) properties of the environment. However, the specific list of behaviorally relevant visual properties might look quite different if our goal stays the same, but we are in a different visible environment (e.g., jogging on an indoor track or petting a dog at a shelter). Thus, neural coding in OTC must be able to cope with the visual diversity of behaviorally relevant properties we encounter in our environment when engaging in natural behavior.

The second is goal-dependence: although the properties of the visual environment may remain stable, our goals are not, and shifting them changes how we represent and evaluate what we see. In other words, although the same properties of a scene may be relevant to different goals, what will change is the significance of those properties. For example, whether a dog appears friendly is important whether your goal is to avoid or pet it (Figure 1(a)). In the former case, you may wish to know whether it will lunge as you pass by, and if it does so, will it be motivated by excitement or aggression. But the dog’s friendliness may otherwise be of low importance to your goal. If you are stopping to pet it, its temperament is of central importance to how you approach. Thus, neural coding in OTC must be sensitive to this goal dependence and differentially process potentially behaviorally relevant sources of visual information.

If the visual function of OTC reflects the need to facilitate different types of natural, adaptive behaviors, then our focus should be to explain how it addresses these two challenges and bridges the gap between our goals and incoming visual information. In this respect, there is an important difference between the way in which behavior is commonly incorporated into the existing framework of category-selectivity and the course correction that we are proposing here in terms of behavioral relevance (Figure 1(b)). While the category-selectivity framework initially stemmed from considerations of ecological importance, the focus of study is primarily on the stimuli (e.g., faces, scenes) and with associated types of natural behaviors (e.g., social interaction, navigation) following category-specific processing (e.g., recognition) of those stimuli. In contrast, our proposal involves prioritizing the behavioral relevance of visual properties, so that we consider goal-directed natural behavior first, and the visual information relevant to the specific type of behavior second.

It may seem that shifting focus to behavioral relevance invites an approach to visual function in OTC that is context-sensitive to the point of making any generalizations impossible. After all, behavioral relevance results from the interaction between our behavioral goals and our visual environment and both can vary in myriad ways. However, it is worth remembering that cognitive ethologists face the same apparent problem, and for this reason study broad classes of natural behaviors such as foraging or threat detection (Cisek, 2007; Mobbs et al., 2018; Pessoa et al., 2022). Although we have illustrated the idea using specific examples (e.g., jogging and petting), we have in mind coding for behavioral relevance at the level of broad classes of adaptive, natural behaviors (e.g., navigation and object manipulation). These types of behavior have generally been associated with whole-brain domain-specific networks of information-processing (Bi, 2020; Mahon & Caramazza, 2011; A. Martin, 2016) as we return to in Section 5.3

In the next two sections we show how the dominant framework of category-selectivity has both helped and hindered our understanding of how OTC meets the twin challenges of visual diversity and goal-dependence.

3. Studying category-selectivity is theoretically limiting

Recent work on category-selectivity has to some extent emphasized the importance of behavior to visual function in OTC. On the one hand, it has been proposed that OTC represents a variety of behaviorally relevant visual properties of region-defining stimuli beyond their category membership (Bracci & Op de Beeck, 2023; Cox, 2014; Peelen & Downing, 2017). On the other hand, networks of purportedly category-selective brain regions in OTC have also been (re)described in terms of functional profiles directly related to classes of natural behavior. For example, tool-selectivity has been associated with a network of regions specialized for representing visual properties for object-directed action (Mahon & Almeida, 2024; Mahon et al., 2007; Wurm & Caramazza, 2022), and more recently face- and body-selectivity has been characterized as recruiting visual pathways specialized for social interaction (Pitcher & Ungerleider, 2021).

These are important preliminary steps in changing how the field thinks about visual function in OTC. However, we believe further progress requires moving beyond the construct of category-selectivity (Box 1). As we argue in this section, category-selectivity: (i) invites us to view behavioral relevance through the lens of region-defining stimuli; and (ii) assumes categorization is a core aspect of information-processing in OTC. These two features of category-selectivity encourage a narrow view of visual function OTC that obscures the twin challenges of visual diversity and goal dependence posed by behavioral relevance.

3.1. The stimulus does not determine the behavior

It has been proposed that how category-selective brain regions represent their preferred stimuli is shaped by the behaviors they facilitate (Bracci & Op de Beeck, 2023; Conway, 2018; Peelen & Downing, 2017). In support of this, many studies have provided evidence that these regions represent the behaviorally relevant features of their region-defining stimuli. For example, so-called ‘scene-selective’ regions have been shown to represent many scene properties highly relevant to navigation, such as the typical arrangement of objects in scenes (Kaiser et al., 2019) distance and openness (Kravitz et al., 2011; Lescroart & Gallant, 2019) and navigational affordances (Bonner & Epstein, 2017; Persichetti & Dilks, 2018) among other properties (for review, see (Dilks et al., 2022; R. A. Epstein & Baker, 2019). In this way, category-selectivity emphasizes specific associations between specific region-defining stimuli (e.g., scenes) and natural behavior (e.g., navigation). However, this obscures the two most challenging features of behavioral relevance.

First, category-selectivity draws attention away from the visual diversity of behaviorally relevant properties in real-world visible environments. For example, when going for a run, we may deliberate about the best way to avoid an obstruction on the path, such as a woman walking her dog (Figure 1(a)). In such a circumstance, the woman, her dog, and the dynamics of the leash are all highly relevant to the action we are taking, which is a form of navigation. Because category-selectivity specifically associates navigation with (non-agentive) scene properties, it limits consideration of navigationally relevant visual properties from other ‘categories’ such as faces and manipulable objects because of their default associations with types of behavior such as social interaction and object manipulation (Figure 1(b)). However, for natural behavior there is no simple mapping of visual stimuli to behavioral relevance as implied by category-selectivity.

Second, category-selectivity implies that behavioral relevance is a fixed feature of visual properties and not goal dependent. However, as we have emphasized, the very same visual properties can be important to different behaviors (Figure 1) (Malcolm et al., 2016; Peelen & Downing, 2017). For example, because faces, bodies, and animals all tend to be prelabeled as stimuli important to social interaction, they are typically ignored as sources of navigationally relevant information, as illustrated by the ubiquity of studies that use ‘scene’ stimuli that are intentionally designed to exclude people and animals. Of course, some stimulus-behavior connections are immutable – we can only socially engage with other agentive living things, for example. However, the problem is that the flexible and goal-dependent nature of behavioral relevance is obscured by rigidly associating stimuli (e.g., agentless scenes) with a broadly defined behavior (e.g., navigation), as is the case with category-selectivity.

3.2. Categorization does not determine the behavior

Constructing viewpoint invariant representations of object identity and category membership has often been characterized as a primary function of the ventral stream (DiCarlo et al., 2012; Gauthier & Tarr, 2016; Margalit et al., 2024) with category-selectivity a byproduct of constructing such representations for ecologically important stimuli (Bracci, Ritchie, et al., 2017; Grill-Spector & Weiner, 2014). Recently it has been argued that, because OTC plays a central role in many behavioral tasks besides recognition, category-selective brain regions represent many properties that are distinct, and even orthogonal, to stimulus identity or category membership (Bracci & Op de Beeck, 2023; Cox, 2014; Peelen & Downing, 2017). For example, head orientation and body configuration are important social cues but are incidental to determining individual identity (Gelder & Solanas, 2021; Perrett et al., 1992). Similarly, visual field position informs how we foveate or reach toward behaviorally relevant aspects of visual scenes, but retinotopic position is orthogonal to object category (Groen et al., 2022; Kravitz et al., 2008). Under this broader view, categorization is just one of the functions implemented by OTC. However, this more expansive view still generates a distorted picture of the two main challenges posed by behavioral relevance.

First, categorization is sometimes considered a ‘task’ of OTC comparable in status to forms of natural behavior like social interaction, navigation, and object manipulation (Bracci & Op de Beeck, 2023; Malcolm et al., 2016; Peelen & Downing, 2017). In which case, it would delineate another type of goal dependence of visual function in OTC. However, putative forms of natural behavior, like navigation, are reflected in actions we plan and execute in the service of our goals (e.g., jogging along a path requires us to find a clear path and represent visual properties that suggest different navigational affordances). In contrast, visual categorization does not directly map to behavior in the real-world outside of experimental or basic educational settings any more than other forms of visual processing. However, when looking at an object like a dog, few would claim that representing something as oblong, brown, or furry are forms of natural behavior. Instead, in the context of natural behavior categorization is primarily a cognitive process that sometimes serves our goals and not a form of natural behavior in of itself (Milner & Goodale, 2008).

Second, and in response to this problem, it is sometimes suggested that in order to plan behavior we have to first categorize what we are looking at (Bracci & Op de Beeck, 2023; Peelen & Downing, 2017). In which case the visual diversity of behavioral relevance first depends on categorization as a filter. However, even if categorization helps us make sense of the visible world, it does not follow that categorization is required to determine which signals are behaviorally relevant (Bi, 2020; Cox, 2014; Gelder & Solanas, 2021; Groen et al., 2017). Material properties are often important to behavior (e.g., whether the grass is slick or the leash is pliable in Figure 1(a)) and are represented in OTC (Hiramatsu et al., 2011; Schmid & Doerschner, 2019; Wada et al., 2014). But we can readily infer such properties directly when representing real-world environments without relying on categorization, even if material property estimation and categorization often interact (Schmid et al., 2023). For example, if we pick up an unfamiliar object, we already perceive its rigidity before it is grasped, and may only come to know what it is we are looking at through visually guided manual exploration. Further, the information needed to inform behavior typically spans the putative region-defining categories (Figure 1(a)), making physiologically distinct representations of those categories inefficient. Thus, the diverse visual properties available during action planning need not be categorized in order to successfully inform behavior.

3.2.1. Summary

OTC represents the visible environment to inform natural behavior (Kravitz et al., 2013; Milner & Goodale, 2008). Given this, category-selectivity paints a misleading picture of the scope of behaviorally relevant properties that OTC represents as it places functionally undue importance on (i) region-defining stimuli; and (ii) visual categorization. These characteristics of category-selectivity obscure visual diversity and goal dependence, which are the main challenges facing visual processing in the service of behavioral relevance. Consequently, from an ethological perspective, it may be theoretically misleading to treat category-selectivity as central to the organization of visual function in OTC.

4. Studying category-selectivity is empirically limiting

By emphasizing the importance of behavioral relevance to visual function in OTC, it seems we are making a bold prediction: that there should be overlapping selectivity for a wide variety of stimuli, so long as they are relevant to similar natural behaviors. For example, regions characterized as face-selective should also respond to stimuli associated with other forms of category-selectivity, such as bodies, places, or tools, if they are relevant to social interaction. Yet, in apparent contradiction to this prediction, it is generally held that there is overwhelming evidence that these region-defining stimuli elicit dissociable loci of selectivity in OTC.

We do not dispute the robustness and reliability of category-selective effects per se (though see Box 1), which have also provided a key frame of reference for comparing the topography of OTC across participants, experimental designs, and research teams (Rosenke et al., 2021; Saxe et al., 2006). However, that robustness has led to a strong focus on finding category-selectivity and interpreting results within its context, leading to two key problems. First, many designs drastically under sample the diversity of possibly behaviorally relevant stimuli in favor of enabling simple contrasts to define and investigate brain regions. Second, even when results do show reliable responses that are more complex (e.g., graded rather than categorical selectivity), interpretations tend to overemphasize apparent categorical differences. Thus, the empirical base for category-selectivity has led to a restricted view of what drives clustering of visual function in OTC.

4.1. Limitations in experimental design

Since studies of category-selectivity tend to selectively sample stimuli in their experimental designs, any results they report underdetermine the visual selectivity exhibited by OTC (Friston et al., 2006; Gauthier & Nelson, 2001). This is best illustrated by classic univariate analyses used to functionally localize category-selective regions of interest (ROI) in OTC based on the neural response to region defining stimuli during passive viewing or attentional tasks (e.g., one-back) unrelated to natural behavior. Typically, this involves comparing the magnitude of response between pairs of stimuli such as faces vs. objects, which results in a map of stimulus preference across ventral OTC – some cortical areas respond more to faces, and other areas to objects (Figure 2a).

Figure 2.

Figure 2.

How focusing on category-selectivity limits the investigation of behavioral relevance. (a) Illustration of a common metric for measuring the visual selectivity of brain regions or neurons by comparing a small number of conditions based on their response magnitude between a target condition A/red and control condition(s) B/blue. The schematic charts illustrate hypothetical responses to a large number of stimuli ranked by magnitude, with the different underlying response distributions shown in the inset histograms (red and blue highlights on the x-axis indicate portions of the distribution that the target and control conditions might be sampled from). Only separate or multimodal distributions are indicative of a possible categorical division, which may also vary depending on behavioral goals. (b) Data from the THINGS fMRI dataset (Hebart et al., 2023). The response magnitude for 720 object concepts in the face-selective region of bilateral fusiform gyrus (n = 3). Top-left, a bar graph shows the mean beta averaged across voxels for the average of nine “face” concepts and nine “object” concepts. Error bars are ± 1 standard deviation. Top-right, the distribution of responses (upper right) for all 720 concepts is shown ordered along the x-axis by response magnitude. The distribution is unimodal and skewed, rather than separate or bimodal. The location of the nine face and nine object concepts from are marked by red and blue vertical lines. Bottom, the top 10 of the 720 concepts that produced the strongest average response in the fusiform gyrus, and which are not obviously groupable by the label “face”. (c) Top-left, 2D multi-dimensional scaling solution for the group-averaged patterns of neural responses to 92 stimuli, color coded by object type, in human ventral occipitotemporal cortex, which shows no bimodal clustering for the division between animate and inanimate objects (Kriegeskorte et al., 2008). Top-right, mean beta weights for 20 stimulus types in face-selective right fusiform gyrus (Downing et al., 2006). Stimulus types are color coded based on whether they are faces (red) or not (blue). The responses suggest a skewed, unimodal distribution. Bottom-left, group-averaged t-SNE embedding of the neural responses for 10,000 images of complex scenes in anterior ventral occipitotemporal cortex, color coded based on (co) occurrence of object types (Allen et al., 2022). Coding of the embedding points by stimulus type does not show separate categorical clusters. Bottom-right, group averaged response magnitudes for two neural components for 515 images of complex scenes, color coded based on the saliency of “scenes” and “faces” in the images (Khosla et al., 2022). Response profiles are unimodal and skewed, not separate or bimodal.

This ubiquitous approach to defining functional ROIs presumes that these experimental conditions sample from different random distributions (e.g., separate normal distributions), and that the relative selectivity is not impacted by changes in observer goals. However, when only two (or a few) conditions are considered, an apparent categorical difference between faces and objects could originate from many selectivity profiles in ventral OTC including continuous distributions (e.g., multimodal normal, skewed, platykurtic, normal) which have no categorical boundary (Figure 2a). Furthermore, what form the distribution takes could depend on what task observers are performing. Thus, the apparent categorical response difference between faces and objects in ventral OTC may be an artifact of undersampling the underlying selectivity distribution based on both stimulus and task. Simply put: one cannot infer selectivity from selective sampling – especially if we wish to characterize the diversity of visual properties in the environment relevant to natural behaviors.

Selective stimulus sampling is pervasive in the evidential base for category-selectivity, which includes a large body of work that has used a variety of methods to corroborate univariate findings. These include decoding the differences between patterns of neural responses for region-defining stimuli using machine learning classifiers (Haxby et al., 2001; O’Toole et al., 2005) identifying differential patterns of functional and structural connectivity for region-defining stimuli (Osher et al., 2016; Saygin et al., 2012) or selectively inhibiting neural responses for region-defining stimuli using transcranial magnetic stimulation (TMS) or microstimulation (Jonas et al., 2015; Parvizi et al., 2012; Pitcher et al., 2009; Schalk et al., 2017). The same limitation even applies to clinical neuropsychology studies, which historically have been interpreted as providing some of the most compelling evidence for category-selectivity. For example, in both acquired and developmental prosopagnosia (‘face blindness’), when behavioral performance is studied systematically in relation to other stimuli, face-specific deficits tend to be the exception rather than the rule (Geskin & Behrmann, 2018; Rice et al., 2021).

Selective sampling is also a persistent feature of research on category-selectivity. This is best illustrated by two exciting research directions in visual neuroscience: the developmental basis of category-selectivity (Arcaro & Livingstone, 2021; Op de Beeck et al., 2019) and the modeling of category-selectivity with DNNs (Kanwisher et al., 2023). In the case of development, the same restricted set of stimuli, and comparisons, are still used whether they are studies of category-selectivity in infants and children (Cabral et al., 2022; Kamps et al., 2020; Kosakowski et al., 2022), the congenitally blind (Ratan Murty et al., 2020; van den Hurk et al., 2017) or non-human primates that have been reared with selective visual experience (Arcaro et al., 2017). In the case of DNNs, several recent studies purport to show that topographic differences similar to those exhibited in OTC emerges in DNNs when trained to categorize region-defining stimuli (Blauch et al., 2022; Dobs et al., 2022; Margalit et al., 2024). However, despite the cutting-edge nature of these AI architectures, when it comes to the evidential base for category-selectivity these DNNs are employed to verify, rather than test, the same simple picture introduced by traditional univariate neuroimaging methods based on selective stimulus sampling. Because of this, they offer limited insight into the computational basis of clustering of visual function in OTC.

Again, we do not dispute the importance of these findings; indeed, it is only because of the empirical foundation that they provide that we have evidence that OTC codes for behavioral relevance. However, if we wish to understand how OTC might code for behaviorally relevant stimuli that are visually diverse, or how this coding is modulated by our behavioral goals, then the selective sampling that is paradigmatic of studies on category-selectivity is ill-suited to this explanatory endeavor.

4.2. Limitations in interpretation

The obvious solution to the selective sampling problem is to use larger, more diverse, stimulus sets. Going further, one could also try to minimize bias in the interpretation of results by adopting data-driven approaches to analysis. Indeed, studies that adopt these approaches purport to provide confirmation of category-selectivity in OTC. We agree wholeheartedly with the importance of both methodological shifts – although they do not directly address the challenge of goal dependence, a point we return to at the end of this subsection (Nau et al., 2024, Kay et al., 2023). However, it is our contention that these studies do not provide clear confirmatory evidence of category-selectivity. Instead, the framework of category-selectivity has limited what analyses are performed and how findings are interpreted, even if results are more in line with predictions of the ethological framework we are proposing.

First, examining the response profile in putative category-selective brain areas for a much broader range of stimuli supports the idea that apparent categorical differences are frequently a result of undersampling. For example, when we plotted fMRI data from the THINGS dataset (Hebart et al., 2023) in terms of the average response to nine ‘face’ concepts, and nine ‘object’ concepts in a face-selective region of the bilateral fusiform gyrus, we obtained the classic (apparently) categorical result of a much stronger response to faces over objects (Figure 2(b), left panel). However, when we plotted the response to all 720 THINGS concepts separately and ordered them in terms of response magnitude, it revealed a skewed unimodal distribution with no discernable categorical boundary (Figure 2(b), right panel). Interestingly, the top 10 stimuli concepts with the strongest response magnitude include some faces, but also animals and body parts (Figure 2(b), bottom panel). These data are consistent with a higher response magnitude to stimuli relevant to social interaction (persons and other agents) in OTC, but not with a categorical difference in response to faces versus other objects (Hu et al., 2020; Taubert et al., 2022).

Other studies using ‘condition rich’ designs often select stimuli with the framework of category-selectivity in mind, resulting in a disproportionate number of exemplars from region-defining ‘categories’ (Bao et al. 2020; Kriegeskorte et al., 2008; Ratan Murty et al., 2021). Even with this sampling bias, it is striking to note that, like with the THINGS dataset, the results of these studies do not clearly support strong categorical differences even in brain regions defined by selectivity to a particular stimulus, such as ‘faces’ (Figure 2(c)). When the response magnitudes for multiple stimuli are plotted, the distribution is again typically graded, often without any clear category boundaries in the representational space or a step difference in response magnitude (Downing et al., 2006; Mur et al., 2012; Ratan Murty et al., 2021; Vinken et al., 2023). One concrete example of this is object animacy, which is thought of as a categorical property associated with face, body, and animal stimuli, but is represented in a graded and continuous fashion in regions of OTC (Proklova & Goodale, 2022; Ritchie et al., 2021; Sha et al., 2015; Thorat et al., 2019). Thus, even with broader stimulus sets, the interpretation of results is often influenced by the assumptions of the category-selectivity framework, even when they match the illustrative analysis in Figure 2(b).

Second, data-driven methods have the potential to avoid methodological biases when they are used to analyze large neuroimaging datasets consisting of neural responses to thousands of visual stimuli that were not selected based on prior assumptions of category-selectivity. A striking example of this is provided by recent studies using the Natural Scenes Dataset (NSD), which consists of high-resolution fMRI responses and behavioral annotations for thousands of complex real-world scene images (Figure 2(c)) (Allen et al., 2022). These studies found that the variation in the fMRI signal was explained by components or response profiles that seemed to match preexisting forms of category-selectivity OTC (e.g., for faces), and identified a less commonly recognized form of selectivity for food images (Jain et al., 2023; Khosla et al., 2022; Pennock et al., 2023) though see (van der Laan et al., 2011). These results have been interpreted as confirming the existence of category-selective brain regions by combining big-data with unsupervised analysis methods (Bannert & Bartels, 2022).

We believe the results of these NSD studies is consistent with a different interpretation, as illustrated by the supposedly ‘discovered’ food-selectivity. Inspection of the graded response profiles reported by these studies show no categorical step (Khosla et al., 2022) and all three studies localized food-selectivity both medial and lateral to face selectivity in the ventral OTC (Jain et al., 2023; Khosla et al., 2022; Pennock et al., 2023). These expanses of cortex are also associated with clustering of selectivity for tools (Mahon & Almeida, 2024; Wurm & Caramazza, 2022). This similarity of clustering of selectivity for tools and food items makes sense from an ethological perspective, since in both cases we manipulate them with our hands in stereotyped ways during actions such as hammering a nail or stuffing our faces. In light of this, overlap in selectivity is to be expected, much as has been observed for hands and tools (Bracci et al., 2012; Schone et al., 2021). A recent study also provided direct evidence of this overlap for food- and tool-selectivity (Ritchie, Andrews, et al., 2024). Thus, the NSD results are compatible with an alternative interpretation in terms of behavioral relevance rather than category-selectivity and demonstrate that even when using data-driven methods, prior assumptions of category-selectivity can influence how subsequent results are interpreted.

For all the above discussion, it is worth reemphasizing that datasets like THINGS and NSD did not incorporate any task manipulations, as data were collected in each case using only a single task. As has recently been emphasized it is not obvious how well we can generalize about the organization of neural function when we ignore how task demands modulate neural responses (Nau et al., 2024, Kay et al., 2023). Studies that have looked at task effects have shown how patterns of activity within different regions can be modulated by whether we attend to different simple or complex visual properties, or affordance properties related to how objects are grasped (Bracci, Daniels, et al., 2017; Fabbri et al., 2016; Harel et al., 2014; Vaziri-Pashkam & Xu, 2017). Similarly, changing what stimulus properties we attend to can change how different stimulus properties predict neural responses across the cortical surface (Çukur et al., 2013). Although these studies did not use tasks tailored to behavioral relevance (a topic we return to in Section 6.2), they have important implications for studies and datasets we have discussed. For example, in the case of the THINGS dataset, the shape of the distribution of responses in ventral visual cortex (Figure 2(b), right) could have taken on a different form (Figure 2(a)). Similarly, there is no reason to assume that response loadings for the NSD stimuli onto dimensions extracted from modeling voxel time courses would stay constant if task was varied (Figure 2(c)). Given these considerations, we cannot make strong inferences about visual function in OTC if our designs are made more stimulus rich, while remaining task poor.

4.2.1. Summary

Studies of category-selectivity offer robust and replicable empirical findings, however the apparently categorical nature of response in OTC is likely a byproduct of undersampling the underlying distribution of selectivity for behavioral relevance. Recent approaches leveraging larger stimulus sets and data driven methods have the potential to avoid the issues inherent in selective sampling, as long as category-selectivity is not an implicit assumption in the experimental design or the interpretation of subsequent results. To understand how OTC supports natural behavior requires developing an alternative and more ethological framework, which we outline in the following sections.

5. Charting behavioral relevance in OTC

Despite our criticisms, there are several reasons the field has progressed using category-selectivity as a model of visual function in OTC (Box 1). First, it clearly delineates what kinds of complex visual properties are prioritized by OTC when representing real-world environments (e.g., faces and scenes). Second, it localizes the neural representation of these visual properties to networks of relatively discrete brain regions allows for their isolation and direct probing through causal interventions. Finally, the topography of these purported brain regions is seemingly constrained by the functional organization of the visual system and the brain more generally (Arcaro & Livingstone, 2021; Op de Beeck et al., 2019). If category-selectivity paints a theoretically and empirically limiting picture of how behavioral relevance drives visual function in OTC, as we have argued, then what is the ethologically inspired alternative model of visual function in OTC?

So far, we have suggested that the spatial clustering of neural responses typically described in terms of category-selectivity instead reflects the role of OTC in facilitating different forms of natural behavior. Furthermore, we have proposed that this clustering does not reflect coding for properties of region-defining stimuli but rather coding for behavioral relevance. In this section we build on this sketch by proposing a model of: (i) what behaviorally relevant visual properties are represented; (ii) how the representations of these properties are implemented in OTC; and (iii) which factors constrain the topography of these representations in OTC. In each case, we also suggest how to reinterpret the compelling picture offered by category-selectivity in light of these alternatives.

5.1. What visual properties are behaviorally relevant?

With respect to what is represented in OTC, the answer is simple: look to the behavior. We believe the forms of natural behavior associated with category-selective brain regions, such as social interaction, environment navigation, and object manipulation, are fitting starting points for inquiry (Bi, 2020; Mahon, 2022). But whatever natural behavior one starts with, whether jogging on a path or petting a dog (Figure 3(a)), the range of behaviorally relevant visual properties in the environment can only be identified in the course of specific actions (Churchland et al., 1993; Hayhoe, 2017). Recognizing the active nature of vision is part of the impetus for emphasizing the two explanatory challenges posed by behavioral relevance that we have highlighted throughout (Figure 3(a)): visual diversity and goal dependence. These two features of behavioral relevance not only provide a critical grounding for how we should study visual function in OTC, they also make two crucial predictions: namely that there is graded, broad selectivity for a wide range of visual properties in portions of OTC previously associated with category-selectivity; and further, that the spatial extent and magnitude of this selectivity is modulated by task goals that recruit processes specialized for different natural behaviors.

Figure 3.

Figure 3.

A proposal for how visual cortex represents behavioral relevance. (a) A heterogenous collection of visual signals can be collectively relevant to the same natural behavior (e.g., jogging on a path). These same signals may also be relevant to a very different behavioral goal in the same environmental context (e.g., petting a dog), but to different degrees. (b) Behaviorally relevant visual dimensions are likely to be represented in a widely distributed fashion throughout OTC and potentially visual cortex more generally, which is here depicted by color patches across the cortical surface showing where, hypothetically, the dimensions significantly explain some variance in the neural response. However, at any cortical locus, only some of these dimensions will be represented, which is depicted using a radial plot in which bars indicate the hypothetical degree of explained variance for different dimensions at cortical loci in OTC. In this respect the code is globally distributed but can be locally sparse. (c) Factors that determine cortical organization that may constrain the sparse coding of behaviorally relevant dimensions in OTC.

How then should we think about the sorts of stimuli commonly associated with category-selectivity effects (e.g., faces and scenes)? As we have alluded to, these stimuli may be inherently important for certain forms of behavior, regardless of goal context (Bi, 2020). For example, persons and animals are what we socially interact with in our environment, whatever our specific goals, or the present environment. Thus, labels like ‘face’ and ‘body’ point to stimuli of high relevance for (social) interaction, but as we have emphasized, the visual function of clustering in OTC is not to represent only these specific stimuli as such, but their behaviorally relevant properties and other parts of the visible environment. Furthermore, these same properties can be represented in different ways in the service of goals that fall within different behavioral domains. For example, gaze direction may be an important cue when we initiate social interaction, but it also constrains our navigational affordances when running on a crowded path (Figure 3(a)). Crucially, there is no sense in which a label like ‘face’ or ‘body’ reflects the default starting point for how we parse real world scenes for socially relevant cues. Instead, the ‘basic’ or entry level at which we discern what visual properties are relevant to our goal is mutable, and influenced by context and experience, as is the case with basic levels in taxonomic hierarchies (Schyns, 1998; Tanaka & Taylor, 1991). For example, a group of people interacting, or an agent performing a behavior, will produce many important visual properties relevant to our social goals (Figure 3(a)). The full range of visible properties we observe in order to interpret the behavior of social agents are more indicative of the role of OTC in the larger domain-specific network for social cognition than whether they have a ‘face’ or ‘body’ or their category-membership (Pitcher & Ungerleider, 2021, Nastase et al., 2017).

5.2. How are behaviorally relevant visual properties represented?

If behaviorally relevant properties are visually diverse, and their relevance comes in degrees modulated by our behavioral goals, then how are they represented? Minimally, an appropriate neural code would require representations of sufficient dimensionality to represent the requisite diversity of visual signals, but in a way that can be reshaped to fit our differing behavioral goals. Here we adapt a model of visual function based on the analysis of the THINGS-fMRI data that originates in our lab (Hebart et al., 2023). Previous work has shown that cognitive representations of the similarity relations between the objects depicted in the THINGS images can be reduced to a small number of dimensions (Hebart et al., 2020). When mapped to the brain, these dimensions predict neural responses across large swaths of the cortical surface (Contier et al., 2024). However, the cortical extent of this coding is not uniform across the dimensions, since at any given locus only some of the dimensions are represented (Figure 3(b)). In other words, there is heterogeneity due to varying levels of sparse coding across the cortical surface; i.e., the degree to which selectivity in a region is driven by only a few dimensions (Contier et al., 2024).

These results for the THINGS-fMRI data are similar in spirit to those of other studies that also suggest widely distributed, sparse coding across visual cortex, but for different dimensions related to simple visual features or semantic properties associated with stimulus categories (Huth et al., 2012; Jozwik et al., 2023). However, we are proposing that the dimensions represented by this type of code are not derived from judged similarity, specific visual features, or semantic properties, but rather reflect degrees of freedom among many correlated visual properties in real-world scenes relevant to planning different forms of natural behavior (Figure 3(a)). For example, as mentioned earlier, evidence suggests ‘animacy’ is not represented in visual cortex as a binary attribute of a stimulus, but a dimension that may encode several properties such as the agency or humanness of an object in graded fashion across the cortical surface (Konkle & Caramazza, 2013; Ritchie et al., 2021; Thorat et al., 2019). Under our view, this dimension may be one of many that capture sources of variation relevant to different behavioral domains for which ‘animacy’ provides a convenient, if potentially misleading, label (Contier et al., 2024). We believe this type of sparse-distributed coding model has the potential to meet both of the challenges presented by coding for behaviorally relevant signals in OTC (Figure 3(b)).

First, this nascent model can explain how OTC codes for visually diverse properties. On the one hand, very different kinds of visual properties (e.g., various low, mid, and high-level visual properties) might be jointly coded along different dimensions of behavioral relevance. On the other hand, visually distinct objects (e.g., faces and bodies or hands and tools) show similar distributions of activity because they are high along the coded dimensions important to domains like social interaction and object manipulation. Second, this model can account for how behavioral relevance is modulated by our goals. A widely distributed representation allows for a behaviorally relevant dimension to be represented in portions of OTC that are part of many behavior-specific networks. Change in goal context will manifest as a shift in the distribution of the coding for the dimensions, as suggested by effects of task-based attention (Çukur et al., 2013, Nastase et al., 2017). For example, when visual signals normally associated with social interaction become relevant to navigation (e.g., where an agent is looking), this may shift the coding so that agent-related dimensions are now more strongly represented in regions of OTC that sparsely code for configuration information about scenes (Figure 3(a)). Evidence supporting this prediction comes from studies that trained subjects to parse fonts made from face and scene images as opposed to more conventional orthography. Viewing the image fonts produced greater responses in regions of left ventral OTC already associated with reading words composed using more conventional orthographies (L. Martin et al., 2019; Moore et al., 2014).

How does the idea of distributed-sparse coding compare to the standard picture afforded by category-selectivity? On the one hand, according to the distributed-sparse coding model, purported category-selective regions can be described as clusters of relatively sparse coding for dimensions that are canonically relevant to particular classes of natural behavior (Contier et al., 2024) as opposed to stimulus categories and their properties (Op de Beeck et al., 2008). For example, results obtained with the THINGS-fMRI data already show that sparse coding of similarity judgment dimensions recovers the locations of purported category-selective brains regions (Contier et al., 2024). On the other hand, within the model any representation of these dimensions outside of sparse coding regions, like that suggested by the graded responses in (Figure 2(bc)), are not considered aberrations of sampling noise but signals to be understood (Westlin et al., 2023).

Our proposed model recalls the earlier debate concerning whether selectivity for stimuli like faces and scenes is locally ‘modular’ or widely ‘distributed’ outside of category-selective regions (J. D. Cohen & Tong, 2001; Haxby et al., 2001). In our view, there may be no tension between these alternatives since the proposed code is both widely distributed and locally sparse. Coding for different dimensions of behavioral relevance will not be uniform in the extent to which they are either distributed or sparse and variation may be driven by different developmental trajectories (Arcaro & Livingstone, 2021; Behrmann & Plaut, 2013; Op de Beeck et al., 2019). At one extreme, it may be that some visual properties are represented so widely as to contribute literal to the sparsity exhibited in a particular cortical region. Representing ‘domain-general’ visual properties like shape or color could be characterized in this way. At the other extreme, some dimensions may only be represented locally because of their unique contributions to particular behavioral domains. For example, coding for word forms and other conventional symbols (e.g., numerals) may be especially local (and sparse) because they are complex visual stimuli that were created by human convention for a specialized, late developing behavior (reading), that connects with downstream linguistic and conceptual systems (Dehaene-Lambertz et al., 2018; Hannagan et al., 2015).

Our model is also similar in spirit to other characterizations of the organization of visual function in OTC. First, it has been hypothesized that there is graded selectivity across hemispheres that helps to explain specialization for different behavioral domains, such as the right vs left laterization for social interaction vs reading (Behrmann & Plaut, 2013). We conjecture that the same graded selectivity occurs within hemisphere as well, across multiple classes of natural behavior, in line with the premise that selectivity gradients are a ubiquitous organizing principle in the brain (Huntenburg et al., 2018; Kravitz et al., 2013). Second, there is evidence of extensive cross-talk between the dorsal and ventral visual streams depending on stimulus and task demands (Milner, 2017; Ritchie, Montesinos, et al., 2024). In this way, behaviorally relevant visual properties for reaching are represented by the ventral stream (e.g., complex shapes with more than one principal axis) and that information is passed on to the dorsal stream to guide action. We conjecture that the same sort of information sharing may occur between different clusters of sparse coding, which may help explain shifts in the distribution of response for different dimensions depending on stimulus and task conditions.

Both graded selectivity and extensive connectivity and cross-talk are also relevant to the question of how our model can capture the balance between canonical vs ad hoc stimulus-behavior relationships (e.g., the distance to a tree vs gaze direction of a person in front of you when going for a jog). First, the gradient of selectivity within particular clusters of sparse coding will reflect the consistency of the relationship between certain stimuli and behavior (e.g., the distance to possible obstacles is more frequently relevant to determining navigational affordances than where an agent is looking). When there is an infrequent or ad hoc relevance of a signal to a behavioral goal, then this may boost a response that is otherwise relatively weak (e.g., cues to spatial awareness of another agent, like gaze direction). Second, at the same time, not all dimensions of behavior relevance are represented everywhere equally – hence the sparsity of coding in the model. In such cases, clusters of selectivity for particular behavioral domains may rely on information sharing, or ‘cross-talk,’ with other clusters associated with different behavioral domains (e.g., information about gaze direction coming from regions for representing social cues). Which alternative better captures the response profile in a portion of OTC is an empirical question, ultimately dependent on the behavioral domain and visual property in question.

5.3. How are the representations constrained?

Three factors have been proposed to constrain the locations of purported category-selective brain regions: maps of visual feature selectivity, the information-processing hierarchy of the visual system, and patterns of large-scale connectivity related to different behavioral domains (Op de Beeck et al., 2019)— though for an alternative perspective see Arcaro & Livingstone (2021). For example, according to this multi-factor proposal, face-selectivity emerges during development in portions of feature maps that have foveal and convex curvature bias, are at a holistic representational stage of the processing hierarchy, and connect with social cognition regions elsewhere in the brain. Within our proposed model, so-called category-selective regions reflect clustering of sparse coding of behaviorally relevant dimensions. Furthermore, we have already pointed out that the evidence for these three factors, which comes from studies on the development of category-selectivity, faces the same selective sampling problem that plagues the evidential base for category selectivity (Section 4.1). Still, we conjecture that when characterized appropriately, each factor will likely play a role in explaining why broadly distributed but sparsely coded dimensions overlap to drive apparent clusters of selectivity.

First, patterns of (innately determined) connectivity between OTC and the rest of the brain provide the foundation for the emergence of whole-brain domain-specific networks, where ‘domain’ refers to different types of natural behavior (or tasks) and the sort of information-processing required to carry them out – not a type of stimulus (Amedi et al., 2017; Bi et al., 2016; Heimler et al., 2015) (Mahon, 2022; Mahon & Caramazza, 2011). These networks have been identified for the different behavioral domains associated with forms of category-selectivity, including for navigation, social interaction, object manipulation, and reading, and are thought to constrain the location of putative category-selective brain regions in OTC (Hannagan et al., 2015; Kamps et al., 2020; A. Martin, 2016; Op de Beeck et al., 2019; Powell et al., 2018; Stevens et al., 2017; Wurm & Caramazza, 2022). However, unlike the framework of category-selectivity, we do not think that clustering of selectivity in OTC reflects stimulus-centered regions that interface with whole-brain domain-specific networks. Rather the clustering reflects regions that represent visual signals based on their relevance to a specific behavioral domain – as such these regions are simply the visual system component of the larger domain-specific networks for particular types of behavior. We would even go so far as to suggest that this picture is far more in line with the very idea of whole-brain domain-specific networks, which does not presuppose any form of stimulus specific selectivity (Mahon, 2022). Furthermore, the goal dependence of behavioral relevance implies that these networks also share information. For example, navigating a crowded room of people may recruit neural pathways that connect sparsely coded regions specialized for spatial processing to networks for social interaction.

Second, characterizations of the information-processing factor tend to presume a hierarchical characterization of the ventral stream focused on object and scene recognition (McGugin et al., 2023; Op de Beeck et al., 2019), which is contrasted with the more action-oriented function of the dorsal stream (Goodale & Milner, 1992; Kravitz et al., 2011). Despite the ubiquity of this characterization, there are two ways it misconstrues the functional profile of OTC within the dual stream model of the visual system, which was also initially motivated by ethological considerations (Ingle, 1967; Schneider, 1969). One is that a more encompassing characterization of the visual function of the ventral stream is that it represents a wide variety of stable visual properties of real-world environments in the service of cognitive processes like reward learning, memory, and decision-making to facilitate deliberation and also action planning (Cisek, 2007; Kravitz et al., 2013; Milner & Goodale, 2008). The other is that the broad information-processing architecture of the ventral stream may be better described as being heterarchical (Baker & Kravitz, 2024; Barlow, 1997; Ritchie, Montesinos, et al., 2024). For example, supposed low-level properties, like aspect-ratio, are represented in many regions of visual cortex associated with ‘high-level’ properties like object category (Bracci & Op de Beeck, 2016; Hong et al., 2016). This alternative conception of the ventral stream may also be better suited to explaining sparse coding for a wide variety of potentially behaviorally relevant signals, which are represented in multiple ways depending on our goals (Figure 3(a)).

Third, we believe visuospatial coding, in the form of overlapping visual feature maps (e.g., for spatial frequency, curvature, or retinotopy), likely provides a crucial scaffold for the representation of behaviorally relevant visual properties in real world environments (Groen et al., 2022). What we dispute is that the visuospatial coding is tailored specifically to region-defining stimuli (Arcaro & Livingstone, 2021; Bao et al., 2020; Op de Beeck et al., 2008). In fact, the evidence that visual features explain selectivity differences between category-selective regions is relatively weak (Op de Beeck et al., 2019; Bracci et al., 2017; Yargholi & Beeck 2017). Rather, it is plausible that visuospatial biases in sparsely coded regions are driven by statistical regularities related to how stimuli appear in the visual field (e.g., graspable objects in the lower periphery) during natural behavior, which are captured by different dimensions of behavioral relevance. The idea of visuospatial coding for behavioral relevance is compatible with early life experience with certain highly behaviorally relevant stimuli (e.g., the faces of caregivers) recruiting the protomaps of selectivity on which subsequent coding of behaviorally relevant dimensions is scaffolded (Arcaro & Livingstone, 2021; Konkle & Alvarez, 2022). Such scaffolding likely starts when infants are incapable of carrying out behavior, but are nonetheless learning a self-supervised model of real-world environments (Cusack et al., 2024). For example, the strongest evidence for the developmental role of visuospatial coding is that monkeys do not develop patches of face selectivity if they are deprived of exposure to faces early in life (Arcaro & Livingstone, 2021; Arcaro et al., 2017). However, this lack of stimulus exposure also deprives them of typical face-directed social interaction and so is consistent with any feature-selectivity arising from learned associations between stimulus types and different behavioral domains (Kravitz et al., 2013; Op de Beeck et al., 2019).

5.3.1. Summary

Behavioral relevance encourages us to search for alternative models of visual function in OTC. We have proposed that OTC represents a diversity of visual properties based on their (goal modulated) behavioral relevance, and this may be implemented in distributed representations of behaviorally relevant dimensions with varying levels of sparsity in their topography, rather than circumscribed category-selective brain regions. Furthermore, functional organization in OTC will likely be constrained by the same factors (visuospatial coding, the ventral stream information-processing heterarchy, and whole-brain domain-specific connectivity) that have been posited to explain the locations of purported category-selective brain regions. We now turn to considering how this model can be explored empirically.

6. Behavioral relevance in practice

Given the above sketch of a coding model for behavioral relevance, how should we study visual function in OTC? Ultimately there must be a trade-off between ecological validity, practicality, and generalizability. Many methods, especially well-suited to studying how OTC codes for behavioral relevance, may require skills and resources that are only available to a few labs and so will only be implementable at scale through continued fostering of collaboration and Open Science (Poldrack et al., 2017). However, taking a more ethological approach does not preclude the need for controlled experiments (Nastase et al., 2020). Nor does it require entirely new tools. The sort of naturalistic and data-driven methods that are increasingly being developed are a first, important step toward determining how behavioral relevance shapes visual function in OTC. But considerations of behavioral relevance still need to inform how we deploy these and other tools. In this final section we discuss several aspects of experimental design and how they should be shaped by considerations of the main features of behavioral relevance, its visual diversity and goal dependence. These aspects include: (i) stimulus design, (ii) tasks; (iii) participant selection; and (iv) data modeling and analysis. In each case, we highlight approaches that range in their feasibility.

6.1. Naturalistic stimuli

Studying how OTC codes for behavioral relevance requires focusing on stimuli that are naturalistic, broadly sampled, and related to specific behavioral domains (Figure 4(a)).

Figure 4.

Figure 4.

Suggested approaches for investigating the neural representation of behaviorally relevant signals in visual cortex. Behavioral relevance can be targeted through (a) Stimulus design, (b) The nature of in-scanner tasks, (c) and participant selection by enhancing the naturalness of experimental conditions. However, different approaches will vary in the logistical challenges they present. While some practices are ideal, others are more practically feasible. (d) Similarly, the analysis and modeling methods vary in how ideal vs feasible they are for capturing sources of variation resulting from design choices illustrated in (a)–(c).

6.2. Behavior during acquisition

Since the behavioral relevance of visual properties is modified by our goals, it is critical that task manipulations during acquisition be the norm, rather than the exception, as is largely now the case in visual neuroscience (Nau et al., 2024, Kay et al., 2023). To date, when task effects have been explored, the tasks that have been utilized are often behaviorally arbitrary (rather than naturalistic), limited in number, and focused on categorization or similarly constrained forms of judgment (Bracci, Daniels, et al., 2017; Bugatus et al., 2017; Duan et al., 2024; Harel et al., 2014). Generally, we should aspire to ‘task rich’ designs that mirror the stimulus rich designs exemplified by recent large-scale datasets like NSD and THINGS, which will allow for systematic study of the effects of task focus and switching on neural coding in OTC (Egner, 2023).

  • An ideal task manipulation would involve real-time recording while subjects engage in a wide-range of natural behaviors that may crosscut some intuitive divisions between behavioral domains (e.g., navigating through a room of people) (Matthis et al., 2018).

  • Although the forms of behavior possible during data acquisition is typically limited with human neuroimaging, it is possible to develop apparatuses that allow for overt behavior like reaching and grasping (Fabbri et al., 2016; Snow & Culham, 2021) or simulating environment exploration and navigation is possible with in-scanner virtual reality headsets (Nau et al., 2020).

  • Live camera feedback during acquisition and ‘hyper-scanning’ also allow for the possibility of volunteers to socially interact with a confederate in different scenarios (Caruana et al., 2017; Hamilton, 2021).

  • Eye movements have largely been treated as a source of noise when studying category-selectivity. However, eye movements are modulated by stimulus and task demands, and can be used to obtain information about how we sequentially sample the environment during natural viewing and to predict neural responses (Çukur et al., 2013; Nau et al., 2018, Liu et al., 2017, 2020; Xiao et al., 2024).

  • Even more common methods (e.g., button presses) can be used to target different behaviorally relevant aspects of complex natural stimuli, and provide the simplest approach for implementing task rich designs in which subjects alternate not between one or two, but potentially dozens of tasks tailored to the same or different behavioral domains (Bracci, Daniels, et al., 2017; Bugatus et al., 2017; Harel et al., 2014; Vaziri-Pashkam & Xu, 2017).

6.3. Selecting participants

The perceived behavioral relevance of stimuli varies across individuals, so it is critical to also consider individual and group differences in experimental designs. While variation in experience and knowledge will shape the neural response to behaviorally relevant signals, even when overt behavior seems similar, individuals may recruit different representations reflecting many possible solutions to the same task (Figure 4(c)).

  • Ideally longitudinal studies during development would allow for extensive comparisons within and between individuals as to how the visual system changes to accommodate increasing flexibility and variation in natural behavior (Casey et al., 2018).

  • More achievable is to study behavioral relevance at different stages in development from infancy to adulthood, but without restricting stimuli to those associated with category-selectivity (Cabral et al., 2022; Deen et al., 2017; Kosakowski et al., 2022). One option is to even re-annotate data acquired during continuous viewing in children for behavioral relevance (Kamps et al., 2022).

  • Specific groups of adults who vary in their personal experience can be targeted, as has been the case in work on visual expertise (e.g., chess players or bird watchers) (Bilalic et al., 2011; Bukach et al., 2006; Martens et al., 2018). The same group-level approach could be taken with respect to forms of experience relevant to natural behavior of interest and how this is impacted by cultural-specific knowledge (Henrich et al., 2010) or personal familiarity (Taylor et al., 2009; Visconti di Oleggio Castello et al., 2021).

  • The topography of neural responses in OTC is assumed to be relatively stable within individuals. However, the phenomenon of ‘representational drift’ (Roth & Merriam, 2023; Rule et al., 2019) suggests that even seemingly stable behavioral responses may be underwritten by within subject changes in neural response over time, which may further be influenced by stimulus and task (Westlin et al., 2023).

  • In addition to selecting participants based on known differences in behaviorally relevant experience, it is crucial to study individual-level variation in the topography of neural responses within a single population. This has sometimes been investigated within the context of category-selectivity, but ideally requires behaviorally relevant stimuli and task selection (Charest et al., 2014; Haxby et al., 2020; McGugin et al., 2023) or direct links to natural behaviors like eye movements (Borovska & de Haas, 2024).

6.4. Modeling and analysis

Explaining how coding in OTC addresses the twin challenges of behavioral relevance (visual diversity and goal dependence) will also require us to change our perspective with respect to what sources of variation in neural data we are trying to explain. Many common and innovative models and analysis methods are already well-suited to this purpose (Figure 4(d)).

  • Since planning action is a dynamic, continuous process, the ultimate goal should be to model the ongoing coupling between the brain and environment during action without distinct stimulus or task events (Murty et al., 2023; Stangl et al., 2023; Wise et al., 2024).

  • We should strive for more ambitious predictive models derived from multiple datasets, across tasks, that allow for generalization to new contexts (Varoquaux & Poldrack, 2019). This would in principle entail developing models at a scale that captures large variation in stimulus conditions and tasks far beyond our current practices.

  • Deep neural networks (DNNs) have provided an important tool for exploring hypotheses about visual processing and cortical specialization (Blauch et al., 2022; Dobs et al., 2022; Kanwisher et al., 2023; Konkle & Alvarez, 2022). However, DNNs have often been benchmarked against datasets from category-selective brain regions (Cichy et al., 2019; Schrimpf et al., 2020) and so do not capture how OTC is specialized for behavioral relevance (Bracci & Op de Beeck, 2023). Examples include benchmarks such as Brain-Score (https://www.brain-score.org/) and the Algonauts Project (http://algonauts.csail.mit.edu/index.html). Going forward, future models should be empirically checked against neural responses for datasets using richer stimulus and task designs (Figure 4(a)), such as free viewing of infants (Orhan & Lake, 2024). Indeed, we believe that DNN models that are only trained on object or scene classification tasks will not do well on such comprehensive benchmark comparisons, and that new artificial neural network models will need to be developed that engage with the richer variety of behaviorally relevant signals that we have emphasized (Bowers et al., 2023).

  • We should not simply use behavioral models as separate predictors, but rather harness formal models to predict behavioral measures (e.g., reaction times, eye movements, and reach trajectories) from neural responses, as is common in so-called ‘model-based’ approaches to neuroimaging (Ritchie & Carlson, 2016; Turner et al., 2017).

  • We can use existing approaches to capture the multidimensional nature of behavioral relevance. For example, encoding models have been applied to data collected during continuous viewing for large scale image sets (Contier et al., 2024; Çukur et al., 2013; Popham et al., 2021; Tarhan & Konkle, 2020) and representational similarity analysis allows for comparing second-order similarities in patterns of neural response to a wide variety of models, tasks, and participants (Charest et al., 2014; Harel et al., 2014). However, predictors in these models should be based on behavioral relevance, not just familiar visual or categorical features.

6.4.1. Summary

As these suggestions make clear, developing a model of how OTC represents behaviorally relevant properties, and what neural processes delineate those properties, will require changes and innovations on multiple methodological fronts. This includes everything from stimulus design to the choice of tasks during data acquisition, through the selection of participants and the choice of models and analyses. Several recent advances are already moving the field in the right direction, and we are optimistic that the adoption of a more ethological framework for understanding visual function in OTC is well within our reach.

7. Conclusion

We have argued that recognizing the importance of behavioral relevance requires a shift in how we approach the study of visual function in OTC. In making our case, we have reevaluated the (often implicit) theoretical and empirical foundations of the category-selectivity framework and taken lessons from the wealth of research it has produced. Though we have offered strong critiques and voiced misgivings (many of which also apply to our own work), we want to emphasize that the focus on category-selectivity has been incredibly successful, even if it has also been limiting – hence, we have characterized our discussion as a rethinking of the framework, and where it is taking us. After all, the correct inference to draw from our discussion is not that we ‘know nothing’ about visual function in OTC. Far from it. Rather, we believe the appropriate conclusion emphasizes not ignorance, but optimism: that we are close to an improved understanding of how the brain makes sense of what we see, and focusing more on how vision facilitates natural behavior offers us a route to get us there.

Funding

This research was supported by the Intramural Research Program of the National Institute of Mental Health (NIMH) (ZIAMH002909 to C.I.B). The contributions of the NIH author (s) were made as part of their official duties as NIH federal employees, are in compliance with agency policy requirements, and are considered Works of the United States Government. However, the findings and conclusions presented in this paper are those of the author(s) and do not necessarily reflect the views of the NIH or the U.S. Department of Health and Human Services.

Footnotes

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  1. Allen EJ, St-Yves G, Wu Y, Breedlove JL, Prince JS, Dowdle LT, Nau M, Caron B, Pestilli F, Charest I, Hutchinson JB, Naselaris T, & Kay K (2022). A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence. Nature Neuroscience, 25(1), 116–126. 10.1038/s41593-021-00962-x [DOI] [PubMed] [Google Scholar]
  2. Allison T (1999). Electrophysiological studies of human face perception. I: Potentials generated in occipitotemporal cortex by face and non-face stimuli. Cerebral Cortex, 9(5), 415–430. 10.1093/cercor/9.5.415 [DOI] [PubMed] [Google Scholar]
  3. Allison T, Ginter H, McCarthy G, Nobre AC, Puce A, Luby M, & Spencer DD (1994). Face recognition in human extrastriate cortex. Journal of Neurophysiology, 71 (2), 821–825. 10.1152/jn.1994.71.2.821 [DOI] [PubMed] [Google Scholar]
  4. Alreja A, Ward MJ, Ma Q, Russ BE, Bickel S, Van Wouwe NC, González-Martínez JA, Neimat JS, Abel TJ, Bagić A, Parker LS, Richardson RM, Schroeder CE, Morency L, & Ghuman AS (2022). A new paradigm for investigating real-world social behavior and its neural underpinnings. Behavior Research Methods, 55 (5), 2333–2352. 10.3758/s13428-022-01882-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Amedi A, Hofstetter S, Maidenbaum S, & Heimler B (2017). Task selectivity as a comprehensive principle for brain organization. Trends in Cognitive Sciences, 21(5), 307–310. 10.1016/j.tics.2017.03.007 [DOI] [PubMed] [Google Scholar]
  6. Arcaro MJ, & Livingstone MS (2021). On the relationship between maps and domains in inferotemporal cortex. Nature Reviews Neuroscience, 22(9), 573–583. 10.1038/s41583-021-00490-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Arcaro MJ, Schade PF, Vincent JL, Ponce CR, & Livingstone MS (2017). Seeing faces is necessary for face-domain formation. Nature Neuroscience, 20(10), 1404–1412. 10.1038/nn.4635 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Ashby FG, & Maddox WT (1993). Relations between prototype, exemplar, and decision bound models of categorization. Journal of Mathematical Psychology, 37(3), 372–400. 10.1006/jmps.1993.1023 [DOI] [Google Scholar]
  9. Baker C, & Kravitz D (2024). Insights from the evolving model of two cortical visual pathways. Journal of Cognitive Neuroscience, 36(12), 2568–2579. 10.1162/jocn_a_02192 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bannert MM, & Bartels A (2022). Visual cortex: Big data analysis uncovers food specificity. Current Biology, 32(19), R1012–R1015. 10.1016/j.cub.2022.08.068 [DOI] [PubMed] [Google Scholar]
  11. Bao P, She L, McGill M, & Tsao DY (2020). A map of object space in primate inferotemporal cortex. Nature, 583(7814), 103–108. 10.1038/s41586-020-2350-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Barlow HB (1997). The knowledge used in vision and where it comes from. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 352(1358), 1141–1147. 10.1098/rstb.1997.0097 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Behrmann M, & Plaut DC (2013). Distributed circuits, not circumscribed centers, mediate visual recognition. Trends in Cognitive Sciences, 17(5), 210–219. 10.1016/j.tics.2013.03.007 [DOI] [PubMed] [Google Scholar]
  14. Bi Y (2020). Concepts and object domains. In Poeppel D, Mangun GR, & Gazzaniga MS (Eds.), The cognitive neurosciences (pp. 785–792). The MIT Press. [Google Scholar]
  15. Bi Y, Wang X, & Caramazza A (2016). Object domain and modality in the ventral visual pathway. Trends in Cognitive Sciences, 20(4), 282–290. 10.1016/j.tics.2016.02.002 [DOI] [PubMed] [Google Scholar]
  16. Bilalic M, Langner R, Ulrich R, & Grodd W (2011). Many faces of expertise: Fusiform face area in chess experts and novices. Journal of Neuroscience, 31(28), 10206–10214. 10.1523/JNEUROSCI.5727-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Blauch NM, Behrmann M, & Plaut DC (2022). A connectivity-constrained computational account of topographic organization in primate high-level visual cortex. Proceedings of the National Academy of Sciences, 119(3), e2112566119. 10.1073/pnas.2112566119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Bodamer J (1947). Die Prosop-Agnosie. Archiv f?r Psychiatrie und Nervenkrankheiten Vereinigt mit Zeitschrift f?r die Gesamte Neurologie und Psychiatrie, 179(1–2), 6–53. 10.1007/BF00352849 [DOI] [PubMed] [Google Scholar]
  19. Bonner MF, & Epstein RA (2017). Coding of navigational affordances in the human visual system. Proceedings of the National Academy of Sciences, 114(18), 4793–4798. 10.1073/pnas.1618228114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Borovska P, & de Haas B (2024). Individual gaze shapes diverging neural representations. Proceedings of the National Academy of Sciences (Vol. 121. pp. e2405602121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Bowers JS, Malhotra G, Adolfi F, Dujmović M, Montero ML, Biscione V, Puebla G, Hummel JH, & Heaton RF (2023). On the importance of severely testing deep learning models of cognition. Cognitive Systems Research, 82, 101158. 10.1016/j.cogsys.2023.101158 [DOI] [Google Scholar]
  22. Bracci S, Cavina-Pratesi C, Ietswaart M, Caramazza A, & Peelen MV (2012). Closely overlapping responses to tools and hands in left lateral occipitotemporal cortex. Journal of Neurophysiology, 107(5), 1443–1456. 10.1152/jn.00619.2011 [DOI] [PubMed] [Google Scholar]
  23. Bracci S, Daniels N, & Op de Beeck H (2017). Task context overrules object- and category-related representational content in the human parietal cortex. Cerebral Cortex, 27, 310–321. 10.1093/cercor/bhw419 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Bracci S, & Op de Beeck H (2016). Dissociations and associations between shape and category representations in the two visual pathways. Journal of Neuroscience, 36(2), 432–444. 10.1523/JNEUROSCI.2314-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Bracci S, & Op de Beeck HP (2023). Understanding human object vision: A picture is worth a thousand representations. The Annual Review of Psychology, 74(1), 113–135. 10.1146/annurev-psych-032720-041031 [DOI] [PubMed] [Google Scholar]
  26. Bracci S, Ritchie JB, & de Beeck HO (2017). On the partnership between neural representations of object categories and visual features in the ventral visual pathway. Neuropsychologia, 105, 153–164. 10.1016/j.neuropsychologia.2017.06.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Bugatus L, Weiner KS, & Grill-Spector K (2017). Task alters category representations in prefrontal but not high-level visual cortex. Neuroimage, 155, 437–449. 10.1016/j.neuroimage.2017.03.062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Bukach CM, Bub DN, Gauthier I, & Tarr MJ (2006). Perceptual expertise effects are not all or none: Spatially limited perceptual expertise for faces in a case of prosopagnosia. Journal of Cognitive Neuroscience, 18(1), 48–63. 10.1162/089892906775250094 [DOI] [PubMed] [Google Scholar]
  29. Cabral L, Zubiaurre-Elorza L, Wild CJ, Linke A, & Cusack R (2022). Anatomical correlates of category-selective visual regions have distinctive signatures of connectivity in neonates. Developmental Cognitive Neuroscience, 58, 101179. 10.1016/j.dcn.2022.101179 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Capitani E, Laiacona M, Mahon B, & Caramazza A (2003). What are the facts of semantic category-specific deficits? A critical Review of the clinical evidence. Cognitive Neuropsychology, 20(3–6), 213–261. 10.1080/02643290244000266 [DOI] [PubMed] [Google Scholar]
  31. Caruana N, McArthur G, Woolgar A, & Brock J (2017). Simulating social interactions for the experimental investigation of joint attention. Neuroscience and Biobehavioral Reviews, 74, 115–125. 10.1016/j.neubiorev.2016.12.022 [DOI] [PubMed] [Google Scholar]
  32. Casey BJ, Cannonier T, Conley MI, Cohen AO, Barch DM, Heitzeg MM, Soules ME, Teslovich T, Dellarco DV, Garavan H, Orr CA, Wager TD, Banich MT, Speer NK, Sutherland MT, Riedel MC, Dick AS, Bjork JM Thomas KM … Fair DA (2018). The adolescent brain Cognitive development (ABCD) study: Imaging acquisition across 21 sites. Developmental Cognitive Neuroscience, 32, 43–54. 10.1016/j.dcn.2018.03.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Chao LL, Haxby JV, & Martin A (1999). Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nature Neuroscience, 2(10), 913–919. 10.1038/13217 [DOI] [PubMed] [Google Scholar]
  34. Charest I, Kievit RA, Schmitz TW, Deca D, & Kriegeskorte N (2014). Unique semantic space in the brain of each beholder predicts perceived similarity. Proceedings of the National Academy of Sciences, 111(40), 14565–14570. 10.1073/pnas.1402594111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Churchland PS, Ramachandran VS, & Sejnowski TJA (1993). Critique of pure vision. In Koch C & David JL (Eds.) Large-scale neuronal theories of the brain (Vol. 23), 23–60. MIT Press. [Google Scholar]
  36. Cichy RM, Roig G, & Oliva A (2019). The algonauts project. Nature Machine Intelligence, 1(12), 613–613. 10.1038/s42256-019-0127-z [DOI] [Google Scholar]
  37. Cisek P (2007). Cortical mechanisms of action selection: The affordance competition hypothesis. Philosophical Transactions of the Royal Society B: Biological Sciences, 362 (1485), 1585–1599. 10.1098/rstb.2007.2054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Cisek P, & Kalaska JF (2010). Neural mechanisms for interacting with a world full of action choices. Annual Review of Neuroscience, 33(1), 269–298. 10.1146/annurev.neuro.051508.135409 [DOI] [PubMed] [Google Scholar]
  39. Cohen JD, & Tong F (2001). The face of controversy. Science, 293(5539), 2405–2407. 10.1126/science.1066018 [DOI] [PubMed] [Google Scholar]
  40. Cohen L, Dehaene S, Naccache L, Lehéricy S, Dehaene-Lambertz G, Hénaff M-A, & Michel F (2000). The visual word form area: Spatial and temporal characterization of an initial stage of reading in normal subjects and posterior split-brain patients. Brain, 123(2), 291–307. 10.1093/brain/123.2.291 [DOI] [PubMed] [Google Scholar]
  41. Contier O, Baker CI, & Hebart MN (2024). Distributed representations of behaviour-derived object dimensions in the human visual system. Nature Human Behaviour, 8(11), 2179–2193. 10.1101/2023.08.23.553812 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Contier O, Baker CI, & Hebart MN (2024). Distributed representations of behaviour-derived object dimensions in the human visual system. Nature Human Behaviour, 1–15. 10.1038/s41562-024-01980-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Conway BR (2018). The organization and operation of inferior temporal cortex. Annual Review of Vision Science, 4(1), 381–402. 10.1146/annurev-vision-091517-034202 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Cox DD (2014). Do we understand high-level vision? Current Opinion in Neurobiology, 25, 187–193. 10.1016/j.conb.2014.01.016 [DOI] [PubMed] [Google Scholar]
  45. Cusack R, Ranzato M, & Charvet CJ (2024). Helpless infants are learning a foundation model. Trends in Cognitive Sciences, 28(8), 726–738. 10.1016/j.tics.2024.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Çukur T, Nishimoto S, Huth AG, & Gallant JL (2013). Attention during natural vision warps semantic representation across the human brain. Nature Neuroscience, 16(6), 763–770. 10.1038/nn.3381 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Deen B, Richardson H, Dilks DD, Takahashi A, Keil B, Wald LL, Kanwisher N, & Saxe R (2017). Organization of high-level visual cortex in human infants. Nature Communications, 8(1), 13995. 10.1038/ncomms13995 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Dehaene S, Le Clec’H G, Poline J-B, Le Bihan D, & Cohen L (2002). The visual word form area: A prelexical representation of visual words in the fusiform gyrus. Neuroreport, 13(3), 321. 10.1097/00001756-200203040-00015 [DOI] [PubMed] [Google Scholar]
  49. Dehaene-Lambertz G, Monzalvo K, Dehaene S, & Grill-Spector K (2018). The emergence of the visual word form: Longitudinal evolution of category-specific ventral visual areas during reading acquisition. PLoS Biology, 16(3), e2004103. 10.1371/journal.pbio.2004103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Desimone R, Albright TD, Gross CG, & Bruce C (1984). Stimulus-selective properties of inferior temporal neurons in the macaque. Journal of Neuroscience, 4(8), 2051–2062. 10.1523/JNEUROSCI.04-08-02051.1984 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. DiCarlo JJ, Zoccolan D, & Rust NC (2012). How does the brain solve visual object recognition? Neuron, 73(3), 415–434. 10.1016/j.neuron.2012.01.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Dilks DD, Kamps FS, & Persichetti AS (2022). Three cortical scene systems and their development. Trends in Cognitive Sciences, 26(2), 117–127. 10.1016/j.tics.2021.11.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Dobs K, Martinez J, Kell AJE, & Kanwisher N (2022). Brain-like functional specialization emerges spontaneously in deep neural networks. Science Advances, 8(11), eabl8913. 10.1126/sciadv.abl8913 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Downing PE, Chan AW-Y, Peelen MV, Dodds CM, & Kanwisher N (2006). Domain specificity in visual cortex. Cerebral Cortex, 16(10), 1453–1461. 10.1093/cercor/bhj086 [DOI] [PubMed] [Google Scholar]
  55. Downing PE, Jiang Y, Shuman M, & Kanwisher N (2001). A cortical area selective for visual processing of the human body. Science, 293(5539), 2470–2473. 10.1126/science.1063414 [DOI] [PubMed] [Google Scholar]
  56. Duan Y, Zhan J, Gross J, Ince RAA, & Schyns PG (2024). Pre-frontal cortex guides dimension-reducing transformations in the occipito-ventral pathway for categorization behaviors. Current Biology, 34(15), 3392–3404.e5. 10.1016/j.cub.2024.06.050 [DOI] [PubMed] [Google Scholar]
  57. Duchaine B, & Yovel G (2015). A revised neural framework for face processing. Annual Review of Vision Science, 1(1), 393–416. 10.1146/annurev-vision-082114-035518 [DOI] [PubMed] [Google Scholar]
  58. Egner T (2023). Principles of cognitive control over task focus and task switching. Nature Reviews Psychology, 2(11), 702–714. 10.1038/s44159-023-00234-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Engell AD, & McCarthy G (2014). Face, eye, and body selective responses in fusiform gyrus and adjacent cortex: An intracranial EEG study. Frontiers in Human Neuroscience, 8, 642. 10.3389/fnhum.2014.00642 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Epstein RA, & Baker CI (2019). Scene perception in the human brain. Annual Review of Vision Science, 5(1), 373–397. 10.1146/annurev-vision-091718-014809 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Epstein R, & Kanwisher N (1998). A cortical representation of the local visual environment. Nature, 392(6676), 598–601. 10.1038/33402 [DOI] [PubMed] [Google Scholar]
  62. Fabbri S, Stubbs KM, Cusack R, & Culham JC (2016). Disentangling representations of object and grasp properties in the human brain. Journal of Neuroscience, 36(29), 7648–7662. 10.1523/JNEUROSCI.0313-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Fleming RW (2017). Material Perception. Annual Review of Vision Science, 3(1), 365–388. 10.1146/annurev-vision-102016-061429 [DOI] [PubMed] [Google Scholar]
  64. Friston KJ, Rotshtein P, Geng JJ, Sterzer P, & Henson RN (2006). A critique of functional localisers. Neuroimage, 30(4), 1077–1087. 10.1016/j.neuroimage.2005.08.012 [DOI] [PubMed] [Google Scholar]
  65. Gauthier I, & Nelson CA (2001). The development of face expertise. Current Opinion in Neurobiology, 11(2), 219–224. 10.1016/S0959-4388(00)00200-2 [DOI] [PubMed] [Google Scholar]
  66. Gauthier I, & Tarr MJ (2016). Visual object recognition: Do we (finally) know more Now than we did? Annual Review of Vision Science, 2(1), 377–396. 10.1146/annurev-vision-111815-114621 [DOI] [PubMed] [Google Scholar]
  67. Gelder BD, & Solanas MP (2021). A computational neuroethology perspective on body and expression perception. Trends in Cognitive Sciences, 25(9), 744–756. 10.1016/j.tics.2021.05.010 [DOI] [PubMed] [Google Scholar]
  68. Geskin J, & Behrmann M (2018). Congenital prosopagnosia without object agnosia? A literature review. Cognitive Neuropsychology, 35(1–2), 4–54. 10.1080/02643294.2017.1392295 [DOI] [PubMed] [Google Scholar]
  69. Goldstone RL (1998). Perceptual learning. The Annual Review of Psychology, 49(1), 585–612. 10.1146/annurev.psych.49.1.585 [DOI] [PubMed] [Google Scholar]
  70. Goodale MA, & Milner AD (1992). Separate visual pathways for perception and action. Trends in Neurosciences, 15(1), 20–25. 10.1016/0166-2236(92)90344-8 [DOI] [PubMed] [Google Scholar]
  71. Grill-Spector K, & Weiner KS (2014). The functional architecture of the ventral temporal cortex and its role in categorization. Nature Reviews Neuroscience, 15(8), 536–548. 10.1038/nrn3747 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Groen IIA, Dekker TM, Knapen T, & Silson EH (2022). Visuospatial coding as ubiquitous scaffolding for human cognition. Trends in Cognitive Sciences, 26(1), 81–96. 10.1016/j.tics.2021.10.011 [DOI] [PubMed] [Google Scholar]
  73. Groen IIA, Silson EH, & Baker CI (2017). Contributions of low- and high-level properties to neural processing of visual scenes in the human brain. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1714), 20160102. 10.1098/rstb.2016.0102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Gross CG, Rocha-Miranda CE, & Bender DB (1972). Visual properties of neurons in inferotemporal cortex of the macaque. Journal of Neurophysiology, 35(1), 96–111. 10.1152/jn.1972.35.1.96 [DOI] [PubMed] [Google Scholar]
  75. Grotheer M, Herrmann K-H, & Kovács G (2016). Neuroimaging evidence of a bilateral representation for visually presented numbers. Journal of Neuroscience, 36(1), 88–97. 10.1523/JNEUROSCI.2129-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Hamilton AFDC (2021). Hyperscanning: Beyond the hype. Neuron, 109(3), 404–407. 10.1016/j.neuron.2020.11.008 [DOI] [PubMed] [Google Scholar]
  77. Hanke M, Baumgartner FJ, Ibe P, Kaule FR, Pollmann S, Speck O, Zinke W, & Stadler J (2014). A high-resolution 7-tesla fMRI dataset from complex natural stimulation with an audio movie. Scientific Data, 1(1), 140003. 10.1038/sdata.2014.3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Hannagan T, Amedi A, Cohen L, Dehaene-Lambertz G, & Dehaene S (2015). Origins of the specialization for letters and numbers in ventral occipitotemporal cortex. Trends in Cognitive Sciences, 19(7), 374–382. 10.1016/j.tics.2015.05.006 [DOI] [PubMed] [Google Scholar]
  79. Harel A, Kravitz DJ, & Baker CI (2014). Task context impacts visual object processing differentially across the cortex. Proceedings of the National Academy of Sciences (Vol. 111. pp. E962–E971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, & Pietrini P (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293(5539), 2425–2430. 10.1126/science.1063736 [DOI] [PubMed] [Google Scholar]
  81. Haxby JV, Guntupalli JS, Nastase SA, & Feilong M (2020). Hyperalignment: Modeling shared information encoded in idiosyncratic cortical topographies. eLife, 9, e56601. 10.7554/eLife.56601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Hayhoe MM (2017). Vision and action. Annual Review of Vision Science, 3(1), 389–413. 10.1146/annurev-vision-102016-061437 [DOI] [PubMed] [Google Scholar]
  83. Hebart MN, Contier O, Teichmann L, Rockter AH, Zheng CY, Kidder A, Corriveau A, Vaziri-Pashkam M, & Baker CI. (2023). THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior. Elife 12 (2023): e82580. 10.1101/2022.07.22.501123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Hebart MN, Zheng CY, Pereira F, & Baker CI (2020). Revealing the multidimensional mental representations of natural objects underlying human similarity judgements. Nature Human Behaviour, 4(11), 1173–1185. 10.1038/s41562-020-00951-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Heimler B, Striem-Amit E, & Amedi A (2015). Origins of task-specific sensory-independent organization in the visual and auditory brain: Neuroscience evidence, open questions and clinical implications. Current Opinion in Neurobiology, 35, 169–177. 10.1016/j.conb.2015.09.001 [DOI] [PubMed] [Google Scholar]
  86. Henrich J, Heine SJ, & Norenzayan A (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2–3), 61–83. 10.1017/S0140525X0999152X [DOI] [PubMed] [Google Scholar]
  87. Hiramatsu C, Goda N, & Komatsu H (2011). Transformation from image-based to perceptual representation of materials along the human ventral visual pathway. Neuroimage, 57(2), 482–494. 10.1016/j.neuroimage.2011.04.056 [DOI] [PubMed] [Google Scholar]
  88. Hoffman DD, & Richards WA (1987). Parts of recognition*. In Fischler MA & Firschein O (Eds.), Readings in computer vision (pp. 227–242). Morgan Kaufmann, San Francisco (CA). 10.1016/B978-0-08-051581-6.50028-3 [DOI] [Google Scholar]
  89. Hong H, Yamins DLK, Majaj NJ, & DiCarlo JJ (2016). Explicit information for category-orthogonal object properties increases along the ventral stream. Nature Neuroscience, 19(4), 613–622. 10.1038/nn.4247 [DOI] [PubMed] [Google Scholar]
  90. Hu Y, Baragchizadeh A, & O’Toole AJ (2020). Integrating faces and bodies: Psychological and neural perspectives on whole person perception. Neuroscience and Biobehavioral Reviews, 112, 472–486. 10.1016/j.neubiorev.2020.02.021 [DOI] [PubMed] [Google Scholar]
  91. Huntenburg JM, Bazin P-L, & Margulies DS (2018). Large-scale gradients in human cortical organization. Trends in Cognitive Sciences, 22(1), 21–31. 10.1016/j.tics.2017.11.002 [DOI] [PubMed] [Google Scholar]
  92. Huth AG, Nishimoto S, Vu AT, & Gallant JL (2012). A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron, 76(6), 1210–1224. 10.1016/j.neuron.2012.10.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Ingle D (1967). Two visual mechanisms underlying the behavior of fish. Psychologische Forschung, 31(1), 44–51. 10.1007/BF00422385 [DOI] [PubMed] [Google Scholar]
  94. Jain N, Wang A, Henderson MM, Lin R, Prince JS, Tarr MJ, & Wehbe L (2023). Selectivity for food in human ventral visual cortex. Communications Biology, 6(1), 175. 10.1038/s42003-023-04546-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Jonas J, Rossion B, Brissart H, Frismand S, Jacques C, Hossu G, Colnat-Coulbois S, Vespignani H, Vignal J-P, & Maillard L (2015). Beyond the core face-processing network: Intracerebral stimulation of a face-selective area in the right anterior fusiform gyrus elicits transient prosopagnosia. Cortex, 72, 140–155. 10.1016/j.cortex.2015.05.026 [DOI] [PubMed] [Google Scholar]
  96. Jozwik KM, Kietzmann TC, Cichy RM, Kriegeskorte N, & Mur M (2023). Deep neural networks and visuo-semantic models explain complementary components of human ventral-stream representational dynamics. Journal of Neuroscience, 43(10), 1731–1741. 10.1523/JNEUROSCI.1424-22.2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Kaiser D, Quek GL, Cichy RM, & Peelen MV (2019). Object vision in a structured world. Trends in Cognitive Sciences, 23 (8), 672–685. 10.1016/j.tics.2019.04.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Kamps FS, Hendrix CL, Brennan PA, & Dilks DD (2020). Connectivity at the origins of domain specificity in the cortical face and place networks. Proceedings of the National Academy of Sciences (Vol. 117. pp. 6163–6169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Kamps FS, Richardson H, Murty NAR, Kanwisher N, & Saxe R (2022). Using child-friendly movie stimuli to study the development of face, place, and object regions from age 3 to 12 years. Human Brain Mapping, 43(9), 2782–2800. 10.1002/hbm.25815 [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Kanwisher N (2000). Domain specificity in face perception. Nature Neuroscience, 3(8), 759–763. 10.1038/77664 [DOI] [PubMed] [Google Scholar]
  101. Kanwisher N, Khosla M, & Dobs K (2023). Using artificial neural networks to ask ‘why’ questions of minds and brains. Trends in Neurosciences, 46(3), 240–254. 10.1016/j.tins.2022.12.008 [DOI] [PubMed] [Google Scholar]
  102. Kanwisher N, McDermott J, & Chun MM (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17(11), 4302–4311. 10.1523/JNEUROSCI.17-11-04302.1997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Kay K, Bonnen K, Denison RN, Arcaro MJ, & Barack DL (2023). Tasks and their role in visual neuroscience. Neuron, 111(11), 1697–1713. 10.1016/j.neuron.2023.03.022 [DOI] [PubMed] [Google Scholar]
  104. Khosla M, Ratan Murty NA, & Kanwisher N (2022). A highly selective response to food in human visual cortex revealed by hypothesis-free voxel decomposition. Current Biology, 32 (19), 4159–4171.e9. 10.1016/j.cub.2022.08.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Kliger L, & Yovel G (2024). Distinct yet proximal face- and body-selective brain regions enable clutter-tolerant representations of the face, body, and whole person. Journal of Neuroscience, 44(24), e1871232024. 10.1523/JNEUROSCI.1871-23.2024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Konkle T, & Alvarez GA (2022). A self-supervised domain-general learning framework for human ventral stream representation. Nature Communications, 13(1), 491. 10.1038/s41467-022-28091-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Konkle T, & Caramazza A (2013). Tripartite organization of the ventral stream by Animacy and object size. Journal of Neuroscience, 33(25), 10235–10242. 10.1523/JNEUROSCI.0983-13.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Kosakowski HL, Cohen MA, Takahashi A, Keil B, Kanwisher N, & Saxe R (2022). Selective responses to faces, scenes, and bodies in the ventral visual pathway of infants. Current Biology, 32(2), 265–274.e5. 10.1016/j.cub.2021.10.064 [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Kourtzi Z, & Kanwisher N (2001). Representation of perceived object shape by the human lateral occipital complex. Science, 293(5534), 1506–1509. 10.1126/science.1061133 [DOI] [PubMed] [Google Scholar]
  110. Krakauer JW, Ghazanfar AA, Gomez-Marin A, MacIver MA, & Poeppel D (2017). Neuroscience needs behavior: Correcting a reductionist bias. Neuron, 93(3), 480–490. 10.1016/j.neuron.2016.12.041 [DOI] [PubMed] [Google Scholar]
  111. Kravitz DJ, Peng CS, & Baker CI (2011). Real-world scene representations in high-level visual cortex: It’s the spaces more than the places. Journal of Neuroscience, 31(20), 7322–7333. 10.1523/JNEUROSCI.4588-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Kravitz DJ, Saleem KS, Baker CI, Ungerleider LG, & Mishkin M (2013). The ventral visual pathway: An expanded neural framework for the processing of object quality. Trends in Cognitive Sciences, 17(1), 26–49. 10.1016/j.tics.2012.10.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Kravitz DJ, Vinson LD, & Baker CI (2008). How position dependent is visual object recognition? Trends in Cognitive Sciences, 12(3), 114–122. 10.1016/j.tics.2007.12.006 [DOI] [PubMed] [Google Scholar]
  114. Kriegeskorte N, Mur M, Ruff DA, Kiani R, Bodurka J, Esteky H, Tanaka K, & Bandettini PA (2008). Matching categorical object representations in inferior temporal cortex of man and Monkey. Neuron, 60(6), 1126–1141. 10.1016/j.neuron.2008.10.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Lee Masson H, Van De Plas S, Daniels N, & Op de Beeck H (2018). The multidimensional representational space of observed socio-affective touch experiences. Neuroimage, 175, 297–314. 10.1016/j.neuroimage.2018.04.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Lescroart MD, & Gallant JL (2019). Human scene-selective areas represent 3D configurations of surfaces. Neuron, 101 (1), 178–192.e7. 10.1016/j.neuron.2018.11.004 [DOI] [PubMed] [Google Scholar]
  117. Liu Z-X, Rosenbaum RS, & Ryan JD (2020). Restricting visual exploration directly impedes neural activity, functional connectivity, and memory. Cerebral Cortex Communications, 1(1), tgaa054. 10.1093/texcom/tgaa054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Liu Z-X, Shen K, Olsen RK, & Ryan JD (2017). Visual sampling predicts hippocampal activity. Journal of Neuroscience, 37(3), 599–609. 10.1523/JNEUROSCI.2610-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Lupyan G, & Thompson-Schill SL (2012). The evocative power of words: Activation of concepts by verbal and nonverbal means. Journal of Experimental Psychology. General, 141(1), 170–186. 10.1037/a0024904 [DOI] [PMC free article] [PubMed] [Google Scholar]
  120. Mahon BZ (2022). Chapter 13 - domain-specific connectivity drives the organization of object knowledge in the brain. In Miceli G, Bartolomeo P, & Navarro V (Eds.), Handbook of clinical neurology (Vol. 187, pp. 221–244). Elsevier. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Mahon BZ, & Almeida J (2024). Reciprocal interactions among parietal and occipito-temporal representations support everyday object-directed actions. Neuropsychologia, 198, 108841. 10.1016/j.neuropsychologia.2024.108841 [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Mahon BZ, & Caramazza A (2011). What drives the organization of object knowledge in the brain? Trends in Cognitive Sciences, 15(3), 97–103. 10.1016/j.tics.2011.01.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Mahon BZ, Milleville SC, Negri GAL, Rumiati RI, Caramazza A, & Martin A (2007). Action-related properties shape object representations in the ventral stream. Neuron, 55(3), 507–520. 10.1016/j.neuron.2007.07.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Malach R, Reppas JB, Benson RR, Kwong KK, Jiang H, Kennedy WA, Ledden PJ, Brady TJ, Rosen BR, & Tootell RB (1995). Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. Proceedings of the National Academy of Sciences, 92(18), 8135–8139. 10.1073/pnas.92.18.8135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Malcolm GL, Groen IIA, & Baker CI (2016). Making sense of real-world scenes. Trends in Cognitive Sciences, 20(11), 843–856. 10.1016/j.tics.2016.09.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Margalit E, Lee H, Finzi D, DiCarlo JJ, Grill-Spector K, & Yamins DL. (2024). A unifying framework for functional organization in early and higher ventral visual cortex. Neuron, 112(14), 2435–2451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Marr D (1982). Vision: A computational investigation into the human representation and processing of visual information. MIT Press. [Google Scholar]
  128. Martens F, Bulthé J, van Vliet C, & Op de Beeck H (2018). Domain-general and domain-specific neural changes underlying visual expertise. Neuroimage, 169, 80–93. 10.1016/j.neuroimage.2017.12.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Martin A (2016). GRAPES-Grounding representations in action, perception, and emotion systems: How object properties and categories are represented in the human brain. Psychonomic Bulletin & Review, 23(4), 979–990. 10.3758/s13423-015-0842-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Martin L, Durisko C, Moore MW, Coutanche MN, Chen D, & Fiez JA (2019). The VWFA is the home of orthographic learning when houses are used as letters. eNeuro, 6(1), ENEURO.0425–17.2019. 10.1523/ENEURO.0425-17.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Matthis JS, Yates JL, & Hayhoe MM (2018). Gaze and the control of foot placement when walking in natural terrain. Current Biology, 28(8), 1224–1233.e5. 10.1016/j.cub.2018.03.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Mayr E (1961). Cause and effect in Biology. Science, 134(3489), 1501–1506. 10.1126/science.134.3489.1501 [DOI] [PubMed] [Google Scholar]
  133. McGugin RW, Sunday MA, & Gauthier I (2023). The neural correlates of domain-general visual ability. Cerebral Cortex, 33(8), 4280–4292. 10.1093/cercor/bhac342 [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. McRae K, Cree GS, Seidenberg MS, & Mcnorgan C (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37(4), 547–559. 10.3758/BF03192726 [DOI] [PubMed] [Google Scholar]
  135. Mehrer J, Spoerer CJ, Jones EC, Kriegeskorte N, & Kietzmann TC (2021). An ecologically motivated image dataset for deep learning yields better models of human vision. Proceedings of the National Academy of Sciences, 118 (8), e2011417118. 10.1073/pnas.2011417118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  136. Milner AD (2017). How do the two visual streams interact with each other? Experimental Brain Research, 235(5), 1297–1308. 10.1007/s00221-017-4917-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Milner AD, & Goodale MA (2008). Two visual systems re-viewed. Neuropsychologia, 46(3), 774–785. 10.1016/j.neuropsychologia.2007.10.005 [DOI] [PubMed] [Google Scholar]
  138. Mirman D, Landrigan J-F, & Britt AE (2017). Taxonomic and thematic semantic systems. Psychological Bulletin, 143(5), 499–520. 10.1037/bul0000092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Mobbs D, Trimmer PC, Blumstein DT, & Dayan P (2018). Foraging for foundations in decision neuroscience: Insights from ethology. Nature Reviews Neuroscience, 19(7), 419–427. 10.1038/s41583-018-0010-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Moore MW, Durisko C, Perfetti CA, & Fiez JA (2014). Learning to Read an alphabet of human faces produces left-lateralized training effects in the fusiform gyrus. Journal of Cognitive Neuroscience, 26(4), 896–913. 10.1162/jocn_a_00506 [DOI] [PMC free article] [PubMed] [Google Scholar]
  141. Mur M, Ruff DA, Bodurka J, De Weerd P, Bandettini PA, & Kriegeskorte N (2012). Categorical, yet graded - single-image activation profiles of human category-selective cortical regions. Journal of Neuroscience, 32(25), 8649–8662. 10.1523/JNEUROSCI.2334-11.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  142. Murty DV, Song S, Surampudi SG, & Pessoa L (2023). Threat and reward imminence processing in the human brain. Journal of Neuroscience, 43(16), 2973–2987. 10.1523/JNEUROSCI.1778-22.2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  143. Nastase SA, Connolly AC, Oosterhof NN, Halchenko YO, Guntupalli JS, Visconti di Oleggio Castello M, Gors J, Gobbini MI, & Haxby JV (2017). Attention selectively reshapes the geometry of distributed semantic representation. Cerebral Cortex, 27(8), 4277–4291. 10.1093/cercor/bhx138 [DOI] [PMC free article] [PubMed] [Google Scholar]
  144. Nastase SA, Goldstein A, & Hasson U (2020). Keep it real: Rethinking the primacy of experimental control in cognitive neuroscience. Neuroimage, 222, 117254. 10.1016/j.neuroimage.2020.117254 [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Nau M, Julian JB, & Doeller CF (2018). How the brain’s navigation system shapes our visual experience. Trends in Cognitive Sciences, 22(9), 810–825. 10.1016/j.tics.2018.06.008 [DOI] [PubMed] [Google Scholar]
  146. Nau M, Navarro Schröder T, Frey M, & Doeller CF (2020). Behavior-dependent directional tuning in the human visual-navigation network. Nature Communications, 11(1), 3247. 10.1038/s41467-020-17000-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. Nau M, Schmid AC, Kaplan SM, Baker CI, & Kravitz DJ (2024). Centering cognitive neuroscience on task demands and generalization. Nature Neuroscience, 27(9), 1656–1667. 10.1038/s41593-024-01711-6 [DOI] [PubMed] [Google Scholar]
  148. Nishimoto S, Vu A, Naselaris T, Benjamini Y, Yu B, & Gallant J (2011). Reconstructing visual experiences from brain activity evoked by natural movies. Current Biology, 21(19), 1641–1646. 10.1016/j.cub.2011.08.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Nobre AC, Allison T, & McCarthy G (1994). Word recognition in the human inferior temporal lobe. Nature, 372(6503), 260–263. 10.1038/372260a0 [DOI] [PubMed] [Google Scholar]
  150. Op de Beeck HP, Haushofer J, & Kanwisher NG (2008). Interpreting fMRI data: Maps, modules and dimensions. Nature Reviews Neuroscience, 9(2), 123–135. 10.1038/nrn2314 [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Op de Beeck HP, Pillet I, & Ritchie JB (2019). Factors determining where category-selective areas emerge in visual cortex. Trends in Cognitive Sciences, 23(9), 784–797. 10.1016/j.tics.2019.06.006 [DOI] [PubMed] [Google Scholar]
  152. Orhan AE, & Lake BM (2024). Learning high-level visual representations from a child’s perspective without strong inductive biases. Nature Machine Intelligence, 6(3), 271–283. 10.1038/s42256-024-00802-0 [DOI] [Google Scholar]
  153. Osher DE, Saxe RR, Koldewyn K, Gabrieli JDE, Kanwisher N, & Saygin ZM (2016). Structural connectivity fingerprints predict cortical selectivity for multiple visual categories across cortex. Cerebral Cortex, 26(4), 1668–1683. 10.1093/cercor/bhu303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  154. O’Toole AJ, Jiang F, Abdi H, & Haxby JV (2005). Partially distributed representations of objects and faces in ventral temporal cortex. Journal of Cognitive Neuroscience, 17(4), 580–590. 10.1162/0898929053467550 [DOI] [PubMed] [Google Scholar]
  155. Parvizi J, Jacques C, Foster BL, Withoft N, Rangarajan V, Weiner KS, & Grill-Spector K (2012). Electrical stimulation of human fusiform face-selective regions distorts face perception. Journal of Neuroscience, 32(43), 14915–14920. 10.1523/JNEUROSCI.2609-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  156. Peelen MV, & Downing PE (2005). Selectivity for the human body in the fusiform gyrus. Journal of Neurophysiology, 93(1), 603–608. 10.1152/jn.00513.2004 [DOI] [PubMed] [Google Scholar]
  157. Peelen MV, & Downing PE (2007). The neural basis of visual body perception. Nature Reviews Neuroscience, 8(8), 636–648. 10.1038/nrn2195 [DOI] [PubMed] [Google Scholar]
  158. Peelen MV, & Downing PE (2017). Category selectivity in human visual cortex: Beyond visual object recognition. Neuropsychologia, 105, 177–183. 10.1016/j.neuropsychologia.2017.03.033 [DOI] [PubMed] [Google Scholar]
  159. Peeters R, Simone L, Nelissen K, Fabbri-Destro M, Vanduffel W, Rizzolatti G, & Orban GA (2009). The representation of tool use in humans and monkeys: Common and uniquely human features. Journal of Neuroscience, 29(37), 11523–11539. 10.1523/JNEUROSCI.2040-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  160. Pennock IML, Racey C, Allen EJ, Wu Y, Naselaris T, Kay KN, Franklin A, & Bosten JM (2023). Color-biased regions in the ventral visual pathway are food selective. Current Biology, 33(1), 134–146.e4. 10.1016/j.cub.2022.11.063 [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Perrett DI, Hietanen JK, Oram MW, & Benson PJ (1992). Organization and functions of cells responsive to faces in the temporal cortex. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 335, 23–30. [DOI] [PubMed] [Google Scholar]
  162. Perrett DI, Rolls ET, & Caan W (1982). Visual neurones responsive to faces in the monkey temporal cortex. Experimental Brain Research, 47(3). 10.1007/BF00239352 [DOI] [PubMed] [Google Scholar]
  163. Persichetti AS, & Dilks DD (2018). Dissociable neural systems for recognizing places and navigating through them. Journal of Neuroscience, 38(48), 10295–10304. 10.1523/JNEUROSCI.1200-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Pessoa L, Medina L, & Desfilis E (2022). Refocusing neuroscience: Moving away from mental categories and towards complex behaviours. Philosophical Transactions of the Royal Society B: Biological Sciences, 377(1844), 20200534. 10.1098/rstb.2020.0534 [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Pitcher D, Charles L, Devlin JT, Walsh V, & Duchaine B (2009). Triple dissociation of faces, bodies, and objects in extrastriate cortex. Current Biology, 19(4), 319–324. 10.1016/j.cub.2009.01.007 [DOI] [PubMed] [Google Scholar]
  166. Pitcher D, & Ungerleider LG (2021). Evidence for a third visual pathway specialized for social perception. Trends in Cognitive Sciences, 25(2), 100–110. 10.1016/j.tics.2020.11.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Poldrack RA, Baker CI, Durnez J, Gorgolewski KJ, Matthews PM, Munafò MR, Nichols TE, Poline J-B, Vul E, & Yarkoni T (2017). Scanning the horizon: Towards transparent and reproducible neuroimaging research. Nature Reviews Neuroscience, 18(2), 115–126. 10.1038/nrn.2016.167 [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Popham SF, Huth AG, Bilenko NY, Deniz F, Gao JS, Nunez-Elizalde AO, & Gallant JL (2021). Visual and linguistic semantic representations are aligned at the border of human visual cortex. Nature Neuroscience, 24(11), 1628–1636. 10.1038/s41593-021-00921-6 [DOI] [PubMed] [Google Scholar]
  169. Powell LJ, Kosakowski HL, & Saxe R (2018). Social origins of cortical face areas. Trends in Cognitive Sciences, 22(9), 752–763. 10.1016/j.tics.2018.06.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  170. Proklova D, & Goodale MA (2022). The role of animal faces in the animate-inanimate distinction in the ventral temporal cortex. Neuropsychologia, 169, 108192. 10.1016/j.neuropsychologia.2022.108192 [DOI] [PubMed] [Google Scholar]
  171. Puce A, Allison T, Gore JC, & McCarthy G (1995). Face-sensitive regions in human extrastriate cortex studied by functional MRI. Journal of Neurophysiology, 74(3), 1192–1199. 10.1152/jn.1995.74.3.1192 [DOI] [PubMed] [Google Scholar]
  172. Ratan Murty NA, Teng S, Beeler D, Mynick A, Oliva A, Kanwisher N (2020). Visual experience is not necessary for the development of face-selectivity in the lateral fusiform gyrus. Proceedings of the National Academy of Sciences 117, 23011–23020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Ratan Murty NA, Bashivan P, Abate A, DiCarlo JJ, & Kanwisher N (2021). Computational models of category-selective brain regions enable high-throughput tests of selectivity. Nature Communications, 12(1), 5540. 10.1038/s41467-021-25409-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. Rice GE, Kerry SJ, Robotham RJ, Leff AP, Lambon Ralph MA, & Starrfelt R (2021). Category-selective deficits are the exception and not the rule: Evidence from a case-series of 64 patients with ventral occipito-temporal cortex damage. Cortex, 138, 266–281. 10.1016/j.cortex.2021.01.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. Ritchie JB (2019). The content of Marr’s information-processing framework. Philosophical Psychology, 32(7), 1078–1099. 10.1080/09515089.2019.1646418 [DOI] [Google Scholar]
  176. Ritchie JB, Andrews ST, Vaziri-Pashkam M, & Baker CI (2024). Graspable foods and tools elicit similar responses in visual cortex. Cerebral Cortex, 34(9), bhae383. 10.1093/cercor/bhae383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Ritchie JB, & Carlson TA (2016). Neural decoding and “inner” psychophysics: A distance-to-bound approach for linking mind, brain, and behavior. Frontiers in Neuroscience, 10. 10.3389/fnins.2016.00190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  178. Ritchie JB, Montesinos S, & Carter MJ (2024). What is a visual stream? Journal of Cognitive Neuroscience, 36(12), 2627–2638. 10.1162/jocn_a_02191 [DOI] [PMC free article] [PubMed] [Google Scholar]
  179. Ritchie JB, Zeman AA, Bosmans J, Sun S, Verhaegen K, & Op de Beeck HP (2021). Untangling the animacy organization of occipitotemporal cortex. Journal of Neuroscience, 41 (33), 7103–7119. 10.1523/JNEUROSCI.2628-20.2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  180. Rosenke M, van Hoof R, van den Hurk J, Grill-Spector K, & Goebel R (2021). A probabilistic functional atlas of human occipito-temporal visual cortex. Cerebral Cortex, 31(1), 603–619. 10.1093/cercor/bhaa246 [DOI] [PMC free article] [PubMed] [Google Scholar]
  181. Roth ZN, & Merriam EP (2023). Representations in human primary visual cortex drift over time. Nature Communications, 14(1), 4422. 10.1038/s41467-023-40144-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Rule ME, O’Leary T, & Harvey CD (2019). Causes and consequences of representational drift. Current Opinion in Neurobiology, 58, 141–147. 10.1016/j.conb.2019.08.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  183. Saxe R, Brett M, & Kanwisher N (2006). Divide and conquer: A defense of functional localizers. Neuroimage, 30(4), 1088–1096. 10.1016/j.neuroimage.2005.12.062 [DOI] [PubMed] [Google Scholar]
  184. Saygin ZM, Osher DE, Koldewyn K, Reynolds G, Gabrieli JDE, & Saxe RR (2012). Anatomical connectivity patterns predict face selectivity in the fusiform gyrus. Nature Neuroscience, 15(2), 321–327. 10.1038/nn.3001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  185. Schalk G, Kapeller C, Guger C, Ogawa H, Hiroshima S, Lafer-Sousa R, Saygin ZM, Kamada K, & Kanwisher N (2017). Facephenes and rainbows: Causal evidence for functional and anatomical specificity of face and color processing in the human brain. Proceedings of the National Academy of Sciences, 114(46), 12285–12290. 10.1073/pnas.1713447114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  186. Schmid AC, Barla P, & Doerschner K (2023). Material category of visual objects computed from specular image structure. Nature Human Behaviour, 7(7), 1152–1169. 10.1038/s41562-023-01601-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  187. Schmid AC, & Doerschner K (2019). Representing stuff in the human brain. Current Opinion in Behavioral Sciences, 30, 178–185. 10.1016/j.cobeha.2019.10.007 [DOI] [Google Scholar]
  188. Schneider GE (1969). Two visual systems. Science, 163(3870), 895–902. 10.1126/science.163.3870.895 [DOI] [PubMed] [Google Scholar]
  189. Schone HR, Maimon-Mor RO, Baker CI, & Makin TR (2021). Expert tool users show increased differentiation between visual representations of hands and tools. Journal of Neuroscience, 41(13), 2980–2989. 10.1523/JNEUROSCI.2489-20.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  190. Schrimpf M, Kubilius J, Lee MJ, Ratan Murty NA, Ajemian R, & DiCarlo JJ (2020). Integrative benchmarking to advance neurally mechanistic models of human Intelligence. Neuron, 108(3), 413–423. 10.1016/j.neuron.2020.07.040 [DOI] [PubMed] [Google Scholar]
  191. Schyns PG (1998). Diagnostic recognition: Task constraints, object information, and their interactions. Cognition, 67 (1–2), 147–179. 10.1016/S0010-0277(98)00016-X [DOI] [PubMed] [Google Scholar]
  192. Sha L, Haxby JV, Abdi H, Guntupalli JS, Oosterhof NN, Halchenko YO, & Connolly AC (2015). The animacy continuum in the human ventral vision pathway. Journal of Cognitive Neuroscience, 27(4), 665–678. 10.1162/jocn_a_00733 [DOI] [PubMed] [Google Scholar]
  193. Shamay-Tsoory SG, & Mendelsohn A (2019). Real-life Neuroscience: An ecological approach to brain and behavior research. Perspectives on Psychological Science, 14(5), 841–859. 10.1177/1745691619856350 [DOI] [PubMed] [Google Scholar]
  194. Smith LB, & Slone LK (2017). A developmental approach to Machine learning? Frontiers in Psychology, 8. 10.3389/fpsyg.2017.02124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  195. Snow JC, & Culham JC (2021). The treachery of images: How realism influences brain and behavior. Trends in Cognitive Sciences, 25(6), 506–519. 10.1016/j.tics.2021.02.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. Stangl M, Maoz SL, & Suthana N (2023). Mobile cognition: Imaging the human brain in the ‘real world’. Nature Reviews Neuroscience, 24(6), 347–362. 10.1038/s41583-023-00692-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  197. Steel A, Garcia BD, Goyal K, Mynick A, & Robertson CE (2023). Scene perception and visuospatial memory converge at the anterior edge of visually responsive cortex. Journal of Neuroscience, 43(31), 5723–5737. 10.1523/JNEUROSCI.2043-22.2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  198. Stevens WD, Kravitz DJ, Peng CS, Tessler MH, & Martin A (2017). Privileged functional connectivity between the visual word form area and the language system. Journal of Neuroscience, 37(21), 5288–5297. 10.1523/JNEUROSCI.0138-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  199. Stevens WD, Tessler MH, Peng CS, & Martin A (2015). Functional connectivity constrains the category-related organization of human ventral occipitotemporal cortex. Human Brain Mapping, 36(6), 2187–2206. 10.1002/hbm.22764 [DOI] [PMC free article] [PubMed] [Google Scholar]
  200. Stigliani A, Weiner KS, & Grill-Spector K (2015). Temporal processing capacity in high-level visual cortex is domain specific. Journal of Neuroscience, 35(36), 12412–12424. 10.1523/JNEUROSCI.4822-14.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  201. Sullivan J, Mei M, Perfors A, Wojcik E, & Frank MC (2021). Saycam: A large, longitudinal audiovisual dataset recorded from the Infant’s perspective. Open Mind, 5, 20–29. 10.1162/opmi_a_00039 [DOI] [PMC free article] [PubMed] [Google Scholar]
  202. Tanaka JW, & Taylor M (1991). Object categories and expertise: Is the basic level in the eye of the beholder? Cognitive Psychology, 23(3), 457–482. 10.1016/0010-0285(91)90016-H [DOI] [Google Scholar]
  203. Tarhan L, & Konkle T (2020). Sociality and interaction envelope organize visual action representations. Nature Communications, 11(1), 3002. 10.1038/s41467-020-16846-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  204. Taubert J, Ritchie JB, Ungerleider LG, & Baker CI (2022). One object, two networks? Assessing the relationship between the face and body-selective regions in the primate visual system. Brain Structure & Function, 227(4), 1423–1438. 10.1007/s00429-021-02420-7 [DOI] [PubMed] [Google Scholar]
  205. Taylor MJ, Arsalidou M, Bayless SJ, Morris D, Evans JW, & Barbeau EJ (2009). Neural correlates of personally familiar faces: Parents, partner and own faces. Human Brain Mapping, 30(7), 2008–2020. 10.1002/hbm.20646 [DOI] [PMC free article] [PubMed] [Google Scholar]
  206. Thorat S, Proklova D, & Peelen MV (2019). The nature of the animacy organization in human ventral temporal cortex. eLife, 8, e47142. 10.7554/eLife.47142 [DOI] [PMC free article] [PubMed] [Google Scholar]
  207. Tinbergen N (1963). On aims and methods of ethology. Zeitschrift für Tierpsychologie, 20(4), 410–433. 10.1111/j.1439-0310.1963.tb01161.x [DOI] [Google Scholar]
  208. Topalovic U, Barclay S, Ling C, Alzuhair A, Yu W, Hokhikyan V, Chandrakumar H, Rozgic D, Jiang W, Basir-Kazeruni S, Maoz SL, Inman CS, Stangl M, Gill J, Bari A, Fallah A, Eliashiv D, Pouratian N Fried I … Suthana N (2023). A wearable platform for closed-loop stimulation and recording of single-neuron and local field potential activity in freely moving humans. Nature Neuroscience, 1–11. 10.1038/s41593-023-01260-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  209. Treue S (2001). Neural correlates of attention in primate visual cortex. Trends in Neurosciences, 24(5), 295–300. 10.1016/S0166-2236(00)01814-2 [DOI] [PubMed] [Google Scholar]
  210. Turner BM, Forstmann BU, Love BC, Palmeri TJ, & Van Maanen L (2017). Approaches to analysis in model-based cognitive neuroscience. Journal of Mathematical Psychology, 76, 65–79. 10.1016/j.jmp.2016.01.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  211. van den Hurk J, Van Baelen M, & Op de Beeck HP (2017). Development of visual category selectivity in ventral visual cortex does not require visual experience. Proceedings of the National Academy of Sciences (Vol. 114. pp. E4501–E4510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  212. van der Laan LN, de Ridder DTD, Viergever MA, & Smeets PAM (2011). The first taste is always with the eyes: A meta-analysis on the neural correlates of processing visual food cues. Neuroimage, 55(1), 296–303. 10.1016/j.neuroimage.2010.11.055 [DOI] [PubMed] [Google Scholar]
  213. Varoquaux G, & Poldrack RA (2019). Predictive models avoid excessive reductionism in cognitive neuroimaging. Current Opinion in Neurobiology, 55, 1–6. 10.1016/j.conb.2018.11.002 [DOI] [PubMed] [Google Scholar]
  214. Vaziri-Pashkam M, & Xu Y (2017). Goal-directed visual processing differentially impacts human ventral and dorsal visual representations. Journal of Neuroscience, 37(36), 8767–8782. 10.1523/JNEUROSCI.3392-16.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  215. Vinken K, Prince JS, Konkle T, & Livingstone MS (2023). The neural code for “face cells” is not face-specific. Science Advances, 9(35), eadg1736. 10.1126/sciadv.adg1736 [DOI] [PMC free article] [PubMed] [Google Scholar]
  216. Visconti di Oleggio Castello M, Haxby JV, & Gobbini MI (2021). Shared neural codes for visual and semantic information about familiar faces in a common representational space. Proceedings of the National Academy of Sciences, 118 (45), e2110474118. 10.1073/pnas.2110474118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  217. Wada A, Sakano Y, & Ando H (2014). Human cortical areas involved in perception of surface glossiness. Neuroimage, 98, 243–257. 10.1016/j.neuroimage.2014.05.001 [DOI] [PubMed] [Google Scholar]
  218. Westlin C, Theriault JE, Katsumi Y, Nieto-Castanon A, Kucyi A, Ruf SF, Brown SM, Pavel M, Erdogmus D, Brooks DH, Quigley KS, Whitfield-Gabrieli S, & Barrett LF (2023). Improving the study of brain-behavior relationships by revisiting basic assumptions. Trends in Cognitive Sciences, 27(3), 246–257. 10.1016/j.tics.2022.12.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  219. Wise T, Emery K, & Radulescu A (2024). Naturalistic reinforcement learning. Trends in Cognitive Sciences, 28(2), 144–158. 10.1016/j.tics.2023.08.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  220. Wurm MF, & Caramazza A (2022). Two ‘what’ pathways for action and object recognition. Trends in Cognitive Sciences, 26(2), 103–116. 10.1016/j.tics.2021.10.003 [DOI] [PubMed] [Google Scholar]
  221. Xiao W, Sharma S, Kreiman G, & Livingstone MS (2024). Feature-selective responses in macaque visual cortex follow eye movements during natural vision. Nature Neuroscience, 27(6), 1157–1166. 10.1038/s41593-024-01631-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  222. Yargholi E, & Beeck HOD (2023). Category trumps shape as an organizational principle of object space in the human occipitotemporal cortex. Journal of Neuroscience, 43(16), 2960–2972. 10.1523/JNEUROSCI.2179-22.2023 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES