Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2020 Nov 23;117(47):29354–29362. doi: 10.1073/pnas.1912333117

Large-scale dissociations between views of objects, scenes, and reachable-scale environments in visual cortex

Emilie L Josephs a,1, Talia Konkle a
PMCID: PMC7703543  PMID: 33229533

Abstract

Space-related processing recruits a network of brain regions separate from those recruited in object processing. This dissociation has largely been explored by contrasting views of navigable-scale spaces to views of close-up, isolated objects. However, in naturalistic visual experience, we encounter spaces intermediate to these extremes, like the tops of desks and kitchen counters, which are not navigable but typically contain multiple objects. How are such reachable-scale views represented in the brain? In three human functional neuroimaging experiments, we find evidence for a large-scale dissociation of reachable-scale views from both navigable scene views and close-up object views. Three brain regions were identified that showed a systematic response preference to reachable views, located in the posterior collateral sulcus, the inferior parietal sulcus, and superior parietal lobule. Subsequent analyses suggest that these three regions may be especially sensitive to the presence of multiple objects. Further, in all classic scene and object regions, reachable-scale views dissociated from both objects and scenes with an intermediate response magnitude. Taken together, these results establish that reachable-scale environments have a distinct representational signature from both scene and object views in visual cortex.

Keywords: objects, scenes, reachspaces, fMRI, visual cortex


Scene-based and object-based representations form a major joint in the organization of the visual system. Scene-selective brain regions are broadly concerned with performing global perceptual analysis of a space (14), computing its navigational affordances (5, 6), and linking the present view to stored memory about the overall location (7, 8). In contrast, object-selective regions represent bounded entities, robust to confounding low-level contours and minor changes in size or position (9, 10). Are these two systems, one for processing spatial layout and another for bounded objects, together sufficient to represent any view of the physical environment?

Consider views of reachable-scale environments—the countertops where we combine ingredients for a cake or the worktables where we assemble the components of a circuit board. These views are intermediate in scale to scenes and objects and are the locus of many everyday actions (Fig. 1A). How are they represented in the visual system?

Fig. 1.

Fig. 1.

Experiment 1 stimuli and results. (A) Examples of object, reachspace, and scene views. (B) Preference mapping results. Colored regions have preference for objects (yellow), reachspaces (blue), and scenes (green). Color saturation indicates the magnitude of the preference relative to the next most preferred category.

One possibility is that reachable-scale environments are represented similarly to navigable-scale scenes, driving similar activations across the ventral and dorsal streams. Views of reachable environments are spatially extended, have three-dimensional layout, and need to be situated within larger environments, all of which are hypothesized functions of scene-selective regions. However, everyday views of reachable-scale environments also prominently feature collections of multiple objects and differ meaningfully from scenes by affording object-centered actions rather than navigation. Thus, a second possibility is that reachable-scale views will strongly drive object-preferring cortex.

A third and not mutually exclusive possibility is that visual responses to reachable-scale environments might recruit distinct brain regions, separate from object- and scene-preferring cortex. There are both action-related and perception-related arguments for this hypothesis. First, it is clear that near-scale spaces have different behavioral demands than far-scale spaces (1113). Indeed, there are well-known motor dissociations between reach-related frontoparietal circuits vs. navigation-related medial networks (1416). Second, low-level statistics of visual images differ as a function of environment scale (17). We recently showed that the human perceptual system is sensitive to these differences: observers performing a visual search task were faster at finding an image of a reachable environment among distractor scenes or objects than among reachspaces, and vice versa (18). These results show that the scale of the depicted environment is a major factor in perceptual similarity computations.

These prior studies suggest that reachable-scale views dissociate from singleton object views and navigable-scale scene views in both their input-related image statistics and output-related action requirements. Such input and output pressures have been proposed to be jointly essential for the large-scale functional clustering observed in visual cortex for different kinds of visual domains [e.g., faces, scenes (1923)]. Thus, it is possible that views of reachable environments are distinct enough in form and purpose to require distinct visual processing regions.

In the present work, we examined how views of reachable-scale environments are represented in the human brain using functional MRI (fMRI). We find clear evidence that reachspace representations dissociate from those of scenes and objects. Specifically, views of reachable environments elicited greater activity than both scenes and objects in regions of ventral and dorsal occipitoparietal cortex, across variations in luminance and global spatial frequency, and variations in the semantic category depicted (e.g., kitchen vs. office reachspaces). Reachable-scale environments also elicited differential responses in classic object- and scene-preferring regions, generally leading to intermediate levels of activation between scene and object views. Regions preferring reachable-scale environments showed a peripheral eccentricity bias but also responded particularly strongly to images of multiple objects, a functional signature that is distinct from both scene and object regions. Taken together, these results suggest that the visual processing of near-scale environments is functionally and topographically dissociable from that of objects and scenes.

Results

Preferential Responses to Reachable-Scale Spaces in Visual Cortex.

To examine the neural representation of reachable-scale environments compared with navigable-scale scenes and singleton objects, we created a stimulus set with images from each of the three environment types (Fig. 1A and SI Appendix, Fig. S1). Object images depicted close-scale views of single objects (within 8 to 12 inches) on their natural background. Reachable-scale images, which we will refer to as “reachspaces,” depicted near-scale environments that were approximately as deep as arm’s reach (3 to 4 feet) and consisted of multiple small objects arrayed on a horizontal surface (18). Scene images depicted views of the interior of rooms. Images were drawn from six semantic categories (bar, bathroom, dining room, kitchen, office, art studio). Note that we use the term “environment scale” to refer to the distinction between conditions but caution the reader against interpreting our results in terms of subjective distance only. Rather, differences observed here likely reflect differences across a constellation of dimensions that co-occur with scale (e.g., number of objects, number of surfaces, action affordances, perceived reachability). Two stimulus sets were collected, with 90 images each (Image Set A, Image Set B; 30 images per environmental scale per set) (Materials and Methods).

In Experiment 1, 12 participants viewed images of objects, reachspaces, and scenes, in a standard blocked fMRI design. All three stimulus conditions drove strong activations throughout visually responsive cortex, particularly in early visual and posterior inferotemporal regions, with progressively weaker responses anteriorly through the ventral and dorsal stream (SI Appendix, Fig. S2). To help visualize the differences between these response topographies, voxels were colored according to the condition that most strongly activated them, with the saturation of the color reflecting the strength of the response preference (early visual regions excluded) (SI Appendix, Supplementary Methods). This analysis revealed that different parts of cortex had reliable preferences for each stimulus type, both at the group level (Fig. 1B) and at the single-subject level (SI Appendix, Fig. S3). Reachspace preferences (blue) were evident in three distinct zones: posterior ventral cortex, occipital–parietal cortex, and superior parietal cortex. These zones of preference lay adjacent to known object-preference zones (yellow) and scene-preference zones (green). Thus, while all three conditions extensively drive visual cortex, the activation landscapes differ in a systematic manner.

To estimate the magnitude of reachspace preferences, we defined reachspace-preferring regions of interest (ROIs) around the peaks in reachspace preference appearing in anatomically consistent locations across subjects. Half of the data (activations from Image Set A) were submitted to a conjunction analysis to find voxels with a preference for reachspaces over objects and reachspaces over scenes. This procedure yielded three reachspace-preferring ROIs: a ventral region of interest (vROI), primarily located in the posterior collateral sulcus; an occipitoparietal region of interest (opROI), variably located in the middle or superior occipital gyri; and a superior parietal region of interest (spROI), in the anterior portion of the superior parietal lobe. Talairach (TAL) coordinates for these ROIs are given in SI Appendix, Table S1.

Next, we examined activation magnitude in the remaining half of the data (Image Set B) and found that reachspace views elicited significantly higher activations than both scenes and objects in all three ROIs (Fig. 2) [vROI: reachspace (RS) > singleton object (O): t(8) = 5.33, P < 0.001; RS > S: t(8) = 4.66, P = 0.001; opROI: RS > O: t(6) = 5.20, P = 0.001; RS > S: t(6) = 4.55, P = 0.002; spROI: RS > O: t(7) = 6.16, P < 0.001; RS > S: t(7) = 5.22, P = 0.001]. These results also held when swapping the image set used to define the ROIs and test for activation differences (see SI Appendix, Table S2 for all statistics). This preference for reachspace images was not driven by any particular semantic category, as all six reachspace categories drove the highest responses in these regions (Fig. 2).

Fig. 2.

Fig. 2.

Locations and activations of reachspace-preferring ROIs. ROI locations are shown in the volume and on the inflated surface of an example subject. Bar plots show beta activations for objects, reachspaces, and scenes, averaged over semantic category (3-bar plot) or with semantic category displayed separately (18-bar plot). Error bars represent the within-subject SEM, and asterisks indicate statistical significance.

Taken together, these analyses show that there are portions of cortex with systematically stronger responses to images of reachable-scale environments than to navigable-scale scenes and single-object images.

Low-Level Control and Replication.

In Experiment 2, we aimed to replicate the finding that reachspaces elicit greater activity than scenes and objects in some regions and to test whether the response preferences for reachspaces are attributable to factors beyond very simple feature differences. Twelve participants (two of whom had completed Experiment 1) viewed Image Set A (“original” images) and a version of Image Set B that was matched in mean luminance, contrast, and global spatial frequency content (“controlled” images) (Fig. 3A and SI Appendix, Fig. S4 show examples).

Fig. 3.

Fig. 3.

Stimuli and results for Experiment 2. (A) Illustration of matching luminance, contrast, and global spatial frequency. (B) A comparison of the group-average preference maps obtained for the original and controlled images, plotted on the same scale, and projected onto an inflated brain. Color saturation indicates the magnitude of the preference relative to the next most preferred category. (C) Activations in reachspace ROIs (defined in original images) in response to controlled images. Error bars represent the within-subject SEM, and asterisks indicate statistical significance.

Preference maps elicited by original and controlled images had highly similar spatial organization (Fig. 3B; SI Appendix, Fig. S5 shows single-subject maps). At the group level, 69.9% of visually responsive voxels preferred the same condition across original and controlled image formats (chance = 33.3%, 50.3 ± 1.5% match at the single-subject level) (SI Appendix, Fig. S6). Further, the topographies found in Experiment 2 with original images also match those found in Experiment 1 (67.4% of voxels in group-level preference maps had the same preference).

Additionally, the ROI results replicated with controlled images. Specifically, ROIs were defined in Experiment 2 subjects using original images, and activations were extracted for controlled images (Fig. 3C). Preferential responses to reachspaces were generally maintained [vROI: RS > O: t(9) = 2.08, P = 0.034; RS > S: t(9) = 2.72, P = 0.012; opROI: RS > O: t(5) = 2.38, P = 0.032; spROI: RS > O: t(5) = 3.61, P = 0.008; RS > S: t(5) = 2.02, P = 0.05; although RS > S in opROI was not significant: t(5) = 0.79, P = 0.234]. Note that in most of these ROIs, controlled images generally elicited lower overall activation magnitude than original images, and in some cases, the strength of the reachspace preference was slightly weaker than in the original image set (SI Appendix, Table S3).

In summary, Experiment 2 found that the controlled image set elicited weaker but similar responses to object, reachspace, and scene images, indicating that these brain responses are not solely driven by stimulus differences in luminance, contrast, or global spatial frequency content.

Responses to Reachable-Scale Environments in Scene- and Object-Preferring Regions.

We next evaluated reachspace-evoked activity in scene- and object-selective regions using data from both Experiment 1 (original images) and Experiment 2 (controlled images). All category-selective ROIs were defined using independent localizer runs (SI Appendix, Supplementary Methods).

In scene-preferring regions (parahippocampal place area [PPA], occipital place area [OPA], retrosplenial cortex [RSC]), reachspaces elicited an intermediate level of activation for both original and controlled images (Fig. 4A). That is, reachspace images evoked stronger activation than object images [original images: PPA: t(11) = 11.29, P < 0.001; OPA: t(10) = 9.16, P > 0.001; RSC: t(11) = 9.15, P < 0.001; controlled images: PPA: t(11) = 8.43, P < 0.001; OPA: t(10) = 9.32, P < 0.001; RSC: t(11) = 5.24, P < 0.001] and weaker activation than scene images, although this difference was marginal in OPA for original images [original image set: PPA: t(11) = 4.50, P < 0.001; OPA: t(10) = 1.63, P = 0.067; RSC: t(11) = 6.80, P < 0.001; controlled images: PPA: t(11) = 9.69, P < 0.001; OPA: t(10) = 4.25, P = 0.001; RSC: t(11) = 6.48, P < 0.001] (SI Appendix, Table S2 has results in original images where the ROI-defining and activation-extracting runs were swapped, and SI Appendix, Table S3 has comparisons of activations evoked by original and controlled images).

Fig. 4.

Fig. 4.

Result for classic category-selective ROIs. (A) Univariate response to objects, scenes, and reachspaces in scene-selective regions for both original and controlled images (i.e., Experiment 1 and Experiment 2, respectively). (B) Same analysis for object-selective regions.

In object-preferring regions (lateral occipital [LO] and posterior fusiform sulcus [pFs]), reachspaces also showed intermediate activation levels of activation in most comparisons (Fig. 4B). Specifically, reachspace images elicited significantly more activity than scene images [original images: LO: t(10) = 5.55, P < 0.001; pFs: t(10) = 4.86, P < 0.001; controlled images: LO: t(11) = 8.10, P < 0.001; pFs: t(11) = 6.04, P < 0.001]. Additionally, reachspace images elicited significantly weaker activation than objects for controlled images [LO: t(11) = 11.20, P< 0.001; pFs: t(11) = 12.19, P < 0.001] but showed a similar overall activation with object images in their original format [LO: t(10) = 0.86, P = 0.204; pFs: t(10) = −0.12, P = 0.547] (SI Appendix, Table S3 has all comparisons between activations to original vs. controlled images).

Taken together, these analyses show that reachspaces elicit an intermediate degree of activity in both scene- and object-preferring ROIs. These results provide further evidence that views of near-scale environments evoke different cortical responses than both scene and objects images.

Functional Signatures of Reachspace-Preferring Cortex.

Next, we examined how object-, scene-, and reachspace-preferring ROIs differ in their broader functional signatures. We first report two opportunistic analyses from Experiment 1, which leverage stimulus conditions present in our localizer runs; then, we report data from Experiment 3, with planned functional signature analyses.

In our first opportunistic analysis, we examined the responses of regions with object, reachspace, and scene preferences to the eccentricity conditions present in the Experiment 1 retinotopy protocol (Fig. 5A and SI Appendix, Fig. S7). Reachspace-preferring regions showed a peripheral bias, which was significant at a conservative post hoc statistical level for the ventral and occipital reachspace regions but not in the superior parietal region (Fig. 5A) [vROI: t(8) = 3.90, P = 0.005; opROI: t(6) = 4.82, P = 0.003; spROI: t(7) = 3.29, P = 0.013; two-tailed post hoc paired t test with Bonferroni-corrected alpha = 0.006]. Similarly, scene regions were strongly peripherally biased [PPA: t(11) = 17.59, P < 0.001; OPA: t(10) = 9.27, P < 0.001; RSC: t(11) = 12.49, P < 0.001]. In contrast, object regions showed mixed biases, which did not reach significance after Bonferroni correction [LO: foveal bias, t(10) = 2.68, P = 0.023; pFs: peripheral bias, t(10) = 2.26, P = 0.047]. These results show that regions that responded preferentially to reachspaces, like scene-selective regions, are most sensitive to peripheral stimulation.

Fig. 5.

Fig. 5.

Response properties of reachspace regions, compared with scene and object regions. (A) Stimuli and results showing the eccentricity bias of object, reachspace, and scene-preferring areas. Error bars show the within-subjects SEM, and asterisks indicate statistical significance. (B) Stimuli and results showing the profile of responses across a range of categories for reachspaces regions and regions corresponding to the anatomical locations of object- and scene-selective areas. Beta values are plotted for each condition in a polar plot; negative values were thresholded to zero for visibility.

In our second opportunistic analysis, we investigated how ROIs differed in their response profile to a broad selection of categories present in the Experiment 1 localizer: faces, bodies, hands, objects, multiple objects, white noise, and scenes. Activations were extracted from reachspace-preferring ROIs. Since localizer runs were no longer available to define scene and object ROIs, these ROIs were approximated using a 9-mm radius sphere around their average TAL location, estimated based on a literature review. Activations for all regions are plotted as fingerprint profiles in Fig. 5B.

In all three reachspace-preferring ROIs, images of multiple objects elicited the highest activation [difference between multiple objects condition and the next highest condition was significant in vROI and spROI: vROI: multiple objects > scenes, t(8) = 3.49, P < 0.01; spROI: multiple objects > bodies, t(7) = 5.54, P < 0.01; marginal in opROI: multiple objects > scenes, t(6) = 2.32, P = 0.03]. In contrast, scene- and object-preferring ROIs showed different functional signatures. The approximated PPA and RSC regions preferred scenes over all other conditions, including multiple objects [scenes > multiple objects in PPA: t(11) = 12.02, P > 0.001; in RS: t(11) = 7.87, P > 0.001; one-tailed paired t test, post hoc alpha level = 0.02]. This difference was not significant for approximated OPA [t(11) = −0.18, P = 0.57]. Finally, approximated LO and pFs regions showed maximal response to bodies, with broad tuning to hands, faces, object, and multiple objects and no differences between single objects and multiple objects in either ROI [LO: t(11) = −0.15, P = 0.56; pFs: t(11) = −0.85, P = 0.79]. Overall, these exploratory analyses suggest that all three reachspace-preferring regions show a similar response profile with each other (i.e., preference for multiple objects, and tuning to more peripheral stimulation), despite their anatomical separation, and this profile is distinct that of from scene- and object-preferring regions.

To test this formally, Experiment 3 probed responses in all ROIs to a broad range of conditions (Fig. 6; stimuli are in SI Appendix, Fig. S8). These conditions included views of standard reachspaces, objects, and scenes, as well as four different multiobject conditions (all depicting multiple objects with no background) and two different minimal object conditions (depicting near-scale spatial layouts with one or no objects). A final condition depicted vertical reachspaces, where the disposition of objects was vertical rather than horizontal (e.g., shelves, pegboards). Experiment 3 was conducted in the same session as Experiment 2 and involved the same participants and functionally defined ROIs. Activations from all conditions were extracted from each ROI, and the fingerprints were compared.

Fig. 6.

Fig. 6.

Experiment 3 results. (A) Fingerprint profile of responses over all conditions in object, reachspace, and scene ROIs. MO-big, multiple big objects; MO-sm, multiple small objects; RS-e, empty reachspace images; RS-nbgs, reachspaces with no background, object positions scrambled; RS-nobg, reachspace images with only objects with background removed; RS-so, reachspace images with only one object; RS-v, vertical reachspaces. (B) Responses in reachspace-preferring ROIs across all Experiment 3 conditions, plotted in order from highest to lowest activations. Images with orange borders indicate stimuli dominated by multiple objects, and images with teal borders highlight images of reachable space with low object content. SI Appendix has all stimuli used in the experiment, in a larger format.

Across these 10 conditions, reachspace-preferring regions had a different fingerprint of activation than scene and object regions (Fig. 6A). To test the significance of the difference in fingerprint profiles, responses across all conditions were averaged over the reachspace ROIs to create a reachspace-ROI fingerprint and then compared with the scene-ROI fingerprint (averaged over scene regions) and object-ROI fingerprint (averaged over object regions) using a two-way ANOVA. An omnibus test of ROI type (object, reachspace, or scene) by condition revealed an ROI type by condition interaction [F(9, 329) = 65.55, P < 0.001], showing that the patterns of activations across the 10 conditions varied as a function of ROI type. This difference held when reachspace ROIs were compared with scene and object ROIs separately [interaction effect for reachspace vs. scene ROIs: F(9, 219) = 32.20, P < 0.001; for reachspace vs. object ROIs: F(9, 219) = 47.89, P < 0.001]. These results further corroborate the conclusion that reachspace-preferring regions have distinct representational signatures than object- and scene-preferring cortex.

Examining this response profile in more detail, in all three reachspace-preferring ROIs, responses were higher to all multiobject conditions (Fig. 7, orange outline) than to empty or single-object reachspaces (blue outline). To quantify this, responses to all multiobject conditions were averaged, as were responses to empty reachspaces and single-object reachspaces, and two resulting activation levels were compared with a post hoc t test [vROI: t(9) = 7.75, P < 0.01; opROI: t(5) = 4.57, P < 0.01; spROI: t(5) = 4.50, P < 0.01]. This pattern of data suggests that the presence of multiple easily individuated objects may be particularly critical for driving the strong response to reachspace images relative to full-scale scenes, where object content may be less prominent than layout information. In contrast, in scene-preferring regions, the empty reachspace images generated higher responses than multiple object arrays, although this difference was marginal in OPA (SI Appendix, Fig. S9) [PPA: t(11) = −8.16, P < 0.01; OPA: t(10) = −1.49, P = 0.08; RSC: t(11) = −7.28, P < 0.01]. This result is consistent with prior work showing that scene regions strongly prefer empty full-scale rooms over multiple objects and generally reflect responses to spatial layout (1).

Fig. 7.

Fig. 7.

Depiction of the location of the reachspace ROIs in relation to scene-, object-, and face-preferring ROIs, shown in the right hemispheres of three example subjects.

These activation profiles also illustrate how the stimuli used to define a region do not allow us to directly infer what specific information is encoded there. For example, scene images depict both spatial layout and multiple objects, but scene ROIs are relatively more sensitive to the spatial layout content of the images. Analogously, reachspace images depict both spatial layout and multiple objects, but reachspaces ROIs are relatively more sensitive to the multiobject content of the images. Thus, the claim is not necessarily that these are “reachspace-selective” regions. Rather, the claim is that these regions are responsive to some content that is relatively more present in naturalistic reachspace images than scene and object images, and we suggest that the presence of multiple individuated objects is likely to be an important factor. Future work will be required to further articulate the distinctive roles of these regions.

New Territory vs. New Subdivisions of Scene-Preferring Regions.

Finally, we conducted several targeted analyses aimed at understanding whether reachspace-preferring regions are truly separate regions of cortex from scene-and object-preferring ROIs or whether they are simply new subdivisions. First, we subdivided classically localized PPA into anterior and posterior regions (24, 25) and found that neither subdivision showed a reachspace preference (SI Appendix, Fig. S10). These analyses indicate that the ventral reachspace-preferring ROI does not correspond to this known subdivision of PPA.

Next, we quantified the overlap between all ROIs, given that it was statistically possible for scene-preferring regions (defined with a standard scene > object contrast in localizer runs) to overlap with reachspace-preferring regions (defined as RS > O and RS > S conjunction contrast in experimental runs). However, we found relatively little overlap among the ROIs (e.g., for the vROI, there was a 4.4 ± 1.8% overlap with PPA, 4.6 ± 2.1% overlap with pFs, and 0.1 ± 0.1% overlap with FFA) (SI Appendix, Tables S4 and S5 have all overlap results). The relationship among these ROIs is visualized for three individual participants in Fig. 7 and for all participants in SI Appendix, Fig. S11. Overall, reachspace-preferring ROIs largely occupy different regions of cortex than object-, scene-, face-, and hand-selective cortex.

Finally, we examined whether reachspace regions could be an artifact of population mixing. For example, it is possible that the ventral reachspace-preferring region actually reflects an intermixing of object-preferring neurons (similar to nearby pFs) and scene-preferring neurons (similar to nearby PPA) whose competing responses to object and scene images average out at the scale of fMRI, creating the appearance of reachspace tuning. If this were the case, then we would expect that the functional profile of the ventral reachspace region over the 10 conditions in Experiment 3 could be predicted by a weighted combination of responses in scene- and object-preferring regions. However, this was not evident in the data (SI Appendix, Fig. S12): no mixture of pFS and PPA tuning could predict the preference for all four multiobject conditions over both single-object conditions. Further, the spROI is also informative, as this region shows both a reachspace preference and a functional fingerprint similar to other reachspace-preferring ROIs but is anatomically far from object- or scene-selective regions.

Discussion

The aim of this study was to characterize how the visual system responds to views of reachable environments relative to views of full-scale scenes and singleton objects. We found that 1) reachable environments activate distinct response topographies from both scenes and objects; 2) regions with reachspace preferences are present in consistent locations across participants, allowing us to define ROIs in the posterior collateral sulcus, in dorsal occiptoparietal cortex, and the superior parietal lobule; 3) the response topographies of reachspace preferences are maintained in an image set equating luminance, contrast, and global spatial frequency; 4) reachspaces elicit dissociable activity in scene and object ROIs, driving these regions to an intermediate degree; reachspace-preferring regions 5) have peripheral biases and 6) have distinctly higher response to the presence of multiple isolated objects over near-scale spatial layout with minimal object content, a combination that is unique among the ROIs explored here; and 7) the reachspace-preferring regions do not appear to be a subdivision of classic category-selective areas.

Situating Reachspace-Preferring Cortex.

Activations across a similar constellation of regions were found when participants attended to the reachability of objects vs. their color or location (26) and when participants attended to a ball approaching their hand vs. a more distant target (27). In addition, the three reachspace-preferring ROIs appear to overlap a subset of parcels involved in making predictions about the physical behavior of objects (28). Taken together, the correspondence between these results suggests that the ROIs that preferred reachable-scale views may be generally important for reachability judgments and suggests a potentially broader role in behaviors that rely on accurate predictions regarding objects in the physical world.

The ventral reachspace ROI lies near a swath of cortex sensitive to features of object ensembles (29), to the texture and surface properties of single objects (30), to regions that are sensitive to videos of actions being performed in the near space (31), and near the posterior edge of a color-biased band running along ventral IT cortex (23, 32). The occipital reachspace ROI lies in the vicinity of inferior parietal regions associated with the maintenance of multiple objects in working memory (33). Additionally, the superior parietal reachspace ROI falls near territory thought to contain information about the reachability of an object (34) and the type of object-directed hand movement that is planned (35). Interestingly, this ROI also appears to overlap the posterior locus of the multiple-demand network, a network of frontoparietal regions associated with the control of visual attention and the sequencing of cognitive operations (36, 37). Future studies with targeted comparisons will be required to map these functions together and assess the degree to which they draw on common representations.

Finally, it was recently found (31) that tuning of ventral and dorsal stream responses to videos of people performing actions was related to the “interaction envelope” (38) of the depicted action and was sensitive to whether the actions were directed at objects in near space or far space. This result is also broadly consistent with the present results, where the scale of depicted space seems to be an important factor in the structure of responses across the entire visual system.

Implications for the Visual Representation of Reachable Space.

The existence of reachspace-preferring cortex suggests that near-scale environments require some distinctive processing relative to navigable-scale scenes and close-scale objects. Part of these differences may relate to differences in scale between the views: perceived depth has been shown to affect activation strength in scene regions (3941). However, it is clear that the ROIs that prefer reachspaces do not do so on the basis of scale alone: environments that were near scale but contained one or no objects elicited low responses in these regions. Instead, the regions responded strongly to images of multiobjects arrays, suggesting a role for object-related content. Is it possible, then, that these regions are best characterized as “multiple object regions?” How important are the background spatial components, such as the desktops, and texture cues to the perceived depth of the scene? Future work will be needed to characterize the effects of scale, number of objects, and their interactions to clarify in these regions.

Finally, it is possible to extend theoretical frameworks for the large-scale organization of (isolated) object information and apply them to the large-scale organization of object, reachspace, and scene views. For example, some have argued that the visual world is divided into domains linked to behavioral relevance, which are separately arrayed along the cortical sheet (20, 23). Consistent with this action-based perspective, objects, reachspaces, and scenes differ in the kinds of high-level goals and behaviors they afford: objects afford grasping, reachspaces afford the coordinated use of multiple objects, and scenes afford locomotion and navigation. Others have argued that the large-scale organization is more of an emergent property that follows from experienced eccentricity and aggregated differences in midlevel image statistics (2123, 42). Consistent with this input-based perspective, reachspace images as a class are perceptually distinct from both scene and object images, a distinction that is also evident in the learned representations of deep neural networks (18). In sum, there are both action-based and image feature properties that can jointly motivate a large-scale division of objects, reachspaces, and scenes across the visual system.

Materials and Methods

In-text methods provide details about subject, stimuli, and ROI definitions. All other method details are available in SI Appendix.

Subjects.

Twelve participants were recruited for Experiment 1 and Experiment 2. Two participants completed both. Experiment 3 was conducted in the same session as Experiment 2. All participants gave informed consent and were compensated for their participation. All procedures were approved by the Harvard University Human Subjects Institutional Review Board.

Stimuli.

All stimuli are available on the Open Science Framework (https://osf.io/g9aj5/). For Experiment 1, we collected views of objects, scenes, and reachable environments, each with 10 images from six semantic categories (bar, bathroom, dining room, kitchen, office, art studio), yielding 60 images per scale. These images were divided into two equal sets—Image Set A and Image Set B. Object images depicted close-scale views (within 8 to 12 inches from the object) on their natural background (e.g., a view of a sponge with a small amount of granite countertop visible beyond it). Reachspace images depicted near-scale environments that were approximately as deep as arm’s reach (3 to 4 feet) and consisted of multiple small objects arrayed on a horizontal surface (e.g., a knife, cutting board, and an onion arrayed on kitchen counter). Scene images depicted views of the interior of rooms (e.g.,: a view of a home office). For Experiment 2, we created a controlled version of Image Set B where all images were gray scaled and matched in average luminance, contrast, and global spatial frequency content using the SHINE toolbox (43). Experiment 3 included 10 stimulus conditions: 1) reachspaces images with the background removed in Photoshop, yielding images of multiple objects in realistic spatial arrangements; 2) reachspaces images with background removed and the remaining objects scrambled, where the objects from the previous condition were moved around the image to disrupt the realistic spatial arrangement; 3) six objects with large real-world size (e.g., trampoline, dresser) arranged in a 3 × 2 grid on a white background; 4) six objects with small real-world size (e.g., mug, watch) arranged in a 3 × 2 grid on a white background and presented at the same visual size as the previous image condition; 5) reachable environments with all objects removed except the support surface; 6) reachspaces containing only a single object on the support surface; 7) vertical reachspaces, where the disposition of objects was vertical rather than horizontal (e.g., shelves, pegboards); 8) regular reachspaces (i.e., horizontal) as in earlier experiments; 9) objects (i.e., close-up views of single objects on their natural background); and 10) scenes (i.e., navigable-scale environments). Further details on stimulus selection and controls are available in SI Appendix.

Defining ROIs with Reachspace Preferences.

For Experiment 1, reachspace-preferring ROIs were defined manually in Brain Voyager by applying the conjunction contrast RS > O and RS > S, using four experimental runs with the same image set. We decided a priori to define all reachspace ROIs using Image Set A runs and extract all activations for further analysis from Image Set B runs. These results are reported in the paper, but we also validated all analyses by reversing which image set was used to localize vs. extract activations, and these results are reported in SI Appendix. For the ROIs used in Experiments 2 and 3 (run in the same session), we designed an automatic ROI selection algorithm, guided by the anatomical locations of these regions in Experiment 1. This method allowed for the precise location of ROIs to vary over individuals while still requiring them to fall within anatomically constrained zones. The algorithm located the largest patch in the vicinity of the average location of the ROIs from E1 where the univariate preference for reachspaces over the next most preferred category exceeded 0.2 beta (more details are in SI Appendix). This automated procedure was developed using a separate pilot dataset, and all parameters were decided a priori (SI Appendix, Fig. S8 shows a visualization of the consequences of this parameter choice).

Supplementary Material

Supplementary File
pnas.1912333117.sapp.pdf (22.3MB, pdf)

Acknowledgments

We thank Leyla Tarhan, Dan Janini, Ruosi Wang, and John Mark Taylor for their help scanning participants. We acknowledge the University of Minnesota Center for Magnetic Resonance Research for use of the multiband-EPI pulse sequences. This work took place at the Harvard Center for Brain Science and is supported by NIH Shared Instrumentation Grant S10OD020039 to the Center for Brain Science.

Footnotes

The authors declare no competing interest.

This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, “Brain Produces Mind by Modeling,” held May 1–3, 2019, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA. NAS colloquia began in 1991 and have been published in PNAS since 1995. From February 2001 through May 2019, colloquia were supported by a generous gift from The Dame Jillian and Dr. Arthur M. Sackler Foundation for the Arts, Sciences, & Humanities, in memory of Dame Sackler’s husband, Arthur M. Sackler. The complete program and video recordings of most presentations are available on the NAS website at http://www.nasonline.org/brain-produces-mind-by.

This article is a PNAS Direct Submission. D.S.B. is a guest editor invited by the Editorial Board.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1912333117/-/DCSupplemental.

Data Availability.

Anonymized stimuli and fMRI data have been deposited in Open Science Foundation (https://osf.io/g9aj5/).

References

  • 1.Epstein R., Kanwisher N., A cortical representation of the local visual environment. Nature 392, 598–601 (1998). [DOI] [PubMed] [Google Scholar]
  • 2.Kravitz D. J., Peng C. S., Baker C. I., Real-world scene representations in high-level visual cortex: It’s the spaces more than the places. J. Neurosci. 31, 7322–7333 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lescroart M. D., Gallant J. L., Human scene-selective areas represent 3D configurations of surfaces. Neuron 101, 178–192 (2019). [DOI] [PubMed] [Google Scholar]
  • 4.Park S., Brady T. F., Greene M. R., Oliva A., Disentangling scene content from spatial boundary: Complementary roles for the parahippocampal place area and lateral occipital complex in representing real-world scenes. J. Neurosci. 31, 1333–1340 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bonner M. F., Epstein R. A., Coding of navigational affordances in the human visual system. Proc. Natl. Acad. Sci. U.S.A. 114, 4793–4798 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kamps F. S., Julian J. B., Kubilius J., Kanwisher N., Dilks D. D., The occipital place area represents the local elements of scenes. Neuroimage 132, 417–424 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Marchette S. A., Vass L. K, Ryan J., Epstein R. A., Outside looking in: Landmark generalization in the human navigational system. J. Neurosci. 35, 14896–14908 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Vass L. K., Epstein R. A., Abstract representations of location and facing direction in the human brain. J. Neurosci. 33, 6133–6142 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Grill-Spector K., Kourtzi Z., Kanwisher N., The lateral occipital complex and its role in object recognition. Vis. Res. 41, 1409–1422 (2001). [DOI] [PubMed] [Google Scholar]
  • 10.Grill-Spector K., Kushnir T., Edelman S., Itzchak Y., Malach R., Cue-invariant activation in object-related areas of the human occipital lobe. Neuron 21, 191–202 (1998). [DOI] [PubMed] [Google Scholar]
  • 11.Previc F H., The neuropsychology of 3-D space. Psychol. Bull. 124, 123–164 (1998). [DOI] [PubMed] [Google Scholar]
  • 12.Grüsser O.-J., “Multimodal structure of the extrapersonal space” in Spatially Oriented Behavior, A. Hein, M. Jeannerod, Eds. (Springer, New York, NY, 1983), pp. 327–352. [Google Scholar]
  • 13.Rizzolatti G., Camarda R., “Neural circuits for spatial attention and unilateral neglect” in Advances in Psychology, M. Jeannerod, Ed. (North-Holland, Amsterdam, the Netherlands, 1987), vol. 45, pp. 289–313. [Google Scholar]
  • 14.di Pellegrino G., Làdavas E., Peripersonal space in the brain. Neuropsychologia 66, 126–133 (2015). [DOI] [PubMed] [Google Scholar]
  • 15.Graziano M. S. A., Gross C. G., “The representation of extrapersonal space: A possible role for bimodal, visual-tactile neurons” in The Cognitive Neurosciences, M. S. Gazzaniga, Ed. (MIT Press, Cambridge, MA, 1994), pp. 1021–1034. [Google Scholar]
  • 16.Maguire E., The retrosplenial contribution to human navigation: A review of lesion and neuroimaging findings. Scand. J. Psychol. 42, 225–238 (2001). [DOI] [PubMed] [Google Scholar]
  • 17.Torralba A., Oliva A., Depth estimation from image structure. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1226–1238 (2002). [Google Scholar]
  • 18.Josephs E. L., Konkle T., Perceptual dissociations among views of objects, scenes, and reachable spaces. J. Exp. Psychol. Hum. Percept. Perform. 45, 715–728 (2019). [DOI] [PubMed] [Google Scholar]
  • 19.de Beeck H. P. O., Pillet I., Ritchie J. B., Factors determining where category-selective areas emerge in visual cortex. Trends Cognit. Sci. 23, 784–797 (2019). [DOI] [PubMed] [Google Scholar]
  • 20.Mahon B. Z., Caramazza A., What drives the organization of object knowledge in the brain? Trends Cognit. Sci. 15, 97–103 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Konkle T., Oliva A., A real-world size organization of object responses in occipitotemporal cortex. Neuron 74, 1114–1124 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Arcaro M. J., Livingstone M. S., A hierarchical, retinotopic proto-organization of the primate visual system at birth. Elife 6, e26196 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Conway B. R., The organization and operation of inferior temporal cortex. Annu. Rev. Vision Sci. 4, 381–402 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Bar M., Aminoff E., Cortical analysis of visual context. Neuron 38, 347–358 (2003). [DOI] [PubMed] [Google Scholar]
  • 25.Baldassano C., Beck D. M., Fei-Fei L., Differential connectivity within the parahippocampal place area. Neuroimage 75, 228–237 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Bartolo A., et al. , Contribution of the motor system to the perception of reachable space: An fMRI study. Eur. J. Neurosci. 40, 3807–3817 (2014). [DOI] [PubMed] [Google Scholar]
  • 27.Makin T. R., Holmes N. P., Zohary E., Is that near my hand? Multisensory representation of peripersonal space in human intraparietal sulcus. J. Neurosci. 27, 731–740 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Fischer J., Mikhael J. G., Tenenbaum J. B., Kanwisher N., Functional neuroanatomy of intuitive physical inference. Proc. Natl. Acad. Sci. U.S.A. 113, E5072–E5081 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Cant J. S., Xu Y., Object ensemble processing in human anterior-medial ventral visual cortex. J. Neurosci. 32, 7685–7700 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Cant J. S., Arnott S. R., Goodale M. A., FMR-adaptation reveals separate processing regions for the perception of form and texture in the human ventral stream. Exp. Brain Res. 192, 391–405 (2009). [DOI] [PubMed] [Google Scholar]
  • 31.Tarhan L., Konkle T., Sociality and interaction envelope organize visual action representations. Nat. Commun. 11, 1–11 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Lafer-Sousa R., Conway B. R., Kanwisher N. G., Color-biased regions of the ventral visual pathway lie between face and place-selective regions in humans, as in macaques. J. Neurosci. 36, 1682–1697 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Xu Y., Distinctive neural mechanisms supporting visual object individuation and identification. J. Cognit. Neurosci. 21, 511–518 (2009). [DOI] [PubMed] [Google Scholar]
  • 34.Gallivan J. P., Cavina-Pratesi C., Culham J. C., Is that within reach? fMRI reveals that the human superior parieto-occipital cortex encodes objects reachable by the hand. J. Neurosci. 29, 4381–4391 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Gallivan J. P., McLean D. A., Valyear K. F., Pettypiece C. E., Culham J. C., Decoding action intentions from preparatory brain activity in human parieto-frontal networks. J. Neurosci. 31, 9599–9610 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Duncan J., The multiple-demand (MD) system of the primate brain: Mental programs for intelligent behaviour. Trends Cognit. Sci. 14, 172–179 (2010). [DOI] [PubMed] [Google Scholar]
  • 37.Fedorenko E., Duncan J., Kanwisher N., Broad domain generality in focal regions of frontal and parietal cortex. Proc. Natl. Acad. Sci. U.S.A. 110, 16616–16621 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Bainbridge W. A., Oliva A., Interaction envelope: Local spatial representations of objects at all scales in scene-selective regions. Neuroimage 122, 408–416 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Persichetti A. S., Dilks D. D., Perceived egocentric distance sensitivity and invariance across scene-selective cortex. Cortex 77, 155–163 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lescroart M. D., Stansbury D. E., Gallant J. L., Fourier power, subjective distance, and object categories all provide plausible models of bold responses in scene-selective visual areas. Front. Comput. Neurosci. 9, 135 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Henderson J. M., Larson C. L., Zhu D. C., Full scenes produce more activation than close-up scenes and scene-diagnostic objects in parahippocampal and retrosplenial cortex: An fMRI study. Brain Cognit. 66, 40–49 (2008). [DOI] [PubMed] [Google Scholar]
  • 42.Long B., Yu C.-P., Konkle T., Mid-level visual features underlie the high-level categorical organization of the ventral stream. Proc. Natl. Acad. Sci. U.S.A. 115, E9015–E9024 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Willenbockel V., et al. , The SHINE toolbox for controlling low-level image properties. J. Vision 10, 653 (2010). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1912333117.sapp.pdf (22.3MB, pdf)

Data Availability Statement

Anonymized stimuli and fMRI data have been deposited in Open Science Foundation (https://osf.io/g9aj5/).


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES