Abstract
When visual objects are located in the lower visual field, human observers perceive objects to be nearer than their real physical location. Conversely, objects in the upper visual field are viewed farther than their physical location. This bias may be linked to the statistics of natural scenes, and perhaps the ecological relevance of objects in the upper and lower visual fields (Previc, 1990; Yang & Purves, 2003). However, the neural mechanisms underlying such perceptual distortions have remained unknown.
To test for underlying brain mechanisms, we presented visual stimuli at different perceptual distances, while measuring high-resolution fMRI in human subjects. First, we localized disparity-selective thick stripes and thick-type columns in secondary and third visual cortical areas, respectively. Consistent with the perceptual bias, we found that the thick stripe/columns that represent the lower visual field also responded more selectively to near rather than far visual stimuli. Conversely, thick stripe/columns that represent the upper visual field show a complementary bias, i.e. selectively higher activity to far rather than near stimuli. Thus, the statistics of natural scenes may play a significant role in the organization of near- and far-selective neurons within V2 thick stripes and V3 thick-type columns.
Keywords: Statistics of Natural Scenes, Stereopsis, Depth Perception, Extrastriate cortex, 7T fMRI
1. Introduction
In humans and many other terrestrial animals, visual objects that appear below the line of sight (i.e. in the lower visual field) are typically located closer than objects appearing in the upper visual field (Yang & Purves, 2003). This typical difference in object distance can affect human judgments. Consistent with these statistics of natural scenes, humans are known to systematically underestimate the distance of objects below the line of sight, perceiving them nearer than their actual distance (Ooi, Wu, & He, 2001; Philbeck & Loomis, 1997; Wallach & O’Leary, 1982; Yang & Purves, 2003). Analogously, observers overestimate object distance when such objects are located in the upper visual field (Breitmeyer, Battaglia, & Bridge, 1977). Thus far, the neural mechanisms underlying these behavioral biases have been obscure.
A main cue for estimating visual object distance is binocular disparity. Images from the two eyes are ‘crossed’ for objects located further than the center of gaze (‘far’ distances), or ‘uncrossed’ for objects located nearer than that (‘near’ distances). At least in macaque monkeys, this distinction is fundamental enough that neurons that respond selectively to such ‘near’ and ‘far disparities are grouped together in segregated columns within visual cortex (Adams & Zeki, 2001; Chen, Lu, & Roe, 2008; Tanabe, Doi, Umeda, & Fujita, 2005). Consistent with these results from monkeys, a recent fMRI study also suggested that near and far selective neurons are clustered within one area in human visual cortex, named V3A (Goncalves et al., 2015). However, it has not been tested whether near and far columns are located preferentially in the cortical representation of the upper vs. lower visual fields (respectively), i.e. consistent with the bias in depth perception.
Here we show evidence for such a neural bias. We conducted high-resolution, high field (7T) fMRI measurements in human subjects during presentation of visual stimuli in near vs. far conditions (see Methods). Consistent with the reported bias in human depth perception, we found that near stimuli evoked stronger activity in disparity selective columns in the lower (compared to upper) visual field representations, within each of the two most retinotopically-organized extrastriate cortical areas.
2. Methods
2.1. Participants
Six human subjects (3 females), aged 21–32 years, participated in this study. All subjects had normal or corrected-to-normal visual acuity and radiologically normal brains, without history of neuropsychological disorder. All experimental procedures conformed to NIH guidelines and were approved by Massachusetts General Hospital protocols. Written informed consent was obtained from all subjects prior to the experiments.
2.2. General Procedures
Each subject was scanned in multiple sessions, on different days, in a high field scanner (Siemens 7T whole-body system, Siemens Healthcare, Erlangen, Germany). Initial sessions localized stereo-selective (‘thick’) and color-selective (‘thin’) stripes/columns, in each subject. Subsequent scans measured fMRI activity evoked by random dot stereograms ((RDS) (Anzai, Chowdhury, & DeAngelis, 2011; Bela Julesz, 1971; Minini, Parker, & Bridge, 2010; Nasr, Polimeni, & Tootell, 2016; Tsao, Vanduffel, et al., 2003)) of either crossed (‘far’) or uncrossed (‘near’) binocular disparity (see below). All subjects were also scanned in a 3T scanner (Tim Trio, Siemens Healthcare) in one additional session, for structural and retinotopic mapping.
2.3. Visual Stimuli
Stimuli were presented via an LCD projector (1024 × 768 pixel resolution, 60 Hz refresh rate) focused on a rear-projection screen, viewed through a mirror mounted on the receive coil array. Matlab 2013a (MathWorks, Natick, MA, USA) and Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) were used to control stimulus presentation.
During all experiments, stimuli were presented in a blocked-design procedure. Subjects were required to maintain fixation on a small (0.1° × 0.1°) central spot. To control the level of attention during the scans, subjects were required to simultaneously perform an unrelated (‘dummy’) task, reporting changes in color (red-to-green or vice versa) and shape (square-to-circle or vice versa) of the fixation spot during ‘near vs. far’ and localizer scans (see below), by pressing a key on a keypad.
2.3.1. Near vs. Far Disparity
Disparity-varying stimuli were sparse (5% bright) RDS based on red or green dots (0.09° × 0.09°) presented against a black background, extending 20° × 20° in the visual field. Subjects viewed the two RDS (each either red or green) through custom anaglyph spectacles, using Kodak Wratten filter No. 25 (red) over one eye, and 44A (cyan) over the other. Two RDS were overlaid and fused within all experiment blocks. In ‘near’ and ‘far’ conditions, stimuli formed a stereoscopic percept of a regular array of cuboids that varied sinusoidally in depth between 0–0.22°, either ‘in front’ or ‘behind’ a fronto-parallel plane that intersected the fixation target. In a control condition, the fused percept was limited to that fronto-parallel plane (i.e. zero depth).
Each experimental run included 9 stimulus blocks (24 s per block). Additionally, each run began and ended with control conditions of 12 s of uniform gray (‘blank’). Each subject participated in two separate scan sessions, with12 runs (960 functional volumes) per session.
2.3.2. Localizing Thick and Thin Stripes/Columns
Details of the stimuli and experimental procedure used to localized thin and thick stripes are reported elsewhere (Nasr et al., 2016). Briefly, V2 thick stripes and V3 thick type columns were localized using RDSs based on red or green dots (0.09° × 0.09°) presented against a black background, extending 20° × 20° in the visual field. As described above, subjects viewed the stimulus through custom anaglyph spectacles. Stimuli formed a stereoscopic percept of a regular array of cuboids that varied sinusoidally in depth, with independent phase. However, in contrast to the main experiment in which stimuli were presented either in front or behind the fixation target, here the stimuli spanned the full depth range (i.e. ± 0.22°) within each experimental block. As a control, in separate blocks, RDS stimuli were presented at zero disparity. Each experimental run began and ended with 12 s of uniform gray (‘blank’) and included 8 stimulus blocks (24 s per block). Each subject participated in three scan sessions (12 runs per session) during which 2592 functional volumes were collected.
Color-selective (‘thin’) stripes and columns were localized in V2 and V3 in separate scan sessions, using sinusoidal gratings (20° × 20° of visual angle) which varied in either color or achromatic luminance, in independent blocks (Nasr et al., 2016). Grating stimuli were also presented in systematically varied orientations (either 0°, 45°, 90° or 135°), drifting in orthogonal directions (reversed every 6 s) at 4°/s. In each run, these blocks included 9 stimulus presentation blocks (24 s per block). Each run began and finished with an additional block (12 s) of uniform gray of equal mean luminance. Each subject participated in 1–2 scan sessions (12 runs per session). 1008 functional volumes were collected in each scan session.
2.3.3. Retinotopic Mapping
Details of retinotopic mapping are reported elsewhere (Nasr et al., 2011). Briefly, stimuli were colored images of scenes and faces, which were presented within retinotopically limited apertures, against a gray background. The retinotopic apertures included wedges aligned along the horizontal and vertical meridian meridians (radius = 10°, polar angle = 30°), a foveal disk (radius = 1.5°) and a peripheral ring (inner-outer radius = 5–10°). For one subject, we also mapped retinotopic areas using counterphased, radially scaled checkerboard stimuli, rather than scenes and faces.
To confirm the V1/V2/V3 borders, in 2 subjects we also used phase-encoded, continuously rotating rays or continuously expanding/contracting ring stimuli for retinotopic mapping, each filled with contrast-reversing (1 Hz) checkerboards that were scaled in size with eccentricity. Details of this procedure are described previously (Sereno et al., 1995).
2.4. Imaging
2.4.1. 7T Sessions
The main experiments were conducted in a 7T Siemens whole-body scanner equipped with SC72 body gradients (70 mT/m maximum gradient strength and 200 T/m/s maximum slew rate) using a custom-built 32-channel helmet receive coil array and a birdcage volume transmit coil (Keil, Triantafyllou, Hamm, & Wald, 2010). Voxel dimensions were nominally 1.0 mm, isotropic, except as noted below. Single-shot gradient-echo EPI was used to acquire functional images with the following protocol parameter values: TR=3000 ms, TE=28 ms, flip angle=78°, matrix=192×192, BW=1184 Hz/pix, echo-spacing=1 ms, 7/8 phase partial Fourier, FOV=192×192 mm, 44 oblique-coronal slices, acceleration factor R=4 with GRAPPA reconstruction and FLEET-ACS data (Polimeni et al., 2015) with 10° flip angle. The field of view included occipital cortical areas V1, V2, V3, and usually the posterior portion of V4.
2.4.2. 3T Sessions
High spatial resolution was not necessary to map the borders of retinotopic areas. Instead, retinotopic mapping was conducted using a 3T Siemens scanner (Tim Trio) and the vendor-supplied 32-channel receive coil array. That functional data was acquired using single-shot gradient-echo EPI with nominally 3.0 mm isotropic voxels using the following protocol parameters: TR=2000 ms, TE=30 ms, flip angle=90°, matrix=64×64, BW=2298 Hz/pix, echo-spacing=0.5 ms, no partial Fourier, FOV=192×192 mm, 33 axial slices covering the entire brain, and no acceleration. For one subject, we also compared these retinotopic maps (3T) to the maps collected in a 7T scanner. As expected, results of this comparison showed relatively higher spatial resolution in the 7T retinotopic maps, but basically identical borders at both field strengths. This similarity was expected, since most retinotopic borders are topographically smooth (see also (Olman et al., 2010)).
Structural (anatomical) data were also acquired in a 3T scanner using a 3D T1-weighted MPRAGE sequence with the following protocol parameter values: TR=2530 ms, TE=3.39 ms, TI=1100 ms, flip angle=7°, BW=200 Hz/pix, echo spacing=8.2 ms, voxel size=1.0 × 1.0 × 1.33 mm3, FOV=256 × 256 × 170 mm3.
2.5. General Data Analysis
Functional and anatomical MRI data were pre-processed and analyzed using FreeSurfer and FS-FAST (version 5.3; http://surfer.nmr.mgh.harvard.edu/) (Fischl, 2012). For each subject, inflated and flattened cortical surfaces were reconstructed based on the high-resolution anatomical data (Dale, Fischl, & Sereno, 1999; Fischl et al., 2002; Fischl, Sereno, & Dale, 1999).
All functional images were corrected for motion artifacts. 3T functional data were spatially smoothed (Gaussian filtered with a 5 mm FWHM). However no spatial smoothing was applied to the main imaging data acquired at 7T (i.e. 0 mm FWHM). For each subject, functional data from each run were rigidly aligned (6 DOF) relative to his/her own structural scan using rigid Boundary-Based Registration (Greve & Fischl, 2009). This procedure enabled us to average data collected across multiple scan sessions, for each subject.
A standard hemodynamic model based on a gamma function was fit to the fMRI signal to estimate the amplitude of the BOLD response. For each subject, the average BOLD response maps were calculated for each condition (Friston, Holmes, Price, Buchel, & Worsley, 1999). Finally, voxel-wise statistical tests were conducted by computing contrasts based on a univariate general linear model, and the resultant significance maps were projected onto the subject’s anatomical volumes and reconstructed cortical surfaces.
2.5.1. Specific data analysis and tests for 7T data
Multiple studies have shown that BOLD activity sampled near in the superficial surface is stronger but spatially more distorted, compared to BOLD activity in the deeper layers (De Martino et al., 2013; Nasr et al., 2016; Polimeni, Fischl, Greve, & Wald, 2010). To reduce the impact of the pial surface veins, evoked BOLD activity was sampled from the deepest cortical depth. Specifically, for each subject the gray-white matter interface was generated from their own high-resolution structural scans (see above) using FreeSurfer (Dale et al., 1999; Fischl et al., 2002; Fischl et al., 1999). To measure the fMRI activity at the deepest cortical depth, the percent fMRI signal change was calculated for each functional voxel intersecting these surfaces, and projected onto the corresponding vertices of the surface mesh. As a control, additional analyses were made from superficial depths (e.g. Fig. 1).
To quantify the consistency between selectivity maps acquired across different scan sessions, we measured the fMRI signal change evoked by the ‘far vs. near’ contrast across the two scan sessions, for each vertex within each region of interest, i.e. retinotopically activated zones in V2 and V3. The results were tested for a significant correlation between the selective activity levels evoked across the two sessions by applying a Pearson correlation test.
2.6.1. Region of Interest (ROI) Analysis
Regions of interest (ROIs) were bounded partly by retinotopic borders. Areas V1, V2 and V3 were defined for each subject based on her/his own retinotopic map (see above). Borders of the ‘thick’ stripes and columns within V2/V3 were defined based on an independent set of stimuli/scans (see above). Sites showing overlapping selectivity for both color and disparity were excluded from the ROIs. Subsequently, each stripe/column type was divided into three groups (central vs. dorsal vs. ventral) based on each subject’s retinotopic mapping (see above and (Engel, Glover, & Wandell, 1997; Sereno et al., 1995)). To improve sensitivity, data from the left and right hemispheres were averaged together, for each subject.
All statistical analysis were based on repeated measures ANOVA. Results were corrected for violation of the sphericity assumption (using the Greenhouse-Geisser method) whenever necessary.
3. Results
Our main goal was to study the representation of near and far binocular disparity in 6 human subjects. As a pre-requisite for this goal, we localized disparity-selective thick stripes/columns in these subjects in a separate set of scans, as described earlier (see Methods and (Nasr et al., 2016)). Panels 1a–c show activity maps evoked in response to the basic disparity-selective localizer (‘3D – 2D’ contrast) (Nasr et al., 2016) across deep, middle and superficial layers respectively.
Consistent with the expected columnar organization in V2 and V3 areas (Nasr et al., 2016) and despite spatial blurring in the superficial layers (De Martino et al., 2013; Nasr et al., 2016; Polimeni et al., 2010) stripes were evident across all three cortical depths. Panel 1d shows the level of correlation between activity radially sampled from the deep and superficial depths (i.e. within columns) of areas V2 and V3, in one subject. By contrast, panel 1e shows the level of correlation between activity sampled in two adjacent points in the cortical map with 1–3 mm distance from the deep layers (i.e. across columns). As expected from a columnar organization, the level of correlation was higher ‘within’ rather than ‘across’ columns (Nasr et al., 2016). Other subjects showed similar results (not illustrated here).
Figure 2 shows the activity map in one individual hemisphere, evoked by ‘far – near’ binocular disparity contrast in the portions of V2 and V3 that represent upper vs. lower visual fields (i.e. ventral vs. dorsal visual cortex respectively) (see Methods and (Engel et al., 1997; Sereno et al., 1995)). Consistent with a prior study in macaque monkeys (Chen et al., 2008), we found fine scale ‘near’ and ‘far’ disparity selective sites that were located within the ‘thick’ stripes (as defined independently by an overall selectivity to binocular disparity (see Methods)) in human V2. Here, we also found a similar organization of ‘near and far disparity sensitivity in V3, consistent with previous results in macaque V3 (Adams & Zeki, 2001). In contrast to V2 and V3, we did not find any clear near- or far-selective clustering in area V1, at the current scanning resolution (Fig. 2).
Next, we compared the distribution of ‘near vs. far’ sites in the retinotopic representation of ‘lower vs. upper’ visual fields (Fig. 2), within areas V2 and V3. Consistent with the known under-and-over-estimated perception of object distance in the lower/upper visual fields (respectively), we found more near-selective disparity columns within the lower (compared to the upper) visual field representation in V2 and V3, while far-selective disparity columns were located more frequently in the upper field representation.
Figure 3a shows the level of activity evoked by near and far stimuli in V2 and V3 in the upper and lower visual field representations (i.e. regions of interest (ROIs)) (see Methods) measured relative to the zero disparity RDS, in all 12 hemispheres. Consistent with the activity map (Fig. 2) we found relatively stronger BOLD responses to near- and far-disparity stimuli within the lower and higher visual field representations, respectively. Application of a two-factor repeated measures ANOVA (‘stimulus-type’ (near vs. far) and ‘visual hemifield’ (upper vs. lower vs. central)) on the level of fMRI activity measured within deep cortical layers (see Methods) yielded a significant interaction between ‘stimulus-type’ × ‘visual hemifield’ effects on activity within V2 thick stripes (F(2, 10)=11.05, p<0.01) and V3 thick-type columns (F(2, 10)=8.33, p<0.01). In both areas, the main effects of ‘stimulus-type’ and ‘visual hemifield’ remained insignificant (F<0.99, p>0.4). As an important control comparison, we also found no significant bias for either near or far disparities in the representation of the central visual field (Fig. 4) within either V2 (t(5)=−0.56, p=0.60) or V3 (t(5)=0.65, p=0.55).
Since V2 and V3 disparity stripes are columnar in 3-D shape (Nasr et al., 2016; Tootell & Hamilton, 1989), one may expect a similar ‘near vs. far’ bias in the superficial layers. Despite the spatial blurring of fMRI activity in the superficial compared to deep layers (Polimeni et al., 2015), this same analysis of the level of fMRI activity measured within superficial layers yielded similar results: a significant interaction between ‘stimulus-type’ × ‘visual hemifield’ effects on the level of fMRI activity within V2 thick stripes (F(2, 10)=8.99, p<0.01) and V3 thick-type columns (F(2, 10)=6.97, p=0.01). Here again, we found no significant bias for either near or far disparities in the representation of the central visual field within the superficial layers of either V2 (t(5)= −1.67, p=0.16) or V3 (t(5)= −0.70, p=0.52).
One question is whether these biases in BOLD activity reflect a change in amplitude or surface area. Specifically, are the results produced by: 1) a change in amplitude of fMRI response to near and far RDS between upper and lower visual field representations, or 2) a change in amplitude of fMRI response driven by a few more localized sites within these ROIs? To address this question, we repeated our analysis, now measuring the ‘number of vertices’ in each ROI that showed a significant (p<0.05) bias for either ‘near’ or ‘far’ RDS (rather than the ‘level of activity’). The results are illustrated in Figure 3b. Application of a two-factor repeated measures ANOVA (‘stimulus-type’ (near vs. far) and ‘visual hemifield’ (upper vs. lower)) to the number of selective vertices in the deep cortical layer yielded a significant interaction between the effects of ‘stimulus-type’ × ‘visual hemifield’ in V2 (F(1, 5)=16.18, p=0.01) and V3 (F(1, 5)=12.54, p=0.02). Thus in both V2 and V3, the number of vertices showing a significant bias for near and far disparities were more frequent within the lower and upper field representations, respectively. The main effects of ‘stimulus-type’ and ‘visual hemifield’ remained insignificant (F<0.80, p>0.4). Here again, application of this analysis to the number of vertices measured within the superficial layers yielded similar results: a significant interaction between the effects of ‘stimulus-type’ × ‘visual hemifield’ in V2 (F(1, 5)=15.94, p=0.01) and V3 (F(1,5)=11.23, p=0.02). The central visual field was excluded from these analysis because the ‘far – near’ contrast did not evoke any selective response in this region (Fig. 4). Thus, our analysis suggests that the near vs. far bias (in upper vs. lower visual fields) is reflected in both the amplitude and distribution of neural activity.
Next we tested the reproducibility of this effect. We compared the level of fMRI response (i.e. the signal change as a percentage) in individual vertices evoked by ‘far-near’ contrast in independent scan sessions, which were acquired on different days (see Methods). In all subjects, activity evoked by the ‘far – near’ contrast was significantly correlated across scan sessions in both V2 (r>0.10, p<10−16) and V3 (r>0.12, p<10−14). This high level of correlation indicated that the overall pattern of activity remained consistent across scan sessions, i.e. our findings were reproducible.
4. Discussion
Many results from animal studies have shown that cortical columns reflect important information processing steps in a given cortical area (Mountcastle, 1997; Tanaka, 1996, 2003). However evidence for such columns in human cortex is limited, partly due to the low spatial resolution of non-invasive neuroimaging techniques, e.g. conventional fMRI. A related challenge to studying columnar organization by using gradient echo BOLD (as used in most conventional fMRI studies) is the presence of diving vessels that extend radially throughout cortex (Duvernoy, Delon, & Vannson, 1983). These diving vessels may emphasize the radial similarity of BOLD activity maps across cortical layers (e.g. in Figure 1). Such concerns are discussed elsewhere (Cheng, 2011; Harel, Bolan, Turner, Ugurbil, & Yacoub, 2010; Huber et al., 2015; Nasr et al., 2016; Polimeni et al., 2010).
Despite such challenges, by taking advantage of higher strength field (7T) and related modifications, several fMRI studies have shown evidence for columns in striate cortex V1 (Cheng, Waggoner, & Tanaka, 2001; Yacoub, Harel, & Ugurbil, 2008; Yacoub, Shmuel, Logothetis, & Ugurbil, 2007), perhaps V3A (Goncalves et al., 2015), and visual area MT/V5 (Zimmermann et al., 2011). Understandably, such initial studies typically focused on benchmark demonstrations of a given set of columns based on fMRI, rather than investigating columnar function further.
Using high field fMRI, we recently showed evidence for interdigitating columnar organizations in the second and third visual cortical areas in humans (Nasr et al., 2016); one set of these columns is involved in disparity coding (Figure 1). Here we extended these previous findings by showing that the organization of these disparity columns matches a bias in the statistics of natural scenes, which may also reflect a psychophysical bias in human depth perception.
4.1. Supporting Evidence from Electrophysiological Studies in NHPs
Our findings are also supported by previous electrophysiological (Adams & Zeki, 2001; Tanabe et al., 2005) and optical recording (Chen et al 2008) studies in macaque monkeys, which reported a bias for near stimuli in the representation of the lower visual field (i.e. the dorsal, more accessible portion) in areas V2, V3 and/or V4. For instance, one study reported a ratio of 61% in the number of patches that responded selectively to near compared with far stimuli across 6 tested monkeys (Chen et al., 2008). Interestingly, the level of this bias is consistent with the bias that we found in the level of response evoked by near stimuli relative to the far ones, in the lower visual field of V2 (55.4 ± 31% (mean ± S.D.; Fig. 3). However, those electrophysiological and optical studies did not sample near vs. far cells in the upper visual field representation, perhaps due to the technical difficulties of sampling from ventral (compared to dorsal) visual cortex.
4.2. Possible Link between Biased Perception and V2 Responses
Evidence for biased depth perception is not limited to depth underestimation and overestimation in lower and upper visual fields, respectively. For instance, human fMRI studies using RDS have shown that humans perceive near stimuli in the lower visual field more rapidly compared to the far ones. Conversely, far stimuli can be perceived faster in the upper visual field ((Breitmeyer, Julesz, & Kropfl, 1975; B. Julesz, Breitmeyer, & Kropfi, 1976) (but see also (Manning, Finlay, Neill, & Frost, 1987)).
Our results imply (but do not directly prove) a causal link between the biased depth perception in humans, and a corresponding activity bias in V2 and V3 for near/far stimuli presented in the lower/higher visual field. Such a link is supported by previous reports of V2 activity during depth perception, in macaque monkeys (Nienborg & Cumming, 2006, 2007). Those studies found a correlation between activity of V2 disparity selective neurons with the disparity discrimination in each subject. Interestingly, while disparity selective neurons can also be found in area V1 (Nienborg & Cumming, 2006; Prince, Pointon, Cumming, & Parker, 2002; Tsao, Conway, & Livingstone, 2003), studies have found no correlation between the V1 responses and individual subjects’ perceptual judgments (Nienborg & Cumming, 2006). Consistent with these results, we did not find any systematic bias for either near or far disparity in area V1 (Fig. 2).
In macaque monkeys, much less is known about binocular disparity processing area V3, compared to what is known in V2. This situation may have arisen partly because macaque V3 is quite thin in the cortical map, and difficult to access in invasive experiments. Nevertheless, increasing evidence suggests an important role for V3 in binocular processing (Adams & Zeki, 2001; Felleman, Burkhalter, & Van Essen, 1997; Poggio & Fischer, 1977), and especially in humans (Nasr et al., 2016).
4.3 Speculations on the Role of Learning vs. Evolution
A broader question raised by these results is whether the V2/V3 activity bias develops gradually with increasing exposure to natural scenes (i.e. within each lifespan), or whether it is generated during evolution (i.e. across many lifespans). Although current evidence does not allow us to answer this question directly, a few studies in human and macaques suggest that learning can influence subjects’ behavior along with neuronal responses. In macaques, training can influence the level of correlation between neuronal responses in V2 and an individual’s disparity discrimination (Nienborg & Cumming, 2007). In humans, the strength of the depth underestimation can be manipulated by using base-up prisms (Ooi et al., 2001), or by brief exposure to a virtual reality (Messing & Durgin, 2005). Thus, while these studies do not rule out possible long-term evolutionary mechanisms of (re)structuring the visual system, they do suggest that such evolutionary mechanisms may not be necessary to produce the bias found here.
5. Conclusion
These results suggest that the neural processing underlying depth perception in humans shows a significant bias, which is consistent with a bias in the statistics of natural scenes (Yang & Purves, 2003). Presumably, this adaption improves the efficiency and sensitivity of neural coding by recruiting more neurons to encode more frequently encountered visual features (Olshausen & Field, 1996). Analogous cortical mechanisms may underlie the increased sensitivity for encoding objects with cardinal orientation (i.e. ‘carpentered environments’ or the ‘oblique effect’ (Furmanski & Engel, 2000; Nasr & Tootell, 2012; Orban & Kennedy, 1981)) and/or rectilinear shapes (Nasr, Echavarria, & Tootell, 2014).
Highlights.
Within human visual cortex, preferential responses to near vs. far stimuli are organized within thick stripes and thick-type columns in V2 and V3, respectively.
Consistent with a known perceptual bias which ‘exaggerates’ the perception of near and far distances in upper vs. lower fields (respectively), we found that ‘near’ and ‘far’ clusters are preferentially located in the retinotopic representation of the lower and upper visual fields (respectively).
Acknowledgments
This study was supported by Massachusetts General Hospital Executive Committee on Research (ECOR) fund Grant 2015A051305 to S.N. and R.B.H.T. Crucial support was also provided by the Martinos Center for Biomedical Imaging and National Institutes of Health (Grant 5P41-EB-015896-17), and by the help and cooperation of all of our subjects.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Adams DL, Zeki S. Functional organization of macaque V3 for stereoscopic depth. J Neurophysiol. 2001;86(5):2195–2203. doi: 10.1152/jn.2001.86.5.2195. [DOI] [PubMed] [Google Scholar]
- Anzai A, Chowdhury SA, DeAngelis GC. Coding of stereoscopic depth information in visual areas V3 and V3A. J Neurosci. 2011;31(28):10270–10282. doi: 10.1523/JNEUROSCI.5956-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brainard DH. The Psychophysics Toolbox. Spat Vis. 1997;10(4):433–436. [PubMed] [Google Scholar]
- Breitmeyer B, Battaglia F, Bridge J. Existence and implications of a tilted binocular disparity space. Perception. 1977;6(2):161–164. doi: 10.1068/p060161. [DOI] [PubMed] [Google Scholar]
- Breitmeyer B, Julesz B, Kropfl W. Dynamic random-dot stereograms reveal up-down anisotropy and left-right isotropy between cortical hemifields. Science. 1975;187(4173):269–270. [PubMed] [Google Scholar]
- Chen G, Lu HD, Roe AW. A map for horizontal disparity in monkey V2. Neuron. 2008;58(3):442–450. doi: 10.1016/j.neuron.2008.02.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng K. Recent progress in high-resolution functional MRI. Curr Opin Neurol. 2011;24(4):401–408. doi: 10.1097/WCO.0b013e3283489711. [DOI] [PubMed] [Google Scholar]
- Cheng K, Waggoner RA, Tanaka K. Human ocular dominance columns as revealed by high-field functional magnetic resonance imaging. Neuron. 2001;32(2):359–374. doi: 10.1016/s0896-6273(01)00477-9. [DOI] [PubMed] [Google Scholar]
- Dale AM, Fischl B, Sereno MI. Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage. 1999;9(2):179–194. doi: 10.1006/nimg.1998.0395. [DOI] [PubMed] [Google Scholar]
- De Martino F, Zimmermann J, Muckli L, Ugurbil K, Yacoub E, Goebel R. Cortical depth dependent functional responses in humans at 7T: improved specificity with 3D GRASE. PLoS One. 2013;8(3):e60514. doi: 10.1371/journal.pone.0060514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duvernoy H, Delon S, Vannson JL. The vascularization of the human cerebellar cortex. Brain Res Bull. 1983;11(4):419–480. doi: 10.1016/0361-9230(83)90116-8. [DOI] [PubMed] [Google Scholar]
- Engel SA, Glover GH, Wandell BA. Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cereb Cortex. 1997;7(2):181–192. doi: 10.1093/cercor/7.2.181. [DOI] [PubMed] [Google Scholar]
- Felleman DJ, Burkhalter A, Van Essen DC. Cortical connections of areas V3 and VP of macaque monkey extrastriate visual cortex. J Comp Neurol. 1997;379(1):21–47. doi: 10.1002/(sici)1096-9861(19970303)379:1<21::aid-cne3>3.0.co;2-k. [DOI] [PubMed] [Google Scholar]
- Fischl B. FreeSurfer. Neuroimage. 2012;62(2):774–781. doi: 10.1016/j.neuroimage.2012.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, … Dale AM. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron. 2002;33(3):341–355. doi: 10.1016/s0896-6273(02)00569-x. [DOI] [PubMed] [Google Scholar]
- Fischl B, Sereno MI, Dale AM. Cortical surface-based analysis. II: Inflation, flattening, and a surface-based coordinate system. Neuroimage. 1999;9(2):195–207. doi: 10.1006/nimg.1998.0396. [DOI] [PubMed] [Google Scholar]
- Friston KJ, Holmes AP, Price CJ, Buchel C, Worsley KJ. Multisubject fMRI studies and conjunction analyses. Neuroimage. 1999;10(4):385–396. doi: 10.1006/nimg.1999.0484. [DOI] [PubMed] [Google Scholar]
- Furmanski CS, Engel SA. An oblique effect in human primary visual cortex. Nat Neurosci. 2000;3(6):535–536. doi: 10.1038/75702. [DOI] [PubMed] [Google Scholar]
- Goncalves NR, Ban H, Sanchez-Panchuelo RM, Francis ST, Schluppeck D, Welchman AE. 7 tesla FMRI reveals systematic functional organization for binocular disparity in dorsal visual cortex. J Neurosci. 2015;35(7):3056–3072. doi: 10.1523/JNEUROSCI.3047-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greve DN, Fischl B. Accurate and robust brain image alignment using boundary-based registration. Neuroimage. 2009;48(1):63–72. doi: 10.1016/j.neuroimage.2009.06.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harel N, Bolan PJ, Turner R, Ugurbil K, Yacoub E. Recent Advances in High-Resolution MR Application and Its Implications for Neurovascular Coupling Research. Front Neuroenergetics. 2010;2:130. doi: 10.3389/fnene.2010.00130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huber L, Goense J, Kennerley AJ, Trampel R, Guidi M, Reimer E, … Moller HE. Cortical lamina-dependent blood volume changes in human brain at 7 T. Neuroimage. 2015;107:23–33. doi: 10.1016/j.neuroimage.2014.11.046. [DOI] [PubMed] [Google Scholar]
- Julesz B. Foundations of cyclopean perception. Chicago: University of Chicago Press; 1971. [Google Scholar]
- Julesz B, Breitmeyer B, Kropfi W. Binocular-disparity-dependent upper-lower hemifield anisotropy and left-right hemifield isotropy as revealed by dynamic random-dot stereograms. Perception. 1976;5(2):129–141. doi: 10.1068/p050129. [DOI] [PubMed] [Google Scholar]
- Keil B, Triantafyllou C, Hamm M, Wald LL. Design optimization of a 32-channel head coil at 7T. Proceedings of the 18th annual meeting of ISMRM; Stockholm, Sweden. 2010. p. 1493. [Google Scholar]
- Manning ML, Finlay DC, Neill RA, Frost BG. Detection threshold differences to crossed and uncrossed disparities. Vision Res. 1987;27(9):1683–1686. doi: 10.1016/0042-6989(87)90174-x. [DOI] [PubMed] [Google Scholar]
- Messing R, Durgin FH. Distance perception and the visual horizon in head-mounted displays. ACM Transactions on Applied Perception (TAP) 2005;2(3):234–250. [Google Scholar]
- Minini L, Parker AJ, Bridge H. Neural modulation by binocular disparity greatest in human dorsal visual stream. J Neurophysiol. 2010;104(1):169–178. doi: 10.1152/jn.00790.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mountcastle VB. The columnar organization of the neocortex. Brain. 1997;120(Pt 4):701–722. doi: 10.1093/brain/120.4.701. [DOI] [PubMed] [Google Scholar]
- Nasr S, Echavarria CE, Tootell RB. Thinking outside the box: rectilinear shapes selectively activate scene-selective cortex. J Neurosci. 2014;34(20):6721–6735. doi: 10.1523/JNEUROSCI.4802-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nasr S, Liu N, Devaney KJ, Yue X, Rajimehr R, Ungerleider LG, Tootell RB. Scene-selective cortical regions in human and nonhuman primates. J Neurosci. 2011;31(39):13771–13785. doi: 10.1523/JNEUROSCI.2792-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nasr S, Polimeni JR, Tootell RB. Interdigitated Color- and Disparity-Selective Columns within Human Visual Cortical Areas V2 and V3. J Neurosci. 2016;36(6):1841–1857. doi: 10.1523/JNEUROSCI.3518-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nasr S, Tootell RB. A cardinal orientation bias in scene-selective visual cortex. J Neurosci. 2012;32(43):14921–14926. doi: 10.1523/JNEUROSCI.2036-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nienborg H, Cumming BG. Macaque V2 neurons, but not V1 neurons, show choice-related activity. J Neurosci. 2006;26(37):9567–9578. doi: 10.1523/JNEUROSCI.2256-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nienborg H, Cumming BG. Psychophysically measured task strategy for disparity discrimination is reflected in V2 neurons. Nat Neurosci. 2007;10(12):1608–1614. doi: 10.1038/nn1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olman CA, Van de Moortele PF, Schumacher JF, Guy JR, Ugurbil K, Yacoub E. Retinotopic mapping with spin echo BOLD at 7T. Magn Reson Imaging. 2010;28(9):1258–1269. doi: 10.1016/j.mri.2010.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olshausen BA, Field DJ. Natural image statistics and efficient coding. Network. 1996;7(2):333–339. doi: 10.1088/0954-898X/7/2/014. [DOI] [PubMed] [Google Scholar]
- Ooi TL, Wu B, He ZJ. Distance determined by the angular declination below the horizon. Nature. 2001;414(6860):197–200. doi: 10.1038/35102562. [DOI] [PubMed] [Google Scholar]
- Orban GA, Kennedy H. The influence of eccentricity on receptive field types and orientation selectivity in areas 17 and 18 of the cat. Brain Res. 1981;208(1):203–208. doi: 10.1016/0006-8993(81)90633-8. [DOI] [PubMed] [Google Scholar]
- Pelli DG. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat Vis. 1997;10(4):437–442. [PubMed] [Google Scholar]
- Philbeck JW, Loomis JM. Comparison of two indicators of perceived egocentric distance under full-cue and reduced-cue conditions. J Exp Psychol Hum Percept Perform. 1997;23(1):72–85. doi: 10.1037//0096-1523.23.1.72. [DOI] [PubMed] [Google Scholar]
- Poggio GF, Fischer B. Binocular interaction and depth sensitivity in striate and prestriate cortex of behaving rhesus monkey. J Neurophysiol. 1977;40(6):1392–1405. doi: 10.1152/jn.1977.40.6.1392. [DOI] [PubMed] [Google Scholar]
- Polimeni JR, Bhat H, Witzel T, Benner T, Feiweier T, Inati SJ, … Wald LL. Reducing sensitivity losses due to respiration and motion in accelerated echo planar imaging by reordering the autocalibration data acquisition. Magn Reson Med. 2015 doi: 10.1002/mrm.25628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Polimeni JR, Fischl B, Greve DN, Wald LL. Laminar analysis of 7T BOLD using an imposed spatial activation pattern in human V1. Neuroimage. 2010;52(4):1334–1346. doi: 10.1016/j.neuroimage.2010.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Previc FH. Functional Specialization in the lower and upper visual fields in humans: Its ecological origins and neurophysiological implications. Behavioral and Brain Sciences. 1990;13:519–575. [Google Scholar]
- Prince SJ, Pointon AD, Cumming BG, Parker AJ. Quantitative analysis of the responses of V1 neurons to horizontal disparity in dynamic random-dot stereograms. J Neurophysiol. 2002;87(1):191–208. doi: 10.1152/jn.00465.2000. [DOI] [PubMed] [Google Scholar]
- Sereno MI, Dale AM, Reppas JB, Kwong KK, Belliveau JW, Brady TJ, … Tootell RB. Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science. 1995;268(5212):889–893. doi: 10.1126/science.7754376. [DOI] [PubMed] [Google Scholar]
- Tanabe S, Doi T, Umeda K, Fujita I. Disparity-tuning characteristics of neuronal responses to dynamic random-dot stereograms in macaque visual area V4. J Neurophysiol. 2005;94(4):2683–2699. doi: 10.1152/jn.00319.2005. [DOI] [PubMed] [Google Scholar]
- Tanaka K. Representation of Visual Features of Objects in the Inferotemporal Cortex. Neural Netw. 1996;9(8):1459–1475. doi: 10.1016/s0893-6080(96)00045-7. [DOI] [PubMed] [Google Scholar]
- Tanaka K. Columns for complex visual object features in the inferotemporal cortex: clustering of cells with similar but slightly different stimulus selectivities. Cereb Cortex. 2003;13(1):90–99. doi: 10.1093/cercor/13.1.90. [DOI] [PubMed] [Google Scholar]
- Tootell RB, Hamilton SL. Functional anatomy of the second visual area (V2) in the macaque. J Neurosci. 1989;9(8):2620–2644. doi: 10.1523/JNEUROSCI.09-08-02620.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsao DY, Conway BR, Livingstone MS. Receptive fields of disparity-tuned simple cells in macaque V1. Neuron. 2003;38(1):103–114. doi: 10.1016/s0896-6273(03)00150-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsao DY, Vanduffel W, Sasaki Y, Fize D, Knutsen TA, Mandeville JB, … Tootell RB. Stereopsis activates V3A and caudal intraparietal areas in macaques and humans. Neuron. 2003;39(3):555–568. doi: 10.1016/s0896-6273(03)00459-8. [DOI] [PubMed] [Google Scholar]
- Wallach H, O’Leary A. Slope of regard as a distance cue. Percept Psychophys. 1982;31(2):145–148. doi: 10.3758/bf03206214. [DOI] [PubMed] [Google Scholar]
- Yacoub E, Harel N, Ugurbil K. High-field fMRI unveils orientation columns in humans. Proc Natl Acad Sci U S A. 2008;105(30):10607–10612. doi: 10.1073/pnas.0804110105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yacoub E, Shmuel A, Logothetis N, Ugurbil K. Robust detection of ocular dominance columns in humans using Hahn Spin Echo BOLD functional MRI at 7 Tesla. Neuroimage. 2007;37(4):1161–1177. doi: 10.1016/j.neuroimage.2007.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Z, Purves D. A statistical explanation of visual space. Nat Neurosci. 2003;6(6):632–640. doi: 10.1038/nn1059. [DOI] [PubMed] [Google Scholar]
- Zimmermann J, Goebel R, De Martino F, van de Moortele PF, Feinberg D, Adriany G, … Yacoub E. Mapping the organization of axis of motion selective features in human area MT using high-field fMRI. PLoS One. 2011;6(12):e28716. doi: 10.1371/journal.pone.0028716. [DOI] [PMC free article] [PubMed] [Google Scholar]