Abstract
Key points
Just as a portrait painting can come from a collection of coarse and fine details, natural vision can be decomposed into coarse and fine components.
Previous studies have shown that the early visual areas in the brain represent these components in a map‐like fashion.
Other studies have shown that these same visual areas can be sensitive to how coarse and fine features line up in space.
We found that the brain actually jointly represents both the scale of the feature (fine, medium, or coarse) and the alignment of these features in space.
The results suggest that the visual cortex has an optimized representation particularly for the alignment of fine details, which are crucial in understanding the visual scene.
Abstract
Complex natural scenes can be decomposed into their oriented spatial frequency (SF) and phase relationships, both of which are represented locally at the earliest stages of cortical visual processing. The SF preference map in the human cortex, obtained using synthetic stimuli, is orderly and correlates strongly with eccentricity. In addition, early visual areas show sensitivity to the phase information that describes the relationship between SFs and thereby dictates the structure of the image. Taken together, two possibilities arise for the joint representation of SF and phase: either the entirety of the cortical SF map is uniformly sensitive to phase, or a particular set of SFs is selectively phase sensitive – for example, greater phase sensitivity for higher SFs that define fine‐scale edges in a complex scene. To test between these two possibilities, we constructed a novel continuous natural scene video whereby phase information was maintained in one SF band but scrambled elsewhere. By shifting the central frequency of the phase‐aligned band in time, we mapped the phase‐sensitive SF preference of the visual cortex. Using functional magnetic resonance imaging, we found that phase sensitivity in early visual areas is biased toward higher SFs. Compared to a SF map of the same scene obtained using linear‐filtered stimuli, a much larger patch of areas V1 and V2 is sensitive to the phase alignment of higher SFs. The results of early areas cannot be explained by attention. Our results suggest non‐uniform sensitivity to phase alignment in population‐level SF representations, with phase alignment being particularly important for fine‐scale edge representations of natural scenes.
Keywords: edges, efficient coding, fmri, natural scene, phase, phase coherence, spatial frequency, visual cortex
Key points
Just as a portrait painting can come from a collection of coarse and fine details, natural vision can be decomposed into coarse and fine components.
Previous studies have shown that the early visual areas in the brain represent these components in a map‐like fashion.
Other studies have shown that these same visual areas can be sensitive to how coarse and fine features line up in space.
We found that the brain actually jointly represents both the scale of the feature (fine, medium, or coarse) and the alignment of these features in space.
The results suggest that the visual cortex has an optimized representation particularly for the alignment of fine details, which are crucial in understanding the visual scene.
Abbreviations
- 3‐D
three dimensional
- cpd
cycles per degree
- fMRI
functional magnetic resonance imaging
- LO
lateral occipital cortex
- RMS contrast
root mean squared contrast
- ROI
region of interest
- SF
spatial frequency
- T1
longitudinal relaxation
- TE
echo time
- TI
time of inversion
- TR
time of repeat
- V1
primary visual cortex
- V2
secondary visual cortex
- V3
tertiary visual cortex
Introduction
Complex natural scenes can be decomposed into their oriented spatial frequency and phase relationships, both of which are represented locally at the earliest stages of cortical visual processing (Sasaki et al. 2001; Felsen et al. 2005; Henriksson et al. 2009). Spatial frequency maps in the human visual cortex vary smoothly and correlate with eccentricity representations in early visual cortex (Sasaki et al. 2001; Henriksson et al. 2008; Hess et al. 2009). Natural scenes consist of a large number of different spatial frequency components at different orientations, with structure imposed by their complex phase relationship (Thomson, 2001). Sensitivity to phase structure appears to be uniform across all human visual areas – every visual area as a whole appears to distinguish between images with identical amplitude spectra but scrambled phase spectra (Felsen et al. 2005; Perna et al. 2008; Henriksson et al. 2009; Hansen et al. 2011, 2012; Castaldi et al. 2013), though primary visual cortex (V1) may be less sensitive to phase (Freeman et al. 2013).
The juxtaposition of these two sets of findings – that spatial frequency maps vary smoothly and correlate with eccentricity and that the retinotopic cortex as a whole is sensitive to phase information — leads to the prediction that phase sensitivity is constant across spatial frequency representations. Such a perspective would be in line with a complete representation of spatial frequency and phase across the visual field (De Valois & De Valois, 1980). Alternatively, phase sensitivity may be specifically important for just a band of spatial frequencies, particularly frequencies critical for edges in natural scenes. This would be in agreement with a sparse‐coding view of early visual cortex, with a strong bias towards representation of edges in complex scenes (Olshausen & Field, 1996). Using novel band‐stop phase‐scrambled stimuli, we sought to test between these two predictions.
We modulated the phase alignment of a natural scene video while keeping the power spectrum of each frame constant. Using band‐stop scrambling in the phase spectrum, we maintained the phase alignment of a band of spatial frequencies while scrambling the phases in the other frequencies. At consecutive time points, the phase‐aligned band was shifted in spatial frequency. The resulting stimulus encodes bands of phase‐aligned spatial frequencies. By measuring the cortical response to each band, we were able to construct a map of phase‐sensitive spatial frequency representations. Surprisingly, we found that in this map, the representation for higher spatial frequencies is exaggerated compared to a map generated from linear‐filtered stimuli. We also found that this bias towards higher spatial frequencies disappears beyond secondary visual cortex (V2), highlighting the importance of local phase in the initial coding of fine‐scale edges of natural scenes.
Methods
Subjects
Five subjects (2 females, mean age: 31 years, range: 21–60 years) participated in this study, three of them being naive to its purpose. All observers had normal or corrected‐to‐normal visual acuity. The experimental procedures were performed with the informed consent of the subjects and were approved by the Ethics Review Board of the Montreal Neurological Institute, consistent with the Declaration of Helsinki.
Visual stimuli
Using a high definition (HD) video camcorder (Sony DCR‐SR1), we recorded a continuous video while walking on the footpath on Mont Royal (Montreal, Canada), to capture natural scene conditions. The scene included foliage, people and animals, as well as depth from motion and looming. The video was recorded in 1080i resolution and in colour, with minimal compression using the Sony proprietary AVCHD algorithm. The video was then transferred to a computer and decompressed using Apple's iMovie software. The uncompressed video was then resized with bi‐cubic interpolation, cropped to 512 × 512 pixels, and converted to 8‐bit greyscale for further processing in MATLAB (Mathworks, Natick, MA, USA) using the Signal Processing and Image Processing Toolboxes. Subsequent to processing, the images were never recompressed in order to maintain fidelity of the spatial frequency filters.
We generated band‐stop phase‐scrambled stimuli by applying a sliding ideal ‘preservation’ filter to different target spatial frequency bands of each video frame's phase spectra (in polar coordinates). The filter works by leaving phase angles within the pass‐band of the filter unaltered, and adding to the phase angles outside of its pass‐band angles chosen from a random distribution [−π, π] created by Fourier transforming a noise image (random values from −1 to 1 sampled from a uniform distribution) of the same dimensions (e.g. Hansen & Hess, 2007). Thus, the phase spectrum for components outside of the filter's pass‐band was the sum of the original phase spectrum and a random‐value matrix, thereby ensuring conjugate symmetry of the entire phase spectrum. We used a single random‐value matrix for the entire movie, which maintained temporal phase offsets between components. The amplitude spectrum for each frame was replaced with an isotropic spectrum with the amplitude fall‐off typical of natural scenes (e.g. Hansen & Hess, 2006). Lastly, each frame was assigned a fixed mean luminance of 128 (pixel greyscale value), and root mean square (RMS) contrast (RMS = 0.15). A sample video of the main condition, albeit with compression, is available for viewing (see Video S1 in the online Supporting information).
The identical algorithm and parameters used for the main stimuli were used to generate the control stimuli, with the only exception that the phase spectrum of the individual movie frames was fully randomized and then subjected to the same pipeline as the band‐stop phase‐scrambled movie. This approach controlled for several critical aspects of the stimulus, namely the dynamics of the natural scene, as well as the sudden changes in the images corresponding to shifts of the filter at each 2‐s step.
An attention control condition used the exact same stimuli as the main condition, but the fixation dot was replaced with a demanding attention task adopted from Chaudhuri (1990). Ten symbols alternated at a rate of 4 Hz and subjects were asked to detect the presence of the diamond shape. The order of the 10 symbols was continuously randomized during the duration of the video, and thus the presence of the diamond shape was unpredictable.
Finally, to map spatial frequency selectivity independent of phase alignment, we generated a third movie that utilized standard linear filtering. We first fully randomized the phase spectrum of each movie frame of the entire video, and then applied a band‐pass linear filter with the same filter size and centre frequency as in the main stimulus, with the power outside of the band set to zero. This manipulation is analogous to drawing, at each step, a large number of oriented sinusoidal patterns with random phases, while maintaining the same overall spatial phase shifts over time as the main condition.
All stimulus conditions are illustrated in Fig. 1 for the test image ‘Lena’. The top row contains the band‐stop phase‐scrambled versions where the peak frequency band in which the phase alignment is maintained moves from low to high. The middle row depicts the control stimuli, with identical filter shifts to the band‐stop phase‐scrambled stimuli, but on a fully phase‐scrambled movie (single random‐value matrix for the entire movie). In the bottom row the phase‐scrambled linear‐filtered versions are shown with the same peak frequency of the pass‐band incrementing from low to high frequencies. The video depicting the main manipulation, albeit with compression, can be viewed online (see Video S1 in the Supporting information).
We utilized a phase‐encoded functional magnetic resonance imaging (fMRI) design, akin to the standard retinotopic mapping protocol. Phase‐encoded designs are efficient, producing robust results in relatively short scan durations (Engel, 2012). The stimulation protocol we employed had 10 steps that either ascended or descended the spatial frequency domain, with each step being 2 s long. The stop‐band shifted linearly from central frequencies of 1.4–5.5 cycles per degree (cpd) with a bandwidth of 1.4 cpd (full‐width at half height). In each run, subjects viewed 13 cycles of the stimuli in either increasing or decreasing spatial frequency steps. We repeated each stimulus condition 3 times (i.e. 3 runs forward, 3 runs reverse). Thus each subject completed 18 runs, with 6 runs per condition (band‐stop phase‐scrambling, control, and linear filter stimuli).
Display
The uncompressed greyscale videos were back‐projected onto a translucent screen mounted at the bore of the MRI scanner using a NEC VT580 projector (with a resolution of 1024 × 768 pixels and a 60 Hz refreshing rate). The images, displayed with a linearized luminance profile (linear gamma) and viewed at a distance of 140 cm, subtended 11 deg vertically and horizontally. The observers were instructed to view the movies while maintaining their gaze fixed on a grey fixation cross in the main condition (0.4 deg wide) or on the array of 10 symbols changing at a rate of 4 Hz while having to detect the diamond symbol (attention condition).
Data acquisition
The magnetic resonance images were acquired on a Siemens Trio 3T MRI with a 32 channel whole‐head coil. Head position was fixed using foam cushions.
T2*‐weighted gradient echo‐echo planar imaging (GE‐EPI) images with 2 mm isotropic voxel resolution (time of repeat/echo time, TR/TE = 2000 ms/30 ms; flip angle = 76 deg; slice number = 35 with no gap; slice thickness = 2 mm) were acquired with a 100 × 100 acquisition matrix, a 200 mm × 200 mm rectangular field of view and Generalized Autocallibrating Partially Parallel Acquisitions (GRAPPA; acceleration factor PE = 3, reference lines = 24). The slices were horizontally oriented and covered the entire occipital and temporal lobes. On each run, we acquired 140 images. The 18 fMRI runs were collected in three separate sessions, with each session lasting approximately 2 h. Thus in total we acquired 12,600 images across 90 runs in five subjects.
For anatomical registration and surface‐based reconstruction, we acquired two to three high resolution longitudinal relaxation (T1)‐weighted anatomical MR images covering the entire brain prior to the functional scans (3D‐MPRAGE sequence, TR/TE = 2300 ms/2.98 ms, time of inversion TI = 900 ms, 176 sagittally oriented slices, slice thickness = 1 mm, 256 × 240 acquisition matrix).
Finally, standard retinotopic mapping (e.g. rotating wedges, expanding rings; Sereno et al. 1995) were used for the identification of cortical visual areas in a separate session.
Data analysis
Processing of the anatomical images
Volumetric segmentation was performed with the Freesurfer image analysis suite (http://surfer.nmr.mgh.harvard.edu/). This processing includes intensity normalization of the multiple individual T1‐weighted images, motion correction and averaging of the images and cortical segmentation (Dale et al. 1999; Fischl & Dale, 2000). The resulting T1‐weighted image and its corresponding segmented image was then further processed in Brainvoyager QX (version 1.26; Brain Innovation, Maastricht, The Netherlands). Both images were normalized to Talairach stereotaxic space (Talairach & Tournoux, 1988) and the segmented image was used to create the two hemispherical cortical surfaces. Both surfaces were successively smoothed, inflated and flattened.
Preprocessing of functional images
The fMRI data were preprocessed using Brainvoyager. The first 10 fMRI measurements of each functional run were discarded. The functional data underwent a series of preprocessing steps including slice‐time correction, three‐dimensional motion correction, linear trend removal, and high‐pass filtering (removing frequencies lower than 6 cycles per session). Finally, the images were co‐registered to the reference anatomy image and normalized to Talairach stereotaxic space (Talairach & Tournoux, 1988), and smoothed with a 4 mm full‐width at half maximum (FWHM) Gaussian kernel.
Statistical analysis
The fMRI data were analysed using Brainvoyager QX and its MATLAB toolbox (BVQXtools, v0.8; Brain Innovation). We analysed our data in the exact same manner as phase‐encoded designs used for retinotopic mapping (Engel et al. 1994; Sereno et al. 1995). Cross‐correlation analysis was used to identify the time point (lag) at which a region responds maximally in order to map the voxel sensitivity in the visual cortex (Linden et al. 1999; Goebel et al. 2001). We took the lag at maximal correlation as indicative of the preferred stimulus lag of that voxel. The obtained lag values at each voxel, corresponding to the preferred stimulation, were encoded in pseudocolours on surface patches (triangles) of the reconstructed cortical sheet. Pixels were included into the statistical map if the obtained cross‐correlation value r was greater than or equal to 0.22 (P < 0.001, uncorrected). The individual maps of the three runs in the reverse direction were corrected for lag ordering (by flipping the polarity of the map) and averaged with the three runs in the forward direction. Figure 2 depicts the analysis pipeline for generation of the smooth surface maps and estimates of preferred lag in each visual area.
To visualize the group‐average maps, we utilized the curvature‐driven cortex‐based alignment (CBA) procedure (Frost & Goebel, 2012) in order to minimize spatial smoothing and anatomical variability across subjects.
Region of interest (ROI) analysis
Cortical visual field maps were identified for every subject using phase encoding mapping (Engel et al. 1994; Sereno et al. 1995). The border of the visual areas V1, V2, the tertiary visual cortex (V3) and lateral occipital cortex (LO) were identified by combining eccentricity and polar‐angle phase maps (Sereno et al. 1995; Wandell et al. 2007). In the current article, LO corresponds to the combination of LO‐1 and LO‐2 (Larsson & Heeger, 2006). The ROI boundaries were drawn on the individual inflated surfaces and were then projected to the corresponding volume to create individual subject volume ROIs. Inside the ROIs we calculated the number of voxels showing a preference for each lag for each subject. To quantify the peak SF for each subject, condition and visual area, we fitted a Gaussian model to the SF preference data defined as
Where ‘Amplitude’ represents the height of the SF preference peak, ‘PeakSF’ represents the position of the peak in the SF domain and ‘Bandwidth’ represents the spread of the bandwidth preference.
Results
We mapped population‐level spatial frequency representations that were jointly phase sensitive and compared the results with maps of spatial frequency obtained through linear filtering of the same images that were phase scrambled.
In each 2 s time step, the phase alignment of one spatial frequency band was kept constant while the phases outside of the band were scrambled – essentially a band‐stop scrambling in the phase spectrum. Throughout an fMRI run, we shifted the centre frequency of the band in 10 time steps (i.e. 10 centre frequencies) while measuring fMRI signal, and analysed the results in terms of peak correlation to the stimulus convolved by the haemodynamic response – in effect, identical to the standard retinotopic mapping method (Fig. 2). To exclude the effect of attention to one or more lags, participants carried out a demanding central attention task adopted from Chaudhuri (1990).
If sensitivity to phase alignment is uniform across the range of spatial frequencies, we would expect our band‐stop phase‐scrambling stimulation to result in a map identical to the linear filter map. In other words, we would expect no modification to the standard spatial frequency map. On the other hand, an efficient coding strategy of fine edges in a scene would predict non‐uniform sensitivity of phase alignment. This second view predicts that the band‐stopped phase‐scrambled stimuli would result in an orderly map, but one different in its emphasis of particular spatial frequencies.
Early visual areas
Figure 3 shows the surface‐based group average of five subjects, while the individual subject maps are depicted in Fig. 4. The band‐stop phase‐scrambled movie resulted in an orderly map of spatial frequencies across V1, V2 and V3 (Fig. 3 A). The spatial frequency preference of each area is quantitatively depicted in Fig. 5.
The orderly map of phase‐sensitive spatial frequency differs from that of a standard spatial frequency map. Figure 3 C depicts the spatial frequency map obtained from linearly filtering the phase‐scrambled movie (i.e. a standard spatial frequency map). This control stimulus has the same dynamic properties of the main movie and therefore the map can only be attributed to the spatial frequency filtering. The ordered map we obtained from linear filtering closely matches those previously reported (Sasaki et al. 2001; Hess et al. 2009) but, critically, this map differs from our map of Fig. 3 A which encodes phase‐sensitive frequency bands. The shift in the spatial frequency preference for V1 is evident in the preference plots shown in Fig. 5, which depict a substantial shift in frequency selectivity under the phase‐aligned as opposed to phase‐scrambled condition.
To quantify the peak SF shifts visualized in Fig. 5, we fitted a Gaussian tuning model to each subject, condition and area SF preference data. The fitted mean peak SF in each condition and area are depicted in Fig. 6. Area V1 exhibited a selectivity for higher SF bands when stimulated by the main stimuli rather than the linear filter stimuli (t(4) = 11.08, P < 0.0001), and peak SF to the main stimuli was unchanged with divided attention (t(4) = 2.16, P = 0.1). The same pattern was observed in area V2, where the main stimuli elicited a higher peak SF than the linear filter condition (t(4) = 4.825; P = 0.008), and peak SF for the main condition was unchanged by divided attention (t(4) = 2.466; P = 0.07). While for area V3 the main stimuli did still elicit a higher peak SF than for linear‐filtered stimuli (t(4) = 3.937, P < 0.02), the peak SF was significantly reduced by divided attention (t(4) = 3.752; P < 0.02). In area LO, there were no differences in the peak SF between the three conditions. These results are in line with the notion that invariant/tolerant representations achieved in high‐level areas such as LO arise from the integration across multiple spatial frequencies (e.g. Riesenhuber & Poggio, 2000).
The control movie – which was identical to the band‐stop phase‐scrambled stimuli but with fully randomized phase – did not result in any discernable map (Fig. 3 D). This is an important control, given that natural scenes have unpredictable variations that may inadvertently time‐lock with our phase‐encoded stimulus. The absence of a resultant map from the control stimuli suggests that the maps obtained from the band‐stop phase‐scrambled stimuli are meaningful.
Higher visual areas
Band‐stopped phase‐scrambled stimuli were much more effective in driving activity beyond the early visual cortex. This was anticipated, since meaningful features of the scene could be discerned in this stimulus, and higher‐level areas are modulated by complex scene contents. A striking feature of the band‐stop phase‐scrambled map was the increased engagement of regions along the posterior angular gyrus and superior occipital gyrus. In the posterior angular gyrus, we observed a coarse map of higher to mid‐range spatial frequencies. The phase‐sensitive spatial frequency map here was large and had a ventral‐to‐dorsal orientation that mapped onto mid‐range to higher spatial frequencies. The map was completely absent in both the linear‐filtered and the control stimuli, suggesting that this region prefers the higher spatial frequency structure and that it may have a fine‐to‐coarse spatial organization on the cortex.
In general, high‐level ventral visual areas had much lower phase‐alignment‐sensitive spatial frequency preferences when mapped with our stimuli, and no local gradation of spatial frequencies were observed that could be indicative of an organized representation. The spatial frequency preference of the posterior and middle fusiform gyrus, for example, was fairly consistent and similar to the lateral occipital complex, suggesting that these ventral‐pathway areas have similar phase‐sensitive spatial frequency preferences. The parietal pattern of phase‐alignment‐sensitive spatial frequency representation was consistent with that observed along the ventral areas – an absence of an organized cluster of spatial frequency selectivity and a preference for mid‐range spatial frequencies that was consistent throughout this area.
Discussion
We show that the phase sensitivity in the early visual cortex is not uniform and that there is a bias to higher spatial frequency representations. We further found that this bias held in both areas V1 and V2, but was lost in later stages of the ventral pathway, namely area LO.
Our study tested two possibilities for joint representation of phase and spatial frequency information. Phase sensitivity could be a general feature of early visual cortex and jointly represented across all spatial frequencies. Alternatively, phase sensitivity may be biased to a subset of spatial frequencies, particularly higher spatial frequencies for which alignment is important in the formation of edges and contours. Our results lend support to the second idea.
Expanded map of phase sensitivity for high spatial frequencies in V1 and V2
Early visual areas do appear to be sensitive to the phase structure of stimuli (Felsen et al. 2005; Perna et al. 2008; Henriksson et al. 2009; Castaldi et al. 2013). While Felsen et al. (2005) found that V1 complex cells had enhanced selectivity for phase information in natural scenes, Henriksson et al. (2009) found that sensitivity to phase coherence is present at the population level (i.e. fMRI voxel) in V1 and all visual areas. Perna et al. (2008) and Castaldi et al. (2013) further showed enhanced V1 response to synthetic patterns with phase alignment (lines and edges) as opposed to phase‐scrambled versions of the stimuli. Interestingly, they did not find a difference between line and edge stimuli, despite large differences in these stimuli in terms of the structure of their phase spectra, suggesting the presence of odd and even symmetric detectors in early cortical processing of vision.
With the band‐stop phase‐scrambled stimuli, we found both that the spatial frequency preference is shifted towards higher frequencies and a larger patch of V1 and V2 represent the phase‐aligned high spatial frequency components. It is not immediately evident why a larger portion of the cortex would prefer phase‐aligned high spatial frequency components. One possibility is that the expanded high spatial frequency preference simply represents the importance of fine details in scene recognition. This would make sense for the foveal representation but why would it need to also involve more peripheral regions? A likely explanation is that the phase‐aligned map ensures edge coherence for extended contours across the visual field.
Despite the substantial drop‐off in acuity outside of the fovea, our perception of the world is surprisingly coherent. Contours and edges of natural scenes do not suddenly bend and warp as their images extend beyond the fovea, and this consistency remains through substantial scene changes brought by eye movements, head movement, lighting changes, etc. Such a consistent percept demands coherence of edges across the visual field. Given the importance of phase alignment in the perception of edges and contours, we speculate that coherent perception of these features across the visual field may demand an extended cortical representation of phase‐aligned higher spatial frequencies across the retinotopic maps of the early visual areas.
The role of phase alignment in natural vision
Phase spectra of natural images have complex statistical structures (Thomson, 2001) which define the layout of the scene as well as fine edges that may be valuable for detecting and identifying constitutive objects in a scene. Phase alignment is important for scene recognition and disruption of the phase spectrum leads to a degradation of performance on object recognition tasks such as animal recognition (Wichmann et al. 2006). In addition to its contribution to contour perception and object recognition from edges and contours, phase alignment appears to be important for depth perception from texture gradients (Thaler et al. 2007), whereby disruption of the texture phase affects the magnitude of depth perceived from texture cues.
High‐level vision and phase alignment
We did not observe clusters of phase‐aligned spatial frequency selectivity in higher visual areas, such as those along the fusiform gyrus or the posterior parietal cortex. In higher visual areas it is known that neuronal receptive fields enlarge and sensitivity to high spatial frequencies reduce (Dumoulin & Wandell, 2008) and these areas also exhibit sensitivity to phase (Henriksson et al. 2009). Our results suggest that this phase sensitivity is less frequency dependent in later areas than earlier ones.
The observation of a coarse phase‐sensitive spatial frequency map in the posterior angular gyrus was unexpected, as this region is not generally considered a visual area. Multiple and seemingly unrelated roles have been attributed to this region, including reading, spatial attention, memory recall, action awareness and saccade planning (Masland, 1958; Schlaggar & McCandliss, 2007; Crawford et al. 2011). An early report (Holmes & Horrax, 1919) of a patient with a bullet wound through the angular gyrus suggested a role in visual depth perception – the patient exhibited a total incapacity to perceive depth in anything. The link between our findings and previous results is at present unclear.
Phase alignment at high spatial frequencies may be important for the perception of depth from texture. Thaler et al. (2007) compared depth judgments of curved surfaces for different textures including textures that had blob‐like elements (dots and flagstones) which require phase alignment across multiple spatial frequencies, as well as line textures which are minimally disrupted by phase scrambling. They found that depth judgments are most affected by phase scrambling of blob‐like textures and only minimally for line textures. Given that natural textures are broad‐band, the results of Thaler et al. (2007) suggest that depth from texture in natural scenes is dependent on phase alignment.
We speculate that the absence of non‐uniform phase sensitivity in later areas such as LO reflect the importance of multiple spatial frequencies in representing natural three‐dimensional (3‐D) objects. Natural scene objects are visually defined by many spatial frequency components as well as the phase alignment of these components. It is likely that when the stimuli are filtered – linearly or with band‐stop phase‐scrambled approaches as done here – the representation of the 3‐D object is degraded. Hence, the non‐uniform phase sensitivity that is tuned to high SF is relevant to early areas where edges and contours are first extracted, but in later areas the relevance of all SF components is more balanced.
One map of spatial frequency or many?
Our two resultant maps for spatial frequency – one phase sensitive and the other not – can be simply understood in terms of overlapping neural population preferences. The linear‐filtered map merely represents the peak frequency preference of a given voxel and does not denote exclusive representation of any spatial frequency. Spatial frequency is represented in a structured manner in the macaque V1 at the scale approaching that of a column (Nauhaus et al. 2012), where gradients of spatial frequency selectivity run perpendicular to gradients of orientation selectivity. These results imply that each voxel contains neurons with a range of spatial frequency preferences. Phase alignment at higher spatial frequencies is therefore probably engaging the high‐spatial‐frequency‐preferring neurons distributed across V1 and V2. The distribution of such cells is still correlated with eccentricity, but at a much more coarse level.
What types of cells in V1 and V2 would exhibit such phase sensitivity? Simple cells, while sensitive to phase of a signal on their receptive field, are insensitive to the phase alignment of multiple frequency components as is present in natural scenes, so their responses would not be expected to be modulated by our phase‐alignment manipulation (Felsen et al. 2005; Touryan et al. 2005). Complex cells, on the other hand, have a particular preference for phase‐aligned spatial frequency components in natural images (Felsen et al. 2005) and are known to respond to higher spatial frequencies than simple cells (Movshon et al. 1978). These cells would be expected, therefore, to be differentially modulated by the phase‐aligned high spatial frequency component modulation in our stimulus. Thus the phase‐sensitive spatial frequency map may be describing the spatial frequency preference of complex cells across V1 and V2.
Additional information
Competing interests
None declared.
Author contributions
R.F., R.F.H. and B.T. conceived the study; R.F., B.T. and B.C.H. created the stimuli; R.F., B.T. and S.C. collected the data; S.C. analysed the data; R.F., R.F.H. and S.C. interpreted the data; S.C. and R.F. created the figures; R.F. prepared the manuscript and revisions; R.F.H. edited the manuscript. All authors have approved the final version of the manuscript and agree to be accountable for all aspects of the work. All persons designated as authors qualify for authorship, and all those who qualify for authorship are listed.
Funding
R.F. was supported by internal funds from the Research Institute of the McGill University Health Centre and a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant (RGPIN 419235‐2013). R.F.H. was supported by Canadian Institutes of Health Research (CIHR) grants (nos MT108‐18 and 53346).
Supporting information
R. Farivar and S. Clavagnier are joint first authors.
References
- Castaldi E, Frijia F, Montanaro D, Tosetti M & Morrone MC (2013). BOLD human responses to chromatic spatial features. Eur J Neurosci 38, 2290–2299. [DOI] [PubMed] [Google Scholar]
- Chaudhuri A (1990). Modulation of the motion aftereffect by selective attention. Nature 344, 60–62. [DOI] [PubMed] [Google Scholar]
- Crawford JD, Henriques DY & Medendorp WP (2011). Three‐dimensional transformations for goal‐directed action. Annu Rev Neurosci 34, 309–331. [DOI] [PubMed] [Google Scholar]
- Dale AM, Fischl B & Sereno MI (1999). Cortical surface‐based analysis. I. Segmentation and surface reconstruction. Neuroimage 9, 179–194. [DOI] [PubMed] [Google Scholar]
- De Valois RL & De Valois KK (1980). Spatial vision. Annu Rev Psychol 31, 309–341. [DOI] [PubMed] [Google Scholar]
- Dumoulin SO & Wandell BA (2008). Population receptive field estimates in human visual cortex. Neuroimage 39, 647–660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engel SA (2012). The development and use of phase‐encoded functional MRI designs. Neuroimage 62, 1195–1200. [DOI] [PubMed] [Google Scholar]
- Engel SA, Rumelhart DE, Wandell BA, Lee AT, Glover GH, Chichilnisky EJ & Shadlen MN (1994). fMRI of human visual cortex. Nature 369, 525. [DOI] [PubMed] [Google Scholar]
- Felsen G, Touryan J, Han F & Dan Y (2005). Cortical sensitivity to visual features in natural scenes. PLoS Biol 3, e342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fischl B & Dale AM (2000). Measuring the thickness of the human cerebral cortex from magnetic resonance images. Proc Natl Acad Sci USA 97, 11050–11055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Freeman J, Ziemba CM, Heeger DJ, Simoncelli EP & Movshon JA (2013). A functional and perceptual signature of the second visual area in primates. Nat Neurosci 16, 974–981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frost MA & Goebel R (2012). Measuring structural‐functional correspondence: spatial variability of specialised brain regions after macro‐anatomical alignment. Neuroimage 59, 1369–1381. [DOI] [PubMed] [Google Scholar]
- Goebel R, Muckli L, Zanella FE, Singer W & Stoerig P (2001). Sustained extrastriate cortical activation without visual awareness revealed by fMRI studies of hemianopic patients. Vision Res 41, 1459–1474. [DOI] [PubMed] [Google Scholar]
- Hansen BC & Hess RF (2006). Discrimination of amplitude spectrum slope in the fovea and parafovea and the local amplitude distributions of natural scene imagery. J Vis 6, 696–711. [DOI] [PubMed] [Google Scholar]
- Hansen BC & Hess RF (2007). Structural sparseness and spatial phase alignment in natural scenes. J Opt Soc Am A Opt Image Sci Vis 24, 1873–1885. [DOI] [PubMed] [Google Scholar]
- Hansen BC, Jacques T, Johnson AP & Ellemberg D (2011). From spatial frequency contrast to edge preponderance: the differential modulation of early visual evoked potentials by natural scene stimuli. Vis Neurosci 28, 221–237. [DOI] [PubMed] [Google Scholar]
- Hansen BC, Johnson AP & Ellemberg D (2012). Different spatial frequency bands selectively signal for natural image statistics in the early visual system. J Neurophysiol 108, 2160–2172. [DOI] [PubMed] [Google Scholar]
- Henriksson L, Hyvarinen A & Vanni S (2009). Representation of cross‐frequency spatial phase relationships in human visual cortex. J Neurosci 29, 14342–14351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henriksson L, Nurminen L, Hyvarinen A & Vanni S (2008). Spatial frequency tuning in human retinotopic visual areas. J Vis 8, 1–13. [DOI] [PubMed] [Google Scholar]
- Hess RF, Li X, Mansouri B, Thompson B & Hansen BC (2009). Selectivity as well as sensitivity loss characterizes the cortical spatial frequency deficit in amblyopia. Hum Brain Mapp 30, 4054–4069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holmes G & Horrax G (1919). Disturbances of spatial orientation and visual attention, with loss of stereoscopic vision. Arch NeurPsych 1, 385–407. [Google Scholar]
- Larsson J & Heeger DJ (2006). Two retinotopic visual areas in human lateral occipital cortex. J Neurosci 26, 13128–13142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linden DE, Kallenbach U, Heinecke A, Singer W & Goebel R (1999). The myth of upright vision. A psychophysical and functional imaging study of adaptation to inverting spectacles. Perception 28, 469–481. [DOI] [PubMed] [Google Scholar]
- Masland RL (1958). Higher cerebral functions. Annu Rev Physiol 20, 533–558. [DOI] [PubMed] [Google Scholar]
- Movshon JA, Thompson ID & Tolhurst DJ (1978). Receptive field organization of complex cells in the cat's striate cortex. J Physiol 283, 79–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nauhaus I, Nielsen KJ, Disney AA & Callaway EM (2012). Orthogonal micro‐organization of orientation and spatial frequency in primate primary visual cortex. Nat Neurosci 15, 1683–1690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olshausen BA & Field DJ (1996). Natural image statistics and efficient coding. Network 7, 333–339. [DOI] [PubMed] [Google Scholar]
- Perna A, Tosetti M, Montanaro D & Morrone MC (2008). BOLD response to spatial phase congruency in human brain. J Vis 8, 11–15. [DOI] [PubMed] [Google Scholar]
- Riesenhuber M & Poggio T (2000). Models of object recognition. Nat Neurosci 3, Suppl., 1199–1204. [DOI] [PubMed] [Google Scholar]
- Sasaki Y, Hadjikhani N, Fischl B, Liu AK, Marrett S, Dale AM & Tootell RB (2001). Local and global attention are mapped retinotopically in human occipital cortex. Proc Natl Acad Sci USA 98, 2077–2082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schlaggar BL & McCandliss BD (2007). Development of neural systems for reading. Annu Rev Neurosci 30, 475–503. [DOI] [PubMed] [Google Scholar]
- Sereno MI, Dale AM, Reppas JB, Kwong KK, Belliveau JW, Brady TJ, Rosen BR & Tootell RB (1995). Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science 268, 889–893. [DOI] [PubMed] [Google Scholar]
- Talairach J & Tournoux P (1988). Co‐Planar Stereotaxic Atlas of the Human Brain. 3‐Dimensional Proportional System: An Approach to Cerebral Imaging. Georg Thieme, Stuttgart, New York. [Google Scholar]
- Thaler L, Todd JT & Dijkstra TM (2007). The effects of phase on the perception of 3D shape from texture: psychophysics and modeling. Vision Res 47, 411–427. [DOI] [PubMed] [Google Scholar]
- Thomson MG (2001). Beats, kurtosis and visual coding. Network 12, 271–287. [PubMed] [Google Scholar]
- Touryan J, Felsen G & Dan Y (2005). Spatial structure of complex cell receptive fields measured with natural images. Neuron 45, 781–791. [DOI] [PubMed] [Google Scholar]
- Wandell BA, Dumoulin SO & Brewer AA (2007). Visual field maps in human cortex. Neuron 56, 366–383. [DOI] [PubMed] [Google Scholar]
- Wichmann FA, Braun DI & Gegenfurtner KR (2006). Phase noise and the classification of natural images. Vision Res 46, 1520–1529. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.