Abstract
Statistics of natural scenes are not uniform—their structure varies dramatically from ground to sky. It remains unknown whether these nonuniformities are reflected in the large-scale organization of the early visual system and what benefits such adaptations would confer. Here, by relying on the efficient coding hypothesis, we predict that changes in the structure of receptive fields across visual space increase the efficiency of sensory coding. Using the mouse (Mus musculus) as a model species, we show that receptive fields of retinal ganglion cells change their shape along the dorsoventral retinal axis, with a marked surround asymmetry at the visual horizon, in agreement with our predictions. Our work demonstrates that, according to principles of efficient coding, the panoramic structure of natural scenes is exploited by the retina across space and cell types.
Subject terms: Neural circuits, Retina, Neural encoding
The statistics of natural scenes are not uniform—their structure varies dramatically from ground to sky. A combination of theory and experiments revealed that the visual system has adapted to these inhomogeneities to increase coding efficiency.
Main
The idea that sensory neurons exploit the statistical structure of natural stimuli to minimize the metabolic cost of information transmission has been a guiding principle in neuroscience for over half a century1–3. This conceptual framework, known as the efficient coding hypothesis4, has provided successful theoretical accounts of sensory coding across species and sensory systems5–8 with the retina being the paramount example9. Most of the work in the retina has focused on retinal ganglion cells (RGCs), the neurons that relay visual information from the eye to the brain. It has been demonstrated that multiple properties of RGCs—the shape of receptive fields (RFs)10–13, organization of RF mosaics14,15 and the ratio of ON to OFF RGC cell types16—can be explained as adaptations to the natural sensory environment. In all the mentioned cases, ab initio theoretical predictions about efficient encoding of natural scenes have led to a better understanding of the physiological and anatomical properties of the retina.
One way a sensory neuron could implement an efficient code is by removing predictable (or redundant) components from sensory stimuli, in a transformation known as predictive coding (PC). This prominent hypothesis suggests that the center–surround structure of RGC RFs is a manifestation of such design principle10. According to this hypothesis, the surround computes a prediction of the stimulus value in the center of the RF. The predicted value is then ‘subtracted’ from the center through inhibition, which dramatically reduces the amount of neural resources used to convey the stimulus downstream. PC and related information–theoretic principles11–13,17 typically assume that the structure of natural scenes is uniform across the visual field. However, as demonstrated recently, local contrast and luminance vary prominently across the elevation within the natural visual field of a mouse18,19. Such systematic variation affects the signal-to-noise ratio (SNR) of the input to RGCs. To understand how this inhomogeneous noise structure should shape RGC RFs, we developed a simple, predictive coding model. When adapted to natural statistics of mouse vision, our model generates three key predictions linking the shape of optimal RFs and their position within the visual field. First, the relative surround strength should increase with increasing elevation, due to a consistent increase in brightness from the dim ground to the bright sky. Second, the center size should decrease along the same axis. Third, due to a rapid change of signal intensity between lower and upper FOVs, RFs centered on the horizon should have strongly asymmetric surrounds, with the upper half being stronger than the bottom one.
To test these predictions experimentally, we established a new system that enables recording and characterization of the RF structure at high resolution, at the scale of thousands of RGCs in a single retina. Such technological development enabled us to collect a dataset of 31,135 RGC RFs covering the entire central retina, which was crucial to test our theory. We found a close agreement between theoretically optimal RF architecture and the variation of RF shapes across the retina, suggesting that RGCs exploit global asymmetries of natural scenes for maximizing coding efficiency. Furthermore, we explored these adaptations across the diversity of functional RGC types20, each thought to share the same physiology, morphology and intraretinal connectivity21–24. We identified a systematic dorsoventral variation of the RF shape, regardless of the functional type. Finally, we show that these global adaptations are preserved in awake-behaving animals with intact eyes. Our results thus indicate that adaptations to the panoramic natural statistics structure retinal representations used by the brain.
Results
Efficient coding predicts receptive field shapes across the visual field
To understand how the statistical structure of natural scenes shapes RFs across the visual field, we developed a model of sensory coding in RGCs (Fig. 1a). Our approach is closely related to the PC theory, which postulates that RGCs recode outputs of photoreceptor cells to minimize the metabolic cost of sensory information transmission10. Following this theory, we modeled neural responses as a linear combination of the RF and natural stimuli, distorted by different sources of constant noise, for example, biochemical or synaptic9,25,26 (Fig. 1a). The computation performed by such RFs can be understood as the difference of the weighted center of the stimulus and its surrounding neighborhood. Our model generates predictions consistent with PC (Extended Data Fig. 1) as well as related theories of efficient sensory coding11,13.
Predictive coding theory of the retina assumes that statistics of natural stimuli are stationary across the visual field10. However, natural scenes are spatially inhomogeneous. To understand this inhomogeneity, we examined a set of natural images collected specifically to study mouse vision18. In agreement with previous studies18,19, we found that the power of the light intensity decreases gradually with the elevation and drops off suddenly close to the simulated horizon line (Fig. 1b,c). Under the assumption of constant noise level, the SNR of photoreceptor outputs (that is, RGC inputs) should therefore follow the analogous pattern (Fig. 1c). Moreover, due to an abrupt change of the stimulus power, stimuli centered at the horizon yield a highly nonuniform SNR pattern (Fig. 1d). To find how such inhomogeneities could affect sensory representations across the retina, we numerically optimized RFs to minimize the strength of neural responses averaged across a set of natural image stimuli (Methods).
The shape of the optimal RF depends on the relative strength and the structure of noise. When the SNR decreases, the center of the optimal RF broadens, and the surround becomes more diffuse (Fig. 1e). These trends are dependent on the relative change but not on the absolute SNR. This qualitative change is manifested in increasing relative surround strength (Fig. 1f) and decreasing center sizes (Fig. 1f). The optimal RF shape is additionally modulated by the spatial pattern of the SNR (Fig. 1g). When the SNR is spatially nonuniform (for example, when the signal is stronger in the upper half of the stimulus), the optimal RF becomes asymmetric (Fig. 1g). This effect is particularly visible in the increasing asymmetry of the surround as a function of SNR asymmetry (Fig. 1h). Because of such systematic variation in the stimulus power across the visual field, the PC model predicted three qualitative links between the position of a neuron in the retina and the shape of its RF. First, the strength of the RF surround relative to the center should be increasing with elevation across the visual field. Second, the size of the center should increase in the opposite direction. Third, RFs located at the horizon should have surrounds that are substantially stronger in the upper half than in the lower half. Such distribution of RF shapes would indicate that RGCs exploit global statistics of the visual field to maximize the efficiency of sensory coding. These three predictions stand in contrast to the dominant view that RGC RFs are uniform across the retinal surface. Furthermore, predictions of the PC model are reproducible across different ranges of the light spectrum (Fig. 1f,h) and sets of natural stimuli (Extended Data Fig. 1) and depend primarily on weak assumptions about the correlation structure of natural images10 (Supplementary Note 1 and Extended Data Fig. 2). We thus consider them to be a robust consequence of the efficient coding hypothesis.
Large-scale characterization of receptive fields across the retina
Testing these theoretical predictions requires a high-resolution characterization of RGC RFs from extended regions of the retinal surface. Currently, however, it is not practical to perform such large-scale characterizations with any of the existing methods. Multiphoton imaging approaches can measure large numbers of RGCs20,27, but only at a moderate throughput (~150 RGCs at a time20,27). Multielectrode array recording approaches have improved this number but are limited by the recording area that is placed on top of the electrode array15. Moreover, RF estimates generated by current approaches lack a clear surround structure20. To circumvent these limitations, we designed a high-throughput and low-cost epifluorescence approach that enables imaging a larger field of view (FOV; 1.7 mm2 sampling at ~1 μm per pixel) and permits >1 h-long recordings of the same FOV while avoiding artifacts caused by small retina wrinkles and laser scanning. Our method takes advantage of red calcium sensors (for example, RCamp1.07)28 that separate the Ca2+ indicator’s red-shifted excitation light from the opsin absorption spectrum (Fig. 2a,b and Methods), and allows robust responses to ultraviolet (UV) visual stimulation. We used the VGluT2-cre driver line to specifically target RGCs (Fig. 2c), leading to a uniform expression across the entire retina. All RCamp1.07-positive somata correspond to RGCs, as seen by the RGC-specific marker RBPMS29 (Fig. 2d). Double-positive cells accounted for ~40% of all RGCs. This expression pattern appears to be RGC-type specific, as seen by co-labeling of SMI32 alpha-RGCs. Alpha-RGCs were consistently excluded from the expression profile, apart from a single, sparse and spatially distributed type (Fig. 2e). Using this line, we were able to reproduce and expand previous large-scale imaging results in single retinas, as seen in direction-selective (DS) and non-DS responses (Fig. 2f), the cardinal DS response distributions (Fig. 2g) and clustering and reproducibility of responses to changes in frequency, contrast and luminance, known as the ‘chirp’ stimulus20 (Fig. 2h and Extended Data Fig. 3). By sequentially recording 3–7 FOVs (Fig. 2c), each for approximately 25 min, we could record neural activity from up to ~6 mm2 of retinal surface (~40% of the total retinal area). By experimental design, the position of each FOV was random. Moreover, the strength of the functional responses in each consecutive session was unaltered (Fig. 2i). Importantly, using a new ‘shifting’ white-noise approach, where the checker positions are randomly shifted to increase the RF spatial resolution (Extended Data Fig. 4a–f, Methods and ref. 30), we were able to estimate high-resolution and high-SNR spatiotemporal RFs for ~85% of recorded cells (Fig. 2j). The quality of these RF estimates allowed for automatic parametrization of the spatial RF into the center and surround using a difference of Gaussians model (Figs. 2j,k, Extended Data Fig. 4g–l and Methods). In total, we recorded 11 retinas, reconstructing and parametrizing 31,135 spatiotemporal RFs, enabling an unprecedented opportunity to index RGC responses across single retinas. This methodology will enable functional developmental screens and circuit dissections due to its simplicity, efficiency and affordability, extending the current retinal research toolbox.
Receptive fields are adapted to anisotropic natural scene statistics
Taking advantage of the high-resolution RFs, we examined variations in RF shapes and strengths across the retina. Given that the PC model does not determine the polarity of the optimal RF and to globally compare all retinal RFs, ON-center RFs were flipped in sign, such that all centers were negative, and all surrounds were positive. This allowed us to pool across all cells within small bins on the retinal surface and visualize the average spatial RF at different locations of the retina (Fig. 3a and Extended Data Fig. 5a). We observed a general and reproducible trend across 11 retinas: a streak-shaped area where all RF surrounds were oriented toward the optic nerve, and below which, hardly any RF surrounds were observed (Fig. 3a and Extended Data Fig. 5a). To compare these spatial variations of RFs with our theoretical predictions, we oriented six of the recorded retinas to a common coordinate system using the immunohistochemically determined S-opsin gradient. Average RFs in the ventral, centrodorsal and peripheral-dorsal retina (Fig. 3b) qualitatively matched model RFs predicted for the upper, medial and lower visual fields, respectively (Fig. 3b). To confirm the change of relative surround strengths independently from surround asymmetry, we computed the radial profiles of RFs and these also strongly resembled the radial profiles for model RFs (Fig. 3c). Overall, model RFs qualitatively reproduced all aspects of average spatial RFs across different elevations with remarkable detail.
To measure these phenomena quantitatively, we made use of the RF parametrizations and pooled RF parameters for all cells in different two-dimensional (2D; Fig. 3d–h) or one-dimensional (1D; Fig. 3e–i) bins across the retina. In line with our theoretical predictions (Fig. 1f), our analysis shows that the relative surround strength increases gradually along the dorsoventral axis (Fig. 3d), a trend visible in every single retina (Fig. 3e). Next, we explored if we could observe any global change in RF center size. As predicted (Fig. 1f), center sizes decreased only across the dorsoventral axis (Fig. 3f,g and Extended Data Fig. 6). While examining the spatial distribution of differences in upper and lower halves of the RF surrounds, we identified a consistent and prominent asymmetric streak in the dorsal retina, between 700 and 900 μm dorsally from the optic nerve (Fig. 3h,i), as one would expect from asymmetric visual inputs (Fig. 1e). Accordingly, linear regressions weights were substantially stronger in elevation, but not azimuth, for all three trends (Fig. 3e–i and Extended Data Fig. 6). Overlaying the measured RF asymmetries with the opsin gradient indicated that the asymmetry is pronounced in the opsin transition zone (Fig. 3j). To test whether this streak corresponds to the horizon line within the animal’s visual field, we transformed the retinal coordinates to visual coordinates31 and used the S-opsin gradient32 to define the dorsoventral axis (Fig. 3j,k and Methods). In visual coordinates, the center of this asymmetric streak is located at 0° elevation, spanning the entire azimuth of our imaged FOVs (Fig. 3k), in line with our theoretical predictions (Fig. 1c). Finally, these trends also aligned in five additionally imaged retinas, where the true orientation could not be determined by the opsin gradient (Extended Data Fig. 7).
Adaptations to natural scene statistics across retinal pathways
It has long been assumed that specific RGC pathways have stereotyped response properties, shaped by the interactions between direct excitation in their center and indirect inhibition in their surround33. Thus, one would expect that these center–surround interactions are uniform across visual space for most RGC pathways. To assess the specificity of the observed RF adaptations across functional RGC pathways, we clustered cells based on the temporal RF profile of 31,135 RGCs into functional groups using a Gaussian mixture model (GMM), as done previously20. Consistent with the proportion of RGCs labeled in our line (Fig. 2d), we defined ten functional clusters (Fig. 4a). Chirp responses were not used for clustering because we observed reliable responses only in RGCs with weak surrounds. Therefore, for RGCs with strong surround, the chirps did not help us to match cluster identities (Extended Data Fig. 3c,d), despite finding new response properties that would have aided classification (for example, clusters 3, 11 and 14; Extended Data Fig. 3b). Each cluster group had distinctly shaped temporal filters, corresponding to different functional properties such as ON or OFF selectivity, transient or sustained responses, and monophasic or biphasic selectivity, as seen in their average profiles (Fig. 4a). Cluster membership statistics were conserved across different retinas (Extended Data Fig. 8). Moreover, the relative positions of RGCs belonging to individual clusters tile the retina in a mosaic-like arrangement in many cases (Fig. 4b), confirming that some clusters are indeed functionally distinct and irreducible RGC types15. As expected, many clusters represent a combination of RGC types that cannot be identified solely by their temporal profiles (Fig. 4b, cluster 10; see Supplementary Table 1 for tiling statistics). We next used this classification and looked at the relative surround strength, center size and asymmetric strength across clusters (Fig. 4c). As with the global pooling of RF (Fig. 3), cells in each functionally defined cluster increase their relative surround strengths gradually in the dorsoventral axis and decrease their center sizes accordingly. Moreover, all clusters contribute to the asymmetric streak (Extended Data Fig. 8c), consistent with the distribution of asymmetries in the opsin transition zone, where most cells have a strongly oriented surround with a ventral bias (Fig. 3k). All three trends were statistically significant for all clusters in elevation, but not azimuth (Supplementary Table 1). These results indicate that a substantial proportion of RGC pathways adapt to the constraints imposed by natural statistics.
In vivo panoramic receptive field anisotropies match predictions
To test whether the adaptations to the panoramic visual scene statistics affect sensory coding during behavior in naturalistic conditions, we decided to map RFs of the RGC axonal terminals in the superior colliculus (SC; Fig. 5a,b). These experiments have the advantage of testing our hypothesis in retinas that retain the complete adaptation machinery, from an attached pigment epithelium to functioning pupil constriction. Moreover, they provide important additional insights into the effects of the M-cone and rod pathways, which are saturated in our ex vivo retinal imaging system. For this purpose, we expressed the calcium indicator GCaMP8m34 in RGCs, using adeno-associated viruses (AAVs) in three mice. Subsequent implantation of a cranial window above the SC allows for visualization of RGC axonal terminal activity with two-photon calcium imaging in awake, behaving mice. The FOV varied for each recording from 0.32 to 1.85 mm2 (median of 0.68 mm2) of superficial SC surface (Fig. 5b), encompassing 23 to 57 (median of 41) visual degrees in elevation. GCaMP8m expression spread homogeneously across the SC (Fig. 5c). Using the same previously used ‘shifting checker’ stimulus, we recorded 53,000 terminals in total, reconstructing 10,000 RFs above our quality index (Methods). Compared to the RFs recorded from ex vivo retinas, in vivo measured RFs were blurred along the main axis of saccadic movements (Figs. 2d and 5d; RFs). Thus, to compare across animals, we aligned the RFs of each animal to their respective saccadic plane, which, due to the head fixation, had one consistent axis parallel to the ground to each animal as shown previously35 (Extended Data Fig. 9a–c) and corrected the mouse head position to match the retinal coordinates (Methods). To avoid any bias due to eye movements, we then used the 1D profiles (1D RFs) of the orthogonal axis for further analysis (Figs. 5e). By binning and averaging 1D RFs along the lateromedial axis, spanning from the lower visual field to the higher visual field (Fig. 5fg–i and Extended Data Fig. 9d), the three predicted trends became visible: (1) the surround strength increased, (2) the center size reduced and (3) the surround became more symmetric. Similarly to our retinal results (Fig. 3b), the mean 2D RFs qualitatively matched model RFs predicted for the upper, medial and lower visual fields (Fig. 5f). Next, we analyzed the RF parameters on mean 1D RFs across visual space. As with previous results, the RF properties were significantly different above and below the horizon and had substantial regression weights on elevation (Fig. 5g–i) but not on azimuth (Extended Data Fig. 9h–l). Consistently, the average visual maps (Extended Data Fig. 5b) resembled the ones measured ex vivo (Fig. 3d–h). Thus, our in vivo results provide independent corroboration that the visual system is adapted to the constraints imposed by the panoramic natural statistics.
Discussion
In this study, we leveraged a new, high-throughput neural imaging setup (Fig. 2) to identify a novel kind of adaptation to panoramic scene statistics in the retina. In agreement with theoretical predictions derived from the efficient coding framework (Fig. 1), our experimental results indicate that the global RF architecture is adapted to encode panoramic natural scenes efficiently (Figs. 3 and 4). These results were further corroborated in RGC terminals of awake animals (Fig. 5), indicating that panoramic, efficient representations impact downstream processing during behavior. Classically, RGCs are known to dynamically change the strength of the RF surround in response to varying light levels36,37, which is thought to further increase the efficiency of sensory coding9,38,39. Our findings demonstrate that, in addition to such dynamic effects, RF shapes are also determined by static factors, namely their position within the visual field. In that way, the retina simultaneously exploits the large-scale spatial and fine-scale temporal structure of the visual space.
How could the visual system establish such global RF architecture without fine-tuning each RGC pathway independently? One partial mechanism would be the nonuniform distribution of spectral sensitivity across the retina32. Such distribution has been discussed to be relevant for color vision40, contrast coding19,41 and the detection of aerial predators in the sky42, but simultaneously, could influence the static RF adaptations. For example, whereas the mouse retina has green light-sensing photoreceptors across the entire retina (M-opsin and Rod-opsin), UV sensitivity follows a sharp dorsoventral gradient (S-opsin)32. From the RGC’s perspective, both inputs will be added at mesopic conditions, leading to a net enhancement of the UV sensitivity from the ground to the sky. Our in vivo results support this perspective by corroborating the ex vivo findings in an intact eye. Intriguingly, in vivo and ex vivo measured RFs differ subtly. Whereas ex vivo RFs show a clear asymmetrical peak at the horizon, in vivo RFs are more asymmetric across large proportions of the visual field (Fig. 5j). This is consistent with PC predictions because natural image patches located above the horizon tend to be vertically asymmetric (Fig. 1d) due to a gradient of stimulus power (Fig. 1b). Such an RF pattern indicates that other mechanisms, apart from the S-opsin gradient, have to be involved. One possibility could be the circuitry mediating the asymmetric surround of J-RGCs33,40, which is ventrally displaced and sensitive to M-cones and rods. Interestingly, the vertical gradient of stimulus power will flatten at lower ambient light levels, for example, at dusk and dawn. In conjunction, the relative strength of UV sensitivity and the antagonistic surround would also decrease36,37. In these conditions, efficient coding hypothesis would predict a more homogeneous RF distribution across the dorsoventral axis. Conversely, the horizon will become more prominent in photopic conditions, where rods are less active. In such situations, the in vivo RF architecture should have a localized asymmetric streak, as measured in our ex vivo data. It would be revealing to test if the retina-wide RF organization is dynamically reshaped under scotopic and photopic conditions. Finally, to fully benefit from this panoramic retinal code, the eye should maintain a relatively constant position on the horizon. In agreement with this idea, eye and head movements stabilize the retinal image remarkably well, on average ~10° in azimuthal angle, during behavior35,43.
A distinct, yet related question can be asked about the emergence of DS computations in retinorecipient areas, such as the SC, where neurons integrate input from the entire retina, including the asymmetric streak. DS encoding can emerge as a consequence of an asymmetric and time-shifted surround, as shown before33,44. Consistent with the measured center–surround asymmetry, some studies have described these neurons as sensitive to upward motion45, whereas others do not find such specificity46,47. The efficient coding interpretation, such as the one adopted here, suggests that the key to resolving this puzzle might be understanding the statistics of what the animal ought to see in nature.
Our theoretical predictions established qualitative links between properties of RFs and their elevation within the visual field. They can be therefore thought of as a first-order approximation of how the retina is adapted to large-scale, spatial statistics of natural scenes. The exact pattern of global retinal adaptation should vary across species occupying diverse environments. It has been found that dorsoventral opsin gradients are present in many different mammalian species. For example, the rabbit, Chilean subterranean rodent cururo, European mole, the shrew and even the spotted hyena42,48 show higher S-opsin expression in the ventral retina. However, not all ecological niches are identical. For example, in dense forests, the vertical gradients of luminance and contrast are less prominent, and a clear horizon line might not be apparent. Interestingly, forest mice species whose opsin distribution has been described present a spatially uniform opsin distribution49. This further strengthens our hypothesis, which relates the global organization of the retina to the statistics of the ecological visual field. Understanding this adaptation in more detail will require a careful analysis of stimuli from the ecological sensory niche, as well as an estimation of biophysical parameters such as biological SNR, RF size and tiling to refine our theoretical predictions. The combination of these approaches will be a critical requirement for building a more general theory of vision across the animal kingdom50.
Methods
Theory
We modeled neural responses rt as a dot product of the model RF and noisy stimulus vectors (image patches) , where T denotes vector transposition. RFs (filters) were optimized to minimize the following cost function:
Where d(i) is the squared distance between the i-th value of the RF and the one with the peak absolute value, and λ is the strength of the spatial locality constraint. This form of the locality constraint was introduced in ref. 11, and it has been demonstrated that it is consistent with RGC properties11,13. We note that the activity-related term in the cost function is equivalent to maximizing the sparsity of the neural activity quantified as the average absolute value of neural responses8. Without the spatial locality constraint (that is, λ = 0), the optimal RF is an oriented, Gabor-like filter. During optimization, to avoid convergence to trivial solutions, the norm of the RF was constant and equal to 1. Overall, this cost function enforces minimization of activity conveyed downstream, while preserving the information about the image and meeting the locality constraints. Conceptually, this goal is equivalent to that of the PC model10. Our model generates predictions consistent with the PC model proposed in ref. 10 (Extended Data Fig. 1). It is, however, more flexible, and enables us to capture changes in the center size.
We modeled output of photoreceptor cells (stimuli st) as natural image patches distorted with the additive Gaussian noise with variance σ2 that is: , where is the noise term. To simulate different SNR conditions, we manipulated the noise variance level, and optimized RFs for each of the noise levels separately.
We optimized RFs by numerically minimizing the cost function L via gradient descent. For training, we used a dataset of 50,000 square image patches of 27 × 27 pixels in size taken from a dataset of natural images from the mouse visual environment18. We sampled images uniformly across the upper and lower visual fields. Each image patch was normalized to have a zero mean and unit variance. To simulate the impact of changing SNR, we normalized images with added noise. During optimization, the dimensionality of natural image data was reduced with principal-component analysis (PCA) to 128 dimensions. For each noise level, dimensionality reduction was performed using the same matrix of PCA components computed on noiseless data. To simulate the impact of changing SNR homogeneity, before normalization we multiplied the bottom half of each image by a scaling factor of less than one, resulting in the range of vertical surround asymmetry values reported in Fig. 1h. After such scaling, we normalized the data and added noise of constant variance. During optimization of RFs on asymmetric stimuli, we computed PCA for each level of SNR asymmetry separately.
In all cases, before optimization, in order to enforce that the RF is centered in the image patch, we initialized RFs with random Gaussian noise with variance equal to 0.1 but set the central pixel value to −1. We independently optimized RFs using images taken in the UV and green parts of the spectrum, as well as in the ‘joint’ spectrum where intensity of each pixel was the average of green and UV values. To ensure that results do not depend on the choice of natural image dataset, we performed the simulations with the images of the African savanna from the van Hateren repository used in ref. 51 (Extended Data Fig. 2).
To evaluate RF properties, we defined the size of the RF to be the smallest circle that included 90% of energy (that is, ) of optimal RFs averaged across all noise levels (Fig. 1b,c). Within that circle, we defined the center to be all values smaller than 0, and the surround to be those larger than or equal to 0. The strength of the surround was thus equal to and the center to . Sizes of the center were simply numbers of entries that were smaller than 0.
To characterize changes in contrast and luminance across the visual field, we used natural images published in ref. 18. We limited our analysis to UV images only; however, light statistics of the green channel do not differ qualitatively. The images provided in ref. 18 were divided into two classes—upper visual field and lower visual field. To simulate the visual horizon, we concatenated pairs of images randomly selected from the upper and lower visual fields. We created a dataset of 1,000 such concatenated images and used them to compute the mean and variance of light intensity as estimates of local luminance and contrast, as well as to estimate the SNR as a function of elevation. To estimate the vertical asymmetry of the SNR pattern we used a square stimulus window, and a fixed noise variance. We note that key, qualitative aspects of our predictions do not depend on these choices. For each position y of the window along the vertical dimension of the visual field, we computed the vertical SNR asymmetry as: where and are sums of the SNR value within the stimulus window above and below its midline, respectively.
Animals
Mouse protocols were reviewed by the institutional preclinical core facility at IST Austria. All breeding and experimentation were performed under a license approved by the Austrian Federal Ministry of Science and Research in accordance with the Austrian and EU animal laws (BMF-66.018/0017-WF/V/3b/2017). For retinal experiments, triple-transgenic male and female mice (n = 8 mice, 3 males, 5 females; n = 11 retinas, 5 from the left eye, 6 from the right) aged 5–12 weeks were used for this study (Vglut2-ires-cre (JAX 28863), TITL-R-CaMP1.07-D (JAX 030217) and ROSA26-ZtTA (JAX 012266)). Original strains were obtained from Jackson Laboratories. For in vivo imaging experiments, C57BL/6J (JAX, 000664; n = 3, 2 males, 1 female), aged 6–11 weeks at eye injection, were used. The mice were housed in a standard (in vivo inverted) 12-h day–night cycle and euthanized by cervical dislocation before in vitro imaging.
Statistics and reproducibility
No statistical method was used to predetermine sample size. Low SNR RFs were excluded from analysis, as described below. As done previously20, only chirp responses that passed a quality criterion were used for further analysis. Given the nature of the retinal experiments, the location of the recordings was randomized to prevent any biases in the outcome.
Ex vivo imaging
The dark-adapted mouse retina was isolated under far-red light (LED peak 735 nm, additionally filtered with a 735-nm LP filter eliciting an isomerization rate of ~17 R s−1) in oxygenated Ames’ medium (Sigma) with constant bubbling (95% O2, 5% CO2) at room temperature. Left and right retinas were kept separate for identification. Four incisions were made to flat mount the retina, with ganglion cells facing up, on an 18-mm coverslip (VWR, 631-0153), and held down with a filter paper (Merck, GSWP01300) with a ~2.5 mm × 2.5 mm imaging window cut out. The preparation was then placed in a heated (32 °C) superfusion chamber on the stage of a custom-built upright fluorescence microscope. The retina was left to recover for a minimum of 10 min with the excitation light of the microscope turned on. An amber LED (Thorlabs, M595L4) filtered with a BP filter (Thorlabs, FB580-10) was used for excitation and a BP filter (Thorlabs, 641-75) in series with a 600-nm LP filter (Thorlabs, FEL0600) was used for collection. Background excitation light intensity was at a constant mean photopic intensity of 105 R s−1 per rod (at 585 ± 5 nm). Isomerization rates were determined using opsin templates52 and assuming that the mouse rod has an optical density at peak absorption wavelength of 0.015 μm−1, a length of 24 μm, a diameter of 1.4 μm and a quantum efficiency of 0.67 (refs. 53,54). Each retina was tiled by recording 3–7 different FOVs at ×10 magnification (Olympus XLUMPLFLN20XW objective) using a sCMOS camera (Photometrics Prime 95B) at 10 frames per second and 1.1-μm pixel resolution. Setup was controlled and data were acquired using custom-built LabVIEW software (National Instruments, version 2019).
Visual stimuli for retinal experiments
Light stimuli were delivered from a modified Texas Instruments DLPLCR4500EVM DLP projector through a custom-made lens system and focused onto the photoreceptors (frame rate of 60 Hz, magnification of 2.5 μm per pixel, total area of 3.2 mm × 2 mm). The projector’s blue LED was replaced with a high-power UV LED (ProLight 1 W UV LED, peak 405 nm), to improve the differential stimulation of S pigments. Two SP filters in series (Thorlabs, FESH0550) were put in the stimulus path to block green light from entering the camera. Intensities and spectra were measured using a calibrated spectrometer (Thorlabs, CCS-100) and a digital power meter (Thorlabs, S130C sensor). A shifting spatiotemporal white-noise stimulus was presented using a binary pseudorandom sequence, in which the two primary lights (green and UV) varied dependently. All white-noise stimuli were presented at a 6-Hz update for 15 min. The checker size was 100 × 100 μm and the entire grid was shifted by random multiples of 10 μm in both x axis and y axis after every frame. In comparison experiments, static checkers (without shifts) of 100 × 100 μm and 25 × 25 μm were interleaved with the shifting checkers in chunks of 5 min for a total of 20 min for each of the three checker types. A ‘chirp’ stimulus with a 1-s bright step, increasing amplitude (0 to 127 over 8 s) and increasing frequency (0 to 4 Hz over 8 s) was repeated for five trials to reproduce clustering of responses20. Moving square gratings (temporal frequency of 0.6 cycles per second and spatial frequency of 0.025 cycles per micron) or a wide bright bar (1 mm s−1 speed, 2 mm width) in eight directions, repeated for five trials, were used for assessing direction selectivity. All visual stimuli were generated using the Psychtoolbox (version 3)55.
Histology
After the ex vivo recordings, some of the retinas were fixed with 4% paraformaldehyde (PFA) for 30 min and stained for S-opsin and RFP. Retinas were incubated for 7 d at 4 °C in PBS, containing 5% donkey serum, 0.5% Triton X-100, goat anti S-opsin (1:500 dilution; Rockland, 600-101-MP7) and rabbit anti-RFP (1:1,000 dilution; Rockland, 600-401-379). After washing thrice in PBS for 15 min each, retinas were incubated overnight in secondary antibodies, donkey anti-goat Alexa Fluor 488 (1:1,000 dilution; Abcam, ab150129) and donkey anti-rabbit Alexa Fluor 594 (1:1,000 dilution; Invitrogen, R37119). Retinas were then mounted and imaged with an Olympus VS120 Slidescanner with a ×20 objective. For cell-type characterization, Vglut2-ires-cre; TITL-R-CaMP1.07-D; ROSA26-ZtTA mice were euthanized and perfused intracardially, followed by retina extraction and staining for RBMPS or SMI32, along with RFP (primary antibodies: guinea pig anti-RBPMS (1:500 dilution; Sigma, ABN1376), mouse anti-SMI32 (1:500 dilution; BioLegend, 801701), rabbit anti-RFP (1:1,000 dilution; Rockland, 600-401-379) or mouse anti-RFP (1:500 dilution; MBL, M155-3); secondary antibodies: goat anti-guinea pig Alexa Fluor 647 (1:1,000 dilution; Invitrogen, A21450), donkey anti-mouse Alexa Fluor 647 (1:1,000 dilution; Abcam, A-31571) and donkey anti-rabbit Alexa Fluor 594 (1:1,000 dilution; Invitrogen, R37119). The staining protocol was the same as above and these retinas were imaged with a Leica SP8 confocal microscope.
After the final in vivo recording, mice were terminally anesthetized with ketamine/xylazine (100 mg per kg body weight/10 mg per kg body weight) intraperitoneally (i.p.) and transcardially perfused with PBS, followed by 4% PFA. Brains were extracted and post-fixed in 4% PFA at 4 °C overnight. Brains were then washed and transferred to 30% sucrose solution for cryoprotection overnight at 4 °C and subsequently frozen and the midbrain coronally sliced into 40-µm sections on a Leica SM2010R sliding microtome. Sections were washed and then incubated in PBS, containing 5% donkey serum, 0.3% Triton X-100 and goat anti-GFP (1:2,000 dilution; Abcam, ab6673) overnight at 4 °C. After washing thrice in PBS for 15 min each, brain sections were incubated for 1 h in secondary antibody solution, donkey anti-goat Alexa Fluor 488 (1:1,000 dilution; Abcam, ab150129), washed thrice again in PBS and mounted on slides, where they were stained with DAPI (not shown) and coverslipped with custom-made Mowiol. Brain sections were imaged with a Nikon CSU-W1 spinning disk confocal microscope at ×20 tile stack acquisition. Shading correction was performed on image stacks with BaSiC56 in ImageJ and, finally, maximum projection was performed over the whole stack.
Preprocessing
Regions of interest (ROIs) were detected automatically from raw calcium movies using Suite2p54. Fluorescence traces, F, were detrended by computing dF/F, where the 8th percentile in a 20-s sliding window centered around each time point was taken as the baseline fluorescence. Different FOVs from the same retina were stitched together based on coordinates from the stage motors and repeated ROIs in overlapping regions were manually annotated using an open-source tool57. Repeated ROIs with the highest score in Suite2p’s built-in classifier were kept for analysis. The deconvolved signal from Suite2p (with tau = 1.0 s) was used for calculating RFs.
The median dF/F response across trials was taken as the response to the chirp and normalized by dividing by the maximum of the absolute values across time. Quality Index was computed as in ref. 20, and only responses with a score > 0.45 were kept for clustering (8,019 of 30,798 neurons). For moving gratings and bar, the mean across trials and maximum across time was taken as the response in any one direction.
Receptive field mapping
The RF for each neuron was computed as a calcium-triggered average. The spatiotemporal RF at latency τ, position (x,y) for neuron i was computed as
where r(i,t) is the deconvolved response of neuron i at time t, s(x,y,t)) is the white-noise stimulus, T is the length of the recording and N is the total number of neurons in the recording. The second term in this equation subtracts away the residual distribution of the stimulus and the contribution of slow bleaching that is common to all neurons in that recording and leads to RFs that had noticeably less noise. Only the UV channel of the stimulus was used for RF mapping. The latency τ was varied in increments of 0.025 s (40 Hz), and the stimulus was interpolated by using the last frame before a particular time, t-τ. The RF for each neuron was normalized between −1 and 1 by subtracting the mean value of the RF at latencies τ < 0, and dividing by the maximum absolute value of the entire RF.
The location of the center of an RF was estimated by finding the pixel that varied the most across time, where is the variance across time for position (x,y). Each neuron’s RF was cropped within a square window of edge 1 mm centered on this pixel. The SNR of an RF was computed as the peak-to-noise ratio where the power of noise was estimated in regions with distance >0.5 mm from the point Pvar. Only RFs with a peak SNR > 15 dB were kept for analysis (31,135 selected RFs of 37,086 recorded neurons). The location of the RF in time was found in a similar way; here, is the variance in space.
Parametrization of receptive fields
We parametrized spatiotemporal properties of the center and surround of the RFs as a sum of two 2D Gaussian distributions (Gaussians) G1 + G2. The first Gaussian represents the center of the RF; its amplitude can be either negative or positive corresponding to an OFF or an ON cell. The amplitude of the second Gaussian is required to be of the opposite sign to model the properties of the surround. We computed a spatial representation of the RF, denoted as RF2D as the median of the RF within a small time window around Tvar. To reduce noise, we excluded the pixels that were weakly correlated with Pvar across time. The sum of 2D Gaussians G1 + G2 was then fitted to RF2D. Each Gaussian is defined by the amplitude A, center , width and the orientation angle θ. We fit only one Gaussian G1 to parametrize the location , the size and the orientation θ1 of the center. In the next step, we fit the sum of Gaussians, where we fixed parameters of G1 while allowing all other parameters to be fitted anew.
Here we differentiated two types of RFs: a RF with a strong center and a weaker surround that largely overlap, and a RF where both center and surround components are strong and well separated. For the first case, we imposed a constraint such that the center of G2 is within the distance d = 2 min from the center . We implemented this constraint as a penalty sigmoid function of the distance between the locations of the center and surround components. We added to the Gaussian mixture and allowed it to be prohibitively large for dist(G1, G2)> d. Encoding the constraint in this manner allowed us to remove the bias for the location of the surround on the diagonals of the RF2D, which otherwise happens when fitting Gaussian mixture on the square bound-constrained region. There were no constraints for the second type of the RFs where the surround component is strong and more distant from the center. To decide the type of the RF, we found the maximum and minimum points of the RF, and we computed the distance between them and the ratio of their absolute values. If the ratio of the smaller to the bigger values was less than 0.75 or if the distance between the extrema points was less than d, then we classified such an RF as the first type, and as the second type otherwise. Experimentally, we found that imposing such a constraint on the location of the second Gaussian leads to a better fitted sum of Gaussians for RFs with largely overlapping center and surround components. All the fitting procedures were implemented using the nonlinear least-squares solver lsqcurvefit in MATLAB.
Using parametrization, we computed various RF characteristics. We found two sets of pixels corresponding to the center and the surround. Center pixels are the pixels within two standard deviations from the center of the G1. Surround pixels are the pixels within two standard deviations from the center of the G2 and that are not center pixels. The center size is the number of pixels in the center set, converted to mm2 for display. The relative surround strength is the ratio of the absolute value of sum of surround pixels to the absolute value of sum of center pixels. Vertical surround asymmetry is defined as where u and l denote the absolute value of sum of pixels in the upper and lower halves of the surround pixels, respectively. The distance between the center and the surround is the distance between the center of mass (COM) of the center pixels and the COM of the surround pixels. The orientation is the angle between the horizontal axis and the line connecting the two COMs. Radial profiles were computed as the average values of the pixels in RF2D within rings of increasing radii centered on the point Pvar. The average value of the center or surround pixels across time was taken to be the RF center or surround temporal dynamics, respectively. The R2 goodness of fit was computed as
where M = G1 + G2 is the RF parametrization. The values of the above RF parameters are reported for a few representative neurons in Extended Data Fig. 4.
Retina alignment
All functional imaging experiments were performed with randomized retina orientations. For retinas that were co-stained against S-opsin and RCaMP1.07 (n = 6 retinas), the direction with the highest density of S-opsin was taken to be the ventral direction58. The stitched maximal projection images of the functional imaging experiments were aligned to the RCaMP channel using the retinal vasculature. In each of these retinas, we observed that a streak of asymmetric surrounds was always consistently present across the dorsal retina. Thus, we assumed this to be a reproducible feature, which we then used to manually align the remaining n = 5 retinas that did not have an S-opsin staining (and hence no ground truth orientation). To avoid any potential circular arguments, we present the location of neurons from these five retinas only in Extended Data Fig. 7, with a clear indication that the retinal orientation is presumed.
The coordinates of cells from each retina were then translated to make the optic nerve the zero of the coordinate system and rotated such that the positive y axis denoted ventral direction. Left retinas were flipped in the nasotemporal axis such that the positive x axis denoted nasal direction for all retinas. All spatial RFs were also translated and rotated accordingly. The cartesian retinal coordinates of cells in the stained retinas were converted to spherical visual coordinates using the R package Retistruct, assuming the optical axis of the mouse eye to be oriented at an azimuth of 64° and an elevation of 22° from the long axis of the animal31.
Binning of receptive field properties
For 2D bins, the retinal space from −1,500 μm to 1,500 μm in both nasotemporal and dorsoventral axes was divided into a square grid and all neurons within each bin were collected. Only 2D bins with at least five neurons were analyzed to minimize sampling bias. The spatial RF values of all neurons within each bin of 300 μm in size were averaged and plotted at the location of the bin in Fig. 3a and Extended Data Fig. 6. The RF parameter values of neurons within each bin of size 50 μm were averaged to yield a 2D map of the parameter, and this map was visualized (without smoothing) in Fig. 3d–h. Owing to its area-preserving property, the sinusoidal projection of visual space was binned in the same way as the retinal surface, and the fraction of cells in each bin that had ventrally oriented surrounds were plotted in Fig. 3k.
For 1D analysis, the bins were defined along the dorsoventral or nasotemporal axis based on the range of coordinates in a particular group (by retina (Fig. 3) or cluster (Fig. 4)). The range was divided into six equally spaced bins and the mean parameter value of neurons within each bin was plotted at the coordinate of the center of the bin. As a summary statistic, two-sample Kolmogorov–Smirnov-tests were performed between ventral and dorsal samples of these binned values (n = 56 (Fig. 3) and n = 60 (Fig. 4) samples). Two-sample Kolmogorov–Smirnov tests (with Bonferroni correction for multiple comparisons) were also performed between raw values of RF parameters for Extended Data Fig. 6. In addition, the weights of linear regressions of RF parameters in elevation (dorsoventral) and azimuth (nasotemporal) orientations are reported.
Clustering into functional types
The GMM procedure developed in ref. 20 was used for clustering temporal RFs and chirp responses, separately. In brief, after normalization, the trace was first reduced in dimension using PCA (10 components for temporal RFs and 20 components for chirp responses) and then GMM models with diagonal covariance matrices were fitted while increasing the number of clusters. The numbers of clusters were identified to be 10 for temporal RFs and 20 for chirp responses based on elbow points in the respective Bayesian information criteria curves. One chirp cluster (n = 68 cells) lacked stimulus-evoked responses and was discarded on visual inspection. To assess the degree of overlap between the RFs of neurons belonging to each of the clusters, we defined the tiling index (TI) of the cluster K as the area of the union of all RF centers in a cluster, divided by the sum of their individual areas:
The value of this index for each cluster was computed separately for each retina and the mean and s.d. across retinas are reported in Supplementary Table 1.
Viral eye injections
For viral-mediated gene transfer, 6- to 11-week-old wild-type C57BL/6J mice (JAX, 000664) were anesthetized with ketamine/xylazine by i.p. injection. A 1/2-inch, 30-gauge needle was used to make a small hole in the temporal eye, below the cornea. Then, 1 μl of vitreous fluid was withdrawn and 1 μl of AAV2.7M8-syn-GCaMP8m viral vector solution (at a titer of ~1 × 1013 genome copies per ml, ISTA viral facility) was injected into the subretinal space with a Hamilton syringe and a 33-gauge blunt-ended needle.
Mouse surgery for in vivo imaging
Two to three weeks after viral eye injections, mice were injected with meloxicam (20 mg per kg body weight, subcutaneous (s.c.), 3.125 mg ml−1 solution) and dexamethasone (0.2 mg per kg body weight i.p., 0.02 mg ml−1 solution). Anesthesia was induced by 2.5% isoflurane in oxygen in an anesthesia chamber. The mouse was subsequently fixed in a stereotaxic device (Kopf) with a constant isoflurane supply at 0.7% to 1.2% in O2 and body temperature controlled by a heating pad to 37.5 °C. After the assertion that reflexes subsided, the cranium was exposed and cleaned of periosteum and connective tissue. A circular craniotomy of 4 mm in diameter was drilled above the left SC and from this point onwards, the exposed brain was constantly irrigated with artificial cerebrospinal fluid. The exposed dura mater was removed, and subsequently, the left transverse sinus was sutured twice with 9-0 monofil surgical suture material (B. Braun) and cut between the sutures. Cortical areas covering the left SC were aspirated with a cell culture vacuum pump (Accuris) connected to a blunt needle of 0.5 mm in diameter. A 3-mm circular coverslip was glued (Norland optical adhesives 61) to a thin-walled custom-made conical ring, made from stainless steel. The coverslip ring was inserted into the cavity left by the aspirated cortex, so that the glass was sitting flush on the surface of the SC. Slight pressure was applied with the help of a thinned toothpick, fixed to the stereotaxic arm. The space around the insert was filled with Dura-Gel (Cambridge Neurotech) and the insert was fixed in place with VetBond (3M). After cleaning and drying the surrounding cranium, a multilayer of glues was applied. First, to provide adhesion to the bone, All-in-One Optibond (Kerr) was applied and hardened by blue light (B.A. Optima 10). Second, Charisma Flow (Kulzer) was applied to cover the exposed bone and fix the metal ring in place by also applying blue light. After removal of the fixation toothpick, a custom-designed and manufactured (RPD) headplate, selective laser-sintered from the medical alloy TiAl6V4 (containing a small bath chamber and micro-ridges for repeatable fixation in the setup), was positioned in place and glued to the Charisma on the cranium with Paladur (Kulzer). Mice were given 300 µl of saline and 20 mg per kg body weight meloxicam (s.c.), before removing them from the stereotaxic frame and letting them wake up while keeping them warm on a heating pad. Another dose of 20 mg per kg body weight meloxicam s.c. and 0.2 mg per kg body weight i.p. dexamethasone was further injected 24 h after conclusion of the surgery. After the implantation surgery, animals were allowed to recover for 1 week.
In vivo visual stimulation and eye movements
Mice were head-fixed while awake using a custom-manufactured clamp, connected to a three-axis motorized stage (8MT167-25LS, Standa). Mice could run freely on a custom-designed spherical treadmill (20-cm diameter). Visual stimuli were projected by a modified LightCrafter (Texas Instruments) at 60 Hz, reflected by a quarter-sphere mirror (Modulor) below the mouse and presented on a custom-made spherical dome (80 cm in diameter) with the mouse’s head at its center. The green and blue LEDs in the projector were replaced by cyan (LZ1-00DB00-0100, Osram) and UV (LZ1-00UB00-01U6, Osram) LEDs respectively. A double bandpass filter (387/480 HD Dualband Filter, Semrock) was positioned in front of the projector to not contaminate the imaging. The reflected red channel of the projector was captured by a transimpedance photo-amplifier (PDA36A2, Thorlabs) and digitized for synchronization. Cyan and UV LED powers were adjusted so that the reflectance on the screen matched the relative excitation of M-cones and S-cones during an overcast day, determined and calibrated using opsin templates52 and a spectrometer (CCS-100, Thorlabs). Stimuli were designed and presented with Psychtoolbox (version 3)55, running on MATLAB 2020b (MathWorks). Stimulus frames were morphed on the GPU using a customized projection map and an OpenGL shader to counteract the distortions resulting from the spherical mirror and dome. The dome setup allows the presentation of mesopic stimuli from circa 100° on the left to circa 135° on the right in azimuth and from circa 50° below to circa 50° above the equator in elevation.
Visual stimuli were like ex vivo retinal imaging experiments: A shifting spatiotemporal white-noise stimulus was presented using a binary pseudorandom sequence, in which the two primary lights (cyan and UV) varied dependently. All pseudo white-noise stimuli were presented at a 5-Hz update in 5-min episodes, interleaved with different stimuli (for example, gray screen, moving gratings (not shown)) with a total pseudo white-noise duration of 15–60 min (median of 25 min) per recording. The checker size was a visual angle of 8 × 8° and the entire grid was shifted by random multiples of a 0.4° visual angle in both elevation and azimuth axis after every frame. Eye movements of the right eye were recorded with a camera (Basler acA1920-150um, 18–108 mm macro zoom lens (MVL7000, ThorLabs), set at 100 mm, and infrared illumination of 830 nm) via an infrared mirror at 50 frames per second.
In vivo retinal terminal imaging
Two-photon axonal terminal imaging was performed on a custom-built microscope, controlled by ScanImage (Vidrio Technologies) running on MATLAB 2020b (MathWorks) and a PXI system (National Instruments). The beam from a pulsed Ti:Sapphire laser (Mai-Tai DeepSee, Spectra-Physics) was scanned by a galvanometric-resonant (8 kHz) mirror combination (Cambridge Scientific) and expanded to underfill the back-aperture of the objective (×16 0.8-NA water-immersion, Nikon); 1.9 × 1.9-mm FOV; 30-Hz frame rates. Fast volumetric imaging was acquired with a piezo actuator (P-725.4CA, Physik Instrumente). Emitted light was collected (FF775-Di01, Semrock), split (580 nm long-pass, FF580-FDi01, Semrock), bandpass filtered (green, FF03-525/50; red, FF01-641/75; Semrock), measured (GaAsP photomultiplier tubes, H10770B-40, Hamamatsu), amplified (TIA60, Thorlabs) and digitized (PXIe-7961R NI FlexRIO FPGA, NI 5734 16-bit, National Instruments). The laser wavelength was set between 920 and 950 nm. Average laser output power at the objective ranged from 57 to 101 mW (median of 69 mW)56. A FOV of 0.32–1.85 mm2 (median of 0.68 mm2) was imaged over 3–7 planes (median of 6 planes) with a plane distance of 14–40 µm (median of 25 µm) at a pixel size of 0.6–1.9 µm (median of 1.3 µm) and a volume rate of 4.3–9.5 Hz (median of 5.0 Hz). Each mouse was recorded in 2–4 imaging sessions on different days. In a subset of mice (n = 2) in separate imaging sessions, absence of substantial z-motion was verified by injecting 40 µl of Texas Red dextran (3000 MW, X 14.3 mg ml−1, diluted in saline, Themo Fisher Scientific) s.c. and imaging brightly red labeled blood vessels at 980 nm59.
In vivo eye movement analysis
Behavior videos were analyzed with DeepLabCut60, labeling eight points around the pupil. The eight points were then fitted to an ellipse and the ellipse center position transformed to rotational coordinates under the assumption of eyeball radius = 1.5 mm61, using custom Python scripts. The median of all eye positions was set to zero azimuth and elevation, that is, all eye coordinates were relative to the median position. The individual horizontal axis, which varied slightly between mice due to differences in the positioning of the head plate, was corrected by leveraging a behavioral feature of head-fixed mice: Saccadic movements are nearly exclusively in one plane35. Saccades were extracted by determining events of fast position changes on a median filtered position trace (median filter window of 0.7 s, minimal saccadic speed of 45° per second, minimal saccade amplitude of 3°, minimal saccade interval of 0.25 s). The preferred saccadic orientation and orientation tuning was determined in a similar fashion as that for neuronal visual orientation tuning based on circular variance:
with ᾱ as the saccadic orientation angle, r̄ as saccadic orientation tuning, rt as saccade amplitude and 𝛼t as direction of saccade t (Extended Data Fig. 9a–c). Saccade orientation tuning was very high, with mean selectivity = 0.8.
In vivo axonal terminal analysis
Functional calcium imaging data were first analyzed with suite2p (v0.10.0)62 for motion correction and ROI extraction. ROIs were then curated manually based on morphological and activity shape. Further analysis was performed in custom MATLAB R2021a (MathWorks) scripts: dF/F0 was estimated based on published procedures63 by first subtracting neuropil contamination (from suite2p, fluorescence signal of 350 pixels surrounding the ROI, excluding other ROIs) with a factor of 0.5 (estimated from fluorescence of small capillaries as reported previously). From the neuropil-corrected ROI fluorescence, baseline F0 was defined as the 8th percentile of a moving window of 15 s64. dF/F0 was then calculated by first subtracting and then dividing the fluorescence trace by the median of the same 15-s window63. Fluorescence SNR was defined for each neuron by dividing the 99th percentile of the dF/F trace (‘signal’) by the standard deviation of its negative values after baseline correction (‘noise’). Only axonal segments with a fluorescence SNR ≥ 5 were included in further analysis. The deconvolved signal from Suite2p (with tau = 0.7 s) was used for calculating RFs. Note that multiple axonal ROIs can originate from the same RGC. Spatiotemporal RF analysis for in vivo retinal terminals was conducted as for ex vivo RGC imaging, but on visual stimuli downsampled to a resolution of a 1° visual angle. The resulting 50 × 50° RFs were contaminated by eye movements and exhibited a lower SNR (as determined by temporal variance of the most varying pixel over the temporal variance of pixels with >50° visual angle distance) than ex vivo soma recordings, requiring further inclusion criteria: SNR > 15 (as in ex vivo data) and peak variance over time located at tau values between −0.1 and 0.6 s. Additionally, the retinotopic projection pattern of RGCs to the SC was utilized by fitting a map from visual coordinates to collicular space. For each recording, RF center azimuth and elevation values were fitted separately to the location of the ROI in the SC using the ‘poly22’ fit option in MATLAB and using only the highest 15% ROIs in SNR and SNR as a fitting weight. Boutons with peak location of the RF deviating by more than a 20° visual angle from its expected location based on the retinotopy fits were removed from further analysis (828 boutons removed).
The main saccadic axis was used to rotate the computed spatial RFs around their respective center and the center positions in visual space as spherical rotation around the approximate eye axis (65° from frontal direction in the horizontal plane). Finally, while freely moving mice hold their head at an approximate pitch angle of 30° downwards65, in vivo imaging allowed only for a pitch angle of 10° downwards. To compensate, the center positions of all RFs were spherically rotated 20° downwards around the main pitch axis (90° from frontal direction in horizontal plane). Note that these calculations only allow an estimate of the position of the horizon in free locomotion.
To avoid biasing the analyses by eye movements, RF parametrization was conducted on mean vertical 1D profiles of extracted 50 × 50° 2D RF crops at the peak azimuthal position ± 1°, where ON center bouton RFs were inverted. To extract parameters from 1D RFs, they were fitted with a difference of two Gaussians, initialized with the central peak magnitude (Mpeak) and width (Wpeak). The fitting procedure was then constrained with amplitudecenter ∈ [Mpeak/2, inf], amplitudesurround ∈ [0,inf], locationcenter ∈ [−20, 20]°, locationsurround ∈ [−25, 25]° (edge of crop), sigmacenter ∈ [Wpeak/4, inf] and sigmasurround ∈ [Wpeak, inf]. Boutons with a center fit location in 1D RFs of more than 5° or with a surround fit location of more than 25° distant from peak estimation based on variance in 2D RFs, were excluded from further analysis (1,609 boutons removed). Extraction of parameters was identical to ex vivo RF parametrization, except center size, where in vivo 2sigmacenter of the fit was used.
For presenting RF characteristics of RGC axonal boutons in the SC, the centered 1D RFs were binned and averaged in each bin. RF parameters varying over elevation and azimuth are presented as parameters of the fit on the mean 1D RF in the respective bin. Linear regression weights were computed from the parameters and location of each individual bouton.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41593-023-01280-0.
Supplementary information
Acknowledgements
We thank H. Asari for sharing the dataset of naturalistic images, Y. B. Simon for initial explorative work with the generation of AAVs, R. Beattie for the custom retina and T. Vega-Zuñiga for help with immunostainings. We also thank G. Tkačik and members of the Neuroethology group for their comments on the manuscript. This research was supported by the Scientific Service Units of IST Austria through resources provided by Scientific Computing, MIBA Machine Shop, the Preclinical Facility, the Lab Support Facility and the Imaging and Optics Facility. This work was supported by European Union Horizon 2020 Marie Skłodowska-Curie grant 665385 (to D.G.), Austrian Science Fund (FWF) stand-alone grant P 34015 (to W.M.), Human Frontiers Science Program LT000256/2018-L (to A.S.), EMBO ALTF 1098-2017 (to A.S.) and the European Research Council Starting Grant 756502 (to M.J.).
Extended data
Author contributions
D.G., W.M. and M.J. designed the study. D.G. and J.S. performed the retina imaging experiments with help from M.J. D.G. and M.J. developed the retinal imaging system. J.S. and M.J. developed the shifting stimuli. A.S. performed the in vivo imaging experiments and developed the imaging system with help from M.J. D.G., A.S. and O.S. analyzed the experimental data with inputs from W.M. and M.J. W.M. analyzed natural scene statistics and developed the normative model. D.G., W.M. and M.J. wrote the manuscript with help of the other authors.
Peer review
Peer review information
Nature Neuroscience thanks Philipp Berens, Olivier Marre and Stephanie Palmer for their contribution to the peer review of this work.
Data availability
Data used in the analysis can be found at ISTA data repository: 10.15479/AT:ISTA:12370.
Code availability
Code used to generate the results is available at GitHub: https://github.com/joesch-lab/panoramic-retina/.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Divyansh Gupta, Wiktor Młynarski, Anton Sumser.
Extended data
is available for this paper at 10.1038/s41593-023-01280-0.
Supplementary information
The online version contains supplementary material available at 10.1038/s41593-023-01280-0.
References
- 1.Shannon CE. A mathematical theory of communication. Bell Syst. Tech. J. 1948;27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
- 2.Barlow, H. B. Possible principles underlying the transformation of sensory messages. in Sensory Communication (ed. W. A. Rosenblith) MIT Press (1961).
- 3.Attneave F. Some informational aspects of visual perception. Psychol. Rev. 1954;61:183–193. doi: 10.1037/h0054663. [DOI] [PubMed] [Google Scholar]
- 4.Laughlin S. A simple coding procedure enhances a neuron’s information capacity. Z. Naturforsch. C. Biosci. 1981;36:910–912. doi: 10.1515/znc-1981-9-1040. [DOI] [PubMed] [Google Scholar]
- 5.Nemenman I, Lewen GD, Bialek W, de Ruyter van Steveninck RR. Neural coding of natural stimuli: information at sub-millisecond resolution. PLoS Comput. Biol. 2008;4:e1000025. doi: 10.1371/journal.pcbi.1000025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Fairhall AL, Lewen GD, Bialek W, de Ruyter Van Steveninck RR. Efficiency and ambiguity in an adaptive neural code. Nature. 2001;412:787–792. doi: 10.1038/35090500. [DOI] [PubMed] [Google Scholar]
- 7.Rieke F, Bodnar DA, Bialek W. Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents. Proc. Biol. Sci. 1995;262:259–265. doi: 10.1098/rspb.1995.0204. [DOI] [PubMed] [Google Scholar]
- 8.Olshausen BA, Field DJ. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature. 1996;381:607–609. doi: 10.1038/381607a0. [DOI] [PubMed] [Google Scholar]
- 9.Rieke F, Rudd ME. The challenges natural images pose for visual adaptation. Neuron. 2009;64:605–616. doi: 10.1016/j.neuron.2009.11.028. [DOI] [PubMed] [Google Scholar]
- 10.Srinivasan MV, Laughlin SB, Dubs A. Predictive coding: a fresh view of inhibition in the retina. Proc. R. Soc. Lond. B Biol. Sci. 1982;216:427–459. doi: 10.1098/rspb.1982.0085. [DOI] [PubMed] [Google Scholar]
- 11.Doi E, Lewicki MS. A simple model of optimal population coding for sensory systems. PLoS Comput. Biol. 2014;10:e1003761. doi: 10.1371/journal.pcbi.1003761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Atick JJ, Redlich AN. What does the retina know about natural scenes? Neural Comput. 1992;4:196–210. doi: 10.1162/neco.1992.4.2.196. [DOI] [Google Scholar]
- 13.Karklin Y, Simoncelli EP. Efficient coding of natural images with a population of noisy linear–nonlinear neurons. Adv. Neural Inf. Process. Syst. 2011;24:999–1007. [PMC free article] [PubMed] [Google Scholar]
- 14.Balasubramanian V, Sterling P. Receptive fields and functional architecture in the retina. J. Physiol. 2009;587:2753–2767. doi: 10.1113/jphysiol.2009.170704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Roy S, Jun NY, Davis EL, Pearson J, Field GD. Inter-mosaic coordination of retinal receptive fields. Nature. 2021;592:409–413. doi: 10.1038/s41586-021-03317-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ratliff CP, Borghuis BG, Kao Y-H, Sterling P, Balasubramanian V. Retina is structured to process an excess of darkness in natural scenes. Proc. Natl Acad. Sci. USA. 2010;107:17368–17373. doi: 10.1073/pnas.1005846107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Chalk M, Marre O, Tkačik G. Toward a unified theory of efficient, predictive, and sparse coding. Proc. Natl Acad. Sci. USA. 2018;115:186–191. doi: 10.1073/pnas.1711114115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Abballe L, Asari H. Natural image statistics for mouse vision. PLoS ONE. 2022;17:e0262763. doi: 10.1371/journal.pone.0262763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Qiu, Y. et al. Natural environment statistics in the upper and lower visual field are reflected in mouse retinal specializations. Curr. Biol. 10.1016/j.cub.2021.05.017 (2021) [DOI] [PubMed]
- 20.Baden T, et al. The functional diversity of retinal ganglion cells in the mouse. Nature. 2016;529:345–350. doi: 10.1038/nature16468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Sümbül U, et al. A genetic and computational approach to structurally classify neuronal types. Nat. Commun. 2014;5:3512. doi: 10.1038/ncomms4512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Helmstaedter M, et al. Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature. 2013;500:168–174. doi: 10.1038/nature12346. [DOI] [PubMed] [Google Scholar]
- 23.Macosko EZ, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161:1202–1214. doi: 10.1016/j.cell.2015.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Masland RH. The neuronal organization of the retina. Neuron. 2012;76:266–280. doi: 10.1016/j.neuron.2012.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Baylor DA, Nunn BJ, Schnapf JL. The photocurrent, noise and spectral sensitivity of rods of the monkey Macaca fascicularis. J. Physiol. 1984;357:575–607. doi: 10.1113/jphysiol.1984.sp015518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Schneeweis DM, Schnapf JL. The photovoltage of macaque cone photoreceptors: adaptation, noise, and kinetics. J. Neurosci. 1999;19:1203–1216. doi: 10.1523/JNEUROSCI.19-04-01203.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sabbah S, et al. A retinal code for motion along the gravitational and body axes. Nature. 2017;546:492–497. doi: 10.1038/nature22818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bethge P, et al. An R-CaMP1.07 reporter mouse for cell-type-specific expression of a sensitive red fluorescent calcium indicator. PLoS ONE. 2017;12:e0179460. doi: 10.1371/journal.pone.0179460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Tran NM, et al. Single-cell profiles of retinal ganglion cells differing in resilience to injury reveal neuroprotective genes. Neuron. 2019;104:1039–1055. doi: 10.1016/j.neuron.2019.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Pamplona D, et al. Receptive field estimation in large visual neuron assemblies using a super-resolution approach. J. Neurophysiol. 2022;127:1334–1347. doi: 10.1152/jn.00076.2021. [DOI] [PubMed] [Google Scholar]
- 31.Sterratt DC, Lyngholm D, Willshaw DJ, Thompson ID. Standard anatomical and visual space for the mouse retina: computational reconstruction and transformation of flattened retinae with the Retistruct package. PLoS Comput. Biol. 2013;9:e1002921. doi: 10.1371/journal.pcbi.1002921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Nadal-Nicolás FM, et al. True S-cones are concentrated in the ventral mouse retina and wired for color detection in theupper visual field. Elife. 2020;9:e56840. doi: 10.7554/eLife.56840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kim I-J, Zhang Y, Yamagata M, Meister M, Sanes JR. Molecular identification of a retinal cell type that responds to upward motion. Nature. 2008;452:478–482. doi: 10.1038/nature06739. [DOI] [PubMed] [Google Scholar]
- 34.Zhang, Y. et al. Fast and sensitive GCaMP calcium indicators for imaging neural populations. Preprint at bioRxiv10.1101/2021.11.08.467793 (2021). [DOI] [PMC free article] [PubMed]
- 35.Meyer AF, O’Keefe J, Poort J. Two distinct types of eye-head coupling in freely moving mice. Curr. Biol. 2020;30:2116–2130. doi: 10.1016/j.cub.2020.04.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Barlow HB, Fitzhugh R, Kuffler SW. Change of organization in the receptive fields of the cat’s retina during dark adaptation. J. Physiol. 1957;137:338–354. doi: 10.1113/jphysiol.1957.sp005817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Farrow K, et al. Ambient illumination toggles a neuronal circuit switch in the retina and visual perception at cone threshold. Neuron. 2013;78:325–338. doi: 10.1016/j.neuron.2013.02.014. [DOI] [PubMed] [Google Scholar]
- 38.Młynarski, W. F. & Hermundstad, A. M. Efficient and adaptive sensory codes. Nat. Neurosci. 10.1038/s41593-021-00846-0 (2021). [DOI] [PubMed]
- 39.Laughlin SB. The role of sensory adaptation in the retina. J. Exp. Biol. 1989;146:39–62. doi: 10.1242/jeb.146.1.39. [DOI] [PubMed] [Google Scholar]
- 40.Joesch M, Meister M. A neuronal circuit for colour vision based on rod–cone opponency. Nature. 2016;532:236–239. doi: 10.1038/nature17158. [DOI] [PubMed] [Google Scholar]
- 41.Baden T, et al. A tale of two retinal domains: near-optimal sampling of achromatic contrasts in natural scenes through asymmetric photoreceptor distribution. Neuron. 2013;80:1206–1217. doi: 10.1016/j.neuron.2013.09.030. [DOI] [PubMed] [Google Scholar]
- 42.Peichl L, Künzle H, Vogel P. Photoreceptor types and distributions in the retinae of insectivores. Vis. Neurosci. 2000;17:937–948. doi: 10.1017/S0952523800176138. [DOI] [PubMed] [Google Scholar]
- 43.Wallace DJ, et al. Rats maintain an overhead binocular field at the expense of constant fusion. Nature. 2013;498:65–69. doi: 10.1038/nature12153. [DOI] [PubMed] [Google Scholar]
- 44.Rossi LF, Harris KD, Carandini M. Spatial connectivity matches direction selectivity in visual cortex. Nature. 2020;588:648–652. doi: 10.1038/s41586-020-2894-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Dräger UC, Hubel DH. Responses to visual stimulation and relationship between visual, auditory, and somatosensory inputs in mouse superior colliculus. J. Neurophysiol. 1975;38:690–713. doi: 10.1152/jn.1975.38.3.690. [DOI] [PubMed] [Google Scholar]
- 46.de Malmazet D, Kühn NK, Farrow K. Retinotopic separation of nasal and temporal motion selectivity in the mouse superior colliculus. Curr. Biol. 2018;28:2961–2969. doi: 10.1016/j.cub.2018.07.001. [DOI] [PubMed] [Google Scholar]
- 47.Chen H, Savier EL, DePiero VJ, Cang J. Lack of evidence for stereotypical direction columns in the mouse superior colliculus. J. Neurosci. 2021;41:461–473. doi: 10.1523/JNEUROSCI.1155-20.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Heukamp AS, Warwick RA, Rivlin-Etzion M. Topographic variations in retinal encoding of visual space. Annu Rev. Vis. Sci. 2020;6:237–259. doi: 10.1146/annurev-vision-121219-081831. [DOI] [PubMed] [Google Scholar]
- 49.Szél Á, et al. Different patterns of retinal cone topography in two genera of rodents, Mus and Apodemus. Cell Tissue Res. 1994;276:143–150. doi: 10.1007/BF00354793. [DOI] [PubMed] [Google Scholar]
- 50.Baden T, Euler T, Berens P. Understanding the retinal basis of vision across species. Nat. Rev. Neurosci. 2020;21:5–20. doi: 10.1038/s41583-019-0242-1. [DOI] [PubMed] [Google Scholar]
- 51.Cadieu CF, Olshausen BA. Learning intermediate-level representations of form and motion from natural movies. Neural Comput. 2012;24:827–866. doi: 10.1162/NECO_a_00247. [DOI] [PubMed] [Google Scholar]
- 52.Govardovskii VI, Fyhrquist N, Reuter T, Kuzmin DG, Donner K. In search of the visual pigment template. Vis. Neurosci. 2000;17:509–528. doi: 10.1017/S0952523800174036. [DOI] [PubMed] [Google Scholar]
- 53.Carter-Dawson LD, LaVail MM. Rods and cones in the mouse retina. I. Structural analysis using light and electron microscopy. J. Comp. Neurol. 1979;188:245–262. doi: 10.1002/cne.901880204. [DOI] [PubMed] [Google Scholar]
- 54.Penn JS, Williams TP. A new microspectrophotometric method for measuring absorbance of rat photoreceptors. Vis. Res. 1984;24:1673–1676. doi: 10.1016/0042-6989(84)90325-0. [DOI] [PubMed] [Google Scholar]
- 55.Brainard DH. The psychophysics toolbox. Spat. Vis. 1997;10:433–436. doi: 10.1163/156856897X00357. [DOI] [PubMed] [Google Scholar]
- 56.Peng T, et al. A BaSiC tool for background and shading correction of optical microscopy images. Nat. Commun. 2017;8:14836. doi: 10.1038/ncomms14836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.GitHub. ROIMatchPub: Match suite2p ROIs between imaging sessions.https://github.com/ransona/ROIMatchPub
- 58.Sondereker, K. B., Stabio, M. E., Jamil, J. R., Tarchick, M. J. & Renna, J. M. Where you cut matters: a dissection and analysis guide for the spatial orientation of the mouse retina from ocular landmarks. J. Vis. Exp. 10.3791/57861 (2018). [DOI] [PMC free article] [PubMed]
- 59.Ryan TM, Hinojosa AJ, Vroman R, Papasavvas C, Lagnado L. Correction of z-motion artefacts to allow population imaging of synaptic activity in behaving mice. J. Physiol. 2020;598:1809–1827. doi: 10.1113/JP278957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Mathis A, et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 2018;21:1281–1289. doi: 10.1038/s41593-018-0209-y. [DOI] [PubMed] [Google Scholar]
- 61.Hammel, B. & Sullivan-Molina, N. bdhammel/least-squares-ellipse-fitting: v2.0.0. Zenodo (2020).
- 62.Pachitariu, M. et al. Suite2p: beyond 10,000 neurons with standard two-photon microscopy. Preprint at bioRxiv10.1101/061507 (2017).
- 63.Keller, G. B., Bonhoeffer, T. & Hübener, M. Sensorimotor mismatch signals in primary visual cortex of the behaving mouse. Neuron10.1016/j.neuron.2012.03.040 (2012). [DOI] [PubMed]
- 64.Dombeck DA, Khabbaz AN, Collman F, Adelman TL, Tank DW. Imaging large-scale neural activity with cellular resolution in awake, mobile mice. Neuron. 2007;56:43–57. doi: 10.1016/j.neuron.2007.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Stabio ME, et al. A novel map of the mouse eye for orienting retinal topography in anatomical space. J. Comp. Neurol. 2018;526:1749–1759. doi: 10.1002/cne.24446. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data used in the analysis can be found at ISTA data repository: 10.15479/AT:ISTA:12370.
Code used to generate the results is available at GitHub: https://github.com/joesch-lab/panoramic-retina/.