Skip to main content
eLife logoLink to eLife
. 2024 Oct 4;13:e86860. doi: 10.7554/eLife.86860

A chromatic feature detector in the retina signals visual context changes

Larissa Höfling 1,2, Klaudia P Szatko 1,2,, Christian Behrens 1, Yuyao Deng 1,2, Yongrong Qiu 1,2,, David Alexander Klindt 3,§, Zachary Jessen 4, Gregory W Schwartz 4, Matthias Bethge 2,5, Philipp Berens 1,2,5,6, Katrin Franke 1,, Alexander S Ecker 7,8, Thomas Euler 1,2,
Editors: Fred Rieke9, Lois EH Smith10
PMCID: PMC11452179  PMID: 39365730

Abstract

The retina transforms patterns of light into visual feature representations supporting behaviour. These representations are distributed across various types of retinal ganglion cells (RGCs), whose spatial and temporal tuning properties have been studied extensively in many model organisms, including the mouse. However, it has been difficult to link the potentially nonlinear retinal transformations of natural visual inputs to specific ethological purposes. Here, we discover a nonlinear selectivity to chromatic contrast in an RGC type that allows the detection of changes in visual context. We trained a convolutional neural network (CNN) model on large-scale functional recordings of RGC responses to natural mouse movies, and then used this model to search in silico for stimuli that maximally excite distinct types of RGCs. This procedure predicted centre colour opponency in transient suppressed-by-contrast (tSbC) RGCs, a cell type whose function is being debated. We confirmed experimentally that these cells indeed responded very selectively to Green-OFF, UV-ON contrasts. This type of chromatic contrast was characteristic of transitions from ground to sky in the visual scene, as might be elicited by head or eye movements across the horizon. Because tSbC cells performed best among all RGC types at reliably detecting these transitions, we suggest a role for this RGC type in providing contextual information (i.e. sky or ground) necessary for the selection of appropriate behavioural responses to other stimuli, such as looming objects. Our work showcases how a combination of experiments with natural stimuli and computational modelling allows discovering novel types of stimulus selectivity and identifying their potential ethological relevance.

Research organism: Mouse

Introduction

Sensory systems evolved to generate representations of an animal’s natural environment useful for survival and procreation (Lettvin et al., 1959). These environments are complex and high dimensional, and different features are relevant for different species (reviewed in Baden et al., 2020). As a consequence, the representations are adapted to an animal’s needs: features of the world relevant for the animal are represented with enhanced precision, whereas less important features are discarded. Sensory processing is thus best understood within the context of the environment an animal evolved in and that it interacts with (reviewed in Turner et al., 2019; Simoncelli and Olshausen, 2001).

The visual system is well suited for studying sensory processing, as the first features are already extracted at its experimentally well-accessible front-end, the retina (reviewed in Kerschensteiner, 2022; Baden et al., 2020). In the mouse, this tissue gives rise to around 40 parallel channels that detect different features (Goetz et al., 2022; Baden et al., 2016; Bae et al., 2018; Rheaume et al., 2018), represented by different types of retinal ganglion cells (RGCs), whose axons send information to numerous visual centres in the brain (Martersteck et al., 2017). Some of these channels encode basic features, such as luminance changes and motion, that are only combined in downstream areas to support a range of behaviours such as cricket hunting in mice (Johnson et al., 2021). Other channels directly extract specific features from natural scenes necessary for specific behaviours. For instance, transient OFF α cells trigger freezing or escape behaviour in response to looming stimuli (Münch et al., 2009; Yilmaz and Meister, 2013; Kim et al., 2020; Wang et al., 2021).

For many RGC types, however, we lack an understanding of the features they encode and how these link to behaviour (Schwartz and Swygart, 2020). One reason for this is that the synthetic stimuli commonly used to study retinal processing fail to drive retinal circuits ‘properly’ and, hence, cannot uncover critical response properties triggered in natural environments. This was recently illustrated at the example of spatial nonlinear processing, which was found to be more complex for natural scenes than for simpler synthetic stimuli (Karamanlis and Gollisch, 2021). Such nonlinearities, which are crucial for the encoding of natural stimuli, cannot be captured by linear-nonlinear (LN) models of retinal processing, and several alternative methods have been proposed for the analysis of natural stimulus responses (reviewed in Sharpee, 2013).

One approach to modelling nonlinear visual processing is to train a convolutional neural network (CNN) to predict neuronal responses. This approach has gained popularity in recent years, both in the retina (McIntosh et al., 2016; Maheswaranathan et al., 2023; Tanaka et al., 2019; Batty et al., 2017) and in higher visual areas (Yamins et al., 2014; Cadena et al., 2019; Ustyuzhaninov et al., 2024). The resulting models, also referred to as ‘digital twins’, offer a number of analysis techniques that have been used to investigate, for example, the contributions of different interneurons to a cell’s response (Maheswaranathan et al., 2023), or the effects of stimulus context (Fu et al., 2024; Goldin et al., 2022) and behavioural state (Franke et al., 2022) on neural coding. In particular, feature visualisations (Olah et al., 2017) can be used to quickly generate stimuli that would maximally excite the modelled neurons (Walker et al., 2019; Bashivan et al., 2019), which in turn can serve as interpretable short-hand descriptions of nonlinear neuronal selectivities. In visual cortex, the resulting maximally exciting inputs (MEIs) revealed more complex and diverse neuronal selectivities than were expected based on previous results obtained with synthetic stimuli and linear methods (Walker et al., 2019; Bashivan et al., 2019).

Here, we combined the power of CNN-based modelling with large-scale recordings from RGCs to investigate colour processing in the mouse retina under natural stimulus conditions. Colour is a salient feature in nature, and the mouse visual system dedicates intricate circuitry to the processing of chromatic information (Szél et al., 1992; Joesch and Meister, 2016; Baden et al., 2013; Szatko et al., 2020; Khani and Gollisch, 2021; Mouland et al., 2021). Studies using simple synthetic stimuli have revealed nonlinear and centre-surround (i.e. spatial) interactions between colour channels, but it is not clear how these are engaged in retinal processing of natural, temporally varying environments. Since mouse photoreceptors are sensitive to green and UV light (Jacobs et al., 2004), we recorded RGC responses to stimuli capturing the chromatic composition of natural mouse environments in these two chromatic channels. A model-guided search for MEIs in chromatic stimulus space predicted a novel type of chromatic tuning in tSbC RGCs, a type whose function is being debated (Mani and Schwartz, 2017; Tien et al., 2015; Tien et al., 2022).

A detailed in silico characterisation followed up by experimental validation ex-vivo confirmed this cell type’s pronounced and unique selectivity for dynamic full-field changes from green-dominated to UV-dominated scenes, a type of visual input that matches the scene statistics of transitions across the horizon (Qiu et al., 2021; Abballe and Asari, 2022; Gupta et al., 2022). We therefore suggest a role for tSbC RGCs in detecting behaviourally relevant changes in visual context, such as a transitions from ground (i.e. below the horizon) to sky (i.e. above the horizon).

Results

Here, we investigated colour processing in the mouse retina under natural stimulus conditions. To this end, we trained a CNN model on RGC responses to a movie covering both achromatic and chromatic contrasts occurring naturally in the mouse environment, and then performed a model-guided search for stimuli that maximise the responses of RGCs.

Mouse RGCs display diverse responses to a natural movie stimulus

Using two-photon population Ca2+ imaging, we recorded responses from 8388 cells (in 72 recording fields across 32 retinae) in the ganglion cell layer (GCL) of the isolated mouse retina (Figure 1a) to a range of visual stimuli. Since complex interactions between colour channels have been mostly reported in the ventral retina and opsin-transitional zone, we focused our recordings on these regions (Szatko et al., 2020; Khani and Gollisch, 2021).

Figure 1. Mouse retinal ganglion cells (RGCs) display diverse responses to a natural movie stimulus.

(a) Illustration of a flat-mounted retina, with recording fields (white circles) and stimulus area centred on the red recording field indicated (cross marks optic disc; d, dorsal; v, ventral; t, temporal; n, nasal). (b) Natural movie stimulus structure (top) and example frames (bottom). The stimulus consisted of 5 s clips taken from UV-green footage recorded outside (Qiu et al., 2021), with 3 repeats of a 5-clip test sequence (highlighted in grey) and a 108-clip training sequence (see Methods). (c) Representative recording field (bottom; marked by red square in (a)) showing somata of ganglion cell layer (GCL) cells loaded with Ca2+ indicator OGB-1. (d) Ca2+ responses of exemplary RGCs (indicated by circles in (c)) to chirp (left), moving bar (centre), and natural movie (right) stimulus. (e) Same recording field as in (c) but with cells colour-coded by functional RGC group (left; see Methods and Baden et al., 2016) and group responses (coloured, mean ± SD across cells; trace of example cells in (d) overlaid in black).

Figure 1.

Figure 1—figure supplement 1. Additional information about the dataset, model performance, and response quality filtering pipeline.

Figure 1—figure supplement 1.

(a) Distribution across cell types for this dataset, and for the dataset described in Baden et al., 2016, which was the basis for our classifier (Qiu et al., 2023). (b) Mean ± SD of model performance, evaluated as correlation between model prediction and retinal ganglion cell (RGC) response on the 25 s long test sequence, averaged across three repetitions of the test sequence, for each cell type. (c) Response quality, RGC group assignment, and model performance filtering pipeline showing the consecutive steps and the fraction of cells remaining after each step. Black bars and numbers indicate cells from all experiments (i.e. all RGCs for which we recorded chirp, moving bar [MB], and movie responses), red bars and numbers indicate the subset of cells recorded in the maximally exciting input (MEI) validation experiments (i.e. those RGCs for which we additionally recorded MEI stimuli responses). Dotted bars indicate the number of cells before the current filtering step. The filtering steps were as follows: (1) Keep only cells that pass the chirp OR MB quality criterion (QIMB>0.6ORQIchirp>0.35). (2) Keep only cells that the classifier assigns to a group with confidence ≥0.25. (3) Keep only cells assigned to a ganglion cell group (groups 1–32; groups 33–46 are amacrine cell groups). (4) Keep only cells with sufficiently high model performance (C(r^(n),ri(n))i>0.3). All cells passing steps 1–3 were included in the horizon detection analysis (Figure 7); all cells passing steps 1–4 were included in the MEI analysis (Figure 3); the ‘red’ cells passing steps 1–4 were included in the MEI validation analysis (Figure 4). All quality criteria are described in the Methods section.

The stimuli included two achromatic synthetic stimuli – a contrast and frequency modulation (‘chirp’ stimulus) and a bright-on-dark bar moving in eight directions (‘moving bar’, MB) – to identify the functional cell type (see below), as well as a dichromatic natural movie (Figure 1b–d). The latter was composed of footage recorded outside in the field using a camera that captured the spectral bands (UV and green; Qiu et al., 2021) to which mouse photoreceptors are sensitive (λpeakS=360,λpeakM=510 nm for S- and M-cones, respectively; Jacobs et al., 2004). We used 113 different movie clips, each lasting 5 s, that were displayed in pseudo-random order. Five of these constituted the test set and were repeated three times: at the beginning, in the middle, and at the end of the movie presentation, thereby allowing to assess the reliability of neuronal responses across the recording (Figure 1b, top).

The responses elicited by the synthetic stimuli and the natural movie were diverse, displaying ON (Figure 1d, rows 4–9), ON-OFF (row 3), and OFF (rows 1 and 2), as well as sustained and transient characteristics (e.g. rows 8 and 4, respectively). Some responses were suppressed by temporal contrast (generally, rows 10, 11; at high contrast and frequency, row 9). A total of 6984 GCL cells passed our response quality criteria (see Methods); 3527 cells could be assigned to 1 of 32 previously characterised functional RGC groups (Baden et al., 2016) based on their responses to the chirp and MB stimuli using our recently developed classifier (Figure 1e; Figure 1—figure supplement 1; Qiu et al., 2021). Cells assigned to any of groups 33–46 were considered displaced amacrine cells and were not analysed in this study (for detailed filtering pipeline, see Figure 1—figure supplement 1c).

CNN model captures diverse tuning of RGC groups and predicts MEIs

We trained a CNN model on the RGCs’ movie responses (Figure 2a) and evaluated model performance as the correlation between predicted and trial-averaged measured test responses, C(r^(n),r(n)i) (Figure 2b). This metric can be interpreted as an estimate of the achieved fraction of the maximally achievable correlation (see Methods). The mean correlation per RGC group ranged from 0.32 (G14) to 0.79 (G24) (Figure 1—figure supplement 1) and reached an average of 0.48 (for all N=3527 cells passing filtering steps 1–3, Figure 1—figure supplement 1). We also tested the performance of our nonlinear model against a linearised version (see Methods; equivalent to an LN model, and from here on ‘LN model’) and found that the nonlinear CNN model achieved a higher test set correlation for all RGC groups (average correlation LN model: 0.38; G14: 0.2, G24: 0.65, Figure 2c).

Figure 2. Convolutional neural network (CNN) model captures diverse tuning of retinal ganglion cell (RGC) groups and predicts maximally exciting inputs (MEIs).

Figure 2.

(a) Illustration of the CNN model and its output. The model takes natural movie clips as input (1), performs 3D convolutions with space-time separable filters (2) followed by a nonlinear activation function (ELU; 3) in two consecutive layers (2–4) within its core, and feeds the output of its core into a per-neuron readout. For each RGC, the readout convolves the feature maps with a learned RF modelled as a 2D Gaussian (5), and finally feeds a weighted sum of the resulting vector through a softplus nonlinearity (6) to yield the firing rate prediction for that RGC (7). Numbers indicate averaged single-trial test set correlation between predicted (red) and recorded (black) responses. (b) Test set correlation between model prediction and neural response (averaged across three repetitions) as a function of response reliability (see Methods) for N=3527 RGCs. Coloured dots correspond to example cells shown in Figure 1c–e. Dots in darker grey correspond to the N=1947 RGCs that passed the model test correlation and movie response quality criterion (see Methods and Figure 1—figure supplement 1). (c) Test set correlation (as in (b)) of CNN model vs. test set correlation of an LN model (for details, see Methods). Coloured dots correspond to means of RGC groups 1–32 (Baden et al., 2016). Dark and light grey dots as in (b). (d) Illustration of model-guided search for MEIs. The trained model captures neural tuning to stimulus features (far left; heat map illustrates landscape of neural tuning to stimulus features). Starting from a randomly initialised input (second from left; a 3D tensor in space and time; only one colour channel illustrated here), the model follows the gradient along the tuning surface (far left) to iteratively update the input until it arrives at the stimulus (bottom right) that maximises the model neuron’s activation within an optimisation time window (0.66 s, grey box, top right).

Next, we wanted to leverage our nonlinear CNN model to search for potentially nonlinear stimulus selectivities of mouse RGC groups. Towards this goal, we aimed to identify stimuli that optimally drive RGCs of different groups. For linear systems, the optimal stimulus is equivalent to the linear filter and can be identified with classical approaches such as reverse correlation (Chichilnisky, 2001). However, since both the RGCs and the CNN model were nonlinear, a different approach was necessary. Other recent modelling studies in the visual system have leveraged CNN models to predict static MEIs for neurons in monkey visual area V4 (Bashivan et al., 2019; Willeke et al., 2023) and mouse visual area V1 (Walker et al., 2019; Franke et al., 2022). We adopted this approach to predict dynamic (i.e. time-varying) MEIs for mouse RGCs. We used gradient ascent on a randomly initialised, contrast- and range-constrained input to find the stimulus that maximised the mean activation of a given model neuron within a short time window (0.66 s; see Methods; Figure 2d).

It is important to note that MEIs should not be confused with, or interpreted as, the linear filters derived from classical approaches such as reverse correlation (Chichilnisky, 2001; Schwartz et al., 2006). While both MEIs and linear filters offer simplified views of a neuron’s nonlinear response properties, they emphasise different aspects. The linear filter is optimised to provide the best possible linear approximation of the response function, identifying the stimulus direction to which the cell is most sensitive on average across the stimulus ensemble. In contrast, the MEI maximises the neuron’s response by finding the single stimulus that activates the cell most strongly. Consequently, MEIs can differ significantly from linear filters, often exhibiting greater complexity and higher frequency components (Walker et al., 2019).

MEIs reflect known functional RGC group properties

The resulting MEIs were short, dichromatic movie clips; their spatial, temporal, and chromatic properties and interactions thereof are best appreciated in lower-dimensional visualisations (Figure 3a–c; more example MEIs in Figure 3—figure supplement 1).

Figure 3. Spatial, temporal, and chromatic properties of maximally exciting inputs (MEIs) differ between retinal ganglion cell (RGC) groups.

(a) Spatial component of three example MEIs for green (top), UV (middle), and overlay (bottom). Solid and dashed circles indicate MEI centre and surround fit, respectively. For display, spatial components s in the two channels were re-scaled to a similar range and displayed on a common grey-scale map ranging from black for max(|s|) to white for max(|s|), i.e., symmetric about 0 (grey). (b) Spatiotemporal (yt) plot for the three example MEIs (from (a)) at a central vertical slice for green (top), UV (middle), and overlay (bottom). Grey-scale map analogous to (a). (c) Trajectories through colour space over time for the centre of the three MEIs. Trajectories start at the origin (grey level); direction of progress indicated by arrow heads. Bottom right: Bounding boxes of the respective trajectory plots. (d) Calculation of MEI centre size, defined as σx+σy, with σx and σy the s.d. in horizontal and vertical direction, respectively, of the difference-of-Gaussians (DoG) fit to the MEI. (e) Calculation of MEI temporal frequency: Temporal components are transformed using fast Fourier transform, and MEI frequency is defined as the amplitude-weighted average frequency of the Fourier-transformed temporal component. (f) Calculation of centre contrast, which is defined as the difference in intensity at the last two peaks (indicated by t1 and t2, respectively, in (c)). For the example cell (orange markers and lines), green intensity decreases, resulting in OFF contrast, and UV intensity increases, resulting in ON contrast. (g) Distribution of green and UV MEI centre sizes across N=1613 cells (example MEIs from (a–c) indicated by arrows; symbols as shown on top of (a)). 95% of MEIs were within an angle of ±8° of the diagonal (solid and dashed lines); MEIs outside of this range are coloured by cell type. (h) As (g) but for distribution of green and UV MEI temporal frequency. 95% of MEIs were within an angle of ±11.4° of the diagonal (solid and dashed lines). (i) As (g) but for distribution of green and UV MEI centre contrast. MEI contrast is shifted away from the diagonal (dashed line) towards UV by an angle of 33.2° due to the dominance of UV-sensitive S-opsin in the ventral retina. MEIs at an angle >45° occupy the upper left, colour-opponent (UVON-greenOFF) quadrant. (j, k) Fraction of MEIs per cell type that lie outside the angle about the diagonal containing 95% of MEIs for centre size and temporal frequency. Broad RGC response types indicated as in Baden et al., 2016. (l) Fraction of MEIs per cell type in the upper-left, colour-opponent quadrant for contrast.

Figure 3.

Figure 3—figure supplement 1. Example maximally exciting inputs (MEIs) for example cell types.

Figure 3—figure supplement 1.

Rows in each panel as in Figure 4a.
Figure 3—figure supplement 2. Illustration of how different time windows for optimisation affect maximally exciting input (MEI) temporal properties.

Figure 3—figure supplement 2.

(a) MEIs (bottom panels) and model neuron responses (top panels) for a short optimisation window of 2 frames (≈0.066s, indicated by grey shaded area). The top row shows the responses of a more transient retinal ganglion cell (RGC) to its own MEI (left stimulus) and to the MEI of a more sustained RGC (right stimulus). The bottom row shows the responses of the more sustained RGC to its own MEI (right stimulus) and to the MEI of the more transient RGC (right stimulus). (b) MEIs (bottom panels) and model neuron responses (top panels) for a longer optimisation window of 20 frames (≈0.66s, indicated by grey shaded area) as used throughout the paper. The top row shows the responses of a more transient RGC to its own MEI (left stimulus) and to the MEI of a more sustained RGC (right stimulus). The bottom row shows the responses of the more sustained RGC to its own MEI (right stimulus) and to the MEI of the more transient RGC (right stimulus). Same cells as in (a).

To analyse the MEIs in terms of these properties, we decomposed them into their spatial and temporal components, separately for green and UV, and parameterised the spatial component as a difference-of-Gaussians (DoG) (Gupta et al., 2022) (N=1613 out of 1947, see Methods). We then located MEIs along the axes in stimulus space corresponding to three properties: centre size, mean temporal frequency, and centre contrast, separately for green and UV (Figure 3d–f). These MEI properties reflect RGC response properties classically probed with synthetic stimuli, such as spots of different sizes (Goetz et al., 2022), temporal frequency modulations (Baden et al., 2016), and stimuli of varying chromatic contrast (Szatko et al., 2020; Khani and Gollisch, 2021). Using the MEI approach, we were able to reproduce known properties of RGC groups (Figure 3g–i). For example, sustained ON α RGCs (G24), which are known to prefer large stimuli (Baden et al., 2016; Mani and Schwartz, 2017), had MEIs with large centres (G24, N=20 cells: green centre size, mean ± SD: 195 ±82 μm; UV centre size 178 ±45 μm; average across all RGC groups: green 148 ±42 μm, UV 141 ±42 μm; see Figure 3g).

The MEI’s temporal frequency relates to the temporal frequency preference of an RGC: MEIs of G20 and G21, termed ON high frequency and ON low frequency (Baden et al., 2016), had high and low average temporal frequency, respectively (G20, N=40 cells, green, mean ± SD: 2.71 ±0.16 Hz, UV 2.86 ±0.22 Hz; G21, N=50 cells, green, mean ± SD: 2.32 ±0.63 Hz, UV 1.98 ± 0.5 Hz; see Figure 3h). Some MEIs exhibit fast oscillations (Figure 3e and Figure 3—figure supplement 1). This is not an artefact but rather a consequence of optimising a stimulus to maximise activity over a 0.66 s time window (Figure 2d). To maximise the response of a transient RGC over several hundred milliseconds, it has to be stimulated repetitively, hence the oscillations in the MEI. Maximising the response over a shorter time period results in MEIs without oscillations (Figure 3—figure supplement 2).

Finally, the contrast of an MEI reflects what is traditionally called a cell’s ON vs. OFF preference: MEIs of ON and OFF RGCs had positive and negative contrasts, respectively (Figure 3i). An ON-OFF preference can be interpreted as a tuning map with two optima – one in the OFF- and one in the ON-contrast regime. For an ON-OFF cell, there are hence two stimuli that are approximately equally effective at eliciting responses from that cell. Consequently, for the ON-OFF RGC groups, optimisation resulted in MEIs with ON or OFF contrast, depending on the relative strengths of the two optima and on the initial conditions (Figure 3—figure supplement 1, G10, and see Discussion).

MEIs were also largely consistent within functional RGC groups (Figure 3—figure supplement 1). Where this was not the case, the heterogeneity of MEIs could be attributed to a known heterogeneity of cells within that group. For example, MEIs of G31 RGCs were diverse (Figure 3—figure supplement 1), and the cells that were originally grouped to form G31 probably spanned several distinct types, as suggested by the group’s unusually high coverage factor (Baden et al., 2016). Together, these results provided strong evidence that RGCs grouped based on responses to synthetic stimuli (chirp and MB) also form functional groups in natural movie response space.

CNN model predicts centre colour opponency in RGC group G28

Our goal was to explore chromatic tuning of RGCs and to identify novel stimulus selectivities related to chromatic contrast. Therefore, we specifically focused on regions in stimulus space where a given stimulus property differs for green and UV. For centre size and temporal frequency, we asked which RGC groups contributed to the MEIs outside of the 95th percentile around the diagonal (Figure 3g, h, j, and k). These 5% MEIs furthest away from the diagonal were almost exclusively contributed by ON cells; and among these, more so by slow than by fast ON cells.

MEI contrast needed to be analysed differently than size and temporal frequency for two reasons. First, due to the dominance of UV-sensitive S-opsin in the ventral retina (Szél et al., 1992), stimuli in the UV channel were much more effective at eliciting RGC responses. As a result, the contrast of most MEIs is strongly shifted towards UV (Figure 3i). Second, contrast in green and UV can not only vary along positive valued axes (as is the case for size and temporal frequency), but can also take on opposite signs, resulting in colour-opponent stimuli. Whereas most MEIs had the same contrast polarity in both colour channels (i.e. both ON or OFF, Figure 3c, blue and turquoise trajectories), some MEIs had opposing contrast polarities in UV and green (Figure 3c, orange trajectory, and Figure 3i, upper left quadrant). Thus, for contrast, we asked which RGC groups contributed to colour-opponent MEIs (i.e. MEIs in the colour-opponent, upper left or lower right quadrant in Figure 3i). Again, slow ON RGCs made up most of the cells with colour-opponent MEIs. Here, G28 stood out: 66% (24/36) of all cells of this group had colour-opponent MEIs (UVON-greenOFF), followed by G27 with 42% colour-opponent MEIs.

The colour opponency we found in G28 was not centre-surround, as described before in mice (Szatko et al., 2020), but rather a centre opponency (‘co-extensive’ colour-opponent RF; reviewed in Schwartz, 2021), as can be seen in the lower-dimensional visualisations (Figure 3a and b, right column; Figure 3c, orange trajectory).

In conclusion, our model-guided in silico exploration of chromatic stimulus space revealed a variety of preferred stimuli that captured known properties of RGC groups, and revealed a preference of G28 RGCs for centre colour-opponent, UVON-greenOFF stimuli, a feature previously unknown for this RGC group.

Experiments confirm selectivity for chromatic contrast

Next, we verified experimentally that the MEIs predicted for a given RGC group actually drive cells of that group optimally. To this end, we performed new experiments in which we added to our battery of stimuli a number of MEIs chosen according to the following criteria: We wanted the MEIs to (i) span the response space (ON, ON-OFF, OFF, transient, sustained, and contrast-suppressed) and (ii) to represent both well-described RGC types, such as α cells (i.e. G5,24), as well as poorly understood RGC types, such as suppressed-by-contrast cells (G28,31,32) (Figure 4a). We therefore chose MEIs of RGCs from groups G1 (OFF local), G5 (OFF α sustained), G10 (ON-OFF local-edge), G18 (ON transient), G20 (ON high frequency), G21 (ON low frequency), G23 (ON mini α), G24 (sustained ON α), G28 (ON contrast suppressed), G31 (OFF suppressed 1), and G32 (OFF suppressed 2). For simplicity, in the following we refer to the MEI of an RGC belonging to group g as group g’s MEI, or MEI g.

Figure 4. Experiments confirm maximally exciting inputs (MEIs) predicted by model.

(a) MEIs shown during the experiment, with green and UV spatial components (top two rows), as well as green and UV temporal components (third row) and a spatiotemporal visualisation (fourth row). For display, spatial components s in the two channels were re-scaled to a similar range and displayed on a common grey-scale map ranging from black for max(|s|) to white for max(|s|), i.e., symmetric about 0 (grey). Relative amplitudes of UV and green are shown in the temporal components. (b) Illustration of spatial layout of MEI experiment. White circles represent 5 × 5 grid of positions where MEIs were shown; red shading shows an example RF estimate of a recorded G32 retinal ganglion cell (RGC), with black dot indicating the RF centre position (Methods). (c) Responses of example RGC from (b) to the 11 different MEI stimuli at 25 different positions. (d) Recorded [top, r(n)] and predicted [bottom, r^(n)] responses to the 11 different MEIs for example RGC n from (b, c). Left: Responses are averaged across the indicated dimensions x, y (different MEI locations); black bar indicates MEI stimulus duration (from 0 to 1.66 s), grey rectangle marks optimisation time window (from 1 to 1.66 s). Right: Response to different MEIs, additionally averaged across time (t; within optimisation time window). (e, f) Same as in (d), but additionally averaged across all RGCs (n) of G5 (N=6) (e) and of G28 (N=12) (f). Error bars show SD across cells. (g) Confusion matrix, each row showing the z-scored response magnitude of one RGC group (averaged across all RGCs of that group) to the MEIs in (a). Confusion matrix for recorded cells (top; ‘Data') and for model neurons (bottom; ‘Model'). Black squares highlight broad RGC response types according to Baden et al., 2016: OFF cells, (G1,5) ON-OFF cells (G10), fast ON cells (G18,20), slow ON (G21,23,24) and ON contrast suppressed (G28) cells, and OFF suppressed cells (G31,32).

Figure 4.

Figure 4—figure supplement 1. Recorded and predicted responses of example RGC groups to the MEI stimuli.

Figure 4—figure supplement 1.

(a) Recorded (top, r) and predicted (bottom, r^) responses to the 11 different maximally exciting inputs (MEIs) for all example cell types. Left: Responses are averaged across the indicated dimensions x, y, n: different MEI locations (x, y) and retinal ganglion cells (RGCs) in a group (n); black bar indicates stimulus duration (from 0 to 1.66 s), grey rectangle marks optimisation time window (from 1 to 1.66 s). Right: Responses to different MEIs, additionally averaged across time (t) within the optimisation time window. Error bars indicate SD across cells. (b) Correlation between the measured and predicted response magnitudes to the MEI stimuli per example cell type. Cumulative histogram is across all N=788 cells; 50% of cells have a correlation between measured and predicted response magnitude of ≥0.8. (c) Mean ± SD of selectivity index (see Methods) for the example cell groups, indicating the difference in response to MEI 28 vs. the average response to all other MEIs in units of standard deviation of the response.

We presented these MEIs on a regularly spaced 5 × 5 grid to achieve approximate centring of stimuli on RGC RFs in the recording field (Figure 4b and c). For these recordings, we fit models whose readout parameters allowed us to estimate the RGCs’ RF locations. We used these RF location estimates to calculate a spatially weighted average of the responses to the MEIs displayed at different locations, weighting the response at each location proportional to the RF strengths at those locations (Figure 4b, red highlight, and Figure 4d, top). We then performed the same experiment in silico, confirming that the model accurately predicts responses to the MEIs (Figure 4d, bottom; Figure 4—figure supplement 1). These experiments allowed us to evaluate MEI responses at the RGC group level (Figure 4e and f; Figure 3—figure supplement 1).

We expected RGCs to show a strong response to their own group’s MEI, a weaker response to the MEIs of functionally related groups, and no response to MEIs of groups with different response profiles. Indeed, most RGC groups exhibited their strongest (G5,20,21,28,32) or second-strongest (G1,10,23) response to their own group’s MEI (Figure 4g, top). Conversely, RGC groups from opposing regions in response space showed no response to each others’ MEIs (e.g. G1,5 [OFF cells] vs. G21-28 [slow ON cells]). The model’s predictions showed a similar pattern (Figure 4g, bottom), thereby validating the model’s ability to generalise to the MEI stimulus regime.

Notably, G28 RGCs responded very selectively to their own MEI 28, displaying only weak responses to most other MEIs (Figure 4f and g, selectivity index G28 to MEI 28, SIG28(28), defined as the average difference in response between MEI 28 and all other MEIs in units of standard deviation of the response, mean ± SD: 2.58±0.76; see Methods). This was in contrast to other RGC groups, such as G23 and G24, that responded strongly to MEI 28, but also to other MEIs from the slow ON response regime (Figure 4g, top; Figure 4—figure supplement 1, SIG23(28), mean ± SD: 1.04±0.69, SIGt24(28), mean ± SD: 1.01±0.46). Hence, our validation experiments confirm the model’s prediction that RGC group G28 is selective for centre colour-opponent, UVON-greenOFF stimuli.

G28 corresponds to the tSbC RGC type

Next, we sought to identify which RGC type G28 corresponds to. In addition to its unique centre colour opponency, the responses of G28 displayed a pronounced transient suppression to temporal contrast modulations (chirp response in Figure 1e). Therefore, we hypothesised that G28 corresponds to the tSbC RGC type (Tien et al., 2015; Tien et al., 2016; Tien et al., 2022), which is one of the three SbC RGC types identified so far in the mouse and is also referred to as ON delayed (OND) because of its delayed response onset (Jacoby and Schwartz, 2018).

To test this hypothesis, we performed cell-attached electrophysiology recordings (Figure 5) targeting tSbC/OND cells (N=4), identified by their responses to spots of multiple sizes (Goetz et al., 2022), and later confirmed by their distinctive morphology (Jacoby and Schwartz, 2018; type 73 in Bae et al., 2018; Figure 5c and d). We recorded spikes while presenting the MEI stimuli (Figure 5a, top). Just like G28 RGCs in the Ca2+ imaging, tSbC/OND cells exhibited a pronounced selectivity for MEI 28, and were suppressed by most other MEIs (Figure 5a, middle and bottom). Notably, the characteristic delayed response onset was visible in both the Ca2+ (Figure 4f, top) and electrical (Figure 5a) responses but was not predicted by the model (Figure 4f, bottom).

Figure 5. Electrical single-cell recordings of responses to maximally exciting input (MEI) stimuli confirm chromatic selectivity of transient suppressed-by-contrast (tSbC) retinal ganglion cells (RGCs).

Figure 5.

(a) Spiking activity (top, raster plot; middle, estimated firing rate) of an OND RGC in response to different MEI stimuli (black bar indicates MEI stimulus duration; grey rectangle marks optimisation time window, from 1 to 1.66 s). Bottom: z-scored activity as a function of MEI stimulus, averaged across cells (solid circles w/ left y-axis, from Ca2+ imaging, N=11 cells; open circles w/ right y-axis, from electrical spike recordings, N=4). Error bars show SD across cells. Colours as in Figure 4. (b) Like (a) but for a sustained ON α cell (G24; N=4 cells, both for electrical and Ca2+ recordings). (c) Different ON delayed (OND/tSbC, G28) RGC (green) dye-loaded by patch pipette after cell-attached electrophysiology recording (z-projection; x–y plane). (d) Cell from (c, green) as side projection (xz), showing dendritic stratification pattern relative to choline-acetyltransferase (ChAT) amacrine cells (tdTomato, red) within the inner plexiform layer (IPL).

As a control, we also recorded MEI responses of a different, well-characterised RGC type, sustained (s) ON α (G24; Krieger et al., 2017; Figure 5b, top; N=4). Again, the electrical recordings of the cells’ MEI responses yielded virtually the same results as the Ca2+ imaging (Figure 5b, middle and bottom; Figure 4—figure supplement 1). Crucially, sON α cells were not selective for MEI 28. The fact that these experiments with precise positioning of stimuli on the cells’ RFs yielded the same results as the 2P imaging experiments confirms the validity of the grid approach for stimulus presentation used in the latter.

Chromatic contrast selectivity derives from a nonlinear transformation of stimulus space

Next, we asked whether G28 (tSbC) RGC’s selectivity is a linear feature, as could be achieved by two linear filters with opposite signs for the two colour channels, or whether it is a nonlinear feature. To address this question, we tested whether an LN model (implemented using convolutions; see Methods) could recover the chromatic selectivity of G28 by predicting MEIs using the LN model. We found that the LN model predicted colour-opponent MEIs for only 9 out of 36 (25%) G28 RGCs (nonlinear CNN: 24 out of 36 [66%] colour-opponent MEIs; Figure 6a–c). This finding argues against the possibility that G28’s colour opponency can be explained on the computational level by two opposite-sign linear filters operating on the two colour channels, which could be recovered by an LN model. Instead, it suggests the presence of a nonlinear dependency between chromatic contrast (of the stimulus) and chromatic selectivity (of the cell). In other words, G28 RGCs process stimuli differently depending on their chromatic contrast. This is a nonlinear property that cannot be accurately captured by an LN model that makes a single estimate of the linear filter for the whole stimulus space.

Figure 6. Chromatic contrast selectivity of G28 retinal ganglion cells (RGCs) derives from a nonlinear transformation of stimulus space.

(a) Distribution of green and UV maximally exciting input (MEI) centre contrast for a linear-nonlinear (LN) model (red) and a nonlinear convolutional neural network (CNN) model (black). Colour-opponent cells highlighted by filled marker. (b, c) Left: MEIs for an example cell of RGC group G28, generated with the LN model (b) or the CNN model (c). The cell’s MEI centre contrast for both models is marked in (a) by asterisks. Right: Respective tuning maps of example model neuron in chromatic contrast space. Contour colours and background greys represent responses in % of maximum response; arrows indicate the direction of the response gradient across chromatic contrast space. The tuning maps were generated by evaluating the model neurons on stimuli that were generated by modulating the contrast of the green (x-axis) and UV (y-axis) component of the MEI. In these plots, the original MEI is located at (–1, 1). More details in the Methods section. (d) Difference in response predicted between LN and CNN model (in % of maximum response). (e) Contour plot as in (b, c) but of activity vs. green and UV contrast for an example transient suppressed-by-contrast (tSbC) G28 RGC measured in whole-cell current-clamp mode. Labels on the contour plot indicate spike count along isoresponse curves. (f) Traces are examples of responses at the 8 extremes of –100%, 0, or 100% contrast in each colour channel. Scale bars: (b), vertical 200 µm, horizontal 0.5 s; MEI scaling in (c) as in (b). (g) Same as (e) for a second example tSbC RGC.

Figure 6.

Figure 6—figure supplement 1. Both LN and CNN model predict colour-opponency for a strongly colour-opponent G_{28} RGC.

Figure 6—figure supplement 1.

(a) Distribution of green and UV maximally exciting input (MEI) centre contrast for a linear-nonlinear (LN) model (red) and a convolutional neural network (CNN) model (black); from Figure 6a. (b, c) Left: MEIs for a second example cell of retinal ganglion cell (RGC) group G28, generated with the LN model (b) or the CNN model (c). The cell’s MEI centre contrast for both models is marked in (a) by cross. Right: Respective tuning maps of example neuron in chromatic contrast space. Colours represent responses in % of maximum response; arrows indicate the direction of the gradient across chromatic contrast space. (d) Difference in response between LN and CNN model (in % of maximum response). Scale bars: (b), vertical 200 µm, horizontal 0.5 s; MEI scaling in (c) same as in (b).

To understand the nature of this dependency, we expanded the estimate of the model RGCs’ tuning to colour contrast around the maximum (the MEI). We did this by mapping the model neurons’ response and its gradient in 2D chromatic contrast space (Figure 6b and c). This analysis revealed that, indeed, G28 RGCs have a nonlinear tuning for colour contrast: they are strongly UV-selective at lower contrasts, but become colour-opponent, i.e., additionally inhibited by green, for higher contrasts. For individual neurons with very strong colour opponency that extends over a large region of chromatic contrast space, also the LN model’s approximation reflects this colour opponency, which demonstrates that the LN model can in principle model colour opponency, too (Figure 6—figure supplement 1).

We confirmed the model’s predictions about G28’s nonlinear tuning for colour contrast using electrical recordings as described above. The example G28 (tSbC) cells shown in the figure exhibit similar nonlinear tuning in chromatic contrast space (Figure 6e–g). The first example cell’s firing rate (Figure 6f) and, consequently, the tuning curve (Figure 6e) peak for UVON-greenOFF stimuli (top left in panel e; upper row, second from left in f), and are lower for UVON-greenON stimuli (top right in panel e; upper row, far right in f), reflecting the suppressive effect of green contrast on the cell’s response. The same is true for the second example cell (Figure 6g).

The nonlinearity in tuning to colour contrast of G28 RGCs leads to a transformation of stimulus space (Figure 6) that amplifies the distance of colour-opponent stimuli from non-colour-opponent stimuli and is thereby expected to increase their discriminability. We therefore hypothesised that the representation of visual input formed by G28 might serve to detect an ethologically relevant, colour-opponent feature from the visual scene.

Visual context changes are characterised by changes in chromatic contrast

Chromatic contrast changes at the horizon (Khani and Gollisch, 2021; Qiu et al., 2021; Gupta et al., 2022; Abballe and Asari, 2022), and so does visual context: from sky to ground or vice versa. We therefore hypothesised that G28 might leverage chromatic contrast as a proxy for detecting changes in visual context, such as might be caused by head or eye movements that cross the horizon. Detecting these changes in visual context could provide information that is crucial for interpreting signals in other RGC channels.

In our natural movie stimulus, the transitions between movie clips (inter-clip transitions; Figure 1b) can be categorised into those with and without change in visual context: ground-to-sky and sky-to-ground transitions for vertical movements with a change in visual context, and as controls, ground-to-ground and sky-to-sky transitions for horizontal movements without change in visual context. We calculated the contrast of these transitions in the green and UV channel to map them to chromatic contrast stimulus space (Figure 7a). We found that ground-to-ground and sky-to-sky transitions were distributed along the diagonal, indicating that they reflect largely achromatic contrast changes. The two transitions resembling visual input elicited by vertical movements crossing the horizon fell into the two colour-opponent quadrants of the stimulus space: sky-to-ground transitions in the lower right quadrant, and ground-to-sky transitions in the upper left quadrant (Figure 7a and b). The UVON-greenOFF MEIs 28, the preferred stimuli of G28, shared a location in stimulus space with ground-to-sky transitions, indicating that these two stimuli are similar in terms of chromatic contrast (Figure 3i).

Figure 7. Chromatic contrast tuning allows detection of ground-to-sky transitions.

(a) Distribution of green and UV contrasts of all movie inter-clip transitions (centre), separately for the four transition types, for each of which an example is shown: ground-to-sky (N=525, top left, red triangle), ground-to-ground (N=494, top right, green disk), sky-to-ground (N=480, bottom left, black downward triangle), and sky-to-sky (N=499, bottom right, purple square). Images show last and first frame of pre- and post-transition clip, respectively. Traces show mean full-field luminance of green and UV channels in last and first 1 s of pre- and post-transition clip. Black trace shows luminance averaged across colour channels. (b) Distributions as in (a), but shown as contours indicating isodensity lines of inter-clip transitions in chromatic contrast space. Density of inter-clip transitions was estimated separately for each type of transition from histograms within 10 × 10 bins that were equally spaced within the coloured boxes. Four levels of isodensity for each transition type shown, with density levels at 20% (outermost contour, strongest saturation), 40%, 60%, and 80% (innermost contour, weakest saturation) of the maximum density observed per transition: 28 sky-to-ground (black), 75 ground-to-ground (green), 42 sky-to-sky (purple), and 45 ground-to-sky (red) transitions per bin. Orange markers indicate locations of N=36 G28 maximally exciting inputs (MEIs) in chromatic contrast space (Figure 3i). (c) Tuning map of G28 retinal ganglion cells (RGCs) (N=78), created by averaging the tuning maps of the individual RGCs, overlaid with outermost contour lines from (b) (Figure 7—figure supplement 2). (d, e) Same as (c) for G21 ((g), N=97) and G5 ((h), N=33). (f) Top: Illustration of receiver operating characteristic (ROC) analysis for two RGCs, a G21 (left) and a G28 (right). For each RGC, responses to all inter-clip transitions were binned, separately for ground-to-sky (red) and all other transitions (grey). Middle: Sliding a threshold d across the response range, classifying all transitions with response >d as ground-to-sky, and registering the false-positive rate (FPR) and true-positive rate (TPR) for each threshold yields an ROC curve. Numbers in brackets indicate (FPR, TPR) at the threshold indicated by vertical line in histograms. Bottom: Performance for each cell, quantified as area under the ROC curve (AUC), plotted as distribution across AUC values for all cells (black), G21 (grey), G5 (blue), and G28 (orange); AUC mean ± SD indicated as dots and horizontal lines above histograms. (g) Boxplot of AUC distributions per cell type. Boxes extend from first quartile (Q1) to third quartile (Q3) of the data; line within a box indicates median, whiskers extend to the most extreme points still within [Q11.5×IQR, Q3+1.5×IQR], IQR = inter-quartile range. Diamonds indicate points outside this range. All plot elements (upper and lower boundaries of the box, median line, whiskers, diamonds) correspond to actual observations in the data. Numbers of RGCs for each type are indicated in the plot. (h) Illustration of stimulus with transitions with (Sky-Ground, Ground-Sky) and without (Sky-Sky, Ground-Ground) context change at different velocities (50, 150, 250, and 350 °/s) used in in silico experiments in (i, j). (i) Like (g) but for model cells and stimuli illustrated in (h) at 50/s (see (h)). (j) AUC as function of transition velocity for G28 (orange) vs. example RGC groups (‘Uncertain', G31,32; Slow-ON, G21,23,24; Fast-ON, G18,20; ON-OFF, G10; OFF, G1,5).

Figure 7.

Figure 7—figure supplement 1. Example response traces to inter-clip transitions with and without context changes.

Figure 7—figure supplement 1.

(a) Traces of example cells of different cell groups (G10, G18, G23, G24, G28) from a single recording field, responding to 33 (of 122) inter-clip transitions. Inter-clip transitions are colour-coded by transition type (red: ground-to-sky, purple: sky-to-sky, green: ground-to-ground, black: sky-to-ground). (b) The resulting tuning maps in chromatic contrast space.
Figure 7—figure supplement 2. Chromatic contrast tuning in the dorsal retina allows detection of ground-to-sky transitions.

Figure 7—figure supplement 2.

(a) Illustration of a flat-mounted retina, with recording fields in the dorsal (black circles) and ventral (white circles) retina (cross marks optic disc; d, dorsal; v, ventral; t, temporal; n, nasal). (b) Left: Distribution of green and UV contrasts of N=122 inter-clip transitions seen by a ventral group 28 (G28) RGC, coloured by transition type (red triangle, ground-to-sky; green disk, ground-to-ground; black downward triangle, sky-to-ground; purple square, sky-to-sky). Middle: Responses of example RGC in the 1 s following an inter-clip transition, averaged across transitions within the bins indicated by the grid. Right: Responses transformed into a tuning map by averaging within bins as defined by grid. Left: Responses are z-scored (μ=0, σ=1). (c) Like (b) but for a dorsal G28 RGC. (d) Tuning map of N=9 dorsal G28 RGCs, created by averaging the tuning maps of the individual RGCs. (e) Same as (d) for N=13 G21 RGCs. (f) Same as (d) for N=4 G5 RGCs. (g) Top: Illustration of receiver operating characteristic (ROC) analysis for two dorsal RGCs, a G21 (left) and a G28 (right). For each RGC, responses were binned to all inter-clip transitions, separately for ground-to-sky (red) and all other transitions (grey). Middle: Sliding a threshold d across the response range, classifying all transitions with response >d as ground-to-sky, and registering the false-positive rate (FPR) and true-positive rate (TPR) for each threshold yields an ROC curve (middle). Numbers in brackets indicate (FPR, TPR) at the threshold indicated by black vertical line in histogram plots. Bottom: We evaluated performance for each cell as the area under the ROC curve (AUC), and plotted the distribution across AUC values for all cells (black), for G5 (blue), for G21 (grey), and for G28 (orange). Among the dorsal RGCs, G28 RGCs achieved the highest AUC on average (mean±SD AUC, G28 (N=9 cells): 0.62±0.07; all other groups (N=720): 0.49±0.09, ΔAUC=0.13, bootstrapped 95% confidence interval,CI95=[0.08,0.18] Cohen’s d=1.45 two-sample permutation test G28 vs. all other groups (see Methods): p=0 with 100,000 permutations; next-best performing G24 (N=6): 0.54±0.12, ΔAUC=0.08 bootstrapped 95% confidence interval, CI95=[0.01,0.18] Cohen’s d=0.87 two-sided t-test G28 vs. G24: p=0.15 with 100,000 permutations (not significant). AUC mean ± SD indicated as dots and horizontal lines above histograms. (h) Boxplot of AUC distributions per cell type (dorsal). The box extends from the first quartile (Q1) to the third quartile (Q3) of the data; the line within a box indicates the median. The whiskers extend to the most extreme points still within [Q11.5×IQR, Q3+1.5×IQR], IQR =inter-quartile range. Diamonds indicate points outside this range. All elements of the plot (upper and lower boundaries of the box, median line, whiskers, diamonds correspond to actual observations in the data. Numbers of RGCs for each type are indicated in the plot.
Figure 7—figure supplement 3. Simulations predict tSbC cells robustly detect context changes across different speeds.

Figure 7—figure supplement 3.

(a) Illustration transition stimulus paradigm (from Figure 7h). (b) Structure of stimuli for different velocities, using a ground-to-sky transition as an example. (c) Statistics of the area under the receiver operating characteristic (ROC) curve (AUC) for the sky-ground detection task in the simulation for different velocities (G28 vs. the next-best retinal ganglion cell [RGC] group). Columns (from left): mean ± standard deviation of AUC values (top: G28; bottom: the respective best next RGC type); difference in mean AUC and corresponding bootstrapped 95% confidence intervals; Cohen’s d and p-value of a two-sample permutation test with 100,000 repeats. (d) Boxplots of AUC distributions per cell type for the different velocities (plots like in Figure 7g and j).

Chromatic contrast selectivity allows detecting visual context changes

Next, we tested if G28 RGCs indeed respond strongly to visual context changes as occur in ground-to-sky transitions. To this end, we extracted the RGC responses to the inter-clip transitions, mapping their tuning to chromatic contrasts (Figure 7—figure supplement 1, Figure 7—figure supplement 2), and then averaged the resulting single-cell tuning maps for each RGC group (e.g. see Figure 7c–e). G28 is most strongly tuned to full-field transitions in the upper left quadrant containing mostly ground-to-sky inter-clip transitions (Figure 7c) – unlike, for example, non-colour-opponent reference RGC groups from the slow ON and OFF response regime (Figure 7d and e).

Could a downstream visual area detect ground-to-sky visual context changes based on input from G28 RGCs? To answer this question, we performed a linear signal detection analysis for each RGC by sliding a threshold across its responses to the inter-clip transitions, classifying all transitions that elicited an above-threshold response as ground-to-sky, and evaluating false-positive and true-positive rates (FPR and TPR, respectively) for each threshold (Figure 7f). Plotting the resulting TPRs for all thresholds as a function of FPRs yields a receiver operating characteristic (ROC) curve (Fawcett, 2006; Figure 7f, middle). The area under this curve (AUC) can be used as a measure of detection performance: it is equivalent to the probability that a given RGC will respond more strongly to a ground-to-sky transition than to any other type of transition. Indeed, G28 RGCs achieved the highest AUC on average (Figure 7f, bottom, and Figure 7g; G28, mean ± SD AUC (N=78 cells): 0.68 ± 0.08; two-sample permutation test G28 vs. all other groups with at least N=4 cells (see Methods), significant for each group, with α=0.0017 Bonferroni-corrected for 30 multiple comparisons).

Ground-to-sky transitions, and therefore visual context changes, can also appear in the lower visual field, which is processed by the dorsal retina, where RGCs receive weaker UV input (Szatko et al., 2020). We recorded additional fields in the dorsal retina (Figure 7—figure supplement 2) and also found here that G28 (tSbC) RGCs displayed the strongest tuning to ground-to-sky transitions among all dorsal RGCs (Figure 7—figure supplement 3c-h, for statistics, see legends).

Visual context changes triggered by different behaviours, such as locomotion and head or eye movements, will differ strongly with respect to their statistics – in particular with respect to their speed. Therefore, for G28 (tSbC) RGCs to play a role in detecting context changes, their detection performance should be robust across velocities. To test whether this is the case, we conducted additional in silico experiments where we ran model inference on stimuli that simulate transitions across the visual field with and without context change (Figure 7h) at different velocities: 50, 150, 250, and 350 visual degrees per second (° s–1; see Methods; Figure 7—figure supplement 3a and b) The slowest speed simulated visual input as could be elicited by locomotion, and the fastest speed approached that of saccades (Meyer et al., 2020). We then performed an ROC analysis on the model cell responses. It should be noted that, because model predictions are noise-free, results from the ROC analysis based on simulated responses will overestimate detection performance. However, under the assumption of approximately equal noise levels across RGC groups, we can still draw conclusions about the relative performance of different RGC groups. This analysis confirmed that G28 RGCs could distinguish ground-to-sky context changes from all other types of transitions robustly across different speeds (Figure 7i and j). Interestingly, the advantage of G28 over other RGC groups in performing this detection task diminished with increasing speed (Figure 7—figure supplement 3c and d; see also Discussion).

Together, these analyses demonstrate that a downstream area, reading out from a single RGC group, would achieve the best performance in detecting ground-to-sky context changes if it based its decisions on inputs from G28 RGCs, robustly across different lighting conditions (transitions between movie snippets), retinal location (ventral and dorsal), and speeds.

Discussion

We combined large-scale recordings of RGC responses to natural movie stimulation with CNN-based modelling to investigate colour processing in the mouse retina. By searching the stimulus space in silico to identify maximally exciting inputs (MEIs), we found a novel type of chromatic tuning in tSbC RGCs. We revealed this RGC type’s pronounced and unique selectivity for full-field changes from green-dominated to UV-dominated scenes, a stimulus that matches the chromatic statistics of ground-to-sky transitions in natural scenes. Therefore, we suggest that tSbC cells may signal context changes within their RF. Beyond our focus on tSbC cells, our study demonstrates the utility of an in silico approach for generating and testing hypotheses about the ethological relevance of sensory representations.

Nonlinear approaches for characterising neuronal selectivities

We leveraged image-computable models in combination with an optimisation approach to search in dynamic, chromatic stimulus space for globally optimal inputs for RGCs, the MEIs. The resulting MEI represents the peak in the nonlinear loss landscape that describes the neuron’s tuning in high-dimensional stimulus space. This approach has also been used to reveal the complexities and nonlinearities of neuronal tuning in monkey visual cortex area V4 (Bashivan et al., 2019; Willeke et al., 2023) and mouse area V1 (Walker et al., 2019; Franke et al., 2022).

Finding optimal stimuli via predictive models is by no means the only way to reveal nonlinear selectivity. Several alternative approaches exist (Schwartz et al., 2006; Sharpee et al., 2004; Liu et al., 2017; Globerson et al., 2009; Maheswaranathan et al., 2023) that could in principle also recover the type of tuning we report (although it is not a trivial question whether and under what conditions they would). More importantly, these approaches are not readily applicable to our data. Some approaches, such as spike-triggered covariance (Schwartz et al., 2006) or spike-triggered non-negative matrix factorisation (Liu et al., 2017), typically make assumptions about the distribution of input stimuli that are violated, or at least not guaranteed, for naturalistic stimuli, which consequently precludes using these methods to probe cells under their natural stimulus statistics (adaptations of these methods for naturalistic stimuli exist, e.g., see Aljadeff et al., 2013; Aljadeff et al., 2016). Other approaches, including maximally informative dimensions (Sharpee et al., 2004) or maximum noise entropy (Globerson et al., 2009), use information theory as a framework. They require estimating mutual information between stimulus and responses, which is a challenge when dealing with high-dimensional stimuli and continuous responses.

Predictive models, on the other hand, can handle high-dimensional input distributions well and can easily be adapted to different response modalities. However, one important limitation of our approach for identifying neuronal stimulus selectivities is that searching for the maximally exciting input will return a single input – even when there are several inputs that would elicit an equal response, such as ON and OFF stimuli for ON-OFF cells (see Figure 3—figure supplement 1, G10 MEIs). A remedy for this limitation is to search for diverse exciting inputs by generating stimuli that are both highly effective at eliciting neuronal responses and at the same time distinct from one another. Ding et al., 2023 used this approach to study bipartite invariance in mouse V1 (see also Cadena et al., 2018).

Another limitation is that identifying the MEI does not immediately provide insight into how the different stimulus dimensions contribute to the neuron’s response, i.e., how varying the stimulus along these dimensions affects the neuron’s response. However, differentiable models readily lend themselves to explore the interactions and contributions of different stimulus dimensions in generating the neuronal response, e.g., by inspecting the gradient field along dimensions of interest as done here, or by searching for locally optimal stimulus perturbations (Goldin et al., 2022). These models can also be used to understand better the properties that distinguish cell types from each other by generating most discriminative stimuli, as recently demonstrated for RGCs in mouse and marmoset (Burg et al., 2023).

Together, these studies showcase the versatility of the toolkit of optimisation-based approaches at characterising nonlinear neuronal operations in high-dimensional, natural stimulus spaces. We add to this toolkit by first searching for a globally optimal stimulus, and then searching locally in its vicinity to map the cells’ loss landscape around the peak.

Circuit mechanisms for colour opponency in tSbC RGCs

Most previous studies of colour opponency in the mouse retina have identified sparse populations of colour-opponent RGCs that have not been systematically assigned to a particular functional type (Szatko et al., 2020; Khani and Gollisch, 2021; Gouras and Ekesten, 2004). The only studies that have examined the mechanisms of colour opponency in identified mouse RGC types showed a centre-surround organisation, with RF centre and surround having different chromatic preferences (Chang et al., 2013; Joesch and Meister, 2016; Stabio et al., 2018, but see Sonoda et al., 2020). While we do not specifically analyse centre-surround opponency in this study, we see a similar trend as described previously in many RGC types, with stronger surrounds in the green channel relative to the UV channel (see Figure 4a, Figure 3—figure supplement 1). tSbC RGCs, in contrast, respond to spatially co-extensive colour-opponent stimuli, functionally reminiscent of colour-opponent RGCs in guinea pig (Yin et al., 2009) and ground squirrels (Michael, 1968).

In mice, centre-surround opponency has been attributed to the opsin gradient (Chang et al., 2013) and rod contributions in the outer retina (Joesch and Meister, 2016; Szatko et al., 2020), whereas the circuitry for spatially co-extensive opponency remains unknown. It seems unlikely, though, that the opsin gradient plays a major role in the tSbC cell’s colour opponency, because both ventral and dorsal tSbC cells preferentially responded to full-field green-to-UV transitions. In primates, spatially co-extensive colour opponency in small bistratified RGCs is thought to arise from the selective wiring of SON and M/LOFF bipolar cells onto the inner and outer dendritic strata, respectively (Dacey and Lee, 1994, but see Field et al., 2007). A similar wiring pattern seems unlikely for tSbC RGCs, since their inner dendrites do not co-stratify with the S-ON (type 9) bipolar cells, nor do their outer dendrites co-stratify with the candidate M-OFF bipolar cell (type 1) (Behrens et al., 2016). The bistratified dendritic arbour distinguishes the mouse tSbC also from the colour-opponent ON RGC type in guinea pig, which is monostratified (Yin et al., 2009).

The large RF centres of the tSbC cells, extending well beyond their dendritic fields, come from a non-canonical circuit, in which tonic inhibition onto the RGC via GABAB receptors is relieved via serial inhibition from different amacrine cells using GABAC receptors (Mani and Schwartz, 2017). An intriguing possibility is that a colour-selective amacrine cell is part of this circuit, perhaps supporting chromatically tuned disinhibition in the absence of selective wiring from the aforementioned cone-selective bipolar cells onto the RGC.

A new functional role for tSbC RGCs

Suppressed-by-contrast responses have been recorded along the early visual pathway in dorsal lateral geniculate nucleus (dLGN), superior colliculus (SC), and primary visual cortex (V1) (Niell and Stryker, 2010; Piscopo et al., 2013; Ito et al., 2017), with their function still being debated (Masland and Martin, 2007). In the retina, three types of SbC RGCs have so far been identified (reviewed in Jacoby and Schwartz, 2018), among them the tSbC cell (Mani and Schwartz, 2017; Tien et al., 2015; Tien et al., 2022). Despite their relatively recent discovery, tSbC RGCs have been suggested to play a role in several different visual computations. The first report of their light responses in mice connected them to the SbC RGCs previously discovered in rabbit, cat, and macaque, and suggested a role in signalling self-generated stimuli, perhaps for saccade suppression (Tien et al., 2015). Aided by a new intersectional transgenic line to selectively label tSbC RGCs (Tien et al., 2022), their projections were traced to areas in SC, v- and dLGN, and nucleus of the optic tract (NOT). The latter stabilises horizontal eye movements; however, as the medial terminal nucleus, which serves stabilisation of vertical eye movements, lacks tSbC innervation, it is unclear whether and how these RGCs contribute to gaze stabilisation.

A retinal study identified the circuit mechanisms responsible for some of the unique spatial and temporal response properties of tSbC cells and suggested a possible role in defocus detection to drive emmetropization in growing eyes and accommodation in adults (Mani and Schwartz, 2017; Baden et al., 2017). Here, we identified another potential role for these RGCs in vision based on the chromatic properties of their RFs: signalling visual context changes (see next section). These different possible functional roles are not mutually exclusive, and might even be complementary in some cases, highlighting the difficulty in assigning single features to distinct RGC types (Schwartz and Swygart, 2020). In particular, the centre colour opponency that we discovered in tSbC RGCs could serve to enhance their role in defocus detection by adding a directional signal (myopic vs. hyperopic) based on the chromatic aberration of lens and cornea (Gawne and Norton, 2020). Future studies may test these theories by manipulating these cells in vivo using the new transgenic tSbC mouse line (Tien et al., 2022).

Behavioural relevance of context change detection

The horizon is a prominent landmark in visual space: it bisects the visual field into two regions, ground and sky. This is particularly relevant in animals like mice, where eye motion largely accounts for head movements and keeps the visual field stable with respect to the horizon (Meyer et al., 2020). Visual stimuli carry different meaning depending on where they occur relative to the horizon, and context-specific processing of visual inputs is necessary for selecting appropriate behavioural responses (reviewed in Evans et al., 2019). For example, it is sensible to assume that a looming stimulus above the horizon is a predator, the appropriate response to which would be avoidance (i.e. escape or freezing). A similar stimulus below the horizon, however, is more likely to be harmless or even prey. To allow for time-critical perceptual decisions – predator or prey – and corresponding behavioural response selection – avoidance or approach – it might be useful for stimulus information (e.g. dark moving spot) and contextual information to converge early in the visual circuitry.

Notably, VGluT3-expressing amacrine cells (a ‘hub’ for distributing information about motion) represent a shared element in upstream circuitry, providing opposite-sign input to tSbC and to RGCs implicated in triggering avoidance behaviour, such as tOFF α (Krieger et al., 2017; Münch et al., 2009) and W3 cells (Zhang et al., 2012). In downstream circuitry, SbC inputs have been found to converge with ‘conventional’ RGC inputs onto targets in dLGN and NOT; whether tSbC axons specifically converge with tOFF α or W3 axons remains to be tested. Such convergence may allow ‘flagging’ the activity of these RGCs with their local context (sky/threat or ground/no threat).

Depending on the behaviour that elicits a context change – be it a head or eye movement or locomotion – the parameters of the incoming stimulus, such as illumination level and velocity, may change. To be behaviourally useful, a context-change-flagging signal needs to be reliable and robust across these different stimulus parameters. While many slow-ON RGCs achieve high detection performance at higher transition velocities, probably reacting to the increasingly flash-like stimuli, tSbC/G28 RGCs were the only type with robustly high performance across different levels of illumination and all simulated speeds.

In silico approaches to linking neural tuning and function

Modelling studies have advanced our understanding of the complexity and organisation of retinal processing in recent years. It is helpful to consider the contributions of different studies in terms of three perspectives on the retinal encoding of natural scenes: the circuit perspective (‘how?’), the normative perspective (‘why?’), and the coding perspective (‘what?’) (Marr, 2010; Karamanlis et al., 2022). For example, an in silico dissection of a CNN model of the retina offered explanations on how the surprisingly complex retinal computations, such as motion reversal, omitted stimulus response, and polarity reversal, emerge from simpler computations within retinal circuits (Maheswaranathan et al., 2023; Tanaka et al., 2019). From the normative perspective, networks trained on an efficient coding objective accurately predicted the coordination of retinal mosaics (Roy et al., 2021).

Here, we proposed an approach that allows investigating the complexity of retinal processing simultaneously from the coding and the normative perspectives: A global search for most exciting mouse RGC inputs in dynamic, chromatic stimulus space provides answers to the question of what it is that retinal neurons encode. Interpreting the abstract features extracted by the retina against the backdrop of natural stimulus space points to why these features might be behaviourally relevant. And finally, classifying individual RGCs into types then allows to bring in the circuit perspective through targeted experiments aimed at dissecting how specific retinal computations are implemented.

Methods

Animals and tissue preparation

All imaging experiments were conducted at the University of Tübingen; the corresponding animal procedures were approved by the governmental review board (Regierungspräsidium Tübingen, Baden-Württemberg, Konrad-Adenauer-Str. 20, 72072 Tübingen, Germany) and performed according to the laws governing animal experimentation issued by the German Government. All electrophysiological experiments were conducted at Northwestern University; the corresponding animal procedures were performed according to standards provided by Northwestern University Center for Comparative Medicine and approved by the Institutional Animal Care and Use Committee (IACUC).

For all imaging experiments, we used 4- to 15-week-old C57Bl/6J mice (N=23; JAX 000664) of either sex (10 male, 13 female). These animals were housed under a standard 12 hr day/night rhythm at 22° and 55% humidity. On the day of the recording experiment, animals were dark-adapted for at least 1 hr, then anaesthetised with isoflurane (Baxter) and killed by cervical dislocation. All following procedures were carried out under very dim red (>650 nm) light. The eyes were enucleated and hemisected in carboxygenated (95% O2, 5% CO2) artificial cerebrospinal fluid (ACSF) solution containing (in mM): 125 NaCl, 2.5 KCl, 2 CaCl2, 1 MgCl2, 1.25 NaH2PO4, 26 NaHCO3, 20 glucose, and 0.5 L-glutamine at pH 7.4. Next, the retinae were bulk-electroporated with the fluorescent Ca2+ indicator Oregon-Green BAPTA-1 (OGB-1), as described earlier (Briggman and Euler, 2011). In brief, the dissected retina was flat-mounted onto an Anodisc (#13, 0.2 μm pore size, GE Healthcare) with the RGCs facing up, and placed between a pair of 4 mm horizontal plate electrodes (CUY700P4E/L, Nepagene/Xceltis). A 10 µl drop of 5 mM OGB-1 (hexapotassium salt; Life Technologies) in ACSF was suspended from the upper electrode and lowered onto the retina. Next, nine pulses (≈9.2 V, 100 ms pulse width, at 1 Hz) from a pulse generator/wide-band amplifier combination (TGP110 and WA301, Thurlby handar/Farnell) were applied. Finally, the tissue was placed into the microscope’s recording chamber, where it was perfused with carboxygenated ACSF (at ≈36°C) and left to recover for ≥30 min before recordings started. To visualise vessels and damaged cells in the red fluorescence channel, the ACSF contained ≈0.1 µM Sulforhodamine-101 (SR101, Invitrogen) (Euler et al., 2009). All procedures were carried out under dim red (>650 nm) light.

For electrophysiology experiments, we used ChAT-Cre (JAX 006410) × Ai14 (JAX 007914) mice on a C57Bl/6J background (N=2, male, aged 27 and 30 weeks). Mice were housed with siblings in groups up to 4, fed normal mouse chow and maintained on a 12:12 hr light/dark cycle. Before the experiment, mice were dark-adapted overnight and sacrificed by cervical dislocation. Retinal tissue was isolated under infrared illumination (900 nm) with the aid of night-vision goggles and IR dissection scope attachments (BE Meyers). Retinal orientation was identified using scleral landmarks (Wei et al., 2010), and preserved using relieving cuts in cardinal directions, with the largest cut at the dorsal retina. Retinas were mounted on 12 mm poly-D-lysine-coated glass affixed to a recording dish with grease, with the GCL up. Oxygenation was maintained by superfusing the dish with carboxygenated Ames medium (US Biological, A1372-25) warmed to 32°C. For cell-attached single-cell recordings, we used Symphony software (Cafaro et al., 2019) with custom extensions (Schwartz and Ala-Laurila, 2024).

Owing to the exploratory nature of our study, we did not use randomisation and blinding. No statistical methods were used to predetermine sample size.

Two-photon calcium imaging

We used a MOM-type two-photon microscope (designed by W Denk, purchased from Sutter Instruments) (Euler et al., 2009; Euler et al., 2019), which was equipped with a mode-locked Ti:sapphire laser (MaiTai-HP DeepSee, Newport Spectra-Physics) tuned to 927 nm, two fluorescence detection channels for OGB-1 (HQ 510/84, AHF/Chroma) and SR101 (HQ 630/60, AHF), and a water immersion objective (CF175 LWD x 16/0.8W, DIC N2, Nikon, Germany). Image acquisition was performed with custom-made software (ScanM by M Müller and TE) running under IGOR Pro 6.3 for Windows (Wavemetrics), taking time-lapsed 64 × 64 pixel image scans (≈ (100 μm)2) at 7.8125 Hz (Figure 1c). For simplicity, we refer to such a time-lapsed scan of a local population of GCL cells as a ‘recording’. Despite the low frame rate, the Ca2+ responses can be related to the spike rate (Trapani et al., 2023; Román Rosón et al., 2019; Berens et al., 2018; Theis et al., 2016). For documenting the position of the recording fields, the retina under the microscope was oriented such that the most ventral edge pointed always towards the experimenter. In addition, higher resolution images (512 ×512 pixel) were acquired and recording field positions relative to the optic nerve were routinely logged.

Data preprocessing

Ca2+ traces were extracted for individual ROIs as described previously (Baden et al., 2016; Szatko et al., 2020). Extracted traces 𝐜raw were then detrended to remove slow drifts in the recorded signal that were unrelated to changes in the neural response. First, a smoothed version of the traces, 𝐜smooth, was calculated by applying a Savitzky-Golay filter of third polynomial order and a window length of 60 s using the SciPy implementation scipy.signal.savgol_filter. This smoothed version was then subtracted from the raw traces to yield the detrended traces.

cdetrend=crawcsmooth

To make traces non-negative (𝐜+), we then clipped all values smaller than the 2.5th percentile, η2.5, to that value, and then subtracted η2.5 from the detrended traces:

c+=cdetrendη2.5

This procedure (i.e. clipping to, and subtracting η2.5) was more robust than simply subtracting the minimum.

Finally, traces were then divided by the standard deviation within the time window before stimulus start at t0:

c:=cfinal=cnnSD(c+[:t0])

For training the model on movie response, we then estimated firing rates 𝐫 from the detrended Ca2+ traces 𝐜 using the package C2S (https://github.com/lucastheis/c2s, copy archived at lucastheis, 2016; Theis et al., 2016).

Inclusion criteria

We applied a sequence of quality filtering steps to recorded cells before analysis illustrated in Figure 1—figure supplement 1. As a first step, we applied a general response quality criterion, defined as a sufficiently reliable response to the MB stimulus (as quantified by a quality index QIMB>0.6), or a sufficiently reliable response to the chirp stimulus (as quantified by a quality index QIchirp>0.35). The quality index is defined as in Baden et al., 2016:

QI=Var[ri]tVar[r]ti

where 𝒓 is the T by I response matrix (time samples by stimulus repetitions) and x and Var[]x denote the mean and variance across the indicated dimension x, respectively.

The second and third step made sure only cells were included that were assigned to a ganglion cell group (i.e. group index between 1 and 32) with sufficient confidence. Confidence is defined as the probability assigned to the predicted class by the random forest classifier (see Qiu et al., 2023), and the threshold was set at ≥0.25.

The fourth step made sure only cells with a sufficient model prediction performance, defined as an average single-trial test set correlation of C(r^(n),ri(n))i>0.3, were included.

All cells passing steps 1–3 were included in the horizon detection analysis (Figure 7); all cells passing steps 1–4 were included in the MEI analysis (Figure 3); the ‘red’ cells passing steps 1–4 were included in the MEI validation analysis (Figure 4). In the process of analysing MEIs, we fitted DoGs to their green and UV spatial component (see Methods section Concentric anisotropic 2D DoG fit). For the analysis of MEI properties (temporal frequency, centre size, chromatic contrast), we only included cells with a sufficient DoG goodness-of-fit, determined as a value of the cost function of <0.11 for both green and UV on the resulting DoG fit. This threshold was determined by visual inspection of the DoG fits and led to the inclusion of 1613 out of 1947 RGCs in the MEI property analysis.

Visual stimulation

For light stimulation (imaging experiments), we projected the image generated by a digital light processing projector (lightcrafter DPM-FE4500MKIIF, EKB Technologies Ltd) through the objective onto the tissue. The lightcrafter featured a light-guide port to couple in external, band-pass-filtered UV and green LEDs (light-emitting diodes) (green: 576 BP 10, F37-576; UV: 387 BP 11, F39-387; both AHF/Chroma) (Franke et al., 2019). To optimise spectral separation of mouse M- and S-opsins, LEDs were band-pass-filtered (390/576 dual-band, F59-003, AHF/Chroma). LEDs were synchronised with the microscope’s scan retrace. Stimulator intensity (as photoisomerisation rate, 103 Ps1 per cone) was calibrated to range from ≈ 0.5 (black image) to ≈ 20 for M- and S-opsins, respectively. Additionally, we estimated a steady illumination component of ≈ 104 P*s1 per cone to be present during the recordings because of two-photon excitation of photopigments (Euler et al., 2009; Euler et al., 2019). Before data acquisition, the retina was adapted to the light stimulation by presenting a binary noise stimulus (20 × 15 matrix, (40 μm)2 pixels, balanced random sequence) at 5 Hz for 5 min to the tissue. Stimuli were presented using the software RRID:SCR_016985, QDSpy (https://github.com/eulerlab/QDSpy).

For electrophysiology experiments, stimuli were presented using a digital projector (DPM-FE4500MKII, EKB Technologies Ltd) at a frame rate of 60 Hz and a spatial resolution of 1140 × 912 pixels (1.3 μm per pixel) focused on the photoreceptor layer. Neutral density filters (Thorlabs), a triple-band pass filter (405 BP 20, 485 BP 20, 552 BP 16; 69,000×, Chroma), and a custom LED controller circuit were used to attenuate the light intensity of stimuli either to match that of the Ca2+ imaging experiments (for MEI presentation) or to range from ≈0 to 200 P*s1 per rod (for cell identification). Stimuli were presented using Symphony software (https://symphony-das.github.io/) with custom extensions (https://github.com/Schwartz-AlaLaurila-Labs/sa-labs-extension, copy archived at Schwartz-AlaLaurila-Labs, 2024).

Identifying RGC types

To functionally identify RGC groups in the Ca2+ imaging experiments, we used our default ‘fingerprinting’ stimuli, as described earlier (Baden et al., 2016). These stimuli included a full-field (700 μm in diameter) chirp stimulus, and a 300 ×1,000 μm bright bar moving at 1000 μm/s in eight directions across the recording field (with the shorter edge leading; Figure 1b).

The procedure and rationale for identifying cells in the electrophysiological recordings is presented in Goetz et al., 2022. Cells with responses that qualitatively matched that of the OND and ON α types were included in the study. Following recording, cells were filled with Alexa Fluor-488 by patch pipette and imaged under a two-photon microscope. Dendrites were traced in Fiji (NIH) using the SNT plugin (Arshadi et al., 2021). Dendritic arbours were computationally flattened using a custom MATLAB tool (https://doi.org/10.5281/zenodo.6578530) based on the method in Sümbül et al., 2014, to further confirm their identity as morphological type 73 from Bae et al., 2018.

Mouse natural movies

The natural movie stimulus consisted of clips of natural scenes recording outside in the field with a specialised, calibrated camera (Qiu et al., 2021). This camera featured a fish-eye lens, and two spectral channels, UV (band-pass filter F37-424, AHF, > 90% transmission at 350–419 nm) and green (F47-510, >90%, 470–550 nm, AHF), approximating the spectral sensitivities of mouse opsins (Jacobs et al., 2004). In mice, eye movements often serve to stabilise the image on the retina during head movements (Meyer et al., 2020). Therefore, the camera was also stabilised by mounting it on a gimbal. As a result, the horizon bisected the camera’s visual field.

A mouse cam movie frame contained a circular field of view of 180° corresponding to 437 pixels along the diameter. To minimise the influence of potential chromatic and spatial aberrations introduced by the lenses, we focused on image cut-outs (crops; 30 × 26, equivalent to 72 × 64 pixels in size) from upper and lower visual field, centred at [28, 56] and [-42, –31], respectively, relative to the horizon (for details, see Qiu et al., 2021). Our stimulus movie consisted of 113 movie clips, each 150 frames (= 5 s) long. 108 clips were randomly reordered for each recording and split into two 54 clips-long training sequences. The remaining 5 clips formed a fixed test sequence that was presented before, in between, and after the training sequences (Figure 1b). To keep intensity changes at clip transitions small, we only used clips with mean intensities between 0.04 and 0.22 (for intensities in [0, 1]). For display during the experiments, intensities were then mapped to the range covered by the stimulator, i.e., [0, 255].

CNN model of the retina

We trained a CNN model to predict responses of RGCs to a dichromatic natural movie. The CNN model consisted of two modules, a convolutional core that was shared between all neurons, and a readout that was specific for each neuron (Klindt et al., 2017).

The core module was modelled as a two-layer CNN with 16 feature channels in each layer. Both layers consisted of space-time separable 3D convolutional kernels followed by a batch normalisation layer and an ELU (exponential linear unit) nonlinearity. In the first layer, sixteen 2 × 11 × 11 × 21 (c=#input channels (green and UV) × h=height ×w=width × t=#frames) kernels were applied as valid convolution; in the second layer, sixteen 16 × 5 × 5 × 11 kernels were applied with zero padding along the spatial dimensions. We parameterised the temporal kernels as Fourier series and added one time stretching parameter per recording to account for inter-experimental variability affecting the speed of retinal processing. More precisely, every temporal kernel was represented by the first k sine and cosine functions, with trainable weights and phases, on an evenly spaced temporal grid, where k=7 for the first layer, and k=3 for the second layer. Additionally, we introduced a trainable stretch parameter for every recording to account for faster and slower response kernels. For example, the first layer temporal kernels are 21 steps long. Then, in order to stay well under the Nyquist limit, we parameterise the kernels with k=21/3=7 sines and cosines.

For each of those sines and cosines a weight (α,β) is learned to represent the shape of the temporal responses kernel (shared among cells within a recording). Per scan i, the time grid t (21 steps from 0 to 1) is stretched by a factor τi to account for different response speeds. To avoid adding additional cycles (e.g. for stretch factors τ>1) this is masked by an exponential envelope

ϵ(τ)=11+exp(t+210.95τ) (1)

Thus,

wi=jkαjsin(2πτitϵ(τi))+βjcos(2πτitϵ(τi)). (2)

is the temporal kernel parameterisation, that allows the model to learn a shared temporal filter that is made faster or slower for each specific scan (Zhao et al., 2020).

In the readout, we modelled each cell’s spatial receptive field (RF) as a 2D isotropic Gaussian, parameterised as 𝒩(μx,μy;σ). We then modelled the neural response as an affine function of the core feature maps weighted by the spatial RF, followed by a softplus nonlinearity.

For the linearised version of the model, the architecture was exactly the same except for the fact that there was no ELU nonlinearity after both convolutional layers. The resulting CNN was therefore equivalent to an LN model.

Model training and evaluation

We trained our network by minimising the Poisson loss

n=1N(r^(n)r(n)logr^(n))

where N is the number of neurons, 𝒓(n) is the measured and 𝒓^(𝒏) the predicted firing rate of neuron n for an input of duration t=50 frames. We followed the training schedule of Lurz et al., 2021. Specifically, we used early stopping (Prechelt, 1998) on the correlation between predicted and measured neuronal responses on the validation set, which consisted of 15 out of the 108 movie clips. If the correlation failed to increase during any five consecutive passes through the entire training set (epochs), we stopped the training and restored the model to the best performing model over the course of training. We went through four cycles of early stopping, restoring the model to the best performing, and continuing training, each time reducing the initial learning rate of 0.01 by a learning rate decay factor of 0.3. Network parameters were iteratively optimised via stochastic gradient descent (SGD) using the Adam optimiser (Kingma and Ba, 2015) with a batch size of 32 and a chunk size (number of frames for each element in the batch) of 50. For all analyses and MEI generation, we used an ensemble of models as described in Franke et al., 2022. Briefly, we trained five instances of the same model initialised with different random seeds. Inputs to the ensemble model were passed to each member and the final ensemble model prediction was obtained by averaging the outputs of the five members. For ease of notation, we thus redefine 𝒓^(𝒏) to be the ensemble model prediction.

After training, we evaluated model performance for each modelled neuron n as the correlation to the mean, i.e., the correlation between predicted response r^(n) and measured response r(n) to the held-out test sequence, the latter averaged across three repetitions i={1,2,3}: C(r^(n),ri(n)i). Unlike the single-trial correlation C(r^(n),ri(n)) which is always limited to values <1 by inherent neuronal noise, a perfect model can in theory achieve a value of 1 for the correlation to the mean, in the limit of infinitely many repetitions when the sample average r(n)ii is a perfect estimate of the true underlying response ρ(n). The observed correlation to the mean can thus be interpreted as an estimate of the fraction of the maximally achievable correlation achieved by our model. For deciding which cells to exclude from analysis, we used average single-trial correlation (C(r^(n),ri(n))i) since this measure reflects both model performance as well as reliability of the neuronal response to the movie stimulus for neuron n (see also Methods section on Inclusion criteria).

Synthesising MEIs

We synthesised MEIs for RGCs as described previously (Walker et al., 2019). Formally, for each model neuron n we wanted to find

x(n)=argmaxxr^(n)(x)30:50t, (3)

i.e., the input 𝒙*(n) where the model neuron’s response 𝒓^(𝒙)30:50t, averaged across frames 30–50, attains a maximum, subject to norm and range constraints (see below). To this end, we randomly initialised an input 𝒙0(n)c×w×h×t of duration t=50 frames with Gaussian white noise, and then iteratively updated xi(n) according to the gradient of the model neuron’s response:

xi+1(n)=xi(n)+λδδxi(n)r^(n)(xi(n))30:50t, (4)

where λ=10 was the learning rate. The optimisation was performed using SGD, and was subject to a norm and a range constraint. The norm constraint was applied jointly across both channels and ensured that the L2 norm of each MEI did not exceed a fixed budget b of 30. The norm-constrained MEI x~i(n) was calculated at each iteration as

x~i(n)=b||xi(n)||2×xi(n) (5)

The range constraint was defined and applied for each colour channel separately and ensured that the range of the MEI values stayed within the range covered by the training movie. This was achieved by clipping values of the MEI exceeding the range covered by the training movie to the minimum or maximum value. Optimisation was run for at least 100 iterations, and then stopped when the number of iterations reached 1000, or when it had converged (whichever occurred first). Convergence was defined as 10 consecutive iterations with a change in model neuron activation of less than 0.001; model neuron activations ranged from ≈1 to ≈10. We denote the resulting MEI for neuron n as 𝒙*(n).

Analysing MEIs

We analysed MEIs to quantify their spatial, temporal, and chromatic properties.

Spatial and temporal components of MEIs

For each colour channel c, we decomposed the spatiotemporal MEIs into a spatial component and a temporal component by singular value decomposition:

U,S,V=svd(𝒙c*(n))

with xc*(n)50×288 for c [green, UV] is the MEI of neuron n in a given colour channel with its spatial dimension (18 ×16=288) flattened out. As a result, any spatiotemporal dependencies are removed and we only analyse spatial and temporal properties separately. The following procedures were carried out in the same manner for the green and the UV component of the MEI, and we drop the colour channel index c for ease of notation. The temporal component is then defined as the first left singular vector, U:1, and the spatial component is defined as the first right singular vector, V:1T, reshaped to the original dimensions 18 × 16 .

Concentric anisotropic 2D DoG fit

We modelled the spatial component as concentric anisotropic DoG using the nonlinear least-squares solver scipy.optimize.least_squares with soft-L1 loss function (Gupta et al., 2022). The DoGs were parameterised by a location (μx,μy) shared between centre and surround, amplitudes Ac, As, variances (σxc,σyc), (σxs,σys), and rotation angles θc, θs separately for centre and surround:

DoG=GcGs

with

Gc(x,y)=Acexp(fc(xμx)2+2gc(yμy)(xμx)+hc(yμy)2)

and

fc=cos2θc2σxc+sin2θc2σyc,gc=sin2θc4σycsin2θc4σyc,hc=sin2θc2σxc+cos2θc2σyc,

and likewise for Gs. We initialised (μx,μy) in the following way: Since we set the model readout’s location parameters to (0, 0) for all model neurons when generating their MEIs, we also expected the MEIs to be centred at (0, 0), as well. Hence, we determined the location of the minimum and the maximum value of the MEI; whichever was closer to the centre (0,0) provided the initial values for the parameters (μx,μy). Starting from there, we then first fit a single Gaussian to the MEI, and took the resulting parameters as initial parameters for the DoG fit. This was a constrained optimisation problem, with lower and upper bounds on all parameters; in particular, such that the location parameter would not exceed the canvas of the MEI, and such that the variance would be strictly positive.

MEI properties

Centre size

We defined the diameter of the centre of the MEI in the horizontal and the vertical orientation, respectively, as dxc=2σxc and dyc=2σyc. The centre size was calculated as 12(dxc+dyc). We then estimated a contour outlining the MEI centre as the line that is defined by all points at which the 2D centre Gaussian Gc attains the value Gc(x,y) with (x,y)=(μx+σxc,μy+σyc). The centre mask m was then defined as a binary matrix with all pixels within the convex hull of this contour being 1 and all other pixels set to 0. This mask is used for calculating centre chromatic contrast (see below).

Temporal frequency

To estimate temporal frequency of the MEIs, we estimated the power spectrum of the temporal components using a fast Fourier transform after attenuating high frequency noise by filtering with a fifth-order low-pass Butterworth filter with cutoff frequency 10 Hz. We then estimated the mean frequency of the temporal component by calculating an average of the frequency components, each weighted with its relative power.

Contrast

The contrast of the MEIs in the two channels, γ(xc(n)) for c [green, UV], was defined as the difference between the mean value within the centre mask m at the two last peaks of the temporal component of the MEI in the UV channel at time points t2 and t1:

γ(xc(n))=(xc(n)m)(t2)(xc(n)m)(t1),

where ⊙ denotes the element-wise multiplication of the MEI and the binary mask (see Figure 3f). The peaks were found with the function scipy.signal.find_peaks, and the peaks found for the UV channel were used to calculate contrast both in the green and the UV channel.

Validating MEIs experimentally

Generating MEI stimuli

To test experimentally whether the model correctly predicts which stimuli would maximally excite RGCs of different RGC groups, we performed a new set of experiments (numbers indicated in red in Figure 1—figure supplement 1), where we complemented our stimulus set with MEI stimuli. For the MEI stimuli, we selected 11 RGCs, chosen to span the responses space and to represent both well-described and poorly understood RGC groups, for which we generated MEIs at different positions on a 5 × 5 grid (spanning 110 μm in vertical and horizontal direction). We decomposed the MEIs as described above, and reconstructed MEIs as rank 1 tensors by taking the outer product of the spatial and temporal components:

x¯=S11U:1V:1T

The MEI stimuli, lasting 50 frames (1.66 s) were padded with 10 frames (0.34 s) of inter-stimulus grey, and were randomly interleaved. With 11 stimuli, presented at 25 positions and lasting 2 s each, the total stimulus duration was 11 × 25 × 2 s = 550 s. Since the model operated on a z-scored (0 mean, 1 SD) version of the movie, MEIs as predicted by the model lived in the same space and had to be transformed back to the stimulator range ([0, 255]) before being used as stimuli in an experiment by scaling with the movie’s SD and adding the movie’s mean. The MEIs’ green channel was then displayed with the green LED, and the UV channel was displayed with the UV LED. For experiments at Northwestern University, an additional transform was necessary to achieve the same levels of photoreceptor activation (photoisomerisation rates) for M- and S-cones with different LEDs. To ensure proper chromatic scaling between the different experimental apparatuses with different spectral profiles, we described the relative activation of M- and S-cones by the green and UV LEDs in the stimulation setup used in the two-photon imaging experiments (setup A) by a matrix

A=[amgasgamuasu]=[10.1901],

and the relative activation of M- and S-cones by the stimulation setup used in the patch-clamp experiments (setup B) by a matrix

B=[bmgbsgbmubsu]=[10.90.0351],

where diagonal entries describe the activation of M-cones by the green LED, and of S-cones by the UV LED, and entries in the off-diagonal describe the cross-activation (i.e. M-cones by UV-LED and S-cones by green LED). The activation of M-cones and S-cones eT=(em,es) by a stimulus 𝒙2×1 displayed on a given stimulation setup was approximated as 𝐞=𝐀𝒙 (Christenson et al., 2022). Hence, a stimulus 𝒙 displayed on setup B, defined as 𝒙=𝐁1𝐀𝒙, will achieve the same photoreceptor activation as stimulus 𝒙 displayed on setup A. Since the solution exceeded the valid range of the stimulator ([0, 255]), we added an offset and multiplied with a scalar factor to ensure all stimuli were within the valid range.

Analysing RGC responses to MEI stimuli

We wanted to evaluate the responses of RGCs to the MEI stimuli in a spatially resolved fashion, i.e., weighting responses to MEIs displayed at different locations proportional to the strength of the RGCs RF at that location. In order to be able to meaningfully compare MEI responses between RGCs and across groups, for each RGC, we first centred and scaled the responses to zero mean and a standard deviation of 1. Then, for each RGC n, we computed a spatial average of its responses, weighting its responses at each spatial location (x,y) proportional to the Gaussian density 𝒩𝝁𝒏,σn(x,y), where the parameters of the Gaussian 𝝁𝒏=(μx,μy),σn were the model’s estimated readout parameters for neuron n (Figure 4b, c, and d, left):

r(n)x,y=x=15y=15rx,y(n)Nμn,σn(x,y)

where rx,y(n)R11×60 is the 60 frames (2 s) long response of neuron n to an MEI at position (x,y)=(x,y), resampled from the recording frame rate of 7.81 Hz to 30 Hz. We then averaged 𝒓(i)x,y across time in the optimisation time window, i.e., frames 30–50, to get a scalar response r~(n)=r(n)x,y,t for each MEI stimulus (Figure 4d).

Selectivity index

To quantify the selectivity of the response r~(n)(xi) of an RGC n to an MEI 𝒙𝒊*, we defined a selectivity index as follows. First, we standardised the responses r~(n) across all MEIs by subtracting the mean and dividing by the standard deviation. The selectivity index of RGC group Gg to MEI 𝒙𝒊* was then defined as

SIg(xi)=r~(n)(xi)110j=111δijr~(n)(xj)n,

where δij is the Kronecker delta. In words, the SI is the difference (in units of SD response) between the response to the MEI of interest (𝒙𝒊*) and the mean response to all other (10) MEIs, 110j=111δijr~(n)(xj), averaged across all cells n belonging to the group of interest Gg.

Characterising nonlinear processing of chromatic contrast

We wanted to analyse the tuning of G28/tSbC RGCs to chromatic contrast and to this end, we mapped the model response and its gradient across a range of chromatic contrasts (Figure 6). Specifically, the MEIs have d=2×18×16×50=28,800 pixels and dimensions, 14,400 for each colour channel. Now let x*(n)1x28800 be the cell’s MEI estimated using the CNN model, with the first d=14,400 dimensions defining the green pixels and the remaining dimensions defining the UV pixels. Then for each cell, we first consider a two-dimensional subspace spanned by two basis vectors 𝐞1,𝐞2 where

e1=s1[x1(n)x2(n)xd(n)00]e2=s2[00xd(n)xd+1(n)x2(n)d]

In words, the first basis vector 𝐞1 consists of the green component of the MEI in the green channel, multiplied by 1, and of 0 s in the UV channel, and the second basis vector consists of the UV component in the UV channel and of 0 s in the green channel for 𝐞2. s1 and s2 are two scaling factors chosen to equalise contrast (as measured by L2 norm) across colour channels while preserving the contrast of the stimulus as a whole. In the subspace spanned by these basis vectors, the point [1,1] represents a contrast-scaled version of the original MEI. We then sampled 11 points along each dimension, equally spaced between [1,1], which resulted in stimuli that are identical in terms of their spatial and temporal properties and only differ in their contrast. We then evaluated the model neuron response at these points in the subspace (Figure 6d). We also evaluated the gradient of the model neuron response at these points and plotted the direction of the gradient projected into the subspace spanned by e1,e2 (Figure 6b and c).

Detection performance analysis

To test the performance of individual RGCs of different groups in detecting the target class of inter-clip transitions (ground-to-sky) from all other classes of inter-clip transitions, we performed an ROC analysis (Fawcett, 2006). For each RGC, we calculated its response to an inter-clip transition occurring at time t0 as the baseline-subtracted average response within 1 s following the transition, i.e., 1Tt=0Tr(t)r(t0), with T=30 frames at 30 Hz. For all N=40 equally spaced thresholds within the response range of an RGC, we then calculated the TPR and FPR of a hypothetical classifier classifying all transitions eliciting an above-threshold response as a positive, and all other transitions as negative. Plotting the TPR as a function of FPR yields an ROC curve, the area under which (AUC) is equivalent to the probability that the RGC will respond more strongly to a randomly chosen inter-clip transition of the target class than to a randomly chosen inter-clip transition of a different class. The AUC thus is a measure of performance for RGCs in this detection task.

Detection task in simulation

We simulated the four types of transitions (sky-sky, sky-ground, ground-ground, ground-sky) in natural scenes at different velocities, which could be triggered by different behaviours such as locomotion or eye movements. With the simulated context-changing stimuli, we predicted model neuron responses in silico and then determined if G28 could perform the detection task robustly well across speeds.

For generating the stimuli, 500 frames were randomly extracted from the same mouse natural movies used for the 2P imaging experiments. For each frame, we simulated visual transitions by moving a 72×64 pixel-large window along a fixed trajectory (Figure 7h, bottom) at four different angular velocities: 50, 150, 250, and 350/s, corresponding to 4, 12, 20, and 28 pixels per frame, respectively (Figure 7—figure supplement 3a and b). Each edge of the trajectory is 220 pixels long, covering 90.6 of visual angle. Each selected scene frame was sampled eight times (i.e. twice per velocity). To avoid potential biases due to asymmetries in the mouse natural movie, we sampled each frame for each velocity both in clockwise and in counterclockwise direction. The stimuli were then down-sampled to 18×16 pixels and shown to the model at a frame rate of 30 Hz. Because the trajectories contained different numbers of moving frames for the four velocities, we ‘padded’ the stimuli at the beginning and the end of each transition stimulus by duplicating the start and end frames, resulting in a total of 60 frames each (see illustration in Figure 7—figure supplement 3).

Statistical analysis

Permutation test

We wanted to test how likely the difference in AUC observed for different RGC groups are to occur under the null hypothesis that the underlying distributions they are sampled from are equal. To this end, we performed a permutation test. We generated a null distribution for our test statistic, the absolute difference in AUC values ΔAUC, by shuffling the RGC group labels of the two groups of interest (e.g. G28 and G24) and calculating the test statistic with shuffled labels 100,000 times. We only included RGC groups with at least N=4 cells in this analysis. We then obtained a p-value for ΔAUC observed with true labels as the proportion of entries in the null distribution larger than ΔAUC.

Bootstrapped confidence intervals

We bootstrapped confidence intervals for ΔAUC (Figure 7 and Figure 7—figure supplement 2). For ΔAUC, we generated a bootstrapped distribution by sampling 100 times with replacement from the AUC values of the two groups that were being compared and calculated ΔAUC. We then estimated the 95% confidence interval for ΔAUC as the interval defined by the 2.5th and 97.5th percentile of the bootstrapped distribution of ΔAUC.

For Γ(ϕs,ϕνg), we generated a bootstrapped distribution by sampling 100 times with replacement from the MEI responses of RGC group g and then calculating RDMϕνg and Γ(ϕs,ϕνg) for each sample. We then estimated the 95% confidence interval for Γ(ϕs,ϕνg) as the interval defined by the 2.5th and 97.5th percentile of the bootstrapped distribution of Γ(ϕs,ϕνg).

Estimating effect size

The effect size of difference in AUC observed for different RGC groups l and k, ΔAUC (Figure 7 and Figure 7—figure supplement 2), was estimated as Cohen’s d (Cohen, 1988; Goulet-Pelletier and Cousineau, 2018):

|mkml|s,

with

s=(Nk1)sk2+(Nl1)sl2Nk+Nl2

and mk and sk the sample mean and standard deviation, respectively, of the AUC observed for the Nk RGCs of group k.

Estimating linear correlation

Wherever the linear correlation between two paired samples x and y of size N was calculated for evaluating model performance, Figure 2, Figure 1—figure supplement 1, Figure 4—figure supplement 1, we used Pearson’s correlation coefficient:

Cxy=iN(xix¯)(yiy¯)iN(xix¯)2iN(yiy¯)2

Acknowledgements

We thank Jonathan Oesterle and Dominic Gonschorek for feedback on the manuscript, Jan Lause for statistical consulting, and Merle Harrer for general assistance. We also thank all members of the Sinz lab for regular discussions on the project. This work was supported by the German Research Foundation (DFG; CRC 1233 'Robust Vision: Inference Principles and Neural Mechanisms', project number 276693517 to PB, MB, TE, KF; Heisenberg Professorship, BE5601/8-1 to PB; Excellence Cluster EXC 2064, project number 390727645 to PB; CRC 1456 'Mathematics of Experiment' project number 432680300 to ASE), the Federal Ministry of Education and Research (FKZ 01IS18039A to PB), National Institutes of Health (NIH; NEI EY031029, NEI EY031329 to GWS; NEI F30EY031565, NIGMS T32GM008152 to ZJ), and the European Research Council (ERC) under the European Union’s Horizon Europe research and innovation programme (ASE, grant agreement No. 101041669). We acknowledge support from the Open Access Publication Fund of the University of Tübingen.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Thomas Euler, Email: thomas.euler@cin.uni-tuebingen.de.

Fred Rieke, University of Washington, United States.

Lois EH Smith, Boston Children's Hospital, United States.

Funding Information

This paper was supported by the following grants:

  • Deutsche Forschungsgemeinschaft CRC 1233 project number 276693517 to Matthias Bethge, Philipp Berens, Katrin Franke, Thomas Euler.

  • Deutsche Forschungsgemeinschaft Heisenberg Professorship BE5601/8-1 to Philipp Berens.

  • Deutsche Forschungsgemeinschaft EXC 2064 390727645 to Philipp Berens.

  • Deutsche Forschungsgemeinschaft CRC 1456 project number 432680300 to Alexander S Ecker.

  • Bundesministerium für Bildung und Forschung FKZ 01IS18039A to Philipp Berens.

  • National Institutes of Health NEI EY031029 to Gregory W Schwartz.

  • National Institutes of Health NEI F30EY031565 to Zachary Jessen.

  • European Research Council grant agreement No. 101041669 to Alexander S Ecker.

  • National Institutes of Health NEI EY031329 to Gregory W Schwartz.

  • National Institutes of Health NIGMS T32GM008152 to Zachary Jessen.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Software, Formal analysis, Validation, Visualization, Methodology, Writing - original draft, Writing – review and editing.

Data curation, Validation, Investigation.

Software, Formal analysis, Methodology.

Validation, Visualization, Methodology, Writing – review and editing.

Resources, Software.

Software, Methodology, Writing – review and editing.

Validation, Investigation, Visualization, Methodology, Writing - original draft.

Supervision, Funding acquisition, Validation, Investigation, Methodology, Writing - original draft.

Conceptualization, Supervision, Funding acquisition, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Visualization, Writing - original draft, Project administration, Writing – review and editing.

Ethics

All imaging experiments were conducted at the University of Tübingen; the corresponding animal procedures were approved by the governmental review board (Regierungspräsidium Tübingen, Baden-Württemberg, Konrad-Adenauer-Str. 20, 72072 Tübingen, Germany) and performed according to the laws governing animal experimentation issued by the German Government. All electrophysiological experiments were conducted at Northwestern University; the corresponding animal procedures were performed according to standards provided by Northwestern University Center for Comparative Medicine and approved by the Institutional Animal Care and Use Committee (IACUC).

Additional files

MDAR checklist

Data availability

The data and the movie stimulus are available at https://gin.g-node.org/eulerlab/rgc-natstim. Code for training models and reproducing analyses and figures is available at https://github.com/eulerlab/rgc-natstim-model, (copy archived at Hoefling et al., 2024).

The following dataset was generated:

Höfling H, Szatko KP, Behrens C, Yeng D, Qiu Y, Klindt DA, Jessen Z, Schwartz GM, Bethge M, Berens P, Franke K, Ecker AS, Euler T. 2024. rgc-natstim-model. g-node. rgc-natstim-model

References

  1. Abballe L, Asari H. Natural image statistics for mouse vision. PLOS ONE. 2022;17:e0262763. doi: 10.1371/journal.pone.0262763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Aljadeff J, Segev R, Berry MJ, Sharpee TO. Spike triggered covariance in strongly correlated gaussian stimuli. PLOS Computational Biology. 2013;9:e1003206. doi: 10.1371/journal.pcbi.1003206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Aljadeff J, Lansdell BJ, Fairhall AL, Kleinfeld D. Analysis of neuronal spike trains, deconstructed. Neuron. 2016;91:221–259. doi: 10.1016/j.neuron.2016.05.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Arshadi C, Günther U, Eddison M, Harrington KIS, Ferreira TA. SNT: a unifying toolbox for quantification of neuronal anatomy. Nature Methods. 2021;18:374–377. doi: 10.1038/s41592-021-01105-7. [DOI] [PubMed] [Google Scholar]
  5. Baden T, Schubert T, Chang L, Wei T, Zaichuk M, Wissinger B, Euler T. A tale of two retinal domains: near-optimal sampling of achromatic contrasts in natural scenes through asymmetric photoreceptor distribution. Neuron. 2013;80:1206–1217. doi: 10.1016/j.neuron.2013.09.030. [DOI] [PubMed] [Google Scholar]
  6. Baden T, Berens P, Franke K, Román Rosón M, Bethge M, Euler T. The functional diversity of retinal ganglion cells in the mouse. Nature. 2016;529:345–350. doi: 10.1038/nature16468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Baden T, Schaeffel F, Berens P. Visual neuroscience: a retinal ganglion cell to report image focus? Current Biology. 2017;27:R139–R141. doi: 10.1016/j.cub.2016.12.022. [DOI] [PubMed] [Google Scholar]
  8. Baden T, Euler T, Berens P. Understanding the retinal basis of vision across species. Nature Reviews. Neuroscience. 2020;21:5–20. doi: 10.1038/s41583-019-0242-1. [DOI] [PubMed] [Google Scholar]
  9. Bae JA, Mu S, Kim JS, Turner NL, Tartavull I, Kemnitz N, Jordan CS, Norton AD, Silversmith WM, Prentki R, Sorek M, David C, Jones DL, Bland D, Sterling ALR, Park J, Briggman KL, Seung HS, Eyewirers Digital museum of retinal ganglion cells with dense anatomy and physiology. Cell. 2018;173:1293–1306. doi: 10.1016/j.cell.2018.04.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bashivan P, Kar K, DiCarlo JJ. Neural population control via deep image synthesis. Science. 2019;364:eaav9436. doi: 10.1126/science.aav9436. [DOI] [PubMed] [Google Scholar]
  11. Batty E, Merel J, Brackbill N, Heitman A, Sher A, Litke A, Chichilnisky EJ, Paninski L. Multilayer recurrent network models of primate retinal ganglion cell responses. Proceedings of the 5th International Conference on Learning Representations.2017. [Google Scholar]
  12. Behrens C, Schubert T, Haverkamp S, Euler T, Berens P. Connectivity map of bipolar cells and photoreceptors in the mouse retina. eLife. 2016;5:e20041. doi: 10.7554/eLife.20041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Berens P, Freeman J, Deneux T, Chenkov N, McColgan T, Speiser A, Macke JH, Turaga SC, Mineault P, Rupprecht P, Gerhard S, Friedrich RW, Friedrich J, Paninski L, Pachitariu M, Harris KD, Bolte B, Machado TA, Ringach D, Stone J, Rogerson LE, Sofroniew NJ, Reimer J, Froudarakis E, Euler T, Román Rosón M, Theis L, Tolias AS, Bethge M. Community-based benchmarking improves spike rate inference from two-photon calcium imaging data. PLOS Computational Biology. 2018;14:e1006157. doi: 10.1371/journal.pcbi.1006157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Briggman KL, Euler T. Bulk electroporation and population calcium imaging in the adult mammalian retina. Journal of Neurophysiology. 2011;105:2601–2609. doi: 10.1152/jn.00722.2010. [DOI] [PubMed] [Google Scholar]
  15. Burg MF, Zenkel T, Vystrčilová M, Oesterle J, Höfling L, Willeke KF, Lause J. Most Discriminative Stimuli for Functional Cell Type Identification. arXiv. 2023 http://arxiv.org/abs/2401.05342
  16. Cadena SA, Weis MA, Gatys LA, Bethge M, Ecker AS. In: Computer Vision - ECCV 2018. Cadena SA, Weis MA, editors. Springer Link; 2018. Diverse feature visualizations reveal invariances in early layers of deep neural networks; pp. 225–240. [DOI] [Google Scholar]
  17. Cadena SA, Denfield GH, Walker EY, Gatys LA, Tolias AS, Bethge M, Ecker AS. Deep convolutional models improve predictions of macaque V1 responses to natural images. PLOS Computational Biology. 2019;15:e1006897. doi: 10.1371/journal.pcbi.1006897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Cafaro M, Midgley F, Neward T, Wark B. Symphony-DAS. v2.6.3.0Github. 2019 https://symphony-das.github.io
  19. Chang L, Breuninger T, Euler T. Chromatic coding from cone-type unselective circuits in the mouse retina. Neuron. 2013;77:559–571. doi: 10.1016/j.neuron.2012.12.012. [DOI] [PubMed] [Google Scholar]
  20. Chichilnisky EJ. A simple white noise analysis of neuronal light responses. Network. 2001;12:199–213. doi: 10.1080/713663221. [DOI] [PubMed] [Google Scholar]
  21. Christenson MP, Mousavi SN, Oriol E, Heath SL, Behnia R. Exploiting colour space geometry for visual stimulus design across animals. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 2022;377:20210280. doi: 10.1098/rstb.2021.0280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cohen JN. Statistical Power Analysis for the Behavioral Sciences. Routledge; 1988. [DOI] [Google Scholar]
  23. Dacey DM, Lee BB. The “blue-on” opponent pathway in primate retina originates from a distinct bistratified ganglion cell type. Nature. 1994;367:731–735. doi: 10.1038/367731a0. [DOI] [PubMed] [Google Scholar]
  24. Ding Z, Tran DT, Ponder K, Cobos E, Ding Z, Fahey PG, Wang E, Muhammad T, Fu J, Cadena SA, Papadopoulos S, Patel S, Franke K, Reimer J, Sinz FH, Ecker AS, Pitkow X, Tolias AS. Bipartite Invariance in Mouse Primary Visual Cortex. bioRxiv. 2023 doi: 10.1101/2023.03.15.532836. [DOI]
  25. Euler T, Hausselt SE, Margolis DJ, Breuninger T, Castell X, Detwiler PB, Denk W. Eyecup scope--optical recordings of light stimulus-evoked fluorescence signals in the retina. Pflugers Archiv. 2009;457:1393–1414. doi: 10.1007/s00424-008-0603-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Euler T, Franke K, Baden T. In: In Multiphoton Microscopy. Hartveit Espen., editor. Humana Press; 2019. Studying a light sensor with light: multiphoton imaging in the retina; pp. 225–250. [DOI] [Google Scholar]
  27. Evans DA, Stempel AV, Vale R, Branco T. Cognitive control of escape behaviour. Trends in Cognitive Sciences. 2019;23:334–348. doi: 10.1016/j.tics.2019.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Fawcett TN. An introduction to ROC analysis. Pattern Recognition Letters. 2006;27:861–874. doi: 10.1016/j.patrec.2005.10.010. [DOI] [Google Scholar]
  29. Field GD, Sher A, Gauthier JL, Greschner M, Shlens J, Litke AM, Chichilnisky EJ. Spatial properties and functional organization of small bistratified ganglion cells in primate retina. The Journal of Neuroscience. 2007;27:13261–13272. doi: 10.1523/JNEUROSCI.3437-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Franke K, Maia Chagas A, Zhao Z, Zimmermann MJ, Bartel P, Qiu Y, Szatko KP, Baden T, Euler T. An arbitrary-spectrum spatial visual stimulator for vision research. eLife. 2019;8:e48779. doi: 10.7554/eLife.48779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Franke K, Willeke KF, Ponder K, Galdamez M, Zhou N, Muhammad T, Patel S, Froudarakis E, Reimer J, Sinz FH, Tolias AS. State-dependent pupil dilation rapidly shifts visual feature selectivity. Nature. 2022;610:128–134. doi: 10.1038/s41586-022-05270-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Fu J, Shrinivasan S, Baroni L, Ding Z, Fahey PG, Pierzchlewicz P, Ponder K, Froebe R, Ntanavara L, Muhammad T, Willeke KF, Wang E, Ding Z, Tran DT, Papadopoulos S, Patel S, Reimer J, Ecker AS, Pitkow X, Antolik J, Sinz FH, Haefner RM, Tolias AS, Franke K. Pattern Completion and Disruption Characterize Contextual Modulation in the Visual Cortex. bioRxiv. 2024 doi: 10.1101/2023.03.13.532473. [DOI]
  33. Gawne TJ, Norton TT. An opponent dual-detector spectral drive model of emmetropization. Vision Research. 2020;173:7–20. doi: 10.1016/j.visres.2020.03.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Globerson A, Stark E, Vaadia E, Tishby N. The minimum information principle and its application to neural code analysis. PNAS. 2009;106:3490–3495. doi: 10.1073/pnas.0806782106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Goetz J, Jessen ZF, Jacobi A, Mani A, Cooler S, Greer D, Kadri S, Segal J, Shekhar K, Sanes JR, Schwartz GW. Unified classification of mouse retinal ganglion cells using function, morphology, and gene expression. Cell Reports. 2022;40:111040. doi: 10.1016/j.celrep.2022.111040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Goldin MA, Lefebvre B, Virgili S, Pham Van Cang MK, Ecker A, Mora T, Ferrari U, Marre O. Context-dependent selectivity to natural images in the retina. Nature Communications. 2022;13:5556. doi: 10.1038/s41467-022-33242-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Goulet-Pelletier JC, Cousineau D. A review of effect sizes and their confidence intervals, part I: the cohen’s d family. The Quantitative Methods for Psychology. 2018;14:242–265. doi: 10.20982/tqmp.14.4.p242. [DOI] [Google Scholar]
  38. Gouras P, Ekesten B. Why do mice have ultra-violet vision? Experimental Eye Research. 2004;79:887–892. doi: 10.1016/j.exer.2004.06.031. [DOI] [PubMed] [Google Scholar]
  39. Gupta D, Młynarski W, Sumser A, Symonova O, Svatoň J, Joesch M. Panoramic visual statistics shape retina-wide organization of receptive fields. Neuroscience. 2022;01:1–31. doi: 10.1101/2022.01.11.475815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Hoefling L, Deng Y, D’Agostino F. rgc-natstim-model. swh:1:rev:fab17f3c34167effb97f2fbbf534ea5f5cfcb8a8Software Heritage. 2024 https://archive.softwareheritage.org/swh:1:dir:9b10b35ed7b1ff364637c849296fd407e66d8dc9;origin=https://github.com/eulerlab/rgc-natstim-model;visit=swh:1:snp:c7e52be290f24dba2e99afe2ddc8634c365a4bf9;anchor=swh:1:rev:fab17f3c34167effb97f2fbbf534ea5f5cfcb8a8
  41. Ito S, Feldheim DA, Litke AM. Segregation of visual response properties in the mouse superior colliculus and their modulation during locomotion. The Journal of Neuroscience. 2017;37:8428–8443. doi: 10.1523/JNEUROSCI.3689-16.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Jacobs GH, Williams GA, Fenwick JA. Influence of cone pigment coexpression on spectral sensitivity and color vision in the mouse. Vision Research. 2004;44:1615–1622. doi: 10.1016/j.visres.2004.01.016. [DOI] [PubMed] [Google Scholar]
  43. Jacoby J, Schwartz GW. Typology and circuitry of suppressed-by-contrast retinal ganglion cells. Frontiers in Cellular Neuroscience. 2018;12:269. doi: 10.3389/fncel.2018.00269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Joesch M, Meister M. A neuronal circuit for colour vision based on rod-cone opponency. Nature. 2016;532:236–239. doi: 10.1038/nature17158. [DOI] [PubMed] [Google Scholar]
  45. Johnson KP, Fitzpatrick MJ, Zhao L, Wang B, McCracken S, Williams PR, Kerschensteiner D. Cell-type-specific binocular vision guides predation in mice. Neuron. 2021;109:1527–1539. doi: 10.1016/j.neuron.2021.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Karamanlis D, Gollisch T. Nonlinear spatial integration underlies the diversity of retinal ganglion cell responses to natural images. The Journal of Neuroscience. 2021;41:3479–3498. doi: 10.1523/JNEUROSCI.3075-20.2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Karamanlis D, Schreyer HM, Gollisch T. Retinal encoding of natural scenes. Annual Review of Vision Science. 2022;8:171–193. doi: 10.1146/annurev-vision-100820-114239. [DOI] [PubMed] [Google Scholar]
  48. Kerschensteiner D. Feature detection by retinal ganglion cells. Annual Review of Vision Science. 2022;8:135–169. doi: 10.1146/annurev-vision-100419-112009. [DOI] [PubMed] [Google Scholar]
  49. Khani MH, Gollisch T. Linear and nonlinear chromatic integration in the mouse retina. Nature Communications. 2021;12:1900. doi: 10.1038/s41467-021-22042-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Kim T, Shen N, Hsiang JC, Johnson KP, Kerschensteiner D. Dendritic and parallel processing of visual threats in the retina control defensive responses. Science Advances. 2020;6:eabc9920. doi: 10.1126/sciadv.abc9920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Kingma DP, Ba J. Adam: a method for stochastic optimization. Proceedings of the 3rd International Conference on Learning Representations. 2015;01:1–15. doi: 10.1145/1830483.1830503. [DOI] [Google Scholar]
  52. Klindt DA, Ecker AS, Euler T, Bethge M. Neural System Identification for Large Populations Separating ‘What’ and ‘where. arXiv. 2017 http://arxiv.org/abs/1711.02653
  53. Krieger B, Qiao M, Rousso DL, Sanes JR, Meister M. Four alpha ganglion cell types in mouse retina: Function, structure, and molecular signatures. PLOS ONE. 2017;12:e0180091. doi: 10.1371/journal.pone.0180091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Lettvin JY, Maturana HR, McCulloch WS, Pitts WH. What the frog’s eye tells the frog’s brain. Proceedings of the IRE. 1959;47:1940–1951. doi: 10.1109/JRPROC.1959.287207. [DOI] [Google Scholar]
  55. Liu JK, Schreyer HM, Onken A, Rozenblit F, Khani MH, Krishnamoorthy V, Panzeri S, Gollisch T. Inference of neuronal functional circuitry with spike-triggered non-negative matrix factorization. Nature Communications. 2017;8:149. doi: 10.1038/s41467-017-00156-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. lucastheis C2s. swh:1:rev:9f12398a33a17e557b4a689f1dce902123c6f2ebSoftware Heritage. 2016 https://archive.softwareheritage.org/swh:1:dir:2434f9554d743b4715f82ce6a26b3a9e07368212;origin=https://github.com/lucastheis/c2s;visit=swh:1:snp:507ec9a7abe956e3292ecfc341605b14b0eb47ff;anchor=swh:1:rev:9f12398a33a17e557b4a689f1dce902123c6f2eb
  57. Lurz KK, Bashiri M, Willeke K, Jagadish AK, Wang E, Walker EY, Cadena SA, Muhammad T, Cobos E, Tolias AS, Ecker AS, Sinz FH. Generalization in data-driven models of primary visual cortex. Neuroscience. 2021;01:9. doi: 10.1101/2020.10.05.326256. [DOI] [Google Scholar]
  58. Maheswaranathan N, McIntosh LT, Tanaka H, Grant S, Kastner DB, Melander JB, Nayebi A, Brezovec LE, Wang JH, Ganguli S, Baccus SA. Interpreting the retinal neural code for natural scenes: from computations to neurons. Neuron. 2023;111:2742–2755. doi: 10.1016/j.neuron.2023.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Mani A, Schwartz GW. Circuit mechanisms of a retinal ganglion cell with stimulus-dependent response latency and activation beyond its dendrites. Current Biology. 2017;27:471–482. doi: 10.1016/j.cub.2016.12.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Marr D. Vision: A Computational Investigation Into the Human Representation and Processing of Visual Information. The MIT Press; 2010. [Google Scholar]
  61. Martersteck EM, Hirokawa KE, Evarts M, Bernard A, Duan X, Li Y, Ng L, Oh SW, Ouellette B, Royall JJ, Stoecklin M, Wang Q, Zeng H, Sanes JR, Harris JA. Diverse central projection patterns of retinal ganglion cells. Cell Reports. 2017;18:2058–2072. doi: 10.1016/j.celrep.2017.01.075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Masland RH, Martin PR. The unsolved mystery of vision. Current Biology. 2007;17:R577–R582. doi: 10.1016/j.cub.2007.05.040. [DOI] [PubMed] [Google Scholar]
  63. McIntosh LT, Maheswaranathan N, Nayebi A, Ganguli S, Baccus SA. Deep learning models of the retinal response to natural scenes. Neural Information Processing Systems; 2016. pp. 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Meyer AF, O’Keefe J, Poort J. Two distinct types of eye-head coupling in freely moving mice. Current Biology. 2020;30:2116–2130. doi: 10.1016/j.cub.2020.04.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Michael CR. Receptive fields of single optic nerve fibers in a mammal with an all-cone retina. I: contrast-sensitive units. Journal of Neurophysiology. 1968;31:249–256. doi: 10.1152/jn.1968.31.2.249. [DOI] [PubMed] [Google Scholar]
  66. Mouland JW, Pienaar A, Williams C, Watson AJ, Lucas RJ, Brown TM. Extensive cone-dependent spectral opponency within a discrete zone of the lateral geniculate nucleus supporting mouse color vision. Current Biology. 2021;31:3391–3400. doi: 10.1016/j.cub.2021.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Münch TA, da Silveira RA, Siegert S, Viney TJ, Awatramani GB, Roska B. Approach sensitivity in the retina processed by a multifunctional neural circuit. Nature Neuroscience. 2009;12:1308–1316. doi: 10.1038/nn.2389. [DOI] [PubMed] [Google Scholar]
  68. Niell CM, Stryker MP. Modulation of visual responses by behavioral state in mouse visual cortex. Neuron. 2010;65:472–479. doi: 10.1016/j.neuron.2010.01.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Olah C, Mordvintsev A, Schubert L. Feature visualization. Distill. 2017;2:e0007. doi: 10.23915/distill.00007. [DOI] [Google Scholar]
  70. Piscopo DM, El-Danaf RN, Huberman AD, Niell CM. Diverse visual features encoded in mouse lateral geniculate nucleus. The Journal of Neuroscience. 2013;33:4642–4656. doi: 10.1523/JNEUROSCI.5187-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Prechelt L. In: Neural Networks:Tricks of the Trade. Lecture Notes in Computer Science. Prechelt L, editor. Springer; 1998. Early stopping - but when? pp. 55–69. [DOI] [Google Scholar]
  72. Qiu Y, Zhao Z, Klindt D, Kautzky M, Szatko KP, Schaeffel F, Rifai K, Franke K, Busse L, Euler T. Natural environment statistics in the upper and lower visual field are reflected in mouse retinal specializations. Current Biology. 2021;31:3233–3247. doi: 10.1016/j.cub.2021.05.017. [DOI] [PubMed] [Google Scholar]
  73. Qiu Y, Klindt DA, Szatko KP, Gonschorek D, Hoefling L, Schubert T, Busse L, Bethge M, Euler T. Efficient coding of natural scenes improves neural system identification. PLOS Computational Biology. 2023;19:e1011037. doi: 10.1371/journal.pcbi.1011037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Rheaume BA, Jereen A, Bolisetty M, Sajid MS, Yang Y, Renna K, Sun L, Robson P, Trakhtenberg EF. Single cell transcriptome profiling of retinal ganglion cells identifies cellular subtypes. Nature Communications. 2018;9:2759. doi: 10.1038/s41467-018-05134-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Román Rosón M, Bauer Y, Kotkat AH, Berens P, Euler T, Busse L. Mouse dLGN receives functional input from a diverse population of retinal ganglion cells with limited convergence. Neuron. 2019;102:462–476. doi: 10.1016/j.neuron.2019.01.040. [DOI] [PubMed] [Google Scholar]
  76. Roy S, Jun NY, Davis EL, Pearson J, Field GD. Inter-mosaic coordination of retinal receptive fields. Nature. 2021;592:409–413. doi: 10.1038/s41586-021-03317-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Schwartz O, Pillow JW, Rust NC, Simoncelli EP. Spike-triggered neural characterization. Journal of Vision. 2006;6:484–507. doi: 10.1167/6.4.13. [DOI] [PubMed] [Google Scholar]
  78. Schwartz GW, Swygart D. In: The Senses: A Comprehensive Reference. Schwartz GW, editor. Elsevier; 2020. Circuits for feature selectivity in the inner retina; pp. 275–292. [DOI] [Google Scholar]
  79. Schwartz GW. In: In Retinal Computation. Schwartz GW, editor. Academic Press; 2021. Chapter 16 - color processing; pp. 288–317. [DOI] [Google Scholar]
  80. Schwartz G, Ala-Laurila P. sa-labs-extension. ccde6ebGitHub. 2024 https://github.com/Schwartz-AlaLaurila-Labs/sa-labs-extension
  81. Schwartz-AlaLaurila-Labs Sa-labs-extension. swh:1:rev:6ffbbc63bd3b1f558db2dd8bf604ae12983491e9Software Heriatge. 2024 https://archive.softwareheritage.org/swh:1:dir:aaf6ba71844a9da2a3433a5d9594e04b61f973c1;origin=https://github.com/Schwartz-AlaLaurila-Labs/sa-labs-extension;visit=swh:1:snp:5f2147554b856b650bf9b3e7b7c7048ad4bf6624;anchor=swh:1:rev:6ffbbc63bd3b1f558db2dd8bf604ae12983491e9
  82. Sharpee T, Rust NC, Bialek W. Analyzing neural responses to natural signals: maximally informative dimensions. Neural Computation. 2004;16:223–250. doi: 10.1162/089976604322742010. [DOI] [PubMed] [Google Scholar]
  83. Sharpee TO. Computational identification of receptive fields. Annual Review of Neuroscience. 2013;36:103–120. doi: 10.1146/annurev-neuro-062012-170253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Simoncelli EP, Olshausen BA. Natural image statistics and neural representation. Annual Review of Neuroscience. 2001;24:1193–1216. doi: 10.1146/annurev.neuro.24.1.1193. [DOI] [PubMed] [Google Scholar]
  85. Sonoda T, Okabe Y, Schmidt TM. Overlapping morphological and functional properties between M4 and M5 intrinsically photosensitive retinal ganglion cells. The Journal of Comparative Neurology. 2020;528:1028–1040. doi: 10.1002/cne.24806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Stabio ME, Sabbah S, Quattrochi LE, Ilardi MC, Fogerson PM, Leyrer ML, Kim MT, Kim I, Schiel M, Renna JM, Briggman KL, Berson DM. The M5 cell: a color-opponent intrinsically photosensitive retinal ganglion cell. Neuron. 2018;97:150–163. doi: 10.1016/j.neuron.2017.11.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Sümbül U, Song S, McCulloch K, Becker M, Lin B, Sanes JR, Masland RH, Seung HS. A genetic and computational approach to structurally classify neuronal types. Nature Communications. 2014;5:3512. doi: 10.1038/ncomms4512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Szatko KP, Korympidou MM, Ran Y, Berens P, Dalkara D, Schubert T, Euler T, Franke K. Neural circuits in the mouse retina support color vision in the upper visual field. Nature Communications. 2020;11:3481. doi: 10.1038/s41467-020-17113-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Szél A, Röhlich P, Caffé AR, Juliusson B, Aguirre G, Van Veen T. Unique topographic separation of two spectral classes of cones in the mouse retina. The Journal of Comparative Neurology. 1992;325:327–342. doi: 10.1002/cne.903250302. [DOI] [PubMed] [Google Scholar]
  90. Tanaka H, Nayebi A, Maheswaranathan N, McIntosh LT, Baccus SA, Ganguli S. From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction. Proceedings of the 33rd Conference on Neural Information Processing Systems; 2019. pp. 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Theis L, Berens P, Froudarakis E, Reimer J, Román Rosón M, Baden T, Euler T, Tolias AS, Bethge M. Benchmarking spike rate inference in population calcium imaging. Neuron. 2016;90:471–482. doi: 10.1016/j.neuron.2016.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Tien NW, Pearson JT, Heller CR, Demas J, Kerschensteiner D. Genetically identified suppressed-by-contrast retinal ganglion cells reliably signal self-generated visual stimuli. The Journal of Neuroscience. 2015;35:10815–10820. doi: 10.1523/JNEUROSCI.1521-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Tien NW, Kim T, Kerschensteiner D. Target-specific glycinergic transmission from VGluT3-expressing amacrine cells shapes suppressive contrast responses in the retina. Cell Reports. 2016;15:1369–1375. doi: 10.1016/j.celrep.2016.04.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Tien NW, Vitale C, Badea TC, Kerschensteiner D. Layer-specific developmentally precise axon targeting of transient suppressed-by-contrast retinal ganglion cells. The Journal of Neuroscience. 2022;42:7213–7221. doi: 10.1523/JNEUROSCI.2332-21.2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Trapani F, Spampinato GLB, Yger P, Marre O. Differences in nonlinearities determine retinal cell types. Journal of Neurophysiology. 2023;130:706–718. doi: 10.1152/jn.00243.2022. [DOI] [PubMed] [Google Scholar]
  96. Turner MH, Sanchez Giraldo LG, Schwartz O, Rieke F. Stimulus- and goal-oriented frameworks for understanding natural vision. Nature Neuroscience. 2019;22:15–24. doi: 10.1038/s41593-018-0284-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Ustyuzhaninov I, Burg MF, Cadena SA, Fu J, Muhammad T, Ponder K, Froudarakis E, Ding Z, Bethge M, Tolias AS, Ecker AS. Digital Twin Reveals Combinatorial Code of Non-Linear Computations in the Mouse Primary Visual Cortex. bioRxiv. 2024 doi: 10.1101/2022.02.10.479884. [DOI]
  98. Walker EY, Sinz FH, Cobos E, Muhammad T, Froudarakis E, Fahey PG, Ecker AS, Reimer J, Pitkow X, Tolias AS. Inception loops discover what excites neurons most using deep predictive models. Nature Neuroscience. 2019;22:2060–2065. doi: 10.1038/s41593-019-0517-x. [DOI] [PubMed] [Google Scholar]
  99. Wang F, Li E, De L, Wu Q, Zhang Y. OFF-transient alpha RGCs mediate looming triggered innate defensive response. Current Biology. 2021;31:2263–2273. doi: 10.1016/j.cub.2021.03.025. [DOI] [PubMed] [Google Scholar]
  100. Wei W, Elstrott J, Feller MB. Two-photon targeted recording of GFP-expressing neurons for light responses and live-cell imaging in the mouse retina. Nature Protocols. 2010;5:1347–1352. doi: 10.1038/nprot.2010.106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Willeke KF, Restivo K, Franke K, Nix AF, Cadena SA, Shinn T, Nealley C, Rodriguez G, Patel S, Ecker AS, Sinz FH, Tolias AS. Deep Learning-Driven Characterization of Single Cell Tuning in Primate Visual Area V4 Unveils Topological Organization. bioRxiv. 2023 doi: 10.1101/2023.05.12.540591. [DOI]
  102. Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ. Performance-optimized hierarchical models predict neural responses in higher visual cortex. PNAS. 2014;111:8619–8624. doi: 10.1073/pnas.1403112111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Yilmaz M, Meister M. Rapid innate defensive responses of mice to looming visual stimuli. Current Biology. 2013;23:2011–2015. doi: 10.1016/j.cub.2013.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Yin L, Smith RG, Sterling P, Brainard DH. Physiology and morphology of color-opponent ganglion cells in a retina expressing a dual gradient of S and M opsins. The Journal of Neuroscience. 2009;29:2706–2724. doi: 10.1523/JNEUROSCI.5471-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Zhang Y, Kim IJ, Sanes JR, Meister M. The most numerous ganglion cell type of the mouse retina is a selective feature detector. PNAS. 2012;109:E2391–E2398. doi: 10.1073/pnas.1211547109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Zhao Z, Klindt DA, Maia Chagas A, Szatko KP, Rogerson L, Protti DA, Behrens C, Dalkara D, Schubert T, Bethge M, Franke K, Berens P, Ecker AS, Euler T. The temporal structure of the inner retina at a single glance. Scientific Reports. 2020;10:4399. doi: 10.1038/s41598-020-60214-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Fred Rieke 1

This study presents a fundamental and very technically strong dataset of mouse ganglion cells responding to natural stimuli that include more natural chromatic properties. Fits of convolutional neural networks to experimental measurements highlighted a novel form of color opponency in suppressed-by-contrast ganglion cells. More generally, the work provides a compelling example of how modern experimental and computational tools can be used to generate and test hypotheses about sensory function under natural conditions.

Decision letter

Editor: Fred Rieke1
Reviewed by: Fred Rieke2, Stephen A Baccus3, Wei Wei4

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "A chromatic feature detector in the retina signals visual context changes" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Lois Smith as the Senior Editor.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

All reviewers were impressed by the technical aspects of the work. Three key issues emerged in review and were emphasized in discussions among the reviewers. First, the MEI consists of a single linear visual feature to represent a neuron's spatial and temporal stimulus selectivity. Such a linear feature cannot capture many key nonlinear aspects of retinal signaling (e.g. direction selectivity among many others). This limitation needs to be clearer in the paper. It is, for example, important that a reader does not come away thinking that the paper provides a general approach to identifying stimulus selectivity. Second, the MEIs in some ways resemble more standard measures of receptive field properties and in others differ strongly. A more complete comparison is needed of MEIs vs. standard measures, as well as a discussion of why they might differ (which could be related to the first main point above). Third, the paper emphasizes the possible role of these cells in horizon detection, but the analyses that support this role are based on ground-to-sky transitions, and the horizon itself was excluded from the stimulus set. Hence it does not seem possible to evaluate the role of horizon detection. It is important that discussions of ethological function are more grounded in the analyses provided in the paper.

Reviewer #1 (Recommendations for the authors):

Line 170-171: Suggest modifying the sentence "This established that …" to be more objective. It's not clear that a correlation of 0.5 supports that sentence.

Line 173: Isn't the linearized CNN equivalent to a straight linear filter model? If so I would state that – it's a simpler description. If not, I would clarify how it is different.

Line 357-359: What is the indication that the warping is in the color opponent region?

Reviewer #2 (Recommendations for the authors):

1. Line 856. Why were clips restricted to a certain mean intensity range, and how did this influence the analysis? The concern is that allowing a more natural range of means would then violate the sky-ground transition specificity, as cells might fire in a broader range of conditions.

2. Figure S2. Add UV / Green labels to the two grayscale maps as in Figure 4a.

3. Figure 2A. Number of channels should be added to the figure.

4. Figure 2A. 'Receptive fields' label is only indicated in the last layer but is defined by the whole network. This is confusing, perhaps for the last layer something should be labelled with something more specifically relating to spatial localization?

5. Figure 2A. Is the Kronecker (or 'tensor') product symbol really intended? If so, please state so in the legend and clarify in methods how this relates to convolution implementation. If it is just generally meant to be convolution, a different symbol should be used and stated. In other words, it helps understanding to be formally correct with symbols, with more formalized mathematics or physics usage preferred to less regulated machine learning usage.

6. Figure 2B. How can so many cells be above the maximum set by the cell's response reliability?

7. Figure 2c. A 'linear model' is stated, but from the methods, it seems like an LN model (the final threshold is still present). If it really is a linear model, this is an inappropriate straw man (a threshold should be included) but if it is an LN model, that should be stated correctly.

8. Line 240 -241. Due to "the fixed contrast budget across channels", the contrast of most MEIs is shifted toward the UV. Please explain what is meant by the fixed contrast budget and its effects.

9. Figure 4G. The block structure of Figure 4G is hard to interpret, because it is not clear what the expectation is. For many of the cells, the MEI is not the most effective input, not even for the model. Attention is focused on the one cell that has the most distinctive response to its MEI, but is this just a random occurrence? Some guidance clarifying the interpretation of this confusion matrix would be helpful, or whether it is just interesting but we are not supposed to have any real interpretation of it.

10. Line 255. If only 2/3 of the cells have color opponent MEIs, are they really a cell class? Does this call into question the classification approach or the MEI approach?

11. Related to the difficulties of the representation analysis, it is not clear what 'sacrifice' occurs in favor of chromatic discrimination. If cell 28 did not exist, some chromatic discrimination would be lost, but other cell types would still be there to represent other features.

12. Figure 7d. the retinal cross-section image is confusing. Due to retinal warping, the ChAT bands droop below the IPL boundaries indicated, and therefore the arborization level seen in the maximal projection is misleading. It would be better to use the ChAT bands to correct for warping at each spatial location so that the dendritic arborization could be interpreted.

13. Line 609. This section relates to a previous claim, summarized as 'the dumber the animal the smarter the retina'. This is regarded as a myth, and experimental evidence does not support this (see Gollisch and Meister, 2010 discussion, point 2). It would be better for the field not to revive this idea.

14. Line 624. There is a misstatement about how this paper contributes to 'how' a computation is implemented, as there is no mechanistic information here about the circuitry.

15. Line 871. Please make clear that the two channels are chromatic, and a different word should be used than 'channel' because the layer has 16 channels by the usual meaning.

16. Line 874. More detail should be given about the Fourier parameterization.

17. The limitations of the slow frame rate should be more clearly acknowledged.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "A chromatic feature detector in the retina signals visual context changes" for further consideration by eLife. Your revised article has been evaluated by Lois Smith (Senior Editor), a Reviewing Editor and the three original reviewers.

All the reviewers agreed that the manuscript has improved in revision and appreciated the new data and analyses that were added. There are a few remaining issues that need to be addressed, as outlined in the reviews below. Most important is clarifying the limitations of the MEI analysis earlier in the paper. These are detailed in the individual reviews below.

Reviewer #1 (Recommendations for the authors):

This paper has improved with the new analyses and revisions added in response to reviews. These changes have clarified several issues, provided stronger evidence for others, and brought the results and text closer in alignment. A few issues have either emerged or been highlighted in the revision:

MEI and temporal tuning. The text at times could be read as saying that the MEIs reflect a better way to measure tuning properties. A concern about this terminology is the dependence of the MEI on the approach used to optimize responses (e.g. on the time window used in that approach). For example, the temporal frequencies that contribute to the time course of the MEI are quite low – likely reflecting the optimization procedure. One suggestion is to clarify in the section starting on line 204 that the MEIs reflect a combination of a cell's own properties and the optimization procedure (as you have already for the oscillations in MEIs for some cells).

Figure 6b, c: What do the contrasts beyond -1 or 1 mean?

Figure 6e, f: This cell does not look opponent, with both UV and green producing On responses. Is this the only cell this experiment was performed on? If not, were other cells more in alignment with the predictions from the model?

Figure 7i: The colors are hard to see and match to those in the rest of the figure. In addition, the gray bars and labels should be defined in the caption.

Reviewer #2 (Recommendations for the authors):

The paper has improved but still does not acknowledge its limitations sufficiently.

1. With respect to the difference between the MEI and the linear receptive field, there was no misunderstanding in the previous review, and the current manuscript still does not acknowledge the limitations of the MEI analysis. This point could have been made more explicit in the previous review, and so the following is an attempt to do so.

By analogy, in a topographic map of the world, the MEI is the location of Mt. Everest. The linear receptive field is a plane fit to the direction of the steepest ascent fit to the world map, additionally including an average slope. A Linear-Nonlinear model allows variation in height along that single direction. But a full understanding between the relationship between change in spatial position and height requires the full two-dimensional map.

In the retina, and in a CNN model of the retina, parallel rectified pathways create sensitivity to many more different directions in stimulus space, commonly referred to as features. Each separate interneuron with a threshold potentially creates a different feature or dimension, with firing rate and sensitivity varying in each point of the high-dimensional space spanned by the set of features.

Linear-Nonlinear models and MEIs only encode sensitivity to a single feature. However, multiple distinct features are required to produce a wide set of important phenomenon, including nonlinear subunits that cause sensitivity to fine textures, direction selectivity for both light and dark objects (On-Off), object motion sensitivity, latency coding, the omitted stimulus response, responses to motion reversal, and pattern adaptation (reviewed in Gollisch and Meister, 2010). All of these properties rely on multiple distinct features and their interactions. Any analysis based solely on an MEI necessarily abandons consideration of how sensitivity along these different features combines to produce a computation. The new Figure 6c, which examines two stimulus directions near the MEI is an improvement, but it is only two directions in one region of stimulus space.

There is substantial concern that readers will misinterpret the 'Most Exciting' input to mean the 'Most Important'. It would therefore serve the field for a clear statement to be made that analysis of MEIs alone (1) will not capture interactions of multiple nonlinear neural pathways, (2) therefore will not capture nonlinear interactions between multiple stimulus features, and (3) will consequently not explain the set of phenomena that rely on these neural pathways and stimulus features.

In response to the authors rebuttal, to clarify the nonlinear analysis that one might do beyond the analysis of MEIs to identify multiple dimensions, these would include Spike-Triggered Covariance (Fairhall et al., 2006), Maximally Informative Dimensions (Sharpee et al., 2003), Maximum Noise Entropy (Globerson et al., 2009), Nonnegative Matrix Factorization (Liu et al., 2017), proximal algorithms (Maheswaranathan et al., 2018), and model reduction approaches to reduce to the CNN model to fewer dimensions (Maheswaranathan et al., 2023) among others.

2. It should be made explicit for each analysis whether it is based on measured responses or stimuli presented to the model. In Figure 7, a proper signal detection analysis cannot be performed without accounting for noise on single trials. It is unclear whether analyzed responses are directly from data, or from stimuli presented to the model. For extrapolation to different speeds, these are clearly from new stimuli presented to the model. It should be stated that without accounting for measured noise, ROC analyses will overestimate detection of ground-sky transitions.

eLife. 2024 Oct 4;13:e86860. doi: 10.7554/eLife.86860.sa2

Author response


Essential revisions:

All reviewers were impressed by the technical aspects of the work. Three key issues emerged in review and were emphasized in discussions among the reviewers.

We thank editors and reviewers for their constructive input.

First, the MEI consists of a single linear visual feature to represent a neuron's spatial and temporal stimulus selectivity. Such a linear feature cannot capture many key nonlinear aspects of retinal signaling (e.g. direction selectivity among many others). This limitation needs to be clearer in the paper. It is, for example, important that a reader does not come away thinking that the paper provides a general approach to identifying stimulus selectivity.

Second, the MEIs in some ways resemble more standard measures of receptive field properties and in others differ strongly. A more complete comparison is needed of MEIs vs. standard measures, as well as a discussion of why they might differ (which could be related to the first main point above).

We think that there is a misunderstanding about the differences between the MEI approach and the more standard measures of receptive field (RF) properties the reviewers refer to. We apologize if this was not clear enough in the original manuscript.

In the following, we will first explain the differences between MEIs and linear RF estimates in more detail. We then show a new analysis (also included in the revised manuscript) that shows more clearly that for G28/tSbC cells (the chromatic feature detectors described in our paper) the MEI is not the same as the linear filter. In fact, the linear filter does not recover the chromatic selectivity reliably, and hence a nonlinear approach such as the MEIs is necessary to characterize G28/tSbC RGCs adequately.

The fundamental difference between more standard measures of receptive field (RF) properties and MEIs is the following: Classical approaches for estimating the RF typically aim at capturing a cell’s linear filter. The estimated linear filter is an approximation of how the cell processes incoming stimuli; RF estimates can therefore be used as linear filters, for example in Linear-Nonlinear (LN) models that predict cell responses to stimuli. The MEI approach, conversely, is an approach for estimating the optimal stimulus for a cell: an optimisation procedure is leveraged to maximize an objective function, which, in the simplest case, can be the model neuron response prediction. However, the objective function can be any function of the predicted response. Importantly, we would like to point out that MEIs do not simply approximate nonlinear stimulus selectivity by a linear filter, but instead are the image that leads to the peak in the nonlinear activation landscape, which is why they are well-suited for characterizing nonlinear systems, such as the retina. For a truly linear system, the two coincide, but not for the retina, because, as the reviewers rightly point out, the retina is nonlinear. Crucially, the MEI approach can also recover stimulus selectivity if a cell is selective for a combination of stimulus features, since the combination of features can be described as a point in high-dimensional stimulus space.

To demonstrate the necessity of a non-linear approach (i.e., MEIs) for recovering the chromatic opponency of RGC type G28, we tested whether we could recover this feature using an LN model (i.e., a CNN model with no nonlinearities between convolutional layers, but with a final thresholding nonlinearity in place). Indeed, with the LN model, only 9 out of 36 G28 cells were identified as color-opponent (nonlinear model: 24 out of 36; see new Figure 6a). This result demonstrates that the preference for chromatic contrast of G28 cells does not result from a color opponent linear filter (i.e. negative weights for green, and positive weights for UV), which could also be captured by an LN-type model. Rather, there must be a nonlinear dependence between chromatic contrast (of the stimulus) and chromatic selectivity (of the cell).

To understand the nature of this dependence, we further expanded the analysis by estimating the model neurons’ tuning to color contrast around the maximum (the MEI). To this end, we mapped the model neurons’ response in 2D chromatic contrast space (new Figure 6b-d), revealing that, indeed, G28/tSbC RGCs have a nonlinear tuning for color contrast: their selectivity for color contrast changes depending on the location in the space of color contrast. We verified these model predictions by electrically recording from morphologically identified G28/tSbC cells (see new Figure 6e,f, also above).

In summary, in the revised manuscript, we (a) added a more detailed explanation of the difference between more conventional linear approaches for estimating stimulus selectivity on the one hand, and the MEI approach as a more general approach to identify nonlinear stimulus selectivity on the other hand. We also offered intuitions on how to interpret the MEIs, especially for those features that differ from linear RF measures (e.g. the temporal oscillations). We also (b) discussed the limitations of the MEI approach (see also replies to individual reviewer comments below). Further, we (c) added an analysis (and new electrophysiological data) of the chromatic contrast selectivity in tSbC RGC, demonstrating that this is a nonlinear feature, which could only be recovered using a nonlinear model.

Third, the paper emphasizes the possible role of these cells in horizon detection, but the analyses that support this role are based on ground-to-sky transitions, and the horizon itself was excluded from the stimulus set. Hence it does not seem possible to evaluate the role of horizon detection. It is important that discussions of ethological function are more grounded in the analyses provided in the paper.

We acknowledge that we were not clear enough about the proposed function for this cell type. What we suggest is that G28/tSbC cells contribute to detecting context changes which are marked by a change in color contrast, such as transitions across the horizon and specifically, from ground to sky. We do not think that the cells are detecting the horizon itself. We improved the revised manuscript in this respect.

To back-up our hypothesis that G28/tSbC cells play a role in detecting context changes, we performed an additional simulation analysis. We sampled 1,000 stimuli from the original mouse movie (Qiu et al. Curr Biol 2021) by moving a window across a scene (Figure 7h), mimicking the kind of stimulus elicited by eye, head, and/or whole body movement. As before, each stimulus mimicked one of four transition types: ground->ground, sky->sky, sky->ground, and ground->sky. The former two transitions correspond to changes in visual input without context change, whereas the latter two reflect a context change. Importantly, the ground->sky and sky->ground transitions in these simulations contain the horizon.

We then simulated responses for all 32 RGC groups and performed an ROC analysis (analogous to the one already described in the manuscript) to determine how well the different RGC groups could distinguish ground->sky transitions from the other transitions.

To test robustness of detection performance across different speeds of transitions, we performed this analysis at different angular velocities (50, 150, 250, and 350°/s; see Figure 6-supplement 3a,b). Interestingly, while most slow ON RGC types improve their performance with increasing speeds, G28 performs well robustly across all four speeds.

In the revised manuscript, we added this new analysis (cf. Figure 7h-j, Figure 6-supplement 3) and discuss the implications of these results in the Discussion.

Reviewer #1 (Recommendations for the authors):

Line 170-171: Suggest modifying the sentence "This established that …" to be more objective. It's not clear that a correlation of 0.5 supports that sentence.

We removed the sentence.

Line 173: Isn't the linearized CNN equivalent to a straight linear filter model? If so I would state that – it's a simpler description. If not, I would clarify how it is different.

The linearised CNN is equivalent to an LN model. We now state this in the manuscript and refer to this model as the “LN model”.

Line 357-359: What is the indication that the warping is in the color opponent region?

We removed the referenced section from the manuscript (see also above). However, it is visible from Figure 6b-d, and Figure 5-supplement 1 that the warping is in the color opponent region.

Reviewer #2 (Recommendations for the authors):

1. Line 856. Why were clips restricted to a certain mean intensity range, and how did this influence the analysis? The concern is that allowing a more natural range of means would then violate the sky-ground transition specificity, as cells might fire in a broader range of conditions.

In a pilot study with unrestricted mean intensity range, we found that strong differences in mean intensity between clips dominated RGC responses, so we restricted the mean intensity range. For our additional analysis (Figure 7h-j), however, we did not restrict the range of intensities, and still found G28/tSbC cells to be selective for ground-sky transitions.

2. Figure S2. Add UV / Green labels to the two grayscale maps as in Figure 4a.

Done.

3. Figure 2A. Number of channels should be added to the figure.

Done.

4. Figure 2A. 'Receptive fields' label is only indicated in the last layer but is defined by the whole network. This is confusing, perhaps for the last layer something should be labelled with something more specifically relating to spatial localization?

We changed the label to “spatial RF”.

5. Figure 2A. Is the Kronecker (or 'tensor') product symbol really intended? If so, please state so in the legend and clarify in methods how this relates to convolution implementation. If it is just generally meant to be convolution, a different symbol should be used and stated. In other words, it helps understanding to be formally correct with symbols, with more formalized mathematics or physics usage preferred to less regulated machine learning usage.

Thanks for pointing that out; we replaced the symbol by that for a convolution.

6. Figure 2B. How can so many cells be above the maximum set by the cell's response reliability?

This is because the test set correlation and the response reliability are calculated in different ways (see Methods) such that their absolute values are not directly comparable. We expect a correlation between the two metrics, which the panel is meant to illustrate, but not a direct correspondence in terms of absolute values.

7. Figure 2c. A 'linear model' is stated, but from the methods, it seems like an LN model (the final threshold is still present). If it really is a linear model, this is an inappropriate straw man (a threshold should be included) but if it is an LN model, that should be stated correctly.

We apologize for the confusion, as stated above, it is indeed a CNN model equivalent to an LN model. We now clearly state that in the paper and changed the label to “LN model”.

8. Line 240 -241. Due to "the fixed contrast budget across channels", the contrast of most MEIs is shifted toward the UV. Please explain what is meant by the fixed contrast budget and its effects.

This statement was slightly misleading; the underlying reason for the UV-shift in MEI contrast is the dominance of UV-sensitive S-opsin in the ventral retina, making stimuli in the UV channel much more effective at eliciting RGC responses. The sharing of the fixed contrast budget across channels (see also Franke et al. 2022, Nature) was the methodological feature that led to this biological fact being reflected in the MEIs. In other words, the MEIs generated like this are the optimal stimuli, conditioned on having a shared fixed contrast budget. The optimal stimuli conditioned on having separate contrast budgets for the two color channels are expected to look different, i.e. with more contrast in the green channel.

We updated the Results section accordingly.

9. Figure 4G. The block structure of Figure 4G is hard to interpret, because it is not clear what the expectation is. For many of the cells, the MEI is not the most effective input, not even for the model. Attention is focused on the one cell that has the most distinctive response to its MEI, but is this just a random occurrence? Some guidance clarifying the interpretation of this confusion matrix would be helpful, or whether it is just interesting but we are not supposed to have any real interpretation of it.

We have a paragraph in the manuscript (original as well as revised) dedicated to this panel. In brief, we wanted to illustrate that RGCs show a strong response to their own group’s MEI, a weaker response to the MEIs of functionally related groups, and no response to MEIs of groups with different response profiles (Figure 4g top). Conversely, RGC groups from opposing regions in response space showed no response to each others’ MEIs. The model’s predictions showed a similar pattern (Figure 4g, bottom), thereby validating the model’s ability to generalize to the MEI stimulus regime.

10. Line 255. If only 2/3 of the cells have color opponent MEIs, are they really a cell class? Does this call into question the classification approach or the MEI approach?

We think that this is due to the classifier we employed, which did not use chromatic selectivity for classification (but chirp and moving bar responses, as well as soma size and direction/orientation selectivity). It was trained on the original dataset (Baden et al.. Nature 2016), where we decided to merge two clusters (38 and 39) to G28 (see Figure 2a there). This was done because the clusters only differed slightly in their direction and orientation selectivity index and we wanted to define functional RGC groups conservatively. Therefore, it is possible that G28 may collect other cells with similar chirp and moving bar responses but no color opponency. Similarly, the fact that G27 – with very similar chirp and moving bar responses – contains some color-opponent cells may also be explained by the original classifier.

11. Related to the difficulties of the representation analysis, it is not clear what 'sacrifice' occurs in favor of chromatic discrimination. If cell 28 did not exist, some chromatic discrimination would be lost, but other cell types would still be there to represent other features.

Our argument here is in terms of resources spent on a cell that monitors a particular region of stimulus space, while not monitoring the rest of stimulus space. In other words, we agree with the reviewer: If G28/tSbC cells did not exist, chromatic discrimination would be lost – but its existence suggests that the resources spent on it are relevant. However, as discussed above, we removed the referenced section from the manuscript.

12. Figure 7d. the retinal cross-section image is confusing. Due to retinal warping, the ChAT bands droop below the IPL boundaries indicated, and therefore the arborization level seen in the maximal projection is misleading. It would be better to use the ChAT bands to correct for warping at each spatial location so that the dendritic arborization could be interpreted.

We replaced the image by one of a different tSbC cell, where the tissue displayed less warping.

13. Line 609. This section relates to a previous claim, summarized as 'the dumber the animal the smarter the retina'. This is regarded as a myth, and experimental evidence does not support this (see Gollisch and Meister, 2010 discussion, point 2). It would be better for the field not to revive this idea.

There may not be experimental data supporting the idea that a larger “cortical” computational capacity results in “simpler” retinal receptive fields, but we think that recent modeling results (Lindsay et al. ICLR 2019) are compelling enough to consider this idea. Nonetheless, we amended this part of the discussion.

14. Line 624. There is a misstatement about how this paper contributes to 'how' a computation is implemented, as there is no mechanistic information here about the circuitry.

Changed.

15. Line 871. Please make clear that the two channels are chromatic, and a different word should be used than 'channel' because the layer has 16 channels by the usual meaning.

Thanks for pointing this out – we changed it to “input channels (green and UV)”, since the two color channels are the input channels in our case.

16. Line 874. More detail should be given about the Fourier parameterization.

Thanks for pointing this out – we changed it to “input channels (green and UV)”, since the two color channels are the input channels in our case.

17. The limitations of the slow frame rate should be more clearly acknowledged.

We are not sure to which specific limitations the reviewer refers. In any case, despite the low frame rate, the Ca2+ responses can be related to the spike rate (e.g., Román Rosón et al. Neuron 2019; Trapani et al. J Neurophysiol 2023, see their Figure Suppl. 4). We added a sentence to the Methods section.

[Editors’ note: what follows is the authors’ response to the second round of review.]

All the reviewers agreed that the manuscript has improved in revision and appreciated the new data and analyses that were added. There are a few remaining issues that need to be addressed, as outlined in the reviews below. Most important is clarifying the limitations of the MEI analysis earlier in the paper. These are detailed in the individual reviews below.

Reviewer #1 (Recommendations for the authors):

This paper has improved with the new analyses and revisions added in response to reviews. These changes have clarified several issues, provided stronger evidence for others, and brought the results and text closer in alignment. A few issues have either emerged or been highlighted in the revision:

We thank the reviewer for appreciating our revisions

MEI and temporal tuning. The text at times could be read as saying that the MEIs reflect a better way to measure tuning properties. A concern about this terminology is the dependence of the MEI on the approach used to optimize responses (e.g. on the time window used in that approach). For example, the temporal frequencies that contribute to the time course of the MEI are quite low – likely reflecting the optimization procedure. One suggestion is to clarify in the section starting on line 204 that the MEIs reflect a combination of a cell's own properties and the optimization procedure (as you have already for the oscillations in MEIs for some cells).

Thank you for pointing out that we came across as implying that MEIs are somehow “better” than other approaches. That is not what we wanted to imply. It is one method of many that can reveal nonlinear tuning properties and we used it to reveal a novel such tuning property. In the revised version of the manuscript, we have carefully rephrased all places where this impression might have been created.

Figure 6b, c: What do the contrasts beyond -1 or 1 mean?

For this analysis, we created visual stimuli in the two-dimensional subspace spanned by the MEI’s green and UV channel, respectively. In other words, we used the MEI’s green and UV channel as basis vectors and took a linear combination with different weights. These weights represent green and UV contrast, respectively (with respect to the MEI’s contrast). We then computed the predicted response of the model to these stimuli, which is shown in the figure. Note that we flipped the sign of the green channel’s basis vector, such that a positive weight corresponds to an ON-type stimulus and a negative weight to an OFF-type stimulus. That means the MEI is at (–1,1) in this plot.

In the previous visualisation, the axes were labelled to indicate the absolute contrast of the resulting stimuli, which could be outside the range (-1, 1), depending on the original MEIs contrast. We now realise that contrasts outside the -1…1 range may be confusing. We have therefore changed the axis labels to indicate the multiplicative scalar, and hence range from -1 to 1. We clarify this in the legend and explain in more detail in the methods section. In addition, we now use the same colour schemes for panels (b, c) and (e).

Note that, in the previous version of the manuscript, we chose a slightly different subspace (spanned by the UV component of the MEI both in the green and UV channel). For ease of interpretability, we now changed to the subspace described above. As a consequence, the panels in Figure 6 look slightly different.

Figure 6e, f: This cell does not look opponent, with both UV and green producing On responses. Is this the only cell this experiment was performed on? If not, were other cells more in alignment with the predictions from the model?

We thank the reviewer for this question and the opportunity to clarify. The colour opponency of this cell derives from a suppressive effect of green contrast, which can be best understood when comparing responses to a UV-ON, green-OFF stimulus (top left in panel (e); upper row, second from left in (f)) with the responses to a UV-ON, green-ON stimulus (top right in Figure 6 panel (e); upper row, far right in Figure 6. panel (f)).

In the manuscript, we previously wrote:

“This analysis revealed that, indeed, G28 RGCs have a nonlinear tuning for colour contrast: they are strongly UV-selective at lower contrasts, but become colour-opponent, i.e. additionally inhibited by green, for higher contrasts.”

We now add further explanation about the nature of the colour opponency:

“We confirmed the model’s predictions about G28’s nonlinear tuning for colour contrast using electrical recordings as described above. The example G28 (tSbC) cells shown in the figure exhibit similar nonlinear tuning in chromatic contrast space. The first example cell's firing rate and, consequently, the tuning curve peak for UV ON-green OFF stimuli (top left in panel e; upper row, second from left in f), and are lower for UV ON-green ON stimuli (top right in panel e; upper row, far right in f), reflecting the suppressive effect of green contrast on the cell's response.”

Figure 7i: The colors are hard to see and match to those in the rest of the figure. In addition, the gray bars and labels should be defined in the caption.

We changed that panel in Figure 7: instead of having all AUC curves in one plot, we now show the curves separately for the example RGC groups (vs. the curve of G28). We hope that this improves the readability of the panel.

Reviewer #2 (Recommendations for the authors):

The paper has improved but still does not acknowledge its limitations sufficiently.

We thank the reviewer for appreciating our revisions.

1. With respect to the difference between the MEI and the linear receptive field, there was no misunderstanding in the previous review, and the current manuscript still does not acknowledge the limitations of the MEI analysis. This point could have been made more explicit in the previous review, and so the following is an attempt to do so.

By analogy, in a topographic map of the world, the MEI is the location of Mt. Everest. The linear receptive field is a plane fit to the direction of the steepest ascent fit to the world map, additionally including an average slope. A Linear-Nonlinear model allows variation in height along that single direction. But a full understanding between the relationship between change in spatial position and height requires the full two-dimensional map.

In the retina, and in a CNN model of the retina, parallel rectified pathways create sensitivity to many more different directions in stimulus space, commonly referred to as features. Each separate interneuron with a threshold potentially creates a different feature or dimension, with firing rate and sensitivity varying in each point of the high-dimensional space spanned by the set of features.

Linear-Nonlinear models and MEIs only encode sensitivity to a single feature. However, multiple distinct features are required to produce a wide set of important phenomenon, including nonlinear subunits that cause sensitivity to fine textures, direction selectivity for both light and dark objects (On-Off), object motion sensitivity, latency coding, the omitted stimulus response, responses to motion reversal, and pattern adaptation (reviewed in Gollisch and Meister, 2010). All of these properties rely on multiple distinct features and their interactions. Any analysis based solely on an MEI necessarily abandons consideration of how sensitivity along these different features combines to produce a computation. The new Figure 6c, which examines two stimulus directions near the MEI is an improvement, but it is only two directions in one region of stimulus space.

There is substantial concern that readers will misinterpret the 'Most Exciting' input to mean the 'Most Important'. It would therefore serve the field for a clear statement to be made that analysis of MEIs alone (1) will not capture interactions of multiple nonlinear neural pathways, (2) therefore will not capture nonlinear interactions between multiple stimulus features, and (3) will consequently not explain the set of phenomena that rely on these neural pathways and stimulus features.

In response to the authors rebuttal, to clarify the nonlinear analysis that one might do beyond the analysis of MEIs to identify multiple dimensions, these would include Spike-Triggered Covariance (Fairhall et al., 2006), Maximally Informative Dimensions (Sharpee et al., 2003), Maximum Noise Entropy (Globerson et al., 2009), Nonnegative Matrix Factorization (Liu et al., 2017), proximal algorithms (Maheswaranathan et al., 2018), and model reduction approaches to reduce to the CNN model to fewer dimensions (Maheswaranathan et al., 2023) among others.

We thank the reviewer for clarifying the nature of their concerns. We did not mean to imply either that MEIs are the only approach, somehow the best or provide a complete picture of neuronal tuning. It is one method among many that can reveal nonlinear tuning properties and we used it to reveal a novel such tuning property. In the revised version of the manuscript, we have carefully rephrased all places where this impression might have been created. Specifically:

We rewrote parts of the Introduction and the Discussion to more clearly position the MEI approach as one of many potential analysis approaches, point out its limitations and discuss other approaches as well.

We added further guidance on how to interpret MEIs (as compared to Linear Receptive Fields) in the Results section.

2. It should be made explicit for each analysis whether it is based on measured responses or stimuli presented to the model. In Figure 7, a proper signal detection analysis cannot be performed without accounting for noise on single trials. It is unclear whether analyzed responses are directly from data, or from stimuli presented to the model. For extrapolation to different speeds, these are clearly from new stimuli presented to the model. It should be stated that without accounting for measured noise, ROC analyses will overestimate detection of ground-sky transitions.

Thank you for catching that this was not clear everywhere. We added labels to Figure 7 making explicit, which results are based on measured responses and which ones are based on model predictions. This is also clearly stated in the text.

We now additionally state that without accounting for measured noise, ROC analyses will overestimate detection of ground-sky transitions. Still, assuming roughly the same noise level across RGC groups, we can consider the ROC results based on model predictions as an upper bound on the expected detection performance for each RGC group. Therefore, we can interpret relative performance across RGC groups.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Höfling H, Szatko KP, Behrens C, Yeng D, Qiu Y, Klindt DA, Jessen Z, Schwartz GM, Bethge M, Berens P, Franke K, Ecker AS, Euler T. 2024. rgc-natstim-model. g-node. rgc-natstim-model [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    MDAR checklist

    Data Availability Statement

    The data and the movie stimulus are available at https://gin.g-node.org/eulerlab/rgc-natstim. Code for training models and reproducing analyses and figures is available at https://github.com/eulerlab/rgc-natstim-model, (copy archived at Hoefling et al., 2024).

    The following dataset was generated:

    Höfling H, Szatko KP, Behrens C, Yeng D, Qiu Y, Klindt DA, Jessen Z, Schwartz GM, Bethge M, Berens P, Franke K, Ecker AS, Euler T. 2024. rgc-natstim-model. g-node. rgc-natstim-model


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES