Skip to main content
. 2024 Oct 4;13:e86860. doi: 10.7554/eLife.86860

Figure 2. Convolutional neural network (CNN) model captures diverse tuning of retinal ganglion cell (RGC) groups and predicts maximally exciting inputs (MEIs).

Figure 2.

(a) Illustration of the CNN model and its output. The model takes natural movie clips as input (1), performs 3D convolutions with space-time separable filters (2) followed by a nonlinear activation function (ELU; 3) in two consecutive layers (2–4) within its core, and feeds the output of its core into a per-neuron readout. For each RGC, the readout convolves the feature maps with a learned RF modelled as a 2D Gaussian (5), and finally feeds a weighted sum of the resulting vector through a softplus nonlinearity (6) to yield the firing rate prediction for that RGC (7). Numbers indicate averaged single-trial test set correlation between predicted (red) and recorded (black) responses. (b) Test set correlation between model prediction and neural response (averaged across three repetitions) as a function of response reliability (see Methods) for N=3527 RGCs. Coloured dots correspond to example cells shown in Figure 1c–e. Dots in darker grey correspond to the N=1947 RGCs that passed the model test correlation and movie response quality criterion (see Methods and Figure 1—figure supplement 1). (c) Test set correlation (as in (b)) of CNN model vs. test set correlation of an LN model (for details, see Methods). Coloured dots correspond to means of RGC groups 1–32 (Baden et al., 2016). Dark and light grey dots as in (b). (d) Illustration of model-guided search for MEIs. The trained model captures neural tuning to stimulus features (far left; heat map illustrates landscape of neural tuning to stimulus features). Starting from a randomly initialised input (second from left; a 3D tensor in space and time; only one colour channel illustrated here), the model follows the gradient along the tuning surface (far left) to iteratively update the input until it arrives at the stimulus (bottom right) that maximises the model neuron’s activation within an optimisation time window (0.66 s, grey box, top right).