Skip to main content
eLife logoLink to eLife
. 2020 Oct 27;9:e55130. doi: 10.7554/eLife.55130

VIP interneurons in mouse primary visual cortex selectively enhance responses to weak but specific stimuli

Daniel J Millman 1,, Gabriel Koch Ocker 1, Shiella Caldejon 1, India Kato 1, Josh D Larkin 1, Eric Kenji Lee 1, Jennifer Luviano 1, Chelsea Nayan 1, Thuyanh V Nguyen 1, Kat North 1, Sam Seid 1, Cassandra White 1, Jerome Lecoq 1, Clay Reid 1, Michael A Buice 1, Saskia EJ de Vries 1
Editors: Martin Vinck2, Kate M Wassum3
PMCID: PMC7591255  PMID: 33108272

Abstract

Vasoactive intestinal peptide-expressing (VIP) interneurons in the cortex regulate feedback inhibition of pyramidal neurons through suppression of somatostatin-expressing (SST) interneurons and, reciprocally, SST neurons inhibit VIP neurons. Although VIP neuron activity in the primary visual cortex (V1) of mouse is highly correlated with locomotion, the relevance of locomotion-related VIP neuron activity to visual coding is not known. Here we show that VIP neurons in mouse V1 respond strongly to low contrast front-to-back motion that is congruent with self-motion during locomotion but are suppressed by other directions and contrasts. VIP and SST neurons have complementary contrast tuning. Layer 2/3 contains a substantially larger population of low contrast preferring pyramidal neurons than deeper layers, and layer 2/3 (but not deeper layer) pyramidal neurons show bias for front-to-back motion specifically at low contrast. Network modeling indicates that VIP-SST mutual antagonism regulates the gain of the cortex to achieve sensitivity to specific weak stimuli without compromising network stability.

Research organism: Mouse

Introduction

Inhibitory interneurons play a major role in establishing the dynamics of cortical microcircuits (Roux and Buzsáki, 2015; Cardin, 2018). In the superficial layers of the cortex, vasoactive intestinal peptide-expressing (VIP) interneurons regulate feedback inhibition of pyramidal neurons through suppression of Martinotti-type somatostatin-expressing (SST) interneurons (Pfeffer et al., 2013). Through this disinhibitory mechanism, VIP interneurons are believed to modulate network dynamics based on the behavioral state of the animal; for instance, VIP neurons in mouse primary visual cortex (V1) are reliably active during periods of locomotion (Fu et al., 2014). Moreover, VIP neurons in V1 are a target of top-down inputs and mediate enhancement of local pyramidal cell activity in response to activation of those inputs (Zhang et al., 2014). Behaviorally, mouse V1 is necessary for the detection of low contrast visual stimuli (Glickfeld et al., 2013), and the optogenetic activation of VIP neurons in mouse V1 lowers contrast detection thresholds whereas the activation of SST or PV neurons raises it (Cone et al., 2019). This suggests that the perception of low contrast stimuli is strongly enhanced by VIP neuron activity in V1. Although the activity of VIP neurons has been shown to be suppressed below baseline in response to high contrast full-field grating stimuli of all tested spatial and temporal frequencies (de Vries et al., 2020), the responses of VIP neurons to low contrast visual stimuli are not known. To this end, we investigated the influence of stimulus contrast and locomotion on the visual responses of VIP, SST, and pyramidal neurons in mouse V1. SST neurons responded exclusively at high contrast whereas VIP neurons responded exclusively at low contrast with a strong preference for front-to-back motion that is congruent with self-motion during locomotion. As a population, layer 2/3 – but not deeper layer – pyramidal neurons responded more strongly at low contrast than high contrast and showed a slight, but significant, bias for front-to-back motion. Finally, we made novel extensions of stabilized supralinear network (SSN) models to incorporate the diversity of inhibitory interneuron types and used these models to demonstrate that VIP-driven disinhibition at low contrast can drive large increases in pyramidal neuron activity, despite the relatively low activity of both SST and pyramidal neurons in this contrast regime. The selective enhancement of front-to-back motion could increase the detection of obstacles approaching head-on during locomotion. Based on these results, we conclude that VIP neurons amplify responses of pyramidal neurons to weak but behaviorally-relevant stimuli.

Results and discussion

We recorded responses to full-field (approximately 120° x 90° of visual space) drifting gratings at eight directions (with a spatial frequency of 0.04 cpd and temporal frequency of 1 Hz) and six contrasts (5–80%) during calcium imaging of mouse Cre lines for Vip and Sst as well as pyramidal neurons across cortical layers (Cux2: layer 2/3; Rorb: layer 4; Rbp4: layer 5; Ntsr1: layer 6) transgenically expressing GCaMP6f (see Figure 1—source data 1 for numbers of neurons, sessions, and mice in the dataset). The four Cre lines used to image pyramidal neurons were chosen to limit GCaMP expression to neurons in a single layer such that fluorescence contamination from processes (e.g. axons or dendrites) of neurons with somata in different layers is minimized, while providing broad coverage of excitatory neuron types within the target layer (see Materials and methods: Experimental Animals). Although Cux2 is expressed in both layers 2/3 and 4, it was only imaged in layer 2/3. Vip mice were imaged in layer 2/3 where VIP neurons are most abundant (Tremblay et al., 2016), whereas Sst mice were imaged in layer 4 where SST neurons are most abundant. Notably, most, if not all, SST neurons in layer 4 of V1 are Martinotti cells (Scala et al., 2019).

Figure 1 shows fluorescence traces for four example neurons, of the key Cre lines, as well as stepwise transformations to ‘events’ in the fluorescence traces and, finally, stimulus-response magnitudes and tuning curves. Events in the fluorescence trace for each neuron were detected using a changepoint detection algorithm with an L0-regularization penalty (de Vries et al., 2020; Jewell and Witten, 2018; Jewell et al., 2019). The result is a time series of event onset times and magnitudes proportional to the change in GCaMP fluorescence; individual events likely do not correspond to single action potentials but have a bias toward bursts (Ledochowitsch et al., 2019; Huang et al., 2019). The response for each trial was computed as the mean event magnitude per second and averaged across trials for each condition.

Figure 1. Single neurons are tuned for stimulus direction and contrast.

Figure 1.

(a) A single VIP interneuron recorded in layer 2/3 of a Vip mouse responds to low contrast with a preference for motion with a direction of 0 degrees (front-to-back). Top: In blue, 20 s of the dF/F trace for this neuron and, in black, the corresponding events extracted from the dF/F trace. Left: Event rasters for each contrast at the peak direction (0 degrees), each direction at the peak contrast (10%), and blank (i.e. 0% contrast) trials. Middle: Contrast tuning curve at the peak direction and direction tuning curve at the peak contrast; mean ± SEM. Right: Heatmap shows the mean response for all stimulus contrasts and directions. (b) Same as a, for a single SST neuron recorded in layer 4 of an Sst mouse. This neuron is tuned for high contrast with a preference for motion with a direction of ±90 (up/down). (c) Same as a, for a pyramidal neuron recorded in layer 2/3 of a Cux2 mouse. This neuron is tuned for low contrast with a preference for motion with a direction of -45 degrees. (d) Same as a, for pyramidal neuron recorded in layer 4 of a Rorb mouse. This neuron is tuned for high contrast with a preference for motion with a direction of 180 degrees (back-to-front).

Figure 1—source data 1. The total number of cells, experimental sessions, and mice per Cre line.

We observed direction- or orientation-tuned neurons that responded preferentially either to high contrast gratings or low contrast gratings. The majority of neurons were responsive to the stimulus set (Figure 2a), measured as a statistically significant bias in responses depending on grating contrast and direction (bootstrapped χ2test, p<0.01; see Materials and methods). Substantial differences in contrast and direction tuning were apparent across Cre lines (Figure 2b–h). Virtually all VIP neurons responded only at low (<20%) contrast to front-to-back motion (0 degrees; nasal-to-temporal) or an adjacent direction (Figure 2b), yielding the greatest direction bias among Cre lines as quantified by the vector sum of direction preferences (Figure 2c). The direction of bias was consistent across all Vip mice (n = 6 sessions, 3 mice; Figure 2—figure supplement 1A) and did not result from stimulus direction-selective running behavior (Figure 2—figure supplement 1B). High contrast gratings of all directions significantly suppressed activity in a substantial fraction of VIP neurons whereas such suppression was rare in other Cre lines (Figure 2d; Figure 2—figure supplement 2). SST neurons had high contrast selectivity, weak direction and orientation selectivity, and varied direction preference (Figure 2b,e,g,h), resulting in an average population response that was strong at high contrast across all directions, complementing the non-direction selective suppression at high contrast observed in VIP neurons. Unlike inhibitory interneurons, pyramidal neurons exhibited substantial direction and orientation selectivity and tiled all eight possible direction preferences (Figure 2b,g,h).

Figure 2. Contrast and direction preferences are cell-type and layer specific.

(a) The fraction of imaged cells that were significantly responsive to the gratings stimulus (bootstrapped χ2 test, p<0.01). (b) Waterfall plots showing the response significance at each contrast and direction of all responsive cells (χ2 test; p < 0.01) from mice of each Cre line. Each row is one neuron and neurons are ordered by direction preference at the cell’s peak contrast. The responses to each stimulus condition are normalized per neuron to be RN=(Rd,cRb)/(R¯d,c+Rb), where RN is the normalized response, Rd,c is the mean response to a grating with direction d and contrast c, Rb is the mean blank (0% contrast) response, and R¯d,c is the mean response to gratings across all directions and contrasts. (c) Radial plot of the average direction preference of cells of each Cre line at each contrast. Arrows are the vector sum of all responsive cells at a given contrast. Gray shaded region indicates a 95% confidence interval of the vector sum for a population with uniformly-distributed direction preferences, multiple comparisons corrected for the six contrasts. Scale: The distance between each pair of concentric dashed rings is 25%. N: Nasal, T: Temporal, U: Up, D: Down. (d) Fraction of all cells of each Cre line that are suppressed by contrast. The mean response to all grating directions at 80% contrast must be significantly below the mean blank response (bootstrapped distribution of mean response differences; family-wise type 1 error < 0.05; see Materials and methods). (e) Distribution of contrast response types by Cre line determined by fitting of rising sigmoid (high contrast preferring), falling sigmoid (low contrast preferring), or the product of rising and falling sigmoids (intermediate contrast preferring; not shown due to a very small percentage of neurons tuned for intermediate contrasts). P-values are shown for pairwise comparisons of the fraction of high contrast preferring pyramidal neurons in each layer (bootstrap test of difference of sample proportions). See Materials and methods. (f) Cumulative distribution of contrast preferences (center-of-mass of a cell’s contrast response function; CoM) across Cre lines. (g) Cumulative distribution of global orientation selectivity indices (gOSI) across Cre lines. (h) Cumulative distribution of direction selectivity indices across Cre lines.

Figure 2.

Figure 2—figure supplement 1. The direction of VIP neuron bias was consistent across mice and did not result from stimulus direction-selective running behavior.

Figure 2—figure supplement 1.

(a) Vector sums for each of the six Vip-Cre experiments. N: Nasal, T: Temporal, U: Up, D: Down. (b) Performance of a linear support vector classifier trained to decode the direction of grating (1-of-8 classification) from the running speed of the mouse. The average validation performance for three-fold cross-validation is shown. Each dot is the performance for one experiment; bars are the mean across experiments of a given Cre line.

Figure 2—figure supplement 2. VIP neurons have evoked responses to low contrast gratings but response suppression to high contrast gratings.

Figure 2—figure supplement 2.

Average fluorescence responses of all VIP neurons to low (left) and high (right) contrast gratings are shown at the neuron’s preferred direction (n=63 neurons; mean ± SEM). Gray shading indicates the stimulus presentation window of 2 s; time is relative to stimulus onset.

We statistically-validated the contrast tuning of neurons with a model selection procedure (see Materials and methods: Contrast response function fitting and model comparison) comparing low contrast preference, high contrast preference, and intermediate contrast preference. This analysis confirmed that nearly all VIP neurons were low contrast-preferring and nearly all SST neurons were high contrast-preferring (Figure 2e). Contrast preference among pyramidal neurons systematically varied across cortical layers, exhibiting a progression from a mixture of low and high contrast-preferring neurons in layer 2/3 to almost exclusively high contrast-preferring neurons in layers 5 and 6. Like VIP neurons, pyramidal neurons in layer 2/3 showed direction bias toward front-to-back motion at 5% and 10% contrast but not at higher contrasts (Figure 2c); pyramidal neurons in deeper layers did not have direction bias. Taken together, concerted changes in response magnitude near 20% contrast across all Cre lines and layers indicate the presence of a phase transition in cortical dynamics between a low contrast regime exemplified by relatively inactive SST neurons and a high contrast regime exemplified by highly active SST neurons.

A previous survey of transcriptomic neuron types using single-cell RNA sequencing identified 16 VIP neuron subtypes, 21 SST neuron subtypes, 3 excitatory neuron subtypes in layer 2/3, 1 excitatory type in layer 4, 12 excitatory types in layer 5, and 17 excitatory types in layer 6 (Tasic et al., 2018). That study also investigated the transcriptomic neuron types labeled by the Cre lines used in the present study (see Extended Data Figure 8 of Tasic et al., 2018). The Vip and Sst Cre lines label all transcriptomic subtypes of VIP and SST neurons, respectively, suggesting that common subtypes of SST neurons, such as Martinotti-type SST neurons, are all high contrast-preferring and common subtypes of VIP neurons are all low contrast-preferring. Furthermore, the Cux2 and Rorb Cre lines label all transcriptomic excitatory neuron types in layers 2/3 and 4, respectively. Our finding of substantial populations of both high contrast-preferring and low contrast-preferring neurons in layer 4, where there is only a single transcriptomic excitatory neuron type, demonstrates that neurons of the same transcriptomic subtype can differ in contrast preference. In other layers, whether all neurons of a particular transcriptomic type have the same contrast tuning and, conversely, all neurons with the same contrast tuning correspond to the same transcriptomic type, are important open questions.

Studies of stimulus tuning in the visual system have long reported (Levick, 1967; Rodieck, 1967) a small but consistent fraction (1–5%) of neurons that exhibit firing rate suppression in response to all stimuli presented, which typically comprised of high contrast gratings, termed ‘suppressed-by-contrast’ (SbC) neurons. Consistent with a recent report (de Vries et al., 2020), these results identify VIP neurons as a major source of SbC neurons in V1. Surprisingly, we observe that not only are these SbC neurons not suppressed at low contrast but that they exhibit robust visual responses to front-to-back motion in such conditions. This contributes new information to our understanding of SbC neurons in the visual circuit. The finding that VIP neurons are suppressed below baseline in response to high contrast gratings, rather than suppressed to baseline, might be due to the high spontaneous activity of VIP neurons that is available to be suppressed compared to the other neuron types measured here (see Figure 3 as well as Extended Data Figure 1 of de Vries et al., 2020). Our measurements of contrast tuning suggest that the high spontaneous activity of VIP neurons enables the cortical circuit to raise or lower the amount of disinhibition of pyramidal neurons depending on stimulus contrast.

To assess the circuit-wide effects of locomotion on cortical dynamics, we examined the average activity of each neuron population as a whole. We focused here on the responses at low contrast in layers 2/3 and 4, but not layers 5 and 6 which did not respond at low contrast. Pyramidal neurons in layers 2/3 and 4, as well as VIP and SST interneurons, had increased activity during stimulus presentations when the mouse was running compared with stimulus presentations when the mouse was stationary (Figure 3; Figure 3—figure supplement 1). During locomotion, the low contrast and front-to-back direction selectivity that was common to nearly all VIP neurons resulted in an average VIP population response that had tuning closely resembling the tuning of any individual VIP neuron (Figure 3, first column). By comparison, the VIP population only weakly responded to front-to-back motion at low contrast when the mice were stationary and did not respond to gratings of any other direction or contrast. Running also increased the SST population response to high contrast gratings, which also had the highest average response to front-to-back motion but responded strongly as a population to other directions as well (Figure 3, second column). The pyramidal population in layer 2/3 (CUX2) responded broadly across directions but more strongly at low than high contrast (Figure 3, third column), whereas the pyramidal population in layer 4 (RORB) had comparable response magnitude and running enhancement across contrasts (Figure 3, fourth column). This analysis demonstrates a substantial enhancement of responses to low contrast visual stimuli during locomotion that is specific to layer 2/3 pyramidal neurons and VIP neurons.

Figure 3. Average population responses of inhibitory, but not excitatory, cells are strongly biased toward front-to-back visual motion which is enhanced during locomotion.

(a) Mean blank-subtracted event magnitude (a.u.; extracted events derived from dF/F trace) of all neurons from mice of each superficial Cre line during stationary periods. Gray boxes in Rorb plots indicate insufficient run and stationary data. (b) Same as a, for running periods. (c) Mean population contrast responses tuning at peak direction during stationary (faint lines) and running (bold lines) periods. (d) Mean population direction response tuning at low (5-10%) contrast. Insets: mean population direction response tuning at high (60-80%) contrast. (e) Mean single-neuron direction tuning (i.e. aligned to each neuron’s peak direction). Insets: mean single-neuron direction tuning at high (60-80%) contrast. All error bars are SEM. Sample size indicates number of neurons with number of experiments in parenthesis.

Figure 3.

Figure 3—figure supplement 1. Distributions of single neuron response magnitudes across stimulus conditions for key Cre lines.

Figure 3—figure supplement 1.

(a) The distribution of responses of all VIP neurons in layer 2/3 of Vip mice by grating direction and contrast. Each dot is one neuron. Box plots show lower quartile, median (red bar), and upper quartile. (b) Same as a but for all SST neurons in layer 4 of Sst mice. (c) Same as a but for all pyramidal neurons in layer 2/3 of Cux2 mice. (d) Same as a but for all pyramidal neurons in layer 4 of Rorb mice. (e) Same as a but shown for stimulus direction relative to the cell’s peak direction (i.e. stimulus direction – peak direction). (f) Same as e but for all SST neurons in layer 4 of Sst mice. (g) Same as e but for all pyramidal neurons in layer 2/3 of Cux2 mice. (h) Same as e but for all pyramidal neurons in layer 4 of Rorb mice.

We built a Generalized Linear Model of VIP, SST, and layer 2/3 pyramidal neuron responses to investigate the contribution of stimulus contrast, stimulus direction, locomotion, and the interactions between these terms to the average activity of each neuron population using a Poisson model to predict responses (Figure 4a). To identify only the terms that significantly contribute to activity, we included an L1-regularization penalty in the cost function which resulted in relatively few non-zero terms (12–15 non-zero out of 126 total terms). VIP neurons had the highest weights for blank sweep, low contrasts (5–20%), running, directions of 0° and 180°, running by direction interactions at 0° and 45°, and direction by contrast interactions at ±45° and low contrasts (Figure 4b). SST neurons had the highest weights for high contrasts (40–80%), direction of 0°, and running by direction interactions at all directions (Figure 4c). Layer 2/3 pyramidal neurons have significant weights only for running, low contrasts (5–20%), and all directions (Figure 4d). Overall, this analysis confirms the influence of running, stimulus direction, and stimulus contrast but suggests that interactions among these variables is limited.

Figure 4. Generalized Linear Models reveal the contribution of stimulus direction, stimulus contrast, locomotion, and the interactions between these terms, to the activity of neuronal populations.

Figure 4.

(a) Schematic of the Poisson GLM consisting of a blank term, a binary run state term (1 for running, 0 for stationary), 8 direction terms, 6 contrast terms, 8 run x direction interaction terms, 6 run x contrast interaction terms, 48 direction x contrast interaction terms, and 48 run x direction x contrast interaction terms. The responses are predicted by summing these 126 terms and raising the sum to an exponential. (b) GLM results for the population of layer 2/3 VIP neurons recorded from Vip mice. Left: The model weights are shown as heatmaps (top) as well as means and 95% confidence intervals (bottom). Sparse weights were obtained using an L1-regularization penalty, resulting in the majority of weights to be zero. For direction x contrast and run x direction x contrast interaction terms, means and confidence intervals are only shown for terms with non-zero weights. Right: Predicted responses to stimulus conditions minus predicted blank response when the mouse is stationary (top) and running (bottom). (c) Same as b, but for the population of layer 4 SST neurons recorded from Sst mice. (d) Same as b, but for the population of layer 2/3 pyramidal neurons recorded from Cux2 mice.

Anatomical and optogenetic perturbation experiments suggest that VIP neurons disinhibit pyramidal neurons through their inhibition of SST neurons (Pfeffer et al., 2013; Zhang et al., 2014; Pi et al., 2013). However, VIP neurons only respond to one direction of low contrast grating and SST neurons have very weak responses to low contrast gratings of any direction, potentially limiting the magnitude of SST activity that is available to be inhibited by VIP neurons and, consequently, limiting the magnitude of disinhibition of pyramidal neurons. Evidence that visual cortex has higher gain at low contrast than high contrast (Heuer and Britten, 2002; Cavanaugh et al., 2002; Carandini and Heeger, 2012) suggests that a small reduction in feedback inhibition (e.g. disinhibition) is capable of driving a large increase in pyramidal neuron activity (Hertäg and Sprekeler, 2019). We hypothesized that VIP neurons are essential to establishing the high gain regime at low contrast as a result of VIP-mediated disinhibition forming a positive feedback loop (i.e. Pyr → VIP → SST → Pyr) that depends upon, and contributes to, network dynamics. Stabilized supralinear network (SSN) models have been proposed to account for a variety of contrast-dependent response properties in visual cortex (Rubin et al., 2015; Ahmadian et al., 2013), including the transition from a high gain regime at low contrast to a feedback inhibition dominated low gain regime at high contrast (Adesnik, 2017; Sanzeni et al., 2020), as well as cortical noise correlations (Hennequin et al., 2018), surround suppression (Liu et al., 2018), and effects of feature and spatial attention on neural activity (Lindsay et al., 2020). In SSNs, high gain is achieved through supralinear single-neuron transfer functions (e.g. f-I curve) and strong recurrent excitatory connections but the gain is eventually reduced as external input strength increases due to the recruitment of inhibitory neurons which also have supralinear transfer functions (Miller and Troyer, 2002; Priebe and Ferster, 2008; Margrie et al., 2002; Linaro et al., 2019). The ability of SSNs to account for a wide variety of phenomenology by utilizing only a few simplified but universal features of cortical circuits (e.g. recurrent excitation, feedback inhibition, and supralinear f-I curves) has established them as attractive models for explaining cortical dynamics (Kraynyukova and Tchumatchenko, 2018). However, the impact of interneuron diversity on the behavior of SSNs is largely unknown.

To investigate the distinct roles of each interneuron type, we extended the SSN model from one homogeneous population of interneurons to three populations corresponding to VIP, SST, and parvalbumin-expressing (PV) neurons to model layer 2/3 of mouse V1 (Figure 5a; see Materials and methods for further details). Briefly, the network is a ring model in which each layer 2/3 pyramidal neuron (‘CUX2’) receives external (‘sensory’) excitatory input that has Gaussian tuning with mean (i.e. peak/preferred direction) corresponding to the neuron’s position on the ring and standard deviation of 30 degrees; PV neurons also receive external input which is not tuned (Kerlin et al., 2010). SST neurons do not receive external input in our model to incorporate the finding of weak or no thalamocortical input to SST neurons in mouse somatosensory cortex (Cruikshank et al., 2010). VIP neurons also do not receive external input in our model since experimental measurements of this input to VIP neurons are lacking, and eliminating this potential source of excitatory input to VIP neurons is the most conservative assumption for reproducing the strong responses of VIP neurons to weak stimuli. The strength of external input is intended to represent a monotonically-increasing function of stimulus contrast, though no specific relationship is claimed here. Connections from CUX2 neurons (i.e. excitatory connections) also have Gaussian tuning that depends on the difference between the orientation preferences of the pre- and post-synaptic neurons (Figure 5b top), with broader tuning of connections targeting SST and PV neurons (standard deviation of 100 degrees) than those targeting pyramidal and VIP neurons (standard deviation of 30 degrees) to reflect the relative tuning of the postsynaptic neurons types (de Vries et al., 2020; Kerlin et al., 2010). Connections from inhibitory neurons (i.e. inhibitory connections) were broadly tuned as well (standard deviation of 100 degrees; Figure 5b bottom). To incorporate the bias we measured in direction tuning, we included a minor (~2%) over-representation of pyramidal neurons that prefer the zero degrees direction as well as a bias in the direction tuning of external input for zero degrees to account for the known direction bias of thalamocortical inputs (Marshel et al., 2012; Zhang et al., 2020). All neurons are modeled as rate units with rectified quadratic transfer function, the simplest supralinear polynomial.

Figure 5. A stabilized supralinear network (SSN) model with three interneuron populations reproduces contrast and direction tuning of multiple neuron types and implicates VIP neurons in enhancement of network gain for weak inputs.

Figure 5.

(a) Top: The network architecture is a ring corresponding to the peak of each L2/3 pyramidal (“CUX2”) neuron’s direction tuning curve. The entire ring spans 180 degrees of direction. Bottom: A schematic illustrates the connectivity among neuron types. (b) Top: The distribution of excitatory connection strength from CUX2 pyramidal neurons onto each neuron type is Gaussian with mean equal to the difference in orientation preference of pre- and post-synaptic neurons. The distributions of recurrent connections onto CUX2 neurons and connections onto VIP neurons are narrow (standard deviation of 30 degrees) compared to the distributions onto PV and SST neurons (standard deviation of 100 degrees). Bottom: Inhibitory connection weights are all broadly tuned (standard deviation of 100 degrees). (c) The average population responses across direction and contrast conditions qualitatively reproduce experimental data for CUX2, SST, and VIP neurons shown in Figure 3. (d) Left: The steady state firing rates are shown for model neurons of each type with peak direction tuning of zero degrees in response to an external input of zero degrees. Right: The steady state firing rates of the same model neurons in response to an external input of zero degrees with the VIP-to-SST connection strength set to zero demonstrates that this connection is necessary for a high gain of CUX2 and PV neurons at the low input levels for which VIP neurons are most responsive. (e) Currents to the pyramidal neurons in panel d show that most additional external excitatory input above 15 is offset by the recruitment of inhibition. Inhibition from PV neurons dominates at weak external input strengths while inhibition from SST neurons dominates at strong external input strengths. (f) The relative fraction of currents that pyramidal neurons receive from other pyramidal neurons, rather than inhibitory neurons, decreases as external input strength increases. This shows the relative dominance of inhibition over excitation in the network. (g) The linear stability of the E-E subnetwork shows a transition from non-ISN dynamics (E-E stability < 0) to ISN dynamics (E-E stability > 0) at an external input strength of ~55 for networks with connection weight from VIP to SST neurons below a critical value (WVIP→SST≈-0.6). Above the critical value, the E-E subnetwork is highly unstable for all external input strengths greater than ~20. (h) The firing rates of all four neuron types remain similar below the critical value of WVIP→SST, except VIP neurons which increase substantially with increasing WVIP→SST but remain most active at weak external input strengths. Above the critical value, rates of pyramidal, PV and VIP neurons increase substantially, and SST neuron rates are near zero, for all external input strengths. (i) The effect of the VIP-to-SST neuron connection on pyramidal neuron gain shows that the increase in gain occurs only at weak external input strengths. The gain effect increases with increasing WVIP→SST below the critical value. W/Wcritical for the networks shown in panels b-f is 0.99, where W is -0.6. (j-l) Same as (g-i), except varying the weight of PV inputs onto pyramidal neurons relative to the total weight of inhibitory (i.e. PV and SST) inputs onto pyramidal neurons. The external input strength at which the E-E stability (panel j) transitions from non-ISN to ISN decreases as the relative weight from PV neurons increase, but the stability behavior, firing rates (panel k), and gain effect (panel l) remain the similar until a bifurcation near WPV/(WPV+WSST)=0.8. The network becomes unstable at very high relative PV weights (WPV/(WPV+WSST) >0.95). WPV/(WPV+WSST) for the networks shown in panels b-f is 0.5.

This model is able to qualitatively reproduce the population direction and contrast tuning we observed for VIP, SST, and layer 2/3 pyramidal (CUX2) neurons as well as make a prediction for the tuning of PV neurons (Figure 5c). Model VIP neurons are suppressed at high levels of external input regardless of stimulus direction, reproducing the suppressed-by-contrast behavior we observed in our imaging experiments, and active for all stimulus direction at low contrast but most active for the zero degrees direction, again reproducing VIP neuron tuning (Figure 3b). The external input strength for which VIP neurons are most active (~10 a.u.) corresponds to the highest gain (‘supralinear’) regime for L2/3 pyramidal and PV activity (Figure 5d: left). Ablating the VIP-to-SST inhibitory connection, the only output of VIP neurons contained in the model, results in a large reduction in the gain and activity of VIP, L2/3 pyramidal, and PV populations at low input (Figure 5d: right). Even in the absence of inhibition from VIP neurons, SST neurons have relatively low activity at a low level of external input, demonstrating that suppression of a relatively small amount of SST neuron activity can drive large increases in pyramidal neuron gain. These results indicate that VIP-mediated disinhibition is capable of producing substantial increases in gain at weak inputs, despite low activity of the intermediate SST neuron population, in networks with supralinear single neuron transfer functions and recurrent excitation.

The introduction of a positive feedback loop into the SSN model in the form of VIP-mediated disinhibition could have a destabilizing effect on network dynamics. A key aspect of the stability of network dynamics is whether recurrent excitation required feedback inhibition to prevent runaway activity; that is, whether the network is inhibitory stabilized (‘ISN’) or non-inhibitory stabilized (‘non-ISN’). In this context, achieving a high gain might push networks to the brink of instability. Conversely, suppressing VIP neuron activity below baseline, rather than to baseline, for high external input strength could be an important component of ensuring network stability. We assessed the stabilization regime by a linear stability analysis of the network’s response to a perturbation of the inputs that uniformly targeted all locations on the ring (Materials and methods). The excitatory-excitatory (E-E) component of the network’s linear response matrix exposes inhibitory stabilization: if it is negative, the excitatory-excitatory subnetwork has weak effective coupling and does not require inhibition to stabilize it. On the other hand, if the E-E subnetwork’s linear response is positive, but the network as a whole is stable (e.g. converges to stable steady-state rates), inhibition is required to stabilize the recurrent excitation and the network is inhibitory-stabilized, an ISN. We find that the network transitions from non-ISN to ISN as external input strength increases with a transition between stability regimes around an external input strength of 50 a.u. (Figure 5g), as long as the VIP-to-SST connection strength is below a critical value (normalized to be 1.0 in Figure 5g–i). Above the critical VIP-to-SST connection strength, the network becomes highly unstable even for low external input strengths and firing rates of pyramidal, PV, and VIP neurons explode while SST neuron activity is suppressed (Figure 5h). To determine the impact of VIP neurons on circuit gain, we measured the pyramidal neuron gain (i.e. the slope with respect to external input strength in Figure 5h) and examined the difference between that gain and the gain in a network with VIP-to-SST connection strength set to zero (but otherwise identical). This difference is greatest at the same low external input strengths for which VIP neuron activity is high and monotonically increases as a function of the VIP-to-SST connection strength until the transition to instability at the critical value (Figure 5i). The gain enhancement is present only at low external input strengths for which the network is in a non-ISN regime, ensuring that the impact of VIP neurons on gain does not disrupt inhibitory stabilization. Still, the external input strength at which VIP rates explode above the critical VIP-to-SST connection strength (Figure 5h) closely matches the external input strength at which VIP rates and network gain are the highest below the critical VIP-to-SST connection strength (Figure 5i), emphasizing the delicate balance of gain and stability in the cortical network. Although VIP-mediated disinhibition and increase in network gain can be destabilizing above a critical strength, this analysis demonstrates that a substantial increase in gain can be achieved over a wide range of VIP-to-SST strengths.

Finally, we also investigated the impact of the relative strength of the weight of PV inputs versus SST inputs to pyramidal neurons (WPV/(WPV+WSST); Figure 5j–l). Along with the strength of the VIP-to-SST connection, this ratio is a key determinant of how inhibition is recruited in the SSN. Although the external input strength at which the network transitions from non-ISN to ISN decreases as the relative weight from PV neurons increase, the stability behavior (Figure 5j), firing rates (Figure 5k), and gain effect (Figure 5l) remain similar over a broad range of the WPV/(WPV+WSST) ratio. Only when input from PV neurons greatly outweighs input from SST neurons, a bifurcation occurs (near WPV/(WPV+WSST)=0.8) and the network ultimately becomes unstable (near WPV/(WPV+WSST)=0.95). These results demonstrate that the stability behavior and gain effects we observe in SSNs with three interneuron populations are robust over a wide range of values for key model parameters.

This survey of contrast tuning in mouse V1 revealed two distinct regimes of cortical dynamics in superficial layers of cortex. At high contrast, SST neuron activity is high, VIP neuron activity is suppressed, and layer 2/3 pyramidal neuron activity is lower than it is at low contrast; at low contrast, SST neuron activity is low, VIP neuron activity is direction tuned and gated by locomotion, and layer 2/3 pyramidal neuron activity is higher and more enhanced by locomotion. Measurements of size tuning with high contrast gratings have shown that SST neurons prefer large gratings, suggestive of a role mediating surround suppression, whereas VIP neurons only respond to gratings smaller than those that drive SST neurons (Adesnik et al., 2012; Dipoppa et al., 2018). Interestingly, the receptive fields of VIP neurons are larger than those of SST or pyramidal neurons when measured with sparse noise stimuli (de Vries et al., 2020), indicating that the selectivity of VIP neurons for small stimuli does not arise simply from having small linear receptive fields. This complementary size tuning parallels the complementary contrast tuning observed here, suggesting that VIP and SST neurons in V1 are tuned for weak and strong inputs, respectively, across multiple stimulus dimensions. Indeed, this relationship appears to hold across sensory modalities as VIP neurons in mouse primary auditory cortex are selective for lower sound intensities than SST or PV neurons (Mesik et al., 2015). Taken together, a parsimonious explanation of these results is that VIP neuron activity supports a high gain regime that increases sensitivity to weak inputs, whereas SST neuron activity promotes a low gain regime that decreases sensitivity to strong inputs and maintains network stability. Heightened sensitivity to detect low contrast objects or obstacles approaching head-on during locomotion might be more behaviorally relevant than other directions of motion. This ability of VIP neurons to promote a high gain in the local microcircuit might be indicative of a more general role at the nexus of top-down (e.g. attention) and bottom-up (e.g. saliency) processes.

Materials and methods

Key resources table.

Reagent type (species)
or resource
Designation Source or reference Identifiers Additional
information
Genetic reagent (M. musculus) Vip-IRES-Cre Jackson Laboratory Stock #: 010908; RRID:MGI:4436915 Dr. Z Josh Huang (Cold Spring Harbor Laboratory)
Genetic reagent (M. musculus) Sst-IRES-Cre Jackson Laboratory Stock #: 013044;
RRID:IMSR_JAX:013044
Dr. Z Josh Huang (Cold Spring Harbor Laboratory)
Genetic reagent (M. musculus) Cux2-CreERT2 MMRRC RRID:MMRRC_032779-MU PMID:22879516
Genetic reagent (M. musculus) Rorb-IRES2-Cre Jackson Laboratory Stock #: 023526
RRID:IMSR_JAX:023526
PMID:25071457
Genetic reagent (M. musculus) Rbp4-Cre_KL100 MMRRC RRID:MMRRC_031125-UCD PMID:24360541
Genetic reagent (M. musculus) Ntsr1-Cre_GN220 Jackson Laboratory Stock #: 017266;
RRID:MMRRC_030648-UCD
PMID:24360541
Genetic reagent (M. musculus) CaMKII-tTA x Ai93-GCaMP6f Jackson Laboratory Stock #: 024108; RRID:IMSR_JAX:024108 PMID:22855807; PMID:25741722
Genetic reagent (M. musculus) Ai148-GCaMP6f Jackson Laboratory Stock #: 030328; RRID:IMSR_JAX:030328 PMID:30007418
Software, algorithm NumPy NumPy RRID:SCR_008633
Software, algorithm Matplotlib MatPlotLib RRID:SCR_008624
Software, algorithm pandas pandas DOI:10.5281/zenodo.3509134
Software, algorithm statsmodel statsmodel RRID:SCR_016074
Software, algorithm scipy SciPy RRID:SCR_008058
Software, algorithm scikit-learn scikit-learn RRID:SCR_002577

Experimental animals

All animal procedures were approved by the Institutional Animal Care and Use Committee (IACUC) at the Allen Institute for Brain Science. Six double or triple transgenic mouse lines were used to drive expression of GCamp6/f in genetically-defined cell types, including four excitatory (Cux2-CreERT2;Camk2a-tTA;Ai93, Rorb-IRES2-Cre;Camk2a-tTA;Ai93, Rbp4-Cre_KL100;Camk2a-tTA;Ai93, and Ntsr1-Cre_GN220;Ai148) and two inhibitory (Vip-IRES-Cre;Ai148 and Sst-IRES-Cre;Ai148) mouse lines. Mice were habituated to head fixation and visual stimulus presentation for 2 weeks before data collection. Post-surgical experimental mice were housed in cages individually and maintained on a reverse dark-light cycle with experiments conducted during the dark phase. (See de Vries et al., 2020 for further Cre line, surgical, and habituation details). Sample size was determined qualitatively to balance repeated experiments for each layer/Cre-line and to preserve the breadth of the survey.

The correspondence between Cre lines (including all six Cre lines used in this study) and transcriptomic neuron subtypes as measured with single-cell RNA sequencing has been reported in Extended Figure 8 of Tasic et al., 2018. Vip-Cre and Sst-Cre lines provide broad coverage of VIP and SST neuron transcriptomic subtypes (16 and 21 subtypes, respectively). In layer 2/3, Cux2-CreERT2 labels all three excitatory neuron transcriptomic subtypes. Layer 4 contains only a single transcriptomic neuron type, which is sampled by the Rorb-Cre line. Rbp4-Cre_KL100 labels all twelve layer 5 neuron transcriptomic subtypes; note that layer 5 was imaged at a single depth in this study, which might result in sampling only a subset of the layer 5 transcriptomic types. Ntsr1-Cre labels all six layer 6 corticothalamic neuron transcriptomic subtypes.

Two-photon imaging platform and image processing

Data was collected using the same data collection pipeline as the Allen Brain Observatory and processed using the same image processing and event detection methods (See de Vries et al., 2020 for further imaging and image processing details). Calcium imaging was performed with Nikon A1R MP+ two-photon microscopes adapted to provide space to accommodate the running disc. Laser excitation with a wavelength of 910 nm was provided by a Ti:Sapphire laser (Chameleon Vision—Coherent). Precompensation was fixed at 10,000 fs2. Movies were recorded at 30 Hz with resonant scanners over a 400 μm field of view with a resolution of 512 × 512 pixels. Temporal synchronization of calcium imaging, visual stimulation, and running wheel movement was achieved by recording all experimental clocks on a single NI PCI-6612 digital IO board at 100 kHz. PMT gain and laser power were chosen for each experiment to maximize dynamic range while saturating fewer than 1000 pixels in the field of view. Two z-stacks, one local (±30 μm from imaging depth in 0.1 μm steps) and one full-depth of the cortex (~700 μm total depth in 5 μm steps), were acquired at the end of each imaging session. Z-drift was calculated from the local z-stack and experiments with z-drift of more than 10 μm during the experiment were excluded. The imaging depth of the field of view was confirmed from the full-depth cortical z-stack.

Calcium fluorescence movies were motion corrected for rigid translational errors using an algorithm based on phase correlation. ROI masks of neuronal somata were segmented from motion-corrected movies by (1) creating initial binarized masks using an adaptive fluorescence threshold, (2) applying a succession of morphological operations to fill closed holes and concave shapes, (3) computing a feature vector of each mask that included morphological attributes such as location, area, perimeter, and compactness, (4) combining or eliminating ROIs based on heuristic decisions, including attributes from the feature vectors, and (5) applying a final discrimination step using a binary relevance classifier fed by experimental metadata (e.g. Cre line and imaging depth) as well as the morphological feature vectors. Fluorescence traces were then extracted for each final ROI, which were then neuropil subtracted and corrected for overlapping ROIs by demixing traces. Neuropil contamination into the ROI contributed by the surrounding neuropil was estimated by modeling the measured ROI fluorescence as the sum of the true ROI fluorescence and a weighting of the surrounding neuropil fluorescence, FM=FC+rFN, where FM is the measured fluorescence trace, FC is the unknown true ROI fluorescence trace that we are trying to estimate, FN is the fluorescence of the surrounding neuropil and r is the contamination ratio. The contamination ratio was estimated for each ROI by selecting the value for r that minimizes the cross-validation error, E=tFC-FM+rFN2, over four folds. Overlapping ROIs were demixed by modeling the measured fluorescence Fit of each pixel i at time t as Fit=kWkitTkt, where Wkit are time-dependent weighted masks that describe how much of each neuron k’s fluorescence is contained in each pixel at each timestep, and Tkt is the fluorescence trace of the neurons that we seek to estimate. Reconstruction of calcium movies is modeled as iAkiFit=k,iAkiWkitTkt, where Akt are the binary spatial masks obtained in the earlier segmentation step in which Aki equals 1 if pixel i is in ROI k and equals 0 otherwise. To solve for Tkt at each time t, we first estimated the weighted masks Wkit by projection of the recorded fluorescence Fit onto the binary masks Aki, then computed the linear least-squares solution T^kt to extract each ROI trace’s value. To calculate ΔF/F traces from each fluorescence trace, a fluorescence baseline was determined by median filtering the fluorescence trace with a window of 180 s (5401 samples); the ΔF/F trace was then produced by subtracting the fluorescence baseline from the original trace followed by dividing the fluorescence baseline. To prevent very small or negative baselines, we set the baseline as the maximum of the median filter-estimated baseline and the standard deviation of the estimated noise of the fluorescence trace. All analyses of cell responses were performed on L0 penalized detected events (Jewell and Witten, 2018; Jewell et al., 2019).

Two-photon imaging data was collected from the retinotopic center of primary visual cortex that was identified through mapping during widefield intrinsic signal imaging. Cux2-CreERT2;Camk2a-tTA;Ai93 and Vip-IRES-Cre;Ai148 were imaged at 175 um below the cortical surface in layer 2/3; Sst-IRES-Cre;Ai148 mice and Rorb-IRES2-Cre;Camk2a-tTA;Ai93 mice were imaged at 275 um below the cortical surface in layer 4; Rbp4-Cre_KL100;Camk2a-tTA;Ai93 mice were imaged at 375 um below the cortical surface in layer 5; and Ntsr1-Cre_GN220;Ai148 mice were imaged at 550 um below the cortical surface in layer 6. (These Cre lines and imaging depths match those used in the Allen Brain Observatory.) Some mice were imaged in two different fields of view at the same depth; the sample sizes for number of imaging sessions and mice are given in Figure 1—source data 1. Some mice were imaged in multiple sessions; in cases in which a subset of cells was imaged in multiple sessions, only data from the first imaging session for each cell was analyzed. Mice were excluded for evidence of epileptiform activity, and individual imaging sessions were failed if there were signs of bleaching, saturation, excessive z-drift, or animal stress, among other factors.

Visual stimulus

As experimental sessions took place on the same data collection pipeline as the Allen Brain Observatory, visual stimulus monitor calibration and positioning (ASUS PA248Q LCD monitor with 1920×1200 pixels; center of monitor was 118.6 mm lateral, 86.2 mm anterior, and 31.6 mm dorsal to the right eye; normal distance from the right eye to center of monitor was 15 cm) were identical. Each monitor was gamma corrected and had a mean luminance of 50 cd m−2. Spherical warping was applied to all stimuli to ensure constant spatial and temporal frequencies across the monitor as seen from the mouse’s perspective. See de Vries et al., 2020 for further visual stimulus presentation details. The stimulus consisted of a full field drifting sinusoidal grating that was presented at a single spatial frequency (0.04 cycles/degree) and temporal frequency (1 Hz), eight directions uniformly distributed in 45 degree increments (0 degrees = horizontal front-to-back motion), and six contrasts (5%, 10%, 20%, 40%, 60%, and 80%). Direction of motion was always orthogonal to the orientation of the grating. Each grating was presented for 2 s, followed by 1 s of mean luminance gray before the next grating. Each grating condition (direction, contrast combination) was presented 15–24 times. Trials were randomized with 30 randomly interleaved blank (i.e. mean luminance gray, zero contrast) trials.

Analysis

Statistical test for responsiveness

A chi-square test for independence was used to determine significantly responsive cells to the drifting grating stimulus set. A chi-square test statistic was computed χ2=i=0nEi-Oi2Ei, where Oi=1mij=0miRi,j is the observed average response (R) of the neuron over m presentations of a grating stimulus of a particular condition (i.e. direction-by-contrast pair or blank, n = 49 total conditions), and Ei=injmiRi,jinmi is the expected (grand average) response per stimulus presentation. A p-value was then calculated for each cell by comparing the test statistic against a null distribution of 200,000 test statistics, each computed from the cell’s responses after shuffling (with replacement) cell responses across all presentations.

Response significance by stimulus condition and test for suppression by contrast

The distribution of responses to stimulus presentations varied substantially across cells. A statistical measure was used to normalize response magnitudes. The mean blank-subtracted response to a given stimulus condition was calculated as: R-=1mij=0miRi,j-1mblankj=0mblankRblank,j. Then, a bootstrapped null distribution of such mean (blank-subtracted) condition responses was generated by sampling with replacement from all of the cell’s responses across all stimulus presentations. The percentiles of each cell’s observed mean condition response within its own bootstrapped distribution was then computed. Cells were determined to be suppressed by high contrast if this percentile for the peak direction grating condition at 80% contrast was below 0.05.

Orientation and direction selectivity metrics

Global orientation selectivity was computed from mean extracted event responses to drifting gratings, at the cell’s preferred contrast as,

gOSI=RθeiθRθ

where θ is the direction of grating movement, and Rθ is the mean response to that direction of motion.

Direction selectivity was computed from mean extracted event responses to drifting gratings, at the cell’s preferred contrast, as

DSI=Rpref-RnullRpref+Rnull

where Rpref is a cell’s mean response in its preferred direction (i.e. largest response-evoking direction) and Rnull is its mean response to the opposite direction.

Contrast preference metric

Contrast preference was computed from mean extracted event responses to drifting gratings, at the cell’s preferred direction, as

cCoM=eRclncRc

where c is the contrast of the drifting grating, Rc is a cell’s mean response at contrast c, and cCoM is the log-scaled center of mass of the cell’s contrast response tuning.

Bias in population direction preference

The direction and magnitude of bias in direction preference for a population of cells (e.g. all cells recorded from one mouse or all cells recorded from all mice of a particular Cre line) was calculated as the direction and magnitude of the vector sum of the direction preferences of the cells that comprise the population, at a particular contrast as,

θbias=tan-1sinθicosθi
rbias= 1ncells(cosθi)2+(sinθi)2

where θi is the preferred direction of cell i, ncells is the number of cells in the population, θbias is the direction of the vector sum over the population, and rbias is the magnitude of the vector sum over the population.

Stimulus tuning conditioned on locomotion behavior

As part of the standardized pipeline for the Allen Brain Observatory, mice were held on a running wheel during experimental sessions and locomotion behavior was recorded (See de Vries et al., 2020 for further run speed measurement details). The mean running speed was calculated for each trial over the same time window as the mean cellular response was calculated. Trials for which the mean running speed was greater than or equal to 1 cm/s were categorized as running trials, whereas trials for which the mean running speed was below 1 cm/s were categorized as stationary trials. The mean and standard error of the mean event magnitude for each contrast and direction condition shown in Figure 3 was calculated separately for running and stationary trials. The criterion for a cell to be included in the calculation for a given direction-by-contrast condition was that the mouse had to be running for a minimum of four trials and be stationary for a minimum of four trials of that condition. At least three responsive neurons needed to be present to include an experiment in this analysis.

Contrast response function fitting and model comparison

Event responses as a function of contrast, at a cell’s preferred direction, were fit to a rising sigmoid (‘high pass’), a falling sigmoid (‘low pass’), and the product of one rising and one falling sigmoid (‘band pass’).

Rhigh pass(c;h,b,s,c50r)= b+h11+es(cc50r)Rlow pass(c;h,b,s,c50f)=b+h11+es(cc50f)Rband pass(c;h,b,s,c50r,c50f)=b+h(11+es(cc50r))(11+es(cc50f))

where c is the contrast, c50r is the contrast at which the response rises halfway between the base and height, c50f is the contrast at which the response falls halfway between the base and height, b is the lowest response, h is the response amplitude, and s is the slope of the sigmoid (fixed at s=10). The best fit model was determined by calculating the Akaike Information Criterion (AIC) for each model and selecting the model with the lowest AIC.

The AIC can be calculated as:

AIC=2k2lnLL=contraststrialsN(Rci|μ=R^c,σR2)lnL=12σR2contraststrials(RciR^c)2+constant

where k is the number of parameters fit in the model, is the likelihood of observing the responses given the fitted model and response distribution, Rci is the cell’s response to a grating stimulus of contrast c (at the cell’s preferred direction) on trial i, R^c is the response predicted by the model to a grating stimulus of contrast c, σR2 is the variance of all of the cell’s responses, and 𝒩 is the normal distribution. In practice, it is more convenient to directly calculate the log-likelihood than to calculate the likelihood and subsequently take the log, and the constant can be ignored for model selection since the same constant applies to all models being compared.

Due to the non-normal response distribution, possibly arising from calcium imaging as well as an underlying non-normal spiking distribution, we bootstrapped the log-likelihood rather than assume normality. Therefore, the likelihood was calculated numerically by shuffling responses across trials 1000 times and calculating the sum of square residuals from the predicted responses as SS=contraststrialsRci-R^c2 for each shuffle. The likelihood was taken as the fraction of shuffles for which SS was greater than the observed SS.

Generalized linear model

We constructed Generalized Linear Models, specifically a Poisson (i.e. exponential function) models, to predict the population response of each neuron type (e.g. VIP neurons) on each trial from stimulus contrast, stimulus direction, locomotion state (i.e. binary run or not run variable), and the interactions between these terms. The model was

R^b,r,d,c=e(wbab+wrar+dad(wd+wd,rar )+cac(wc+wc,rar)+dcadac(wd,c+wd,c,rar)+k)

where R^b,r,d,c is the predicted response for a trial, w terms are the weights of the model, a terms are binary variables that equal 1 if the trial attribute is true and equal 0 otherwise; the trial attributes are blank (b), run state (r), stimulus direction (d), and stimulus contrast (c); and k is a constant. The weights of the model were computed by minimizing the cost function L using iterative reweighted least squares,

L=SSE+λl1

where SSE is the reconstruction error.

SSE= b,r,d,c(Rb,r,d,cR^b,r,d,c)2

and l1 is an L1-regularization penalty that serves to identify only weights that significantly contribute to neuronal responses,

l1=wb+wr+dwd+wd,r+cwc+wc,r+dcwd,c+wd,c,r

The strength of regularization, λ, was determined through leave-one-out cross validation in which one experimental session was left out for each fold.

Stabilized supralinear network (SSN) model

The SSN was modeled as a ring network, largely maintaining the basic architecture and dynamics described in Rubin et al., 2015 but deviating primarily in the diversity of inhibitory neurons and distributions of connections between neuron populations (including untuned inhibitory connections, described below). Our network consisted of one excitatory population (representing layer 2/3 CUX2 pyramidal neurons) and three inhibitory populations (representing PV, SST, and VIP interneurons, respectively). The ring network structure was imposed by providing each excitatory neuron with external (‘sensory’) excitatory input that had Gaussian tuning with the mean (i.e. peak/preferred direction) corresponding to the neuron’s position on the ring and standard deviation of 30 degrees; PV neurons also received external input which was not tuned (i.e. all PV cells receive input of equal strength). The entire network covered 180 degrees of orientation (or direction). The strength of external input was intended to represent a monotonically-increasing function of stimulus contrast, though no specific relationship between input magnitude and contrast is claimed here.

Connections between neurons also had Gaussian tuning that depended on the difference between the orientation preferences of the pre- and post-synaptic neurons (Figure 5b). The distributions of recurrent excitatory connections onto CUX2 cells and excitatory connections onto Vip cells were narrow (standard deviation of 30 degrees) compared to the distributions of connections to and from PV and Sst cells (standard deviation of 100 degrees).

The network consisted of 184 excitatory neurons, 40 PV neurons, 15 SST neurons, and 15 VIP neurons. The excitatory population had 180 neurons with uniform 1-degree spacing of peak directions to tile the ring, plus four extra neurons with peak direction of zero degrees to capture the slight bias of the CUX2 neurons. All model VIP neurons had a peak direction of zero degrees to capture the strong bias for front-to-back motion observed for VIP neurons. In addition, all SST and PV model neurons also had a peak direction of zero degrees, though the very broadly-tuned inputs to these neurons results in a much weaker bias of net input to these neurons than the bias to VIP neurons. All neurons were implemented as rate models with firing rate that was a rectified quadratic function of the summed input to the neuron,

rss(I)={kI2 I>00 I0

where I is the input strength, rss is the steady state firing rate, and k is a constant of proportionality. For ease of comparison with the SSN models developed by Rubin et al., 2015, we used k=0.04 for all models.

For a given external input, the firing rates of all neurons in the network were obtained by evolving the network in time, with dynamics:

r˙=rss(Isum(t))r(t)Isumj(t)= Ispj+iWi,jri(t)

where r(t) is the time-dependent firing rate, r˙ is the time derivate of the neuron’s firing rate, rss is the steady state firing rate that varies in time based on the inputs to the neuron, Isumj is the net input to neuron j, Ispj is a constant spontaneous input to neuron j, and Wi,j is the connection strength from presynaptic neuron i onto postsynaptic neuron j. To provide a spontaneous activity to the network, and account for the higher spontaneous activity of VIP neurons (Roux and Buzsáki, 2015), we set IspCUX2=IspPV=IspSST=2 and IspVIP=10. The network is evolved with Euler integration with updates of rj=tτjr˙j at each time step of t=0.1ms, where the time constants of the different neuron types are τCUX2=τSST=τVIP=20ms and τPV=10ms.

We calculated the stability of the steady state of activity at zero degrees with respect to a spatially homogenous perturbation of the inputs. The stability matrix is, in the spatial Fourier domain,

Jx,n=gixWijGij(x,n)

where x is the postsynaptic location in degrees, n is the spatial frequency corresponding to the orientation difference between the presynaptic and postsynaptic cells, gi(x) is the postsynaptic gain at its steady-state rate, W is the weight matrix and Gij(x,n) is the Fourier transform of the wrapped Gaussian connectivity profile:

Gx,n=302πe-2nπσe2e-2nπσb2e-2nπσb2e-2nπσb2e-2nπσb2e-2nπσe2e-2nπσb2e-2nπσe2e-2nπσb2e-2nπσb2e-2nπσe2e-2nπσb2e-2nπσb2e-2nπσe2e-2nπσb2e-2nπσe2

where σe=30 degrees is the projection width for E→E, E→VIP and VIP projections and σb=100 degrees is the projection width for the remaining inhibitory projections. If EEstability=J00x,n-1 is greater than zero, the network at orientation x is in an inhibitory-stabilized state with respect to perturbations at spatial frequency n.

Acknowledgements

The authors wish to thank the Allen Institute founder, Paul G Allen, for his vision, encouragement, and support.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Daniel J Millman, Email: danielm@alleninstitute.org.

Martin Vinck, Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Germany.

Kate M Wassum, University of California, Los Angeles, United States.

Funding Information

This paper was supported by the following grant:

  • Allen Institute for Brain Science to Daniel J Millman, Gabriel Koch Ocker, Shiella Caldejon, India Kato, Josh D Larkin, Eric Kenji Lee, Jennifer Luviano, Chelsea Nayan, Thuyanh V Nguyen, Kat North, Sam Seid, Cassandra White, Jerome Lecoq, Clay Reid, Michael A Buice, Saskia EJ de Vries.

Additional information

Competing interests

No competing interests declared.

Author contributions

Formal analysis, Methodology, Writing - original draft, Writing - review and editing.

Formal analysis, Methodology, Writing - review and editing.

Investigation.

Investigation.

Investigation.

Investigation.

Investigation.

Investigation.

Investigation.

Investigation.

Investigation.

Investigation.

Supervision, Methodology.

Supervision.

Conceptualization, Supervision, Writing - review and editing.

Conceptualization, Formal analysis, Supervision, Methodology, Writing - review and editing.

Ethics

Animal experimentation: All animal procedures were approved by the Institutional Animal Care and Use Committee (IACUC) at the Allen Institute for Brain Science.

Additional files

Transparent reporting form

Data availability

The data generated and analyzed in this study are available on DANDI: Distributed Archives for Neurophysiology Data Integration. All analyses were performed using custom scripts written in Python 2.7, using NumPy, SciPy, Pandas, Matplotlib, statsmodel, and Scikit-learn. Analysis code is available at https://github.com/AllenInstitute/Contrast_Analysis (copy archived at https://archive.softwareheritage.org/swh:1:rev:c7ddda11647093e8a0173dbd2a1986ac6239c821/). Event extraction was performed using FastLZeroSpikeInference available at https://github.com/jewellsean/FastLZeroSpikeInference.

The following dataset was generated:

Millman DJ, de Vries SE. 2020. Allen Institute – Contrast tuning in mouse visual cortex with calcium imaging. DANDI. 000039

References

  1. Adesnik H, Bruns W, Taniguchi H, Huang ZJ, Scanziani M. A neural circuit for spatial summation in visual cortex. Nature. 2012;490:226–231. doi: 10.1038/nature11526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Adesnik H. Synaptic mechanisms of feature coding in the visual cortex of awake mice. Neuron. 2017;95:1147–1159. doi: 10.1016/j.neuron.2017.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Ahmadian Y, Rubin DB, Miller KD. Analysis of the stabilized supralinear network. Neural Computation. 2013;25:1994–2037. doi: 10.1162/NECO_a_00472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Carandini M, Heeger DJ. Normalization as a canonical neural computation. Nature Reviews Neuroscience. 2012;13:51–62. doi: 10.1038/nrn3136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cardin JA. Inhibitory interneurons regulate temporal precision and correlations in cortical circuits. Trends in Neurosciences. 2018;41:689–700. doi: 10.1016/j.tins.2018.07.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cavanaugh JR, Bair W, Movshon JA. Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. Journal of Neurophysiology. 2002;88:2530–2546. doi: 10.1152/jn.00692.2001. [DOI] [PubMed] [Google Scholar]
  7. Cone JJ, Scantlen MD, Histed MH, Maunsell JHR. Different inhibitory interneuron cell classes make distinct contributions to visual contrast perception. Eneuro. 2019;6:ENEURO.0337-18.2019. doi: 10.1523/ENEURO.0337-18.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cruikshank SJ, Urabe H, Nurmikko AV, Connors BW. Pathway-specific feedforward circuits between thalamus and neocortex revealed by selective optical stimulation of axons. Neuron. 2010;65:230–245. doi: 10.1016/j.neuron.2009.12.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. de Vries SEJ, Lecoq JA, Buice MA, Groblewski PA, Ocker GK, Oliver M, Feng D, Cain N, Ledochowitsch P, Millman D, Roll K, Garrett M, Keenan T, Kuan L, Mihalas S, Olsen S, Thompson C, Wakeman W, Waters J, Williams D, Barber C, Berbesque N, Blanchard B, Bowles N, Caldejon SD, Casal L, Cho A, Cross S, Dang C, Dolbeare T, Edwards M, Galbraith J, Gaudreault N, Gilbert TL, Griffin F, Hargrave P, Howard R, Huang L, Jewell S, Keller N, Knoblich U, Larkin JD, Larsen R, Lau C, Lee E, Lee F, Leon A, Li L, Long F, Luviano J, Mace K, Nguyen T, Perkins J, Robertson M, Seid S, Shea-Brown E, Shi J, Sjoquist N, Slaughterbeck C, Sullivan D, Valenza R, White C, Williford A, Witten DM, Zhuang J, Zeng H, Farrell C, Ng L, Bernard A, Phillips JW, Reid RC, Koch C. A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex. Nature Neuroscience. 2020;23:138–151. doi: 10.1038/s41593-019-0550-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dipoppa M, Ranson A, Krumin M, Pachitariu M, Carandini M, Harris KD. Vision and locomotion shape the interactions between neuron types in mouse visual cortex. Neuron. 2018;98:602–615. doi: 10.1016/j.neuron.2018.03.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fu Y, Tucciarone JM, Espinosa JS, Sheng N, Darcy DP, Nicoll RA, Huang ZJ, Stryker MP. A cortical circuit for gain control by behavioral state. Cell. 2014;156:1139–1152. doi: 10.1016/j.cell.2014.01.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Glickfeld LL, Histed MH, Maunsell JH. Mouse primary visual cortex is used to detect both orientation and contrast changes. Journal of Neuroscience. 2013;33:19416–19422. doi: 10.1523/JNEUROSCI.3560-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Hennequin G, Ahmadian Y, Rubin DB, Lengyel M, Miller KD. The dynamical regime of sensory cortex: stable dynamics around a single Stimulus-Tuned attractor account for patterns of noise variability. Neuron. 2018;98:846–860. doi: 10.1016/j.neuron.2018.04.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hertäg L, Sprekeler H. Amplifying the redistribution of somato-dendritic inhibition by the interplay of three interneuron types. PLOS Computational Biology. 2019;15:e1006999. doi: 10.1371/journal.pcbi.1006999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Heuer HW, Britten KH. Contrast dependence of response normalization in area MT of the rhesus macaque. Journal of Neurophysiology. 2002;88:3398–3408. doi: 10.1152/jn.00255.2002. [DOI] [PubMed] [Google Scholar]
  16. Huang L, Knoblich U, Ledochowitsch P, Lecoq J, Reid RD, de Vries SEJ. Relationship between simultaneously recorded spiking activity and fluorescence signal in GCaMP6 transgenic mice. bioRxiv. 2019 doi: 10.1101/788802. [DOI] [PMC free article] [PubMed]
  17. Jewell SW, Hocking TD, Fearnhead P, Witten DM. Fast nonconvex deconvolution of calcium imaging data. Biostatistics. 2019;10:kxy083. doi: 10.1093/biostatistics/kxy083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Jewell S, Witten D. Exact spike train inference via $\ell_{0}$ optimization. The Annals of Applied Statistics. 2018;12:2457–2482. doi: 10.1214/18-AOAS1162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kerlin AM, Andermann ML, Berezovskii VK, Reid RC. Broadly tuned response properties of diverse inhibitory neuron subtypes in mouse visual cortex. Neuron. 2010;67:858–871. doi: 10.1016/j.neuron.2010.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kraynyukova N, Tchumatchenko T. Stabilized supralinear network can give rise to Bistable, Oscillatory, and persistent activity. PNAS. 2018;115:3464–3469. doi: 10.1073/pnas.1700080115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ledochowitsch P, Huang L, Knoblich U, Oliver M, Lecoq J, Reid C, Li L, Zeng H, Koch C, Water J, de Vries SEJ, Buice MA. On the correspondence of electrical and optical physiology in in vivo population-scale two-photon calcium imaging. bioRxiv. 2019 doi: 10.1101/800102. [DOI]
  22. Levick WR. Receptive fields and trigger features of ganglion cells in the visual streak of the rabbits retina. The Journal of Physiology. 1967;188:285–307. doi: 10.1113/jphysiol.1967.sp008140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Linaro D, Ocker GK, Doiron B, Giugliano M. Correlation transfer by layer 5 cortical neurons under recreated synaptic inputs In Vitro. The Journal of Neuroscience. 2019;39:7648–7663. doi: 10.1523/JNEUROSCI.3169-18.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lindsay GW, Rubin DB, Miller KD. A unified circuit model of attention: neural and behavioral effects. bioRxiv. 2020 doi: 10.1101/2019.12.13.875534. [DOI]
  25. Liu LD, Miller KD, Pack CC. A unifying motif for spatial and directional surround suppression. The Journal of Neuroscience. 2018;38:989–999. doi: 10.1523/JNEUROSCI.2386-17.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Margrie T, Brecht M, Sakmann B. In vivo, low-resistance, whole-cell recordings from neurons in the anaesthetized and awake mammalian brain. Pflugers Archiv European Journal of Physiology. 2002;444:491–498. doi: 10.1007/s00424-002-0831-z. [DOI] [PubMed] [Google Scholar]
  27. Marshel JH, Kaye AP, Nauhaus I, Callaway EM. Anterior-posterior direction opponency in the superficial mouse lateral geniculate nucleus. Neuron. 2012;76:713–720. doi: 10.1016/j.neuron.2012.09.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Mesik L, Ma W, Li L, Ibrahim LA, Huang ZJ, Zhang LI, Tao HW. Functional response properties of VIP-expressing inhibitory neurons in mouse visual and auditory cortex. Frontiers in Neural Circuits. 2015;09:1–14. doi: 10.3389/fncir.2015.00022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Miller KD, Troyer TW. Visual cortex neurons of monkeys and cats: temporal dynamics of the contrast response function. Journal of Neurophysiology. 2002;88:652–659. doi: 10.1152/jn.2002.88.2.888. [DOI] [PubMed] [Google Scholar]
  30. Pfeffer CK, Xue M, He M, Huang ZJ, Scanziani M. Inhibition of inhibition in visual cortex: the logic of connections between molecularly distinct interneurons. Nature Neuroscience. 2013;16:1068–1076. doi: 10.1038/nn.3446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Pi HJ, Hangya B, Kvitsiani D, Sanders JI, Huang ZJ, Kepecs A. Cortical interneurons that specialize in disinhibitory control. Nature. 2013;503:521–524. doi: 10.1038/nature12676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Priebe NJ, Ferster D. Inhibition, spike threshold, and stimulus selectivity in primary visual cortex. Neuron. 2008;57:482–497. doi: 10.1016/j.neuron.2008.02.005. [DOI] [PubMed] [Google Scholar]
  33. Rodieck RW. Receptive fields in the cat retina: a new type. Science. 1967;157:90–92. doi: 10.1126/science.157.3784.90. [DOI] [PubMed] [Google Scholar]
  34. Roux L, Buzsáki G. Tasks for inhibitory interneurons in intact brain circuits. Neuropharmacology. 2015;88:10–23. doi: 10.1016/j.neuropharm.2014.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Rubin DB, Van Hooser SD, Miller KD. The stabilized supralinear network: a unifying circuit motif underlying multi-input integration in sensory cortex. Neuron. 2015;85:402–417. doi: 10.1016/j.neuron.2014.12.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Sanzeni A, Akitake B, Goldbach HC, Leedy CE, Brunel N, Histed MH. Inhibition stabilization is a widespread property of cortical networks. eLife. 2020;9:e54875. doi: 10.7554/eLife.54875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Scala F, Kobak D, Shan S, Bernaerts Y, Laturnus S, Cadwell CR, Hartmanis L, Froudarakis E, Castro JR, Tan ZH, Papadopoulos S, Patel SS, Sandberg R, Berens P, Jiang X, Tolias AS. Layer 4 of mouse neocortex differs in cell types and circuit organization between sensory Areas. Nature Communications. 2019;10:4174. doi: 10.1038/s41467-019-12058-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Tasic B, Yao Z, Graybuck LT, Smith KA, Nguyen TN, Bertagnolli D, Goldy J, Garren E, Economo MN, Viswanathan S, Penn O, Bakken T, Menon V, Miller J, Fong O, Hirokawa KE, Lathia K, Rimorin C, Tieu M, Larsen R, Casper T, Barkan E, Kroll M, Parry S, Shapovalova NV, Hirschstein D, Pendergraft J, Sullivan HA, Kim TK, Szafer A, Dee N, Groblewski P, Wickersham I, Cetin A, Harris JA, Levi BP, Sunkin SM, Madisen L, Daigle TL, Looger L, Bernard A, Phillips J, Lein E, Hawrylycz M, Svoboda K, Jones AR, Koch C, Zeng H. Shared and distinct transcriptomic cell types across neocortical Areas. Nature. 2018;563:72–78. doi: 10.1038/s41586-018-0654-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Tremblay R, Lee S, Rudy B. GABAergic interneurons in the neocortex: from cellular properties to circuits. Neuron. 2016;91:260–292. doi: 10.1016/j.neuron.2016.06.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Zhang S, Xu M, Kamigaki T, Hoang Do JP, Chang WC, Jenvay S, Miyamichi K, Luo L, Dan Y. Long-range and local circuits for top-down modulation of visual cortex processing. Science. 2014;345:660–665. doi: 10.1126/science.1254126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Zhang J, Larsen RS, Takasaki KT, Ouellette ND, Daigle TL, Tasic B, Waters J, Zeng H, Reid RC. The spatial structure of feedforward information in mouse primary visual cortex. bioRxiv. 2020 doi: 10.1101/2019.12.24.888156. [DOI]

Decision letter

Editor: Martin Vinck1

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

Millman et al. show new insights into the function and visual response properties of a main subclass of vasoactive intestinal peptide-expressing (VIP) interneurons in area V1 of the mouse. They find that VIP interneurons are most strongly activated by low-contrast stimuli with front-to-back motion, which is congruent with self-motion that occurs during locomotion, another main factor that is known to drive VIP interneuron activity. The authors develop a model suggesting that this accounts for the suppression of SSt+ (Somatostatin) interneurons at low luminance-contrasts, and the surprising preference of Layer 2/3 pyramidal cells for low-contrast stimuli.

Decision letter after peer review:

Thank you for submitting your article "VIP interneurons selectively enhance weak but behaviorally-relevant stimuli" for consideration by eLife. Your article has been reviewed by four peer reviewers, one of whom, Martin Vinck, is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Kate Wassum as the Senior Editor. The other reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

VIP interneurons are a class of GABAergic interneurons that can disinhibit pyramidal neurons via inhibition of SST interneurons. Millman et al. describe a novel role for VIP-driven disinhibition of pyramidal cells in the mouse primary visual cortex (V1). They report surprising and novel cell type-specific responses in the mouse visual cortex. First, VIP-positive interneurons show stereotyped direction tuning, strikingly different from the pyramidal cell population, suggesting that inputs to these neurons are related to a top-down input that signals behavioral relevance of the stimulus. The finding that VIP responses are enhanced by locomotion fits with this hypothesis. Second, VIP responses are suppressed at high contrast. Finally, a model based on the SSN architecture can reproduce some of the major features of the data. Together, this work supports the notion that VIP neuron activity increases sensitivity to weak inputs, whereas SST neuron activity decreases sensitivity to strong inputs and maintains network stability. This interesting conjunction leads to heightened sensitivity to detect low-contrast objects or obstacles approaching head-on during locomotion. Thus, VIP neurons promote high gain in the local microcircuit at low contrast conditions.

General assessment:

All reviewers agreed that the study has substantial merits and makes an important, novel contribution to the field.

Essential revisions:

The reviewers raise a number of concerns that must be adequately addressed before the paper can be accepted. Some of the required revisions will require further data analysis.

1) Data representation and interpretation of calcium signals

The data shown are highly overprocessed. The authors should include raw data traces and images, and describe their methods comprehensively in the paper. Data in Figure 1 should be shown with proper normalization (percent change) as well as raw data, and not as percentiles. While the authors use an established "pipeline" for data collection, they should include raw calcium imaging traces and images (at least in the Materials and methods), They should describe their methods comprehensively in the paper. The paper should be self-contained, and the authors should provide a representation of how fluorescence images were converted into the color maps in Figure 1. Inclusion of raw data is necessary for readers to get a sense of what is measured and possible caveats. For example, it could be possible that VIP neurons have saturating response at high contrast and the calcium does not decrease to baseline between trials, resulting in what looks like a suppressed response. A further consideration is the fact that imaging depths vary throughout the manuscript, and GABAergic neuronal activity can be substantially higher than pyramidal cell activity, which should have a "smoothing" impact on calcium traces. The authors should better explain how they took this into consideration in their estimation of various metrics. The authors should further motivate why data was averaged over running and stationary trials in Figure 1.

2) Authors should discuss how GCaMP signals relate to spikes across different cell types (for example see Khan et al., Nat Nsci 2018, Figure 2E and Supplementary Figure 6, where a comparison is made with calcium imaging and extracellular recording in Vip, PV and Sst neurons) (and Huang et al-Biorxiv).

3) It is unclear why the authors used Cre lines to image different pyramidal neuron populations, along with imaging at different depths. A possible reason could be to avoid contamination from dendrites of neurons in different layers. If this is the reason it should be stated in the paper. In addition, they asked for something to be said about the idiosyncrasies of the cre lines in light of recent scRNAseq data.

4) Response percentiles in Figure 1B are reported, but these are never corrected for the many multiple comparisons, such that the significance is merely a normalized measure of strength. However percentiles are deceiving because they hide the amplitude of the responses. Data should be represented as percent change to obtain an idea about the effect sizes shown in this study, because a small effect might correspond to a large normalized measure of strength. In general, the color map in Figures 1 and 2 are unconventional and needs to be justified better. It is hard to judge how strong the responses are overall, and what the differences between cell types mean.

With respect to questions about effect size: reviewers asked whether the small numbers throughout Figure 1B are significant, and pointed out that some of these cells appear to be barely responsive. Reviewers were surprised about the seemingly tiny responses in Figure 2O and P, and questioned whether this can be a meaningful code for direction? What do these magnitudes mean in terms of spike numbers?

5) Averaging across the raw magnitudes in Figure 2 across neurons could be a highly insensitive analysis and lead to conclusions that are dominated by a few neurons with high baseline firing rates. A standard way to analyze this kind of data would be to normalize the responses per neurons by for example dividing by the number of baseline events. Further why are the response magnitudes in Figure 2 not normalized as in Figure 1B?

6) Figure 1C: It appears that no multiple-comparison corrected statistics are provided on the differences in population orientation tuning between different lines. Given that this study is explorative, inferential statistics should be performed to compare mice lines and should be corrected for multiple comparisons (contrasts x cre lines). The text sections, "CUX2 neurons in layer 2/3 showed direction bias toward front-to-back motion at 5% and 10% contrast but not at higher contrasts (Figure 1C);. pyramidal neurons in deeper layers did not 83 have direction bias." seems not motivated by statistics. A statistical test should be provided for the difference between low and high contrasts. Furthermore, the statistical difference between superficial layer and deep layer is suggested by the text, but appears to be never explicitly tested. The authors should carefully check that all of their claims on differences between groups or conditions are supported by multiple-comparison corrected statistics. A general problem in this study is that claims on differences between lines are made, but never made explicit through testing. An example is the description: "79: Contrast preference among pyramidal neurons systematically varied across cortical layers, exhibiting a progression from a mixture of low and high contrast-preferring neurons in layer 2/3 to almost exclusively high contrast-preferring neurons in layers 5 and 6 (Figure 1F, G)." The claim of differences among cell lines should be substantiated with statistics that test whether the different mouse lines are significantly different (the unit of analysis should be the mouse).

7) Claims on interactions between variables

Claims about the interactions between variables needs to be demonstrated by statistics. In principal cells, direction tuning is canonically thought to be largely invariant to contrast tuning. The authors should explicitly analyse whether the direction tuning for VIP interneurons is invariant to contrast, and whether the front-to-back preference is indeed specific for low contrast (as the authors suggest) – i.e. test against the alternative that contrast simply gain-modulates the direction tuning. A similar question pertains to the interaction between locomotion and direction tuning. The manuscript description of Figure 2 suggests an interaction between the Locomotion/Stationary variable with contrast and direction tuning. However this does not appear to be statistically quantified, and that it is possible that contrast tuning curves are modulated in a multiplicative way (i.e. invariant to locomotion). A similar comment pertains to, "This analysis demonstrates a substantial enhancement of responses to low contrast visual stimuli during locomotion that is specific to layer 2/3 pyramidal neurons and VIP neurons." There could be a general multiplicative increase in firing across cell lines and contrasts, and Figure 2 does not demonstrate that low contrast visual stimuli are enhanced specifically during locomotion.

8) Imaging depths and layers

The authors need to carefully check their depths and laminar assignments and update the text on this. The authors should more carefully discuss in the manuscript what type of SSt and VIP cells were imaged and discuss the implications of interneuron heterogeneity. Is there any particular reason the mice from SSt-Cre mouse line were imaged in layer 4 as stated in the Materials and methods? As the authors know, dendrite-targeting Martinotti cells are more likely to be found in Layers 2/3 and 5 (Munoz et al., 2017, Science), and the Agmon Lab found that SSt-expressing neurons from layer 4 barrel cortex had unique and different electrophysiological properties from other SSt neurons (Ma et al., 2006, J.Neurosci). If indeed, this is the only imaging plane used for in the SSt-Cre mouse line, how come the authors describe the activity of SSt neurons in layers 2/3 (main text, fourth paragraph)? Similarly for layer 4 VIP cells activity (same lines). Generally there is a bit of confusion with the reported imaging depths and the authors should comment on this. In particular the laminar assignment of the SSt neurons to be clarified. According to the depth 275 micrometer one would say these are layer 2/3, but the authors write layer 4. In general, the laminar assignment based on depth should be justified if it is not based on a layer-specific mouse line.

9) Generalizability

Due to the focus on very specific stimuli, the scope of the study is limited and the results do maybe not generalize beyond large drifting gratings. For instance, it is unclear whether these findings will hold up for smaller stimuli. Other stimulus parameters are not explored systematically (spatial frequency or temporal frequency). Stimulus size is ignored, even though stimulus size could have dramatic effects on the findings presented here. The authors need to discuss these limitations in the manuscript. In particular they should comment on the issue of size tuning, which is missing in the manuscript. Previous studies have shown a very clear size dependence of VIP and SSt neurons (Adesnik and Scanziani, 2012; DiPoppa et al., 2018). The contrast dependence of the VIP and SSt neurons is likely strongly dependent on stimulus size. For instance VIP neurons have small receptive fields (DiPoppa et al., 2018). Because surround modulation is contrast dependent, suppression of PC L2/3 cell activity and VIP activity likely depends on the stimulus size. The authors to discuss these limitations.

10) Interpretation of SSt direction tuning and VIP suppression

The authors should discuss the interpretation of some of the main effects. In particular, reviewers wondered why VIP neurons are suppressed at high contrast? They also wondered why the response of SST neurons at low contrast is not direction modulated – given that the VIP neurons have strong direction tuning? Do VIP interneurons indeed suppress the SSt neurons recorded here? This assumption is being made but it is never tested or argued for these specific laminar recordings from the literature.

11) Optic Flow

The interpretation of optic flow appears to be problematic and needs to be discussed. The authors interpret their data in terms of optic flow related to locomotion. However, the peak of VIP neurons occurs at 45 degrees rather than 0 degrees, but the authors write that 0 degrees corresponds to front-to-back motion (main text, second paragraph). Does the interpretation of the authors make sense given this discrepancy?

12) Modelling:

There are major concerns about parameter and model selection. Several findings hard to account for with the model. The description of the model needs to be substantially improved. The tuning of VIP cells in 3b is much wider than that observed in the data (1b), and this could be a major problem for the model. The authors should explain this. The authors should make a better effort to explain the model better in the text. For the modelling, authors need to provide more explanation on why SSNs were selected, and how they work etc. In the current version of the paper, it is necessary to refer to the very difficult Rubin et al., 2015 paper to understand what the model does. Also, reviewers would like to see that the canonical results from Rubin et al. still work in the revised model, and commented that this is an essential addition to the paper. The authors should include a discussion of the process of parameter selection for the model (how much fine tuning was required, what happens when connection weights deviate somewhat etc). In general, the scope of the model to be somewhat limited given that model parameters are not systematically explored or fitted based on the data. Further, the model makes strong assumptions on the inputs that these different neuron types presumably receive from external drive. These limitations need to be discussed. The main text to be improved with a more in-depth explanation for using SSNs (stabilized supra-linear network). Finally, rectified quadratic response functions for neurons are not common in RNNs, and should be motivated. Also, no explanation/citation is provided for the choice of parameters (input, connections) for PV neurons.

With respect to the conclusion – "These results indicate that Vip-mediated disinhibition is capable of producing substantial increases in gain at low contrast despite low activity of the intermediate SST neuron population." The authors should explain why this is not a circular argument and a strange way to summarize the results of the model. Reviewers pointed out that it is clear from Figure 3D that the suppression of Sst activity by Vip neurons at low contrast is what enables supra-linear responses of Cux2 neurons at low contrast.

The bias of direction tuning in L2/3 neurons is very weak. How can the authors account for the contrast enhancement of these neurons in their VIP model? Should this enhancement not be extremely specific to the 45 degrees angle?

eLife. 2020 Oct 27;9:e55130. doi: 10.7554/eLife.55130.sa2

Author response


Essential revisions:

The reviewers raise a number of concerns that must be adequately addressed before the paper can be accepted. Some of the required revisions will require further data analysis.

1) Data representation and interpretation of calcium signals

The data shown are highly overprocessed. The authors should include raw data traces and images, and describe their methods comprehensively in the paper. Data in Figure 1 should be shown with proper normalization (percent change) as well as raw data, and not as percentiles. While the authors use an established "pipeline" for data collection, they should include raw calcium imaging traces and images (at least in the Materials and methods), They should describe their methods comprehensively in the paper. The paper should be self-contained, and the authors should provide a representation of how fluorescence images were converted into the color maps in Figure 1. Inclusion of raw data is necessary for readers to get a sense of what is measured and possible caveats. For example, it could be possible that VIP neurons have saturating response at high contrast and the calcium does not decrease to baseline between trials, resulting in what looks like a suppressed response. A further consideration is the fact that imaging depths vary throughout the manuscript, and GABAergic neuronal activity can be substantially higher than pyramidal cell activity, which should have a "smoothing" impact on calcium traces. The authors should better explain how they took this into consideration in their estimation of various metrics.

We have made several, substantial changes to the main text, Materials and methods section, and figures to address these points. Together, we believe that these additions will resolve these issues.

– The revised manuscript includes a new Figure 1 that shows fluorescence traces for four example neurons, of the key Cre lines, as well as stepwise transformations to “events” in the fluorescence traces and, finally, stimulus response magnitudes and tuning curves.

– We changed the normalization used in Figure 2B (previously Figure 1B), which we discuss further in our response to point 4 below.

– We have expanded the description of the processing of calcium imaging data in the Materials and methods section to make this paper self-contained.

– We have added additional detail on the imaging depths in the Materials and methods section and have carefully documented the imaged layer in the text and figures. See our reply to point 8 below for an extended discussion of imaging depths.

– We show that the fluorescence traces of VIP neurons are substantially suppressed in response to high contrast gratings, not saturated, in Figure 2—figure supplement 2. Therefore, the suppression of VIP responses as measured in events does not result from fluorescence saturation during high contrast grating presentations.

– As can be seen in the example dF/F traces in Figure 1, we observe clear calcium transients for inhibitory interneurons that are not substantially “smoother” than those of excitatory neurons. We are aware that the high spike rates and calcium buffering in PV+ interneurons can lead to highly smoothed traces, but that does not appear to be the case for the VIP and SST interneurons imaged in this study.

Based on this, we do not believe that our conclusions based on the tuning metrics we calculate (e.g. DSI, gOSI, etc.) are likely to be impacted by higher spontaneous rates of interneurons.

The authors should further motivate why data was averaged over running and stationary trials in Figure 1.

The motivation for this study was to investigate the influence of stimulus contrast and direction on neuron responses across Cre lines and layers. The findings regarding contrast and direction tuning described in Figure 2 (previously Figure 1) which is already a very dense full-page figure, motivate further examination of locomotion. Therefore, we address the influence of locomotion on contrast and direction tuning in the subsequent two figures. In particular, related to reviewers’ point 7 below, we have added a Generalized Linear Model analysis to specifically address the relationship between locomotion and stimulus direction and contrast.

2) Authors should discuss how GCaMP signals relate to spikes across different cell types (for example see Khan et al., Nat Nsci 2018, Figure 2E and Supplementary Figure 6, where a comparison is made with calcium imaging and extracellular recording in Vip, PV and Sst neurons) (and Huang et al-Biorxiv).

Khan et al., 2018, show that the relationship between DF/F transients and spikes, for combined loose path and 2P recordings in cortical slices, using injected GCaMP6 and imaged at an unknown magnification. These show that VIP and SST neurons exhibit the same relationship between DF/F and spikes, while the slope of that relationship is different for PV neurons.

Huang et al., 2019, shows the relationship between multiple excitatory Cre lines in layer 2/3. The paper focuses mostly on differences between GCaMP6s and GCaMP6f, comparing transgenically expressed GCaMP with data for virally expressed GCaMP6 from Chen et al. The different Cre lines used for GCaMP6f (the reporter used in our study) are Cux2-CreERT2 (which we use) and Emx1-IRES-Cre. Within layer 2/3, these are both considered to be pan-excitatory, and the results in Huang et al. reveal that they show the same relationship as each other between spikes and DF/F amplitude.

Further, as Huang et al., 2019, alludes to, and is further examined in Ledochowitsch et al., 2019, the spatial and temporal imaging resolution is a critical parameter in this relationship. The analysis in Ledochowitsch et al., 2019, approximates the imaging resolutions used in our study, but again only considers two pan-excitatory Cre lines within layer 2/3.

Since the two studies use different imaging paradigms (slice vs. in vivo, unknown magnifications, etc.), we are unable to meaningfully compare the data in Khan et al. with Huang et al. Further, neither dataset speaks to potential differences in the relationship of spikes and DF/F between the VIP and SST inhibitory populations and pyramidal neurons, or between different pyramidal Cre lines at different imaging depths. As these are the questions we believe the reviewers are asking, and are most relevant to this study, it is unclear how we can address this without doing a separate study on this specific issue. While we agree that this is an important question for the field, it is beyond the scope of this paper.

3) It is unclear why the authors used Cre lines to image different pyramidal neuron populations, along with imaging at different depths. A possible reason could be to avoid contamination from dendrites of neurons in different layers. If this is the reason it should be stated in the paper.

The reviewers are precisely correct that the Cre lines were used in order to target neurons in specific layers without contamination from processes of neurons in different layers. We have now stated this point in the second paragraph of the main text.

In addition, they asked for something to be said about the idiosyncrasies of the cre lines in light of recent scRNAseq data.

Extended Figure 8 of Tasic et al., 2018, provides a summary table of the correspondence between Cre lines (including all 6 used here) and transcriptomic neuron subtypes. We have now discussed the transcriptomic types labeled by the Cre lines used in this study in the Materials and methods under the “Experimental Animals” section (second paragraph) and discuss the potential relationships between transcriptomic types and contrast tuning in the sixth paragraph of the main text. Briefly, scRNA-seq data has shown that the Vip-Cre and Sst-Cre lines provide broad coverage of VIP and SST neuron transcriptomic subtypes (16 and 21 subtypes). In layer 2/3, Cux2-CreERT2 labels all three excitatory neuron transcriptomic subtypes. Layer 4 contains only a single transcriptomic neuron type, which is sampled by the Rorb-Cre line. Rbp4-Cre_KL100 labels all twelve layer 5 neuron transcriptomic subtypes. Ntsr1-Cre labels all six layer 6 corticothalamic neuron transcriptomic subtypes.

4) Response percentiles in Figure 1B are reported, but these are never corrected for the many multiple comparisons, such that the significance is merely a normalized measure of strength. However percentiles are deceiving because they hide the amplitude of the responses. Data should be represented as percent change to obtain an idea about the effect sizes shown in this study, because a small effect might correspond to a large normalized measure of strength. In general, the color map in Figures 1 and 2 are unconventional and needs to be justified better. It is hard to judge how strong the responses are overall, and what the differences between cell types mean.

We understand that the metric we had used, the percentile of the response in a bootstrapped distribution, was a source of confusion. The motivation behind this normalization is analogous to z-scoring the responses, with the additional step of generating a baseline distribution through bootstrapping rather than assuming a normal distribution since the responses with calcium imaging are non-Gaussian and highly skewed. Given the confusion, however, we have followed the reviewers’ suggestion and changed the metric to be fractional change.

The color map used is “RdBu” for the Matplotlib python package. We chose this color map specifically because it is perceptually uniform.

With respect to questions about effect size: reviewers asked whether the small numbers throughout Figure 1B are significant, and pointed out that some of these cells appear to be barely responsive. Reviewers were surprised about the seemingly tiny responses in Figure 2-0 and P, and questioned whether this can be a meaningful code for direction? What do these magnitudes mean in terms of spike numbers?

Note that this figure shows population average responses. The panels in question are for pyramidal neurons which have narrow direction tuning (relative to SST neurons) and varied peak directions. Therefore, we expect the response strength averaged over neurons to any direction in particular to be small. Also, we suspect that the potentially unfamiliar units we used (event magnitude per frame) led to the interpretation that the responses were very small. We have changed the units to “event magnitude per second” and show it as a percent rather than fraction. The relationship between these response magnitudes and the calcium fluorescence traces is now illustrated for example neurons in a new Figure 1.

5) Averaging across the raw magnitudes in Figure 2 across neurons could be a highly insensitive analysis and lead to conclusions that are dominated by a few neurons with high baseline firing rates. A standard way to analyze this kind of data would be to normalize the responses per neurons by for example dividing by the number of baseline events.

We have added a new supplementary figure (Figure 3—figure supplement 1) in which we show the full distribution of single neuron response magnitudes as well as box plots to show the quartiles of the distributions.

Further why are the response magnitudes in Figure 2 not normalized as in Figure 1B?

The normalization in Figure 2B (previously Figure 1B) has changed in response to point 4.

6) Figure 1C: It appears that no multiple-comparison corrected statistics are provided on the differences in population orientation tuning between different lines. Given that this study is explorative, inferential statistics should be performed to compare mice lines and should be corrected for multiple comparisons (contrasts x cre lines). The text sections, "CUX2 neurons in layer 2/3 showed direction bias toward front-to-back motion at 5% and 10% contrast but not at higher contrasts (Figure 1C);. pyramidal neurons in deeper layers did not 83 have direction bias." seems not motivated by statistics. A statistical test should be provided for the difference between low and high contrasts.

The 95% confidence intervals shown in Figure 2C (previously Figure 1C) are now corrected for multiple comparisons and the figure legend has been updated to reflect this change.

Furthermore, the statistical difference between superficial layer and deep layer is suggested by the text, but appears to be never explicitly tested. The authors should carefully check that all of their claims on differences between groups or conditions are supported by multiple-comparison corrected statistics. A general problem in this study is that claims on differences between lines are made, but never made explicit through testing. An example is the description: "79: Contrast preference among pyramidal neurons systematically varied across cortical layers, exhibiting a progression from a mixture of low and high contrast-preferring neurons in layer 2/3 to almost exclusively high contrast-preferring neurons in layers 5 and 6 (Figure 1F, G)." The claim of differences among cell lines should be substantiated with statistics that test whether the different mouse lines are significantly different (the unit of analysis should be the mouse).

The revised manuscript includes a new Figure 2E that shows the results of new statistical analysis of contrast tuning across Cre lines and layers. In this panel, we show bootstrapped confidence intervals on the fractions of low contrast preferring neurons and high contrast preferring neurons as well as pairwise tests between the fractions of low contrast preferring pyramidal neurons across layers. We use the mouse as the unit of analysis and bootstrapping.

7) Claims on interactions between variables

Claims about the interactions between variables needs to be demonstrated by statistics. In principal cells, direction tuning is canonically thought to be largely invariant to contrast tuning. The authors should explicitly analyse whether the direction tuning for VIP interneurons is invariant to contrast, and whether the front-to-back preference is indeed specific for low contrast (as the authors suggest) – i.e. test against the alternative that contrast simply gain-modulates the direction tuning. A similar question pertains to the interaction between locomotion and direction tuning. The manuscript description of Figure 2 suggests an interaction between the Locomotion/Stationary variable with contrast and direction tuning. However this does not appear to be statistically quantified, and that it is possible that contrast tuning curves are modulated in a multiplicative way (i.e. invariant to locomotion). A similar comment pertains to, "This analysis demonstrates a substantial enhancement of responses to low contrast visual stimuli during locomotion that is specific to layer 2/3 pyramidal neurons and VIP neurons." There could be a general multiplicative increase in firing across cell lines and contrasts, and Figure 2 does not demonstrate that low contrast visual stimuli are enhanced specifically during locomotion.

The revised manuscript includes a new section and Figure 4 in which we model responses of VIP, SST, and L2/3 pyramidal neurons with a Generalized Linear Model to investigate the contribution of stimulus direction, stimulus contrast, running, and the interactions between these terms. We did not find strong interactions between stimulus direction and contrast for any neuron type, but we did find significant interactions between running and stimulus contrast as well as running and stimulus direction. We have edited the text of the manuscript to more precisely describe the interactions among these variables.

8) Imaging depths and layers

The authors need to carefully check their depths and laminar assignments and update the text on this. The authors should more carefully discuss in the manuscript what type of SSt and VIP cells were imaged and discuss the implications of interneuron heterogeneity. Is there any particular reason the mice from SSt-Cre mouse line were imaged in layer 4 as stated in the Materials and methods? As the authors know, dendrite-targeting Martinotti cells are more likely to be found in Layers 2/3 and 5 (Munoz et al., 2017, Science), and the Agmon Lab found that SSt-expressing neurons from layer 4 barrel cortex had unique and different electrophysiological properties from other SSt neurons (Ma et al., 2006, J.Neurosci). If indeed, this is the only imaging plane used for in the SSt-Cre mouse line, how come the authors describe the activity of SSt neurons in layers 2/3 (main text, fourth paragraph)? Similarly for layer 4 VIP cells activity (same lines). Generally there is a bit of confusion with the reported imaging depths and the authors should comment on this. In particular the laminar assignment of the SSt neurons to be clarified. According to the depth 275 micrometer one would say these are layer 2/3, but the authors write layer 4. In general, the laminar assignment based on depth should be justified if it is not based on a layer-specific mouse line.

The cranial window used for 2P imaging put pressure on the brain, and results in a compression of the cortical layers. The cortical thickness is compressed to ~700um, and the compression is not consistent across layers, such that, for example, layer 5 shows more compression and layer 4 shows less compression, on average (see de Vries, Lecoq, Buice et al. Supplementary Figure 12D and E). Based on this compression, as well as the density of neurons labeled by layer 4 specific excitatory Cre lines (e.g. Rorb, Scnn1a), we estimate that layer 4 begins at ~250um, and layer 5 begins at ~375 um, below surface.

The inhibitory neurons were imaged where they are most densely labeled. For VIP, this is shallower than for SST which we find to be densest at 275-375 μm below the surface. Based on our depth estimate, this puts the SST somata in layer 4, while the VIP neurons that are densest at ~175 μm are in layer 2/3. We believe, related to lines 96-98, that ambiguous wording led to the misinterpretation that VIP and SST neurons were imaged in both layers 2/3 and 4 – this wording has now been changed to clarify that pyramidal neurons were imaged in layers 2/3 and 4.

Although the referees are correct to point out that some properties of layer 4 SST neurons differ from SST neurons in other layers in barrel cortex, recent work finds a clear difference between mouse V1 and S1 – in particular that the vast majority of layer 4 SST neurons in V1 are Martinotti cells (Scala et al., 2019, now stated and cited in the second paragraph of the main text). Furthermore, all of the L4 SST neurons in our study prefer high contrast and have weak direction selectivity, suggesting that these properties are very likely to apply to the subset of L4 SST neurons that are Martinotti cells. Finally, we have also observed in previous work (de Vries et al., 2020) robust reliable responses to high contrast full-field drifting gratings for SST neurons in both layers 4 and 5, suggesting that at least this aspect of SST neuron tuning in layer 4 is not different from layer 5.

9) Generalizability

Due to the focus on very specific stimuli, the scope of the study is limited and the results do maybe not generalize beyond large drifting gratings. For instance, it is unclear whether these findings will hold up for smaller stimuli. Other stimulus parameters are not explored systematically (spatial frequency or temporal frequency). Stimulus size is ignored, even though stimulus size could have dramatic effects on the findings presented here. The authors need to discuss these limitations in the manuscript. In particular they should comment on the issue of size tuning, which is missing in the manuscript. Previous studies have shown a very clear size dependence of VIP and SSt neurons (Adesnik and Scanziani, 2012; DiPoppa et al., 2018). The contrast dependence of the VIP and SSt neurons is likely strongly dependent on stimulus size. For instance VIP neurons have small receptive fields (DiPoppa et al., 2018). Because surround modulation is contrast dependent, suppression of PC L2/3 cell activity and VIP activity likely depends on the stimulus size. The authors to discuss these limitations.

Although we have not explored spatial or temporal frequency in this study, we note that suppression of VIP neurons by high-contrast large gratings has been observed across spatial and temporal frequencies (de Vries et al., 2020). For size tuning, we explicitly address this question in the final paragraph with the sentences, “Measurements of size tuning have shown that SST neurons prefer large gratings, suggestive of a role mediating surround suppression, whereas VIP neurons only respond to gratings smaller than those that drive SST neurons. This complementary size tuning parallels the complementary contrast tuning observed here, suggesting that VIP and SST neurons in V1 are tuned for weak and strong inputs, respectively, across multiple stimulus dimensions.” Following those sentences, we also now point to our previous finding that VIP neuron receptive fields as measured with sparse noise stimuli are largerthan those of SST and pyramidal neurons (de Vries et al., 2020). This result shows that the relatively small size tuning of VIP neurons does not arise simply from having small linear receptive fields; instead, we believe that VIP neurons have small size tuning due to the weaker stimulus energy of small versus large features. We certainly agree with the reviewers’ point that both size and contrast contribute to VIP and SST responses. Our discussion acknowledges this point and goes further to propose that the parsimonious explanation is that both size and contrast determine the strength of input to the cortical circuit which is the determining factor.

10) Interpretation of SSt direction tuning and VIP suppression

The authors should discuss the interpretation of some of the main effects. In particular, reviewers wondered why VIP neurons are suppressed at high contrast?

We have added a discussion of this point in the seventh paragraph of the main text. Given that VIP neurons have the highest spontaneous activity among the neuron types sampled here (Figure 3; see also Extended Data Figure 1 in de Vries et al., 2020), one possibility is that VIP neurons simply have non-zero spontaneous activity whereas the other neuron types are already close to zero making suppression impossible. Another, related possibility is that suppression of other neuron types occurs but goes undetected because both the suppressed and non-suppressed activity levels are below the detection threshold of calcium imaging. Functionally, the high spontaneous activity of VIP neurons enables the cortical circuit to raise or lower the amount of disinhibition of pyramidal neurons depending on stimulus contrast. Our SSN modeling results demonstrate that suppression of VIP neurons below baseline helps the network to maintain stability in response to strong external inputs.

They also wondered why the response of SST neurons at low contrast is not direction modulated – given that the VIP neurons have strong direction tuning?

This is a very interesting question. Note that the SST neurons have a bias toward zero degrees at contrasts greater than 20% whereas SST neurons have weak or no response to any direction at contrasts less than or equal to 20%. One possibility is that SST neurons would also respond to zero degrees at low contrast in the absence of inhibition from VIP neurons. Another possibility is that the responses of SST neurons are simply too small at low contrast to detect direction tuning.

Do VIP interneurons indeed suppress the SSt neurons recorded here? This assumption is being made but it is never tested or argued for these specific laminar recordings from the literature.

Layer 4 SST neurons in V1 are mostly, but not entirely, Martinotti cells (Scala et al., 2019). Although we cannot be certain which of the SST neurons that we recorded are Martinotti cells, the tuning curves of all of the SST neurons look very similar suggesting that our findings apply to Martinotti cells and possibly also non-Martinotti SST neurons. We have added a statement to the main text clarifying this distinction between subtypes of SST neurons and motivating the imaging of Sst mice in layer 4.

11) Optic Flow

The interpretation of optic flow appears to be problematic and needs to be discussed. The authors interpret their data in terms of optic flow related to locomotion. However, the peak of VIP neurons occurs at 45 degrees rather than 0 degrees, but the authors write that 0 degrees corresponds to front-to-back motion (main text, second paragraph). Does the interpretation of the authors make sense given this discrepancy?

We do not believe that the discrepancy is large enough to warrant an interpretation that the mice perceive the direction of motion to be substantially different from front-to-back. First, the peak of VIP neurons is fairly centered between 0 and 45 degrees (new Figure 3D), suggesting a more modest discrepancy of 20 to 25 degrees. Second, our imaging experiments were performed on head-fixed mice that were standing on a wheel that permits them to run in place. The running wheel is angled slightly upward a few degrees to facilitate running during head-fixation which might also influence the mouse’s perception of egocentric angle of visual motion.

12) Modelling:

There are major concerns about parameter and model selection. Several findings hard to account for with the model. The description of the model needs to be substantially improved.

These points are addressed where they are raised in detail below.

The tuning of VIP cells in 3b is much wider than that observed in the data (1b), and this could be a major problem for the model. The authors should explain this.

Figure 5B (previously Figure 3B) shows the tuning of pyramidal to VIP connections. Perhaps this comment was intended to refer to Figure 5C (previously Figure 3C) which shows the direction (and contrast) tuning of VIP neuron activity. The tuning of the VIP responses in the SSN shown in Figure 5C (previously Figure 3C) are on the order of 10 to 20 degrees, which is not much wider than the tuning of VIP neurons in mouse V1 shown in Figure 2 (previously Figure 1). Our experiments measured direction responses at increments of 45 degrees and we do not claim that the width of direction tuning for VIP neurons is substantially narrower than this sampling permits us to discern.

The authors should make a better effort to explain the model better in the text. For the modelling, authors need to provide more explanation on why SSNs were selected, and how they work etc. In the current version of the paper, it is necessary to refer to the very difficult Rubin et al., 2015 paper to understand what the model does.

The revised manuscript has a more thorough motivation and introduction to SSNs. In brief, SSNs were selected because a few universal features of cortical circuits (e.g. recurrent excitation, feedback inhibition, and supralinear f-I curves) can account for a wide variety of contrast-dependent phenomenology (Rubin et al., 2015) in addition to a variety of other phenomenology that we now state and cite.

Also, reviewers would like to see that the canonical results from Rubin et al. still work in the revised model, and commented that this is an essential addition to the paper.

In addition to further exploration of network behavior across model parameters (see next point), we have added a section (main text) and Figure 5G-L on the analysis of the inhibitory stabilization behavior of the network as a function of the model parameters. Specifically, we performed linear stability analysis of the excitatory portion of the network (by computing the eigenvalues of the excitatory-to-excitatory submatrix of the Jacobian for the linearized network dynamics; full details in the Materials and methods section) to determine whether or not the steady state firing rates of the network would be stable in the absence of feedback inhibition. In other words, this analysis shows whether the network is inhibitory stabilized or not inhibitory stabilized. It is possible that achieving a high gain makes the recurrent excitation unstable, requiring strong inhibitory stabilization to prevent runaway excitation. Surprisingly, we found that high gain at low contrast effect due to VIP neurons does not require the network to be in an inhibitory stabilized regime.

The authors should include a discussion of the process of parameter selection for the model (how much fine tuning was required, what happens when connection weights deviate somewhat etc). In general, the scope of the model to be somewhat limited given that model parameters are not systematically explored or fitted based on the data.

The revised manuscript includes a wider exploration of model parameters which has yielded additional insights into the behavior of the model. In particular, we have examined the network dynamics as a function of two crucial parameters: 1) the strength of the VIP to SST connection (Figure 5G-I and main text) and 2) the relative amount of inhibition that pyramidal neurons receive from PV versus SST neurons (Figure 5J-L and main text). We show in Figure 5 (previously Figure 3) that our main findings (specifically, increased gain at low contrast due to VIP neurons) hold across a wide range of values for these parameters as well as how the magnitude of the gain effect varies with these parameters.

Further, the model makes strong assumptions on the inputs that these different neuron types presumably receive from external drive. These limitations need to be discussed.

Correct, our model assumes that pyramidal and PV neurons receive direct external (e.g. thalamic) input whereas VIP and SST neurons do not. The lack of direct thalamocortical input to SST neurons (and confirmation of direct thalamocortical input to pyramidal and fast-spiking putative PV neurons) has been observed in mouse somatosensory cortex (Cruikshank et al., 2010). While we also assume that VIP neurons do not receive direct external input, the responses of VIP neurons to weak stimuli would be expected to increase if this additional excitatory input to VIP neurons were added to the model; therefore, our assumption of no direct external input to VIP neurons is conservative in relation to our main conclusion. We have now elaborated on this reasoning in the eleventh paragraph of the main text.

The main text to be improved with a more in-depth explanation for using SSNs (stabilized supra-linear network).

Again, the revised manuscript has a more thorough motivation and introduction to SSNs.

Finally, rectified quadratic response functions for neurons are not common in RNNs, and should be motivated.

Rectified quadratic single neuron transfer functions have been used in previous studies of SSNs (Ahmadian et al., 2013, Rubin et al., 2015) for simplicity of mathematical analysis (e.g. the derivative of x2 is simply 2x) while still having the supralinear property. We use the same single neuron transfer function to maintain consistency with the existing SSN literature.

Also, no explanation/citation is provided for the choice of parameters (input, connections) for PV neurons.

We thank the reviewer’s for pointing out that a citation was not provided. We have now cited work that found the orientation tuning of PV neurons to be broad (Kerlin et al., 2010) in the eleventh paragraph of the main text.

With respect to the conclusion – "These results indicate that Vip-mediated disinhibition is capable of producing substantial increases in gain at low contrast despite low activity of the intermediate SST neuron population." The authors should explain why this is not a circular argument and a strange way to summarize the results of the model. Reviewers pointed out that it is clear from Figure 3D that the suppression of Sst activity by Vip neurons at low contrast is what enables supra-linear responses of Cux2 neurons at low contrast.

SST neurons have very weak responses to low contrast gratings. One might (mistakenly) conclude that even full suppression of a small amount of SST activity can only result in a small increase in pyramidal neuron activity. Our intention with this sentence was to emphasize that, due to recurrent excitation and supralinear single neuron transfer functions, the suppression of a relatively small amount of SST neuron activity can drive large increases in pyramidal neuron activity. We have now elaborated this reasoning in the tenth and twelfth paragraphs of the main text.

The bias of direction tuning in L2/3 neurons is very weak. How can the authors account for the contrast enhancement of these neurons in their VIP model? Should this enhancement not be extremely specific to the 45 degrees angle?

Approximately half of the L2/3 pyramidal neurons are low contrast preferring (Figure 2E) whereas the over-representation of 0 and 45 degrees-preferring L2/3 pyramidal neurons is relatively small. Our measurements of VIP neuron contrast and direction tuning is consistent with the hypothesis that VIP neurons can cause a larger over-representation based on (low) contrast preference than (front-to-back) direction preference. Indeed, VIP neuron activity is higher at low contrast than high contrast across all directions (i.e. the suppressed by high contrast effect) andVIP neuron activity is even higher for front-to-back motion than other directions of motion at low contrast (Figure 3D). We now discuss this aspect of the model explicitly in the main text.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Millman DJ, de Vries SE. 2020. Allen Institute – Contrast tuning in mouse visual cortex with calcium imaging. DANDI. 000039

    Supplementary Materials

    Figure 1—source data 1. The total number of cells, experimental sessions, and mice per Cre line.
    Transparent reporting form

    Data Availability Statement

    The data generated and analyzed in this study are available on DANDI: Distributed Archives for Neurophysiology Data Integration. All analyses were performed using custom scripts written in Python 2.7, using NumPy, SciPy, Pandas, Matplotlib, statsmodel, and Scikit-learn. Analysis code is available at https://github.com/AllenInstitute/Contrast_Analysis (copy archived at https://archive.softwareheritage.org/swh:1:rev:c7ddda11647093e8a0173dbd2a1986ac6239c821/). Event extraction was performed using FastLZeroSpikeInference available at https://github.com/jewellsean/FastLZeroSpikeInference.

    The following dataset was generated:

    Millman DJ, de Vries SE. 2020. Allen Institute – Contrast tuning in mouse visual cortex with calcium imaging. DANDI. 000039


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES