Skip to main content
eLife logoLink to eLife
. 2021 Apr 7;10:e54858. doi: 10.7554/eLife.54858

Stimulus-dependent relationships between behavioral choice and sensory neural responses

Daniel Chicharro 1,2,, Stefano Panzeri 1,, Ralf M Haefner 3,†,
Editors: Joshua I Gold4, Kristine Krug5
PMCID: PMC8184215  PMID: 33825683

Abstract

Understanding perceptual decision-making requires linking sensory neural responses to behavioral choices. In two-choice tasks, activity-choice covariations are commonly quantified with a single measure of choice probability (CP), without characterizing their changes across stimulus levels. We provide theoretical conditions for stimulus dependencies of activity-choice covariations. Assuming a general decision-threshold model, which comprises both feedforward and feedback processing and allows for a stimulus-modulated neural population covariance, we analytically predict a very general and previously unreported stimulus dependence of CPs. We develop new tools, including refined analyses of CPs and generalized linear models with stimulus-choice interactions, which accurately assess the stimulus- or choice-driven signals of each neuron, characterizing stimulus-dependent patterns of choice-related signals. With these tools, we analyze CPs of macaque MT neurons during a motion discrimination task. Our analysis provides preliminary empirical evidence for the promise of studying stimulus dependencies of choice-related signals, encouraging further assessment in wider data sets.

Research organism: Rhesus macaque

Introduction

How perceptual decisions depend on responses of sensory neurons is a fundamental question in systems neuroscience (Parker and Newsome, 1998; Gold and Shadlen, 2001; Romo and Salinas, 2003; Gold and Shadlen, 2007; Siegel et al., 2015; van Vugt et al., 2018; O’Connell et al., 2018; Steinmetz et al., 2019). The seminal work of Britten et al., 1996 showed that responses from single cells in area MT of monkeys during a motion discrimination task covaried with behavioral choices. Similar activity-choice covariations have been found in many sensory areas during a variety of both discrimination and detection two-choice tasks (see Nienborg et al., 2012; Cumming and Nienborg, 2016, for a review). Identifying which cells encode choice, and how and when they encode it, is essential to understand how the brain generates behavior based on sensory information.

With two-choice tasks, Choice Probability (CP) has been the most prominent measure (Britten et al., 1996; Parker and Newsome, 1998; Nienborg et al., 2012) used to quantify activity-choice covariations. Although early studies (Britten et al., 1996; Dodd et al., 2001) explored potential dependencies of the CP on the stimulus content, no significant evidence was found of a CP stimulus dependency. Accordingly, it has become common to report for each neuron a single CP value to quantify the strength of activity-choice covariations. This scalar CP value has been typically calculated either only from trials with a single, non-informative stimulus level (e.g. Dodd et al., 2001; Parker et al., 2002; Krug et al., 2004; Wimmer et al., 2015; Katz et al., 2016; Wasmuht et al., 2019), or by pooling trials across stimulus levels (so-called grand CP [Britten et al., 1996]) under the assumption that choice-related neural signals are separable from stimulus-driven responses (e.g. Verhoef et al., 2015; Pitkow et al., 2015; Smolyanskaya et al., 2015; Bondy et al., 2018). Alternatively, a single CP is sometimes obtained simply averaging CPs across stimulus levels (e.g. Cai and Padoa-Schioppa, 2014; Latimer et al., 2015; Liu et al., 2016). Even when activity-choice covariations are modeled jointly with other covariates of the neural responses using Generalized Linear Models (GLMs) (Truccolo et al., 2005; Pillow et al., 2008), the stimulus level and the choice value are also usually used as separate predictors of the responses (Park et al., 2014; Runyan et al., 2017; Scott et al., 2017; Pinto et al., 2019; Minderer et al., 2019).

This focus on characterizing a neuron by a single CP value is mirrored in the existing theoretical studies. Existing theoretical results rely on a standard feed-forward model of decision making in which a neural representation of the stimulus is converted by a threshold mechanism into a behavioral choice (Shadlen et al., 1996; Cohen and Newsome, 2009b; Haefner et al., 2013) assuming a single, zero-signal stimulus level, and hence ignoring stimulus dependencies of CPs. Furthermore, so far no analytical mechanistic model accounts for feedback contributions to activity-choice covariations known to be important empirically (Nienborg and Cumming, 2009; Cumming and Nienborg, 2016; Bondy et al., 2018).

The main contribution of this work is to extend CP analysis reporting a single CP value for each cell to a more complete characterization of within-cell patterns of choice-related activity across stimulus levels. First, we extended the analytical results of Haefner et al., 2013 to the general case of informative stimuli and to include both feedforward and feedback sources of the covariation between the choice and each cell. Our results predict that CP stimulus dependencies can appear in a cell-specific way because of stimulus-dependencies of cross-neuronal correlations. We show that they can also appear for all neurons because of the transformation of the neural representation of the stimulus into a binary choice, if the decision-making process relies on a threshold mechanism (or threshold criterion) to convert a continuous decision variable into a binary choice. Second, we developed two new analytical methods (a refined CP analysis and a new generalized linear model with stimulus-choice interactions) with increased power to detect stimulus dependencies in activity-choice covariations. Our new CP analysis isolates within-cell stimulus dependencies of activity-choice covariations from across-cells heterogeneity in the magnitude of the CP values, which may hinder their detection (Britten et al., 1996). Third, we applied this analysis framework to the classic dataset of Britten et al., 1996 containing recordings from neurons in visual cortical area MT and found evidence for our predicted population-level threshold-induced dependency but also additional interesting cell-specific dependencies. We found consistent results on the existence of stimulus-choice interactions in neural activity both with our refined CP analysis and using generalized linear models with interaction terms. Finally, we show that main properties of the additional dependencies found can be explained modeling the cross-neuronal correlation structure induced by gain fluctuations (Goris et al., 2014; Ecker et al., 2014; Kayser et al., 2015; Schölvinck et al., 2015), which have been shown to explain a substantial amount of response variability in MT visual cortex (Goris et al., 2014).

Results

We will first present the analysis of a theoretical model of how informative stimuli modulate choice probabilities. We will then analyze MT visual cortex neuronal responses from Britten et al., 1996, applying new methods developed to quantify stimulus-dependent activity-choice covariations with CPs and GLMs. This analysis provides preliminary empirical evidence in support of using these new methods for studying stimulus dependencies of activity-choice covariations.

A general account for choice-related neural signals in the presence of informative stimuli

In a two-choice psychophysical task, such as a stimulus discrimination or detection task, a neuron is said to contain a ‘choice-related signal’, or ‘decision-related signal’ when its activity carries information about the behavioral choice above and beyond the information that it carries about the stimulus (Britten et al., 1996; Parker and Newsome, 1998; Nienborg et al., 2012). The interpretation of choice-related signals in terms of decision-making mechanisms is however difficult. Much progress in our understanding of their meaning has relied on using models to derive mathematically the relationship between the underlying decision-making mechanisms and different measures of activity-choice covariation (Haefner et al., 2013; Pitkow et al., 2015) usually used to quantify choice-related signals.

The most widely used measure of activity-choice covariation for tasks involving two choices is choice probability, CP. The CP is defined as the probability that a random sample of neural activity from all trials with behavioral choice D equal to 1 is larger than one sample randomly drawn from all trials with choice D=-1 (Britten et al., 1996; Parker and Newsome, 1998; Nienborg et al., 2012; Haefner et al., 2013):

CP-drp(r|D=1)-rdrp(r|D=-1), (1)

where r is any measure of the neural activity, which we will here consider to be the neuron’s per-trial spike count. Another prominent measure of choice-related signals is choice correlation, CC (Pitkow et al., 2015). This quantity is defined under the assumption that the binary choice D is mediated by an intermediate continuous decision value, d. This value may represent the brain’s estimate of the stimulus, or an internal belief about the correct choice. The definition of CC further assumes that the categorical choice D is related to d via a thresholding operation such that the choice depends on whether d is smaller or larger than a threshold θ (Gold and Shadlen, 2007). Its expression is as follows:

CCcorr(r,d)=cov(r,d)varrvard, (2)

where cov(r,d) is the covariance of the neural responses with d, and varr, vard their variance across trials. Perhaps, the simplest measure of activity-choice covariation, which has been used in empirical studies (Mante et al., 2013; Ruff et al., 2018), is what we called the choice-triggered average, CTA, defined as the difference between a neuron’s average spike count r across trials with behavioral decision D=1 minus the average spike count in trials with decision D=-1:

CTArD=1-rD=-1. (3)

The CP and CTA quantify activity-choice covariations without assumptions about the underlying decision-making mechanisms. However, their interpretation has commonly (Nienborg et al., 2012) been informed in previous analytical and computational studies by assuming a specific feedforward decision-threshold model of choice-related signals (Shadlen et al., 1996; Cohen and Newsome, 2009b). Haefner et al., 2013 used that model to derive an analytical expression for CP valid under two assumptions that are often violated in practice: first, the model assumes a causally feedforward structure in which sensory responses caused the decision, and second, it is assumed that both decisions are equally likely. However, the presence of informative stimuli leads to one choice being more likely than the other, hampering the application of the analytical results to Grand CPs and to detection tasks (Bosking and Maunsell, 2011; Smolyanskaya et al., 2015), which involve informative stimuli. Furthermore, decision-related signals have empirically been shown to reflect substantial feedback components (Nienborg and Cumming, 2009; Nienborg et al., 2012; Macke and Nienborg, 2019). We will next extend this previous model (Haefner et al., 2013) to obtain a general expression of the CP valid for informative stimuli and regardless of the feedforward or feedback origin of the dependencies between the neural responses and the decision variable.

We first consider a most generic model in which we simply assume that the response ri of the i-th sensory neurons covaries with the behavioral decision D, but without making any assumption about the origin of that covariation (Figure 1A). We find that to a first approximation (exact solution provided in Methods), the CP of cell i captures the difference between the distributions p(ri|D=1) and p(ri|D=-1) resulting from a difference in their means, and hence is related to the CTA:

CPi12+12πCTAivarri. (4)

Figure 1. Models of choice probabilities.

Figure 1.

Arrows indicate causal influences. Undirected edges indicate relationships that may be due to feedforward, feedback, and/or common inputs. (a) A model agnostic to the causal origin of the choice–response covariation: the response of sensory neurons encoding a stimulus s covaries with choice D. (b) Threshold model with a continuous decision variable d mediating the relationship between responses and choice. The binary decision is made comparing d to a threshold θ. (c) The threshold mechanism (vertical dashed black line) dichotomizes the d-space, resulting in a difference between the means of the conditional distributions associated with D=±1 (red and blue vertical dashes on top of figure). This difference is quantified by CTAd (horizontal thick black line) and implies a non-zero difference between the choice-triggered average responses (CTAi) in the presence of a correlation, CCi, between d and ri.

The CTA generically quantifies the linear dependencies between responses and choice, and this approximation of the CP does not depend on their feedforward or feedback origin (Figure 1A). We next add the assumption that the relationship between a neuron’s response and the choice is mediated by the continuous variable d, as commonly assumed by previous studies and described above (Figure 1B). This splits any correlation between the neural response ri and choice D into the product of the two respective correlations: corr(ri,D)=corr(ri,d)corr(d,D) = CCicorr(d,D), where CCi=corr(ri,d) is the choice correlation as defined in Equation 2. It follows (see Methods) that:

CTAi=CCivarrivardCTAd, (5)

where CTAd is the average difference in d between the two choices, in analogy to the CTAi for neuron i. Equation 5 describes how activity-choice covariations appear in the model (Figure 1C): the threshold mechanism dichotomizes the space of the decision variable, resulting in a different mean of d for each choice, which is quantified in CTAd. If the activity of cell i is correlated with the decision variable d (non zero CCi), the CTAd is then reflected in the CTAi of the cell. In previous theoretical work (Haefner et al., 2013), the distribution over d was assumed to be fixed and centered on the threshold value θ. Here, we remove that assumption and consider that d may not be centered on the threshold if the stimulus is informative, containing evidence in favor of one of the two choices, or if the choice is otherwise biased. In those cases, the normalized CTAd in Equation 5, namely CTAd/vard, can be determined (see Materials and methods) in terms of the probability of choosing choice 1, pCRp(D=1)=p(d>θ), which we call the ‘choice rate’, pCR. Since the decision variable is determined as the combination of the responses of many cells, its distribution is well approximated by a Gaussian distribution, but now with a nonzero mean determined by the stimulus content. With this assumption, the normalized CTAd for pCR=0.5 is equal to 4/2π, and for each other pCR value differs by a scaling factor

h(pCR)=2πϕ(Φ1(pCR))4pCR(1pCR), (6)

where ϕ(x) is the density function of a zero-mean, unit variance, Gaussian distribution, and Φ-1 is the corresponding inverse cumulative density function. By construction, h(pCR)=1 for pCR=0.5 where it has its minimum. Given the factor h(pCR), combining Equations 4 and 5 we can relate CP and CC across different ratios pCR, corresponding to different stimulus levels, irrespectively of whether CP is caused by feedforward or feedback signals. In the linear approximation (see Methods for the exact formula and derivation with the decision-threshold model), this relationship reads:

CPi(pCR)12+2πh(pCR)CCi(pCR). (7)

For equal fractions of choices, pCR=0.5, this CP expression corresponds to the linear approximation derived in Haefner et al., 2013. Note that extending the CP formula to pCR0.5 required us to also make explicit the dependency of the choice correlations on the choice rate, CCi(pCR). Unlike h(pCR) which is an effect of the decision-making threshold mechanism and shared by all neurons, CCi(pCR) is specific to and generally different for each neuron, reflecting its role in the perceptual decision-making process. A CC stimulus dependence may arise as a result of stimulus-dependent decision feedback (Haefner et al., 2016; Bondy et al., 2018; Lange and Haefner, 2017), or other sources of stimulus-dependent cross-neuronal correlations (Ponce-Alvarez et al., 2013; Orbán et al., 2016) such as shared gain fluctuations (Goris et al., 2014). In fact, we will show below that gain-induced stimulus-dependent cross-neuronal correlations account for observed features in our empirical data. Note that we do not distinguish between CC stimulus dependencies and a dependence of the CC on pCR. We do not make this distinction here because most generally a change in the stimulus level results in a change of pCR, and the two cannot be disentangled. However, the pCR more generally depends on other factors such as the reward value, attention level, or arousal state, and in Equation 7 the separate dependencies on the stimulus and pCR can be explicitly indicated as CCi(pCR,s) when the experimental paradigm allows to separate these two influences.

For simplicity, we presented above only the general relationship between the CP and CC in Equation 7 derived as a linear approximation for weak activity-choice covariations, as this is the regime relevant for single sensory neurons. See Methods for the exact analytical solution from the threshold model (Equation 16) and Appendix 1 for its derivation. Despite the assumption of weak activity-choice covariations, this approximation is very close over the empirically relevant range of CC’s (Figure 2A–B). Below we will focus on a concrete type of CC stimulus dependence, namely originated by gain fluctuations, but it is clear from Equation 7 that any CC stimulus dependence will modify the CP(pCR) shape induced purely by the threshold effect. A summary of the overall relation between the CP, CTA, and CC is provided in Figure 2D.

Figure 2. Predictions for stimulus dependencies from the threshold model.

Figure 2.

(a) CP dependence on pCR through the threshold-induced factor h(pCR). Results are shown for three values of a stimulus-independent choice correlation, CCi, isolating the shape of h(pCR) from other stimulus dependencies. Solid curves represent the exact solution of the CP obtained from our model (see Methods, Equation 16) and dashed curves its linear approximation (Equation 7). (b) Comparison of the exact solution of the CP (solid) and its linear approximation (dashed), as a function of the magnitude of a stimulus-independent choice correlation. Results are shown for two values of pCR, 0.5 and 0.9. (c) CP dependence on pCR when together with the factor h(pCR) stimulus dependencies also appear through stimulus-dependent choice correlations induced by response gain fluctuations (Equation 11). Results are shown for five values of CC(pCR=0.5) (dotted horizontal lines) and in each case for two values of λi, the fraction of the variance of a cell i caused by the gain fluctuations (Methods). (d) Summary of the derived relationships as provided by Equations 4-7.

The model provides a concrete prediction of a stereotyped dependence of CP on pCR through h(pCR) when the choice-related signals are mediated by an intermediate decision variable d, which is testable using data. First, under the assumption that CC is constant and therefore h(pCR) is the only source of CP dependence on pCR, for a positive CC (CP>0.5), the CP(pCR) should have a minimum at pCR=0.5 and increase symmetrically as pCR deviates from 0.5 as the result of a change in the stimulus in either direction (Figure 2A). When the CC is negative (CP<0.5), then CP(pCR) should have a maximum at pCR=0.5 and analogously decrease symmetrically as pCR deviates from 0.5. Second, since the influence of h(pCR) is multiplicative, it creates higher absolute differences in the CP across different stimulus levels for cells with a stronger CP (either larger or smaller than 0.5). Third, the dependence on h(pCR) is weak for a wide range of pCR values (Figure 2A), making it empirically detectable only when including highly informative stimuli in the analysis to obtain pCR values very different from 0.5. However, for those pCR values, CP estimates are less reliable, because only for few trials the choice is expected to be inconsistent with the sensory information, meaning that one of the two distributions p(ri|D=1) or p(ri|D=-1) is poorly sampled. This means that to detect the h(pCR) modulation for single cells, many trials would be needed for each value of pCR to obtain good estimates. Because h(pCR) is common to all cells, averaging CP(pCR) profiles across cells can also improve the estimation. This averaging may also help to isolate the h(pCR) modulation, assuming that cell-specific CPi stimulus dependencies introduced through choice correlations CCi are heterogeneous across cells and average out. We refer to Appendix 1 for a detailed analysis of the statistical power for the detection of h(pCR) as a function of the number of trials and cells used to estimate an average CP(pCR) profile. We will present below (Section ‘Stimulus dependence of choice-related signals in the responses of MT cell’) evidence for the h(pCR) modulation from a re-analysis of the data in Britten et al., 1996.

The structure of CP stimulus dependencies induced by response gain fluctuations

We will now focus on a concrete source of stimulus-dependent correlations that leads to a non-constant CC(pCR), namely the effect of gain fluctuations into the stimulus-response relationship (Goris et al., 2014; Ecker et al., 2014; Kayser et al., 2015; Schölvinck et al., 2015). Goris et al., 2014 showed that 75% of the variability in the responses in monkeys MT cells when presented with drifting gratings could be explained by gain fluctuations. We derive the CP dependencies on pCR in a feedforward model of decision-making (Shadlen et al., 1996; Haefner et al., 2013) that also models the effect of gain fluctuations in the responses. The feedfoward model considers a population of sensory responses, r=(r1,,rn), with tuning functions f(s)=(f1(s),,fn(s)), responses ri=fi(s)+ξi, and a covariance structure Σ of the neuron’s intrinsic variability ξi. The responses are read out into the decision variable with a linear decoder 

d=wri=1nwiri, (8)

where w are the read-out weights. The categorical choice D is made by comparing d to a threshold θ. With this model, the general expression of Equation 7 reduces to

CPi(pCR)12+2πh(pCR)(Σ(s)w)iΣii(s)wΣ(s)w. (9)

where (Σ(s)w)i=cov(ri,d) and vard=wΣ(s)w. This expression corresponds to the one derived by Haefner et al., 2013, except for h(pCR) and for the fact that we now explicitly indicate the dependence of the correlation structure Σ(s) on the stimulus. The expression relates the CP magnitude to single-unit properties such as the neurometric sensitivity, as well as to population properties, such as the decoder pooling size and the magnitude of the cross-neuronal correlations, which determine CC (Shadlen et al., 1996; Haefner et al., 2013). In particular, if the decoding weights are optimally tuned to the structure of the covariability Σ(pCR=0.5) at the decision boundary, this results in a proportionality between CPi(pCR=0.5) and the neurometric sensitivity of the cells: CPi(pCR=0.5)fi/σri (Haefner et al., 2013), as has been experimentally observed (Britten et al., 1996; Parker and Newsome, 1998). While this feedfoward model is generic, we concretely study CC stimulus dependencies induced by the effect of global gain response fluctuations in cross-neuronal correlations. Following Goris et al., 2014 we modeled the responses of cell i in trial k as fik(s)=gkfi(s), where gk is a gain modulation factor shared by the population. We assume that the readout weights w are stimulus-independent. As a consequence, the covariance of population responses Σ has a component due to the gain fluctuations:

Σ(s)=Σ¯+σG2f(s)f(s), (10)

 where σG2 is the variance of the gain g and Σ¯ is the covariance not associated with the gain, which for simplicity we assume to be stimulus independent. The component of the cross-neuronal covariance matrix Σ induced by gain fluctuations is proportional to the tuning curves (f(s)fT(s)). A deviation Δss-s0 of the stimulus from the uninformative stimulus s0 produces a change Δf=f(s0)Δs in the population firing rates, which affects the variability of the responses, the variability of the decoder, and their covariance, which all vary with Δs. Because the variance of the decoder vard=wΣ(s)w and the covariance cov(ri,d)=(Σ(s)w)i both depend on the concrete form of the read-out weights, the effect of gain-induced stimulus dependencies on the CP is specific for each decoder. Under the assumption of an optimal linear decoder at the decision boundary s0 (wΣ1f(s0)), we obtain an approximation of the CC dependence on the stimulus deviation Δs from s0 (see Methods for details):

CCi(pCR)=CCi(pCR=0.5)+σGλi[1CCi2(pCR=0.5)]Δsvard, (11)

where the slope is determined by the coefficient βpCR=σGλi[1CCi2(pCR=0.5)], with λi being the fraction of the variance of cell i caused by the gain fluctuations (Methods). The choice rate pCR is determined by the stimulus Δs as characterized by the psychometric function. For this form of the slope coefficient βpCR obtained with an optimal decoder all the factors contributing to it are positive (Figure 2C). In Appendix 4 we further analytically describe how gain fluctuations introduce CP stimulus dependencies not only for an optimal decoder, but also for any unbiased decoders. Conversely to the factor h(pCR), the pattern of CP(pCR) profiles produced by the gain fluctuations is cell-specific, with a stronger asymmetric component for cells with higher λi (Figure 2C). Furthermore, while the sign of the multiplicative modulation h(pCR) changes when CC>0 or CC<0, the gain-induced contribution in Equation 11 is additive. As seen in Figure 2C, for cells with a weak activity-choice covariation for uninformative stimuli (CP(pCR=0.5) close to 0.5), this implies that the CP of a neuron can actually change from below 0.5 to above 0.5 across the stimulus range presented in the experiment.

Stimulus dependencies of choice-related signals in the responses of MT cells

In the light of our findings above, we re-analyzed the classic Britten et al., 1996 data containing responses of neurons in area MT in a coarse motion direction discrimination task (see Methods for a description of the data set). Our objective is to identify any patterns of CP dependence on the choice rate/stimulus level. First, we describe our results testing for the threshold-induced CP stimulus dependence, h(pCR), and then more generally we characterize the CP(pCR) patterns found in the data using clustering analysis. Finally, as an alternative to CP analysis, we show how to extend Generalized Linear Models (GLMs) of neural activity to include stimulus-choice interaction terms that incorporate the stimulus dependencies of activity-choice covariations derived with our theoretical approach and found above in the MT data.

Testing the presence of a threshold-induced CP stimulus dependence in experimental data

We start describing how to analyze within-cell CP(pCR) profiles to test the existence of the threshold-induced modulation. The theoretically derived properties of h(pCR) suggest several empirical signatures that will be reflected in the within-cell CP(pCR) profiles. First, because h(pCR) introduces a multiplicative modulation of the choice correlation, for informative stimuli it leads to an increase of the CP for cells with positive choice correlation (CP>0.5) and to a decrease for cells with negative choice correlation (CP<0.5). Second, because h(pCR) is multiplicative, the absolute magnitude of the modulation will be higher for cells with stronger choice correlation, that is CPs most different from 0.5. Third, the effect of h(pCR) is strongest when one choice dominates and hence most noticeable for highly informative stimuli.

These properties of h(pCR) indicate that, to detect this modulation, it is necessary to examine within-cell CP(pCR) profiles isolated from across-cells heterogeneity in the magnitude of the CP values. Ideally, we would like to calculate a CP(pCR) profile for each cell and analyze the shape of these single-cell profiles. However, given the available number of trials, estimates of CP(pCR) profiles for single cells are expected to be noisy. The estimation error of the CP is higher when pCR is close to 0 or 1, the same pCR values for which the h(pCR) modulation would be most noticeable. The standard error of CP^ can be approximated as SEM(CP^)1/12KpCR(1pCR) (Bamber, 1975; Hanley and McNeil, 1982, see Methods), where K is the number of trials. In the Britten et al. data set the number of trials varies for different stimulus levels, and most frequently K=30 for highly informative stimuli. In that case, for pCR=0.9, only three trials for choice D=-1 are expected, and SEM(CP^)0.18. As can be seen from Figure 2A, this error surpasses the order of magnitude of the CP modulations expected from h(pCR). This means that we need to combine CP estimates of adjacent pCR values, and/or combine estimated CP(pCR) profiles across neurons, to reduce the standard error (See Appendix 1 for a detailed analysis of the statistical power for the detection of h(pCR)).

When averaging CPs across neurons, two considerations are important. First, cells that for pCR=0.5 have a CP higher or lower than 0.5 should be separated, given that the sign of the CC leads to an inversion of the profile resulting from h(pCR) (Equation 7). If not separated, the h(pCR)-dependence would average out, or the average CP(pCR) profile would reflect the proportion of cells with CPs higher or lower than 0.5 in the data set. Second, the average should correspond to an average -across cells- of within-cell CP(pCR) profiles, and hence it should only include cells for which a full CP(pCR) profile can be calculated. This is important because for each cell i the h(pCR) modulation is relative to the value of CPi(pCR=0.5). If a different subset of cells was included in the average of the CP at each pCR value, the resulting shape across pCR values of the averaged CPs would not be an average of within-cell CP(pCR) profiles. Conversely, in that case, the resulting shape would reflect the heterogeneity in the magnitude of the CP values across the subsets of cells averaged at each pCR value. In the single-cell recordings from Britten et al., the range of stimulus levels used varies across neurons, and for a substantial part of the cells a full CP(pCR) profile cannot be constructed. Following the second consideration, those cells were excluded from the analysis to avoid that they only contributed to the average at certain pCR values.

We derived the following refined procedure to analyze CP(pCR) profiles. As a first step, we constructed a CP(pCR) profile for each cell. First, for each cell and each stimulus coherence level we calculated a CP estimate if at least four trials were available for each decision. For the experimental data set, CPs are always estimated from its definition (Equation 1), and we will only use the theoretical expression of h(pCR) to fit the modulation of the experimentally estimated CP(pCR) profiles. Second, as a first way to improve the CP estimates, we binned pCR values into five bins and assigned stimulus coherence levels to the bins according to the psychometric function that maps stimulus levels to pCR, with the central bin containing the trials from the zero-signal stimulus. A single CP value per bin for each cell was then obtained as a weighted average of the CPs from stimulus levels assigned to each bin. The weights were calculated as inversely proportional to the standard error of the estimates, giving more weight to the most reliable CPs (see Methods). The results that we present hereafter are all robust to the selection of the minimum number of trials and the binning intervals. Unless otherwise stated, in all following analyses we included all the cells (N=107) for which we had data to compute CPs in all five bins, thus allowing us to estimate a full within-cell CP(pCR) profile. As a second step, we averaged the within-cell CP(pCR) profiles across cells, taking into account the two considerations above. As before, averages were weighted by inverse estimation errors.

Figure 3A shows the averaged CP(pCR) profiles. To assess the statistical significance of the CP dependence on pCR, we developed a surrogates method to test whether a pattern consistent with the predicted CP-increase for informative stimuli could appear under the null hypothesis that the CP has a constant value independent of pCR (see Methods). For the cells with average CP higher than 0.5, we found that the modulation of the CP was significant (p=0.0006), with higher CPs obtained for pCR close to 0 or one in agreement with the model. For cells with average CP lower than 0.5, the modulation was not significant (p=0.26). While the actual absence of a modulation would imply that the choice-related signals in these neurons are not mediated by a continuous intermediate decision-variable but may be, for example, due to categorical feedback, we point out the lower power of this statistical test due to fewer neurons being in the CP<0.5 group and the expected effect size being lower, too. First, there were 74 cells with CP higher than 0.5 but only 33 with CP lower than 0.5, meaning that the estimation error is larger for the average CP(pCR) profile of the cells with CP<0.5. Second, as the modulation predicted by h(pCR) is multiplicative, its impact is expected to be smaller when the magnitude of CP-0.5 is smaller. Figure 3A shows that CP values are on average closer to 0.5 for the cells with CP<0.5, in agreement with Figure 5 of Britten et al., 1996. This means that fewer cells classified in the group with CP<0.5 have choice-related responses. Therefore, the fact that we cannot validate the prediction of an inverted symmetric h(pCR) modulation for the cells with CP<0.5 with respect to the cells with CP>0.5 is not strong evidence against the existence of a threshold-induced CP stimulus dependence. We further confirmed the robustness of the results in a wider set of cells. For this purpose, we repeated the analysis forming subsets separately including cells with a computable CP for the three bins with pCR lower or equal 0.5, and the three with pCR higher or equal than 0.5. Also in this case the observed CP(pCR) pattern was significant (p=0.0013) for cells with average CP higher than 0.5 (Figure 3B, N=171), and non-significant for cells with CP lower than 0.5 (p=0.20).

Figure 3. Choice probability as a function of the choice rate for MT cells during a motion direction discrimination task (Britten et al., 1996).

Figure 3.

(a) Average CP as a function of pCRp(D=1). The average across N=107 cells was calculated separately for cells with average CP higher or lower than 0.5. Dotted curves reflect the relationship predicted by the factor h(pCR) (Equation 6). Significance of the stimulus dependencies was evaluated against the null hypothesis of a constant CP value using surrogate data (see Methods). (b) Same analysis but with a less strict inclusion criterion (see main text). (c) CP(pCR) profile for four example cells with average CP lower and higher than 0.5, respectively. (d) Standard errors of the estimated CP for the example cells as a function of pCR.

Interestingly, the identified significant CP(pCR) dependence for the cells with CP>0.5 goes beyond the symmetric threshold-induced shape predicted by h(pCR), both in magnitude and shape (Figure 2A), since the increase is bigger for pCR values close to 1 than to 0. This implies that the choice correlation for each neuron, CCi(pCR), must systematically change with pCR as well, contributing to the overall CP stimulus dependency observed. In particular, the observed average CP(pCR) profile indicates that the CP increase appears to be higher for pCR>0.5. The finding of this asymmetry is consistent with results reported in Britten et al., 1996, who found a significant but modest effect of coherence direction on the CP (see their Figure 3). By experimental design, the direction of the dots corresponding to choice D=1 was tuned for each cell separately to coincide with their most responsive direction. This means that this asymmetry indicates that CPs tend to increase more when the stimulus provides evidence for the direction eliciting a higher response. However, Britten et al., 1996 found no significant relation between the global magnitude of the firing rate and the CP (see their Figure 3), and we confirmed this lack of relation specifically for the subset of N=107 cells (no significant correlation coefficient between average rate and average CP values, p=0.33). This eliminates the possibility that higher CPs for high pCR>0.5 values are due only to higher responses, and suggests a richer underlying structure of CP(pCR) patterns, which we will investigate next using cluster analysis to identify the predominant patterns shared by the within-cell CP(pCR) profiles.

Characterizing the experimental patterns of CP stimulus dependencies with cluster analysis

We carried out unsupervised k-means clustering (Bishop, 2006) to examine the patterns of CP(pCR) without a priori assumptions about a modulation h(pCR) associated with the threshold effect. Clustering was performed on CP(pCR)-0.5, with each cell represented as a vector in a five-dimensional space, where five is the number of pCR bins used to summarize the data as described above. To consider both the shape and sign of the modulation, distances between neurons were calculated with the cosine distance between their CP(pCR) profiles (one minus the cosine of the angle between the two vectors). Clustering was performed for a range of specified numbers of clusters. Specifying the existence of two clusters, we naturally recovered the distinction between cells with CP higher or lower than 0.5 (Figure 4A). The statistical significance of any pCR-modulation was again assessed constructing surrogate CP(pCR) profiles and repeating the clustering analysis on those surrogates. As before, a significant dependence of the CP on pCR was found only for the cluster associated with CP higher than 0.5 (p=0.0007 for CP>0.5 and p=0.21 for CP<0.5).

Figure 4. Clustering analysis of choice probability as a function of pCR.

Figure 4.

(a–b) CP as a function of pCR for clusters of the MT cells determined by k-means clustering. Each CP(pCR) profile corresponds to the center of a cluster. Significance of the modulation was quantified as in Figure 3. (a) Two clusters (Nc=2) for all cells. (b) Further subclustering of cells with average CP>0.5 into two subclusters. (c) Representation of the CP(pCR) profiles in a two-dimensional space spanned by the cluster means. The horizontal axis is defined by clusters 1 and 2 and closely aligned with CP-0.5. Vertical axis is defined as perpendicular to horizontal axis in the plane defined by the subcluster means. Colors correspond to the clusters of panel b, with blue and cyan further indicating subclusters of cells with average CP<0.5 (see Appendix 3—figure 1A). (d) Space defined by projection onto two templates: a constant relationship (x-axis) representing the magnitude of CP-0.5, and a monotonic relationship with slope 1 (y-axis) representing CP asymmetry. Colors correspond to the clusters of panel b and numbers indicate example cells shown in Figure 3C. (e) Modeling the influence of neuronal gain modulation on CP(pCR) profiles. CP(pCR) profiles for different combinations of strength of the gain fluctuations, σG2, and the choice correlation that would be obtained for the uninformative stimulus s0 with no gain fluctuations, CCi0(s0). We display CP(pCR) for four values of CCi0(s0) (curves vertically separated) and two values of σG2 (solid vs dashed). Each curve corresponds to a point in the two-dimensional space defined by the symmetric and asymmetric templates introduced in panel b. See Methods for model details.

As mentioned above, the divergence from h(pCR) of the average CP(pCR) profile for cells with CP>0.5 suggests that cell-specific modulations are introduced through CCi(pCR). While the variability of individual CPi(pCR) profiles (Figure 3C) is expected to reflect substantially the high estimation errors of CP^ for the single cells (Figure 3D), the presence of subclusters can identify CP(pCR) patterns common across cells.

We proceed to examine subclusters within the CP>0.5 cluster with a significant CP(pCR) profile, excluding from our analysis cells within the CP<0.5 cluster (analogous results were found when increasing the number of clusters in a nonhierarchical way, without a priori excluding these cells, see Appendix 3—figure 1A). Average CP(pCR) profiles obtained when inferring two subclusters of cells with CP>0.5 are shown in Figure 4B. For both subclusters the CP(pCR) dependence is significant (p=0.0008 for cluster two and p=0.0026 for cluster 3, respectively, in Figure 4B). The larger cluster has a more symmetric shape of dependence on pCR, with an increase of CP in both directions when the stimulus is informative, consistent with the prediction of a threshold-induced CP stimulus dependence h(pCR). For the smaller cluster the dependence is asymmetric, with a CP increase when the stimulus direction is consistent with the preferred direction of the cells and a decrease in the opposite direction. We verified that no significant difference exists between the firing rates of the cells in the two subclusters (Wilcoxon rank-sum test, p=0.23). The monotonic shape of the second subcluster mirrors the dependency produced by response gain fluctuations as predicted by the gain model described above. This suggests that the neurons in this subcluster differ from the neurons in the other subcluster by a substantially larger gain-induced variability, a testable prediction for future experiments and further discussed below.

Introducing a second cluster allows for representing each neuron’s CP(pCR)-dependency in the two-dimensional space (Figure 4C) spanned by the mean profiles for each of the three clusters. The horizontal axis corresponds to the separation between the two initial clusters, and is closely aligned to the departure of the average CP from 0.5. The vertical axis is defined by the vectors corresponding to the centers of the two subclusters and hence is determined separately for the cells with average CP higher and lower than 0.5 (see Methods for details, and Appendix 3—figure 1A). The vertical axis is associated with the degree to which the CP(pCR) dependence is symmetric or asymmetric with respect to pCR=0.5. Cells for which the CP increases consistently with its preferred direction of motion coherence lie on the upper half-plane. To further support this interpretation of the axis, we repeated the clustering procedure replacing the nonparametric k-means procedure with a parametric procedure that defines the subclusters with a symmetric and an asymmetric template, respectively. The data is distributed approximately equally in both spaces (Figure 4C–D).

Similar results were also obtained when increasing the number of clusters non-hierarchically. Introducing a third cluster for all cells leaves almost unaltered the cluster of cells with CP lower than 0.5 (Appendix 3—figure 1B). The cluster of cells with CP higher than 0.5 splits into two subclusters analogous to the ones found from cells with CP higher than 0.5 alone. The distinction between cells with more symmetric and asymmetric CP(pCR) dependencies is robust to the selection of a larger number of clusters, that is, clusters with this type of dependencies remain large when allowing for the discrimination of more patterns (Appendix 3—figure 1C). However, we do not mean to claim that the variety of CP(pCR) profiles across cells can be reduced to three separable clusters. As reflected in the distributions in Figure 4C–D, the clusters are not neatly separable. Indeed, a richer variety of profiles would be expected if the properties of CP(pCR) profiles across cells were associated with their tuning properties and the structure of feedback projections, as we further argue in the Discussion. The predominance of a symmetric and asymmetric pattern would only reflect which are the predominant CP(pCR) shapes shared across cells.

This clustering analysis confirms the presence of shared patterns of CP stimulus-dependence across cells, whose shape is compatible with the analytical predictions from the threshold- and gain-related dependencies. The symmetric component of CP stimulus dependence is congruent with h(pCR) (Equation 6), albeit with a larger magnitude than predicted (Figures 2A and 3A, and additional analysis of the statistical power in Appendix 1). This stronger modulation suggests an additional symmetric contribution of the choice correlation CC(pCR) and/or a dynamic feedback reinforcing the stronger modulation for highly informative stimuli. However, while the cluster analysis separates the predominant CP(pCR) patterns, the Britten et al. data lacks the statistical power to further distinguish between h(pCR) and symmetric CC(pCR) contributions with a similar shape.

Gain-induced CP stimulus dependencies in the MT responses

Three key features of the CP(pCR) dependencies observed for the MT cells are qualitatively explained by introducing shared gain fluctuations in the decision threshold model described above (Figure 4E) – the first two manifesting itself on the population (cluster) level and the third one on an individual neuron level. First, a shared gain variability predicts the existence of the asymmetric CP stimulus dependence seen in cluster 3 (Equation 11 and Figure 2C). Second, the average CP of the asymmetric cluster 3 is lower than the average CP of the symmetric cluster 2 (compare red and green profiles in Figure 4B+E). And third, if gain variability is indeed a driving factor for the observed asymmetry in cluster 3, then within this cluster, neurons with a higher amount of gain variability should also have a steeper CP(pCR) profile, a prediction we could confirm as described in the next paragraph.

In order to test this prediction, for each neuron in cluster 3, we first computed the degree of asymmetry of its CP(pCR) profile from the data directly, by simply fitting a quadratic function to CP(pCR) (Methods). Next, and independently of this, we used the method of Goris et al., 2014 to estimate the amount of gain variability for each neuron. Knowing each neuron’s gain variability allowed us to predict each neuron’s degree of asymmetry (slope of CP(pCR) as determined by βpCR, using Equation 11). We indeed found a significant correlation between the predicted and the observed slopes (r=0.58, p=0.0018) supporting the conclusion that shared gain variability underlies the observed asymmetric shape of CP(pCR) for the neurons in cluster 3. For cluster 2, in which the symmetric pattern is predominant, no analogous correlation was found (r=0.15, p=0.35). It is important to note that the asymmetry predicted by the gain variability overestimates the actually observed one by an order of magnitude (average observed slope of 0.002±0.0003 compared to an average predicted slope of 0.034±0.008). However, this is not surprising given our simplifying assumption of a single global gain factor across the whole population whereas in practice the gain fluctuations are likely inhomogeneous across the population. Furthermore, the actual read-out used by the brain may deviate from the optimal one, further reducing the expected match between predictions and observations. A more precise modeling of CP–stimulus dependencies would require measurements of the cross-neuronal correlation structure that is not available from the single unit recordings of Britten et al., 1996 but will be for future population recordings.

Modeling stimulus-dependent choice-related signals with GLMs

The implications of a stimulus-dependent relationship between the behavioral choice and sensory neural responses are not restricted to measuring them as CPs, for which activity-choice covariations are quantified without incorporating other explanatory factors of neural responses. To further substantiate the existence of this stimulus-dependent relationship in MT data, and to understand how our model predictions could help to refine other analytical approaches, we examined how representing that relationship can improve statistical models of neural responses. In particular, we study how the stimulus-dependent choice-related signals that we discovered may inform the refinement of Generalized Linear Models (GLMs) of neural responses (Truccolo et al., 2005; Pillow et al., 2008). In the last few years, GLMs have been used for modelling choice dependencies together with the dependence on other explanatory variables, such as the external stimulus, response memory, or interactions across neurons (Park et al., 2014; Runyan et al., 2017). Typically, in a GLM of firing rates each explanatory variable contributes with a multiplicative factor that modulates the mean of a Poisson process. In their classical implementation, the choice modulates the firing rate as a binary gain factor, with a different gain for each of the two choices (Park et al., 2014; Runyan et al., 2017; Pinto et al., 2019). The multiplicative nature of this factor already introduces some covariation between the impact of the choice on the rate and the one of the other explanatory variables. However, using a single regression coefficient to model the effect of the choice on the neural responses may be insufficient if choice-related signals are stimulus dependent, as suggested by our theoretical and experimental analysis.

We developed a GLM (see Methods) that can model stimulus-dependencies of choice signals (or, in other words, stimulus-choice interactions) by including multiple choice-related predictors that allow for a different strength of dependence of the firing rate on the choice for different subsets of stimulus levels (via the choice rate, pCR). We fitted this model, which we call the stimulus-dependent-choice GLM, to MT data and we compared its cross-validated performance against two traditional GLMs. In the first type, called the stimulus-only GLM, the rate in each trial is predicted only based on the external stimulus level. In a second type, that we called stimulus-independent-choice GLM and that corresponds to the traditional way to include choice signals in a GLM (Park et al., 2014; Runyan et al., 2017; Scott et al., 2017; Pinto et al., 2019; Minderer et al., 2019), additionally the effect of choice is included, but using only a single, stimulus-independent choice predictor.

To compare the models, we separated the trials recorded from each MT cell (Britten et al., 1996) into training and testing sets, and calculated the average cross-validated likelihood for each type of model on the held-out testing set. To quantify the increase in predictability when adding the choice as a predictor we defined the relative increase in likelihood (RIL) as the relative increase of further adding the choice as a predictor relative to the increase of previously adding the stimulus as a predictor. RIL measures the relative influence of the choice and the sensory input in the neural responses. Figure 5A compares the cross-validated RIL values obtained on MT neural data when fitting either the stimulus-independent-choice or the stimulus-dependent-choice GLMs. We found that RIL values were mostly higher when allowing for multiple choice parameters, both in terms of average RIL values (Figure 5C) and in terms of the proportion of cells in each cluster for which the RIL was higher than a certain threshold, here selected to be at 10% (Figure 5B).

Figure 5. Modeling stimulus-dependent choice-related signals with GLMs.

Figure 5.

(a) Scatter plot of the cross-validated relative increase in likelihood (RIL), with respect to a stimulus-only model, of the stimulus-dependent-choice GLMs (multiple choice parameters) versus the stimulus-independent-choice GLMs (a single choice parameter). (b) Proportion of cells with RIL>0.1 for the two types of models, grouped by the clusters as in Figure 4B. Cells not included in the set of 107 cells for which a CP value could be estimated for each bin of pCR are labeled as ‘Others’. (c) Average RIL values, grouped as in b. (d) CP(pCR) profiles of the three cells with the highest RIL in the stimulus-dependent-choice GLMs, as numbered in panel a.

GLMs that include stimulus-choice interaction terms can be used not only to better describe the firing rate of neural responses, but also to individuate more precisely the neurons or areas by their choice signals. To illustrate this point, we show how adding the interaction terms may change the relative comparison of cells by their RIL values. Consider the three neurons with highest RIL for the stimulus-dependent-choice GLM (Figure 5A, and with corresponding CP(pCR) profiles shown in Figure 5D). The ranking of cells 1 and 2 by RIL flips with respect to the stimulus-independent-choice GLM because of the higher CP(pCR) modulation of cell 2. Similarly, while the RIL with multiple choice parameters for cells 1 and 3 are close, the RIL of cell 3 is substantially lower with a single choice parameter, indicating that its pattern of stimulus dependence is less well captured by a single parameter. The degree to which a model with interaction terms improves the predictability will depend on the shape of the CP(pCR) patterns, which themselves are expected to vary across areas or across cells with different tuning properties. For example, we see in Figure 5C that for the cluster with an asymmetric CP(pCR) profile (cluster 3), the average RIL with only one choice parameter suggests that this type of cells are not choice driven. The reason is that for the cells in this cluster the sign of the choice influence on the rate can be stimulus dependent, which is impossible to model by a single choice parameter. Furthermore, the profile of the GLM choice parameters across stimulus levels provides a characterization of stimulus-dependent choice-related signals analogous to the CP(pCR) profile, in this case within the GLM framework, hence allowing efficient inference including principled regularization and the ability to account for a range of factors beyond choices and stimuli. Overall, we expect that accounting for stimulus-choice interactions in GLMs will allow for a more accurate assessment of the relative importance of stimulus and choice on neural responses.

Discussion

Our work makes several contributions to the understanding of how choice and stimulus signals in neural activity are coupled. The first is that we derived a general analytical model of perceptual decision-making predicting how the relationship between sensory responses and choice should depend on stimulus strength, regardless of whether this relationship is due to feedforward or feedback choice-related signals. The key model assumption is that the link between sensory responses and choices is mediated by a continuous decision variable and a thresholding mechanism. Second, we designed new, more powerful methods to measure within-cell dependencies of choice probabilities (CPs) on stimulus strength. Third, we studied CP stimulus dependencies in the classic dataset by Britten et al., 1996. Interestingly, we found a rich and previously unknown structure in how CPs in MT neurons depend on stimulus strength. In addition to a symmetric dependence predicted by the thresholding operation, we found an asymmetric dependence which we could explain by incorporating previously proposed gain fluctuations (Goris et al., 2014) in our model, thereby introducing a stimulus-dependent component in the cross-neuronal covariance. Finally, we showed that generalized linear models (GLMs) that account for stimulus-choice interactions better explain sensory responses in MT and allow for a more accurate characterization of how stimulus-driven and how choice-driven a cell’s response is.

Advances on analytical solutions of choice probabilities

Previous work has demonstrated that solving analytically models of perceptual decision-making can lead to important new insights on the interpretation of the relationship between neural activity and choice in terms of decision-making computations (Bogacz et al., 2006; Gold and Shadlen, 2007; Haefner et al., 2013). In particular, previous analytical work on CPs has shown how experimentally measured CPs relate to the read-out weights by which sensory neurons contribute to the internal stimulus decoder in a feedforward model, assuming both choices are equally likely (Haefner et al., 2013; Pitkow et al., 2015). Here, we provided a general analytical solution of CPs in a more general model, with informative stimuli resulting in an unbalanced choice rate, and valid both for feedforward and feedback choice signals. We derived the analytical dependency of CP on the probability of one of the choices (pCRp(choice=1)), which mediates the dependence of the CP on the stimulus strength. Our model is therefore directly applicable to both discrimination and detection tasks, for any stimulus strength that elicits both choices. As we demonstrated, these advances in the analytical solution of the decision-threshold model allowed for detecting and interpreting stimulus dependencies of choice-related signals in neural activity.

Characterization of patterns of choice probability stimulus-dependencies from sensory neurons

Characterizing within-cell stimulus dependencies of activity-choice covariations at the population level requires isolating these dependencies from across-cells heterogeneity in the magnitude of the CP values. Our analytical analysis suggests possible reasons why previous attempts failed to find stimulus dependencies of CPs in real neural data. First, the magnitude of the CP dependence on pCR is proportional to the magnitude of choice-related signals (i.e. on how different CPs are from 0.5). This implies that neuron-specific dependencies need to be characterized for each cell individually, relative to the CP obtained with the uninformative stimulus. Only neurons for which a full individual CP profile can be estimated should be averaged to determine stimulus dependencies at the population level, or otherwise the overall average CP profile of stimulus dependence will be dominated by variability associated with the different subsets of neurons contributing to the CP estimate at each stimulus level. Second, the threshold-induced predicted direction of CP dependence on pCR is different for neurons with CP larger or smaller than 0.5, that is, neurons more responsive to opposite choices. This opposite modulation can cancel out the magnitude of the overall threshold-induced dependence of the CP on stimulus strength when averaging over all neurons, as done in previous analyses (Britten et al., 1996). Informed by these insights we characterized the within-cell dependencies of choice-related signals on stimulus strength. The application of our refined methods to the classic neural data from MT neurons during a perceptual decision-making task of Britten et al., 1996 allowed us to find stimulus dependencies of CPs, while previous analyses had not detected a significant effect.

Our understanding of how CP-stimulus dependencies may arise within the decision-making process, and the methods we used to measure these dependencies in existing data, will allow future studies to perform more fine-grained analyses and interpret more appropriately choice-related signals. Traditional analyses computed a single CP value for each neuron by either concentrating on zero-signal trials (e.g. Dodd et al., 2001; Parker et al., 2002; Krug et al., 2004; Wimmer et al., 2015; Katz et al., 2016; Wasmuht et al., 2019) or calculating grand CPs (Britten et al., 1996) across stimulus levels (e.g. Cai and Padoa-Schioppa, 2014; Verhoef et al., 2015; Latimer et al., 2015; Pitkow et al., 2015; Smolyanskaya et al., 2015; Liu et al., 2016; Bondy et al., 2018). Grand CPs are calculated directly as a weighted average of the CPs estimated for each stimulus level, or by pooling the responses from trials of all stimulus levels, after subtracting an estimate of the stimulus-related component (Kang and Maunsell, 2012). Our theoretical CP analysis shows that the latter procedure also corresponds to a specific type of weighted average (Appendix 2). Using the so computed individual CP values for each cell, areas or populations were then often ranked in terms of their averaged CP values per neurons. Areas with higher CP values are then identified as areas key for decision-making (e.g. Nienborg and Cumming, 2006; Cai and Padoa-Schioppa, 2014; Pitkow et al., 2015; Yu et al., 2015).

However, if CPs depend on pCR, it is clear that a single grand CP value cannot summarize this dependence. The use of average single CPs may thus introduce confounds in their comparison and miss important cell-specific information. For example, CP(pCR) patterns with different sign for different pCR values will result in lower average CP values. Similarly, the comparison of the grand CP of a cell across tasks may mostly reflect changes in the sampling in each task of stimulus levels, leading to a change in how much the CP(s) associated with each stimulus level contributes to the grand CP. As a result, the change in the grand CP may be interpreted as indicating the existence of task-dependent choice-related signals, even if the CP(s) profile is invariant. In the same way, if the structure of CP(pCR) patterns covaries with the tuning properties, the comparison of the grand CP across cells with different tuning properties may mostly depend on the sampling of stimulus levels. This limitation is not specific to average CP values, and applies to other measures that consider choice-related and stimulus-driven components of the response as separable, such as partial correlations (e.g. Zaidel et al., 2017). Our work instead indicates that the shape of the CP(pCR) patterns cannot be summarized in the average, and this shape may be informative about the role of the activity-choice covariations, when comparing across cells with different tuning properties, cells from different areas, or across tasks (e.g. Romo and Salinas, 2003; Nienborg and Cumming, 2006; Nienborg et al., 2012; Krug et al., 2016; Sanayei et al., 2018; Shushruth et al., 2018; Jasper et al., 2019; Steinmetz et al., 2019). Our new methods allow quantifying these CP patterns to better characterize the covariations between neural activity and choice across neurons and populations.

A key novelty introduced in our study is the development of a model-inspired methodological procedure for identifying genuine within-cell CP(pCR) profiles, that would otherwise be masked by across-cells heterogeneity in the magnitude of the CP values. As representative examples of how our procedure may find previously unnoticed patterns of CP dependencies, we discuss the previous analyses in Britten et al., 1996 and in Dodd et al., 2001. Britten et al., 1996 analyzed the dependence of the CP on the stimulus strength at the population level (see their Figure 3). In particular, for each stimulus level they averaged the CP of all cells for which an estimate of the CP was calculated, without separating cells with CP higher or lower than 0.5. Furthermore, in their data set, the stimulus levels vary across cells, and hence in their analysis different subsets of cells contribute to the CP average at each stimulus level. Dodd et al., 2001 presented a scatter plot of the CPs for all cells and stimulus levels (see their Figure 6). Although this analysis did not average cells with CP>0.5 and CP<0.5, in the scatter plot the cell-identity of each dot is not represented. This means that it is not possible to trace the within-cell CP(s) profiles. Like in the case of Britten et al., also in Dodd et al., 2001 the sampled stimulus levels varied across cells, further confounding the within-cell CP(s) profiles with heterogeneity of CP magnitudes across cells. As shown by our analysis of the data of Britten et al., 1996, our analytical tools can add extra discoveries from these data, by removing some potential confounds that may have obscured the presence of within-cell CP patterns. It is important to note however that our model-based results do not imply in any way that these previous papers reached to inaccurate conclusions, as these analyses were done for purposes other than discovering the within-cell patterns predicted by our models. In particular, most of the analysis of Dodd et al., 2001 used only CPs calculated from trials with non-informative stimuli, and their main results did not rely on the evaluation of CP stimulus dependencies. Similarly, while Britten et al., 1996 used z-scoring to calculate grand CPs combining all stimulus levels, their analysis did not involve the comparison of grand CPs across areas or types of cells with different tuning properties. As discussed above, it is for this kind of comparisons, when the patterns of CP(pCR) profiles may themselves vary across the groups of cells compared, that reducing CP(pCR) profiles to a single CP value may confound the comparison.

Generalized linear models with stimulus-choice interactions

Our work has also implications for improving generalized linear models (GLMs) of neural activity, which are very widely used to describe neural responses in the presence of many explanatory variables that could predict the neuron’s firing rate, such as the external stimulus, motor variables, autocorrelations or refractory periods, and the interaction with other neurons (Truccolo et al., 2005). While usually the stimulus and the choice are treated as separate explanatory variables (e.g. Park et al., 2014; Runyan et al., 2017; Scott et al., 2017; Pinto et al., 2019; Minderer et al., 2019), we used GLMs including explicit interactions between choice and stimulus to show that, consistently with the finding of non-constant CP(pCR) patterns, these models improved the goodness of fit for the responses of MT cells. Importantly, making the choice term depend on the choice rate, pCR, affected the quantification of how stimulus-driven or choice-driven different cells are, quantified as the increased in predictive power when further adding the choice as a predictor after the stimulus. This suggests a more fine-grained way to compare the degree of a neuron’s association with the behavioral choice or the stimulus, for example across neuron types or brain areas (Runyan et al., 2017; Pinto et al., 2019; Minderer et al., 2019). Our GLMs with multiple choice parameters associated with subsets of stimulus levels also allow characterizing the patterns in the vector of choice parameters analogously to our analysis of CP(pCR)-patterns. Furthermore, our approach can be extended straightforwardly to GLMs that model the influence of the choice across the time-course of the trials (Park et al., 2014), by making the stimulus-choice interaction terms time-dependent. GLMs with time-dependent stimulus-choice interaction terms can also be useful for experimental settings with multiple sensory cues presented at different times (e.g. Romo and Salinas, 2003; Sanayei et al., 2018) or a continuous time-dependent stimulus (Nienborg and Cumming, 2009), to account for a difference in the interaction of stimuli with the choice depending on the time they are presented. Similarly, the interaction terms may also help to model the influence of choice history in the processing of sensory evidence in subsequent trials (Tsunada et al., 2019; Urai et al., 2019), in which case the interaction terms would be between the stimulus and the choice from the previous trial.

Patterns of stimulus-choice interactions as a signature of mechanisms of perceptual decision-making

Theoretical and experimental evidence suggests that the patterns of stimulus dependence of choice-related signals may be informative about the mechanisms of perceptual decision-making. Activity-choice covariations have been characterized in terms of the structure of cross-neuronal correlations and of feedforward and feedback weights (Shadlen et al., 1996; Cohen and Newsome, 2009b; Nienborg and Cumming, 2010; Haefner et al., 2013; Cumming and Nienborg, 2016). Stimulus dependencies may be inherited from the dependence of cross-neuronal correlations on the stimulus (Kohn and Smith, 2005; Ponce-Alvarez et al., 2013), or from decision-related feedback signals (Bondy et al., 2018). Experimental (Nienborg and Cumming, 2009; Cohen and Maunsell, 2009a; Bondy et al., 2018), and theoretical (Lee and Mumford, 2003; Maunsell and Treue, 2006; Wimmer et al., 2015; Haefner et al., 2016; Ecker et al., 2016) work indicates that top-down modulations of sensory responses play an important role in the perceptual decision-making process. In particular, feedback signals are expected to show cell-specific stimulus dependencies associated with the tuning properties (Lange and Haefner, 2017). Different coding theories attribute different roles to the feedback signals, for example, conveying predictive errors (Rao and Ballard, 1999) or prior information for probabilistic inference (Lee and Mumford, 2003; Fiser et al., 2010; Haefner et al., 2016; Tajima et al., 2016Bányai and Orbán, 2019, Bányai et al., 2019; Lange and Haefner, 2020). Accordingly, characterizing the stimulus dependencies of activity-choice covariations in connection with the tuning properties of cells is expected to provide insights into the role of feedback signals and may help to discriminate between alternative proposals. Such an analysis would require simultaneous recordings of populations of neurons tiling the space of receptive fields, and the joint characterization of the cross-neuronal correlations and tuning properties. Although this is beyond the scope of this work, we have shown that the analysis methods we proposed are capable of identifying a nontrivial structure in the stimulus-dependencies of choice-related signals. A better understanding of their differences across brain areas, across cells with different tuning properties, or for different types of sensory stimuli, promises further insights into the mechanisms of perceptual decision-making.

While we here analyzed single-cell recordings, our conclusions hold for any type of recordings used to study activity-choice covariations. This spans the range from single units (Britten et al., 1996), multiunit activity (Sanayei et al., 2018), and measurements resulting from different imaging techniques at different spatial scales like intrinsic imaging or fMRI (Choe et al., 2014; Thielscher and Pessoa, 2007; Runyan et al., 2017; Michelson et al., 2017). Given the increasing availability of population recordings, larger number of trials due to chronic recordings, and the advent of stimulation techniques to help to discriminate the origin of the choice-related signals (Cicmil et al., 2015; Tsunada et al., 2016; Yang et al., 2016; Lakshminarasimhan et al., 2018; Fetsch et al., 2018; Yu and Gu, 2018), we expect our tools to help gain new insights into the mechanisms of perceptual decision-making.

Materials and methods

We here describe the derivations of the CP analytical solutions, our new methods to analyze stimulus dependencies in choice-related responses, and we describe the data set from Britten et al., 1996 in which we test the existence of stimulus dependencies.

An exact CP solution for the threshold model

We first derive our analytical CP expression valid in the presence of informative stimuli, decision-related feedback, and top-down sources of activity-choice covariation, such as prior bias, trial-to-trial memory, or internal state fluctuations. We follow Haefner et al., 2013 and assume a threshold model of decision making, in which the choice D is triggered by comparing a decision variable d with a threshold θ, so that if d>θ choice D=1 is made, and D=-1 otherwise. The identification of the binary choices as D=±1 is arbitrary and an analogous expression would hold with another mapping of the categorical variable. The choice probability (Britten et al., 1996) of cell i is defined as

CPi=p(ri|D=1>ri|D=-1)=-drip(ri|D=1)-ridrip(ri|D=-1) (12)

and measures the separation between the two choice-specific response distributions p(ri|D=-1) and p(ri|D=1). It quantifies the probability of responses to choice D=1 to be higher than responses to D=-1. If there is no dependence between the choice and the responses this probability is CP=0.5. To obtain an exact solution of the CP, we assume that the distribution p(ri,d) of the responses ri of cell i and the decision variable d can be well approximated by a bivariate Gaussian. Under this assumption, following Haefner et al., 2013 (see their Supplementary Material) the probability of the responses for choice D=1 follows the distribution 

p(zi|D=1)=1pCRϕ(zi;0,1)Φ(d+σri,dσriziθσd|ri;0,1), (13)

where a more parsimonious expression is obtained using the z-score zi=(ri-ri)/σri. This distribution is a skew-normal (Azzalini, 1985), where ϕ(;0,1) is the standard normal distribution with zero mean and unit variance, and Φ(;0,1) is its cumulative function. Furthermore, σri,d is the covariance of ri and d, σd|ri is the conditional standard deviation of d given ri, and the probability of D=1 is

pCRp(d>θ)=Φ(d-θσd), (14)

which determines the rate of each choice over trials. The choice D=-1 could equally be taken as the choice of reference, resulting in an analogous formulation. Intuitively, pCR increases when the mean of the decision variable d is higher than the threshold θ, and decreases when its standard deviation σd increases. The form of the distribution of Equation 13 can be synthesized in terms of pCR and the correlation coefficient ρrid, which was named by Pitkow et al., 2015 choice correlation (CCi). In particular, defining αρrid/1-ρrid2 and cΦ-1(pCR)/1-ρrid2

p(zi|D=1)=1pCRϕ(zi;0,1)Φ(αzi+c;0,1). (15)

The CP is completely determined by p(zi|D=-1) and p(zi|D=1), and these distributions depend only on pCR and the correlation coefficient ρrid. Plugging the distribution of Equation 15 into the definition of the CP (Equation 12) an analytical solution is obtained:

CPi=12+T(Φ1(pCR),ρrid2ρrid2)pCR(1pCR), (16)

where T is the Owen’s T function (Owen, 1956). In Appendix 1, we provide further details of how this expression is derived. For an uninformative stimulus (pCR=0.5), the function T reduces to the arctangent and the exact result obtained in Haefner et al., 2013 is recovered. The dependence on ρrid can be understood because under the Gaussian assumption the linear correlation captures all the dependence between the responses and the decision variable d. The dependence on pCR reflects the influence of the threshold mechanism, which maps the dependence of ri with d into a dependence with choice D by partitioning the space of d in two regions.

While Equation 16 provides an exact solution of the CP, in the Results section we present and mostly focus on a linear approximation to understand how the stimulus content modulates the choice probability. This approximation is derived (Appendix 1) in the limit of a small ρrid, which leads to CPs close to 0.5 as usually measured in sensory areas (Nienborg et al., 2012). However, as we show in the Results and further justify in the Appendix this approximation is robust for a wide range of ρrid values. The linear approximation relates the choice probability to the Choice Triggered Average (CTA) (Haefner, 2015; Chicharro et al., 2017), defined as the difference of the mean responses for each choice (Equation 3). Given the binary nature of choice D, the CTA is directly proportional to the covariance of the responses and the choice: CTAi=cov(ri,D)/[2p(D=1)p(D=-1)]. [Note: This relation holds for the covariance between any variable x and a binary variable D, and independently of the convention adopted for the values of D: the factor 2 has to be replaced by a-b in general for D=a,b instead of D=1,-1.] This relation between CTAi and cov(ri,D), given the factorization corr(ri,D)=CCicorr(d,D) resulting from the mediating decision variable d in the threshold model, allows expressing the CTAi as in Equation 5, connecting CTAi to the choice-triggered average of d, CTAd. This connection indicates that in the threshold model CTAi is expected to be stimulus dependent, since an informative stimulus s shifts the mean of d, thus altering the dichotomization of d produced by the threshold θ. The exact form of CTAd depends on the distribution p(d). However, since d is determined by a whole population of neurons, its distribution is expected to be well approximated by a Gaussian distribution, even if the distribution of neural responses for any single neuron is not Gaussian. With this Gaussian approximation, the normalized CTAd in Equation 5, namely CTAd/vard, is specified in terms of the probability of choosing choice 1, pCRp(D=1)=p(d>θ), by the factor h(pCR) (Equation 6). In more detail, the CTA is

CTAiriD=1riD=1=4h(pCR)2πρridσri=4h(pCR)2πσdσri,d. (17)

Neuronal data

To study stimulus dependencies in the relationship between the responses of sensory neurons and the behavioral choice, we analyzed the data from Britten et al., 1996 publicly available in the Neural Signal Archive (http://www.neuralsignal.org). In particular, we analyzed data from file nsa2004.1, which contains single unit responses of macaque MT cells during a random dot discrimination task. This file contains 213 cells from three monkeys. We also used file nsa2004.2, which contains paired single units recordings from 38 sites from one monkey. In the experimental design, for the single unit recordings the direction tuning curve of each neuron was used to assign a preferred-null axis of stimulus motion, such that opposite directions along the axis yield a maximal difference in responsiveness (Bair et al., 2001). For paired recordings, the direction of stimulus motion was selected based on the direction tuning curve of the two neurons and the criterion used to assign it varied depending on the similarity between the tuning curves. For cells with similar tuning, a compromise between the preferred directions of the two neurons was made. For cells with different tuning, the axis were chosen to match the preference of the most responsive cell. To minimize the influence in our analysis of the direction of motion selection, we only analyzed the most responsive cell from each site. Accordingly, our initial data set consisted in a total of 251 cells. The same qualitative results were obtained when limiting the analysis to data from nsa2004.1 alone. Further criteria regarding the number of trials per each stimulus level were used to select the cells. As discussed below, if not indicated otherwise, we present the results from 107 cells that fulfilled all the criteria required.

Analysis of stimulus-dependent choice probabilities

Our analysis of choice probabilities stimulus dependencies is based on examining the patterns in the CP(pCR) profile as a function of the probability pCRp(D=1). We here describe how these profiles are constructed, the surrogates-based method used to assess the significance of stimulus dependencies, and the clustering analysis used to identify different stimulus dependence patterns. Matlab functions are available at https://github.com/DanielChicharro/CP_DP (Chicharro, 2021; copy archived at swh:1:rev:5850c573860eb04317e7dc550f96b1f47ca91c6a) to calculate weighted average CPs, to obtain CP profiles, and to generate surrogates consistent with the null hypothesis of a constant CP.

Profiles of CP as a function of the choice rate

We constructed CP(pCR) profiles instead of CP(s) profiles based on the prediction from the theoretical threshold model of the modulatory factor h(pCR). We estimated the pCR value associated with each random dots coherence level using the psychophysical function for each monkey separately. For each coherence level, we calculated a CP value if at least 15 trials were available in total, and at least four for each choice. In the original analysis of Britten et al., 1996 stimulus dependencies CP(s) were examined averaging across cells the CP at each coherence level. This analysis did not separate the within-cell stimulus dependencies CP(s) from variability due to changes in choice probabilities across cells. In particular, in the data set the stimulus levels presented vary across cells, which means that for each coherence level the average CP does not only reflect any potential stimulus dependence of the CP but also which subset of cells contribute to the average at that level. Therefore, we binned the range of pCR in a way that for each cell at least one stimulus level mapped to each bin of pCR. We here present the results using five bins defined as [0-0.3,0.3-(0.5-ε),(0.5-ε)-(0.5+ε),(0.5+ε)-0.7,0.7-1], where ε was selected such that only trials with the uninformative (zero coherence) stimulus were comprised in the central bin. Results are robust to the exact definition of the bins. We selected larger bins for highly informative stimulus levels for two reasons. First, the stimulus levels used in the experimental design do not uniformly cover the range of pCR, there are more stimulus levels corresponding to pCR values close to pCR=0.5. Second, the CP estimates are worse for highly informative stimuli. In particular, the standard error of the CP estimates depends on the magnitude of the CP itself (Bamber, 1975; Hanley and McNeil, 1982) but for small |CP-0.5| can be approximated as

SEM(CP^)1/12KpCR(1pCR), (18)

where K is the number of trials. The product pCR(1-pCR) is maximal at pCR=0.5 and decreases quadratically when pCR approximates 0 or 1. Furthermore, in the data set the number of trials K is higher for stimuli with low information, while most frequently K=30 for highly informative stimuli. We used these estimates of the CP^ error to combine the CPs of Mk different stimulus levels assigned to the same bin k of pCR. The average CP(pCR,k) for bin k was calculated as CP(pCR,k)=jMkwjCP(sjk) with normalized weights proportional to KjpCR,j(1-pCR,j). A full profile CP(pCR) could be constructed for 107 cells, while for the rest a CP value could not be calculated for at least one of the bins because of the criteria on the number of trials. Together with the profile CP(pCR), we also obtained an estimate of its error as a weighted average of the errors, which corresponds to

SEM(CP^(pCR,k))=1/(12MkwU), (19)

where wU is the average of the unormalized weights wU,jKjpCR,j(1-pCR,j). Following this procedure, we can iteratively calculate weighted averages of the CPs across different sets. In particular, we used this same type of average to obtain averaged CP(pCR) profiles across cells. Importantly, in contrast to the analysis of Britten et al., 1996, we previously separated the cells into two groups, with a positive or negative average CP-0.5 value, given that the effect of h(pCR) predicts an inverse modulation by pCR.

Surrogates to test the significance of CP stimulus dependencies

Given a certain average profile CP(pCR), we want to assess whether the pattern observed is compatible with the null hypothesis of a constant CP value for all pCR values. In particular, because the error of the CP estimates is sensitive to the number of trials K and to pCR (Equation 18), we want to discard that any structure observed is only a consequence of changes of K and pCR across the bins used to calculate the CP(pCR) profiles. For this purpose, we developed a procedure to build surrogate data sets compatible with the hypothesis of a flat CP(pCR) and that preserves at each stimulus level the number of trials for each choice equal to the number in the original data. The surrogates are built shuffling the trials across stimulus levels to destroy any stimulus dependence of the CP. However, because the responsiveness of the cell changes across levels according to its direction tuning curve, responses need to be normalized before the shuffling. Kang and Maunsell, 2012 showed that, to avoid underestimating the CPs, this normalization should take into account that mean responses at each level are determined by the conditional mean response for each choice and also by the choice rate. Under the assumption of a constant CP, they proposed an alternative z-scoring, which estimates the mean and standard deviation correcting for the different contribution of trials corresponding to the two choices (see Appendix 2 for details of their method).

We applied the z-scoring of Kang and Maunsell, 2012 to pool the responses within an interval of stimulus levels with low information, preserving only the separation of trials corresponding to each choice. We selected the interval from −1.6% to 1.6% of coherence values, which comprises a third of the informative coherence levels used in the experiments. Because these stimuli have low information they lead to pCR values close to pCR=0.5 and hence we can approximate the CP as constant within this interval. The fact that the factor h(pCR) is almost constant around pCR=0.5 (see Figure 2A) further supports this approach. We used this pool of neural responses to sample responses for all stimulus levels in the surrogate data set. For each stimulus level of the surrogate data, the number of trials for each choice was preserved as in the original data. In these surrogates, apart from random fluctuations, any structure in the CP(pCR) profiles can only be produced by the changes in K and pCR across bins. To test the existence of significant stimulus dependencies in the original CP(pCR) profiles we calculated the differences ΔCPk=CP(pCR,k+1)-CP(pCR,k) for the bins k=1,,4. To test for an asymmetric pattern with respect to pCR=0.5 the average of ΔCPk across bins was calculated. To test for a symmetric pattern the sign of the difference was flipped for the bins corresponding to pCR<0.5 before averaging. When testing for a pattern consistent with the modulation predicted by the threshold model, the shape was inverted for cells with average CP lower than 0.5. The same procedure was applied to each surrogate CP(pCR) profile. We generated 8000 surrogates and estimate the p-value as the number of surrogates for which the average ΔCP was higher than for the original data.

Clustering analysis

We used clustering analysis to examine the patterns in the CP(pCR) profiles beyond the stereotyped shape h(pCR) predicted from the threshold model. We first used nonparametric k-means clustering for an exploratory analysis of which patterns are more common among the 107 cells for which a complete CP(pCR) profile could be constructed. The clustering was implemented calculating cosine distances between vectors defined as CP(pCR)-0.5. The selection of this distance is consistent with the prediction of the threshold model that a different pattern is expected for cells with a CP higher or lower than 0.5. We examined the patterns associated with the clusters as a function of the number of clusters to identify robust patterns of dependence (see Appendix 3—figure 1B,C). We then focused on a symmetric and an asymmetric pattern of CP(pCR) with respect to pCR=0.5, for cells with average CP higher than 0.5. To better interpret these two clusters, we complemented the analysis with a parametric clustering approach in which a symmetric and asymmetric template were a priori selected to cluster the CP(pCR) profiles. To assess the significance of the CP(pCR) patterns we repeated the same clustering procedure for surrogate data generated as described above. We refer to Appendix 3 for a more detailed description of the construction, visualization, and significance assessment of the CP(pCR) patterns.

The effect of response gain fluctuations on choice probabilities

To model the effect on the CP of response gain fluctuations we adopted a classic feedforward encoding/decoding model (Shadlen et al., 1996; Haefner et al., 2013), with a linear decoder d=wr (Equation 8), for which the CP depends on cross-neuronal correlations and the read-out weights w following Equation 9. This expression can be derived from Equation 7 directly calculating the choice correlation from its definition (Equation 2). The expressions cov(ri,d)=(Σ(s)w)i and σd2=wΣ(s)w are obtained as derived in the Supplementary Material S1 of Haefner et al., 2013. For this model, if the read-out weights are optimized to the form of covariability for the uninformative stimulus s0 at the decision boundary, the CPs are proportional to the neurometric sensitivity of the cells (Haefner et al., 2013; Pitkow et al., 2015), a relationship for which there is some experimental support (e.g. Britten et al., 1996; Parker et al., 2002, reviewed in Nienborg et al., 2012). In more detail, modeling the responses as ri=fi(s)+ξi, with tuning functions f(s)=(f1(s),,fn(s)) and a covariance structure Σ of the neuron’s intrinsic variability ξi, the optimal read-out weights have the form

w=Σ-1(s0)f(s0)fT(s0)Σ-1(s0)f(s0), (20)

where f(s0) and Σ(s0) are the derivative of the tuning curves and the responses covariance matrix, respectively, for s=s0. With these optimal weights, the covariability of the population responses is unbundled, with Σ-1(s0) canceling the effect of Σ(s0) in cov(ri,d)=(Σ(s)w)i, and for each cell the CC is proportional to its own neurometric sensitivity fi/σri, namely

CCi(s0)=fi(s0)σd(s0)σri(s0). (21)

While this expression is valid for the uninformative stimulus s0, we examined how this CP expression is perturbed for other informative stimuli s in the presence of gain fluctuations that make the covariance structure Σ(s) stimulus-dependent, altering the structure for which the read-out weights are optimized. Goris et al., 2014 estimated that in MT gain fluctuations accounted for more than 75% of the variance in the responses to sinusoidal gratings, and we found that in the data set of Britten et al., 1996 gain fluctuations also explain a large fraction of trial-to-trial variability of the neurons (62 ± 25% across neurons). Trial-to-trial excitability fluctuations are modeled as a a gain modulatory factor gk, such that the tuning function for cell i in trial k, is fik(s)=gkfi(s). In general, the magnitude of the gain may vary across cells, as well as the degree to which the gain co-fluctuates across cells. We here modeled a global gain fluctuation affecting the response of the whole population. Given the gain variability, the covariance structure can be partitioned as in Equation 10, as the sum of a component Σ¯ unrelated to the gain fluctuations – which for simplicity we consider to be stimulus-independent – and the gain-induced covariance σG2f(s)fT(s). In a first order approximation, a change Δs=s-s0 in the stimulus leads to a change in the covariance structure such that

Σ(s)Σ(s0)+σG2[f(s0)fT(s0)+f(s0)fT(s0)]Δs, (22)

where Σ(s0)=Σ¯+σG2f(s0)fT(s0) is the covariance structure for which the weights are optimized. Combining this covariance structure with the form of the optimal read-out weights (Equation 20), we derive the changes in cov(ri,d), σri2, and σd2 with Δs, and given Equation 9 determine the perturbation of the CP, leading to the following CP expressions

CCi(pCR=0.5)=1-λi2CCi0(pCR=0.5) (23a)
CCi(pCR)CCi(pCR=0.5)[1βσrΔs(pCR)]+σG2fi(s0)Δs(pCR)wΣ(s0)wΣii(s0), (23b)

where σd2(s0)=wΣ(s0)w, σri2(s0)=Σii(s0), βσrσG2fi(s0)fi(s0)/σri2(s0), and λi2σrig2(s0)/σri2(s0), as introduced in Equation 11, with s0 resulting in pCR=0.5 for an unbiased decoder. Equation 23a relates the choice correlation for pCR=0.5 with the choice correlation CCi0(pCR=0.5) that cell i would have if there were no gain fluctuations (σG2=0). The coefficient βCC1-λi2 indicates a decrease in the CC in the presence of gain fluctuations, because of the increase in the response variability produced by the fluctuations, namely σrig2=σG2fi2. Equation 23b describes the CC(pCR) profile induced by the gain fluctuations. The second summand corresponds to the increase in the choice correlation due to a new component of cov(ri,d) proportional to Δs, given that the whole population response determining d is jointly modulated by the gain. In the first summand, the factor [1-βσrΔs(pCR)] indicates an attenuation of CCi(pCR=0.5) analogous to 1-λi2 in Equation 23a, associated with an increase of the variance in the responses ri due to Δs. Rearranging the terms in Equation 23b, and taking into account the form of the CC for pCR=0.5 (Equation 21), the expression in Equation 11 is obtained, which indicates that the overall effect of the gain fluctuations is an increase of the choice correlation for the stimuli to which the cell is more responsive. A more general form of this expression is derived in Appendix 4, valid for any unbiased decoder.

Apart from producing an asymmetric CP(pCR) pattern, the gain fluctuations also create a negative covariation between the CP at pCR=0.5 and the degree of asymmetry of the CP(pCR) pattern. This covariation appears because the cells with a higher portion of their variability driven by the gain fluctuations (higher λi) have a higher attenuation of CCi(pCR=0.5), given Equation 23a, while both a higher λi and smaller CCi(pCR=0.5) lead to an increase in the slope βpCRσGλi(1CCi2(pCR=0.5)) of the dependence on Δs(pCR) in Equation 23b. Furthermore, a smaller CCi(pCR=0.5) also leads to a smaller effect of the multiplicative symmetric modulation h(pCR), further contributing to the negative covariation between the magnitude of the CP and predominance of the symmetric or asymmetric pattern.

To illustrate the properties common to the model and to the CP(pCR) patterns from the MT data, Figure 4E shows CPs from Equation 23 as a function of pCR for examples combining four values of CCi0(pCR=0.5) and two values of σG2, while the other parameters of the cell responses are kept constant. In particular, to determine λi2 only in terms of the strength of the gain we fixed the rate to fi(pCR=0.5)=10spike/s and considered the variance not associated with the gain to be equal to that rate, so that λi2=1/(1+0.1/σG2). Accordingly, the values of σG2 in Figure 4E, namely σG2=0.1 and σG2=0.01, correspond to λi2=0.5 and λi2=0.09, respectively. Further analysis of the model is provided in Appendix 4, where we also discuss the form of the CP(pCR) pattern produced by gain fluctuations when the decoder is composed by two pools of opposite choice preference (Shadlen et al., 1996).

To experimentally estimate the coefficients βpCR(exp) we fitted a quadratic regression of the CPs on the stimulus levels. To theoretically estimate the coefficients βpCR(th), we used the negative binomial model of Goris et al., 2014 to estimate σG2 for each cell and used the form βpCRσGλi(1CCi2(pCR=0.5)) predicted by the gain model (Equation 11) to estimate βpCR(th).

Generalized linear models modeling the interaction between stimulus and choice predictors

We implemented a new GLM, called stimulus-dependent-choice GLM, that includes regression coefficients quantifying the effect on the firing rate of interactions between stimulus and choice. This model of the firing rate of each neuron was compared to two simpler and traditional models: a stimulus-only GLM, which includes only stimulus predictors of the neuron’s firing rate, and a stimulus-independent-choice GLM, which includes together with the stimulus predictor a single, stimulus-independent choice predictor.

In more detail, all three GLMs were Poisson models in which the mean firing rate μ(ri) of cell i was generally expressed by the following equation:

log(μ(ri))=Σj=04ajsj+Σj=1NcIPj(pCR)bjD. (24)

The terms Σj=04ajsj are present in all three types of GLM, and model the stimulus influence with a fourth order polynomial function of the stimulus level. These are the only terms of the stimulus-only GLM.

The choice dependence is modeled by Σj=1NcIPj(pCR)bjD, with the parameter Nc (Nc{1,2,3}) setting the number of possible different levels of stimulus-dependent choice (we restricted the fitting to up to three different choice levels for simplicity, and because we found empirically this to work well for the MT data analyzed here). IPj(pCR) is an indicator function which equals one if a pCR value belongs to the subset Pj of values selected to be associated with the choice parameter bj, and is zero otherwise. For the stimulus-independent-choice model, we set Nc=1 so that the choice affects the predicted responses equally for all stimulus levels. For the stimulus-dependent-choice GLM, we set Nc>1. For this stimulus-dependent-choice GLM, we determined the subsets of stimulus levels associated with each of those parameters using the CP(pCR) profiles for a first characterization of the stimulus dependencies. Like for the CP analysis, for each cell we determined which coherence values could be included in the analysis given a criterion requiring a minimum number of trials for each choice (at least 4). The existence of non-monotonic CP(pCR) profiles, such as the symmetric pattern around pCR=0.5, indicated that it would be sub-optimal to tile the domain of pCR with Np bins and assign a different choice-parameter level to each bin. Accordingly, we first estimated the CP(pCR) profile of each cell and then used k-means clustering with an Euclidean distance to cluster the components of CP(pCR), corresponding to the bins of pCR, into Nc subsets. A different GLM choice-parameter bj was then assigned to each choice-parameter level j=1,Nc.

We compared the predictive power of the three types of models using cross-validation. To avoid that the choice-parameters fitted were affected by the ratio of trials with each choice, we matched the number of trials of each choice used to fit the model at each choice-parameter level. We first merged in two pools, one for each choice separately, the trials of all stimulus levels assigned to the same choice-parameter level. We then determined the number of trials from each pool to be included in the fitting set as an 80% of the trials available in the smallest pool, hence matching the number of trials selected from each choice. The remaining trials were left for the testing set. This procedure was repeated for each choice-parameter level and a GLM model was fitted on the fitting set obtained combining the selected trials for all levels. This random separation between fitting and testing data sets was repeated 50 times and the average predictive power was calculated. Performance was then quantified comparing the increase in the likelihood of the data in the testing set with respect to the likelihood of the null model which assumes a constant firing rate (L0). To determine if incorporating the choice as a predictor improved the prediction, we examined the relative increase in likelihood (RIL) defined as the ratio of the likelihood increase L(choice,stimulus)-L(stimulus) and the increase L(stimulus)-L0. For the stimulus-dependent-choice models, we selected the most predictive model from Nc=2,3. To evaluate the improvement when considering stimulus-dependent choice influences, we compared the RIL obtained for the stimulus-dependent-choice and stimulus-independent-choice models.

Code availability

The codes for the analysis of Choice Probability stimulus-dependencies and GLMs with stimulus-choice interaction terms are available at https://github.com/DanielChicharro/CP_DP (copy archived at swh:1:rev:5850c573860eb04317e7dc550f96b1f47ca91c6a).

Acknowledgements

This work was supported by the BRAIN Initiative (grants No. R01 NS108410 and No. U19 NS107464 to SP, U19 NS118246 to RMH), the Fondation Bertarelli, and the CRCNS program (grant R01 EY028811 to RMH). We thank KH Britten, WT Newsome, MN Shadlen, S Celebrini, and JA Movshon for making their data available in the Neural Signal Archive (http://www.neuralsignal.org/), and W Bair for maintaining this database.

Appendix 1

The analytical threshold model of choice probability

We here provide details of how the CP analytical expression of Equation 16 is obtained from the definition of Choice Probability (Equation 12) when the probability of the responses for each choice has the form of Equation 15, derived from the threshold model. We subsequently characterize the statistical power for the detection of the threshold-induced modulation h(pCR), as a function of the magnitude of the CP, the number of trials, and the number of cells used to estimate an average CP(pCR) profile.

Derivation of the CP analytical expression

Plugging the distribution of Equation 15 into the definition of the CP we get

CPi=12pCR(1pCR)[1αdxϕ(αx+c)Φ2(x)( dxϕ(x)Φ(αx+c))2]. (A1)

This expression is derived analogously to Equation S1.2 in Haefner et al., 2013, and generalizes the case examined there, which corresponds to c=0. To solve this expression, we use some results involving integrals of normal distributions:

dxxqϕ(x)Φn(αx+c)=(1q)Φ(cb)+qαbϕ(cb)+2(1n)T(cb,11+2α2), (A2)

where b=1+α2 and T is the Owen’s T function (Owen, 1956). The equality above is valid for the cases q=0,n=1,2, and q=1,n=1, which we used to derive the expressions of the CP and CTA. Using the equality for q=0,n=1,2 into Equation (A1) we obtain the CP expression of Equation 16. On the other hand, the case with q=1,n=1 allows deriving the expression of the CTA (Equation 17) calculating riD=1 and riD=-1 using the form of Equation 15 for P(ri|D=1).

The CP linear approximation of Equation 4 is generically valid when the activity-choice covariations are well captured by the linear dependence between the responses and the choice. It can be generically derived with a first order approximation such that d or D only affect the mean of the distribution of the responses ri. We here only present a restricted derivation, specifically from the exact CP solution resulting from the threshold model. It can be checked that the same approximation follows for example from the exact solution of the CP obtained when taking the conditional distributions p(ri|D) to be Gaussians (Dayan and Abbot, 2001; Carnevale et al., 2013) and not skew normals (Equation 15) like for the threshold model. Expanding Equation 16 in terms of ρrid we get a polynomial approximation

CPi=12+2h(pCR)πρrid+k=1d2k(exp[(Φ1(pCR))2222ρrid2]2πpCR(1pCR)2ρrid2)dρrid2k|ρrid=0ρrid2k+1(2k+1)!. (A3)

This expansion contains only odd order terms, producing a symmetry of CP-0.5 with respect to the sign of ρrid. This explains why Haefner et al., 2013 found that the linear approximation was accurate for a wide range of ρrid values, since the choice correlation needs to be high so that the contribution of ρrid3 is relevant. Up to order 3.

CPi12+2h(pCR)π[ρrid+1(Φ1(pCR))212ρrid3]. (A4)

Since Φ-1(0.5)=0, for pCR values for which |1-(Φ-1(pCR))2|<1 the third order term makes a smaller contribution than for the uninformative case. This is true for Φ-1(pCR)(-2,2), which leads to pCR(0.08,0.92). This means that the linear approximation is expected to be an even better approximation in this range than for pCR=0.5. Furthermore, for (Φ-1(pCR))2<1 the third order contribution is positive, so that for the pCR values fulfilling this constraint, pCR(0.16,0.84), the linear approximation is expected to underestimate the CP. The range of pCR in which the linear approximation underestimates or overestimates the CP can be seen in Figure 2 of the main article.

Statistical power for the detection of threshold-induced CP stimulus dependencies

In Figure 2, we showed the shape and magnitude of the threshold-induced CP modulation for different values of CP(pCR=0.5). The magnitude of this modulation is small and is most noticeable for the extreme pCR values, for which the estimation of the CPs is also the poorest. As discussed in the Results, we calculated weighted average CPs both across stimulus levels and across cells to reduce the standard error of the resulting averaged within-cell CP(pCR) profiles. We here characterize the statistical power for the detection of this CP modulation as a function of the magnitude of CP(pCR=0.5), the number of cells, and the number of trials per stimulus level. For this purpose, we generated responses following the probability distribution analytically derived in the model (Equation 15). We selected five CP(pCR=0.5) levels, namely {0.55,0.6,0.65,0.7,0.75}, and for each we simulated cell responses corresponding to different pCR values. We simulated responses for a collection of 5000 cells, selecting their CP(pCR=0.5) values from a uniform distribution centered at each CP(pCR=0.5) level, and a range of width 0.1. We repeated the same procedure using different numbers of trials. For each selection of the number of trials, we generated three times that number of trials for the pCR bin corresponding to an uninformative stimulus, to allow the shuffling of trials used in the construction of surrogate data, as for the experimental data. We estimated averaged CP(pCR) profiles for different numbers N of cells. We repeated this estimation 500 times, independently randomly sampling from the 5000 cells the N cells used for the average. For each of these repetitions, we generated the surrogate data required to implement the surrogates test described in Methods.

Appendix 1—figure 1 shows the p-values obtained when testing for a symmetric increase of the CP with extreme pCR values. As expected, p-values decrease with the number of cells, the number of trials, and the magnitude of the CP. This characterization of the p-values indicates the utility of our method to calculate averaged within-cell CP(pCR) profiles, combining CP estimates across neighboring stimulus levels -analogously to increasing the number of trials- as well as averaging CP estimates across cells. We can compare these predicted p-values with the p-values obtained when analyzing the experimental data. Concretely, for cluster 2 in Figure 4B, which contains N=48 cells, the average CP(pCR=0.5) value is CP(pCR=0.5)=0.57 and, for each of the two pCR bins most distant from pCR=0.5, the average number of trials available is K=147, comprising all stimulus levels assigned to those bins. These values of CP(pCR=0.5), N, and K resulted in a p-value p=0.0008 (Figure 4B), which is substantially smaller than what predicted in the simulations of Appendix 1—figure 1A. Accordingly, while the model correctly predicts the existence of a symmetric CP modulation with higher CP values for extreme pCR values, it underestimates its actual strength. As discussed in Results, this stronger symmetric modulation may be due to other symmetric contributions from CC(pCR) in addition to h(pCR), or to a dynamic amplification of the threshold-induced modulation, for example due to reinforcing decision-related feedback signals. Independently of the origin if this higher effect size, the method developed to better characterize within-cell CP(pCR) dependencies allowed us to detect the actual presence of CP stimulus dependencies in the Britten et al., 1996 data, and promises to be a useful tool to characterize the properties of these dependencies.

Appendix 1—figure 1. Statistical power for the detection of threshold-induced CP stimulus dependencies.

Appendix 1—figure 1.

For cell responses generated from the threshold model (see text for details), the figure characterizes the p-values obtained in the surrogates test used to assess the presence of a symmetric modulation of the CP, with increased CP values for unbalanced choice rates. Each panel presents the p-values as a function of the number of cells included to calculate an averaged CP(pCR) profile, and of the CP(pCR=0.5) value. Curves corresponding to different CP values are shifted for a better visualization of the error bars (standard deviation of the p-value across 500 simulations). The horizontal dashed line indicates the threshold of significance p=0.05.

Appendix 2

The relation between the weighted average CP and the grand CP of z-scored responses

We here describe the connection between a weighted average CP and the corrected z-scoring procedure Kang and Maunsell, 2012 used to calculate a grand CP pooling the responses across stimulus levels. We will use zc to refer to the responses with the corrected z-scoring, as opposed to the standard z-scoring z. For each stimulus level s, the corrected z-scoring is calculated as zc=(r-μ~r|s)/σ~r|s with

μ~r|sμr|D=1,s+μr|D=-1,s2,σ~r|sσr|D=1,s2+σr|D=-1,s22+Δμr|D,s24, (A5)

where μr|D=±1,s are the mean responses to each choice for the stimulus s, σr|D=±1,s2 are the variance of the responses to each choice for the stimulus s, and Δμr|D,s=μr|D=1,s-μr|D=-1,s is the CTA for the fixed stimulus level s.

The relation between the CP and the CTA of Equation 4 holds due to the definition of the measures and does not depend on the nature of the response variable, and hence also holds when the CP is not calculated for the raw responses but for their corrected z-scores. Furthermore, the relation holds when the CP is calculated for a fixed stimulus level, or when the responses are pooled across stimulus levels to calculate the grand CP. For the latter case

grandCPzc12+12πΔμzc|Dσzc, (A6)

where we drop the cell index i for simplicity. We here use the notation Δμzc|D=CTAzc and σzc=varzc, in comparison to Equation 4. Furthermore, in this section we need to explicitly differentiate which measures are calculated for a fixed stimulus, as indicated by the conditioning on s in the subindex, and which measures are calculated from the distribution of the pooled responses, in which case there is no stimulus subindex, for example, for Δμzc|D. The grand CPzc and Δμzc|D are calculated from the distributions p(zc|D=±1) obtained after pooling, while for a fixed stimulus CPzc(s) and Δμzc|D,s are calculated from p(z|D=±1,s). The choice rate pCR is defined as in the main text as pCRp(D=1|s), although previously the dependence on the fixed stimulus was implicit.

We use the definition of zc according to Equation (A5) to calculate Δμzc|D. Given the definition of μ~r|s and σ~r|s, we obtain that the conditional means μzc|D=1,s and μzc|D=-1,s are equal to μzc|D=1,s=Δμr|D,s/(2σ~r|s) and μzc|D=-1,s=-Δμr|D,s/(2σ~r|s), respectively. We calculate Δμzc|D as follows:

Δμzc|D=μzc|D=1-μzc|D=-1=dsp(s|D=1)μzc|D=1,s-dsp(s|D=-1)μzc|D=-1,s=ds12[p(s|D=1)+p(s|D=-1)]Δμr|D,sσ~r|s. (A7)

The first equality corresponds to the definition of Δμzc|D. In the second equality, we have estimated the means μzc|D=±1 in terms of the conditional means μzc|D=±1,s using the general property that the mean of a variable is equal to the average of its conditional means. The third equality results from inserting the actual values of μzc|D=±1,s. Given Equation (A7), the choice-triggered average obtained after pooling the normalized responses zc across stimulus levels corresponds to a weighted average of Δμr|D,s/σ~r|s across stimulus levels. Indeed, the factor [p(s|D=1)+p(s|D=-1)]/2 is properly normalized and plays the role of a weight wzc(s), since dsp(s|D=±1)=1 and hence dswzc(s)=1. Moreover, σ~r|s only introduces a second-order correction with respect to the standard normalization with σr|s. In particular, given the skew-normal distributions (Equation 15) resulting from the threshold model, both Δμr|D,s2 and σr|D=±1,s2 depend quadratically on the strength of the activity-choice covariations, as determined by the choice correlation (Arnold and Beaver, 2000; Azzalini, 2005). Neglecting this second-order correction we have that Δμr|D,s/σ~r|sΔμr|D,s/σr|s and σzcσz=1. Furthermore, taking into account the general relation between the CP and CTA (Equation 4), Δμr|D,s/σr|s approximates 2π(CPr(s)-1/2), and analogously, as indicated in Equation (A6), Δμzc|D/σzc approximates 2π(CPzc-1/2). Altogether, Equation (A7) can be expressed for the CP as

CPzcds12[p(s|D=1)+p(s|D=-1)]CPr(s). (A8)

As mentioned above, the weights wzc(s)[p(s|D=1)+p(s|D=-1)]/2 are properly normalized to wzc(s)ds=1, and hence CPzc is approximated as a weighted average of the CPs of the responses at each stimulus level, CPr(s). This shows that in fact the use of corrected z-scores to pool responses across stimulus levels to calculate a grand CP is in the linear approximation equivalent to calculating a weighted average of the CPs at each stimulus level, with a specific selection of the average weights, namely wzc(s). An analogous derivation with the uncorrected z-scoring shows that in that case the pooling across stimulus levels is associated with an improper use of unormalized weights, which analytically confirms the arguments and simulations in Kang and Maunsell, 2012 indicating that a grand CP calculated with the standard z-scoring provides a biased estimation of an underlying stimulus independent CP (see a detailed derivation in section S2 of Chicharro et al., 2019). The weights wzc(s) differ from the ones inversely proportional to the standard error of the CP estimates (Equation 18). In particular, if the stimulus set is designed such that across all stimulus levels the rate of the choices is balanced, i. e. if p(D=1)=p(D=-1), then these weights simplify to wzc(s)=p(s), that is, the CPs are averaged according to the relative number of trials available for each stimulus level. When there are no CP stimulus dependencies, the weights related to the estimate error are preferable since the grand CP will provide a better estimate of the underlying constant CP. In the presence of CP stimulus dependencies, any grand CP calculated as a weighted average across stimulus levels may introduce some confoundings in the comparison of grand CPs across cell types, areas, or across tasks. For example, if the distribution of the presented stimuli p(s) is not uniform, the weights wzc(s)=p(s), will assign a higher weight to the CP(s) of certain stimulus levels, and difference in the grand CP across cells may reflect for which stimulus levels the cells compared have a higher CP(s). Accordingly, characterizing the CP(s) patterns can also help to understand if differences in grand CPs reflect functionally meaningful differences or are produced by the grand CP weighted average estimation.

Appendix 3

Clustering analysis

We here provide further details about the alternative procedures used to cluster the CP(pCR) profiles, about the visualization of the clusters, and about how to assess the significance of the CP(pCR) patterns. As a first step, we implemented a nonparametric k-means clustering analysis to cluster the CP(pCR) profiles of the 107 cells for which a full profile could be constructed. We started using C=2 clusters (Figure 4A) and found that this nonparametric approach, when using the cosine distance, recovered qualitatively the same patterns obtained when separating a priori the cells into cells with an average CP higher or lower than 0.5 (Figure 3A). From the patterns of the two clusters only the one of cells with an average CP higher than 0.5 was found significant (see below for details on the significance analysis). Given this difference in significance, we subsequently increased the number of clusters in two alternative ways. In a first approach, we a priori separated the cells with an average CP higher or lower than 0.5 and continued the clustering analysis separately for these two groups. Appendix 3—figure 1A shows the obtained subclusters with C=2 for the two groups separately. As a second approach, we increased the number of clusters to C=3 without any previous separation in two groups. The resulting clusters (Appendix 3—figure 1B) indicated that the separation of the two subclusters for the cells with average CP higher than 0.5 naturally appears without enforcing the separation. Increasing the number of clusters without any a priori separation provided evidence that the two main patterns for cells with average CP higher than 0.5 are robust and still contain a substantial portion of the cells even for C=6 (Appendix 3—figure 1C). We therefore focused our posterior analysis in characterizing the features of these symmetric and asymmetric patterns.

Appendix 3—figure 1. Subclustering of CP(pCR) dependencies.

Appendix 3—figure 1.

(a) Analogous to Figure 4B but showing also the average profiles for the two subclusters obtained from cells with average CP<0.5. (b) Nonparametric k-means clustering with three clusters determined from all cells. (c) Nonparametric k-means clustering with six clusters determined from all cells. The clusters more similar to the ones of Figure 4B are correspondingly coloured.

To evaluate the significance of the CP(pCR) patterns found with the clustering analysis, we repeated the same clustering procedure for the surrogate data generated as described in Methods. For each surrogate, each of the C clusters found was associated with the most similar original pattern of the ones being tested. For example, in Figure 4B, when testing the significance of the symmetric and asymmetric patterns for cells with average CP higher than 0.5, each of the two surrogate cluster patterns was assigned to the most similar pattern among the symmetric and asymmetric one. Subsequently, the average of ΔCPk across bins was calculated for the original and surrogate profiles as explained in the Methods. The p-value corresponding to each original pattern was calculated from the number of surrogate patterns associated with it for which the average ΔCPk was higher.

To visualize the clusters in Figure 4 and Appendix 3—figure 1A, we constructed orthonormal axes using either the vectors corresponding to the center of the clusters or the selected templates, for nonparametric and parametric clustering, respectively. In the case of nonparametric clustering, the x-axis corresponds to the separation between the two initial clusters, and is closely aligned to the departure of the average CP from 0.5. The y-axis was built as a projection orthogonal to the x-axis of the vector connecting the center of the two subclusters. When templates were used, the x-axis corresponds to the template with a constant CP and the y-axis was built as an orthogonal projection of the template with an asymmetric profile (a vector with a positive unit slope).

Appendix 4

The effect of gain fluctuations on the CP

We here derive a general CP expression accounting for the effect of response gain fluctuations valid for a feedforward encoding/decoding decision-making model with any unbiased weights, and subsequently focus on the decoder based on two pools of cells with opposite choice preference, as previously studied in Shadlen et al., 1996. As described in Methods, we consider a decoder d=wr (Equation 8), estimating the decision variable d from the responses ri=fi(s)+ξi, with tuning functions f(s)=(f1(s),,fn(s)) and a covariance structure Σ(s) of the neuron’s intrinsic variability ξi. The general expression of the CP valid for any stimulus level and agnostic about the source of the activity source covariations is

CPi(s)12+2h(pCR)πCCi(s), (A9)

and, as described in Equation 9, the CC is determined by the covariance matrix Σ(s) and the read-out weights w as

CCi(s)=(Σ(s)w)iwΣ(s)wσri2(s) (A10)

where cov(ri,d)(s)=(Σ(s)w)i and σd2(s)=wΣ(s)w. Given that the covariance matrix has a structure Σ(s)=Σ¯+σG2f(s)f(s) (Equation 10), with the component σG2f(s)f(s) induced by the gain fluctuations, a change Δs=s-s0 in the stimulus level departing from the uninformative stimulus s0 alters the covariance structure as indicated in Equation 22. This leads to the following perturbation of the CC

CCi(s)CCi(s0)[1-12(Δσd2(s)σd2(s0)+Δσri2(s)σri2(s0))Δs]+Δcov(ri,d)(s)Δsσri(s0)σd(s0), (A11)

which is a generalization of Equation 23b, where Δσri2(s), Δσd2(s), and Δcov(ri,d)(s) are the changes produced in the variances and covariance due to Δs. From Equations (A10) and 22, the change in the variability of the responses is Δσri2(s)=2σG2fi(s0)fi(s0), the change in the variability of the decision variable is Δσd2(s)=2σG2wf(s0)wf(s0), and the covariance varies in Δcov(ri,d)(s)=σG2fk(s0)wf(s0)+σG2fk(s0)wf(s0). Furthermore, for any unbiased decoder wf(s0)=1, so that d=wr properly estimates d(s)d(s0)=wf(s0)Δs=Δs (Moreno-Bote et al., 2014). Taking this into account, Equation (A11) can be expressed as

CCi(s)CCi(s0)[1-σG(λdηd+λiηi)Δs]+σG(λdηi+λiηd)Δs. (A12)

Here λiσrig(s0)/σri(s0) quantifies the portion of the responses variance of cell i caused by the gain fluctuations, as defined in Equation 11, and λdwf(s0)/σd(s0) quantifies the portion of the variance of the decision variable caused by the gain fluctuations. We define ηifi(s0)/σri(s0), which quantifies the neurometric sensitivity of the cell, and ηd1/σd(s0), which quantifies the behavioral sensitivity. Note that Pitkow et al., 2015 defined the so-called neural threshold θi and behavioral threshold θd such that ηi=1/θi and ηd=1/θd, but in our case the measures of sensitivity are more suited to describe the dependence of the CC. In particular, the CCi(s0) of the uninformative stimulus has an attenuation factor that depends on the relative increase in variability in the responses and in the decision variable, each determined by the product λkηk of the sensitivity to the change in the stimulus and the relative magnitude of the variance produced by the gain fluctuations. On the other hand, the additional contribution to CCi(s) depends on the cross-products λkηk, of the relative magnitude of the gain-related variance for the cell and the sensitivity of the decision variable, or viceversa. Rearranging the terms this equation can also be expressed as

CCi(s)CCi(s0)+σG[λiηd(1-CCi(s0)ηiηd)+λdηi(1-CCi(s0)ηdηi)]Δs. (A13)

From this expression, Equation 23b in Methods is recovered for the case of the optimal decoder, since the CC with the optimal decoder at s0 is equal to the ratio of the neural and the behavioral threshold (Pitkow et al., 2015), that is, CC(s0)=ηi/ηd, as indicated in Equation 21. For this optimal decoder the second summand of Equation (A13) is canceled out, while in the first CCi(s0)ηi/ηd equals CCi2(s0), ensuring that the slope of dependence on Δs is positive. More generally, λd=wf(s0) is zero when the decoder is uncorrelated to the magnitude of the tuning curves. In this case the performance of the decoder is not affected by global gain fluctuations when presenting a non-informative stimulus (Moreno-Bote et al., 2014; Ecker et al., 2016) and the gain does not contribute to the variability of d or its covariance with the cell responses. This additional assumption was used to determine Equation 23a, such that only the variance of the cell changes with respect to the case of no gain fluctuations.

We now additionally examine the decoder formed by two pools of cells with opposite choice preference, because of the role it has played in previous understanding of activity-choice covariations (Shadlen et al., 1996; Cohen and Newsome, 2009b). We consider the particular configuration examined in Haefner et al., 2013, such that the decoder is formed by n neurons divided in two pools of n/2 neurons, all with the same variance σri2(s0), and with the same intra-pool covariance cov||(ri,rj) for all pairs of cells within the same pool and the same inter-pool cov(ri,rj) for all pairs across pools. The read-out weights all have the same magnitude, with opposite sign for the cells of the two pools. For this configuration, Haefner et al., 2013 derived (see their Suppl. Material S5) that CCi(s0)±1/n+Δϱ/2, where Δϱ(1-2/n)ρ||-ρ is the difference between the intra- and inter-pool correlations, in the limit of a large pool, and the sign of the choice correlation is the opposite across pools. They also showed that σd(s0)=nσri(s0)|CCi(s0)|/c, with c being a normalization factor of the weights to ensure that the decoder is unbiased (wf(s0)=1). Accordingly, for this decoder ηi/ηd=nfi(s0)|CCi(s0)|/c. Assuming that for a pair of neuron/antineuron in the two pools their firing rates have derivatives of same magnitude and opposite sign, c=n|f(s0)|, being |f(s0)| the average of the magnitude of the derivatives. This leads to ηi/ηd=fi(s0)|CCi(s0)|/|f(s0)|. Further following the idea that the two pools contain neurons and antineurons with the same response properties but opposite choice preference (Cohen and Newsome, 2009b), λd=wf(s0)=0, since for the uninformative stimulus each neuron with tuning curve fk(s0) is paired by a cell in the other pool with the same firing rate and an equal weight but of opposite sign. Accordingly, for this decoder Equation (A13) takes the form

CCi(s)CCi(s0)+σGλiηd[1-CCi2(s0)fi(s0)|f(s0)|]Δs. (A14)

Given the structure of the covariance matrix, this decoder is in fact optimal if furthermore all cells had the same derivative fi(s0), in which case Equation (A14) equals Equation 11 and for all cells the CP(s) pattern has a positive slope in dependence on Δs. More generally, with this two-pools decoder and covariance matrix, the CP(s) pattern can have a negative slope for those cells with larger derivatives. Indeed, only for the optimal decoder the general model of Equation (A13) guarantees a positive slope.

Finally, in Appendix 4—figure 1 we further characterize the dependencies predicted by the model when using an optimal decoder (Equations 11 and 23). The covariation of the coefficients βpCRσGλi(1CCi2(pCR=0.5)) from Equation 23b and βCC1-λi2 from Equation 23a that modulate the strength of the CP(pCR) dependence and the magnitude of CPi(pCR=0.5), respectively, is shown in Appendix 4—figure 1A. We determined λi2 only in terms of the strength of the gain as for Figure 4E, namely as λi2=1/(1+0.1/σG2). The range σG2=[0- 0.5] corresponds to λi2=[0- 0.83]. Appendix 4—figure 1B-C further illustrate how combinations of different CCi0(pCR=0.5) and σG2 populate the 2-D space of CP(pCR) profiles as in Figure 4D,E. CP(pCR) profiles were simulated randomly sampling the average CP values from the ones observed for the MT cells. For Appendix 4—figure 1B,D, the fluctuation gains were uniformly sampled from the interval σG2=[0- 0.1], corresponding to λi2=[0- 0.5]. For Appendix 4—figure 1C,E, the 2-D space was not evenly sampled, simulating a further dependence between CCi0(pCR=0.5) and σG2 values which determines the exact balance between the symmetric and asymmetric dependencies observed in the average profiles associated with each cluster.

Appendix 4—figure 1. Modeling the influence of neuronal gain modulation on CP(pCR) profiles.

Appendix 4—figure 1.

(a) Dependence of gain coefficients βCC and βpCR (Equations 23) on the strength of the gain fluctuations, σG2 determines their effect on the choice correlation CCi(s0) for the uninformative stimulus s0. βpCR determines the degree of asymmetry of the choice correlation dependence on the pCR. (b) CP(pCR) profiles, represented in the same 2-D space as in Figure 4D,E, generated with a uniform sampling of CCi0(s0) consistent with the observed average CPs of the MT cells, and with a uniform sampling of the gain (σG2𝒰(0,0.1)). (c) Analogous to b, but with a nonuniform distribution in the 2-D space, reflecting structure in the covariation of CCi0(s0) and σG2. (d–e) CP(pCR) profiles corresponding to the clusters centers obtained when sampling the space according to panels b and c, respectively.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Daniel Chicharro, Email: Daniel_Chicharro@hms.harvard.edu.

Ralf M Haefner, Email: ralf.haefner@rochester.edu.

Joshua I Gold, University of Pennsylvania, United States.

Kristine Krug, University of Oxford, United Kingdom.

Funding Information

This paper was supported by the following grants:

  • National Institute of Neurological Disorders and Stroke R01 NS108410 to Stefano Panzeri.

  • National Institute of Neurological Disorders and Stroke U19 NS107464 to Stefano Panzeri.

  • National Eye Institute R01 EY028811 to Ralf M Haefner.

  • Fondation Bertarelli to Daniel Chicharro.

  • National Institute of Neurological Disorders and Stroke U19 NS118246 to Ralf M Haefner.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft.

Conceptualization, Supervision, Funding acquisition, Writing - review and editing.

Conceptualization, Formal analysis, Supervision, Funding acquisition, Writing - review and editing.

Additional files

Transparent reporting form

Data availability

No data was collected as part of this study.

The following previously published dataset was used:

Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA. 1996. A relationship between behavioral choice and the visual responses of neurons in macaque MT. The Neural Signal Archive. Macaque

References

  1. Arnold BC, Beaver RJ. Hidden truncation models. The Indian Journal of Statistics. 2000;62:23–35. [Google Scholar]
  2. Azzalini A. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics. 1985;12:171–178. [Google Scholar]
  3. Azzalini A. The Skew-normal distribution and related multivariate families*. Scandinavian Journal of Statistics. 2005;32:159–188. doi: 10.1111/j.1467-9469.2005.00426.x. [DOI] [Google Scholar]
  4. Bair W, Zohary E, Newsome WT. Correlated firing in macaque visual area MT: time scales and relationship to behavior. The Journal of Neuroscience. 2001;21:1676–1697. doi: 10.1523/JNEUROSCI.21-05-01676.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology. 1975;12:387–415. doi: 10.1016/0022-2496(75)90001-2. [DOI] [Google Scholar]
  6. Bányai M, Lazar A, Klein L, Klon-Lipok J, Stippinger M, Singer W, Orbán G. Stimulus complexity shapes response correlations in primary visual cortex. PNAS. 2019;116:2723–2732. doi: 10.1073/pnas.1816766116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bányai M, Orbán G. Noise correlations and perceptual inference. Current Opinion in Neurobiology. 2019;58:209–217. doi: 10.1016/j.conb.2019.09.002. [DOI] [PubMed] [Google Scholar]
  8. Bishop CM. Pattern Recognition and Machine Learning. New York: Springer; 2006. [Google Scholar]
  9. Bogacz R, Brown E, Moehlis J, Holmes P, Cohen JD. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychological Review. 2006;113:700–765. doi: 10.1037/0033-295X.113.4.700. [DOI] [PubMed] [Google Scholar]
  10. Bondy AG, Haefner RM, Cumming BG. Feedback determines the structure of correlated variability in primary visual cortex. Nature Neuroscience. 2018;21:598–606. doi: 10.1038/s41593-018-0089-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bosking WH, Maunsell JH. Effects of stimulus direction on the correlation between behavior and single units in area MT during a motion detection task. Journal of Neuroscience. 2011;31:8230–8238. doi: 10.1523/JNEUROSCI.0126-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA. A relationship between behavioral choice and the visual responses of neurons in macaque MT. Visual Neuroscience. 1996;13:87–100. doi: 10.1017/S095252380000715X. [DOI] [PubMed] [Google Scholar]
  13. Cai X, Padoa-Schioppa C. Contributions of orbitofrontal and lateral prefrontal cortices to economic choice and the good-to-action transformation. Neuron. 2014;81:1140–1151. doi: 10.1016/j.neuron.2014.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Carnevale F, de Lafuente V, Romo R, Parga N. An optimal decision population code that accounts for correlated variability unambiguously predicts a subject's choice. Neuron. 2013;80:1532–1543. doi: 10.1016/j.neuron.2013.09.023. [DOI] [PubMed] [Google Scholar]
  15. Chicharro D, Panzeri S, Haefner RM. Decision-related signals in the presence of nonzero signal stimuli, internal Bias, and feedback. bioRxiv. 2017 doi: 10.1101/118398. [DOI]
  16. Chicharro D, Panzeri S, Haefner RM. Stimulus dependent relationships between behavioral choice and sensory neural responses. bioRxiv. 2019 doi: 10.1101/2019.12.27.889550. [DOI] [PMC free article] [PubMed]
  17. Chicharro D. CP_DP . swh:1:rev:5850c573860eb04317e7dc550f96b1f47ca91c6aSoftware Heritage. 2021 https://archive.softwareheritage.org/swh:1:dir:a5c4ee4746c91be8de89003c5726214169478922;origin=https://github.com/DanielChicharro/CP_DP;visit=swh:1:snp:3f6aa343203d36b93379d41973c206d360588a1f;anchor=swh:1:rev:5850c573860eb04317e7dc550f96b1f47ca91c6a
  18. Choe KW, Blake R, Lee SH. Dissociation between neural signatures of stimulus and choice in population activity of human V1 during perceptual decision-making. Journal of Neuroscience. 2014;34:2725–2743. doi: 10.1523/JNEUROSCI.1606-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cicmil N, Cumming BG, Parker AJ, Krug K. Reward modulates the effect of visual cortical microstimulation on perceptual decisions. eLife. 2015;4:e07832. doi: 10.7554/eLife.07832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Cohen MR, Maunsell JH. Attention improves performance primarily by reducing interneuronal correlations. Nature Neuroscience. 2009a;12:1594–1600. doi: 10.1038/nn.2439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Cohen MR, Newsome WT. Estimates of the contribution of single neurons to perception depend on timescale and noise correlation. Journal of Neuroscience. 2009b;29:6635–6648. doi: 10.1523/JNEUROSCI.5179-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cumming BG, Nienborg H. Feedforward and feedback sources of choice probability in neural population responses. Current Opinion in Neurobiology. 2016;37:126–132. doi: 10.1016/j.conb.2016.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Dayan P, Abbot LF. Theoretical Neuroscience, Computational and Mathematical Modeling of Neural Systems. Cambridge, Massachusetts: The MIT press; 2001. [Google Scholar]
  24. Dodd JV, Krug K, Cumming BG, Parker AJ. Perceptually bistable three-dimensional figures evoke high choice probabilities in cortical area MT. The Journal of Neuroscience. 2001;21:4809–4821. doi: 10.1523/JNEUROSCI.21-13-04809.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Ecker AS, Berens P, Cotton RJ, Subramaniyan M, Denfield GH, Cadwell CR, Smirnakis SM, Bethge M, Tolias AS. State dependence of noise correlations in macaque primary visual cortex. Neuron. 2014;82:235–248. doi: 10.1016/j.neuron.2014.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Ecker AS, Denfield GH, Bethge M, Tolias AS. On the structure of neuronal population activity under fluctuations in attentional state. The Journal of Neuroscience. 2016;36:1775–1789. doi: 10.1523/JNEUROSCI.2044-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Fetsch CR, Odean NN, Jeurissen D, El-Shamayleh Y, Horwitz GD, Shadlen MN. Focal optogenetic suppression in macaque area MT biases direction discrimination and decision confidence, but only transiently. eLife. 2018;7:e36523. doi: 10.7554/eLife.36523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Fiser J, Berkes P, Orbán G, Lengyel M. Statistically optimal perception and learning: from behavior to neural representations. Trends in Cognitive Sciences. 2010;14:119–130. doi: 10.1016/j.tics.2010.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Gold JI, Shadlen MN. Neural computations that underlie decisions about sensory stimuli. Trends in Cognitive Sciences. 2001;5:10–16. doi: 10.1016/S1364-6613(00)01567-9. [DOI] [PubMed] [Google Scholar]
  30. Gold JI, Shadlen MN. The neural basis of decision making. Annual Review of Neuroscience. 2007;30:535–574. doi: 10.1146/annurev.neuro.29.051605.113038. [DOI] [PubMed] [Google Scholar]
  31. Goris RL, Movshon JA, Simoncelli EP. Partitioning neuronal variability. Nature Neuroscience. 2014;17:858–865. doi: 10.1038/nn.3711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Haefner RM, Gerwinn S, Macke JH, Bethge M. Inferring decoding strategies from choice probabilities in the presence of correlated variability. Nature Neuroscience. 2013;16:235–242. doi: 10.1038/nn.3309. [DOI] [PubMed] [Google Scholar]
  33. Haefner RM. A note on choice and detect probabilities in the presence of choice Bias. arXiv. 2015 https://arxiv.org/abs/1501.03173
  34. Haefner RM, Berkes P, Fiser J. Perceptual Decision-Making as probabilistic inference by neural sampling. Neuron. 2016;90:649–660. doi: 10.1016/j.neuron.2016.03.020. [DOI] [PubMed] [Google Scholar]
  35. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. doi: 10.1148/radiology.143.1.7063747. [DOI] [PubMed] [Google Scholar]
  36. Jasper AI, Tanabe S, Kohn A. Predicting perceptual decisions using visual cortical population responses and choice history. The Journal of Neuroscience. 2019;39:6714–6727. doi: 10.1523/JNEUROSCI.0035-19.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kang I, Maunsell JH. Potential confounds in estimating trial-to-trial correlations between neuronal response and behavior using choice probabilities. Journal of Neurophysiology. 2012;108:3403–3415. doi: 10.1152/jn.00471.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Katz LN, Yates JL, Pillow JW, Huk AC. Dissociated functional significance of decision-related activity in the primate dorsal stream. Nature. 2016;535:285–288. doi: 10.1038/nature18617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kayser C, Wilson C, Safaai H, Sakata S, Panzeri S. Rhythmic auditory cortex activity at multiple timescales shapes stimulus-response gain and background firing. Journal of Neuroscience. 2015;35:7750–7762. doi: 10.1523/JNEUROSCI.0268-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kohn A, Smith MA. Stimulus dependence of neuronal correlation in primary visual cortex of the macaque. Journal of Neuroscience. 2005;25:3661–3673. doi: 10.1523/JNEUROSCI.5106-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Krug K, Cumming BG, Parker AJ. Comparing perceptual signals of single V5/MT neurons in two binocular depth tasks. Journal of Neurophysiology. 2004;92:1586–1596. doi: 10.1152/jn.00851.2003. [DOI] [PubMed] [Google Scholar]
  42. Krug K, Curnow TL, Parker AJ. Defining the V5/MT neuronal pool for perceptual decisions in a visual stereo-motion task. Philosophical Transactions of the Royal Society B: Biological Sciences. 2016;371:20150260. doi: 10.1098/rstb.2015.0260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Lakshminarasimhan KJ, Pouget A, DeAngelis GC, Angelaki DE, Pitkow X. Inferring decoding strategies for multiple correlated neural populations. PLOS Computational Biology. 2018;14:e1006371. doi: 10.1371/journal.pcbi.1006371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Lange RD, Haefner RM. Characterizing and interpreting the influence of internal variables on sensory activity. Current Opinion in Neurobiology. 2017;46:84–89. doi: 10.1016/j.conb.2017.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Lange RD, Haefner RM. Task-induced neural covariability as a signature of approximate bayesian learning and inference. bioRxiv. 2020 doi: 10.1101/081661. [DOI] [PMC free article] [PubMed]
  46. Latimer KW, Yates JL, Meister ML, Huk AC, Pillow JW. NEURONAL MODELING. Single-trial spike trains in parietal cortex reveal discrete steps during decision-making. Science. 2015;349:184–187. doi: 10.1126/science.aaa4056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Lee TS, Mumford D. Hierarchical bayesian inference in the visual cortex. Journal of the Optical Society of America A. 2003;20:1434–1448. doi: 10.1364/JOSAA.20.001434. [DOI] [PubMed] [Google Scholar]
  48. Liu LD, Haefner RM, Pack CC. A neural basis for the spatial suppression of visual motion perception. eLife. 2016;5:e16167. doi: 10.7554/eLife.16167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Macke JH, Nienborg H. Choice (-history) correlations in sensory cortex: cause or consequence? Current Opinion in Neurobiology. 2019;58:148–154. doi: 10.1016/j.conb.2019.09.005. [DOI] [PubMed] [Google Scholar]
  50. Mante V, Sussillo D, Shenoy KV, Newsome WT. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature. 2013;503:78–84. doi: 10.1038/nature12742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Maunsell JH, Treue S. Feature-based attention in visual cortex. Trends in Neurosciences. 2006;29:317–322. doi: 10.1016/j.tins.2006.04.001. [DOI] [PubMed] [Google Scholar]
  52. Michelson C, Pillow JW, Seidemann E. Majority of choice-related variability in perceptual decisions is present in early sensory cortex. bioRxiv. 2017 doi: 10.1101/207357. [DOI]
  53. Minderer M, Brown KD, Harvey CD. The spatial structure of neural encoding in mouse posterior cortex during navigation. Neuron. 2019;102:232–248. doi: 10.1016/j.neuron.2019.01.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Moreno-Bote R, Beck J, Kanitscheider I, Pitkow X, Latham P, Pouget A. Information-limiting correlations. Nature Neuroscience. 2014;17:1410–1417. doi: 10.1038/nn.3807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Nienborg H, Cohen MR, Cumming BG. Decision-related activity in sensory neurons: correlations among neurons and with behavior. Annual Review of Neuroscience. 2012;35:463–483. doi: 10.1146/annurev-neuro-062111-150403. [DOI] [PubMed] [Google Scholar]
  56. Nienborg H, Cumming BG. Macaque V2 neurons, but not V1 neurons, show choice-related activity. Journal of Neuroscience. 2006;26:9567–9578. doi: 10.1523/JNEUROSCI.2256-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Nienborg H, Cumming BG. Decision-related activity in sensory neurons reflects more than a neuron's causal effect. Nature. 2009;459:89–92. doi: 10.1038/nature07821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Nienborg H, Cumming B. Correlations between the activity of sensory neurons and behavior: how much do they tell Us about a neuron's causality? Current Opinion in Neurobiology. 2010;20:376–381. doi: 10.1016/j.conb.2010.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Orbán G, Berkes P, Fiser J, Lengyel M. Neural variability and Sampling-Based probabilistic representations in the visual cortex. Neuron. 2016;92:530–543. doi: 10.1016/j.neuron.2016.09.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Owen DB. Tables for computing bivariate normal probabilities. The Annals of Mathematical Statistics. 1956;27:1075–1090. doi: 10.1214/aoms/1177728074. [DOI] [Google Scholar]
  61. O’Connell RG, Shadlen MN, Wong-Lin K, Kelly SP. Bridging neural and computational viewpoints on perceptual Decision-Making. Trends in Neurosciences. 2018;41:838–852. doi: 10.1016/j.tins.2018.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Park IM, Meister ML, Huk AC, Pillow JW. Encoding and decoding in parietal cortex during sensorimotor decision-making. Nature Neuroscience. 2014;17:1395–1403. doi: 10.1038/nn.3800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Parker AJ, Krug K, Cumming BG. Neuronal activity and its links with the perception of multi–stable figures. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences. 2002;357:1053–1062. doi: 10.1098/rstb.2002.1112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Parker AJ, Newsome WT. Sense and the single neuron: probing the physiology of perception. Annual Review of Neuroscience. 1998;21:227–277. doi: 10.1146/annurev.neuro.21.1.227. [DOI] [PubMed] [Google Scholar]
  65. Pillow JW, Shlens J, Paninski L, Sher A, Litke AM, Chichilnisky EJ, Simoncelli EP. Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature. 2008;454:995–999. doi: 10.1038/nature07140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Pinto L, Rajan K, DePasquale B, Thiberge SY, Tank DW, Brody CD. Task-Dependent changes in the Large-Scale dynamics and necessity of cortical regions. Neuron. 2019;104:810–824. doi: 10.1016/j.neuron.2019.08.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Pitkow X, Liu S, Angelaki DE, DeAngelis GC, Pouget A. How can single sensory neurons predict behavior? Neuron. 2015;87:411–423. doi: 10.1016/j.neuron.2015.06.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Ponce-Alvarez A, Thiele A, Albright TD, Stoner GR, Deco G. Stimulus-dependent variability and noise correlations in cortical MT neurons. PNAS. 2013;110:13162–13167. doi: 10.1073/pnas.1300098110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Rao RP, Ballard DH. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience. 1999;2:79–87. doi: 10.1038/4580. [DOI] [PubMed] [Google Scholar]
  70. Romo R, Salinas E. Flutter discrimination: neural codes, perception, memory and decision making. Nature Reviews Neuroscience. 2003;4:203–218. doi: 10.1038/nrn1058. [DOI] [PubMed] [Google Scholar]
  71. Ruff DA, Ni AM, Cohen MR. Cognition as a window into neuronal population space. Annual Review of Neuroscience. 2018;41:77–97. doi: 10.1146/annurev-neuro-080317-061936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Runyan CA, Piasini E, Panzeri S, Harvey CD. Distinct timescales of population coding across cortex. Nature. 2017;548:92–96. doi: 10.1038/nature23020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Sanayei M, Chen X, Chicharro D, Distler C, Panzeri S, Thiele A. Perceptual learning of fine contrast discrimination changes neuronal tuning and population coding in macaque V4. Nature Communications. 2018;9:4238. doi: 10.1038/s41467-018-06698-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Schölvinck ML, Saleem AB, Benucci A, Harris KD, Carandini M. Cortical state determines global variability and correlations in visual cortex. Journal of Neuroscience. 2015;35:170–178. doi: 10.1523/JNEUROSCI.4994-13.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Scott BB, Constantinople CM, Akrami A, Hanks TD, Brody CD, Tank DW. Fronto-parietal cortical circuits encode accumulated evidence with a diversity of timescales. Neuron. 2017;95:385–398. doi: 10.1016/j.neuron.2017.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Shadlen MN, Britten KH, Newsome WT, Movshon JA. A computational analysis of the relationship between neuronal and behavioral responses to visual motion. The Journal of Neuroscience. 1996;16:1486–1510. doi: 10.1523/JNEUROSCI.16-04-01486.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Shushruth S, Mazurek M, Shadlen MN. Comparison of Decision-Related signals in sensory and motor preparatory responses of neurons in area LIP. The Journal of Neuroscience. 2018;38:6350–6365. doi: 10.1523/JNEUROSCI.0668-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Siegel M, Buschman TJ, Miller EK. Cortical information flow during flexible sensorimotor decisions. Science. 2015;348:1352–1355. doi: 10.1126/science.aab0551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Smolyanskaya A, Haefner RM, Lomber SG, Born RT. A Modality-Specific feedforward component of Choice-Related activity in MT. Neuron. 2015;87:208–219. doi: 10.1016/j.neuron.2015.06.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Steinmetz NA, Zatka-Haas P, Carandini M, Harris KD. Distributed coding of choice, action and engagement across the mouse brain. Nature. 2019;576:266–273. doi: 10.1038/s41586-019-1787-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Tajima CI, Tajima S, Koida K, Komatsu H, Aihara K, Suzuki H. Population code dynamics in categorical perception. Scientific Reports. 2016;6:22536. doi: 10.1038/srep22536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Thielscher A, Pessoa L. Neural correlates of perceptual choice and decision making during fear-disgust discrimination. Journal of Neuroscience. 2007;27:2908–2917. doi: 10.1523/JNEUROSCI.3024-06.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Truccolo W, Eden UT, Fellows MR, Donoghue JP, Brown EN. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. Journal of Neurophysiology. 2005;93:1074–1089. doi: 10.1152/jn.00697.2004. [DOI] [PubMed] [Google Scholar]
  84. Tsunada J, Liu AS, Gold JI, Cohen YE. Causal contribution of primate auditory cortex to auditory perceptual decision-making. Nature Neuroscience. 2016;19:135–142. doi: 10.1038/nn.4195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Tsunada J, Cohen Y, Gold JI. Post-decision processing in primate prefrontal cortex influences subsequent choices on an auditory decision-making task. eLife. 2019;8:e46770. doi: 10.7554/eLife.46770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Urai AE, de Gee JW, Tsetsos K, Donner TH. Choice history biases subsequent evidence accumulation. eLife. 2019;8:e46331. doi: 10.7554/eLife.46331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. van Vugt B, Dagnino B, Vartak D, Safaai H, Panzeri S, Dehaene S, Roelfsema PR. The threshold for conscious report: signal loss and response Bias in visual and frontal cortex. Science. 2018;360:537–542. doi: 10.1126/science.aar7186. [DOI] [PubMed] [Google Scholar]
  88. Verhoef BE, Michelet P, Vogels R, Janssen P. Choice-related activity in the anterior intraparietal area during 3-D structure categorization. Journal of Cognitive Neuroscience. 2015;27:1104–1115. doi: 10.1162/jocn_a_00773. [DOI] [PubMed] [Google Scholar]
  89. Wasmuht DF, Parker AJ, Krug K. Interneuronal correlations at longer time scales predict decision signals for bistable structure-from-motion perception. Scientific Reports. 2019;9:11449. doi: 10.1038/s41598-019-47786-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Wimmer K, Compte A, Roxin A, Peixoto D, Renart A, de la Rocha J. Sensory integration dynamics in a hierarchical network explains choice probabilities in cortical area MT. Nature Communications. 2015;6:6177. doi: 10.1038/ncomms7177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Yang H, Kwon SE, Severson KS, O'Connor DH. Origins of choice-related activity in mouse somatosensory cortex. Nature Neuroscience. 2016;19:127–134. doi: 10.1038/nn.4183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Yu XJ, Dickman JD, DeAngelis GC, Angelaki DE. Neuronal thresholds and choice-related activity of otolith afferent fibers during heading perception. PNAS. 2015;112:6467–6472. doi: 10.1073/pnas.1507402112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Yu X, Gu Y. Probing sensory readout via combined Choice-Correlation measures and microstimulation perturbation. Neuron. 2018;100:715–727. doi: 10.1016/j.neuron.2018.08.034. [DOI] [PubMed] [Google Scholar]
  94. Zaidel A, DeAngelis GC, Angelaki DE. Decoupled choice-driven and stimulus-related activity in parietal neurons may be misrepresented by choice probabilities. Nature Communications. 2017;8:715. doi: 10.1038/s41467-017-00766-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision letter

Editor: Kristine Krug1
Reviewed by: Jochen Braun2, Andrew J Parker3, Klaus Wimmer4

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

Perceptual decisions rely on neural activity of sensory neurons. This work presents new model-based analytical tools to understand the relationship between sensory stimulus, sensory neural activity and perceptual choice. Dependencies between neural and behavioural observables in studies of decision making can be quite complex and non-intuitive, as different observables depend in different ways on unobserved hidden states (e.g., decision variable, decision criterion). This paper derives these dependencies for standard assumptions (Gaussian variability) and predicts how dependencies will change under more realistic assumptions (gain modulation). This is a thoughtful reworking of previously published data, making the suggestion that there is a functional relationship between ongoing neural activity and behavioral decisions. This new analysis is however still limited by the available data. Ultimately, this paper suggests new avenues that should be explored by neuroscientists when modelling perceptual decisions.

Decision letter after peer review:

Thank you very much for submitting your article "Stimulus dependent relationships between behavioral choice and sensory neural responses" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Joshua Gold as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Jochen Braun (Reviewer #1); Andrew J Parker (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional experiments/analysis are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

The manuscript "Stimulus dependent relationships between behavioral choice and sensory neural responses" by Haefner and colleagues addresses interpretations of choice probability, which is the capacity of sensory neurons to predict the decisions made by experimental subjects in sensory decision tasks. Variability in the responses of individual neurons is predictive of the subject's choice. This paper further develops previous theoretical findings concerning this "choice probability". At the centre of the new modelling approach is the important relationship between these variable responses and the strength of the stimulus-related component of neuronal firing and how this can be formalised in an algorithm.

Essential revisions:

This manuscript further develops previous theoretical findings concerning "choice probability", the relationship of the trial-to-trial variation of firing rates of a (sensory) neuron and the behavioral choice of a subject in perceptual decision making experiments. CPs and their interpretation have a long history, including the debate about their origin (do CPs reflect a causal effect of sensory neural variability on the decision or do they arise from top-down feedback signals?) Here, the authors identify an additional factor by showing that CP depends on the "choice rate" (the fraction of left vs. right choices). Since this choice rate depends on stimulus strength, CP itself will depend on stimulus strength. They also show experimental evidence for this effect (in the classical Britten et al. dataset) that has been overlooked until now and present methods for refined analysis for linking the responses of sensory neurons to choices.

The reviewers raised a number of central concerns about the relationship between the model and the available empirical data. They agree that the relationship between pCR and CP must be firmly established as a general experimental fact, or reasons given why it is apparent in some conditions and not others. Also, the theoretical interpretation of the relationship between pCR and CP needs to deliver accurate predictions of the experimental results and also to demonstrate consistency with theoretical models of how other factors affect CP (interneuronal correlation and stimulus sensitivity, being the primary variables of concern here).

1. The relationship between pCR and CP appears more at the core of this rather than the related stimulus strength vs CP, but there is a quantitative mismatch between model prediction and empirically observed data, especially in Figure 3a and Figure 4b. Interpretation of Figure 3a. Our initial concern when reading the paper was that the predicted effect of choice rate on CP (dotted line in Figure 3a) is so small that there will not be enough statistical power to detect it in the available experimental data. Moreover, the analysis is complicated by the fact that the effect of h(pCR) is partly masked by the gain modulation effect (Figure 4b). However, for the n=48 neurons (Figure 4b) that show the predicted U-shape of CP vs. pCR, the effect is actually much larger than the theoretical prediction (e.g. or pCR ~0.1 the CP is close to 0.65 when in theory it should be ~0.58). What could be the reason for this quantitative mismatch?

We would like to see in this regard:

a. A clear discussion of and situation of the results within the available empirical literature on CP, especially with regards to quantitative effect size in Britten et al. 1996 and Dodd et al. 2001.

We agree that in some previous papers CP is computed from z-scored spike counts across different stimulus conditions (and thus different choice rate values). However, there are several other papers (including Britten et al. 1996, Nienborg and Cumming 2009, Wimmer et al. 2015, Katz et al. 2016) that compute CP from "zero coherence trials" with choice rate ~ 0.5 (or the equivalent for binocular disparity in V5/MT: Dodd et al. 2001; Krug et al. 2004; Wasmuht et al. 2019). These papers should be mentioned in order to avoid the impression that all previous work suffers from a (slightly) incorrect way of computing CP.

b. Re-analysis of the main effect in different subsets of cells. We are concerned that modelling the group for which CP < 0.5 includes a lot of noise, because the members of this are almost always non-significant, as shown in Dodd et al. and in Britten et al. (See their Figure 5). If you look at Dodd et al., there is no sign of this relationship between pCR and CP. Moreover, in Dodd et al. there are very few 'wrong way' choice probabilities: so few in fact that they are within the statistical bounds of repeated measures. So the procedure of separating the CPs into >0.5 and <0.5 is not likely to alter the result at all from the published Figure in Dodd et al. It might be helpful to recalculate the results for Figure 3A but only including those neurons that show an individually significant CP > 0.5.

c. A simulation of the factors that affect expected effect size. The simulation should show what size of the effect is expected given the limited amount of noisy data (number of neurons, trials, etc.). This would allow you to determine whether the experimental result lies with the confidence bounds of the theoretical prediction.

There is apparently no such relationship between CP and pCR in the Dodd et al. data (but it could perhaps show up when selecting and binning the neurons as described in the manuscript, line 288). In any case, a possible explanation for the discrepancy could be that the expected relationship is so small that it cannot reliably be detected from ~100 neurons. We also think also this issue can be clarified by simulating the size of the effect for limited data.

2. Clarifications are further needed in term of how the fluctuations in the stimulus-related gain of neuronal firing are responsible for the emergence of stronger CPs at higher performance levels. As we understand it, this is required both in terms of the formal implementation in the model and in terms of the proposed neurobiological implementation.

The authors present a detailed analysis of abstract decision-making models. They relate noisy neuronal responses ri, a hypothetical covert decision variable d and decision threshold θ, and an overt behavioural choice D. The authors assume a bivariate Gaussian association between ri and d, with a certain correlation coefficient, and from this minimal basis derive exact or approximate expressions for choice rate pCR, choice probability CPi, choice correlation CCi, between ri and d, and choice-triggered averages of ri, CTAi , and of d, CTAd. The treatment extends previous work in that it covers the entire range of choice rates pCR, not only the special case pCR = 0.5.

A key result is that choice probability CPi changes multiplicatively as a function of pCR, increasing as the decision grows more consistent in either direction, with the baseline level set by choice correlation CCi. An important implication is that the dependence of CPi on pCR, which is shared by all cells and expected to be U-shaped, can be averaged over cells with different choice correlation CCi, provided that cells with positive and negative choice correlations are distinguished.

To test these predictions, the authors re-analyze MT recordings from Britten et al. (1996) and were able to confirm a U-shaped dependence of average CP on pCR, which was statistically significant for cells with positive CCi. However, contrary to predictions, the U-shaped dependence was asymmetric and more pronounced when the more frequent choice (pCR > 0) is consistent with the preferred stimulus of the cell (positively correlated cells CCi>0). A cluster analysis of empirical individual cell dependencies of CPi on pCR revealed that, in addition to the predicted U-shaped dependencies, the presence of unexpected monotonically increasing dependencies.

To clarify the origin of these unexpected dependencies, the authors consider the effect of trial-to-trial response gain fluctuations (Goris et al., 2014) and, with the help of the model of these authors, confirm that gain fluctuations account for 62% of the observed trial-to-trial variance. The authors point out that gain fluctuations add a stimulus-dependent component to the noise covariance of neural responses, which is inherited by choice correlation CCi and by choice probability CPi. In Methods and Supplementary Text S4, the authors derive that this stimulus-dependent component can itself depend asymmetrically on pCR, to an extent that is specific to each cell (i.e., the specific coupling between response and gain). Unfortunately, the authors do not offer an intuitive argument about the origin of this asymmetry.

Please explain how gain fluctuations lead to rising dependencies on pCR in some cells. Which cells are these? Do they combine strong stimulus modulation with weak choice correlation? Without such an explanation, the entire cluster analysis appears incomplete and ultimately pointless.

3. Following on from this, there is also the question how could the stronger CP with increasing pCR be implemented in different ways in actual neural terms. Would for example the pool size of neurons contributing to the decision change? With regards to the model, how does the effect of pCR x CP intersect with the experimentally much stronger interaction effect of neurometric sensitivity and size of the CP?

The authors analyse this from the classical Britten et al. [1] data set, producing a framework for analysis in Figure 2 and the outcome of analysis in Figures3 and 4 of the paper. This type of analysis is not new. The specific form of the plots in Figures2,3,4 appears in Dodd et al. [2] in Figure 6, while Figure 3 in Britten et al. delivers almost the same information.

In regard to the clustering analysis, it seems that the big driver for the formation of clusters is the division between CP >0.5 and CP<0.5. For values of CP<0.5, there is not really a functional account of these, as they do not relate to the tuning of the cells for motion. The lack of functional meaning is highlighted by the fact that cluster 1 in Figure 4a (blue==CP <0.5) is statistically non-significant. Not unrelated to this lack of significance is the fact that Figure 13 of Britten et al. and Figure 6 of Parker et al. [3] show that CPs are stronger for neurons that are more sensitive to the visual task. The usual interpretations of this are either the intuitive claim that more sensitive neurons are more tightly involved in the task and therefore have higher CPs or, more subtly, that neurons with weaker sensitivity have lower degrees of interneuronal correlation.

The rest of the analysis in this paper advances the idea that fluctuations in the stimulus-related gain of neuronal firing are responsible for the emergence of stronger CPs at higher performance levels. The authors write on ll243-4 that "Briefly, the contribution of gain modulations to the covariance of the responses increases with neuronal firing rates, which in turn are stimulus-modulated as determined by tuning functions.". However, the lead author has already published a nice theoretical summary [4] showing that CP is related to the level of interneuronal correlations in the pool. Indeed, the analysis showed that under some conditions (large pool, correlated noise and at least one or two members of the pool contributing significantly to perceptual read-out) one might take CP as substitute indicator for the interneuronal correlation of the decision pool. In the light of the earlier analysis, the present paper does not address the very relevant question of changes in the membership of the neuronal pool with stimulus strength. In effect, if read-out weights change with stimulus strength, then CP will be expected to change. Equally, if pool membership changes then interneuronal correlation may be expected to change. We did not see anything in this analysis here that definitively ties down the change in CP to stimulus-related gain changes.

l 242-243: " We derived the specific form of CC𝑖(𝑝CR) predicted from gain modulations in the threshold decision model to explain additional CP stimulus dependencies beyond the symmetric modulation by h(𝑝CR ). Briefly, the contribution of gain modulations to the covariance of the responses increases with neuronal firing rates, which in turn are stimulus-modulated as determined by tuning functions. This leads to an asymmetric component of CP(𝑝CR), with higher CPs for 𝑝CR values associated with stimuli preferred by the cell. Furthermore, while stronger gain fluctuations increase this asymmetric stimulus dependence, they also decrease the magnitude of the cell-specific CP because they add variability to the responses unrelated to the choice "

As neuronal sensitivity has normally been measured, the change in response due to a change in the stimulus is assessed relative to the variability of neuronal firing. The description above implies that stimulus modulations in the Haefner models translate into response changes on top of which random gain modulations are applied. At first sight, there does not seem any room in the model for low firing rate, low variability neurons to contribute to CP, even though such neurons may have high neurometric sensitivity. One the other hand, it may well be that the new Haefner model all shakes down to give the established experimental result that CP is linked to neuronal sensitivity. If that's correct, then the paper is currently rather obscure on this point and it will be useful for the paper to lay this out clearly.

4. Editing for accessibility and readability

Unfortunately, our genuine enthusiasm for the manuscript is somewhat dampened by its length, by its opacity in places, and by the high degree of topic familiarity that it presupposes. For example, the discussion of grand CP on page 7 and in section S2 of the supplementary material, is difficult to follow even for someone in the field. Accordingly, if readability could be improved, the usefulness would be even greater.

We appreciate the theoretical advance of the paper. It is very useful to have equations that clarify the relationship between previously used measures (CP, CTA, CC) and the effect of informative stimuli. Overall, by the very nature of the topic, the paper is rather technical. Our impression was that the paper is not easy to read, in particular when we think about a broader readership that is not familiar with details of the theory of CP and the typical interpretation of CP measurements. We understand that this is not an easy job because the interpretation of CP is complicated (bottom-up vs. top-down contributions, relationship to spike count correlations, etc) but it would help to revise the Results section and add some more details and (where possible) intuitive interpretation of the theoretical and experimental results to guide the reader. As an example, it turns out that gain fluctuations are an important factor to explain CP vs pCR but this is only treated very briefly in the results (gain fluctuation model is not explained, no figure is shown about how well the model explains the variability observed in the data, etc).

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you very much for submitting your article "Stimulus-dependent relationships between behavioral choice and sensory neural responses" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Joshua Gold as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Jochen Braun (Reviewer #1); Klaus Wimmer (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a final revised submission.

All the reviewers agreed on the significant theoretical advance of your manuscript. However, there were some concerns about the generality of the proposed model given that the data supporting it have limitations. Therefore, we propose a small but essential revision that makes it clear to the reader that the empirical support for the new model requires further explorations. Additionally, we have added the reviewers' further suggestions for you as recommendations.

Essential Revision:

1. Overall, the empirical part looks a little bit like an attempt to drag meaning out of a weak relationship, at least as measured experimentally in the single, but classical data set of Britten et al. The more detailed explanation of the clustering analysis reveals that the main effect of interest is driven by one critical data point from the original Britten et al. data. This is the value of CP for pCR = 0.85, in the data set for which CP > 0.5. Much of the paper depends on just how confident we are about a difference in CP that rises from 0.57 for pCR=0.5 to 0.64 for pCR=0.85. The SEMs calculated for individual neurons in Figure 3d (lower right panel) are not encouraging in this regard.

We think that further empirical tests are needed to obtain clarity about significance about key elements of the proposed model but this is beyond the scope of the current manuscript. It should be flagged to the reader, though.

In order to clarify to the reader that the empirical part provides only an initial analysis – as was also explicitly mentioned by the authors in their response to the previous reviews – we ask the authors to insert something close to the following statement in the abstract and in the first paragraph of the results:

This paper provides preliminary empirical evidence for the promise of studying stimulus dependencies of choice-related signals, which requires further exploration in a wider data set.

2. The revisions are acceptable in that the presentation indeed streamlined in many respects (thank you!). However, one large reservation remains: the main findings are NOT explained in an intuitive manner.

Basically, when we distinguish neural responses associated with choice D=1 from those associated with choice D=0, we obtain two more or less distinct distributions, P(R|D=1) and P(R|D=0). When these distributions are well separated, then the choice probability CP (average probability that an R|D=1 is larger than an R|D=0) is somewhat larger than when these distributions are more overlapping.

However, both qualitative and quantitative aspects of the relation between choice probability CP and choice rate PCR are highly dependent on assumptions. For example, the symmetric U-shape depends on Gaussian variability of responses. When responses are assumed to be Poisson-variable, for example, the symmetric U-shape is replaced by a monotonic decline (Matlab code available upon request).

Gain modulation changes the picture in exactly this way. After gain modulation, response variability is no longer Gaussian and, additionally, the set of responses with the larger average is more affected than the set with the smaller average. In other words, gain modulation alters the shape of response distributions and in consequence also the shape of CP = f(PCR).

In spite of this reservation, I have to confess that this paper has been quite useful to me, because it forced me to think through these issues.

3. With regards to the dependence of the results on one critical data point in the Britten et al. data set (CP for pCR = 0.85) and the question of significance to other data sets, various suggestions for further data worth testing were made.

We wonder, for instance, whether Geoff Ghose's lab (https://pubmed.ncbi.nlm.nih.gov/30067123/ ; https://pubmed.ncbi.nlm.nih.gov/19109454/ ) is a better source of data to support this exercise. In particularly, his recordings could be searched for signs of noisy gain modulations, which is the mechanism that lies at the core of this analysis. Other suggestions were as mentioned before the Dodd et al. 2001/Wasmuht et al. 2019 data.

But we agree this is beyond the current scope of the manuscript.

4. Line 10: "activity-choice covariations are traditionally quantified with a single measure of choice probability (CP), without characterizing their changes across stimulus levels" I appreciate the need to demonstrate novelty in the paper, but this statement does a disservice to earlier researchers who recognized the possibility of a stimulus-choice interaction but did not find any evidence of such an interaction. The earlier papers are clear on this point. So the novelty here is not the failure of earlier researchers to think through their results carefully; the novelty here is an apparent improvement in sensitivity of the analysis methods. This may sound less exciting but is still important, if correct.

5. Line 65 "if the decision process uses a decision threshold"

6. Line 96 "choice probability, CP, defined as the probability that a random sample from all trials"; the reader only learns on line 99 what this is a random sample of. It would better read as "a random sample of neural activity r " and then explain in a separate sentence what r might be in any particular experimental situation.

7. Line 147: "threshold value 𝜃" the authors insist on referring to this parameter 𝜃 as a threshold value. This usage will inflame debate as to whether there is a "high threshold assumption" baked into this model, where "high threshold" here means a classical high threshold in visual detection models, as opposed to a signal detection model. As classical high threshold theory is now rejected for detection models, I think it would be better here and throughout to refer to 𝜃 as a criterion value, which is what it is and better aligns with the language of signal detection theory.

8. Lines 129-169: this development is still very difficult to follow and labors over what are some fairly basic points. It would better be rewritten with better structure at about half the length. For example, compare lines 111-112 and 134-135, which could be combined into a single point that is made once at the right stage in the argument.

9. Lines 170-209: the writing continues in a stilted manner with multiple cross-references to other material.

10. Lines 173-4 "We here will refer to CP stimulus dependencies and CP(𝑝CR ) patterns interchangeably". We rather fear that this is going to cause a lot of readers to trip over. I can see that these two are interchangeable from the theoretical perspective of these authors, but many will think that behavior pCR is dependent on a number of factors other than the stimulus. In the field of monkey neurophysiology, pCR will depend on reward, attention, arousal and so forth, in a way that the stimulus does not. What We think the authors actually mean is something like "Within the structure of our model, there is a fixed relationship between the dependence of CP on the stimulus strength and the dependence of CP on the choice rate pCR, for each threshold."

11. Lines 263-265 "we show how to extend Generalized Linear Models (GLMs), a popular model to characterize the factors modulating neural activity, to include a stimulus-choice interaction terms" As a statistical procedure, this is fairly routine stuff and could be abbreviated considerably.

12. Lines 376-377 "Specifying the existence of two clusters, we naturally recovered the distinction between cells with CP higher or lower than 0.5" We struggled to find a clear and unambiguous summary of the statistics associated with this. We can see that a consistent pattern emerges when the cluster number is increased from 2 to 3, but looking at the distributions in Figure 4c and Figure 4d does not appear to reveal clusters. The significance values in Figures4a and 4b relate to the significance of the modulation effect for each cluster, not the significance of the cluster separations. The methods section and supplementary analysis section cross-reference each other but neither seems to answer the simple question of whether 1 versus 2 clusters is statistically justified, let alone the step from 2 to 3.

13. Lines 612-638 This discussion suggests that there may be within-cell changes in CP as a function of pCR that may have been hidden by pooling across populations of cells. But in the end, this paper has problems in detecting real changes in this relationship at the level of recordings from single cells. The small SEMs that attach to the data in Figure 3a do indeed reflect pooling across a population sample; they do not relate to changes in individual neurons. The panel in Figure 3d shows the true picture for individual cells. So this discussion begs the question as to why the population analyses in Dodd et al. do not show the predicted relationships. The point made by the authors about within-cell changes does not appear to be material in this regard.

eLife. 2021 Apr 7;10:e54858. doi: 10.7554/eLife.54858.sa2

Author response


Essential revisions:

This manuscript further develops previous theoretical findings concerning "choice probability", the relationship of the trial-to-trial variation of firing rates of a (sensory) neuron and the behavioral choice of a subject in perceptual decision making experiments. CPs and their interpretation have a long history, including the debate about their origin (do CPs reflect a causal effect of sensory neural variability on the decision or do they arise from top-down feedback signals?) Here, the authors identify an additional factor by showing that CP depends on the "choice rate" (the fraction of left vs. right choices). Since this choice rate depends on stimulus strength, CP itself will depend on stimulus strength. They also show experimental evidence for this effect (in the classical Britten et al. dataset) that has been overlooked until now and present methods for refined analysis for linking the responses of sensory neurons to choices.

The reviewers raised a number of central concerns about the relationship between the model and the available empirical data. They agree that the relationship between pCR and CP must be firmly established as a general experimental fact, or reasons given why it is apparent in some conditions and not others. Also, the theoretical interpretation of the relationship between pCR and CP needs to deliver accurate predictions of the experimental results and also to demonstrate consistency with theoretical models of how other factors affect CP (interneuronal correlation and stimulus sensitivity, being the primary variables of concern here).

1. The relationship between pCR and CP appears more at the core of this rather than the related stimulus strength vs CP, but there is a quantitative mismatch between model prediction and empirically observed data, especially in Figure 3a and Figure 4b. Interpretation of Figure 3a. Our initial concern when reading the paper was that the predicted effect of choice rate on CP (dotted line in Figure 3a) is so small that there will not be enough statistical power to detect it in the available experimental data. Moreover, the analysis is complicated by the fact that the effect of h(pCR) is partly masked by the gain modulation effect (Figure 4b). However, for the n=48 neurons (Figure 4b) that show the predicted U-shape of CP vs. pCR, the effect is actually much larger than the theoretical prediction (e.g. or pCR ~0.1 the CP is close to 0.65 when in theory it should be ~0.58). What could be the reason for this quantitative mismatch?

We now better clarify that the CP(pCR) dependency is the result of two factors: h(pCR) shared by the entire population, and CC(pCR) which is cell-specific. Our model predicts the former and allows us to infer the latter. These revisions are reported in lines 160-164 and 170-174.

We thank the Reviewers for the opportunity to clarify these important points. As we now discuss in lines 170-174, we refer to a CP(s) stimulus dependence or CP(pCR) dependence interchangeably, since most often -and in particular in our case- pCR values change due to changes in stimulus levels, and hence the two quantities are intertwined and monotonically related. As discussed in lines 160-164, according to the model in Equation 7, a dependence of the CP on the pCR -and by extension on the stimulus level- may appear in two ways. First, through the modulation factor h(pCR) which arises due to the thresholding operation when converting an internal, continuous estimate of the stimulus into a binary decision. Independent of that, it can appear in a second way, due to a stimulus-dependent choice correlation CC(s), for instance as the result of stimulus-dependent cross-neuronal correlations. Importantly, the U-shape due to the factor h(pCR) is common to all cells, while the CP stimulus dependencies induced by CC are cell-specific (as motivated in lines 164167 and lines 672-680). This makes both effects dissociable empirically – at least in principle.

We agree with the Reviewer that the modulation induced by h(pCR) is small, and therefore is challenging to detect it in existing experimental data. This is why we developed our refined methods, which allow examining within-cell CP modulations by the pCR. As we argue in lines 195-202, for each individual cell i the CPi(pCR) profile jointly reflects h(pCR) and any potential dependence CCi(pCR). The average across a population of neurons will therefore average out cell-specific CPi(pCR) dependencies associated with CCi(pCR). This will make the U-shape dependency due to the h(pCR), which is shared by the entire neural population and does not average away, more visible.

Empirically, we found that the average <CP(pCR)> in Figure 3 has an asymmetric dependency that goes beyond that predicted by the h(pCR) factor, which motivated us to use a cluster analysis to further separate h(pCR) from cell-specific contributions through CC(pCR) and to shed further light on general patterns of CCi(pCR) dependencies in the data.

Still, the cluster analysis cannot by itself separate h(pCR) from other CC(pCR) patterns that have the same shape (that is only possible by changes in the experimental design to dissociate stimulus strength from pCR). It only separates the most predominant profiles of CP(pCR) dependencies across cells. This means that the symmetric dependency of cluster 2 in Figure 4B may result not only from h(pCR), but also from other symmetric CC(pCR) patterns shared by the cells. Its symmetric shape is compatible with the symmetric shape of h(pCR) predicted by the model. However, as pointed out by the Reviewers, there is a quantitative mismatch in its magnitude. We now address this mismatch in the revised text (lines 424-431). First, the mismatch can be due to additional symmetric dependencies through CC(pCR) patterns that are also shared across the cells. Second, the quantitative mismatch can be due to the abstraction of the decision-making dynamics necessary to derive the analytical model. The analytical framework of the model, as developed by the work of Shadlen et al., 1996, Haefner et al. 2013, and others, constitutes an abstraction of any biological implementation of the decision process. The quantitative deviations could be caused by effects neglected by the model, such as a dynamic feedback contribution amplifying the U-shape dependence. In this regard, we consider that the qualitative prediction of the U-shape of h(pCR) is itself remarkable, given the degree of abstraction of the CP model.

Furthermore, our empirical characterization of existing patterns of CP-stimulus dependencies and our theoretical analyses of their contributing factors and predicted shapes are themselves original contributions, because these dependencies have mostly been unnoticed in previous analyses or have been ignored because no interpretation was available. Although motivated by the challenge of testing the prediction of the U-shape modulation, our new refined methods to examine within-cell CP(pCR) profiles are agnostic about their origin, and as we have shown provide a powerful tool to characterize this dimension of the cells choice-related neural responses that has not been thoroughly explored yet.

Importantly, despite focusing on the two main predominant patterns of the cluster analysis, we do not claim that those are the only patterns present. Conversely, as we discussed in lines 672-683, previous theoretical and experimental work would support the presence of a richer structure of CP(pCR) dependencies, which we would not be able to characterize because of the limitations of the data in terms of number of trials and number of cells. However, with rapidly improving recording technology allowing for chronic recordings (i.e. many more trials than currently) in addition to recording many neurons concurrently, we expect future studies to map this structure in much greater detail resulting in insights into the functional role and place in the microcircuit of the different clusters of neurons exposed by our analysis.

We would like to see in this regard:

a. A clear discussion of and situation of the results within the available empirical literature on CP, especially with regards to quantitative effect size in Britten et al. 1996 and Dodd et al. 2001.

We agree that in some previous papers CP is computed from z-scored spike counts across different stimulus conditions (and thus different choice rate values). However, there are several other papers (including Britten et al. 1996, Nienborg and Cumming 2009, Wimmer et al. 2015, Katz et al. 2016) that compute CP from "zero coherence trials" with choice rate ~ 0.5 (or the equivalent for binocular disparity in V5/MT: Dodd et al. 2001; Krug et al. 2004; Wasmuht et al. 2019). These papers should be mentioned in order to avoid the impression that all previous work suffers from a (slightly) incorrect way of computing CP.

We now discuss prior studies more accurately, by better delineating their respective contributions and by distinguishing between those reporting CP(s=0) and those computing a Grand CP. We clarify that our primary goal is to advance and promote the analysis of the CP dependence on pCR/stimulus level based on a deeper understanding of the nature of decision-related signals (both choice and detect probabilities) in sensory neurons. Both Grand CPs (computed in various ways) and CP(s=0) are scalar summaries of the richer and more informative underlying function that considers the CP changes across stimulus values. We clarified the importance of isolating within-cell modulations of the CP by pCR from the across-cell heterogeneity of CP magnitudes, which is key for studying the within-cell patterns of CP as a function of pCR. These revisions are reported in lines 37-44 and 57-59.

Thanks to the reviewer for pointing out the necessity to further clarify how we propose to study CP(pCR) patterns building on previous studies. We now cite this previous work in lines 37-44, by better recognizing the importance of previous contributions and by better specifying that in many cases CPs have been calculated only from noninformative stimuli, or alternatively weighted averages or corrected z-scoring are used to combine data across stimulus levels. We emphasize the novelty with respect to these alternatives in our consideration of CP stimulus dependencies (lines 57-59). We also emphasize that our proposal to construct within-cell CP(pCR) profiles is not meant to serve only -not even primarily- as a way to improve the estimation of a single grand CP value per cell. The calculation of a single CP from trials with noninformative stimuli is meant to study choice driven neural responses isolated from stimulus-driven signals. Conversely, as discussed in lines 57-59, 683-690, our aim is to (a) theoretically motivate the need to study the shape of the CP(pCR) patterns, and (b) demonstrate our power to do so with refined methods using a classic and public dataset as an example.

We have now improved the description of how our refined analysis differs from previous analysis in Britten et al. 1996 (lines 616-621). We have also incorporated to this comparison the important work of Dodd et al. 2001 (lines 621-633). In the original submission we explicitly referred to the key differences with respect to the analysis in Figure 3 of Britten et al. only in Discussion. Now we expanded the description of the key steps of our analysis (lines 271-307) to more clearly indicate why it should improve the characterization of within cell CP dependencies on pCR. The key difference with respect to Figure 3 of Britten et al. 1996 and Figure 6 of Dodd et al. 2001 is that in our analysis we separate the within-cell modulations of the CP by pCR from the across-cell heterogeneity of the CP magnitudes. Conversely, in Britten et al. 1996 an average of CP values across cells was carried out at each stimulus level without ensuring that the same cells were contributing to each of these averages. Similarly, in a scatter plot of the form presented in Figure 6a of Dodd et al. 2001, the cell-identity of each dot is lost, and hence it is not possible to trace the within-cell profile CP(pCR), making it harder to discriminate within-cell modulations from across cells CP heterogeneity. The construction of CP(pCR) profiles for each cell isolates within-cell modulations. The fact that the modulation h(pCR) is multiplicative, as we indicate in lines 189-190 and 275-277, makes the detection of modulations even more sensitive to the isolation of within-cell comparisons, since the magnitude of any modulation is relative to the magnitude of the CP value obtained with the noninformative stimulus. Furthermore, CP(pCR) profiles also have the advantage that they can subsequently be used -with cluster analysis as we do, but also with other pattern recognition methods- to characterize which patterns of CP stimulus dependencies are predominant across cells.

As we explain in lines 297-307, an important step when averaging across cells is that we need to ensure that our average CP profiles -as presented in Figure 3- really correspond to an average of the individual within-cell CP profiles of the cells included in the analysis, and are not the result of different subpopulation of neurons contributing to the average at each dot of the curve. That is, our average should correspond to an average of the within-cell CP modulations, and be isolated from heterogeneity in the CP magnitude across cells. This distinguishes our refined analysis from the type of previous analysis used in Britten et al. 1996 (Figure 3) or Dodd et al. 2001 (Figure 6), in which across-cell CP variability may hinder within-cell modulations. We have discussed these differences between these previous works and our new method in detail in lines 616-638. In addition, we also have improved the comparison of our analysis with the one of Britten et al. in the Introduction (lines 67-69). In particular, we corrected a key problem in the old text (old lines 71-74 in the previous submission) that misleadingly suggested that the key feature of our novel method was the separation of cells of opposite choice preferences. We now clarified that they key feature is, more broadly, the characterization of within-cell CP profiles.

b. Re-analysis of the main effect in different subsets of cells. We are concerned that modelling the group for which CP < 0.5 includes a lot of noise, because the members of this are almost always non-significant, as shown in Dodd et al. and in Britten et al. (See their Figure 5). If you look at Dodd et al., there is no sign of this relationship between pCR and CP. Moreover, in Dodd et al. there are very few 'wrong way' choice probabilities: so few in fact that they are within the statistical bounds of repeated measures. So the procedure of separating the CPs into >0.5 and <0.5 is not likely to alter the result at all from the published Figure in Dodd et al. It might be helpful to recalculate the results for Figure 3A but only including those neurons that show an individually significant CP > 0.5.

We completely agree with these suggestions. We note that cells with CP<0.5 were already excluded for our further cluster-based analysis, and this is better specified in revision. We also agree that is useful to better discuss the implications of Dodd’s work on our results and the suggestions that arise from our work on how to extend Dodd’s previous work. These issues have been clarified in revision. These revisions are reported in lines 330343, 297-307, and 616-633.

This is a good suggestion, and we agree that this issue needs clarifications. We agree with the Reviewers that "a separation of CPs into >0.5 and <0.5 is not likely to alter the result at all from the published Figure 6 in Dodd et al. 2001". However, as discussed above and now emphasized in lines 297-307 and 616-629, apart from the separation between cells with CPs >0.5 and <0.5 another important component of our refined analysis is to isolate within-cell CP(pCR) profiles from the across cells CP heterogeneity. In Figure 3A of Britten et al. 1996 the fact that the average of CPs at each coherence level is calculated without a separation of CPs into >0.5 and <0.5 affects the identification of the U-shape, but this lack of separation is not the only factor hiding the CP(pCR) patterns. In more detail, in the data of Britten et al. 1996 the set of coherence levels used varies across cells, which means that a different subpopulation of cells is contributing to each dot of their Figure 3A, so that the analysis of within-cell CP(pCR) profiles is lost. This is the reason why in our analysis we take special care that the same neurons are averaged at each pCR level, even at the cost of excluding a portion of the cells from the analysis because a CP value cannot be calculated for all pCR levels.

In the case of Dodd et al. 2001, all the results of the paper rely on the analysis of CPs from zero disparity trials, and hence their main conclusions stand and are not affected by our new advances. We make this point clear in our reference to Dodd et al. 2001 (lines 629-633). The point of their work that could be revisited at the light of our advances is that in the specific case of their Figure 6A the capacity to identify CP(pCR) dependencies may have been affected by the lack of isolation of within-cell CP(pCR) modulations. From our understanding of the experimental procedure of Dodd et al. 2001, the range of disparity levels changes from unit to unit, being usually five or seven. This means that different cells may be contributing more dots in a certain range of the y-axis (% Choice PREF). Moreover, it is not possible to identify the within-cell CP(pCR) profiles from their published scatter plot. Furthermore, in our understanding, Figure 6A includes a dot for each stimulus condition in which the animal made at least one incorrect response. As Dodd et al. indicates, this is reflected in the increasing spread of the values at the top and the bottom of the ordinate axis of Figure 6A. We attenuated this kind of problems in our refined method by constructing pCR bins within which a weighted average of several CPs is calculated (lines 308-318). Because the CP(pCR) profile is a vector constructed using 5 bins, this means that the estimate for each bin is already more reliable than individual estimates for single stimulus levels.

Altogether, our method was specifically designed to better estimate within-cell CP(pCR) profiles, and this analysis may reveal some structure difficult to appreciate from the joint scatter plot of CPs from all cells. We now indicate this point also with regard to the discussion of Figure 6 of Dodd et al. 2001 (lines 621-633).

[Speculative note: We think it would be very interesting to collect more data in the experimental paradigm of Dodd et al., ideally with chronic recordings of many neurons at the same time. Comparing the results of our analysis of the CP(pCR) dependency for such a dataset with one collected in a more conventional discrimination task (e.g. motion direction discrimination) might provide deeper insights into the perceptual decision-making mechanisms underlying both, and how they might differ. For instance, one might find that the h(pCR) dependency is absent in the rotating cylinder task suggesting that the decision in that task – unlike in a motion direction discrimination task – is not mediated by a continuous internal estimate that is being thresholded.]

We also agree with the reviewers that "modelling the group for which CP < 0.5 includes a lot of noise". Indeed, in lines 330-339 we indicate that the fact that we do not find a modulation consistent with the factor h(pCR) significant for the cells with CP<0.5 can be explained by two factors. One of the factors is that fewer cells are included in this group, so that the estimated average CP profile is noisier. The other factor is related to the point of the Reviewers, namely the fact that the h(pCR) modulation is multiplicative to CP-0.5, and hence if few cells in the group of CP<0.5 have a CP magnitude significantly different from 0.5, then CP-0.5 = 0+noise, and a consistent effect of the multiplicative factor h(pCR) cannot be isolated when averaging across cells. We now further highlighted (lines 339-343) this fact that as pointed out by the Reviewers "CP < 0.5 includes a lot of noise" and this explains why the prediction of the U-shape modulation could only be corroborated to be significant for the cells with CP>0.5. As a result, the lack of a significant inverted U-shape modulation for the cells with CP<0.5 does not constitute evidence against our model. Furthermore, note that the cells with CP <0.5 do not affect the subsequent discrimination between a symmetric and an asymmetric pattern in Figure 4b (cluster 1 is the same in Figure 4a and 4b), since they are already excluded for that analysis, as we now highlight (lines 386-387).

c. A simulation of the factors that affect expected effect size. The simulation should show what size of the effect is expected given the limited amount of noisy data (number of neurons, trials, etc.). This would allow you to determine whether the experimental result lies with the confidence bounds of the theoretical prediction.

There is apparently no such relationship between CP and pCR in the Dodd et al. data (but it could perhaps show up when selecting and binning the neurons as described in the manuscript, line 288). In any case, a possible explanation for the discrepancy could be that the expected relationship is so small that it cannot reliably be detected from ~100 neurons. We also think also this issue can be clarified by simulating the size of the effect for limited data.

We now include a power analysis based on simulations that provides a clearer intuition about the expectations for effect size as a function the number of trials, the number of neurons, and the magnitude of the CPs. These simulations are reported in lines 1238-1276.

We thank the Reviewers for the suggestion to perform simulations that provide additional intuition of the expected statistical power for the detection of the CP modulation h(pCR). We now implemented simulations of how the detection of a significant modulation h(pCR) depends on the number of trials, the number of neurons, and the magnitude of the CPs (see new section S1.2 in the Supplementary Material). As expected, these simulations show that increasing the number of cells and of trials used to estimate an average CP(pCR) profile increases the statistical power for the detection of a CP modulation. This indicates the utility of our refined analysis of within-cell CP(pCR) dependencies, which can substantially improve the statistical power of the analysis by efficiently combining CP values, both across stimulus levels and across cells. Most importantly, in our method CP values are combined while preserving the characterization of within-cell CP(pCR) profiles.

These simulations also indicate that the p-values obtained for the experimental data, in particular p = 0.0008 for the red curve of Figures 4B, are smaller than predicted from the model. This new analysis complements the comparison of the experimental effect size (red curve) and predicted effect size (dashed black curve) in Figure 3A, and in general the comparison of the experimental results (Figures 3-4) with the analytical effect sizes displayed in Figure 2, which already pointed to the higher magnitude of the experimental effect size. In the revised paper we now address this quantitative mismatch between the predictions of the h(pcr) modulation and the CP(pcr) pattern obtained for the symmetric cluster (red curve) in Figure 4b (lines 424-431).

As already discussed above in a previous reply, the quantitative mismatch can be explained either by the presence of further symmetrical sources of CP stimulus-dependence through CC(pCR), and/or dynamic feedback amplifying the threshold-induced U-shape dependence. Given the level of abstraction of our CP model, we think it is remarkable that qualitative evidence compatible with h(pCR) is found. Furthermore, independently of the quantitative fit to the prediction of the h(pCR) modulation, it is in itself an achievement of our method that the characterization of within-cell CP(pCR) allows identifying any existing pattern of CP stimulus dependence, since their existence had been previously unnoticed. This identification is a first step towards inferring the underlying mechanisms producing a characteristic structure of CP(pCR) patterns, as we discussed in lines 672-683.

Finally, we agree with the Reviewers that our analysis applied to the Dodd et al. data may show the h(pCR) dependence (but see the speculative note above).

2. Clarifications are further needed in term of how the fluctuations in the stimulus-related gain of neuronal firing are responsible for the emergence of stronger CPs at higher performance levels. As we understand it, this is required both in terms of the formal implementation in the model and in terms of the proposed neurobiological implementation.

The authors present a detailed analysis of abstract decision-making models. They relate noisy neuronal responses ri, a hypothetical covert decision variable d and decision threshold θ, and an overt behavioural choice D. The authors assume a bivariate Gaussian association between ri and d, with a certain correlation coefficient, and from this minimal basis derive exact or approximate expressions for choice rate pCR, choice probability CPi, choice correlation CCi, between ri and d, and choice-triggered averages of ri, CTAi , and of d, CTAd. The treatment extends previous work in that it covers the entire range of choice rates pCR, not only the special case pCR = 0.5.

A key result is that choice probability CPi changes multiplicatively as a function of pCR, increasing as the decision grows more consistent in either direction, with the baseline level set by choice correlation CCi. An important implication is that the dependence of CPi on pCR, which is shared by all cells and expected to be U-shaped, can be averaged over cells with different choice correlation CCi, provided that cells with positive and negative choice correlations are distinguished.

To test these predictions, the authors re-analyze MT recordings from Britten et al. (1996) and were able to confirm a U-shaped dependence of average CP on pCR, which was statistically significant for cells with positive CCi. However, contrary to predictions, the U-shaped dependence was asymmetric and more pronounced when the more frequent choice (pCR > 0) is consistent with the preferred stimulus of the cell (positively correlated cells CCi>0). A cluster analysis of empirical individual cell dependencies of CPi on pCR revealed that, in addition to the predicted U-shaped dependencies, the presence of unexpected monotonically increasing dependencies.

To clarify the origin of these unexpected dependencies, the authors consider the effect of trial-to-trial response gain fluctuations (Goris et al., 2014) and, with the help of the model of these authors, confirm that gain fluctuations account for 62% of the observed trial-to-trial variance. The authors point out that gain fluctuations add a stimulus-dependent component to the noise covariance of neural responses, which is inherited by choice correlation CCi and by choice probability CPi. In Methods and Supplementary Text S4, the authors derive that this stimulus-dependent component can itself depend asymmetrically on pCR, to an extent that is specific to each cell (i.e., the specific coupling between response and gain). Unfortunately, the authors do not offer an intuitive argument about the origin of this asymmetry.

Please explain how gain fluctuations lead to rising dependencies on pCR in some cells. Which cells are these? Do they combine strong stimulus modulation with weak choice correlation? Without such an explanation, the entire cluster analysis appears incomplete and ultimately pointless.

We now provide a more complete description of the gain modulation model and how it relates to the results of the cluster analysis in the Results section. Some of the final questions raised by the Reviewers are excellent ones but will require data to characterize the cells and their responses (e.g. structure of cross-neuronal correlations? what layer? Excitatory/inhibitory? Target of feedback projections? Projecting to other cells within the same column/the same area/other cortical areas/thalamus/…?) beyond those available in Britten et al. data. We performed the cluster analysis to demonstrate that there is structure in the CP(pCR) dependencies, and that beyond the modulation h(pCR) common to all cells there is additional structure that is cell-specific and can be expected to be related to the cell properties mentioned above. Now we’re handing this back to the field to figure out how these structures correlate with other properties or structures since we believe that this will provide insights into the neural circuits underlying perceptual decision-making. Together with more fine-grained improvements reported below we specifically improved the presentation of the gain model in Results (lines 205-256), Methods (lines 882-957) and Suppl. Information S4.

Thanks to the Reviewers for the summary of our findings. We now better explain that the finding of an asymmetric component of the CP(pCR) dependence is not "contrary to predictions" of our model, as presented in Eq 7. We indicate in lines 160-164 that the CP(pCR) dependence can be caused by the threshold-induced factor h(pCR) but also by choice correlations CC(pCR). We indicate in lines 197-200 that averaging across neurons is expected to help to isolate the h(pCR) pattern because it is common to all cells, while cell-specific patterns are expected to be introduced by CCi(pCR). The identification of the U-shape of h(pCR) relies on the assumption that CC(pCR) is stimulus independent or that the CP(pCR) dependencies induced by CC(pCR) are sufficiently heterogeneous across cells so that when averaging across cells they average out and the predominant modulation observable in the average profile <CP(pCR)> is h(pCR). The fact that we find the asymmetric component of CP(pCR) is against this assumption, suggesting that there is a shared pattern of CC(pCR) among part of the cells (lines 349-353), but not against the predictions of our model. Indeed, as we discuss in lines 164-167 and 672-680, there is evidence from previous theoretical and experimental work suggesting that choice correlations CCs have cell-specific stimulus dependencies, and hence given Eq 7 it should be expected that also the CPs have cell-specific stimulus-dependent patterns. As we discuss in lines 683-692, our refined method to characterize within-cell CP(pCR) should help to examine this dimension of the data that has not been thoroughly explored yet, and which can provide insights into the nature of the underlying decision-making mechanisms beyond what is captured by a single CP value.

As mentioned above, the finding of the asymmetric pattern is evidence of a stimulus dependence of the CC. This raises the question of which can be the origin of this component of the CC stimulus dependence that is shared by the neurons. As we mention in lines 164167 and 672-680 it can be expected that CC stimulus dependencies have a rich structure reflecting the structure, for example, of feedback connections. However, as indicated in lines 458-460, to examine the structure of CP(pCR) patterns associated with the computational decision-making mechanisms would require a characterization the cross-neuronal correlation structure of the responses which is beyond the single cell recordings of Britten et al. This is why we chose to examine if the asymmetric component of the CP(pCR) could be explained by a non-specific mechanism for which prior empirical support exists (Goris et al. 2014), such as the presence of trial-to-trial gain fluctuations.

We have now substantially improved the description of the gain model (lines 205-256) and add new analyses quantifying how well the gain model can predict the experimentally observed asymmetric CP pattern extracted from the cluster analysis (lines 443-460). In particular, we now report also that the theoretical coefficients estimated from the gain model significantly correlate with the experimentally estimated coefficients for the asymmetric cluster, although overestimating their magnitude.

The separation in a new Section for the gain model (lines 205-256) helps to clearly differentiate between the generic model introduced before, and the specific model that considers gain fluctuations as the source of a CC(pCR) contribution. In this section we describe the main components of the model and introduce the resulting analytical CP expression (Eq 11), which in the previous version was only provided in Methods. For this model of gain fluctuations, we specifically adopted a purely feedforward encoding/decoding model, as previously studied in Shadlen et al. (1996) and Haefner et al. (2013). Equation 9 is a generalization of the CP expression of Haefner et al. (2013) to all stimulus levels. We now further explain this extension in connection with the previous results of Haefner et al., which characterized the dependence between CPs and properties such as the neurometric sensitivity and the cross-neuronal correlation structure (lines 218-227).

In more detail, we consider that the neural decoder that determines the choice is tuned to be optimal at the decision boundary, that is, that the read-out weights are tuned to the structure of covariability in the population responses for the non-informative stimuli. For this optimal read-out, the CP is proportional to the neurometric sensitivity (Haefner et al. 2013, Pitkow 2015). However, in the presence of gain-fluctuations, this structure of covariability changes for different stimuli -due to the additional gain-related component in Eq 10. We now explain in more detail how the CC stimulus dependence appears as a consequence of the stimulus dependence of the gain-related component of the cross-neuronal correlations in Eq 10 (lines 234-243).

We have also improved the description of the formal implementation of the model in Methods (lines 882-957) and in section S4 of the Suppl. Material. In Eq 23 and lines 924-934 we explain how the changes in the cross-neuronal correlation structure of Eq 22 create additional response variability that attenuates the CC but also create a new component of covariation between the decision variable and the single-cell responses due to the shared gain fluctuations common to the neural population. A more detailed derivation is now extended in section S4.

Moreover, to further understand which properties of the single-cell responses and of the decoder determine the form of the asymmetric gain-induced CP(pCR) component, in S4 we further generalize the model (lines 1468-1484). This generalization is valid not only for an optimal decoder but for any unbiased decoder. We describe how the stimulus dependencies of CC induced by the gain are determined In Equation S12. The shape of these dependencies depends on cell-specific properties, namely on the neurometric sensitivity of the cells and on the relative contribution of the gain to their variability, as well as on population properties, such as the behavioral sensitivity of the decoder to changes in the stimulus, and on the relative contribution of the gain to the variability of the decision variable. In the particular case of the model with an optimal decoder in which we focus in the main article, as explained in lines 244-245 for each cell the strength of the asymmetric component is determined by the CC magnitude for pCR = 0.5 as well as by the relative contribution of gain to their variability (λi).

Additionally, to further connect our model to previous computational (Shadlen et al. 1996) and analytical (Haefner et al. 2013) studies of the relation between CPs and the structure of cross-neuronal correlations, we now analyzed in detail the effect of gain fluctuations when the decoder is formed by two pools of neurons/antineurons with opposite choice preference (lines 1427-1452).

While the gain model explains main features of the CP(pCR) patterns observed with the cluster analysis we would like to highlight that the value of the cluster analysis stands independently of modeling the observed patterns with the gain model. This is now clearer since the results of the cluster analysis and the gain model are in separate subsections. As we now discuss in lines 349-353 and 381-382, the cluster analysis identifies the existence of a statistically significant asymmetric CP(pCR) pattern together with the symmetric pattern that the threshold-induced modulation h(pCR) predicts. The presence of this asymmetric pattern is a signature of the existence of stimulus dependent choice correlations CC(pCR), something that had not been identified in previous studies of CPs, but that is consistent with both theoretical and experimental work (lines 672-680) which studies the structure of stimulus dependent decision-related feedback signals, and more broadly stimulus-dependent cross-neuronal correlations. In this sense, our cluster analysis resolves the apparent contradiction between expectations from this bulk of previous work and the lack of evidence of CC stimulus dependencies.

A full characterization of the CP(pCR) patterns in relation to cell types would require a joint characterization of properties such as the connectivity structure and cross-neuronal correlation structure, which are not available from the single unit recordings of Britten et al. 1996 (lines 458-460). Despite this limitation, the cluster analysis is useful because it reveals a significant structure of CP(pCR) previously unnoticed, and validates the use of within-cell CP(pCR) profiles to study the interaction between stimulus-driven and choice-driven signals in neural responses. Similarly, the gain model we present shows that an asymmetric component of CP stimulus dependence can be caused by a generic mechanism such as trial-to-trial gain fluctuations, and is informative about the relative strength of the asymmetric CP stimulus dependencies for the cluster in which the asymmetric dependence is predominant (lines 443-460), although overestimating its magnitude. We expect that the analysis of within-cell CP(pCR) patterns will in future work be a useful tool to identify a finer structure of CP stimulus dependencies across cell, distinctive of specific mechanisms of the decision-making process, such as the structure of stimulus-dependent decision-related feedback.

3. Following on from this, there is also the question how could the stronger CP with increasing pCR be implemented in different ways in actual neural terms. Would for example the pool size of neurons contributing to the decision change? With regards to the model, how does the effect of pCR x CP intersect with the experimentally much stronger interaction effect of neurometric sensitivity and size of the CP?

The h(pCR) modulation that we derive does not require any change in

“implementation”, pool size or membership of the decision pool. This modulation is already embedded (undescribed) in the Shadlen et al. 1996 model and in the Haefner et al. 2013 model when considering a pCR different than 0.5. Our results extend the results in Haefner et al. to pCR0.5 and thereby open the door to a principled study of the CP(pCR) relationship. In fact, without our results – both on h(pCR) and the impact of the known gain variability – a measurement of a U-shaped CP(pCR) or monotonic CP(pCR) relationship might lead to suggestions that pool size and/or pool membership must be changing with pCR/stimulus (which appears rather unlikely given that this would imply a change of the decoder driven precisely by the property to be decoded). These revisions are reported in lines 210-231 and 1405-1415.

The authors analyse this from the classical Britten et al. [1] data set, producing a framework for analysis in Figure 2 and the outcome of analysis in Figures3 and 4 of the paper. This type of analysis is not new. The specific form of the plots in Figures2,3,4 appears in Dodd et al. [2] in Figure 6, while Figure 3 in Britten et al. delivers almost the same information.

We respectfully disagree with this summary. All panels of Figure 2 represent novel mathematical results and insights that have not previously appeared anywhere. While Britten et al. and Dodd et al. present plots similar to those in Figure 3, our method introduces a key new refinement separating within-cell CP patterns of stimulus dependence from the across cell heterogeneity of CP magnitudes. Please find above in reply to point (1a) of the Reviewers a more detailed discussion of how we now improved the comparison with these previous studies. A comparison with these Figures from Dodd et al. and Britten et al. is now discussed in the Discussion, lines 612-638.

In regard to the clustering analysis, it seems that the big driver for the formation of clusters is the division between CP >0.5 and CP<0.5. For values of CP<0.5, there is not really a functional account of these, as they do not relate to the tuning of the cells for motion. The lack of functional meaning is highlighted by the fact that cluster 1 in Figure 4a (blue==CP <0.5) is statistically non-significant. Not unrelated to this lack of significance is the fact that Figure 13 of Britten et al. and Figure 6 of Parker et al. [3] show that CPs are stronger for neurons that are more sensitive to the visual task. The usual interpretations of this are either the intuitive claim that more sensitive neurons are more tightly involved in the task and therefore have higher CPs or, more subtly, that neurons with weaker sensitivity have lower degrees of interneuronal correlation.

Please see our previous answers regarding CP<0.5 in reply to point (1b) of the Reviewers. The reviewer’s points about the relationship across cells between the magnitude of the CP and the neurometric threshold (d’) is orthogonal to the analysis of the within-cell dependence of CP as a function of pCR that we focus on here. Put differently, a single neuron’s CP magnitude may be a function this single neuron’s d’ and additionally is modulated by the pCR, which is shared by all neurons by virtue of being a behavioral quantity. Furthermore, as we now explained in our improved description of the gain model, the neurometric sensitivity also affects the strength of the CP stimulus dependence induced by the gain.

The rest of the analysis in this paper advances the idea that fluctuations in the stimulus-related gain of neuronal firing are responsible for the emergence of stronger CPs at higher performance levels. The authors write on ll243-4 that "Briefly, the contribution of gain modulations to the covariance of the responses increases with neuronal firing rates, which in turn are stimulus-modulated as determined by tuning functions.". However, the lead author has already published a nice theoretical summary [4] showing that CP is related to the level of interneuronal correlations in the pool. Indeed, the analysis showed that under some conditions (large pool, correlated noise and at least one or two members of the pool contributing significantly to perceptual read-out) one might take CP as substitute indicator for the interneuronal correlation of the decision pool. In the light of the earlier analysis, the present paper does not address the very relevant question of changes in the membership of the neuronal pool with stimulus strength. In effect, if read-out weights change with stimulus strength, then CP will be expected to change. Equally, if pool membership changes then interneuronal correlation may be expected to change. We did not see anything in this analysis here that definitively ties down the change in CP to stimulus-related gain changes.

l 242-243: " We derived the specific form of CC𝑖(𝑝CR) predicted from gain modulations in the threshold decision model to explain additional CP stimulus dependencies beyond the symmetric modulation by h(𝑝CR ). Briefly, the contribution of gain modulations to the covariance of the responses increases with neuronal firing rates, which in turn are stimulus-modulated as determined by tuning functions. This leads to an asymmetric component of CP(𝑝CR), with higher CPs for 𝑝CR values associated with stimuli preferred by the cell. Furthermore, while stronger gain fluctuations increase this asymmetric stimulus dependence, they also decrease the magnitude of the cell-specific CP because they add variability to the responses unrelated to the choice "

The gain model examines the effect on CP(pCR) of a particular structure of stimulus-dependent cross-neuronal correlations, namely produced by shared gain fluctuations across neurons. We clarified the advances with respect to Haefner et al. 2013. We assume that only the cross-neuronal correlations are stimulus-dependent, while the read-out weights are independent of the stimulus. We believe this assumption is reasonable since the cross-neuronal correlations depend on the dynamics of the network, while the weights are expected to be hardwired in the network structure and the decoder is not expected to depend on the very sensory stimulus it has to decode. We did not claim (nor could we given the available data, as explained in the ‘In Brief’ summary in replied to point 2 above) that gain variability is the only explanation for a monotonic increase of CP with stimulus strength. What we show is that the gain model explains main features of the patterns of CP stimulus dependence observed with the cluster analysis. We have reinforced these conclusions with additional analysis of the predictive power of the gain model (lines 443-460). The paper has been revised to make the gain model clearer (lines 205-256, 882957).

As neuronal sensitivity has normally been measured, the change in response due to a change in the stimulus is assessed relative to the variability of neuronal firing. The description above implies that stimulus modulations in the Haefner models translate into response changes on top of which random gain modulations are applied. At first sight, there does not seem any room in the model for low firing rate, low variability neurons to contribute to CP, even though such neurons may have high neurometric sensitivity. One the other hand, it may well be that the new Haefner model all shakes down to give the established experimental result that CP is linked to neuronal sensitivity. If that's correct, then the paper is currently rather obscure on this point and it will be useful for the paper to lay this out clearly.

As mentioned above, our model does not amend the findings in Haefner et al. 2013 that support with their analytical CP model the experimental finding that CP is proportional to the neurometric sensitivity. Haefner et al. derived that relation for pCR = 0.5 and it equally holds in our extended model, hence supporting that neurons with low firing rate and low variability contribute to the internal decoder. Using this suggestion and other suggestions (see reply to point 2) we have improved the description of the gain model to better explain the role of the neurometric sensitivity both determining the CP magnitude at pCR = 0.5 and the strength of the modulation induced by the gain. The connection with the neurometric sensitivity can be found in lines 220-227 and 1405-1415.

We now provide an expanded discussion of all the points raised by the Reviewers in point 3.

As detailed in reply to point (2), we have extended the analysis of the CP stimulus dependency patterns extracted from the cluster analysis jointly with the predictions of the gain model (lines 443-460) and substantially rewritten the presentation of the CP gain induced model in the main text (lines 205-256), Methods (lines 882-957), and Suppl. Material (section S4). We now better explain the role of single cell properties, namely the neurometric sensitivity of the cells and the relative contribution of the gain to their variability, as well as population properties, such as the behavioral sensitivity of the decoder to changes in the stimulus, and the relative contribution of the gain to the variability of the decision variable.

As indicated in lines 59-61 and 125-128, our general model is agnostic with respect to the feedforward or feedback origins of activity-choice covariations. The only assumption is that "the link between sensory responses and choices is mediated by a continuous decision variable and a thresholding mechanism" (lines 529-530). This general model (Eq 7) identifies two sources of CP stimulus dependencies (lines 160-164). The stereotypical modulation h(pCR), common to all cells and threshold-induced, and potentially stimulus dependencies inherited from the choice correlation CC(pCR). To study the form of CC(pCR) associated with a specific source of cross-neuronal correlations we focused on the model of gain induced correlations of Goris et al. 2014 (Eq 10). When studying these gain-induced CP stimulus dependencies we follow Haefner et al. 2013 and adopt a traditional purely feedforward encoding/decoding model. Eq 9 is equal to the expression derived by Haefner et al. 2013, except for the factor h(pCR) and considering that the cross-neuronal correlation structure can be potentially stimulus dependent.

We now highlight in lines 230-231 that we model the effect of these gain-induced stimulus dependent cross-neuronal correlations while assuming that the read-out weights -and hence the population size- are stimulus independent. We believe it is reasonable to assume that the read-out weights are independent of the stimulus value, since otherwise the form of the decoder would depend precisely on what the decoder estimates. On the other hand, we admit that the traditional linear decoder (Eq 8) used in the analytical model -following Shadlen et al. 1996 and Haefner et al. 2013, among others- is an approximation of the internal neural decoder, and that stimulus-dependent weights may reflect dynamic aspects of the decision-making process neglected when using the linear decoder. Either way, the assumption of stimulus independent weights can also be taken as a modeling choice to determine which CP stimulus dependencies can be derived purely from the stimulus dependent cross-neuronal correlations induced by gain fluctuations.

In reply to point (2) of the Reviewers above we already described our improvements connecting the extended model to the previous analysis of Haefner et al. 2013, which examined the interpretability of CPs in relation to properties such as the neurometric sensitivity and cross-neuronal correlations for the case of uninformative stimuli. We here address the more specific points of the Reviewers to complement that description. As we now describe in lines 210-229, the CP model including gain fluctuations is a generalization to all stimulus levels of the model of Haefner et al. 2013, and hence their characterization of the CP in relation to the neurometric sensitivity and cross-neuronal correlations for uninformative stimuli still holds and is consistent with our analysis. Indeed, all the analysis of Haefner et al. 2013 focus on the properties of the term CC(pCR = 0.5) in Eq 11, and not on the rest of this expression. The key difference and novelty with respect to this previous work is that in our work we do not focus on the characterization of which neural properties determine the magnitude of CPs across neurons, but within-cell changes of the CP across stimulus levels.

In lines 218-227 we refer to the previous results from Haefner et al. 2013, indicating that the relation between the CP and the neurometric sensitivity for pCR = 0.5 also holds in our model. This is described in more detail in lines 1016-1018 and Equation 21 in Methods. This means that, equally to the previous model of Haefner et al. 2013, neurons with low firing rate and low variability can have a high CP at pCR = 0.5, if they have high neurometric sensitivity. Also the strength of the asymmetric CP(pCR) pattern is related to the neurometric sensitivity since in Equation 11 the slope depends on the factor [1-CCi(pCR=0.5)] and CCi(pCR=0.5) is proportional to the neurometric sensitivity. Apart from the case of an optimal decoder studied in the main text, we now in the Suppl. Material S4 derive a more general gain model valid for any unbiased decoder, which explains more generally how the strength of the gain-induced asymmetric stimulus-dependence depends on the neurometric sensitivity of each cell (denoted as \etai in Eqs S12 and S13).

Also the previous conclusions of the model of Haefner et al. 2013 regarding the connection between CPs and the cross-neuronal correlation structure hold in our extended model, since they describe properties of the CP for pCR = 0.5, not properties of the dependence CP(pCR) across pCR values (or equivalently across stimulus levels). We believe this is clearer in this revised version since we now describe the gain model in a separate section (lines 205-256) and we explicitly show the extended feedforward model of Eq 9, which shows the dependence of the CP on the cross-neuronal correlation structure. In lines 218234 we explain how this model extends the one of Haefner et al. 2013. To further facilitate the connection of our model with previous work studying the relation between CPs and cross-neuronal correlations, now in the Suppl Material S4 we additionally derived the form of CP(pCR) dependencies for the paradigmatic model of a decoder formed by two pools of neurons/ anti neurons (lines 1427-1452). For this model, Haefner et al. 2013 found that, as indicated by the Reviewers, the CP for the uninformative stimulus is determined by the cross-neuronal correlations, in particular by the difference between within-pool and between pools correlations (see now lines 1434-1437). We show that also for this particular two-pool based decoder gain fluctuations are expected to produce an asymmetric CP stimulus dependence, as modeled in Equation S14.

Regarding the novelty of our methods of analysis with respect to Figure 3 of Britten et al. 1996 and Figure 6 of Dodd et al. 2001, please see our detailed reply above to point (1) raised by the Reviewers. Very briefly, the key difference is that our method isolates within-cell CP(pCR) profiles from the heterogeneity of the CP magnitude across cells. This is now repeatedly emphasized in the revised paper, and especially described in lines 297-307 and 612-638.

We agree with the description of the Reviewers of the reason why no significant modulation of the CP with pCR is found for the cells with CP<0.5. In lines 330-339 we provide two explanations for this lack of significance. Apart from a smaller power due to the smaller size of the group of cells with CP<0.5, we also indicated that these cells have a smaller CP magnitude. Because the factor h(pCR) is multiplicative to CP-0.5, a smaller CP-0.5 results in a weaker CP modulation by pCR. We now elaborated on this second point following the argument of the Reviewers that in few cases CPs <0.5 are significant (lines 339-343). We indicate that therefore the fact that we do not observe an inverted U-shape dependence as predicted by our model for the cells with CP<0.5, is not strong evidence against the presence of the existence of a threshold-induced CP stimulus dependence.

We agree with the Reviewers that the separation in two clusters reflects the division between cells with CP>0.5 and CP<0.5. However, beyond this separation, the cluster analysis naturally subdivides the group of cells with CP>0.5 into cells with a predominantly symmetric or asymmetric CP(pCR) pattern, and the identification of these patterns is not driven by the sign of the CP>0.5 or <0.5. We have now highlighted that the separation of clusters 2 and 3 in Figure 4b is a subdivision of the cells with CP>0.5 (lines 386-387), although in fact these results are robust and analogous results are found when identifying three clusters without a priori excluding the cells with CP<0.5 (lines 387-389).

4. Editing for accessibility and readability

Unfortunately, our genuine enthusiasm for the manuscript is somewhat dampened by its length, by its opacity in places, and by the high degree of topic familiarity that it presupposes. For example, the discussion of grand CP on page 7 and in section S2 of the supplementary material, is difficult to follow even for someone in the field. Accordingly, if readability could be improved, the usefulness would be even greater.

We appreciate the theoretical advance of the paper. It is very useful to have equations that clarify the relationship between previously used measures (CP, CTA, CC) and the effect of informative stimuli. Overall, by the very nature of the topic, the paper is rather technical. Our impression was that the paper is not easy to read, in particular when we think about a broader readership that is not familiar with details of the theory of CP and the typical interpretation of CP measurements. We understand that this is not an easy job because the interpretation of CP is complicated (bottom-up vs. top-down contributions, relationship to spike count correlations, etc) but it would help to revise the Results section and add some more details and (where possible) intuitive interpretation of the theoretical and experimental results to guide the reader. As an example, it turns out that gain fluctuations are an important factor to explain CP vs pCR but this is only treated very briefly in the results (gain fluctuation model is not explained, no figure is shown about how well the model explains the variability observed in the data, etc).

We did our very best to improve readability while at the same time addressing all concerns and suggestions of the Reviewers. We sincerely appreciate the effort of the Reviewer’s in helping us to improve every aspect of the paper.

We thank the Reviewers for indicating the necessity to make the paper more accessible. Throughout the paper we have worked in improving its readability. We simplified the description of the grand CP in the main text and moved it to the Discussion (old lines 245266, now lines 593-603) to make clear the effect that CP stimulus dependencies may have on the interpretation of grand CPs and the additional information that can provide CP(pCR) profiles. We simplified section S2 focusing on the connection between the corrected z-score of Kang and Maunsell and a weighted CP average, which is the result relevant for our reasoning (lines 584-588). The additional description of how for the standard z-score the bias in the estimated grand CP (first pointed out by Kang and Maunsell) can be understood as a consequence of an unnormalized weighted average has been removed, since it is not necessary for our reasoning. See the reply to point (11) below for more details on our improvement of section S2.

We appreciate that in particular our explanation of the gain model in the main text was insufficient and it was unclear how to situate our contributions in relation to the previous work of Haefner et al. 2013. We now better explain the model in the Results in a separate subsection (lines 205-256) and we also have rewritten the description in Methods (lines 882957) and Supl. Material S4. As described in detail in reply to previous points of the Reviewers, we now much better describe the neuronal properties, such as the neurometric sensitivity, that determine the strength of an asymmetric CP(pCR) pattern induced by gain fluctuations.

Furthermore, we have expanded the analysis of the symmetric and asymmetric clusters extracted from the cluster analysis using the gain model (lines 443-453). We now also indicate that in cluster 3, when the asymmetric pattern is predominant, the coefficients estimated with the gain model are significantly correlated with the experimental estimates, although overestimating their magnitude. We also better explain the theoretical and experimental limitations in the analysis of the gain model (lines 454-460) despite the merits of this simple model explaining main features of the additional asymmetric CP pattern that could not be explained by the threshold-induced stimulus dependency.

Finally, we acknowledge that we have struggled with the trade-off between length and understandability. Invariably, any point that we cut from the text will have to be filled in by an interested reader. And we worry that the effort to do so for an interested reader, would exceed the annoyance from having to skim over seemingly irrelevant text for a reader interested in other aspects of our work. However, we welcome of course any specific suggestions of where to expand and what to cut beyond the changes in the current revision.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Essential Revision:

1. Overall, the empirical part looks a little bit like an attempt to drag meaning out of a weak relationship, at least as measured experimentally in the single, but classical data set of Britten et al. The more detailed explanation of the clustering analysis reveals that the main effect of interest is driven by one critical data point from the original Britten et al. data. This is the value of CP for pCR = 0.85, in the data set for which CP > 0.5. Much of the paper depends on just how confident we are about a difference in CP that rises from 0.57 for pCR=0.5 to 0.64 for pCR=0.85. The SEMs calculated for individual neurons in Figure 3d (lower right panel) are not encouraging in this regard.

We think that further empirical tests are needed to obtain clarity about significance about key elements of the proposed model but this is beyond the scope of the current manuscript. It should be flagged to the reader, though.

In order to clarify to the reader that the empirical part provides only an initial analysis – as was also explicitly mentioned by the authors in their response to the previous reviews – we ask the authors to insert something close to the following statement in the abstract and in the first paragraph of the results:

This paper provides preliminary empirical evidence for the promise of studying stimulus dependencies of choice-related signals, which requires further exploration in a wider data set.

We added the following sentence in the Abstract:

“Our analysis provides preliminary empirical evidence for the promise of studying stimulus dependencies of choice-related signals, encouraging further assessment in wider data sets.” (lines 20-22)

At the beginning of the Results section we also added these sentences (lines 85-86):

“This analysis provides preliminary empirical evidence in support of using these new methods for studying stimulus dependencies of activity-choice covariations.”

2. The revisions are acceptable in that the presentation indeed streamlined in many respects (thank you!). However, one large reservation remains: the main findings are NOT explained in an intuitive manner.

Basically, when we distinguish neural responses associated with choice D=1 from those associated with choice D=0, we obtain two more or less distinct distributions, P(R|D=1) and P(R|D=0). When these distributions are well separated, then the choice probability CP (average probability that an R|D=1 is larger than an R|D=0) is somewhat larger than when these distributions are more overlapping.

However, both qualitative and quantitative aspects of the relation between choice probability CP and choice rate PCR are highly dependent on assumptions. For example, the symmetric U-shape depends on Gaussian variability of responses. When responses are assumed to be Poisson-variable, for example, the symmetric U-shape is replaced by a monotonic decline (Matlab code available upon request).

We really appreciate the Reviewer taking time to probe the robustness of our analytical results. Below, we include Matlab code for a simple simulation based on the neuron-antineuron model (Britten et al. 1992) showing that our results also hold for Poisson neurons.

As explained in lines 759-771, the shape of the factor h(pCR) in the choice-triggered average CTAi (Equation 17) is determined based only on the assumption of the distribution p(d) being Gaussian. This approximation is likely excellent due to the Central Limit Theorem since d is the combination of many sensory neurons. We now further justify this in lines 153-155. On the other hand, the relationship between the CTAi and CPi depends also on the Gaussian assumption for the neural response distribution. While this assumption is less accurate (e.g. when Poisson), the CTA-CP relationship derived by Haefner et al. 2013 is very robust to deviations, down to only a few spikes per trial (see Figure S2 in Haefner et al. 2013). As part of an earlier preliminary version of our study (Chicharro et al. bioRxiv cited below), we actually already performed numerical simulations with Poisson responses to verify the robustness of our analytical results and confirmed the above insights. In our simulations, we found divergencies that were minor and with a different shape than the additional asymmetries that the gain model could explain. See Figure 3C-D in:

Decision-related signals in the presence of nonzero signal stimuli, internal bias, and feedback Daniel Chicharro, Stefano Panzeri, Ralf M. Haefner bioRxiv 118398; doi: https://doi.org/10.1101/118398

Based on the insights in the manuscript and numerical checks in the above two references, we can only speculate that in the Reviewer’s simulation the choice correlation might not have been constant as a function of the stimulus, in which case the shape of the threshold-induced modulation h(pCR) cannot be isolated from CC(pCR). The threshold-induced U-shape, albeit slightly distorted, persists down to populations of only 2 Poisson neurons, firing only 5 spikes per trial on average (the more of either, the better our approximations). We also include in Author response image 1 a figure showing multiple simulations with only 30 neurons. Equivalent results are obtained by increasing both neuron number and within-pool noise correlations.

Author response image 1. 5 runs simulating 30 Poisson neurons, spiking an average of 5 spikes/trial.

Author response image 1.

For larger populations, and for higher spike counts, our approximations will be even better.

stim=(-0.6:0.2:0.6);%change in the firing rate with informative stimuli

for i=1:length(stim)

r1=poissrnd(5+stim(i),15,1e4); % 15 neurons & 1e4 trials

r2=poissrnd(5-stim(i),15,1e4); % 15 anti-neurons & 1e4 trials

choice=sign(mean(r1)-mean(r2)); % difference of average population responses

choice(choice==0)=2*binornd(1,0.5,1,sum(choice==0))-1;

pcr(i)=sum(choice==1)/length(choice); % choice ratio

for j=1:size(r1,1)

cp2(j,i)=ChoiceProbability(r1(j,choice==-1),r1(j,choice==1));

end

end

plot(pcr,mean(cp2),'b.-');

(ChoiceProbability(x,y) is a function returning the choice probability value for the two sets of responses x, y)

Gain modulation changes the picture in exactly this way. After gain modulation, response variability is no longer Gaussian and, additionally, the set of responses with the larger average is more affected than the set with the smaller average. In other words, gain modulation alters the shape of response distributions and in consequence also the shape of CP = f(PCR).

Equation 7 shows that the dependence of CP on pCR can be decomposed into two parts. The first part is the modulation h(pCR), shared by all neurons. As discussed above, this modulation appears in the CTAi under the assumption of p(d) being Gaussian, it is inherited by the CPi (Equation 7) and is robust for distributions p(r) that depart from Gaussianity. The second part is a modulation specific to each neuron that is due to a stimulus-dependence of the choice correlation CCi. To be precise: it is not the non-Gaussianity of p(r) that determines this dependence of CP on pCR, it is the stimulus-dependence of CCi (Equation 7).

These insights were the motivation for our gain model: the idea that the stimulus dependent covariance of Equation 10 would likely result in a stimulus-dependent CC(s) that then is reflected in CP(pCR). It is through this additional contribution CC(pCR) and not modifying the U-shape factor h(pCR) induced by the threshold that the gain fluctuations contribute to the CP stimulus dependencies.

In spite of this reservation, I have to confess that this paper has been quite useful to me, because it forced me to think through these issues.

Thank you very much. Making these issues explicit, and shedding light on the deeper relationships underlying CP, has been a major motivation for our paper.

3. With regards to the dependence of the results on one critical data point in the Britten et al. data set (CP for pCR = 0.85) and the question of significance to other data sets, various suggestions for further data worth testing were made.

We wonder, for instance, whether Geoff Ghose's lab (https://pubmed.ncbi.nlm.nih.gov/30067123/ ; https://pubmed.ncbi.nlm.nih.gov/19109454/ ) is a better source of data to support this exercise. In particularly, his recordings could be searched for signs of noisy gain modulations, which is the mechanism that lies at the core of this analysis. Other suggestions were as mentioned before the Dodd et al. 2001/Wasmuht et al. 2019 data.

But we agree this is beyond the current scope of the manuscript.

Thank you very much for these suggestions. We agree with the Reviewers that further data sets will need to be examined in the future to confirm the existence of CP stimulus dependencies. In fact, in particular the shape of any dependence associated with the Choice Correlation CC(pCR), can be expected to depend on the particular role of the cells and on the task, so it would be very interesting to examine different data sets and the CP stimulus dependencies therein and to understand how any difference of their shape across data sets can be explained in terms of the particularities of the tasks or the properties of the cells. We also have high hopes for new datasets that come with cell type and layer information, as well as from chronic recordings where the same neurons are held across multiple days allowing for more accurate CP estimates.

4. Line 10: "activity-choice covariations are traditionally quantified with a single measure of choice probability (CP), without characterizing their changes across stimulus levels" I appreciate the need to demonstrate novelty in the paper, but this statement does a disservice to earlier researchers who recognized the possibility of a stimulus-choice interaction but did not find any evidence of such an interaction. The earlier papers are clear on this point. So the novelty here is not the failure of earlier researchers to think through their results carefully; the novelty here is an apparent improvement in sensitivity of the analysis methods. This may sound less exciting but is still important, if correct.

We agree with the Reviewers. Indeed, in the original submission we were giving historical context to the focus on a single CP based on early works such as the one of Britten et al. 1996, which examined the possibility of CP stimulus dependencies but did not find a significant dependence. This was lost in the simplifications of the Introduction. We now have recovered this important explanation in lines 36-38. We have also substituted ‘traditionally’ by ‘commonly’ in the piece of text cited by the Reviewers.

5. Line 65 "if the decision process uses a decision threshold"

Please see our explanation under (7) below. We expanded this sentence to include

‘if the decision-making process relies on a threshold mechanism (or threshold criterion) to convert a continues decision variable into a binary choice’ (lines 66-67)

6. Line 96 "choice probability, CP, defined as the probability that a random sample from all trials"; the reader only learns on line 99 what this is a random sample of. It would better read as "a random sample of neural activity r " and then explain in a separate sentence what r might be in any particular experimental situation.

Thank you, we now indicate that it is a sample of neural activity earlier in the explanation (line 99).

7. Line 147: "threshold value 𝜃 " the authors insist on referring to this parameter 𝜃 as a threshold value. This usage will inflame debate as to whether there is a "high threshold assumption" baked into this model, where "high threshold" here means a classical high threshold in visual detection models, as opposed to a signal detection model. As classical high threshold theory is now rejected for detection models, I think it would be better here and throughout to refer to 𝜃 as a criterion value, which is what it is and better aligns with the language of signal detection theory.

Thanks for pointing out this potential misunderstanding. We use ‘threshold’ as it is used for example in the literature of decision-making models that consider the accumulation of evidence until a threshold is reached, triggering a decision, and also often in the literature examining decisions in LIP. For example, the idea of a decision threshold is used in this way in the Review paper by Gold and Shadlen 2007. The model used in Shadlen et al. 1996, and in Haefner et al. 2013 can be viewed as a simplified analytically tractable version of this type of models.

To minimize misunderstandings, we added further explanation in the Introduction (see our answer to point (5)) that allows understanding the meaning of this decision threshold. We believe that this initial clarification, together with the detailed explanation of the meaning of the decision threshold early in the Results section and in Figure 1 should avoid confusion. The revised text reads as follows (lines 64-67):

‘We show that they can also appear for all neurons because of the transformation of the neural representation of the stimulus into a binary choice, if the decision-making process relies on a threshold mechanism (or threshold criterion) to convert a continues decision variable into a binary choice.’

8. Lines 129-169: this development is still very difficult to follow and labors over what are some fairly basic points. It would better be rewritten with better structure at about half the length. For example, compare lines 111-112 and 134-135, which could be combined into a single point that is made once at the right stage in the argument.

We simplified this explanation removing the more technical details of the threshold model not yet required at this point (lines 122-123). However, we believe these are important conceptual points: carefully relating the neuron-specific quantities to the choice-specific one, and we worry that shortening would increase the risk of a first-time reader missing important information. Also, some level of redundancy between lines 111-112 (now 116-117) and 134-135 (now 136-137) is justified by the discursive structure of the text: lines 111-112 serve to contrast with the subsequent explanation, that indicates that despite the lack of additional assumptions in the definition of the CP and CTA, their interpretation has been mostly led by a feedforward model. On the other hand, in lines 134-135 we are already describing the assumptions of a threshold mechanism, introducing the continuous decision variable d.

9. Lines 170-209: the writing continues in a stilted manner with multiple cross-references to other material.

To improve readability, we now connected the explanations of lines 174-179 with the previous text. Nonetheless, we believe all of these are important points, and that the cross-references are important, too. We don’t know how to simplify without omissions, or risking comprehension by a first-time reader. However, we would gladly implement any specific suggestion.

This section accomplishes 3 things:

- Explaining the possible sources of variation in pCR.

- Justifying the use of the linear relationship for all our explanations and pointing to the exact formulas.

- Laying out the empirical predictions of our theoretical results.

10. Lines 173-4 "We here will refer to CP stimulus dependencies and CP(𝑝CR ) patterns interchangeably". We rather fear that this is going to cause a lot of readers to trip over. I can see that these two are interchangeable from the theoretical perspective of these authors, but many will think that behavior pCR is dependent on a number of factors other than the stimulus. In the field of monkey neurophysiology, pCR will depend on reward, attention, arousal and so forth, in a way that the stimulus does not. What We think the authors actually mean is something like "Within the structure of our model, there is a fixed relationship between the dependence of CP on the stimulus strength and the dependence of CP on the choice rate pCR, for each threshold."

We have now clarified this point. We now write in lines 174-179: “Note that we do not distinguish between CC stimulus dependencies and a dependence of the CC on pCR. We do not make this distinction here because most generally a change in the stimulus level results in a change of pCR, and the two cannot be disentangled. However, the pCR more generally depends on other factors such as the reward value, attention level, or arousal state, and in Equation 7 the separate dependencies on the stimulus and pCR can be explicitly indicated as CCi(pCR, s) when the experimental paradigm allows to separate these two influences.”

So the model is flexible to accommodate different dependencies of CC on the stimulus s and on pCR, because the details of this dependencies are irrelevant for the model, that is, the threshold model does not characterize the form of CC(pCR), or CC(pCR, s). The reason why we do not distinguish between the dependence on pCR and on s is because in most experimental settings the two are intertwined. This is the case for example in the Britten et al. data we reanalyze, in which a relation pCR(s) is estimated experimentally by the psychometric function. We hope that explicitly mentioning that Equation 7 can be rewritten with CCi(pCR, s) when these two effects are separable will avoid the misunderstanding that our model is limited to assuming a one to one relation between pCR and stimulus levels.

11. Lines 263-265 "we show how to extend Generalized Linear Models (GLMs), a popular model to characterize the factors modulating neural activity, to include a stimulus-choice interaction terms" As a statistical procedure, this is fairly routine stuff and could be abbreviated considerably.

Thanks, we have shortened this description, now in lines 270 and 478.

12. Lines 376-377 "Specifying the existence of two clusters, we naturally recovered the distinction between cells with CP higher or lower than 0.5" We struggled to find a clear and unambiguous summary of the statistics associated with this. We can see that a consistent pattern emerges when the cluster number is increased from 2 to 3, but looking at the distributions in Figure 4c and Figure 4d does not appear to reveal clusters. The significance values in Figures4a and 4b relate to the significance of the modulation effect for each cluster, not the significance of the cluster separations. The methods section and supplementary analysis section cross-reference each other but neither seems to answer the simple question of whether 1 versus 2 clusters is statistically justified, let alone the step from 2 to 3.

As correctly indicated by the Reviewers, the significance values relate to the modulation effects, and not to the cluster separations. The analysis of the Supplementary figure S2c indicates that the distinction between the symmetric and asymmetric cluster is robust, in the sense that they still contain a substantial number of cells even when the number of clusters is allowed to be 6. However, our objective here was not to conclude that there is a concrete number of separable clusters. Indeed, we argue in the discussion that from considerations about the structure of stimulus-dependent feedback signals, it could be expected to exist a richer structure of CP stimulus dependencies (lines 683-694). This richer uncharacterized structure may explain the lack of separability of the clusters in the distributions of Figure 4c and 4d, although it may be due as well to noise in the estimated CP values. We limited the analysis to three clusters because of the limitations of statistical power of the data set. We considered a third cluster because qualitatively it was instructive to detect a CP dependence that was not symmetric as predicted by the threshold-effect h(pCR), since the asymmetric CP dependence implies (according to Equation 7) that it has to be originated through a choice correlation CC(pCR) modulation.

We now clarify in lines 425-431 that we do not claim that these clusters are the only existing ones, or even that a finite number of CP(pCR) patterns exists, leading to a finite number of clusters. For example, if the CP(pCR) profiles were associated with the structure of stimulus-dependent feedback across cells with different tuning functions, a continuum of CP(pCR) dependencies would be expected, in agreement with the continuum of tuning functions.

13. Lines 612-638 This discussion suggests that there may be within-cell changes in CP as a function of pCR that may have been hidden by pooling across populations of cells. But in the end, this paper has problems in detecting real changes in this relationship at the level of recordings from single cells. The small SEMs that attach to the data in Figure 3a do indeed reflect pooling across a population sample; they do not relate to changes in individual neurons. The panel in Figure 3d shows the true picture for individual cells. So this discussion begs the question as to why the population analyses in Dodd et al. do not show the predicted relationships. The point made by the authors about within-cell changes does not appear to be material in this regard.

The difference may be in the fact that, as indicated in lines (302-303), our average corresponds to an average -across cells- of within-cell CP(pCR) profiles. The importance of this is discussed in lines 304-309. It is further motivated in lines (569-576). When we average across cells we are averaging their individual within-cell CP(pCR) profiles. Therefore, for each particular pCR value all cells contribute in the same way.

As explained in lines 629-634, from Figure 6 of Dodd et al. 2001 we cannot directly read the within-cell CP(pCR) profiles. Given that, in our understanding, the set of stimulus levels presented to each cell were different, it is not possible to think of an average of all the dots in Figure 6a with CP>0.5 as analogous to an average of the within-cell CP(pCR) profiles of the cells with CP>0.5. We do not know though how strong was the variability in the set of stimulus levels presented to each cell and whether using our method a clear CP(pCR) dependence would be found. We do not discard that in that data set there is really no significant CP stimulus modulation, but we believe that our argument about how to isolate within-cell CP(pCR) profiles –even if averaged– from across cell heterogeneities, is still applicable in this case.

We do appreciate that a higher number of trials per neuron would be required to definitively answer the question about the nature of the CP(pCR) relationship, and that this manuscript provides theory and insights that we will only be fully exploitable in the future. With regard to Dodd et al., it is entirely possible that CPs in MT during the rotating cylinder task do not depend on pCR, while they do depend on pCR during a classic motion discrimination task. This is exactly what would be expected if the CPs during the rotating cylinder task were mostly due to feedback from a binary variable – as entirely plausible – rather than linked to the decision through a continuous decision-variable. Only in the latter case does our theory predict the symmetric h(pCR) relationship.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA. 1996. A relationship between behavioral choice and the visual responses of neurons in macaque MT. The Neural Signal Archive. Macaque [DOI] [PubMed]

    Supplementary Materials

    Transparent reporting form

    Data Availability Statement

    No data was collected as part of this study.

    The following previously published dataset was used:

    Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA. 1996. A relationship between behavioral choice and the visual responses of neurons in macaque MT. The Neural Signal Archive. Macaque


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES