Skip to main content
eLife logoLink to eLife
. 2015 Dec 24;4:e10056. doi: 10.7554/eLife.10056

Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits

Balázs B Ujfalussy 1,2,3,4,*, Judit K Makara 4,5, Tiago Branco 3,6, Máté Lengyel 1,7
Editor: Frances K Skinner8
PMCID: PMC4912838  PMID: 26705334

Abstract

Cortical neurons integrate thousands of synaptic inputs in their dendrites in highly nonlinear ways. It is unknown how these dendritic nonlinearities in individual cells contribute to computations at the level of neural circuits. Here, we show that dendritic nonlinearities are critical for the efficient integration of synaptic inputs in circuits performing analog computations with spiking neurons. We developed a theory that formalizes how a neuron's dendritic nonlinearity that is optimal for integrating synaptic inputs depends on the statistics of its presynaptic activity patterns. Based on their in vivo preynaptic population statistics (firing rates, membrane potential fluctuations, and correlations due to ensemble dynamics), our theory accurately predicted the responses of two different types of cortical pyramidal cells to patterned stimulation by two-photon glutamate uncaging. These results reveal a new computational principle underlying dendritic integration in cortical neurons by suggesting a functional link between cellular and systems--level properties of cortical circuits.

DOI: http://dx.doi.org/10.7554/eLife.10056.001

Research Organism: Human, Mouse, Rat

eLife digest

Imagine that you are in the habit of checking three different weather forecasts each day, and then one day in early September the first forecast suddenly predicts snow. If you live in an area where it doesn’t normally snow in September, your initial reaction is likely to be surprise. However, you will not be quite so surprised to see a prediction of snow in the second forecast, and by the third forecast you will hardly be surprised at all.

In these three cases, you have responded to the same piece of information in a different way. In mathematics, this type of response is referred to as “nonlinear” because the output (varying degrees of surprise) is not directly proportional to the input (identical predictions of snow). In the case of the weather forecasts, the source of the nonlinearity was the fact that the three predictions were not truly independent. Instead, they corresponded with one another, or “correlated”, because all three depended on the weather itself.

In the brain, a single neuron can receive thousands of inputs from other cells. These are received via junctions called synapses that form between the cells. In many cases, the synapses form on the receiving neuron’s dendrites – the short branches that protrude from its cell body. Each dendrite can receive signals from hundreds of other neurons, and must combine these inputs to produce a single neuronal response. How dendrites do this is not clear.

Ujfalussy et al. have now developed a computational model that predicts the optimal response of dendrites to complex and realistic inputs from other neurons. The model shows that when dendrites receive inputs from neurons that independently respond to different stimuli, the optimal response is for the dendrites to average the inputs. This is a form of linear processing. By contrast, when the inputs are correlated – for example, because they come from neurons responding to the same stimulus – the optimal response is nonlinear processing. In this and other cases, the optimal response predicted by the model is similar to the response observed in real dendrites.

The model also makes a number of testable predictions; for example, that neurons with correlated activities will tend to form clusters of synapses close together on the dendrites of a target neuron, whereas neurons with unrelated activities will tend to form synapses that are further apart. Somewhat unexpectedly, Ujfalussy et al. show that compensating for input correlations accounts for almost all the nonlinearities that can be found in real neurons' dendrites – at least in response to relatively simple input patterns. Thus, it remains to be shown whether nonlinear dendritic responses to more complex input patterns can also be explained by this single principle. Further studies are also required to understand how different plasticity mechanisms enable neurons to achieve this close match between input correlations and dendritic processing.

DOI: http://dx.doi.org/10.7554/eLife.10056.002

Introduction

The dendritic tree of a cortical neuron performs a highly nonlinear transformation on the thousands of inputs it receives from other neurons, sometimes resulting in a markedly sublinear (Longordo et al., 2013) and often in strongly superlinear integration of synaptic inputs (Losonczy and Magee, 2006; Nevian et al., 2007; Branco and Häusser, 2011; Makara and Magee, 2013). These nonlinearities have been traditionally studied from the perspective of single-neuron computations, using a few well-controlled synaptic stimuli, revealing a remarkable repertoire of arithmetic operations that the dendrites of cortical neurons carry out (Poirazi and Mel, 2001; London and Häusser, 2005; Branco et al., 2010) including additive, multiplicative and divisive ways of combining individual synaptic inputs in the cell’s response (Silver, 2010). More recently, the role of nonlinear dendritic integration in actively shaping responses of single neurons under in vivo conditions has been demonstrated in several cortical areas including the hippocampus (Grienberger et al., 2014), as well as visual (Smith et al., 2013) and somatosensory cortices (Xu et al., 2012; Lavzin et al., 2012Palmer et al., 2014).

However, while many of the basic biophysical mechanisms underlying these nonlinearities are well understood (Stuart et al., 2007), it has proven a daunting task to include all these mechanisms in larger scale network models to understand their interplay at the level of the circuit (Herz et al., 2006). Conversely, studies of cortical computation and dynamics have largely ignored the complex and highly nonlinear information processing capabilities of the dendritic tree and concentrated on circuit-level computations emerging from interactions between point-like neurons with single, somatic nonlinearities (Hopfield, 1984; Seung and Sompolinsky, 1993; Gerstner and Kistler, 2002; Vogels et al., 2011). Therefore, it is unknown how dendritic nonlinearities in individual cells contribute to computations at the level of a neural circuit.

A limitation of most theories of nonlinear dendritic integration is that they focus on highly simplified input regimes (Mel et al., 1998; Poirazi et al., 2003; Archie and Mel, 2000; Poirazi and Mel, 2001; Ujfalussy et al., 2009), essentially requiring both the inputs and the output of a cell to have stationary firing rates. This approach thus ignores the effects and consequences of temporal variations in neural activities at the time scale of inter-spike intervals characteristic of in vivo states in cortical populations (Crochet et al., 2011; Haider et al., 2013). In contrast, we propose an approach which is specifically centered on these naturally occurring statistical patterns – in analogy to the principle of ‘adaptation to natural input statistics’ which has been highly successful in accounting for the input-output relationships of cells in a number of sensory areas at the systems level (Simoncelli and Olshausen, 2001). We pursued this principle in understanding the integrative properties of individual cortical neurons, for which the relevant statistical input patterns are those characterising the spatio-temporal dynamics of their presynaptic spike trains. Thus, rather than modelling specific biophysical properties of a neuron directly, our goal was to predict the phenomenological input integration properties that result from those biophysical properties and are matched to the statistics of the presynaptic activities.

Our theory is based on the observation that cortical neurons mainly communicate by action potentials, which are temporally punctate all-or-none events. In contrast, the computations cortical circuits perform are commonly assumed to involve the transformations of analog activities varying continuously in time, such as firing rates or membrane potentials (Rumelhart et al., 1986; Hopfield, 1984; Dayan and Abbott, 2001; Archie and Mel, 2000; London et al., 2010). This implies a fundamental bottleneck in cortical computations: the discrete and stochastic firing of spikes by neurons conveys only a limited amount of information about their rapidly fluctuating activities (Pfister et al., 2010; Sengupta et al., 2014). Formalising the implications of this bottleneck mathematically reveals that the robust operation of a circuit requires its neurons to integrate their inputs in highly nonlinear ways that specifically depend on two complementary factors: the computation performed by the neuron and the long-term statistics of the inputs it receives from its presynaptic partners.

To critically evaluate our theory, we first illustrate qualitatively the nonlinearities that most efficiently overcome the spiking bottleneck for different classes of presynaptic correlation structures. Next, to provide biophysical insight, we demonstrate that the form of optimal input integration for these presynaptic correlations can be efficiently approximated by a canonical, biophysically-motivated model of dendritic integration. Finally, we test the prediction that cortical dendrites are optimally tuned to their input statistics in in vitro experiments. For this, we use available in vivo data to characterize the presynaptic population activity of two different types of cortical pyramidal cells. Based on these input statistics, our theory accurately predicts the integrative properties of the postsynaptic dendrites measured in two-photon glutamate uncaging experiments. We also show that NMDA receptor activation is necessary for dendritic integration to approximate the optimal response. These results suggest a novel functional role for dendritic nonlinearities in allowing postsynaptic neurons to integrate their richly structured synaptic inputs near-optimally, thus making a key contribution to dynamically unfolding cortical computations.

Results

Suppose that every day you check your three favorite websites for the weather forecast. On a September day, the first website forecasts snow which you find hard to believe as it is highly unusual in your area – so you dismiss it as the forecaster’s mistake. However, when you read a similar forecast on the second site, you become convinced that snow is coming, and by the time the third site brings you the same news you are hardly surprised at all. Thus, even though all three sources conveyed the same information (snow), they had different impact on you – in other words, their cumulative effect was nonlinear. This nonlinearity was due to the fact that the information you get from these sites tends to be correlated as they are all related to a common cause, the actual weather. Below we argue that the same fundamental statistical principle, that correlated information sources require nonlinear integration, accounts for the dendritic nonlinearities of cortical pyramidal neurons.

Overcoming the spiking bottleneck in circuit computations

To introduce our theory, we consider a postsynaptic neuron computing some function, f, of the activity of its presynaptic partners, u (Figure 1A, top):

v˙=f(u) (1)

Figure 1. Spike-based implementation of analogue computations in neural circuits.

(A) Computation (top) is formalized as a mapping, f, from presynaptic activities, u1,uN (left), to the postsynaptic activity, v (right). As neurons communicate with spikes, the implementation (bottom) of any computation must be based on the spikes the presynaptic neurons emit, s1,sN (middle). Optimal input integration in the postsynaptic cells requires that the output of g is close to that of f. (B) The logic and plan of the paper. Grey box in the center shows theoretical framework, blue boxes around it show steps necessary to apply the framework to neural data. To compute the transformation from stimulation patterns (bottom left) to the optimal response (bottom right) we assumed linear computation (top right) and specified the presynaptic statistics based on cortical population activity patterns observed in vivo (top left). To demonstrate the validity of the approach, we studied the fundamental qualitative properties of the optimal response (Figure 2), compared it to biophysical models (Figures 34) and tested it in in vitro experiments (Figure 5). (C) The optimal postsynaptic response (purple line, bottom) linearly integrates spikes from different presynaptic neurons (top: rasters in shades of green; middle: membrane potential of one presynaptic cell) if their activities are statistically independent. (D) Optimal input integration becomes nonlinear (purple line, bottom) if the activities of the presynaptic neurons are correlated (rasters in shades of green, top), even though the long-term statistics and spiking nonlinearity of individual neurons remains the same as in (C). In this case, the best linear response (black line, bottom) is unable to follow the fluctuations in the signal.

DOI: http://dx.doi.org/10.7554/eLife.10056.003

Figure 1.

Figure 1—figure supplement 1. An example of supralinear input integration with firing rate-based rather than membrane potential-based computations.

Figure 1—figure supplement 1.

(A) Optimal response (dark red to pink) and linear predictions (black) to an increasing number of input spikes (130) arriving with a short delay (1 ms). Throughout the paper, we assumed that the variables relevant for the postsynaptic neurons were the sub-threshold membrane potential values of their presynaptic partners, and for consistency, that the function the postsynaptic neuron computed was also represented by its sub-threshold membrane potential. We chose membrane potentials as the time-dependent representational substrate because they can be directly measured and are well defined at all times in individual trials in electrophysiological experiments (in contrast to firing rates which require averaging over trials or time). Here, we demonstrate that our basic results apply equally if computations are instead based on firing rates. We used the same model of presynaptic statistics as before but defined the intended computation as a linear mapping between pre- and postsynaptic firing rates (cf. Equation 5): τpostr˙post(t)=-rpost(t)+i=1Nwiri(t) where ri(t)=geβui(t) is the firing rate of presynaptic neuron i as defined previously. This means that the optimal spike-based implementation is (cf. Equation 6): τpostr~˙post(t)=-r~post(t)+P(u(t)|s0:t) i=1Nwi geβui(t) du(t). Therefore, based on this equation, the equations of assumed density filtering (Equations 20–22) remained unchanged but the optimal response was computed as τpostr~˙post(t)=-r~post(t)+iwizζz(t) γi(z)(t) where γi(z)(t)=geβμi(z)(t)+12β2Σii(z)(t) is the posterior mean firing rate of neuron i in population state z, as before. Here we plot r~post(t) as the optimal response. (B) The amplitude of the measured response as a function of the linear expectations. (C) Nonlinearity of the response amplitude as a function of the number of input spikes. The nonlinearity in this case is comparable to that obtained in the case of membrane potential-based computations (cf. Figures 2 and 5 of the main text). In fact, it is even stronger because the firing rate is a highly supra-linear (here: exponential) function of the membrane potential. Parameters were tuned to hippocampal sharp wave dynamics (Table 1, HP) with N=500, τpost=5ms, and wi=w=1/N.
Figure 1—figure supplement 2. The range of total dendritic inputs in vitro and in vivo.

Figure 1—figure supplement 2.

(A) Input ranges to a single dendritic branch in our in vitro experiments (green histogram) are similar to the input ranges expected during natural, in vivo stimulus conditions (black curve). Throughout the paper, we assumed that the dynamics of the postsynaptic cell is governed by the total input arriving to the cell, g(s) (cf. Equation 6): τpostv~˙(t)=-v~(t)+g(s) where g(s)=1NP (u(t)|s0:t) i=1N ui(t) du(t) is the average of the estimated presynaptic membrane potentials. To illustrate the input distribution in our in vitro cortical experiment (green histogram), we computed the values of the total input g(s) for s that were the stimuli given to the cell shown in Figure 5B–C and, just as in Figure 5C–E, we used neocortical input statistics (Table 1, NC) to determine P(ut|s0:t). Next, to compute the input distribution to a dendritic branch relevant in vivo (black curve), we simulated 100 s of activity from a presynaptic population of N=120 neurons undergoing dynamics with the same neocortical statistics and estimated g(s) in a similar way, but now with s being the spikes generated using these in vivo statistics. We observed an excellent overlap between the ranges spanned by the two distributions suggesting that the relatively simple stimuli used in our in vitro experiments were appropriate to probe the physiologically relevant integrative properties of a single dendritic branch. However, note that the range of the input received in vivo by a whole neuron (orange curve), rather than a single branch, can be substantially wider than that probed in our experiments. We modelled the neuron as possessing 10 branches receiving statistically independent inputs, with the total input being an average across all synapses and branches. (B) Examples for single neuron computation, f(u). Note, that although f(u) can be highly nonlinear over the whole-neuron input range (orange distribution in A), in many cases it can be reasonably well approximated by a linear function over the single-branch input range used in our in vitro experiments (green box).
Figure 1—figure supplement 3. Nonlinear computation.

Figure 1—figure supplement 3.

(A) To demonstrate that the theory applies to the case when the computation is nonlinear, we assumed that the postsynaptic neuron computes a sigmoidal function of the weighted sum (specifically the average) of the presynaptic membrane potentials (cf. Equation 5): f(u)=11+eβfi=1Nwiui-θf where θf and βf are the threshold and the slope of the sigmoidal nonlinearity, respectively (see inset, which also shows the marginal distribution of the inputs). To demonstrate the need for nonlinear dendritic integration, we compared the optimal response (black, Equation 3, where f(u) is defined above, and the posterior P(u|s) is approximated by Equations 20–22) with a response of a linear model (grey), a model with nonlinear soma but linear dendrites (somatic, orange; Equation 31) and the clustered dendrite model with nonlinear dendrites and nonlinear soma (clustered, red). For the somatic nonlinearity, we chose a sigmoidal function to match the form of the required computation and fitted its parameters to training data. For the clustered dendrite model, the dendritic nonlinearities were fitted to the data but the somatic nonlinearity was assumed to be identical to the nonlinearity used for the computation, f(u). We cross-validated the quality of the fits on a separate set of test data. Similar test and training errors confirmed that the parameter optimisation found good, locally near-optimal solutions without significant overfitting (not shown). (B) The error (mean ± sd, Equation 33) of the clustered dendrite model (red) is similar to the error of the optimal response (black) and is substantially smaller than the error of the models with linear dendritic integration (grey and orange). Moreover, having a global nonlinearity does not provide substantial improvement over the purely linear response, further emphasising the importance of nonlinear dendritic integration. Parameters were Ω-=10 Hz, Ω+=4 Hz, u¯=2.3 mV, τ=20 ms, Σii=1 mV2, Σij=0 mV2, g=10 Hz, β=0.4 mV-1, τrefr=1 ms, prel=1, βf=1 and θf=1 mV.

where v˙ is the resultant temporal change of the activity of the postsynaptic neuron. We chose u and v to be analog variables, rather than for example digital spike trains, in line with the vast bulk of theories of network computations (Hopfield, 1984; Dayan and Abbott, 2001; Pouget et al., 2003) and experimental results suggesting analog coding in the cortex (London et al., 2010; Shadlen and Newsome, 1998). In particular, we considered these variables to correspond to the coarse-grained (low-pass filtered) somatic membrane potentials of neurons (in particular, excluding the action potentials themselves, as often reported in experimental data; Carandini and Ferster, 2000), although the theory can equally be formalized in terms of instantaneous firing rates (Materials and methods, Figure 1-figure supplement 1).

The standard description of neural circuit dynamics in Equation 1 hides an important informational bottleneck intrinsic to the operation of cortical circuits. While according to Equation 1, the postsynaptic neuron’s analog activity, v, is required to depend directly on the analog activities of its presynaptic partners, u, in reality it only accesses these presynaptic activities through the spikes the presynaptic population transmits, s, incurring a substantial loss of information (Alenda et al., 2010; Sengupta et al., 2014). Therefore, the function g a neuron actually implements on its inputs can only depend directly on the presynaptic spikes, not the underlying activities (Figure 1A, bottom):

v˙=g(s) (2)

Importantly, while f(u) is dictated by the computational function of the circuit, the actual transformation of the synaptic input to the postsynaptic response, expressed by g(s), is determined by the morphological and biophysical properties of the cell. (For these purposes, we regard the presynaptic side of synapses, transforming presynaptic spike trains to synaptic transmission events, as conceptually being part of the postsynaptic cell’s g(s) function.) How can then the neuron integrate the incoming presynaptic spikes, as formalized by g(s), such that the resulting postsynaptic response best matches the required computational function, f(u), thereby alleviating the fundamental informational bottleneck of spiking-based communication?

Determining the best g(s) is nontrivial because the same presynaptic spike train may be the result of many different underlying presynaptic activities (Paninski, 2006), each potentially implying a different output of the computational function. This ambiguity is formalized mathematically as a posterior probability distribution, P(u|s), expressing the probability that the analog activities of the presynaptic cells might currently be u given their spike trains, s (Pfister et al., 2010; Ujfalussy et al., 2011). The optimal response, i.e. the g(s) that minimizes the average squared error relative to f(u), is the expectation of f(u) under the posterior:

g(s)=f(u)P(u|s) du (3)

Crucially, the expression for the posterior, given by Bayes’ rule, is:

P(u|s)P(s|u)P(u) (4)

Note that while Equations 3–4 do not reveal directly the specific biophysical properties the postsynaptic cell should have, they tell us phenomenologically what signal integration properties should result from its biophysical properties. In particular, they make it explicit that the optimal g(s) depends fundamentally on two factors (Figure 1B, top):

  1. the function that needs to be computed, f(u), and

  2. the statistics of presynaptic activities: P(u), the prior probability distribution characterizing the long-run statistics of multi-neural activity patterns in the presynaptic ensemble, and the likelihood P(s|u), expressing the potentially probabilistic relationship between analog activities (e.g. somatic membrane potential trajectories) and emitted spike trains.

In the following, we show that the outcome of the integration of presynaptic spike trains in cortical neurons approximates very closely the optimal response, and that dendritic nonlinearities are crucial for achieving this near-optimality. For this, 1) we make an assumption about the computational function of the postsynaptic cell, f(u) (Figure 1B, top right); 2) we constrain presynaptic statistics, P(u) and P(s|u), by in vivo data about cortical population activity patterns (Figure 1B, top left); and with these 3) we compute the optimal response they jointly determine for various stimulation patterns (Figure 1B, bottom left and right).

Optimal input integration is nonlinear

To specify our model, we considered the case when f(u) itself is linear. Although networks with purely linear dynamics can perform non-trivial computations already (Dayan and Abbott, 2001; Hennequin et al., 2014), in the general case, we do expect f(u) to be nonlinear, e.g. sigmoidal (Hopfield, 1984). Nevertheless, in typical electrophysiological experiments only a small fraction of the full dynamic range of a neuron’s total input is stimulated (Figure 1—figure supplement 2), and so we approximate the computational function, f(u), as being linear on this limited input range without loss of generality. (See Figure 1—figure supplement 3 for the application of the theory to the case of nonlinear f.) Yet, as we show below, for physiologically realistic statistics of presynaptic activity patterns, the optimal response combines input spike trains in highly nonlinear ways even in the case of linear computation, predicting experimentally characterized nonlinearities in dendritic input integration. In particular, second- and higher-order prior presynaptic correlations, represented by P(u), will have a major role in determining the form of the corresponding optimal response. The likelihood, P(s|u), also influences the optimal response, but only in its quantitative details, as it does not involve correlations across neurons: each neuron’s firing is independent from the others’, given its own somatic membrane potential (Materials and methods).

Previous suggestions for how postsynaptic neurons achieve reliable computation despite the substantial ambiguity about the individual presynaptic activities relied on the linear averaging of inputs arriving from a sufficiently large pool of presynaptic neurons (Dayan and Abbott, 2001; Pfister et al., 2010). However, linear averaging is only guaranteed to produce the correct output, as dictated by Equations 3-4, if the activities of presynaptic neurons are statistically independent under the prior distribution, i.e. P(u)=iP(ui) (Materials and methods). In contrast, the membrane potential (Crochet et al., 2011) and spiking (Cohen and Kohn, 2011) of cortical neural populations often show complex patterns of correlations, which include both ‘spatial’ (cross-correlations between different neurons) and temporal components (auto-correlations, i.e. the correlation of the activity of the same cell with itself at different moments in time). Thus, in this more general case, we expect the optimal response to involve a nonlinear integration of spike trains. While temporal correlations alone do not require nonlinear dendritic integration across synapses, only local nonlinearities within each synapse, as brought about e.g. by short term synaptic plasticity (Pfister et al., 2010), spatial correlations require the non-linear integration of spikes emitted by different presynaptic neurons.

To illustrate that presynaptic spatial correlations require nonlinear integration across synapses, we compared the best linear response to a given presynaptic spike pattern with the optimal response (Equation 3, as approximated by Equation 23) for two different input statistics that differed only in the correlations between the presynaptic cells but not in the activity dynamics or spiking of individual neurons (temporal correlations). To compute the postsynaptic response, we assumed that dendritic integration in the postsynaptic cell was linear but, in order to dissect the role of dendritic integration across synapses from the effects of nonlinearities in individual synapses, we allowed spikes from the same presynaptic neuron still to be integrated nonlinearly (Pfister et al., 2010). In the first case (Figure 1C), when the presynaptic neurons were independent, the best linear response was identical to the optimal response. However, if presynaptic neurons became correlated, the optimal response became nonlinear and the best linear response was unable to accurately follow the fluctuations in the input (Figure 1D).

Thus, inputs from presynaptic neurons whose activity tends to be correlated need to be nonlinearly integrated, while inputs from uncorrelated sources need to be integrated linearly. This could be naturally achieved in the same dendritic tree by clustering synapses of correlated inputs to efficiently engage dendritic nonlinearities, while distributing the synapses of uncorrelated inputs on different dendritic branches (Larkum and Nevian, 2008). Crucially, for correlated inputs it is also necessary that the dendritic nonlinearities have just the appropriate characteristics for the particular pattern of correlations in presynaptic activities.

The form of the optimal nonlinearity depends on the statistics of presynaptic inputs

In order to systematically study the nonlinearities in the optimal response in the face of naturalistic input patterns, we derived and analyzed its behavior for a flexible class of richly structured, correlated inputs. Our statistical model for presynaptic activities, specifying the parametric forms of P(u) and P(s|u) (Materials and methods and Figure 2—figure supplement 1), was able to generate a variety of multi-neural activity patterns resembling the statistical properties described in in vitro and in vivo multielectrode recordings of neuronal population activities (Figure 2A and D show two representative examples). Once we have specified the statistical model of presynaptic activities, it uniquely determined the optimal response to any given input pattern (Equations 3–4). Thus, we used the same statistical model in two fundamentally different ways: first, to generate “naturalistic” in vivo-like patterns of presynaptic membrane potential traces and spike trains; and second, to compute the optimal response pattern to any stimulation pattern, be it “naturalistic” or parametrically varying “artificial” as used in typical in vitro experiments.

Figure 2. Nonlinearities in the optimal response.

(AC) Second order correlations between presynaptic neurons (A) imply sublinear integration (BC). (A) Membrane potentials and spikes of two presynaptic neurons with correlated membrane potential fluctuations. (B) The optimal response (solid line) to a single spike (left) and to a train of six presynaptic spikes (right, green colors correspond to different presynaptic cells, two of which are shown in A) when the long-run statistics of presynaptic neurons are like those shown in (A). Shaded areas highlight how response magnitudes to a single spike from the same presynaptic neuron differ in the two cases: the response to the sixth spike in the train (right, light blue shading) is smaller than the response to a solitary spike (left, gray shading) implying sublinear integration. Dashed line shows linear response. (C) Response amplitudes for 1–12 input spikes versus linear expectations. (DF) Same as (AC) but for presynaptic neurons exhibiting synchronized switches between a quiescent and an active state, introducing higher order correlations between the neurons (D, bottom). In this case, the optimal response shows supralinear integration (EF).

DOI: http://dx.doi.org/10.7554/eLife.10056.010

Figure 2.

Figure 2—figure supplement 1. Definition of the statistical model describing presynaptic activities and illustration of the inference process in the model.

Figure 2—figure supplement 1.

(A) Graphical model showing statistical dependencies between variables (quantities changing in time, circles), and the parameters (quantities constant on the time scale of our interest, above and beside arrows) controlling those dependencies. (B) Table showing the variables and parameters of the model. See Materials and methods for further details. (C) Validating assumed density filtering (Equations 20–22) with one presynaptic cell and two states. Black shows the true state variable (top), and membrane potential trace with the spikes indicated by vertical lines (bottom). Cyan shows the posterior mean state variable (top) and membrane potential (bottom) obtained by assumed density filtering, red shows the corresponding posterior means estimated by particle filtering. Parameters were N=1, β=0.33 mV-1, τ=20 ms, g=10 Hz, Σii=4 mV2, u¯=5 mV and Ω+=Ω-=5 Hz.

The optimal response determined by this statistical model, for essentially any setting of parameters, was inherently nonlinear because the additional effect of a presynaptic spike depended on the pattern of spikes that had been previously received from the presynaptic population. Temporal correlations in the presynaptic population caused the optimal response to depend on the spiking history of the same cell (Pfister et al., 2010), while crucially, the additional presence of spatial correlations introduced a dependency on the past spikes of other cells. Thus, the integrated effect of multiple spikes could not be computed as a simple linear sum of their individual effects in isolation. Specifically, a spike that was consistent with the information already gained from recent presynaptic spikes had only a small effect on the response (Figure 2B). Conversely, a spike that was unexpected based on the recent spiking history caused a larger change (Figure 2E).

As could be anticipated based on Equations 3-4, whether a spike counted as expected or unexpected relative to recently received spikes, and hence whether it had a small or large postsynaptic effect, depended on the long-run prior distribution of presynaptic activities, P(u). As a result, the same pattern of presynaptic spikes led to qualitatively different responses under different prior distributions. In particular, sublinear integration was optimal when presynaptic activities exhibited Gaussian random walks and thus they did not contain statistical dependencies beyond second order correlations (Figure 2A-C), as seen in the retina and cortical cultures (Schneidman et al., 2006). This was because with the activities of presynaptic neurons being positively correlated, successive spikes conveyed progressively less information about the presynaptic signal resulting in sublinear integration (Figure 2C) and the strength of the sublinearity depended on the magnitude of correlations (Ujfalussy and Lengyel, 2011). In contrast, supralinear integration was optimal when the presynaptic population exhibited coordinated switches between distinct states associated with large differences in the activity levels compared to activity-fluctuations within each state (Figure 2D–F). These switches led to higher order statistical dependencies as seen in the cortex in vivo, either due to population-wide modulation by cortical state (Gentet et al., 2010; Crochet et al., 2011), or due to stimulus-driven activation of particular cell ensembles (Harris et al., 2003; Ohiorhenuan et al., 2010; Miller et al., 2014). In this case, while observing a few spikes was consistent with random membrane potential fluctuations within the quiescent state, thus only warranting a small response, further spikes suggested that the presynaptic population was in the active state now and thus the response should be larger, leading to supralinear integration (Figure 2F).

Note, that nonlinearities in the optimal postsynaptic response needed not simply compensate for the nonlinearities in the presynaptic spike generation process, as captured by P(s|u), but they critically depended on the presynaptic correlations, as captured by P(u). Indeed, in Figures 1C, D and 2A–F, the same spiking nonlinearity was used and yet very different input integration was required depending on the form of the presynaptic statistics: linear integration for uncorrelated inputs (Figure 1B) and nonlinear integration for correlated inputs (Figure 1C), with sub- or supralinear integration being optimal depending on whether only second order (Figure 2A–C) or also higher order correlations were present in the presynaptic population (Figure 2D–F). Moreover, optimal input integration remained nonlinear even if the postsynaptic neuron computed a function of the presynaptic firing rates (rather than membrane potentials) which were linearly related to spikes (Figure 1—figure supplement 1).

Nonlinear dendrites can approximate the optimal response

The nonlinear input integration seen in the optimal response strongly resembled dendritic nonlinearities. Indeed, the basic biophysical mechanisms present in dendrites naturally yield nonlinearities that are qualitatively similar to those of the optimal response: purely passive properties lead to sublinear integration (Koch, 1999), whereas locally generated dendritic spikes endow dendrites with strong supralinearities (Nevian et al., 2007; Branco and Häusser, 2011). However, the full mathematical implementation of the optimal response is excessively complex (Materials and methods) and thus, there is unlikely to be a one-to-one mapping between the variables necessary for implementing it and the biophysical quantities available in dendrites. Therefore, we sought to establish a formal proof that dendritic-like dynamics can implement, even if approximately, the optimal response. For this, we considered two limiting cases of our statistical model of presynaptic activities, P(u) and P(s|u), and compared the properties of the corresponding optimal response to a well-established simplified model of nonlinear dendritic integration, using a combination of analytical and numerical techniques.

First, we considered a limiting case in which the statistics of a large presynaptic population were strongly dominated by the simultaneous switching of presynaptic neurons between a quiescent and an active state (as shown in Figure 2D). In this limiting case we were able to show mathematically (see Materials and methods) that a simple, biophysically-motivated, canonical model of nonlinear dendritic integration (Poirazi and Mel, 2001) is able to produce responses that are near-identical to the optimal response for any sequence of presynaptic spikes (Figure 3A, see also Figures 4C). In this simple dendritic model (Figure 3A, inset; Equations 24–25), inputs within a branch are integrated linearly and the local dendritic response is then obtained by transforming this linear combination through a sigmoidal nonlinearity, which is a hallmark of supralinear behavior in dendrites (Poirazi et al., 2003b).

Figure 3. A canonical model of dendritic integration approximates the optimal response.

(A) The optimal response (black) and the response of a canonical model of a dendritic branch, v (inset), with a sigmoidal nonlinearity (red, Equation 25) as functions of the linearly integrated input, vlin (inset, Equation 24), when the presynaptic population exhibits synchronized switches between a quiescent and an active state, as in Figure 2D. Black dots show optimal vs. linear postsynaptic response sampled at regular 2.5 ms intervals during a 3 s-long simulation of the presynaptic spike trains. (B) Optimal response (black) approximated by the saturating part of the sigmoidal nonlinearity (blue) when the presynaptic population is fully characterized by second-order correlations, as in Figure 2A. Inset shows the same data on a larger scale to reveal the sigmoidal nature of the underlying nonlinearity (gray box indicates area shown in the main plot).

DOI: http://dx.doi.org/10.7554/eLife.10056.012

Figure 3.

Figure 3—figure supplement 1. Reducing the optimal response with second order correlations to a canonical model of dendritic integration.

Figure 3—figure supplement 1.

(AC) Comparing the full (Equations 20–23) and the reduced model Equations A99-A100) of the optimal response. The estimates of the mean presynaptic membrane potential by the full model (A, grey) and the reduced model (A, black) are nearly identical. The error of the reduced model (quantified as the mean squared difference between the two models normalized by the variance of the full model) decreases monotonically with increasing correlations in the presynaptic population (B) and remains bounded as the number of neurons increases (C). (D) Steady state posterior variance, σ¯2, as a function of the posterior mean, μ¯, in the reduced model (Equation A100). (E) Comparing the linear-nonlinear model and the optimal response. Black dots: the optimal response against the output of the linear model, vlin (Equation 24). Blue line: sigmoidal nonlinearity operating in the linear-nonlinear model at the arrival of spikes, h(vlin) (Equation A103). Orange line: the result of numerically fitting a sigmoidal nonlinearity in the canonical model (Equation 25) to the optimal response. Parameters were N=10, g=2 Hz (A–C), or N=1, g=20 Hz (D–E), and β=2 mV-1, τ=20 ms, Σii=1 mV2 (A–E), and ρ=0.5 (A) or as indicated on the x-axis (B) or in the legend (C). For details, see Appendix B.
Figure 3—figure supplement 2. Adaptation without parameter change.

Figure 3—figure supplement 2.

(A) We simulated a presynaptic population with two different global activity states, a synchronized and a desynchronized state and first determined the optimal response in the two states separately (black). Next, we trained a linear (grey) and a nonlinear (red) dendrite to approximate the optimal response in both the synchronized and the desynchronized state (grey). (B) Green and black dots indicate the optimal response as a function of the best linear response respectively during the desynchronized and synchronized states. The same single dendritic nonlinearity (red line) can efficiently approximate the optimal response in both states simply because each state uses a different part of the input range of this nonlinearity: during the synchronized state the expansive supralinearity of the upstroke is being predominantly used, while during the desynchronized state the saturating sublinear-linear regime is dominating the response. (C) The error of the dendritic response is slightly larger than that of the optimal response but still substantially smaller than the error of the linear response. Parameters of the synchronized state were Ω-=10 Hz, Ω+=0.7 Hz, u¯=2.3 mV, τ=20 ms, Σii=1 mV2, Σij=0.5 mV2, g=5.3 Hz, β=0.5 mV-1, τrefr=1 ms and prel=1 and the desynchronized state was identical to a persistent active phase of the synchronized state.

Figure 4. A simple nonlinear dendritic model closely approximates the optimal response for realistic input patterns.

(A) Presynaptic spiking activity matching the statistics observed during hippocampal sharp waves. Spike trains (rows) belonging to four different assemblies are shown (colors), gray shading indicates assembly activations. (B) Different variants of the dendritic model, parts colored in yellow, orange, and red highlight the differences between successive variants (see text for details). (C) Estimating the mean of the presynaptic membrane potentials based on the observed spiking pattern (shown in A) by the optimal response (black) compared to the linear (dotted), somatic (yellow), random (orange) and clustered (red) models. (D) Performance of the four model variants compared to that of the optimal response. Gray lines show individual runs, squares show mean± s.d. Performance is normalized such that 0 is obtained by predicting only the time-average of the signal, and 1 means perfect prediction attainable only with infinitely high presynaptic rates (Materials and methods).

DOI: http://dx.doi.org/10.7554/eLife.10056.015

Figure 4.

Figure 4—figure supplement 1. Responses of different variants of the dendritic model compared to the true signal.

Figure 4—figure supplement 1.

Responses of different variants of the dendritic model to the spiking pattern shown in Figure 4A of the main paper compared to the true signal (green), that is the mean of the presynaptic membrane potentials. The models shown are the optimal response (A, black) and the linear (B, dotted), somatic (C, yellow), random dendrite (D, orange) and clustered dendrite (E, red) models (see also Figure 4B–D of the main paper). The error (gray areas, measured relative to the signal, cf. Figure 4C where it is measured relative to the optimal response) of the linear, somatic and random models is substantially larger than the error of the optimal response, whereas the response of the clustered model is nearly identical to the optimal response.
Figure 4—figure supplement 2. Performance of different neuron models over a wide range of input statistics.

Figure 4—figure supplement 2.

(A) Performance of the optimal (opt) linear (lin) and clustered dendritic (clust) model for presyanaptic statistics with only second-order correlations (Table 1, cor2) with different numbers of presynaptic neurons (N), while keeping firing rates fixed (g=0.5 Hz). (B) Same as (A), but changing firing rates (g) while keeping the number of presynaptic neurons fixed (at N=10). The performance of the dendritic neuron is always close to the performance of the optimal response. (With N=10 and g=20 Hz the dendritic model seems to be slightly better than the optimal response. This is because we derived the optimal response with the assumption that there were no more than one spike in each time bin and we did not enforce this condition in these simulations.) (C) Performance of the models using neocortical input statistics (Table 1, NC) with different numbers of presynaptic neurons. (D) Performance of the models using hippocampal-like input (Table 1, HP; with Σ¯ij=0 and τrefr=1 ms in D–F) with different numbers of presynaptic neurons. (E) The dendritic model performs significantly better than the linear model and nearly as well as the optimal response for different values of N and g. (F) The performance of the models depends mostly on the firing rate of the population, N×g, and not on N or g individually. (Range of N values used is shown in the legend, g was 1, 2, 4, or 10.) This can be useful because although the computational complexity of the optimal response scales as N2, and thus computing it for large N can be prohibitive, this scaling suggests that, for practical purposes, the large N limit can be studied by scaling g rather than N. Error bars show s.d. and were smaller than the symbols for the means in some cases.

Second, we considered another limiting case in which the statistics of the presynaptic population were fully characterized by second-order correlations (as shown in Figure 2A). In this case, the same type of dendritic model, but with a sublinear input-output mapping, was able to approximate the optimal response very closely. Although a closed-form solution for the optimal nonlinear mapping could not be obtained in this case, it could be shown to be sublinear (Appendix), and was well approximated by a sigmoidal nonlinearity parametrized to be dominantly saturating (Figures 3B and Figure 3—figure supplement 1).

We also noted that it was the same type of sigmoidal nonlinearity which could implement supralinear and sublinear integration depending on the input regime (low background, synchronous spikes: supralinear; high background, asynchronous spikes: sublinear integration, compare Figure 3A and B, inset). This suggests that dendritic integration may adapt to systematic changes in presynaptic statistics, such as those brought about by transitioning between the desynchronized and synchronized states of the neocortex, or sharp waves and theta activity in the hippocampus, without having to change the parameters of its nonlinearity (Borst et al., 2005) (Figure 3—figure supplement 2). Indeed, Gasparini and Magee (2006) demonstrated that dendritic integration in hippocampal pyramidal cells was supralinear when inputs were highly synchronized (as they are during sharp waves), while integration was linear if the input was asynchronous (such as during theta activity).

Nonlinear dendrites are computationally advantageous

While the foregoing analyzes proved that dendritic-like nonlinearities can closely approximate the optimal response in certain limiting cases, they do not address directly whether having such nonlinearities in input integration is crucial for attaining near-optimal computational performance for more realistic input statistics, or simpler forms of input integration could achieve similar computational power. To study this, we considered a scenario in which the presynaptic population consisted of four ensembles, such that neurons belonging to each ensemble underwent synchronized switches in their activity levels which were independent across the four ensembles, while there were also independent fluctuations in the activity of individual presynaptic neurons which were comparable in magnitude to those caused by these synchronized activity switches (Figure 4A). We then assessed the performance of four different variants of a simple dendritic model relative to that of the optimal response (Figure 4B): a model with linear dendrites and soma; a model in which only the soma was nonlinear, and two models in which nonlinearities resided in the dendrites with either random or clustered connectivity between the presynaptic assemblies and the dendritic branches.

We quantified the performance of each of the models based on how closely their output approximated the linear average of the analog presynaptic activities giving rise to the spike trains they were integrating (Figure 4—figure supplement 1, Materials and methods). For a fair comparison, we tuned the parameters of each variant of the dendritic model to obtain the best possible performance with it (Figure 4C). The model with nonlinear dendrites and clustered connectivity had near-optimal cross-validated performance (Figure 4D) while all other models performed significantly worse (n = 20 runs, t = 51, t = 35, t = 20, and P<10-15, P<10-15, P<10-13; respectively from left to right as shown in Figure 4D). This remained true when we varied the number and firing rate of presynaptic neurons over a wide range, and under a diverse set of qualitatively different population-level statistics, determining the dynamics of assembly switchings and within-assembly membrane potential correlations (Figure 4—figure supplement 2).

Taken together, these results demonstrate that the clustering of correlated inputs together with nonlinearities akin to those found in dendrites is necessary to achieve optimal estimation performance in the face of presynaptic correlations. However, in order to be tractable, our dendritic model was mathematically simplified and, as a result, only qualitatively reproduced the nonlinearities of real dendrites. Thus, we directly compared experimentally recorded responses in dendrites to the optimal response.

Nonlinear integration in cortical neurons is matched to their input statistics

A crucial prediction of our theory is that dendritic nonlinearities act to achieve near-optimal responses in a way that the form of the nonlinearity is specifically matched to the long-run statistics of the presynaptic population. We tested this prediction in experiments in which two different types of cortical pyramidal neurons, from layer 2/3 of the neocortex (Figure 5A–E) and from area CA3 of the hippocampus (Figure 5F–J), received patterned dendritic stimulation using two-photon glutamate uncaging, and compared their subthreshold somatic responses with the optimal responses predicted by the theory.

Figure 5. Nonlinear dendritic integration is matched to presynaptic input statistics.

(A) Sample membrane potential fluctuations (left, adapted from Gentet et al., 2010) and multineuron spiking patterns (right, adapted from Ji and Wilson, 2007) recorded from the neocortex (top), and matched in the model (bottom, see also Tables 13). (B) Two-photon image of a neocortical layer 2/3 pyramidal cell, numbers indicate individual dendritic spines stimulated in the experiment. (C) Responses to trains of seven stimuli using different inter-stimulus intervals (ISI, shown below traces) recorded in the cell shown in (B) (black; mean± s.d.) and predicted by the optimal response tuned to the presynaptic statistics shown in A (red). Parameters related to postsynaptic dendritic filtering were tuned for the specific dendrite ( Figure 5—figure supplement 1B–C). (D) Dependence of response amplitudes on ISI in the same dendrite shown in B-C (squares), and as predicted by the optimal response (filled circles) or linear integration (empty circles). (E) Average error of fitting dendritic recordings across all dendrites and conditions using the optimal response tuned to different presynaptic statistics (NC, HP, cor2, ind; see text for details) compared to within-data variability (var). Gray lines show individual dendrites. Rightmost bar (NC-AP5) shows fit using NC presynaptic statistics to responses obtained after pharmacological blockade of NMDA receptor activation. (FJ) Same as (AE) for presynaptic patterns characterized by hippocampal sharp waves (F) and recordings from hippocampal CA3 pyramidal cells (HJ) when stimulating synapses on its basal dendrites (G). In vivo data in (F) was adapted from (Ylinen et al., 1995) (left, membrane potential traces, not simultaneously recorded) and (O'Neill et al., 2006) (right, multineuron spike trains). Error bars show s.e.m.

DOI: http://dx.doi.org/10.7554/eLife.10056.018

Figure 5.

Figure 5—figure supplement 1. Best fit parameters for fitting dendritic responses.

Figure 5—figure supplement 1.

(AB) The parameters of the HP and NC models describing the activity of the presynaptic population were fitted to in vivo recordings from the corresponding presynaptic populations (see Materials and methods). When using these presynaptic models to fit individual hippocampal (orange) and neocortical (red) dendritic responses, respectively, we only tuned parameters that characterized the postsynaptic neuron for the optimal response: its membrane time constant, τpost (A), the response amplitude for a single spike (not shown), and the size of the dendritic sububit, i.e. the number of presynaptic neurons innervating a single dendritic branch, N (B). (C) When using the cor2 presynaptic model to fit hippocampal (hipp), neocortical (cort) and cerebellar (cbl) dendritic responses, we tuned the correlations between presynaptic neurons, ρ, instead of the size of the dendritic subunit (for an explanation, see also Table 1). Box plots show the range of the data (whiskers), the quartiles (box), and the median (center line). Green lines in (C) show mathematically possible largest negative correlations (-1N-1), where N is the number of presynaptic cells.
Figure 5—figure supplement 2. Dendritic integration in cerebellar stellate cells is not predicted by cortical presynaptic statistics.

Figure 5—figure supplement 2.

We used data recorded from cerebellar stellate cells in which input integration is sublinear (Abrahamsson et al., 2012) to test our predictions with a qualitatively different form of dendritic integration. Stellate cells receive input from cerebellar granule cells (Shepherd, 2004). (A) Top: Presynaptic membrane potential trace recorded intracellularly in a cerebellar granule cell (adapted from Duguid et al., 2012). Bottom: presynaptic membrane potentials in two model neurons (left) and spiking in the model population (right) with simple second-order correlations across cells. (No simultaneously recorded multineural experimental data was available.) Although it is clear from in vivo recordings from cerebellar granule cells that these neurons do not show state-switching dynamics under anesthesia (Duguid et al., 2012), the limited experimental data about their natural population activity made it unfeasible to fit our model quantitatively to data. (B) Responses to increasing number of stimuli (2 10) recorded in a cerebellar stellate cell (black lines, adapted from Abrahamsson et al., 2012) and predicted by the optimal response (thick colored lines) assuming second-order correlations (left) or independence (right) between the membrane potentials of presynaptic neurons. (C) Dependence of response amplitudes on the number of stimuli in the same dendrite shown in (B) (squares, adapted from Abrahamsson et al., 2012), and as predicted by the optimal response (assuming second order correlations, filled circles) or linear integration (empty circles). (D) Average error of fitting dendritic recordings across conditions using the optimal response tuned to different presynaptic statistics (HP, NC, cor2, ind; see main text, and Table 1 for details).

For generating our predictions of the optimal response in these two cell types, we fitted the parameters describing presynaptic statistics in our model, P(u) and P(s|u), to the statistical patterns in the activity of their respective presynaptic populations. For neocortical pyramidal cells, we fitted in vivo data available on the membrane potential fluctuations of layer 2/3 pyramidal cell-pairs in the barrel cortex during quiet wakefulness (Gentet et al., 2010; Crochet et al., 2011) (NC, Figure 5A, Table 2). For hippocampal pyramidal cells, we fitted presynaptic statistics to membrane potential fluctuations (Ylinen et al., 1995; English et al., 2014) and to multineuron spiking patterns of hippocampal pyramidal cells during sharp wave activity (Csicsvari et al., 1999; 2000) (HP, Figure 5F, Table 3). Due to the limitations of available hippocampal data sets, extracellular rather than intracellular data was used for fitting correlations. The motivation for our choice of the particular neocortical and hippocampal states used for fitting presynaptic statistics was two-fold. First, the general network state of the slice preparations in which we tested dendritic integration was likely most analogous to these states (A Gulyás, personal communication; see also Karlocai et al., 2014; Schlingloff et al., 2014), characterized by relatively suppressed neural excitability due to low levels of cholinergic modulation (Harris and Thiele, 2011; Eggermann et al., 2014). Second, the stimulation protocol used in our study (short bursts of synaptic stimuli following longer silent periods) was also most consistent with population activity during hippocampal sharp waves and quiet wakefulness in the cortex. In order to capture variability across the cells we recorded from, the parameters related to postsynaptic dendritic filtering (amplitude and decay of the response to a single stimulation, and the size of the dendritic subunit, Figure 5—figure supplement 1B-C) were tuned for the individual dendrites. Importantly, the parameters describing presynaptic statistics were fitted without regard to our dendritic experimental data, thus allowing a strong test of our predictions about dendritic integration (see Materials and methods).

Table 2.

Features of neocortical population activity during quiet wakefulness. Parameters of the model are given in column NC of Table 1.

DOI: http://dx.doi.org/10.7554/eLife.10056.008

Data Model (NC) Reference
duration of active states 130 ms 100 ms Gentet et al. (2010)
duration of quiescent states 200 ms 250 ms Gentet et al. (2010)
r+, firing rate during active states 2.5 Hz 2.86 Hz Gentet et al. (2010)
r-, firing rate during quiescent states 1/3 Hz 0.39 Hz Gentet et al. (2010)
2u¯, depolarisation during active states 20 mV 20 mV Gentet et al. (2010)
time constant 20 ms 20 ms Poulet and Petersen (2008)

Table 3.

Features of hippocampal population activity during sharp wave-ripple states. Parameters of the model are given in column HP of Table 1A recent intracellular study (English et al., 2014) recording from CA1 neurons in awake mice found parameters similar to our previous estimates. Using the parameters found in that study – r+=12.8 Hz, r-=2.85 Hz (Table 1 of English et al., 2014), 2u¯=5 mV and Σ¯ii=4 mV2 (Figure 3A of English et al., 2014) yielding g=5 Hz and β=0.3 mV-1 – did not influence our results (not shown).

DOI: http://dx.doi.org/10.7554/eLife.10056.009

Data Model (HP) Reference
activation rate of an ensemble 0.25 Hz 0.027 Hz Grosmark et al. (2012); Pfeiffer and Foster (2013)
duration of SPWs 105 ms 100 ms Csicsvari et al. (2000)
r+, firing rate during SPW 10 Hz 9.5 Hz Csicsvari et al. (2000)
r-, firing rate between SPWs 0.5 Hz 0.6 Hz Grosmark et al. (2012); Csicsvari et al. (2000)
2u¯, depolarisation during SPWs 0–10 mV 4.6 mV Ylinen et al. (1995)
time constant 8–22 ms 20 ms Epsztein et al. (2011)

We found that the non-linear integration of individual spike patterns in cortical neurons was remarkably well fit by the optimal response when it was tuned to the correct presynaptic statistics (Figure 5C,H). The systematic dependence of response amplitudes on the inter-stimulus interval (ISI) in individual cells (Figure 5D,I) was also well predicted by the optimal response. We quantified the quality of match between the predicted and experimentally recorded time course of responses across a population of n = 6 (neocortex) and n = 6 (hippocampus) dendrites under a range of conditions varying ISI or the number of stimuli, and found that the precision of our predictions was not statistically different from that expected from the inherent variability of responses in individual dendrites (Figure 5E,J; neocortex: t = 0.2, P = 0.85; hippocampus: t = 1.85, P = 0.12). In contrast, when the optimal response was tuned to unrealistic presynaptic statistics characterized purely by second-order correlations (cor2), or by a lack of any correlations implying statistically independent presynaptic firing (ind), the quality of fits became significantly worse (Figure 5E,J; neocortex: t = −4.6, P = 0.006 for cor2, and t = −4.9, P = 0.004 for ind; hippocampus: t = −4, P = 0.01 for cor2, and t = −4.9, P = 0.004 for ind).

Moreover, using realistic presynaptic statistics, but matching hippocampal rather than neocortical activities, also resulted in significantly worse fits for neocortical responses (Figures 5E; t= −3.6, P = 0.02). The converse was not observed in the case of hippocampal neurons (Figures 5J; t= 0.43, P = 0.68). This might be because hippocampal neurons also receive neocortical inputs (albeit on their apical not basal dendrites) that show similar population activity patterns to the ones we matched here for the neocortical cells (Isomura et al., 2006), while the primary sensory cortical pyramidal cells we recorded from do not receive direct input from the hippocampus. Nevertheless, when we analyzed the quality of fit between our predictions and recorded responses in hippocampal and neocortical data together, we found a small, but significant interaction between the source of the input statistics (neocortex or hippocampus) and the location of the postsynaptic neurons (ANOVA F = 5, P < 0.05). This suggests that dendritic nonlinearities in cortical pyramidal neurons are specifically tuned to the dynamics of their presynaptic cortical ensembles. Furthermore, the blockade of NMDA receptor activation by AP5 resulted in dendritic responses that afforded substantially poorer fits by the model, even after refitting the postsynaptic parameters (Figure 5E,J, AP5). This indicated that the fine tuning of dendritic nonlinearities to input statistics relied on the action of NMDA receptors.

As dendrites in both of our cortical cell types integrated inputs supralinearly, as a further control, we analyzed similar data available from cerebellar stellate cell dendrites, which are known to integrate their inputs sublinearly (Abrahamsson et al., 2012) (Figure 5—figure supplement 2). In this case, we fitted the statistics of individual presynaptic cells to those of cerebellar granule cells. The correlations between these cells are less known, but we found that assuming simple second-order correlations made the optimal response a close match to dendritic responses. In contrast, the hippocampal- or neocortical-like statistics that were crucial for matching responses in cortical dendrites (Figure 5D,H) resulted in a substantially poorer fit in this cerebellar cell type. This demonstrates a double dissociation in the matching of cortical and subcortical neuron types to cortical and non-cortical input statistics.

Discussion

We established a functional link between the statistics of the synaptic inputs impinging on the dendritic tree of a neuron and the way those inputs are integrated within the dendritic tree. We first demonstrated that efficient computation in spiking circuits requires nonlinear input integration if the activities of the neurons are correlated and that the structure of the presynaptic correlations determines the form of the optimal input integration. Second, we showed that the optimal response can be efficiently approximated by a canonical biophysically-motivated model of dendritic signal processing both for linearly correlated inputs and for cell-assembly dynamics. Third, we found that nonlinear dendrites with synaptic clustering carry significant benefits for decoding richly structured presynaptic spike trains. Finally, in vitro measurements of dendritic integration in two different types of cortical pyramidal neurons yielded postsynaptic responses that closely matched those predicted to be optimal given the in vivo input statistics of those particular cell types. These results suggest that nonlinear dendrites are essential to decode complex spatio-temporal spike patterns and thus play an important role in network-level computations in neural circuits.

Biophysical substrate

The central insight of our theory is the relationship between presynaptic statistics and postsynaptic response, formalized as the optimal response. The optimal response can be expressed as a set of nonlinear differential equations that requires storing and continuously updating ~N2 variables within the dendritic tree, where N is the number of synapses (Materials and methods). Thus, it is unlikely to be implemented by the postsynaptic neuron as such. Consequently, to demonstrate the biophysical feasibility of our theory, we derived a simple approximation to the optimal response that performs about equally well using just a few postsynaptic variables and that corresponds to a canonical descriptive model of dendritic integration (Poirazi et al., 2003; Poirazi et al., 2003b).

We found that simple second order correlations between presynaptic neurons imply sublinear integration which can be implemented by the saturating nonlinearity characteristic of passive dendrites. Conversely, the biophysical substrate for the type of supralinear integration that was optimal for state-switching dynamics likely involves NMDA receptors because the particular dendritic nonlinearites observed in the cortical cells in which we tested our theory are known to be mediated primarily through NMDA receptor activation (Branco et al., 2010; Makara and Magee, 2013; Major et al., 2013). Indeed, we found that pharmacological inactivation of NMDA receptors abolished the precise match between dendritic integration and presynaptic statistics in these neurons (Figure 5). Moreover, the local plateau potentials generated by NMDA currents have been shown to have graded response durations (Major et al., 2008), and the resulting nonlinearities could be continuously tuned between weaker and stronger forms (boosting and bistability, Major et al., 2013). These properties make NMDA receptor mediated dendritic nonlinearities ideally suited for being matched to presynaptic statistics, as the optimal response involves sustained dendritic depolarisations of varying duration (Figure 4) that depend parametrically on those statistics.

Input statistics and clustering

A central prediction of our theory that awaits confirmation is the existence of a tight relationship between the structure of correlations in the activity of presynaptic cells and the morphological clustering of their synapses on the postsynaptic dendrite. This is because our theory requires nonlinear integration of inputs from neurons with statistically dependent activity, while spikes from independent neurons need to be integrated linearly. Biophysical considerations suggest (Koch, 1999) and experimental data supports (Polsky et al., 2004; Losonczy and Magee, 2006) that, when synchronous, nearby inputs on the same dendritic branch are summed nonlinearly, whereas widely separated inputs are combined linearly. Consequently, our theory predicts that the correlation structure of the inputs will be mapped to the dendritic tree in a way that presynaptic neurons with strongly correlated activities target nearby locations while independent neurons innervate distinct dendritic subunits.

According to our theory, the kind of correlation relevant for determining synaptic clustering is the ‘marginal’ correlations between the membrane potentials of presynaptic neurons. Marginal correlations include both signal and noise correlations (Averbeck et al., 2006) and thus can reach substantial magnitudes even when noise correlations alone are small, as e.g. during desynchronized cortical states (Renart et al., 2010), especially for neurons with overlapping receptive fields (Froudarakis et al., 2014), and when measured between the membrane potentials of neurons rather than their spike counts (Dorn and Ringach, 2003; de la Rocha et al., 2007; Poulet and Petersen, 2008).

At the level of different dendritic regions, the segregation of different input pathways along the dendritic tree of hippocampal neurons supports this prediction (Witter et al., 1989; Druckmann et al., 2014). At the level of individual synapses, the degree and the existence of clustering among inputs showing correlated activity is currently debated. High resolution imaging revealed subcellular topography of sensory inputs in the tadpole visual system (Bollmann and Engert, 2009), clustered patterns of axonal activity in the parallel fibres that provide input to cerebellar Purkinje cells (Wilms and Häusser, 2015), and experience-driven synaptic clustering in the barn owl auditory localization pathway (McBride et al., 2008). Furthermore, it has been demonstrated that neighboring synapses are more likely to be coactive than synapses that are further away from each other in developing hippocampal pyramidal cells (Kleindienst et al., 2011) as well as in hippocampal cultures and in vivo in the barrel cortex during spontaneous activity (Takahashi et al., 2012). These results thus suggest clustering of correlated inputs.

In contrast, an interspersion of differently tuned orientation-, frequency- or whisker-specific synaptic inputs on the same dendritic segments was found in the mouse visual, auditory or somatosensory cortex, respectively, thus challenging the prevalence of synaptic clustering (Jia et al., 2010; Chen et al., 2011; Varga et al., 2011). However, in all these studies the stimuli used were non-naturalistic and varied along a single stimulus dimension only (direction of drifting gratings, pitch of pure tones, or the identity of the single whisker being stimulated), which may account for the apparent lack of clustering. In particular, our theory predicts clustering based on the long-term statistical dependencies between the responses of the presynaptic neurons for naturalistic inputs, which can be quite poorly predicted from their tuning properties for single stimulus dimensions (Harris et al., 2003; Fiser et al., 2004). In contrast, the statistical dependencies relevant for our theory are well represented by those found during spontaneous activity (Berkes et al., 2011). Indeed, studies finding evidence in favor of synaptic clustering analyzed the structure of synaptic input to dendritic branches during spontaneous network activity (McBride et al., 2008; Kleindienst et al., 2011; Makino and Malinow, 2011; Takahashi et al., 2012). Thus, presynaptic correlations for naturalistic stimulus sets may be predictive of synaptic clustering and providing more direct evidence for or against such clustering will offer a crucial test of our theory.

Linear vs. nonlinear postsynaptic computations

Although, in general, we expect single-neuron computations to be nonlinear (Zador, 2000), and our theory indeed applies to nonlinear computations (Figure 1—figure supplement 3), we assumed the postsynaptic computation to be linear for matching experimental data. This choice was justified by two reasons. First, it is difficult to determine, without making strong prior assumptions, what kind of nonlinear function the neuron actually computes; and so the choice of any particular such function would have been arbitrary. Note that even in relatively well-characterized cortical areas (such as the visual cortex) it is unknown how much of the computationally relevant output of individual neurons (such as orientation or direction selectivity) is brought about by specific nonlinearities in the input-output transformations of these neurons, or by multiple steps of feed-forward and recurrent processing carried out at various stages of the visual pathway between the retina and those neurons. Moreover, in some cases, even networks with linear dynamics can provide a remarkably close fit to experimentally observed cortical population dynamics (Hennequin et al., 2014). This issue may be best addressed in systems that are more specialized than the cortex so that there are well-supported hypotheses about the particular nonlinear computations individual neurons need to perform, such as the fly visual system (Single and Borst, 1998) or the mammalian and avian auditory brain stem (Agmon-Snir et al., 1998). In order to test our theory in these systems, in vivo multineural data will need to be collected from the afferent brain areas, preferably in the unanesthetized animal, for characterising the relevant statistical properties of the presynaptic population to which dendritic nonlinearities are adapted according to our prediction.

Second, any nonlinear function can be approximated to high precision by a linear function over a sufficiently limited input range. Currently available experimental techniques for systematically probing dendritic nonlinearities, including those used in our study, only provide data over such a very limited range (0.1% of the number of excitatory inputs impinging a neuron, Megías et al., 2001). Inputs in this small range do not sufficiently engage global nonlinearities brought about by active somatic conductances or global events such as Ca2+ spikes. Thus, we could assume linear computation over this range without loss of generality (Figure 1—figure supplement 2). In fact, from this perspective, it is a non-trivial phenomenon to account for on its own right that stimulating such a small fraction of inputs already leads to observable nonlinearities in the postsynaptic dendrite. By defining the computation to be linear, we could demonstrate that such strong dendritic nonlinearities arise naturally in our theory, entirely due to the correlations in the prior input statistics, thus providing a functional account for this remarkable phenomenon.

Once patterned dendritic stimulation over a broader and more realistic range of inputs becomes feasible, our theory will provide a principled method for dissecting the roles of presynaptic correlations vs. genuine nonlinear computations in shaping dendritic nonlinearities. A sufficiently rich set of such data will allow the fitting of presynaptic parameters, as we did here, followed by fitting postsynaptic transfer functions to dendritic responses without having to make strong prior assumptions about their (linear) nature.

Analog communication, stochastic synaptic transmission and short-term synaptic plasticity

Our formalism was based on the assumption that cortical neurons only influence each other’s membrane potentials via the action potentials they emit. While there exist other, more analog forms of communication, such as the modulation of the effects of action potentials by subthreshold potentials (Clark and Häusser, 2006), the propagation of voltage signals through gap junctions (Vervaeke et al., 2012), and ephaptic interactions between nearby cells (Anastassiou et al., 2011), these either require slow membrane potential dynamics, small distances between interacting cells, or large degrees of population synchrony, and are thus generally believed to have a supplementary role beside spike-based communication (Sengupta et al., 2014). Note that our theory is self-consistent even though it considers spiking only in the presynaptic population and not in the postsynaptic neuron. This is because we assumed that the computationally relevant mapping is that between the membrane potentials of the presynaptic neurons and the postsynaptic cell (Figure 1A, Equation 1), and so, by induction, the spikes of the postsynaptic neuron will effect the mapping from its membrane potential to those of its postsynaptic partners.

We also assumed that presynaptic spikes deterministically and uniformly impact the postsynaptic response, and thus apparently neglected the stochasticity in synaptic transmission, and in particular systematic variations in synaptic efficacy brought about by short-term synaptic plasticity. Nevertheless, these presynaptic features are compatible with our theory. The stochasticity of synaptic transmission, due to a baseline level of synaptic failures, is straight-forward to incorporate in the model by reducing the effective presynaptic firing rate, which can thus be interpreted as a ‘transmission rate’ instead. In fact, we have already done this while matching hippocampal and neocortical presynaptic statistics (Table 1).

Table 1.

Parameters used in Figures 15 of the main paper (see also Tables 23). Ω- (Ω+) is the rate of switching from the active to the quiescent (from the quiescent to the active) state. The resting potential corresponding to the active and quiescent states is u¯ and -u¯, respectively. Σ¯ii (Σ¯ij) is the posterior variance (covariance) of the presynaptic membrane potential fluctuations in a given state where Σ¯=Qτ2. τrefr is the length of the refractory period and prel is the baseline transmission probability in these synapses (13, 49).

DOI: http://dx.doi.org/10.7554/eLife.10056.007

Figure 1 Figure 2 Figure 3 Figure 4 Figure 5
parameter unit B,C A,B C,D A B A-D ind cor2 NC HP
Ω- Hz 10 10 10 10 10 10
Ω+ Hz 10 0.27 0.67 0.67 4 0.027
u¯ mV 2.4 0 2.3 2.3 0 2.3 0 0 10 2.3
τ ms 20 20 20 20 20 20 20 20 20 20
Σ¯ii mV2 1 16 4 1 1 1 1 1 10 1
Σ¯ij ij mV2 0 0.5 0.5 0 0.5 0 0 * 5 0.5
g Hz 1 1 1 5.3 0.5 5 0.5 0.5 1 2
β mV-1 1 0.4 0.4 0.5 0.4 0.4 1 2 0.1 0.6
τrefr ms 3 3 3 1 3 1 3 3 3 3
prel 1 1 1 1 1 1 1 1 0.5 0.2
N 20 70 20 10 10 10 +0 +20 * *
τpost ms 0 10 10 0 0 0 * * * *
wi 1/N 1/N 1/N 1/N 1/N 1/N * * * *

 *These parameters were fitted to experimentally recorded dendritic responses, see Figure 5—figure supplement 1. The numbers 0 and 20 indicated here are in addition to the number of stimulated synaptic sites in the experiment. For the ind model, this number does not affect the predictions, for the cor2 model its effects could phenomenologically be incorporated into which we chose to fit instead.

Short-term synaptic plasticity, resulting in dynamical changes in synaptic efficacy as a function of the recent spiking history of the presynaptic neuron, is not only a constraint in our framework, but as we have shown in related work, it can act itself as an optimal estimator of the membrane potentials of individual presynaptic neurons (Pfister et al., 2010). Thus, the effects of short-term plasticity can be regarded as a special case of what can be expected from our optimal response: when presynaptic neurons are statistically independent, spikes arriving at different synapses are integrated linearly, and local nonlinearities acting at individual synapses suffice (Figure 1C, see also Materials and methods). However, the importance of nonlinear interactions between inputs from different presynaptic neurons, brought about by dendritic nonlinearities, rapidly increases with the magnitude of presynaptic correlations, especially in large populations (Figure 1D, see also Ujfalussy et al., 2011).

These considerations suggest that short-term synaptic plasticity and dendritic nonlinearities have complementary roles in tuning the postsynaptic response to the statistics of the presynaptic population along the orthogonal dimensions of time and space. The former is useful in the face of temporal correlations private to individual presynaptic neurons (auto-correlations, e.g., brought about by spike frequency adaptation, Pfister and Surace, 2014), while the latter is matched to spatio-temporal correlation patterns present across the presynaptic population.

Inhibitory neurons

We focused on the nonlinear integration of excitatory inputs in the dendritic tree of cortical neurons that have been extensively studied and described over the past decades, giving rise to a strong body of converging evidence as to their characteristics and mechanisms (Spruston, 2008). Recent work studying the nonlinear interaction between inhibitory and excitatory inputs in active dendrites (Gidon and Segev, 2012; Jadi et al., 2012; Müller et al., 2012; Wilson et al., 2012; Lovett-Barron et al., 2012) demonstrated that local inhibition has a powerful control over the excitability of the dendritic tree.

However, it is not yet clear whether these inhibitory inputs are directly involved in the computation performed by the circuit, just as excitatory neurons but with negative signs (Koch et al., 1982), or, alternatively, they may have a more ancillary role in supporting computations carried out primarily by excitatory neurons (Vogels et al., 2011).

Our theory can be extended to include both possibilities, by allowing inhibitory inputs to contribute to the computational function, f(u), with negative weights, or by considering them as providing auxiliary information about the common state of the excitatory presynaptic ensemble, especially when this state is in the more suppressed regime. Indeed, our preliminary results suggest that such an extension of our theory (Ujfalussy and Lengyel, 2013) successfully accounts for the interaction of (excitatory) Schaffer collateral inputs with the feedforward inhibitory effects of the temporo-ammonic pathway (Remondes and Schuman, 2002), likely mediated by interneurons delivering dendritic inhibition (Dvorak-Carbone and Schuman, 1999).

In the present paper we focused on dendritic integration in pyramidal neurons because dendritic nonlinearities have traditionally been more extensively characterized in this cell type, but our theory equally applies to synaptic integration in other types of neurons, including inhibitory interneurons. Therefore, our theory predicts a qualitative similarity of dendritic integration in different neuron types (i.e. interneurons versus principal cells) when they receive inputs from overlapping presynaptic populations. Indeed, it has been recently found that inhibitory interneurons can exhibit dendritic NMDA spikes under certain experimental circumstances (Katona et al., 2011; Chiovini et al., 2014) in addition to standard sublinear integration. The differences between dendritic integration in excitatory and inhibitory neurons could be attributed to their different computational function, f(u), or differences in the specific presynaptic populations innervating them.

Adaptation of dendritic nonlinearities to presynaptic statistics

According to our theory, the optimal response depends on prior information about the input statistics. Consequently, for dendritic processing to approximate the optimal response, this prior information needs to be implicitly encoded in the form of the particular nonlinearity a dendrite expresses. Therefore, our theory predicts an ongoing adaptation of dendritic nonlinearities to presynaptic firing statistics across several time-scales.

First, there is a simple yet potent mechanism implicit in our theory that can ensure that a match of dendritic integration to presynaptic statistics is maintained as those statistics are changing over time. This is based on the observation that essentially instantaneous, albeit probably incomplete, adaptation can occur even without specific changes in the integrative properties of dendrites per se, simply due to the fact that a critical level of input synchrony is required to elicit dendritic spikes, and so the same sigmoid-looking dendritic transfer function can be used as superlinear, linear, or sublinear, depending on which part of its input range is being used (Figure 3—figure supplement 2).

Second, to match the more specific modulation of the statistics of presynaptic activities by global cortical states (Crochet et al., 2011; Mizuseki and Buzsaki, 2014), dendritic integration may also be modulated by these states. As different cortical states are typically accompanied by changes in the neuromodulatory milieu (Hasselmo, 1995; Harris and Thiele, 2011), neuromodulators may be the ideal substrates to ensure that dendritic integration also changes according to the current cortical activity mode. This may provide a functional account of changes in the excitability of the dendritic tree as dynamically regulated by acetylcholine and monoamines (Sjöström et al., 2008).

Third, experience-dependent synaptic plasticity can gradually change the statistics of the presynaptic population activity implying that the optimal form of input integration should also change as a function of experience. We propose that branch-specific forms of plasticity of dendritic excitability (Losonczy et al., 2008; Makara et al., 2009; Müller et al., 2012) may have a functional role in enabling dendrites to adjust the form of input integration to such slowly developing and long-lasting changes in the statistics of their inputs.

Finally, whether inputs from two presynaptic cells are integrated linearly or nonlinearly in a dendrite depends critically on the distance between their synapses within the dendritic tree (Polsky et al., 2004; Losonczy and Magee, 2006). Our theory requires nonlinear integration of inputs from neurons with statistically dependent activity, predicting a mapping of presynaptic correlations on the postsynaptic dendritic tree. Local electrical and biochemical signals can drive synaptic plasticity (Larkum and Nevian, 2008; Govindarajan et al., 2011; Winnubst et al., 2015) and rewiring (DeBello, 2008) leading to synaptic clustering of correlated inputs along the dendritic tree (Kleindienst et al., 2011; Takahashi et al., 2012).

A combination of all these mechanisms may be crucial for achieving and dynamically maintaining, at the level of individual neurons, a detailed matching of dendritic nonlinearities to presynaptic statistics. Thus, our theory provides a novel framework for studying a range of phenomena regarding the dynamical regulation of dendritic nonlinearities from the perspective of circuit-level computations.

Materials and methods

Source code for reproducing the analyses and simulations presented in the paper as well as the experimental data we used for testing our predictions are available online (https://bitbucket.org/bbu20/optimdendr).

Computing the optimal response

In order to study the optimal form of input integration with realistic input statistics, we need to make two important assumptions. First, we need to assume a particular algebraic form for the computation that a neuron performs. Second, we need to define what the relevant presynaptic statistics are, that is, the membrane potential and spiking dynamics of the presynaptic population under naturalistic in vivo circumstances. Given these two assumptions, the theory uniquely defines the optimal response of a neuron to any input pattern. The optimal response has qualitatively similar properties whether computations are defined as mappings between pre- or postsynaptic membrane potentials or firing rates (Figure 1—figure supplement 1).

Throughout the paper the term input refers to the spatio-temporal spiking pattern impinging the neuron while the response of a neuron refers to its (subthreshold) somatic membrane potential (or firing rate, see below). All parameters used in the paper are given in Table 1 or in the caption of the corresponding Figure.

Postsynaptic computation

We assumed the postsynaptic computation to be linear, i.e. the dynamics of the postsynaptic membrane potential v(t) evolves according to a weighted sum of the presynaptic membrane potential values, u(t) (cf. Equation 1):

τpostv˙(t)=-v(t)+i=1Nwiui(t) (5)

where τpost is the time constant of the postsynaptic neuron, N is the number of presynaptic neurons, and wi is the computational weight assigned to presynaptic neuron i. As the postsynaptic neuron cannot access presynaptic membrane potentials, u, directly only the spikes the presynaptic cells emit, s, (Figure 1, Equation 2), the optimal response (that minimizes mean squared error) is the expectation of Equation 5 under the posterior distribution of the presynaptic membrane potential at time t, u(t) given the history of presynaptic spiking up to that time, s(0:t) (cf. Equation 3):

τpostv~˙(t)=-v~(t)+P(u(t)|s(0:t))i=1Nwiui(t) du(t) (6)

Throughout the paper we call the output of Equation 6 the optimal response and compare its behavior to input integration in the dendrites of cortical pyramidal cells.

Table 1 shows the values of the parameters in Equation 5 (N, τpost, and wi) used in the simulations. In short, to illustrate the contributions of inference to Equation 6 (the term including the integral), we used τpost=0 in Figures 1, 3 and 4 as well as in all Supplemental Figures, unless otherwise stated. We used τpost=10 ms in Figure 2 to aid comparison with experimental data and fitted τpost to data for Figure 5. Throughout the paper we used wi=1/N, except in Figure 5 where we fit N and wi=w jointly to the data.

Presynaptic statistics

Computing the posterior, P(u(t)|s(0:t)) in Equation 6 requires a model for the joint membrane potential and spiking statistics of the presynaptic population, P(u,s) (see also Equation 4). For mathematical convenience, we present some of our results below in discrete time with time step size δt, which we will eventually take to zero to derive time-continuous equations. We distinguish discrete and continuous time results by using time as an index versus as an argument of the corresponding time-dependent quantities, e.g. ut vs. u(t).

We describe the joint statistics of presynaptic membrane potentials and spikes by a hierarchical generative model that has three layers of variables (Figure 2—figure supplement 1A,B). The global state of the system is described by a single binary variable, z that switches between a quiescent (-) and an active (+) state following first-order Markovian dynamics (see Appendix for the extension to an arbitrary number of states). The transition rates to the active and quiescent states are given by Ω+ and Ω-, respectively.

The dynamics of (subthreshold) membrane potentials u are modeled as a multivariate Ornstein-Uhlenbeck (mOU) process, which yields random walk-like behavior that (unlike simple Brownian motion) decays exponentially towards a baseline defined by the resting potential u¯, which in turn depends on the momentary global state of the system, zt:

P(ut|ut-δt,zt)=𝒩(ut;(1-δtτ)ut-δt+δtτ u¯(zt),δt Q) (7)

where τ is the presynaptic time constant of the exponential decay, and Q is the ‘process noise’ covariance matrix determining the variance of individual membrane potentials (together with τ) and, importantly, also the correlations between presynaptic neurons. It is straightforward to extend the model by also making these parameters state- (or in fact, neuron-) dependent.

Note that both the state switching and mOU components of this model introduce both spatial and temporal statistical dependencies in the membrane potentials and spike trains of presynaptic cells. In the rest of this paper, we informally refer to any statistical dependency (second or higher order) as ‘correlation’, and we write ‘auto-correlation’ when we refer to the correlations between the membrane potential (firing rate) values of the same neuron at different times, and ‘cross-correlation’ when referring to the correlation between the activities of two different cells (at the same time, or at different times). Also note that temporal and spatial correlations can not be studied in complete isolation in the case of smoothly varying signals, such as membrane potentials, as the cross-correlation between the activity of two presynaptic neurons always has a characteristic temporal profile. While it is possible to consider a presynaptic neuronal population completely lacking spatial correlations (i.e. independent presynaptic neurons, as in Figure 1C), having a population with only spatial but not temporal correlations would require the membrane potentials of the individual neurons to be temporally white noise – which is so far removed from reality that we did not consider this case worth pursuing.

More specifically, the timescale of temporal correlations (auto-correlations) in the model depend on the transition rates of the switching component, Ω+ and Ω-, and the presynaptic time constant of the mOU component, τ, such that cells are auto-correlated as long as τ, Ω+-1, and Ω--1>0. Spatial correlation (cross-correlations between different presynaptic neurons) also emerge from both components. First, the pairs of presynaptic neurons corresponding to the non-zero off-diagonal elements of Q matrix of the mOU component become correlated. Second, synchronous state transitions during state switching in the presynaptic ensemble introduce positive correlations. Importantly, in both the temporal and spatial domains, while the mOU process can only introduce second-order correlations (i.e. it makes membrane potentials be distributed according to a multivariate normal), the switching process introduces higher order correlations (such that membrane potentials are not normally distributed any more). These higher order correlations are stronger when the effect of state-switching is large relative to the membrane potential fluctuations within a single state.

Finally, instead of modeling the detailed dynamics of action potential generation, we model spiking phenomenologically by introducing a single discrete variable, si,t, for each presynaptic neuron that represents the number of spikes neuron i fires in time step t. (Note that in the limit δt0 this variable becomes binary, i.e. there can never be more than one spike fired in a δt time window.) Spiking in each cell only depends on the membrane potential of that cell, and follows an inhomogeneous Poisson process with the firing rate, r, being an exponential function of the membrane potential (Gerstner and Kistler, 2002):

P(st|ut)=i Poisson(si,t;δt ri,t),with ri,t=geβui,t (8)

where β describes how deterministically the cell switches to firing at threshold (u=0) and g is the firing rate at that threshold. We modeled the absolute refractory period by not allowing the generation of spikes (i.e. setting ri,t=0) within a time window of length τrefr following each spike in a cell, regardless of its membrane potential.

The parameters of the presynaptic statistics used in the paper are given in Table 1. Examples of neural dynamics generated by the model are shown in Figures 1, 2, 4, and 5.

Inference and the optimal response

Our goal was to infer the posterior distribution of the membrane potential based on the spiking pattern observed up to time t, P(ut|s0:t).

We first show that linear dendritic integration is sufficient when presynaptic neurons are statistically independent. We start by noting that by marginalising out the past membrane potential history of the presynaptic cells and using Bayes’ rule, the posterior can always be written as

P(ut|s0:t)=P(u0:t|s0:t)du0:t-δt P(s0:t|u0:t)P(u0:t)du0:t-δt (9)

and as the spikes of each neuron are independent from all other neurons conditioned on its own membrane potential history (Equation 8), this can be rewritten as

P(ut|s0:t)iP(si,0:t|ui,0:t)P(u0:t)du0:tδt (10)

In the special case when we assume that presynaptic neurons are statistically independent, i.e. their prior factorizes P(u0:t)=iP(ui,0:t), the posterior also becomes factorised

(11)P(ut|s0:t)iP(si,0:t|ui,0:t)P(ui,0:t)dui,0:tδt(12)=iP(ui,t|si,0:t)

which in continuous time reads simply as

P(u(t)|s(0:t))=iP(ui(t)|si(0:t)) (13)

Thus, taking our usual assumption that the postsynaptic computation is linear (Equation 5), the optimal response in Equation 6 can be written as

τpostv~˙(t)=-v~(t)+i=1Nwiui(t) P(ui(t)|si(0:t))dui(t) (14)

indicating that integration of inputs from different neurons is linear in this case (it is a weighted sum of terms each depending on just a single presynaptic neuron). However, even in this case, note that integration of input spikes from the same presynaptic neuron, i.e. the result of the integral over each ui(t) as a function of si(0:t), is still nonlinear in general (Pfister et al., 2010). Indeed, Equation 14 including these local nonlinearties was used to compute the linear response in Figure 1C–D.

In the general case inference can be performed using filtering such that in each step we update the inferred state of the hidden variables, zt and ut, using information from two different sources: the likelihood of emitting a particular spiking pattern (observation) and the dynamics of the hidden variables combined with the previous estimate (innovation):

P(ut=u,zt=z|s0:t)P(st=s|ut=u)==zP(zt=z|ztδt=z)duP(ut=u|utδt=u,zt=z) P(utδt=u,ztδt=z|s0:tδt) (15)

where the likelihood P(st|ut) is defined by Equation 8, the dynamics of the global state variable P(zt|zt-δt) is first order, Markovian (see above) and the state-dependent membrane potential dynamics P(ut|ut-δt,zt) is given by Equation 7. Equation 15 thus defines a mapping between the posterior distribution of the hidden variables in the previous time step (last term on RHS) and their current distribution (LHS). The posterior over membrane potentials can then be obtained by simply marginalising out the state variable:

P(ut|s0:t)=z P(ut,zt=z|s0:t)  (16)

For the following, it is useful to represent the posterior as a product of two terms:

P(ut=u,zt=z|s0:t)=P(zt=z|s0:t)P(ut=u|zt=z,s0:t) (17)

As the state variable is binary, its posterior is a Bernoulli distribution which we parametrize by ζ, without loss of generality:

P(zt=+|s0:t)=ζt (18)

However, in general, the posterior of the membrane potentials conditioned on the current state, zt, can be arbitrarily complex. To allow an analytical reduction of the inference process, we adopted an assumed density filtering approach in which this distribution is moment-matched in each time step by a multivariate normal distribution which is thus described by two (sets of) parameters, its mean, μt(z), and covariance, Σt(z):

P(ut=u|zt=+,s0:t) 𝒩(u;μt+,Σt+) (19)

with the analogous equation for the posterior of ut conditioned on zt being in the - state.

One advantage of this parametric approach is that inference (filtering) can be implemented by updating only the parameters describing the (approximate) posterior distribution (Equations 18–19): ζt, μt(z), and Σt(z). In the Appendix we derive an analytical form for these parameter updates resulting in the following set of differential equations:

(20)ζ˙=ζ(1ζ)(γ¯+γ¯)+ζ(1ζ)s(t)TΓ1(γ+γ)+(1ζ)Ω+ζΩ(21)μ˙+=u¯+μ+τ+βΣ+(s(t)γ+)+1ζζΩ+(μμ+)(22)Σ˙+=2τ(τ2QΣ+)β2Σ+Γ+Σ++1ζζΩ+[(ΣΣ+)+(μμ+)(μμ+)T]

where γ(z) (Γ(z)) is a state-dependent vector (diagonal matrix) of which the elements γi(z)=Γii(z)=geβμi(z)+12β2Σii(z) are the expected firing rates of the neurons in a given state, γ¯(z)=iγi(z) is the expected total population firing rate in state z, and Γ=ζ Γ++(1-ζ)Γ- is the expected firing rate of the cells averaged across states. In these equations, the spike trains of the presynaptic neurons are represented by the sum of Dirac-delta functions in continuous time and are denoted by s(t), to be distinguished from its discrete time analog, st, such that s(t)=limδt0st/δt (see Equation A62 in the Appendix). The differential equations for the conditional mean and variance in the - state, μ˙- and Σ˙-, are analogous to Equations 21–22. The absolute refractory period is taken into account by setting γi=Γii=0 after each observed spike for the duration of the refractory period, τrefr, thus omitting the effect of the likelihood (terms containing γ(z) or Γ(z)) from Equations 20–22.

The first term in Equation 20 captures the decay in ζt that is proportional to the difference in the state conditional firing rates in the absence of presynaptic spikes; the second term expresses the instantaneous change in ζt after observing a spike, proportional to both the state estimation uncertainty, ζ(1-ζ), and the differences in the conditional firing rates (γ+-γ-); and the last term captures the decay of ζt to its steady state in the absence of observations. The filtering equations for the conditional mean and covariance (Equations 21–22) are each composed of three terms: the first term expresses the decay of the variable towards its baseline in the absence of observations; the second term captures the effect of the current observation (i.e. the presence or absence of a spike) on the variable; and the third term describes the changes in the variable caused by potential state transitions. This can be viewed an extension and generalization of earlier work deriving the equivalents of Equations 21–22 for the special case of a single neuron without state-switching dynamics (Pfister et al., 2010).

Another advantage of the parametrization of the posterior we chose is that computing the optimal response, i.e. the posterior expectation of the simple linear functions that we consider in this paper, becomes straightforward (cf. Equation 6):

τpostv~˙(t)=v~(t)+iwi(ζ(t)μi+(t)+(1ζ(t))μi(t)) (23)

In order to verify the assumed density filtering approximations used above we numerically integrated the system of differential equations (Equations 20–22) using the software package R (R Core Team, 2012; Soetaert et al., 2010) and compared the results with those obtained using standard particle filters (Doucet et al., 2001). In these simulations we used 500,000 particles to evaluate Equation 15 with 1 neuron and 2 states. Figure 2—figure supplement 1C shows that the results of assumed density filtering are essentially identical to those of particle filtering, confirming that the approximations we used were valid.

Dendritic approximation of the optimal response

Here we first describe a simple canonical model of dendritic integration following Poirazi and Mel (Poirazi and Mel, 2001), and then show that it provides an approximation to the optimal response (Equations 20–23) in the limiting case in which presynaptic dynamics are dominated by simultaneous switching between a quiescent and an active state. In this simple dendritic model, inputs within a branch are integrated linearly:

v˙lin =-𝒜vlin+s(t)-𝒞 (24)

where vlin is the variable linearly integrating inputs with weight , dendritic time constant 1/𝒜 and steady state value -𝒞/𝒜 (in the absence of spikes), and s(t)=isi(t) denotes the spike train of presynaptic neurons, collecting all spikes from the presynaptic population. The actual dendritic response, vden, is then given by mapping this linear response through a sigmoidal nonlinearity, scaled to be between vmin and vmax (Figure 3A, inset):

vden(t)=vmin+(vmax-vmin)11+e-vlin(t) (25)

To demonstrate that this reduced model of dendritic integration closely approximates the optimal response, we first note that under appropriate conditions (τpost is small, N is large, Q is diagonal, and β is small relative to the diagonal elements of Q) the dynamics of the optimal response are dominated by the state switching process (Equation 20; see Appendix). Thus, the optimal response essentially follows the inference about the global state variable, ζ, up to linear rescaling and filtering:

v~(t)  u¯-+(u¯+-u¯-)ζ(t) (26)

with u¯+ and u¯- respectively denoting the resting membrane potential in the active and quiescent states. As Equation 26 is linear, all nonlinear interactions, corresponding to dendritic nonlinearities, must be contained in the temporal dynamics of the posterior probability of this global state variable being in the active state, ζ(t), which can be expressed as

ζ ˙ ζ(1-ζ)[B s(t)-C] (27)

where constants B and C depend on the parameters of the presynaptic statistics (see Appendix). Note that the fact that ζ(1-ζ) multiplies Equation 27 expresses the simple intuition that the size of the update to ζ (the posterior probability of z=+) in response to incoming information (presence, B term, or absence of a spike, C term) should be proportional to our current (posterior) uncertainty about z; and since the posterior is a Bernoulli distribution, the uncertainty associated with it is simply ζ(1-ζ).

The solution of Equation 27 can be expressed in a form that is similar, albeit not identical (see below), to the canonical model for dendritic integration (Equations 24–25). This form requires the linear integration of incoming spikes

ν˙=B s(t)-C (28)

and the temporal evolution of ζ is expressed as a sigmoidal function of the linearly integrated inputs ν:

ζ(t)=11+e-ν(t) (29)

Thus, dendrites with sigmoidal nonlinearity are near-optimal when their synaptic inputs switch between a quiescent and an active state.

The main difference between the dendritic integrator and the optimal response is that the dynamics of spike integration imply exponential decay towards a finite baseline in the former (there is a negative term in Equation 24 which is scaled by vlin) and steady decrease towards negative infinity in the latter (the only negative term in Equation 28 is a constant, independent of ν). This is because the approximations we used for deriving Equation 27 were accurate only in the quasi-static case, when the state switching dynamics are infinitely slow. In this case, remote and more recent observations should have identical effects on the current value of ζ as they all correspond to the same underlying state. In the more general case, when state switching occurs with non-zero probability, more remote observations likely correspond to a state which has changed in the meantime, and should thus count for less, such that their effect on the current value of ζ should decay with time – leading to leaky integration of incoming spikes, similar to that in Equation 24.

Simplified neuron models

To compare quantitatively the response of the linear and nonlinear dendrites to the optimal response in a computational task using realistic input statistics, we divided the presynaptic population into four groups (cell assemblies), where neurons within each group were statistically dependent (either through simple second-order correlations, Figure 4—figure supplement 2, or through sharing a common state variable, Figure 4 and Figure 4—figure supplement 2) while neurons from different groups were independent (Figure 4A). In this task we used 4 different versions of the simplified neuron model (Figure 4B):

  • The linear model responded to all incoming spikes with an identical postsynaptic potential (PSPs) characterized by its amplitude (w), decay time constant (τ) and the resting potential (v¯): 

v~˙(t)=v¯-v~τ+w s(t) (30)

where s(t) is the total incoming spike train (as before). This model had three parameters.

  • The model with a somatic nonlinearity had linear dendrites but a nonlinear soma. Motivated by our analytical calculations we used a sigmoidal nonlinearity: 

v~g(t)=ag1+e-βg(v~(t)-θg)-v¯g (31)

where we computed v~(t) as defined in Equation 30 above with w=1 and v¯=0 (as these parameters were interchangeable with βg and θg). This model had five free parameters.

  • The random dendrites model had four nonlinear dendritic subunits, each receiving inputs from a unique set of 10 neurons randomly selected from the four presynaptic assemblies, and integrating these inputs using a sigmoidal nonlinearity (Equations 30-31). Each subunit had its own set of five parameters, and the outputs of the subunits were simply averaged in the soma (without loss of generality), resulting in 20 parameters in total.

  • The clustered dendrites model was similar to the random dendrites model, with the important difference that neurons in each cell assembly selectively targeted a single nonlinear dendritic branch. Since the presynaptic statistics was the same for all four branches, we constrained the parameters of the four dendritic subunits to be identical and this model had only five parameters (the same performance was achieved when we relaxed this constraint and optimised all 20 parameters).

To fit the models we generated 240 s-long samples of presynaptic activity and optimized the parameters of the models to minimize the squared error between the signal (v, the true average of the stimulated presynaptic potentials, cf. Equation 5 with wi=1N) and their estimates (v^, the outputs of the models), averaged over the duration of the sample:

ϵestimation=1Tt=1T(v^t-vt)2 (32)

After training, we tested the different models in cross-validation, on a novel 120 s-long input sequence, and quantified their performance by the fraction of variance unexplained, i.e. the temporally averaged squared error, ϵestimation, normalized by the variance of the signal (as a sensible upper limit on the error – achievable by an estimator that predicts the prior mean, ignoring incoming spikes altogether):

ϵ¯estimation=ϵestimationVar[v] (33)

where Var[v]=1T-1t(vt-E[v])2 and E[v]=1Ttvt. (Normalization was unnecessary during training because the parameters of the models that we were optimizing obviously did not influence the variance of the signal.) Figure 4 shows 1-ϵ¯estimation, i.e. the fraction of variance explained as ‘performance’, and Figure 1—figure supplement 3, Figure 3—figure supplement 2, and Figure 4—figure supplement 2 show ϵ¯estimation as ‘estimation error’.

Fitting and predicting experimental data

To predict dendritic integration in hippocampal and neocortical neurons we fitted the parameters describing presynaptic statistics in our model, P(u) and P(s|u), to the statistical patterns in the activity of their respective presynaptic populations.

The basal dendrites of neocortical layer 2/3 pyramidal cells are targeted by neighbouring pyramidal neurons as well as by neurons from layer 4 (Douglas and Martin, 2004). We used in vivo intracellular paired recordings from layer 2/3 pyramidal neurons in the barrel cortex (Poulet and Petersen, 2008; Gentet et al., 2010; Crochet et al., 2011) to set the parameters of our model to reproduce the salient features of the presynaptic population dynamics during quiet wakefulness (Tables 12).

In the hippocampal experiments we stimulated synapses on the proximal dendrites of CA3 neurons targeted by recurrent collaterals of neighbouring pyramidal cells (Andersen et al., 2007). We fitted the presynaptic statistics to in vivo population activity patterns recorded from the hippocampus during quiet wakefulness, characterised by sharp wave (SPW) activity (Csicsvari et al., 2000). As intracellular recordings from CA3 pyramidal neurons during SPW activity in the awake animal are not available, we fitted the presynaptic statistics to awake extracellular data from CA3 (Csicsvari et al., 2000; Grosmark et al., 2012) and intracellular (Ylinen et al., 1995; English et al., 2014) data from CA1 pyramidal neurons (Tables 1 and 3).

Comparing the optimal response to dendritic integration

We used four different parameter sets (models) to describe the activity of the presynaptic population (Figure 1). The parameters of the HP and NC models were fitted to in vivo recordings from the corresponding presynaptic populations as described above. As a control, we used two simpler models with no state switching dynamics. The cor2 model had correlated membrane potential fluctuations with all cross-correlations between presynaptic neurons being the same, -1N-1ρ1. Neurons in the last model, ind, had independent membrane potential fluctuations. The HP, NC, and ind models had no free parameters, while parameter ρ of the cor2 model was left free and later tuned to fit dendritic data. Note that to fit supralinear cortical responses, ρ had to be tuned to unnaturally large negative values in this model (Figure 5—figure supplement 1A) – and it still produced significantly poorer fits than the HP and NC models (Figure 5).

After setting the parameters of the presynaptic population, we computed the optimal response by numerically integrating Equations 20–22 in the software package R (R Core Team, 2012; Soetaert et al., 2010). When comparing the optimal response to experimental data, we assumed that each uncaging event corresponded to a single spike at the presynaptic axon terminal, and spines not showing measurable gluEPSP were considered to be non-stimulated. For all four presynaptic parameter sets, we varied two postsynaptic parameters to fit the responses of the optimal estimator to dendritic integration data, i.e. the somatic membrane potential traces recorded in our in vitro experiments. These two parameters were the weight w of the presynaptic neurons and the time constant of the postsynaptic filtering, τpost (Equation 5). To avoid overfitting, we assumed that all presynaptic neurons had equal weight, i.e. i wi=w. A final free parameter that we had to consider was the number of synapses that were in the same functional cluster as the synapses we stimulated in our experiments – where the term ‘functional cluster’ refers to a set of synapses for which the presynaptic cells are correlated. This parameter was irrelevant for the ind model (by definition), it was fixed at 20 for the cor2 model (because its effects on the optimal response were largely indistinguishable from that of varying ρ, see above), and it was tuned to fit dendritic integration for the state-switching models (NC and HP). In sum, the number of free parameters used to fit dendritic integration data was 2 for the ind model and 3 for the cor2, NC, and HP models. We confirmed that the higher number of free parameters in the latter models did not result in an unfair advantage in fitting performance by using Bayesian Information Criterion (BIC), rather than squared error (see below), as our measure of performance. BIC includes an explicit term penalizing the number of parameters, and our results were not qualitatively affected by it: fitting the model using the relevant in vivo statistics resulted in 3500± 940 (NC, mean± s.d.) and 2000± 1100 (HP) higher BIC scores than when using independent statistics (where each unit of BIC difference corresponds to a likelihood that is higher by a factor of e2.71).

We fitted each recorded neuron independently using these parameters by minimizing the mean squared error between the predicted, v~, and the recorded postsynaptic membrane potential (averaged across repetitions of the same stimulation in the same cell), v¯*:

ϵfitting=1Tt=1T(v~t-v¯t*)2 (34)

To be able to compare results across different neurons and different stimulation protocols, we normalized the error by the total variance of the data:

ϵ¯fitting=ϵfittingVar[v¯*] (35)

where Var[v¯*]=1T-1t(v¯t*-E[v¯*])2 and E[v¯*]=1Ttv¯t*. A natural lower bound of our fitting error was the intrinsic variability of the data, so we computed the mean of the variance of the experimental data across repetitions, normalized by the total variance of the data:

ϵ¯min=1L-1l=1L1Tt(vt,l*-v¯t*)2Var[v¯*] (36)

where vt,l* is the raw data before averaging across repetitions, L is the number of repetitions using the same stimulation protocol in the same cell and v¯t*=1LLvt,l*. Figures 5D,H and Figure 5—figure supplement 1G show ϵ¯fitting as the ‘fitting error’ and ϵ¯min as ‘var’. The best fitting parameter values for the postsynaptic time constant, τpost, and total number of neurons in a functional cluster, N, are shown in Figure 5—figure supplement 1B,C.

Experimental methods

Neocortex

Slice preparation and electrophysiology

Acute sagittal brain slices (300 μm) incorporating both visual and somatosensory cortex were prepared from 3–6 week-old Sprague-Dawley rats as previously described (Sjöström and Häusser, 2006) and in accordance with institutional and national guidelines. Experiments were carried out at 32°C–35°C in artificial cerebrospinal fluid (ACSF) containing (in mM): NaCl 125, KCl 2.5, glucose 25, NaH2PO4 1.25, NaHCO3 25, MgCl2 1, CaCl2 2 (pH 7.3 when bubbled with 95% O2 and 5% CO2). Somatic whole-cell recordings were obtained with a Multiclamp 700B amplifier (Molecular Devices, Sunnyvale, CA), and data was acquired at 50 kHz using custom-written software in Matlab 7.2 (Mathworks, Natick, MA) interfacing with an ITC-18 A/D board (Instrutech, Holliston, MA). Patch pipettes had a resistance of 3–6 MΩ when filled with a solution containing (in mM): KMeSO4 130, HEPES 10, KCl 7, MgATP 2, Na2ATP 2, Na2GTP 0.3, EGTA 0.05 (pH 7.2) and Rseries was <30 MΩ. For visualization of cell morphology Alexa Fluor 594 (100 μM; Invitrogen, Carlsbad, CA) was added to the internal solution.

Two-Photon imaging and uncaging

Simultaneous 2-photon imaging and uncaging was performed using a dual galvanometer-based scanning system (Prairie Technologies, Middleton, WI) using two Ti:sapphire pulsed lasers (MaiTai, Spectra-Physics, Santa Clara, CA), one tuned to 840 nm for imaging cell morphology, and another tuned to 720 nm for photolysis of MNI-caged-L-glutamate. Neurons were visualized using an Olympus BX51WI objective (60x, 0.9 NA; Olympus, Melville, NY). Two-photon glutamate uncaging was carried out based on previously published methods (Gasparini and Magee, 2006; Losonczy and Magee, 2006; Branco and Häusser, 2011). MNI-caged-L-glutamate (12 mM, Tocris Cookson, UK) was dissolved in (in mM): NaCl 125, KCl 2.5, HEPES 10, CaCl2 2, MgCl2 1, glucose 25, and puffed locally. To block NMDA receptors (Figure 5E,J, blue squares), 500 μM D-AP5 was included in the glutamate puffing pipette.

Stimulation and data analysis

A short burst of presynaptic uncaging events equally spaced in time with inter stimulus interval (ISI) between 1 and 20 ms was applied on 1–7 visually identified dendritic spines. For each dendritic branch either the number of stimuli or the ISI was varied. Uncaging exposure time was 100–500 μs and the inter-trial interval was 10 s. All data was acquired using custom written software in Matlab 7.2 (Mathworks). The original data recorded at 50 kHz were averaged across identical trials, filtered with a Gaussian kernel with σ=0.2 ms and subsampled at 2 kHz for analyses. All spines were responsive in the neocortical experiments.

Hippocampus

Slice preparation and electrophysiology

Adult male Sprague-Dawley rats (8–12 week-old) were used to prepare transverse slices (400 μm) from the hippocampus similarly to that described previously (Losonczy and Magee, 2006), according to methods approved by the Janelia Farm Institutional Animal Care and Use Committee and the Animal Care and Use Committee (ACUC) of the Institute of Experimental Medicine, Hungarian Academy of Sciences, and in accordance with 86/609/EEC/2 and DIRECTIVE 2010/63/EU Directives of the EU. Experiments were carried out at 33°C–35°C in artificial cerebrospinal fluid (ACSF) containing (in mM): NaCl 125, KCl 3, glucose 25, NaH2PO4 1.25, NaHCO3 25, MgCl2 1, CaCl2 1.3, Na-pyruvate 3, and ascorbic acid 1 saturated with 95% O2 and 5% CO2. Somatic whole-cell recordings were obtained with BVC-700 amplifier (Dagan, Minneapolis, MN) in the active ‘bridge’ mode, filtered at 3 kHz and data was acquired at 50 kHz. Patch pipettes had a resistance of 2–6 MΩ when filled with a solution containing (in mM): K-gluconate 120, KCl 20, HEPES 10, NaCl 4, Mg2ATP 4, Tris2GTP 0.3, phosphocreatine 14, complemented with 100 μM Alexa Fluor 488 (Invitrogen-Molecular Probes, Eugene, OR) and 0.1–0.3% biocytin (Sigma), pH=7.25 and Rseries was <30 MΩ. Alexa Fluor 488 fluorescence or biocytin labeling with immunoperoxidase reaction was used for post hoc verification of the localization of neurons along the proximodistal axis of CA3. All CA3 neurons included in this study had resting membrane potentials between −62 and −72 mV. Cells were hyperpolarized when necessary to avoid action potential firing during synaptic stimulation.

Two-Photon imaging and uncaging

A dual galvanometer based two photon scanning system (Prairie Technologies, Middleton, WI) equipped with an Olympus BX-61 microscope (60X, 0.9 NA objective) was used to image Alexa 488-loaded neurons and to uncage glutamate at individual dendritic spines as described (Losonczy and Magee, 2006; Makara and Magee, 2013). Two ultrafast pulsed laser beams (Chameleon Ultra II; Coherent, Auburn, CA, USA) were used, one at 920 nm for imaging Alexa 488 and the other at 720 nm to photolyze MNI-caged-L-glutamate (Tocris Cookson, Ballwin, MO, USA; 10 mM applied through a pipette above the slice). Laser beam intensity was independently controlled with electro-optical modulators (Model 350–50, Conoptics, Danbury, CT, USA). Unitary gluEPSP amplitude and rise time was close to that of mEPSPs as measured by sucrose application at dendritic segments 70–168 μm from the soma as described before (Magee and Cook, 2000; Makara and Magee, 2013). To standardize these experiments, results were included in the analysis only if 1) at least 65% of the selected spines were responsive (see below), 2) the average amplitude of the successful unitary gluEPSPs was 0.2–0.6 mV and maximum unitary gluEPSP amplitude was 1.2 mV, 3) at least 5 mV expected amplitude was achieved, and 4) unitary responses were stabile with repeated stimulation. To block NMDA receptors 50 μM AP5 was added to both bath solution and the puffing pipette solution.

Stimulation and data analysis

Dendritic branches on basal dendritic segments 100–160 μm from the soma were stimulated by synchronous uncaging of MNI-glutamate at a spatially clustered set (1–32) of visually identified spines using 0.2 ms uncaging duration with different intervals (in the range of 0.1–5 ms) between synapses. For each dendritic branch either the number of stimuli (control) or the ISI (NMDA block) was varied. To detect non-responsive spines, we fitted the individual responses with a double-exponential function:

ϕ(t)=ϕ0(e-(t-t0)/τ1-e-(t-t0)/τ2) (37)

Spines were classified as non-responsive if the rise time of the gluEPSP (τ2) was slower than 15 ms or its start time (t0) was more than 5 ms (relative to the stimulation time). The original data recorded at 50 kHz were averaged across identical trials, filtered with a Gaussian kernel with σ=0.2 ms and subsampled at 2 kHz for analyses.

Acknowledgements

We thank J-P Pfister for valuable insights and comments on earlier versions of the manuscript, B Gutkin, Sz Káli, and A Gulyás for useful discussions, and J Magee, M Häusser and D Wolpert for support and helpful comments.

Appendix

A Derivation of the optimal response

Here we derive the filtering equations for the switching multivariate OU (mOU) process introduced in the main text, such that we generalise the two-state model presented there to an arbitrary number of hidden states.

A.1 The generative model of presynaptic activities

The generative model of presynaptic activities has two hidden and one visible layers (Figure 2—figure supplement 1). The state of the system is described by a global discrete state variable, z, and its dynamics is characterised by the state transition matrix, Ω:

P(zt=z|zt-δt=z') =δzz'(1-δtΩ¯z')+(1-δzz')δtΩzz' (A1)

where Ωzz' is the rate of transitioning from state z' to z, and

Ω¯z'=zz'Ωzz', Ωzz'0,δt1maxz' Ω¯z'

and δzz' is the Kronecker delta function. This description becomes identical to the state switching dynamics presented in the main text in the case of a binary state variable, z{-,+}.

The dynamics of the membrane potentials u are modeled as a mOU process where the resting potential, u¯(zt), the time constant of the neurons, τ(zt), and the covariance of the process noise, Q(zt), may depend on the current state, zt:

P(ut|ut-δt,zt)=𝒩(ut;(1-δtτ(zt)) ut-δt+δtτ(zt) u¯(zt),δt Q(zt)) (A2)

For simplicity, in the main text we assumed that only the resting potential depends on the state variable.

The observations are spike counts, s, which are given by an inhomogeneous Poisson process for each neuron independently from the other neurons given its own membrane potential:

P(st|ut)=i Poisson (st,i; δt g eβ ut,i) (A3)

where g is the baseline firing rate and β is a parameter controlling the nonlinearity of the spiking.

A.2 The inference problem

Our goal is to infer the posterior distribution of the current membrane potentials based on the spiking pattern observed up to time t:

P(ut|s0:t)=ztP(ut|zt,s0:t)P(zt|s0:t) (A4)
=ztP(ut,zt|s0:t) (A5)

Inference of the value of the hidden variables requires information from two different sources: the likelihood of emitting a particular spiking pattern (observation) and the dynamics of the hidden variables combined with the previous estimate (innovation).

P(ut=u,zt=z|s0:t)P(st=s|ut=u)fs(δt,u): observationzP(zt=z|ztδt=z)P(ztδt=z|s0:tδt)duP(ut=u|utδt=u,zt=z)P(utδt=u|ztδt=z,s0:tδt)h(δt,u,z)=P(ut=u,zt=z|s0:tδt): innovation (A6)

where the likelihood P(st|ut) is defined by Equation A3, the dynamics of the global state variable P(zt|zt-δt) is defined by Equation A1 and the state-dependent membrane potential dynamics P(ut=u|ut-δt=u',zt=z) is given by Equation A2. Equation A6 is the filtering equation, analogous to Equation 15 in the main text. Note that both the observation fs(δt,u) and the innovation h(δt,u,z) terms depend on δt, the time step used in the inference.

In general, computing Equation A6 is intractable so we take an assumed density filtering approach in which we approximate the posterior in each time step with a particular parametric form and derive how the parameters of this approximation change due to Equation A6. Specifically, we represent the posterior using the following factorisation (without approximation):

P(ut=u,zt=z|s0:t)=P(zt=z|s0:t)P(ut=u|zt=z,s0:t) (A7)

The first term of Equation A7 is the posterior over the current value of the discrete hidden state variable, zt, which is a discrete distribution that can be parametrised by a single (vector) parameter, ζt (still without approximation):

P(zt=z|s0:t)=ζt(z) (A8)

The second term of Equation A7 is the distribution of the membrane potentials conditioned on the current state, which we approximate by a multivariate Gaussian with parameters μt(z) and Σt(z) respectively describing the conditional mean and covariance of the presynaptic membrane potentials:

P(ut=u|zt=z,s0:t)𝒩(u;μt(z),Σt(z)) (A9)

In sum, this approximate posterior has three sets of parameters that fully characterise it: ζt, μt(z), and Σt(z). In Sections A.3-A.6 we derive the update dynamics for these parameters when observing arbitrary spiking patterns.

For notational convenience, we also introduce a further quantity, which can be derived from these parameters: the expected firing rate of the cells in each hidden state, either written as a vector γt(z) or as a diagonal matrix Γt(z), whose elements are

γt,i(z)=Γt,ii(z)=P(ut,i=u|zt=z,s0:t)geβudu=geβμt,i(z)+12 β2Σt,ii(z) (A10)

A.3 Parameter updates

The advantage of describing the elements of the posterior in a parametric form is that we only need to update the parameters of Equations A8-A9. As the parameter updates are derived using Equation A6, in which the innovation and the observation terms depended on δt, the parameter updates will also depend on δt. Eventually, to derive continuous-time dynamics, we will take the limit of δt0. Incidentally, the Gaussian approximation in Equation A9 is also more accurate with small δt.

We rewrite the parameters of the approximate posterior by using auxiliary variables (this also makes it explicit how the update rule depends on δt):

(A11)ζt(z)(δt)=ast(δt,z)bst(δt)(A12)μt(z)(δt)=mst(δt,z)ast(δt,z)(A13)Ξt(z)(δt)=Cst(δt,z)ast(δt,z)(A14)Σt(z)(δt)=Ξt(z)(δt)μt(z)(δt)(μt(z)(δt))T

where the auxiliaries are defined as:

(A15)as(δt,z)=fs(δt,u)h(δt,u,z)du(A16)bs(δt)=zas(δt,z)(A17)ms(δt,z)=ufs(δt,u)h(δt,u,z)du(A18)Cs(δt,z)=uuTfs(δt,u)h(δt,u,z)du

where fs(δt,u) and h(δt,u,z) were defined in Equation A6.

A.4 First-order approximations to parameter updates

Our goal is to derive differential equations for the parameter updates. Therefore we will assume that δt is sufficiently small (i.e. δt0) to use a first order approximation of Equations A11-A13 around δt=0:

ζt(z)(δt)ζt(z)(0)+δtζt(z)(0)(A19)=ast(0,z)bst(0)+δtast(0,z)bst(0)ast(0,z)bst(0)bst2(0)μt(z)(δt)μt(z)(0)+δtμt(z)(0)(A20)=mst(0,z)ast(0,z)+δtmst(0,z)ast(0,z)mst(0,z)ast(0,z)ast2(0,z)Ξt(z)(δt)=Ξt(z)(0)+δtΞt(z)(0)(A21)=Cst(0,z)ast(0,z)+δtCst(0,z)ast(0,z)Cst(0,z)ast(0,z)ast2(0,z)

where

(A22)as(δt,z)=fs(δt,u)h(δt,u,z)duaI+fs(δt,u)h(δt,u,z)duaII(A23)bs(δt)=zas(δt,z)(A24)ms(δt,z)=ufs(δt,u)h(δt,u,z)dumI+ufs(δt,u)h(δt,u,z)dumII(A25)Cs(δt,z)=uuTfs(δt,u)h(δt,u,z)duCI+uuTfs(δt,u)h(δt,u,z)duCII

In particular, in order to compute Equations A19-A21 we will need the value of h(δt,u,z) and h(δt,u,z) with δt=0 (see Section A.8.1):

(A26)h(0,u,z)=ζtδt(z)N(u;μtδt(z),Σtδt(z))h(0,u,z)=zζtδt(z)[(1δzz)ΩzzδzzΩ¯z]N(u;μtδt(z),Σtδt(z))==12ζtδt(z)N(u;μtδt(z),Σtδt(z))==[Tr[Σtδt(z)1Q(z)+2τ(z)I]+2τ(z)(u¯(z)μtδt(z))TΣtδt(z)1(uμtδt(z))(A27)==(uμtδt(z))TΣtδt(z)1(Q(z)2τ(z)Σtδt(z))Σtδt(z)1(uμtδt(z))]

Having derived the general form of the parameter updates we need to apply it to the possible observations, i.e. derive how observing a particular presynaptic spiking pattern at time t changes the posterior distribution over the state variable, P(zt=z|s0:t) and the presynaptic membrane potentials, P(ut=u|zt=z,s0:t). In the limit of δt0 we consider only two alternative outcomes of the observation process: the case of observing zero or one spike in a single time bin.

A.5 Observation: No spikes

We first calculate the likelihood (sec. A.5.1) and the auxiliaries (sec. A.5.2) with st=0 and then derive the corresponding parameter updates in sec. A.5.3.

A.5.1 Likelihood

(A28)f0(δt,u)=P(st=0|ut=u)(A29)=iPoisson(0;δtgeβui)(A30)=eδtgieβui(A31)f0(δt=0,u)=1(A32)f0(δt,u)=(gieβui)f0(δt,u)(A33)f0(δt=0,u)=gieβui

A.5.2 Auxiliaries

Here we describe the values of the auxiliaries required to compute Equations A19-A21 with the likelihood function given above:

(A34)a0(0,z)=ζtδt(z)(A35)b0(0)=1(A36)m0(0,z)=ζtδt(z)μtδt(z)(A37)C0(0,z)=ζtδt(z)(Σtδt(z)+μtδt(z)(μtδt(z))T)

The following quantities have been derived in Ujfalussy et al., 2011:

(A38)aI(0,z)=ζtδt(z)γ¯tδt(z)(A39)mI(0,z)=ζtδt(z)[γ¯tδt(z)μtδt(z)+βΣtδt(z)γtδt(z)](A40)CI(0,z)=ζtδt(z)[γ¯tδt(z)(μtδt(z)(μtδt(z))T+Σtδt(z))+2βμtδt(z)(γtδt(z))TΣtδt(z)+β2Σtδt(z)Γtδt(z)Σtδt(z)]

where γ¯t(z)=iγt,i(z) is the expected population firing rate in state z.

The derivation of the following auxiliaries is given in Section A.8.1:

(A41)aII(0,z)=zζtδt(z)[(1δzz)ΩzzδzzΩ¯z](A42)mII(0,z)=zζtδt(z)[(1δzz)ΩzzδzzΩ¯z]μtδt(z)+ζtδt(z)u¯(z)μtδt(z)τ(z)(A43)CII(0,z)=zζtδt(z)[(1δzz)ΩzzδzzΩ¯z](μtδt(z)(μtδt(z))T+Σtδt(z))+====+ζtδt(z)[(Q(z)2τ(z)Σtδt(z))+1τ(z)μtδt(z)(u¯(z)μtδt(z))T+1τ(z)(u¯(z)μtδt(z))(μtδt(z))T]

A.5.3 Parameter updates

Now we are ready to substitute Equation A34-A43 into Equation A19-A21 to derive the parameter updates.

State probabilities

ζt(z)(δt)=a0(0,z)b0(0)+δta0'(0,z)b0(0)-a0(0,z)b0'(0)b02(0)=ζt-δt(z)+δt[-ζt-δt(z)γ¯t-δt(z)+z'ζt-δt(z') [(1-δzz')Ωzz'-δzz'Ω¯z']--ζt-δt(z)(z''-ζt-δt(z'')γ¯t-δt(z'')+z',z''ζt-δt(z')[(1-δz''z')Ωz''z'-δz''z'Ω¯z'])]=ζt-δt(z)+δt[-ζt-δt(z)(γ¯t-δt(z)-z'ζt-δt(z')γ¯t-δt(z'))++z'ζt-δt(z')[(1-δzz')Ωzz'-δzz'Ω¯z']-ζt-δt(z)z',z''ζt-δt(z')[(1-δz''z') Ωz''z'-δz''z'Ω¯z']] (A44)

However, since  Ω¯z'=z''Ωz''z' the last terms in Equation A44 cancel each other and we can write that

ζt(z)(δt)=ζtδt(z)+δt[ζtδt(z)(γ¯tδt(z)γ¯tδt)+zζtδt(z)[(1δzz)ΩzzδzzΩ¯z]] (A45)

where denotes averaging over z', and so γ¯t= z'ζt-δt(z') γ¯t(z') is the average expected population firing rate. In Equation A45 two different processes are changing the posterior state probabilities. First, changes in ζ(z) are proportional to the difference between the total firing rate in state z, γ¯(z), and the average firing rate, γ¯. This process causes a decay in the posterior probability of those states in which the expected firing rate is high and thus incompatible with the current observation (no spikes). Second, posterior state probabilities are changing according to the prior state transition dynamics captured by Ω.

Posterior mean

μt(z)(δt)m0(0,z)a0(0,z)+δtm0(0,z)a0(0,z)m0(0,z)a0(0,z)a02(0,z)=ζtδt(z)μtδt(z)ζtδt(z)+δt1ζtδt(z)2[(ζtδt(z)[γ¯tδt(z)μtδt(z)+βΣtδt(z)γtδt(z)]+======+zζtδt(z)[(1δzz)ΩzzδzzΩ¯z]μtδt(z)+ζtδt(z)u¯(z)μtδt(z)τ(z))ζtδt(z)====ζtδt(z)μtδt(z)(ζtδt(z)γ¯tδt(z)+zζtδt(z)[(1δzz)ΩzzδzzΩ¯z])]=μtδt(z)+δt[u¯(z)μtδt(z)τ(z)βΣtδt(z)γtδt(z)+zζtδt(z)ζtδt(z)(μtδt(z)μtδt(z))[(1δzz)ΩzzδzzΩ¯z]]=μtδt(z)+δt[u¯(z)μtδt(z)τ(z)βΣtδt(z)γtδt(z)+zzΩzzζtδt(z)ζtδt(z)(μtδt(z)μtδt(z))] (A46)

The evolution of the conditional mean is governed by three factors. The first term in the bracket describes the decay of the posterior mean towards the prior mean with time constant τ(z). The second term captures the effect of the observation (absence of spikes) where the decrease in the posterior mean is proportional to the posterior covariance Σ(z) (because when more uncertain a priori, the observation should have a larger impact) and the expected firing rate γ(z) (because a higher expected firing rate is more incompatible with the absence of spikes and thus requires a larger correction). This term is omitted in the absence of observations, i.e. during the absolute refractory period. Finally, the third term expresses the effect of a possible transition from each of the other states z' to state z which is proportional to the product of the prior transition rate Ωzz' and the relative probabilities of the two states (i.e. the relative proportion of the probability mass coming from state z against that already existing in state z), and the difference between the respective posterior means (as the more different these are, the larger the bias contributed by the probability mass coming from state z is).

Posterior covariance

Ξt(z)(δt)=Ξt(z)(0)+δtΞt(z)(0)=C0(0,z)a0(0,z)+δtC0(0,z)a0(0,z)C0(0,z)a0(0,z)a02(0,z)=ζtδt(z)(Σtδt(z)+μtδt(z)(μtδt(z))T)ζtδt(z)+==+δt1ζtδt(z)2[ζtδt(z)2(γ¯tδt(z)[μtδt(z)(μtδt(z))T+Σtδt(z)]+2βμtδt(z)(γtδt(z))TΣtδt(z)+β2Σtδt(z)Γtδt(z)Σtδt(z))+====+ζtδt(z)zζtδt(z)[(1δzz)ΩzzδzzΩ¯z](μtδt(z)(μtδt(z))T+Σtδt(z))====+ζtδt(z)2[(Q(z)2τ(z)Σtδt(z))+1τ(z)μtδt(z)(u¯(z)μtδt(z))T+1τ(z)(u¯(z)μtδt(z))(μtδt(z))T]====ζtδt(z)(Σtδt(z)+μtδt(z)(μtδt(z))T)(ζtδt(z)γ¯tδt(z)+zζtδt(z)[(1δzz)ΩzzδzzΩ¯z])]=Σtδt(z)+μtδt(z)(μtδt(z))T+δt[2βμtδt(z)(γtδt(z))TΣtδt(z)β2Σtδt(z)Γtδt(z)Σtδt(z)+====+zζtδt(z)ζtδt(z)[(1δzz)ΩzzδzzΩ¯z](μtδt(z)(μtδt(z))T+Σtδt(z))+====+(Q(z)2τ(z)Σtδt(z))+1τ(z)μtδt(z)(u¯(z)μtδt(z))T+1τ(z)(u¯(z)μtδt(z))(μtδt(z))T====(Σtδt(z)+μtδt(z)(μtδt(z))T)(zζtδt(z)ζtδt(z)[(1δzz)ΩzzδzzΩ¯z])]=Σtδt(z)+μtδt(z)(μtδt(z))T+δt[Q(z)2τ(z)Σtδt(z)β2Σtδt(z)Γtδt(z)Σtδt(z)+====+zζtδt(z)ζtδt(z)[(1δzz)ΩzzδzzΩ¯z](μtδt(z)(μtδt(z))Tμtδt(z)(μtδt(z))T+Σtδt(z)Σtδt(z))+====+1τ(z)μtδt(z)(u¯(z)μtδt(z))T+1τ(z)(u¯(z)μtδt(z))(μtδt(z))T2βμtδt(z)(γtδt(z))TΣtδt(z)] (A47)

And finally, in order to compute the covariance, we will need:

μt(z)(μt(z))T=μtδt(z)(μtδt(z))T+δt[u¯(z)μtδt(z)τ(z)(μtδt(z))T+μtδt(z)(u¯(z)μtδt(z)τ(z))T2βΣtδt(z)γtδt(z)(μtδt(z))T+====zζtδt(z)ζtδt(z)(μtδt(z)μtδt(z))(μtδt(z))T[(1δzz)ΩzzδzzΩ¯z]+====zζtδt(z)ζtδt(z)μtδt(z)(μtδt(z)μtδt(z))T[(1δzz)ΩzzδzzΩ¯z]]+O(δt2) (A48)

Now we substitute Equation A47-A48 into Equation A14 to obtain the linear form for the change of the covariance matrix:

Σt(z)=Σtδt(z)+δt[Q(z)2τ(z)Σtδt(z)β2Σtδt(z)Γtδt(z)Σtδt(z)+====+zζtδt(z)ζtδt(z)[(1δzz)ΩzzδzzΩ¯z](Σtδt(z)Σtδt(z))====+zζtδt(z)ζtδt(z)[(1δzz)ΩzzδzzΩ¯z](μtδt(z)μtδt(z))(μtδt(z)μtδt(z))T]=Σtδt(z)+δt[Q(z)2τ(z)Σtδt(z)β2Σtδt(z)Γtδt(z)Σtδt(z)+====+zzΩzzζtδt(z)ζtδt(z)(Σtδt(z)Σtδt(z))+zzΩzzζtδt(z)ζtδt(z)(μtδt(z)μtδt(z))(μtδt(z)μtδt(z))T] (A49)

The evolution of the posterior covariance is governed by the same three factors as the evolution of the conditional mean. The first is the decay of the posterior covariance towards the prior covariance τ(z)2Q(z) with time constant τ(z)2. (Note that this decay is twice faster than that of the mean.) The second term captures the effect of the observation (absence of spikes) such that the decrease in the posterior covariance is proportional to the square of the covariance Σ(z) and the expected firing rate Γ(z) (because this observation is more informative, i.e. it causes a larger surprise, when the expected firing rate is higher). Finally, the last term expresses the effect of possible hidden transitions between state z and state z' on the posterior covariance (which is again proportional to the relative proportion of the probability masses and the differences between the two states).

A.6 Observation: cell i emits a spike

Now we assume that cell i emits a spike while all other neurons remain silent in the current time step. Note that as δt0 the probability of having multiple presynaptic neurons spiking in the same time bin, or the same cell firing twice, converges to zero.

A.6.1 Likelihood

In this case the current spiking pattern is denoted by st=s^(i), that is s^j(i)=δij and the likelihood is given by

fs^(i)(δt,u)=P(st=s^(i)|ut=u)=Poisson(1;δtgeβui)jiPoisson(0;δtgeβuj)(A50)=δtgeβuieδtgjeβuj(A51)fs^(i)(δt=0,u)=0

The derivative of the likelihood is

(A52)fs^(i)(δt,u)=geβuieδtgjeβujδtgeβuig(jeβuj)eδtgjeβuj=geβuieδtgjeβuj(1δtgjeβuj)(A53)fs^(i)(δt=0,u)=geβui

In the following, we follow the same steps as in the previous section, but replacing f0(0,u) and f0(0,u) with fs^(i)(0,u) and fs^(i)(0,u), respectively.

A.6.2 Auxiliaries

Again, the goal is to derive differential equations for the parameter updates. As the number of time bins with spikes does not grow with δt0 we only need to compute terms up to zeroth order (cf. Equations A19-A21). However, as the likelihood (Equation A51) is 0 when δt=0, the values of the auxiliaries in Equations A54-A56 are also 0 (Equations A15-A18), and so we shall use l’Hôpital’s rule to deal with the δt0 limit:

ζt(z)(δt)ζt(z)(0)=ast(0,z)bst(0)=ast'(0,z)bst'(0) (A54)
μt(z)(δt)μt(z)(0)=mst(0,z)ast(0,z)=mst'(0,z)ast'(0,z) (A55)
Ξt(z)(δt)=Ξt(z)(0)=Cst(0,z)ast(0,z)=Cst'(0,z)ast'(0,z) (A56)

Substituting Equation A53 into Equations A22-A25 gives the following values for the corresponding derivatives (for details see Ujfalussy et al., 2011):

(A57)as^(i)(0,z)=ζtδt(z)γtδt,i(z)(A58)bs^(i)(0,z)=zζtδt(z)γtδt,i(z)(A59)ms^(i)(0,z)=ζtδt(z)γtδt,i(z)(μtδt(z)+βΣtδt(z)s^(i))(A60)Cs^(i)(0,z)=ζtδt(z)γtδt,i(z)[(μtδt(z)+βΣtδt(z)s^(i))(μtδt(z)+βΣtδt(z)s^(i))T+Σtδt(z)]

A.6.3 Parameter updates

Finally, by substituting Equations A57-A60 back into Equations A54-A56, we can derive the parameter updates in the case of observing a spike.

State probabilities

ζt(z)(δt=0)=as^(i)'(0,z)z'as^(i)'(0,z')=ζt-δt(z)γt-δt,i(z)z'ζt-δt(z')γt-δt,i(z')=ζt-δt(z)+ζt-δt(z)(γt-δt,i(z)-γt-δt,i)γt-δt,i (A61)

where γt,i=z'ζt-δt(z')γt,i(z') is the average (over states) expected firing rate of cell i. Equation A61 shows that the posterior state probabilities change instantaneously after observing a spike, and the change is proportional to the normalised difference between the expected firing rate of cell i in state z and its state-averaged expected firing rate. Therefore if state z is more compatible with the observed spiking pattern than state z', i.e. γi(z)>γi(z'), then the posterior probability of being in state z, ζ(z), will increase while ζ(z') will decrease.

Equation A61 can be rewritten into vector form as follows:

ζt(z)-ζt-δt(z)δt=ζt-δt(z)stδtΓt-δt-1(γt-δt(z)-γt-δt)

which in turn can be rewritten in differential equation form by taking the δt0 limit:

ζ(z).=ζ(z)s(t)Γ-1(γ(z)-γ) (A62)

where s(t)=limδt0st/δt is the sum of Dirac-delta functions representing the presynaptic spike trains.

Posterior mean

μt(z)(δt=0)=ms^(i)'(0,z)as^(i)'(0,z)=μt-δt(z)+βΣt-δt(z)st (A63)

Thus, the posterior conditional means of the presynaptic membrane potentials also change instantaneously after observing a spike in the presynaptic population. The change is proportional to the posterior variance in the case of the cell that emitted the spike while for other presynaptic neurons the change is proportional to the posterior covariance between the given cell and the neuron that emitted the spike. Note that, similar to the change in the posterior state probabilities, the change of the mean is instantaneous and it does not depend on the time step.

The differential equation form of Equation A63 reads as

μt(z)μtδt(z)δt=βΣtδt(z)stδtμ˙(z)=βΣtδt(z)s(t) (A64)

where s(t) is, again, the sum of Dirac-delta functions representing the presynaptic spike trains.

Posterior covariance

(A65)Ξt(z)(0)=Cs^(i)(0,z)as^(i)(0,z)=Σtδt(z)+(μtδt(z)+βΣtδt(z)s^(i))(μtδt(z)+βΣtδt(z)s^(i))T(A66)Σt(z)(0)=Σtδt(z)

Thus, Equation A66 indicates that the posterior covariance does not change directly after observing a spike – only indirectly, through the increase in the posterior mean (Equation A63) and thus Γ (Equation A10), which in turn decreases the covariance in the silent period following the spike (Equation A49).

A.7 Differential equations

Here we summarize the results of Equations A45,A46,A49, and Equation A61-A66 as differential equations in continuous time:

(A67)ζ˙(z)=ζ(z)(γ¯(z)γ¯)+ζ(z)s(t)TΓ1(γ(z)γ)+zζ(z)[(1δzz)ΩzzδzzΩ¯z](A68)μ˙(z)=u¯(z)μ(z)τ(z)+βΣ(z)(s(t)γ(z))+zzζ(z)ζ(z)(μ(z)μ(z))ΩzzΣ˙(z)=Q(z)2τ(z)Σ(z)β2Σ(z)Γ(z)Σ(z)+(A69)====+zzζ(z)ζ(z)Ωzz[(Σ(z)Σ(z))+(μ(z)μ(z))(μ(z)μ(z))T]

Once again note that in the continuous time limit the presynaptic spike trains are represented by the sum of Dirac-delta functions and we denote it with s(t) instead of st.

In the case of two states (+ and -), we can write Equation A67 as

ζ+˙=ζ+(1ζ+)(γ¯+γ¯)+ζ+(1ζ+)s(t)TΓ1(γ+γ)+(1ζ+)Ω+ζ+Ω (A70)

where we emphasise that the observation of either a spike or the absence of a spike causes changes in the state probabilities that are proportional to their uncertainty, ζ+(1-ζ+).

A.8 Derivations of auxiliaries

A.8.1 Derivation of Equations A26-A27

After substituting Equations A1-A2 and Equation A9 into Equation A6 we can derive the δt dependence of the innovation term:

h(δt,u,z)=P(ut=u,zt=z|s0:t-δt)=z'P(zt=z|zt-δt=z')P(zt-δt=z'|s0:t-δt)du'P(ut=u|ut-δt=u',zt=z)P(ut-δt=u'|zt-δt=z',s0:t-δt)=z'ζt-δt(z')[δzz'(1-δtΩ¯z')+(1-δzz')δtΩzz']du'𝒩(u;(1-δtτ(z))u'+δtτ(z)u¯(z),δtQ(z))𝒩(u';μt-δt(z'),Σt-δt(z'))=z'ζt-δt(z')[δzz'(1-δtΩ¯z')+(1-δzz')δtΩzz']11-δtτ(z)du'𝒩(u';u-δtτ(z)u¯(z)1-δtτ(z),δt(1-δtτ(z))2Q(z))𝒩(u';μt-δt(z'),Σt-δt(z'))=z'ζt-δt(z')[δzz'(1-δtΩ¯z')+(1-δzz')δtΩzz']11-δtτ(z)·𝒩(u-δtτ(z)u¯(z)1-δtτ(z);μt-δt(z'),δt(1-δtτ(z))2Q(z)+Σt-δt(z'))=z'ζt-δt(z')[δzz'(1-δtΩ¯z')+(1-δzz')δtΩzz']·𝒩(u;(1-δtτ(z))μt-δt(z')+δtτ(z)u¯(z),δt Q(z)+(1-δtτ(z))2Σt-δt(z')) (A71)

Substituting δt=0 into Equation A71 yields Equation A26.

Next, we take the derivative of Equation A71 with respect to δt:

h(δt,u,z)=zζtδt(z)[(1δzz)ΩzzδzzΩ¯z]==N(u;(1δtτ(z))μtδt(z)+δtτ(z)u¯(z),δtQ(z)+(1δtτ(z))2Σtδt(z))+==+ζtδt(z)[δzz(1δtΩ¯z)+(1δzz)δtΩzz]==[N(u;μtδt(z)+δtu¯(z)μtδt(z)τ(z),Σtδt(z)+δt(Q(z)2τ(z)Σtδt(z))+δt2Σtδt(z)τ(z)2)] (A72)

where the derivative of the last term is provided below in general form.

[N(u;a0+δta1,B0+δtB1+δt2B2)]==[1(2π)D|B0+δtB1+δt2B2|e12(ua0δta1)T(B0+δtB1+δt2B2)1(ua0δta1)]=[1(2π)D|B0+δtB1+δt2B2|]e12(ua0δta1)T(B0+δtB1+δt2B2)1(ua0δta1)+==+1(2π)D|B0+δtB1+δt2B2|[e12(ua0δta1)T(B0+δtB1+δt2B2)1(ua0δta1)]=12N(u;a0+δta1,B0+δtB1+δt2B2)=[ln(|B0+δtB1+δt2B2|)+[(ua0δta1)T(B0+δtB1+δt2B2)1(ua0δta1)]]=12N(u;a0+δta1,B0+δtB1+δt2B2)==[Tr((B0+δtB1+δt2B2)1(B1+2δtB2))+===a1T(B0+δtB1+δt2B2)1(ua0δta1)+===(ua0δta1)T(B0+δtB1+δt2B2)1a1(A73)===(ua0δta1)T(B0+δtB1+δt2B2)1(B1+2δtB2)(B0+δtB1+δt2B2)1(ua0δta1)][N(u;a0+δta1,B0+δtB1+δt2B2)]δt=0=(A74)=12N(u;a0,B0)[Tr(B01B1)2a1TB01(ua0)(ua0)TB01B1B01(ua0)]

Substituting δt=0 into Equation A72 after using Equation A74 to express its last term yields Equation A27.

A.8.2 Derivation for Equations A41-A43

The above result can be used to compute terms appearing in the integrals of Equations A22-A25 which are involved in computing Equations A41-A43

[N(u;a0+δta1,B0+δtB1+δt2B2)]δt=0du==12N(u;a0,B0)[Tr(B01B1)2a1TB01(ua0)(ua0)TB01B1B01(ua0)]du=(A75)=12[Tr(B01B1)Tr(B01B1B01B0)]=0u[N(u;a0+δta1,B0+δtB1+δt2B2)]δt=0du=(A76)=12N(u;a0,B0)u[Tr(B01B1)2a1TB01(ua0)(ua0)TB01B1B01(ua0)]du==12[a0Tr(B01B1)2(N(u;a0,B0)ua1TB01uua1TB01a0du)(A77)===N(u;a0,B0)(uuTB01B1B01u2ua0TB01B1B01u+ua0TB01B1B01a0)du]=12[a0Tr(B01B1)2(B01a1(B0+a0a0T)a0a1TB01a0)=====a0Tr(B01B1)B0B01B1B01a0(B0+a0a0T)B01B1B01a0+(A78)=====+2(B0+a0a0T)B01B1B01a0a0a0TB01B1B01a0](A79)=a1u[N(u;a0+δta1,B0+δtB1+δt2B2)]|δt=0uTdu(A80)=12N(u;a0,B0)u[Tr(B01B1)2a1TB01(ua0)(ua0)TB01B1B01(ua0)]uTdu=12[(B0+a0a0T)Tr(B01B1)===2N(u;a0,B0)(ua1TB01uuTua1TB01a0uT)du(A81)===N(u;a0,B0)(uuTB01B1B01uuT2ua0TB01B1B01uuT+ua0TB01B1B01a0uT)]=12[(B0+a0a0T)Tr(B01B1)===2(a0a1TB01(B0+a0a0T)+(B0+a0a0T)B01a1a0T+a1TB01a0(B0a0a0T)a1TB01a0(B0+a0a0T))===(2(B0+a0a0T)B01B1B01(B0+a0a0T)+a0TB01B1B01a0(B0a0a0T)+(B0+a0a0T)Tr(B01B1))+===+2(a0a0TB01B1B01(B0+a0a0T)+(B0+a0a0T)B01B1B01a0a0T+a0TB01B1B01a0(B0a0a0T))(A82)===a0TB01B1B01a0(B0+a0a0T)](A83)=12[2(a0a1T+a1a0T)2B1]=B1+a0a1T+a1a0T

B The optimality of the sigmoidal nonlinearity

Neurons need to combine incoming spikes nonlinearly to perform efficient analog computations based on digital spikes. In the main text, and the previous sections of this Appendix, we derived the optimal way of combining incoming spikes (Equations 20-22), but the exact implementation of this optimal nonlinear mapping within the dendritic tree is unrealistic. A plausible alternative would be if each dendrite branch acted as a simple linear-nonlinear unit, computing a nonlinear function of the linear combination of its linearly filtered input spike trains (Poirazi et al., 2003), but it is unclear whether this could efficiently approximate the optimal response, and if so then what is the specific functional form of the nonlinearity that is able to make this approximation tight. In the main paper, we demonstrated that a sigmoidal nonlinearity approximates the optimal response remarkably well if the dynamics of the presynaptic population is dominated by simultaneous switching between a quiescent and an active state. In this section we provide additional details for this derivation.

B.1 Simplified presynaptic dynamics

In the following derivations, for convenience, we assume that the weights of the linear computation are roughly uniform and scale inversely with N, wi~1/N, although our derivations also remain valid with other scalings of wi as long as the number of weights significantly greater than 0 scales with N. We start by noting that the optimal response can always be trivially rewritten as requiring the inference of the state, z(t), and conditioned on the state, the average presynaptic membrane potential directly, m(t)=iwiui(t)=1/N iui(t) (instead of the individual membrane potentials, as in Equations 3 and 6):

(A84)v~(t)=zP(z(t)=z|s(0:t))P(m(t)=m|z(t)=z,s(0:t))mdm(A85)=zζ(z)(t)P(m(t)=m|z(t)=z,s(0:t))mdm

where, for simplicity, we took τpost=0, although it is not essential for our argument – which instead applies to the total input when τpost0. Note that we do not generally use this seemingly simpler form because the likelihood describing the spike generation process (Equation 8) becomes more complex, and notably non-factorised, when conditioned on m(t) instead of u(t). Nevertheless, it is instructive to write down the prior over m (cf. Equation 7):

P(mt|mt-δt,zt=z)=𝒩(mt;(1-δtτ) mt-δt+δtτm¯(z),δtq¯2) (A86)

where m¯(z)=1NiNu¯i(z) is the mean and q¯2=1N2ijQij is the process noise variance defined in terms of the parameters of the original mOU process. In the following sections, we derive simplified forms of inference, that can be readily related to dendritic processing, by taking different limits to the original model, and in particular Equation A86.

B.2 Sigmoidal nonlinearity with state switching dynamics

We start by assuming that the presynaptic neurons are equally variable and conditionally independent given the population state (Qi,j=q2δij). It is easy to see that in this case q¯2=q2/N and thus the prior variance of m (which is q¯2τ/2, as can be seen from Equation A86) also scales as 1/N. Thus, for a large presynaptic population, N, the prior variance of m diminishes which means that the posterior inference is also greatly simplified because there is no uncertainty any more about m given the population state, i.e. P(m(t)|z(t)=z,s(0:t))=δ(m(t)-m¯(z)), and so the optimal response (Equations A85) for two population states simplifies to

v~(t)=ζ(t)m¯++1-ζ(t)m¯- (A87)

Although it seems that the only quantity that needs updating in Equation A87 is ζ(t), in general this still requires keeping track of the membrane potentials of individual presynaptic neurons because the dynamics of ζ(t) (Equation 20) depends on the individual expected firing rates (conditioned on each population state). Therefore, to allow further simplification, we assume that all expected firing rates within a state are identical and constant, equal to their prior value, i.e. γi+γ+=geβu¯++12β2q2τ/2, and so the population rate is simply γ¯+Nγ+ (with corresponding definitions for γ- and γ¯-). With these assumptions the dynamics of the posterior estimate of population state reduces to

ζ˙=Ω+-(Ω++Ω-)ζA+γ+-γ-ζγ++(1-ζ)γ-B(ζ)ζ(1-ζ)s(t)-(γ¯+-γ¯-)Cζ(1-ζ) (A88)

As we do not track the membrane potential fluctuations of the individual neurons in this reduced model, we also replaced the s(t) vector by the total spike train of the population, s(t)=isi(t).

Amongst the parameters of Equation A88 only C scales with N, therefore the last term dominates Equation A88 during silent periods. Conversely, the second term (involving B) will dominate during spikes. Thus, the first term (involving A) can be ignored. Finally, we note that

B(ζ)=1ζ+γ-γ+-γ-=1ζ+1eβ(u¯+-u¯-)-1 (A89)

and therefore B(ζ) is nearly constant if the difference between γ+ and γ- is small (i.e. βIn2u¯+-u¯-), in which case

Beβ(u¯+-u¯-)1 (A90)

These considerations allow us to write the approximate form of Equation A88 as (cf. Equation 27):

ζ˙=ζ(1-ζ)[Bs(t)-C] (A91)

Solving Equation A91 for ζ(t) is possible by noting that it is separable, and so the solution can be written as

1ζ(1-ζ)dζ=[Bs(t)-C]dt (A92)

where solving the integral on the left yields

In (ζ1-ζ)+χ=[Bs(t)-C]dt (A93)

In general, the value of χ would need to be determined by solving the initial value problem, but for simplicity and without loss of generality here we take χ=0 and assume that ν(0) has the appropriate value to map to the required ζ(0).

From Equation A93, ζ can be expressed as

ζ=11+e-[Bs(t)-C]dt (A95)

which in turn can be rewritten as

ζ=11+e-ν (A96)

where we introduced the new variable ν=[Bs(t)-C]dt whose definition can be equally written as a differential equation:

ν˙=Bs(t)-C (A97)

The system defined by Equations A96-A97 has a form that is closely analogous to that of the standard linear-nonlinear model of dendritic processing with a logistic sigmoid nonlinearity, and is presented in the main text (Methods) as Equations 28-29.

B.3 Sublinear integration with second order correlations

We also considered another limiting case when the dynamics of the presynaptic population was fully characterized by purely second-order correlations between the neurons. This means that there is no state switching dynamics, so inference is simplified by taking ζ=1, but we also cannot assume (as we did in the previous section) that the posterior correlations and thus the posterior variance of m vanishes in the large N limit. Instead, we seek a formalism in which the population activity of N cells, represented by the variables {u,s}, is captured by the lower dimensional dynamics of {m,s} (where s(t)=isi(t) is still the ‘total’ input spike train, as before). Under this simplified model, the prior dynamics of m is as described by Equation A86, and the likelihood is analogous to the one we had for individual neurons (Equation 8):

P(st|mt)=Poisson(st;δt rt),  with rt=g¯ eβ¯ ut (A98)

As the likelihoods of the two models can not be exactly matched, we chose the parameters β¯ and g¯ to match the mean and the variance of the population firing rate. Thus, there is now a single variable that characterizes the whole population, m, whose dynamics (both the prior and likelihood) are analogous to those we had for a single membrane potential in the previous model. Consequently, inference in this collapsed model requires the manipulation of only two scalar variables – instead of the original 𝒪(N2) –, the posterior mean and variance of m, μ¯ and σ¯2 respectively, whose dynamics can be derived analogously to those obtained before for the full population (Equations 21-22):

(A99)μ¯˙=m¯μ¯τ+β¯σ¯2(s(t)γ¯)(A100)σ¯˙2=q¯22τσ¯2β¯2σ¯4γ¯

where, in line with previous notation, γ¯=g¯eβ¯μ¯+12β¯2σ¯2 is the posterior mean estimate of the average firing rate of the presynaptic population. (These equations only apply to the special case when wi=1/N and Qij=δijq+(1-δij)ρq, but an extension to the more general case is relatively straightforward.) We used this reduced model to infer the mean membrane potential of the presynaptic population – which is relevant in the special case when wi=1/N and τpost=0 (see above) so that v~=μ¯. Figure 3—figure supplement 1A–C demonstrates that our reduced model approximates the full model remarkably well provided that there are substantial correlations in the presynaptic population.

Next, we noted that during inference the dynamics of the posterior variance is faster than the dynamics of the posterior mean (Equations A99-A100) and therefore the posterior variance can be approximated by its (mean-dependent) steady state value, σ¯2(μ¯). If we replace the posterior variance in Equation A100 with its steady state value we obtain the following approximate one-dimensional dynamics:

μ¯˙m¯-μ¯τ+β¯σ¯2(μ¯) (s(t)-g¯eβ¯μ¯+12β¯2σ¯2(μ¯)) (A101)

As the optimal response to a single spike is proportional to the posterior variance, the functional form of the steady state is highly informative about the nonlinearity of the integration. Figure 3—figure supplement 1D shows that σ¯2(μ¯) is a decreasing sigmoidal function, thus responses to consecutive spikes will add sublinearly in this case, as each spike increases μ¯, which in turn reduces σ¯2, which results in a smaller response for the next spike.

Expressing Equation A101 directly in the form of linear-nonlinear dynamics, as in the case of switching dynamics, did not seem feasible, so we focussed instead on establishing a correspondence only at the times of presynaptic spikes, when s(t)>0. This required that there exists an instantaneous mapping h(vlin) such that

μ¯=h(vlin) (A102)

where vlin is the linearly filtered and integrated inputs, whose dynamics vlin˙ is given by Equation 24. Differentiating both sides of Equation A102 wrt. time, and substituting Equation A101 and Equation 24, respectively, to each side yields

β¯σ¯2(h(vlin))=h'(vlin) (A103)

after only keeping the dominating terms including s(t) on both sides and subsequently simplifying by s(t).

Although there was no closed-form analytical solution for h(vlin) from Equation A103, it could be obtained by numerical integration (Figure 3—figure supplement 1E, blue line). It is clear that the optimal mapping h is indeed sublinear in this case and it can be reasonably well approximated using the concave half of the logistic sigmoid function derived above, for the switching dynamics, even during silent periods (Figure 3—figure supplement 1E, red line versus black dots; Figure 3). There is a simple intuition for the sublinearity implied by h (following on from the intuition we gave for the sublinearity of the response after Equation A101): according to Equation A103, its derivative is proportional to the steady state posterior variance, which in turn is a non-negative monotonically decreasing function (Figure 3—figure supplement 1D), and hence the final non-linearity must be monotonically increasing and, importantly, concave – i.e. sublinear.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Funding Information

This paper was supported by the following grants:

  • Hungarian Scientific Research Fund OTKA MB08A 84471 to Balazs B Ujfalussy.

  • Hungarian Academy of Sciences LP 2011-012 to Judit K Makara.

  • Wellcome Trust 090915/Z/09/Z to Judit K Makara.

  • Medical Research Council to Tiago Branco.

  • Royal Society Fellowship to Tiago Branco.

  • Wellcome Trust 098400MA to Tiago Branco.

  • Wellcome Trust 095621/Z/11/Z to Máté Lengyel.

Additional information

Competing interests

The authors declare that no competing interests exist.

Author contributions

BBU, Concept and design, Acquisition of Data, Analysis and Interpretation of Data, Drafting and Revising the article.

JKM, Acquisition of Data, Drafting and Revising the article.

TB, Acquisition of Data, Drafting and Revising the article.

ML, Concept and design, Analysis and Interpretation of Data, Drafting and Revising the article.

References

  1. Abrahamsson T, Cathala L, Matsui K, Shigemoto R, Digregorio DA. Thin dendrites of cerebellar interneurons confer sublinear synaptic integration and a gradient of short-term plasticity. Neuron. 2012;73:1159–1172. doi: 10.1016/j.neuron.2012.01.027. [DOI] [PubMed] [Google Scholar]
  2. Agmon-Snir H, Carr CE, Rinzel J. The role of dendrites in auditory coincidence detection. Nature. 1998;393:268–272. doi: 10.1038/30505. [DOI] [PubMed] [Google Scholar]
  3. Alenda A, Molano-Mazón M, Panzeri S, Maravall M. Sensory input drives multiple intracellular information streams in somatosensory cortex. Journal of Neuroscience. 2010;30:10872–10884. doi: 10.1523/JNEUROSCI.6174-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Anastassiou CA, Perin R, Markram H, Koch C. Ephaptic coupling of cortical neurons. Nature Neuroscience. 2011;14:217–223. doi: 10.1038/nn.2727. [DOI] [PubMed] [Google Scholar]
  5. Andersen P, Morris R, Amaral D, Bliss T, O’Keefe J. The Hippocampus Book. Oxford University Press; 2007. [Google Scholar]
  6. Archie KA, Mel BW. A model for intradendritic computation of binocular disparity. Nature Neuroscience. 2000;3:54–63. doi: 10.1038/71125. [DOI] [PubMed] [Google Scholar]
  7. Averbeck BB, Latham PE, Pouget A. Neural correlations, population coding and computation. Nature Reviews. Neuroscience. 2006;7:358–366. doi: 10.1038/nrn1888. [DOI] [PubMed] [Google Scholar]
  8. Berkes P, Orbán G, Lengyel M, Fiser J. Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science. 2011;331:83–87. doi: 10.1126/science.1195870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bollmann JH, Engert F. Subcellular topography of visually driven dendritic activity in the vertebrate visual system. Neuron. 2009;61:895–905. doi: 10.1016/j.neuron.2009.01.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Borst A, Flanagin VL, Sompolinsky H. Adaptation without parameter change: dynamic gain control in motion detection. Proceedings of the National Academy of Sciences of the United States of America. 2005;102:6172–6176. doi: 10.1073/pnas.0500491102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Branco T, Clark BA, Häusser M. Dendritic discrimination of temporal input sequences in cortical neurons. Science. 2010;329:1671–1675. doi: 10.1126/science.1189664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Branco T, Häusser M. Synaptic integration gradients in single cortical pyramidal cell dendrites. Neuron. 2011;69:885–892. doi: 10.1016/j.neuron.2011.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Branco T, Staras K. The probability of neurotransmitter release: variability and feedback control at single synapses. Nature Reviews Neuroscience. 2009;10:373–383. doi: 10.1038/nrn2634. [DOI] [PubMed] [Google Scholar]
  14. Carandini M, Ferster D. Membrane potential and firing rate in cat primary visual cortex. Journal of Neuroscience. 2000;20:470–484. doi: 10.1523/JNEUROSCI.20-01-00470.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chen X, Leischner U, Rochefort NL, Nelken I, Konnerth A. Functional mapping of single spines in cortical neurons in vivo. Nature. 2011;475:501–505. doi: 10.1038/nature10193. [DOI] [PubMed] [Google Scholar]
  16. Chiovini B, Turi GF, Katona G, Kaszás A, Pálfi D, Maák P, Szalay G, Szabó MF, Szabó G, Szadai Z, Káli S, Rózsa B. Dendritic spikes induce ripples in parvalbumin interneurons during hippocampal sharp waves. Neuron. 2014;82:908–924. doi: 10.1016/j.neuron.2014.04.004. [DOI] [PubMed] [Google Scholar]
  17. Clark B, Häusser M. Neural coding: hybrid analog and digital signalling in axons. Current Biology. 2006;16:R585–R588. doi: 10.1016/j.cub.2006.07.007. [DOI] [PubMed] [Google Scholar]
  18. Cohen MR, Kohn A. Measuring and interpreting neuronal correlations. Nature Neuroscience. 2011;14:811–819. doi: 10.1038/nn.2842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Crochet S, Poulet JFA, Kremer Y, Petersen CCH. Synaptic mechanisms underlying sparse coding of active touch. Neuron. 2011;69:1160–1175. doi: 10.1016/j.neuron.2011.02.022. [DOI] [PubMed] [Google Scholar]
  20. Csicsvari J, Hirase H, Czurkó A, Mamiya A, Buzsáki G. Fast network oscillations in the hippocampal CA1 region of the behaving rat. Journal of Neuroscience. 1999;19:RC20. doi: 10.1523/JNEUROSCI.19-16-j0001.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Csicsvari J, Hirase H, Mamiya A, Buzsáki G. Ensemble patterns of hippocampal CA3-CA1 neurons during sharp wave–associated population events. Neuron. 2000;28:585–594. doi: 10.1016/S0896-6273(00)00135-5. [DOI] [PubMed] [Google Scholar]
  22. Dayan P, Abbott LF. Theoretical Neuroscience. The MIT press; 2001. [Google Scholar]
  23. de la Rocha J, Doiron B, Shea-Brown E, Josić K, Reyes A. Correlation between neural spike trains increases with firing rate. Nature. 2007;448:802–806. doi: 10.1038/nature06028. [DOI] [PubMed] [Google Scholar]
  24. DeBello WM. Micro-rewiring as a substrate for learning. Trends in Neurosciences. 2008;31:577–584. doi: 10.1016/j.tins.2008.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Dorn JD, Ringach DL. Estimating membrane voltage correlations from extracellular spike trains. Journal of Neurophysiology. 2003;89:2271–2278. doi: 10.1152/jn.000889.2002. [DOI] [PubMed] [Google Scholar]
  26. Doucet A, De Freitas N, Gordon N. Sequential Monte Carlo Methods in Practice. New York, NY: Springer New York; 2001. [DOI] [Google Scholar]
  27. Douglas RJ, Martin KAC. Neuronal circuits of the neocortex. Annual Review of Neuroscience. 2004;27:419–451. doi: 10.1146/annurev.neuro.27.070203.144152. [DOI] [PubMed] [Google Scholar]
  28. Druckmann S, Feng L, Lee B, Yook C, Zhao T, Magee JC, Kim J. Structured synaptic connectivity between hippocampal regions. Neuron. 2014;81:629–640. doi: 10.1016/j.neuron.2013.11.026. [DOI] [PubMed] [Google Scholar]
  29. Duguid I, Branco T, London M, Chadderton P, Hausser M. Tonic inhibition enhances fidelity of sensory information transmission in the cerebellar cortex. Journal of Neuroscience. 2012;32:11132–11143. doi: 10.1523/JNEUROSCI.0460-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Dvorak-Carbone H, Schuman EM. Patterned activity in stratum lacunosum moleculare inhibits CA1 pyramidal neuron firing. Journal of Neurophysiology. 1999;82:3213–.3222. doi: 10.1152/jn.1999.82.6.3213. [DOI] [PubMed] [Google Scholar]
  31. Eggermann E, Kremer Y, Crochet S, Petersen CCH. Cholinergic signals in mouse barrel cortex during active whisker sensing. Cell Reports. 2014;9:1654–1660. doi: 10.1016/j.celrep.2014.11.005. [DOI] [PubMed] [Google Scholar]
  32. English DF, Peyrache A, Stark E, Roux L, Vallentin D, Long MA, Buzsaki G. Excitation and inhibition compete to control spiking during hippocampal ripples: intracellular study in behaving mice. Journal of Neuroscience. 2014;34:16509–16517. doi: 10.1523/JNEUROSCI.2600-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Epsztein J, Brecht M, Lee AK. Intracellular determinants of hippocampal CA1 place and silent cell activity in a novel environment. Neuron. 2011;70:109–120. doi: 10.1016/j.neuron.2011.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Fiser J, Chiu C, Weliky M. Small modulation of ongoing cortical dynamics by sensory input during natural vision. Nature. 2004;431:573–578. doi: 10.1038/nature02907. [DOI] [PubMed] [Google Scholar]
  35. Froudarakis E, Berens P, Ecker AS, Cotton RJ, Sinz FH, Yatsenko D, Saggau P, Bethge M, Tolias AS. Population code in mouse V1 facilitates readout of natural scenes through increased sparseness. Nature Neuroscience. 2014;17:851–857. doi: 10.1038/nn.3707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Gasparini S, Magee JC. State-dependent dendritic computation in hippocampal CA1 pyramidal neurons. Journal of Neuroscience. 2006;26:2088–2100. doi: 10.1523/JNEUROSCI.4428-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Gentet LJ, Avermann M, Matyas F, Staiger JF, Petersen CCH. Membrane potential dynamics of GABAergic neurons in the barrel cortex of behaving mice. Neuron. 2010;65:422–435. doi: 10.1016/j.neuron.2010.01.006. [DOI] [PubMed] [Google Scholar]
  38. Gerstner W, Kistler WM. Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge: Cambridge University Press; 2002. [DOI] [Google Scholar]
  39. Gidon A, Segev I. Principles governing the operation of synaptic inhibition in dendrites. Neuron. 2012;75:330–341. doi: 10.1016/j.neuron.2012.05.015. [DOI] [PubMed] [Google Scholar]
  40. Govindarajan A, Israely I, Huang S-Y, Tonegawa S. The dendritic branch is the preferred integrative unit for protein synthesis-dependent LTP. Neuron. 2011;69:132–146. doi: 10.1016/j.neuron.2010.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Grienberger C, Chen X, Konnerth A. NMDA receptor-dependent multidendrite Ca2+ spikes required for hippocampal burst firing in vivo. Neuron. 2014;81:1274–1281. doi: 10.1016/j.neuron.2014.01.014. [DOI] [PubMed] [Google Scholar]
  42. Grosmark AD, Mizuseki K, Pastalkova E, Diba K, Buzsáki G. REM sleep reorganizes hippocampal excitability. Neuron. 2012;75:1001–1007. doi: 10.1016/j.neuron.2012.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Haider B, Häusser M, Carandini M. Inhibition dominates sensory responses in the awake cortex. Nature. 2013;493:97–100. doi: 10.1038/nature11665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Harris KD, Csicsvari J, Hirase H, Dragoi G, Buzsáki G. Organization of cell assemblies in the hippocampus. Nature. 2003;424:552–556. doi: 10.1038/nature01834. [DOI] [PubMed] [Google Scholar]
  45. Harris KD, Thiele A. Cortical state and attention. Nature Reviews Neuroscience. 2011;12:509–523. doi: 10.1038/nrn3084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Hasselmo ME. Neuromodulation and cortical function: modeling the physiological basis of behavior. Behavioural Brain Research. 1995;67:1–27. doi: 10.1016/0166-4328(94)00113-T. [DOI] [PubMed] [Google Scholar]
  47. Hennequin G, Vogels TP, Gerstner W. Optimal control of transient dynamics in balanced networks supports generation of complex movements. Neuron. 2014;82:1394–1406. doi: 10.1016/j.neuron.2014.04.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Herz AVM, Gollisch T, Machens CK, Jaeger D. Modeling single-neuron dynamics and computations: a balance of detail and abstraction. Science. 2006;314:80–85. doi: 10.1126/science.1127240. [DOI] [PubMed] [Google Scholar]
  49. Holderith N, Lorincz A, Katona G, Rózsa B, Kulik A, Watanabe M, Nusser Z. Release probability of hippocampal glutamatergic terminals scales with the size of the active zone. Nature Neuroscience. 2012;15:988–997. doi: 10.1038/nn.3137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Hopfield JJ. Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the National Academy of Sciences of the United States of America. 1984;81:3088–3092. doi: 10.1073/pnas.81.10.3088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Isomura Y, Sirota A, Özen S, Montgomery S, Mizuseki K, Henze DA, Buzsáki G. Integration and segregation of activity in entorhinal-hippocampal subregions by neocortical slow oscillations. Neuron. 2006;52:871–882. doi: 10.1016/j.neuron.2006.10.023. [DOI] [PubMed] [Google Scholar]
  52. Jadi M, Polsky A, Schiller J, Mel BW. Location-dependent effects of inhibition on local spiking in pyramidal neuron dendrites. PLoS Computational Biology. 2012;8:e1002550. doi: 10.1371/journal.pcbi.1002550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Ji D, Wilson MA. Coordinated memory replay in the visual cortex and hippocampus during sleep. Nature Neuroscience. 2007;10:100–107. doi: 10.1038/nn1825. [DOI] [PubMed] [Google Scholar]
  54. Jia H, Rochefort NL, Chen X, Konnerth A. Dendritic organization of sensory input to cortical neurons in vivo. Nature. 2010;464:1307–1312. doi: 10.1038/nature08947. [DOI] [PubMed] [Google Scholar]
  55. Karlocai MR, Kohus Z, Kali S, Ulbert I, Szabo G, Mate Z, Freund TF, Gulyas AI. Physiological sharp wave-ripples and interictal events in vitro: what's the difference? Brain. 2014;137:463–485. doi: 10.1093/brain/awt348. [DOI] [PubMed] [Google Scholar]
  56. Katona G, Kaszás A, Turi GF, Hájos N, Tamás G, Vizi ES, Rózsa B. Roller coaster scanning reveals spontaneous triggering of dendritic spikes in CA1 interneurons. Proceedings of the National Academy of Sciences of the United States of America. 2011;108:2148–2153. doi: 10.1073/pnas.1009270108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Kleindienst T, Winnubst J, Roth-Alpermann C, Bonhoeffer T, Lohmann C. Activity-dependent clustering of functional synaptic inputs on developing hippocampal dendrites. Neuron. 2011;72:1012–1024. doi: 10.1016/j.neuron.2011.10.015. [DOI] [PubMed] [Google Scholar]
  58. Koch C, Poggio T, Torres V. Retinal ganglion cells: a functional interpretation of dendritic morphology. Philosophical Transactions of the Royal Society B: Biological Sciences. 1982;298:227–263. doi: 10.1098/rstb.1982.0084. [DOI] [PubMed] [Google Scholar]
  59. Koch C. Biophysics of Computation. Oxford University Press; 1999. [Google Scholar]
  60. Larkum ME, Nevian T. Synaptic clustering by dendritic signalling mechanisms. Current Opinion in Neurobiology. 2008;18:321–331. doi: 10.1016/j.conb.2008.08.013. [DOI] [PubMed] [Google Scholar]
  61. Lavzin M, Rapoport S, Polsky A, Garion L, Schiller J. Nonlinear dendritic processing determines angular tuning of barrel cortex neurons in vivo. Nature. 2012;490:397–401. doi: 10.1038/nature11451. [DOI] [PubMed] [Google Scholar]
  62. London M, Häusser M. Dendritic computation. Annual Review of Neuroscience. 2005;28:503–532. doi: 10.1146/annurev.neuro.28.061604.135703. [DOI] [PubMed] [Google Scholar]
  63. London M, Roth A, Beeren L, Häusser M, Latham PE. Sensitivity to perturbations in vivo implies high noise and suggests rate coding in cortex. Nature. 2010;466:123–127. doi: 10.1038/nature09086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Longordo F, To M-S, Ikeda K, Stuart GJ. Sublinear integration underlies binocular processing in primary visual cortex. Nature Neuroscience. 2013;16:714–723. doi: 10.1038/nn.3394. [DOI] [PubMed] [Google Scholar]
  65. Losonczy A, Magee JC. Integrative properties of radial oblique dendrites in hippocampal CA1 pyramidal neurons. Neuron. 2006;50:291–307. doi: 10.1016/j.neuron.2006.03.016. [DOI] [PubMed] [Google Scholar]
  66. Losonczy A, Makara JK, Magee JC. Compartmentalized dendritic plasticity and input feature storage in neurons. Nature. 2008;452:436–441. doi: 10.1038/nature06725. [DOI] [PubMed] [Google Scholar]
  67. Lovett-Barron M, Turi GF, Kaifosh P, Lee PH, Bolze F, Sun X-H, Nicoud J-F, Zemelman BV, Sternson SM, Losonczy A. Regulation of neuronal input transformations by tunable dendritic inhibition. Nature Neuroscience. 2012;15:423–430. doi: 10.1038/nn.3024. [DOI] [PubMed] [Google Scholar]
  68. Magee JC, Cook EP. Somatic EPSP amplitude is independent of synapse location in hippocampal pyramidal neurons. Nature Neuroscience. 2000;3:895–903. doi: 10.1038/78800. [DOI] [PubMed] [Google Scholar]
  69. Major G, Larkum ME, Schiller J. Active properties of neocortical pyramidal neuron dendrites. Annual Review of Neuroscience. 2013;36:1–24. doi: 10.1146/annurev-neuro-062111-150343. [DOI] [PubMed] [Google Scholar]
  70. Major G, Polsky A, Denk W, Schiller J, Tank DW. Spatiotemporally graded NMDA Spike/Plateau potentials in basal dendrites of neocortical pyramidal neurons. Journal of Neurophysiology. 2008;99:2584–2601. doi: 10.1152/jn.00011.2008. [DOI] [PubMed] [Google Scholar]
  71. Makara JK, Losonczy A, Wen Q, Magee JC. Experience-dependent compartmentalized dendritic plasticity in rat hippocampal CA1 pyramidal neurons. Nature Neuroscience. 2009;12:1485–1487. doi: 10.1038/nn.2428. [DOI] [PubMed] [Google Scholar]
  72. Makara JK, Magee JC. Variable dendritic integration in hippocampal CA3 pyramidal neurons. Neuron. 2013;80:1438–1450. doi: 10.1016/j.neuron.2013.10.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Makino H, Malinow R. Compartmentalized versus global synaptic plasticity on dendrites controlled by experience. Neuron. 2011;72:1001–1011. doi: 10.1016/j.neuron.2011.09.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. McBride TJ, Rodriguez-Contreras A, Trinh A, Bailey R, DeBello WM. Learning drives differential clustering of axodendritic contacts in the barn owl auditory system. Journal of Neuroscience. 2008;28:6960–6973. doi: 10.1523/JNEUROSCI.1352-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Megías M, Emri Z, Freund TF, Gulyás AI. Total number and distribution of inhibitory and excitatory synapses on hippocampal CA1 pyramidal cells. Neuroscience. 2001;102:527–540. doi: 10.1016/S0306-4522(00)00496-6. [DOI] [PubMed] [Google Scholar]
  76. Mel BW, Ruderman DL, Archie KA. Translation-invariant orientation tuning in visual "complex" cells could derive from intradendritic computations. Journal of Neuroscience. 1998;18:4325–.4334. doi: 10.1523/JNEUROSCI.18-11-04325.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Miller J.-e. K., Ayzenshtat I, Carrillo-Reid L, Yuste R. Visual stimuli recruit intrinsically generated cortical ensembles. Proceedings of the National Academy of Sciences of the United States of America. 2014;111:E4053. doi: 10.1073/pnas.1406077111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Mizuseki K, Buzsaki G. Theta oscillations decrease spike synchrony in the hippocampus and entorhinal cortex. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 2014;369 doi: 10.1098/rstb.2012.0530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Müller C, Beck H, Coulter D, Remy S. Inhibitory control of linear and supralinear dendritic excitation in CA1 pyramidal neurons. Neuron. 2012;75:851–864. doi: 10.1016/j.neuron.2012.06.025. [DOI] [PubMed] [Google Scholar]
  80. Nevian T, Larkum ME, Polsky A, Schiller J. Properties of basal dendrites of layer 5 pyramidal neurons: a direct patch-clamp recording study. Nature Neuroscience. 2007;10:206–214. doi: 10.1038/nn1826. [DOI] [PubMed] [Google Scholar]
  81. O'Neill J, Senior T, Csicsvari J. Place-selective firing of CA1 pyramidal cells during sharp Wave/Ripple network patterns in exploratory behavior. Neuron. 2006;49:143–155. doi: 10.1016/j.neuron.2005.10.037. [DOI] [PubMed] [Google Scholar]
  82. Ohiorhenuan IE, Mechler F, Purpura KP, Schmid AM, Hu Q, Victor JD. Sparse coding and high-order correlations in fine-scale cortical networks. Nature. 2010;466:617–621. doi: 10.1038/nature09178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Palmer LM, Shai AS, Reeve JE, Anderson HL, Paulsen O, Larkum ME. NMDA spikes enhance action potential generation during sensory input. Nature Neuroscience. 2014;17:383–390. doi: 10.1038/nn.3646. [DOI] [PubMed] [Google Scholar]
  84. Paninski L. The most likely voltage path and large deviations approximations for integrate-and-fire neurons. Journal of Computational Neuroscience. 2006;21:71–87. doi: 10.1007/s10827-006-7200-4. [DOI] [PubMed] [Google Scholar]
  85. Pfeiffer BE, Foster DJ. Hippocampal place-cell sequences depict future paths to remembered goals. Nature. 2013;497:74–79. doi: 10.1038/nature12112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Pfister J-P, Surace SC. Short-term facilitation as a normative consequence of presynaptic spike-rate adaptation. In Cosyne Abstracts; Salt Lake City, USA. 2014. [Google Scholar]
  87. Pfister JP, Dayan P, Lengyel M. Synapses with short-term plasticity are optimal estimators of presynaptic membrane potentials. Nature Neuroscience. 2010;13:1271–1275. doi: 10.1038/nn.2640. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Poirazi P, Brannon T, Mel BW. Arithmetic of subthreshold synaptic summation in a model CA1 pyramidal cell. Neuron. 2003;37:977–987. doi: 10.1016/S0896-6273(03)00148-X. [DOI] [PubMed] [Google Scholar]
  89. Poirazi P, Brannon T, Mel BW. Pyramidal neuron as two-layer neural network. Neuron. 2003b;37:989–999. doi: 10.1016/S0896-6273(03)00149-1. [DOI] [PubMed] [Google Scholar]
  90. Poirazi P, Mel BW. Impact of active dendrites and structural plasticity on the memory capacity of neural tissue. Neuron. 2001;29:779–796. doi: 10.1016/S0896-6273(01)00252-5. [DOI] [PubMed] [Google Scholar]
  91. Polsky A, Mel BW, Schiller J. Computational subunits in thin dendrites of pyramidal cells. Nature Neuroscience. 2004;7:621–627. doi: 10.1038/nn1253. [DOI] [PubMed] [Google Scholar]
  92. Pouget A, Dayan P, Zemel RS. Inference and computation with population codes. Annual Review of Neuroscience. 2003;26:381–410. doi: 10.1146/annurev.neuro.26.041002.131112. [DOI] [PubMed] [Google Scholar]
  93. Poulet JF, Petersen CC. Internal brain state regulates membrane potential synchrony in barrel cortex of behaving mice. Nature. 2008;454:881–885. doi: 10.1038/nature07150. [DOI] [PubMed] [Google Scholar]
  94. R Core Team . R: A Language and Environment for Statistical Computing. Austria: R Foundation for Statistical Computing, Vienna; 2012. [Google Scholar]
  95. Remondes M, Schuman EM. Direct cortical input modulates plasticity and spiking in CA1 pyramidal neurons. Nature. 2002;416:736–740. doi: 10.1038/416736a. [DOI] [PubMed] [Google Scholar]
  96. Renart A, de la Rocha J, Bartho P, Hollender L, Parga N, Reyes A, Harris KD. The asynchronous state in cortical circuits. Science. 2010;327:587–590. doi: 10.1126/science.1179850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Rumelhart DE, McClelland JL, Research Group PDP. MIT Press; 1986. Parallel distributed processing, volume 1: explorations in the microstructure of cognition: foundations. [Google Scholar]
  98. Schlingloff D, Kali S, Freund TF, Hajos N, Gulyas AI. Mechanisms of sharp wave initiation and ripple generation. Journal of Neuroscience. 2014;34:11385–11398. doi: 10.1523/JNEUROSCI.0867-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Schneidman E, Berry MJ, Segev R, Bialek W. Weak pairwise correlations imply strongly correlated network states in a neural population. Nature. 2006;440:1007–1012. doi: 10.1038/nature04701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Sengupta B, Laughlin SB, Niven JE. Consequences of converting graded to action potentials upon neural information coding and energy efficiency. PLoS Computational Biology. 2014;10:e1003439. doi: 10.1371/journal.pcbi.1003439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Seung HS, Sompolinsky H. Simple models for reading neuronal population codes. Proceedings of the National Academy of Sciences of the United States of America. 1993;90:10749–10753. doi: 10.1073/pnas.90.22.10749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Shadlen MN, Newsome WT. The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. Journal of Neuroscience. 1998;18:3870–.3896. doi: 10.1523/JNEUROSCI.18-10-03870.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Shepherd GM. The Synaptic Organization of the Brain. Oxford University Press; 2004. [DOI] [Google Scholar]
  104. Silver RA. Neuronal arithmetic. Nature Reviews. Neuroscience. 2010;11:474–489. doi: 10.1038/nrn2864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Simoncelli EP, Olshausen BA. Natural image statistics and neural representation. Annual Review of Neuroscience. 2001;24:1193–1216. doi: 10.1146/annurev.neuro.24.1.1193. [DOI] [PubMed] [Google Scholar]
  106. Single S, Borst A. Dendritic integration and its role in computing image velocity. Science. 1998;281:1848–1850. doi: 10.1126/science.281.5384.1848. [DOI] [PubMed] [Google Scholar]
  107. Sjöström PJ, Häusser M. A cooperative switch determines the sign of synaptic plasticity in distal dendrites of neocortical pyramidal neurons. Neuron. 2006;51:227–238. doi: 10.1016/j.neuron.2006.06.017. [DOI] [PubMed] [Google Scholar]
  108. Sjöström PJ, Rancz EA, Roth A, Häusser M. Dendritic excitability and synaptic plasticity. Physiological Reviews. 2008;88:769–840. doi: 10.1152/physrev.00016.2007. [DOI] [PubMed] [Google Scholar]
  109. Smith SL, Smith IT, Branco T, Häusser M. Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo. Nature. 2013;503:115–120. doi: 10.1038/nature12600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Soetaert K, Petzoldt T, Setzer RW. Solving differential equations in r : package deSolve. Journal of Statistical Software. 2010;33:pp. 1–pp. 25. doi: 10.18637/jss.v033.i09. [DOI] [Google Scholar]
  111. Spruston N. Pyramidal neurons: dendritic structure and synaptic integration. Nature Reviews Neuroscience. 2008;9:206–221. doi: 10.1038/nrn2286. [DOI] [PubMed] [Google Scholar]
  112. Stuart G, Spruston N, Häusser M. Dendrites. Oxford University Press; 2007. [DOI] [Google Scholar]
  113. Takahashi N, Kitamura K, Matsuo N, Mayford M, Kano M, Matsuki N, Ikegaya Y. Locally synchronized synaptic inputs. Science. 2012;335:353–356. doi: 10.1126/science.1210362. [DOI] [PubMed] [Google Scholar]
  114. Ujfalussy B, Kiss T, Érdi P. Parallel computational subunits in dentate granule cells generate multiple place fields. PLoS Computational Biology. 2009;5:e1000500. doi: 10.1371/journal.pcbi.1000500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Ujfalussy BB, Lengyel M. In: Advances in Neural Information Processing Systems 24. Shawe-Taylor J, Zemel R. S, Bartlett P, Pereira F. C. N, Weinberger Q. W, editors. 2011. Active dendrites: adaptation to spike-based communication; pp. 1188–1196. [Google Scholar]
  116. Ujfalussy BB, Lengyel M. Feed-forward inhibition in hippocampal microcircuits: adaptation to spike-based computation.. In Cosyne Abstracts; Salt Lake City, USA. 2013. [Google Scholar]
  117. Varga Z, Jia H, Sakmann B, Konnerth A. Dendritic coding of multiple sensory inputs in single cortical neurons in vivo. Proceedings of the National Academy of Sciences of the United States of America. 2011;108:15420–15425. doi: 10.1073/pnas.1112355108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Vervaeke K, Lorincz A, Nusser Z, Silver RA. Gap junctions compensate for sublinear dendritic integration in an inhibitory network. Science. 2012;335:1624–1628. doi: 10.1126/science.1215101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Vogels TP, Sprekeler H, Zenke F, Clopath C, Gerstner W. Inhibitory plasticity balances excitation and inhibition in sensory pathways and memory networks. Science. 2011;334:1569–1573. doi: 10.1126/science.1211095. [DOI] [PubMed] [Google Scholar]
  120. Wilms CD, Häusser M. Reading out a spatiotemporal population code by imaging neighbouring parallel fibre axons in vivo. Nature Communications. 2015;6:6464. doi: 10.1038/ncomms7464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Wilson NR, Runyan CA, Wang FL, Sur M. Division and subtraction by distinct cortical inhibitory networks in vivo. Nature. 2012;488:343–348. doi: 10.1038/nature11347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Winnubst J, Cheyne JE, Niculescu D, Lohmann C. Spontaneous activity drives local synaptic plasticity in vivo. Neuron. 2015;87:399–410. doi: 10.1016/j.neuron.2015.06.029. [DOI] [PubMed] [Google Scholar]
  123. Witter MP, Groenewegen HJ, Lopes da Silva FH, Lohman AHM. Functional organization of the extrinsic and intrinsic circuitry of the parahippocampal region. Progress in Neurobiology. 1989;33:161–253. doi: 10.1016/0301-0082(89)90009-9. [DOI] [PubMed] [Google Scholar]
  124. Xu NL, Harnett MT, Williams SR, Huber D, O'Connor DH, Svoboda K, Magee JC. Nonlinear dendritic integration of sensory and motor input during an active sensing task. Nature. 2012;492:247–251. doi: 10.1038/nature11601. [DOI] [PubMed] [Google Scholar]
  125. Ylinen A, Bragin A, Nádasdy Z, Jandó G, Szabó I, Sik A, Buzsáki G. Sharp wave-associated high-frequency oscillation (200 hz) in the intact hippocampus: network and intracellular mechanisms. Journal of Neuroscience. 1995;15:30–46. doi: 10.1523/JNEUROSCI.15-01-00030.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Zador AM. The basic unit of computation. Nature Neuroscience. 2000;3:1167. doi: 10.1038/81432. [DOI] [PubMed] [Google Scholar]
eLife. 2015 Dec 24;4:e10056. doi: 10.7554/eLife.10056.022

Decision letter

Editor: Frances K Skinner1

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

[Editors’ note: a previous version of this study was rejected after peer review, but the authors submitted for reconsideration. The first decision letter after peer review is shown below.]

Thank you for choosing to send your work entitled "Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits" for consideration at eLife. Your full submission has been evaluated by Eve Marder (Senior editor) and three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the decision was reached after very extensive discussions between the reviewers. Based on our discussions and the individual reviews below, we have decided not to consider the present manuscript further at this time. However, we would be willing to consider a significantly revised resubmission (below).

While the reviewers all felt that the work was quite interesting, concern was raised in the ensuing discussions as to the unclear presentation of the essence of the work. In fact, the reviewers spent a considerable time discussing what they thought was the structure of the model, and what it means. The fact that the reviewers went back and forth over these issues indicates a lack of clarity in the Results section. Therefore, it was decided that the work needed significant revisions and significant effort would be required to make the work accessible to the general reader, not only in making the take home message clear, but also in explaining how and why detailed mathematical derivations are needed and used.

Overall, two major aspects require consideration, as summarized below with some suggestions. Additional detailed comments from the reviewers are appended below.

1) A clearer and simpler version of the paper is needed.

A clear line through the reasoning that will allow everyone to see the core of what has been done, and why, needs to be presented. Specifically, the paper needs to be transparent and math that obfuscates should be removed or simplified or explained. The authors are encouraged to present their essential results in a much more straightforward fashion to help elucidate their mathematical derivations.

A suggestion is to use a flow chart of the process/protocol (capturing presynaptic firing statistics, fitting, optimizing etc., and lay bare the limitations/assumptions – e.g., the output computation f(u) is assumed to be a linear function), describe the essential results in words, and point to the mathematical parts in methods/or a mathematical appendix of how and why. We understand and appreciate that you wish to publish the full mathematical treatment, but most eLife readers won't benefit from the math in its present form. The mathematical results could be made complete and readable in an Appendix, while freeing the Results to show the logic of the model and how it works. It looks complicated because of using the most general formulations (multi-state state transitions, and multi-variate normal with arbitrary state-dependent covariance), but the main results are obtained via somewhat simpler formulations. More detailed mathematical derivations could go into a mathematical appendix if necessary.

In this way, people (of diverse backgrounds) would be able to sink their teeth into the contribution, build on it, test it and so on.

2) Take-home message(s) and biological/biophysical intuition needs to be presented.

No biophysical model seems to be included, even simply. Everything seems to be folded into the optimal g(s), so that it is not obvious how best to make a biological/biophysical connection.

The derived 'optimal traces' do, in principle, bring a more stringent test to the theory, but in Figure 4 they have access to the whole pre-synaptic dynamics so the optimal trace can be replaced by the averaged pre-synaptic potential, while in Figure 5 the experimental pattern of stimulation is not sufficiently complex to really probe the detailed temporal structure predicted by the theory. The other figures support qualitative statements. This, for example, could be explicitly stated and rationalized for the reader.

Some aspects came through but because of the first point above, there was not always a clear consensus of what the authors intended as the main message(s) from the work. Could these intents (if appropriate) be better explained?

For example:i) Presynaptic neurons with correlated inputs should target neighboring regions of the dendritic tree so that the inputs can sum nonlinearly while presynaptic neurons with uncorrelated inputs should map onto different portions of the tree and the inputs sum linearly?ii) An optimizing aspect/principle that could ultimately predict the 'location structure' of synapses as a crucial test of their theory.iii) Optimally decoding the information present in the spikes to do the desired computation depends upon the correlationsiv) Addressing the question of why there are nonlinearities in the first place.v) An optimal way to recover the averaged presynaptic potential only from the pre-synaptic spike trains? Can the biophysics approximate this optimal 'decoding' operation?

Reviewer #1:

In this most interesting study, the authors propose a principle that dendritic nonlinearity is optimized to integrate synaptic inputs from the statistics of the presynaptic cells. This principle assumes that communication is mainly via spikes whereas computation is not 'punctate', thus identifying the bottleneck of dendritic processing. They go on to use models (clustered connectivity prediction) and experiment to show that this principle is followed.

The authors may have uncovered a fundamental principle, and as they state in the Discussion, "Once patterned dendritic stimulation over a broader and more realistic range of inputs becomes feasible, our theory will provide a principled method for dissecting the roles of presynaptic correlations vs. genuine nonlinear computations in shaping dendritic nonlinearities." Nice I think.

I have a few comments.

1) The authors could state that graded synaptic transmission exists (many examples in invertebrates), so that their principle may not apply there, I assume. Is it definitively known that graded synaptic transmission is not present in cortical circuits?

2) It would be helpful if the authors could expand on the discussion regarding biophysical substrate. That is, to fill in the blanks on how and why "NMDA spikes may provide a general solution". Was some modeling (not shown) done to be able to state this? Please provide some additional discussion/rationale/assumptions for making these statements.

3) Regarding inhibition, the authors discuss potential addition from the perspective of excitatory cells. However, perspectives from inhibitory cells are not expressed. Do the authors expect their principle to hold from the perspective of inhibitory cells (receiving excitatory and/or inhibitory inputs)? The authors should provide some discussion/rationale/assumptions of why this is not included/considered. I do not think it is a given that computation is only from the perspective of excitatory cells? For example, consider the reviews by Klausberger and Somogyi (Science 2008) and Chamberland and Topolnik (Frontiers in Neuroscience 2012).

Reviewer #2:

Nonlinear dendritic integration is often seen as a thorn in the foot. Following in vitro work by Larkum, Magee, Stuart, Hausser and others, it is now clear that dendrites introduce various types of nonlinearity to input integration. Further observations showed physiological patterns that appeared to tune specifically the dendritic nonlinearities. Computations that are carried out in the dendrites (as in Taylor et al. 2000) were then discovered. Thus, it has become increasingly difficult to ignore their computational role. The search for a theoretical role (mainly by Mel and coworkers), on the other hand, was mainly focused on rate-based computations. Yet, rate and spiking descriptions are often qualitatively different (as in plasticity or synfire chains). Understanding the key mechanisms for spike-based dendritic computations remains a fundamental problem, but a very difficult one.

In this article, Ujfalussy and coworkers address the hypothesis that nonlinear dendritic integration serves to decode the correlated pre-synaptic spike trains. They describe mainly three observations that back up this hypothesis. First, they show mathematically that the canonical model of nonlinear integration of Poirazi and Mel (P&M) approximates an optimal decoding of the switching between active and quiescent states. Second, they compared how different biophysically detailed models approximated an optimal decoding strategy. They found that synaptic input clustered on nonlinear dendrites performed better at this task than when the same input was lumped in the soma, or synapses randomly assigned to dendrites. Last, they show that in vitro measurements of dendritic integration in a specific cell type matched the optimal integration strategy expected to decode the input statistics of that particular cell in vivo. These are the three main observations, but a particular strength of this article is that it draws a clear top-down rhetoric and provides general estimates of (bayes) optimal spiking communication.

Although I don't subscribe to all the assumptions described here, they are sensible and worth a careful examination. Ujfalussy and coworkers offer a thorough and elegant treatment of the question relying substantially on most of the state-of-the-art experimental recordings relevant to this problem. The conclusions are well tested, often with very stringent criteria. The relevant literature is properly cited. The article is generally well written.

The main contributions will appeal to researchers coming from different approaches. The mathematical neuroscientist will respect the relation between the P&M model and optimal decoding. Those considering decoding spike train from a bayesian perspective will check the derivations in the supplementary material. Others coming from the literature on noise-correlation will appreciate that, with dendrites, some of the reasonings should be revised. Finally, the new theoretical role for dendritic nonlinearities can clearly be tested with present experimental techniques. For these reasons, I believe that even if the supposed role is later shown to be false or incomplete, the article will have had a strong impact on the community.

Reviewer #2 (Minor Comments):

Generally, confusion arises in the number of models considered. I think that, when a comparison between models is the main result, these different models should be mentioned at the beginning of the section. For instance in the last paragraph of the section “The form of the optimal nonlinearity depends on the statistics of presynaptic inputs” one suddenly learns of the actual comparisons being made.

On a first reading, I could not understand what was done in the first paragraph of the section” Nonlinear integration in cortical neurons is matched to their input statistics”. After reading the supplementary, it is clear, but I think a few sentences should be added to describe in a little more details what was actually compared. For instance, one can read from the first sentence that to test the theory, we need to measure presynaptic statistics and derive the optimal nonlinearity, then measure the nonlinearity in vitro and compare the nonlinearities. That is not what was done, I know now, but I could not rule that out from the paragraph as is. It is also difficult to understand the end of second paragraph of the same section on the first go.

What is the experimental stimulus protocol used in the uncaging experiments? How broad is the distribution of ISI? Evenly distributed on Figure 5c and g? Are each burst made of evenly spaced pre-synaptic encaging? Can you really say overfitting is small?

Why exactly the multi-state description in Eq. 11 and 19 if it is never used? Personally, it is only when I saw Eq. S67 that I finally reached the proper mathematical intuitions. Other than the need for simplicity, I am concerned that some of the conclusions would not hold in the multi-state scenario suggested by Eq. 11.

Figure 1: 1B is argued to offer a better fit to optimal than 1C. But there are two differences, correlation and scale. Could it be that in the uncorrelated case, the fit to optimal is not as good when the potential fluctuates on greater amplitudes? In fact, I don't understand why the uncorrelated case should not have a saturating dendritic nonlinearity to counteract the exponential spiking nonlinearity."

Reviewer #3:

The authors present a detailed mathematical analysis, computations and experimental results aimed at supporting the conclusion that dendritic nonlinearities used to integrate synaptic inputs in a neuron are optimized based upon the statistics of that neuron's presynaptic connections. In particular, the suggested result is that if a neuron's presynaptic connections are uncorrelated, sublinear responses are optimal. If, however, the firing of presynaptic neurons are correlated, supralinear responses are optimal for that neuron. If correct, I would find this conclusion to be interesting.

One of the main difficulties with the manuscript is that the details are not easily approachable. There are many pages of complicated mathematics that one must work through, and in many places there hasn't been much attempt to try to explain things in simpler terms to the general reader – I suspect that one will not be able to follow the details here unless one is already an expert.

While the major focus on presynaptic activity is in terms of spikes (and I will grant that a good explanation for this is given), I think it's a concern is that the focus on postsynaptic activity is in terms of subthreshold responses. Granted, the authors say that they can also perform their calculation if the firing rate is substituted for the subthreshold response. This seems like an inconsistency, however – if two or more neurons share synaptic inputs from overlapping sets of presynaptic neurons, and the correlations in the spiking patterns of the presynaptic neurons matter, why is it that correlations in the output spikes of those two or more neurons don't matter? This would seem to be a basic internal inconsistency in the manuscript.

In addition, while there is a huge amount of detail associated with the mathematical calculations, I found the details associated with the experiments to be a bit too sparse. It would have been useful, for example, to see an image of the neocortical pyramidal cell and the specific synapses which are being stimulated.

[Editors’ note: what now follows is the decision letter after the authors submitted for further consideration.]

Thank you for submitting your work entitled "Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits" for peer review at eLife. Your submission has been favorably evaluated by Eve Marder (Senior Editor) and three reviewers, one of whom is a member of our Board of Reviewing Editors.

The reviewers have discussed the reviews with one another and the Reviewing editor has drafted this decision to help you prepare a revised submission.

All of the reviewers are strongly positive about this work but they also want to ensure that the work is accessible, understandable and widely appreciated. They would like all aspects be made as clear as possible. Thus, pending the following revisions, we expect to accept the paper.

Please respond directly to the revisions on a point-by-point basis, revising the paper where needed.

1) Double check all equations in the mathematical appendix – there was some concern given potential sign errors in other equations (see below). Additional explanatory guidance through the mathematical steps should be provided for those readers who may be interested in following these details. The math is still very complicated.

2) Check and correct potential inconsistencies related to the description of the function s(t). In the subsection “Presynaptic statistics” it is still stated that in the limit in which the time step goes to zero the discrete version becomes binary, i.e., zero or 1. In the subsection “Inference and the optimal response”, however it's stated that s(t) is a sum of Dirac-delta functions. The two descriptions can't be consistent, of course, unless there is a division by a time step somewhere. There does not appear to be a careful definition in the manuscript, however.

3) Check the solution of equation (22) given in (23) and (24) for potential sign errors.

4) The additional details about the experiments were appreciated. However, a question arose about the blocking of NMDA channels. It's stated that substantially poorer fits were obtained for this case. Please confirm/clarify whether the postsynaptic filtering parameters were re-fit in this case?

5) It seems like a bit of an inconsistency that while a big point is made that presynaptic neurons must communicate with spikes, no spiking is considered in the postsynaptic neurons. If spiking is the only way that neurons can communicate (the authors' point), why should one care about the subthreshold response of the postsynaptic neuron? Since so much of the manuscript covers the subthreshold response, something explicit (explanation-wise) should be said about this in the manuscript (and we realize that they consider the case of firing rates).

6) Include a 'toy model' explanation (without math, or if possible, with straightforward enough math equations that most can follow) that summarizes the link between pairwise correlation (temporal) and dendritic integration (spatial) before any specific statistical models are discussed. This would help the reader follow the more detailed case, and avoid confusion in consideration of later statements in the paper. A good place for this might be at the beginning of the Results section.

Specifically, the way the text reads is confusing about when spatial correlations (for lack of a better term) between different presynaptic neurons are being discussed, and when temporal correlations between inputs are being considered. Part of the issue may be that the mathematical formulation in the main text completely hides issues of temporal correlation in the details. Thus one can be left wondering if one of the two types of correlations might be more important and if the theory could shed light on this.

For example, consider these comments from the manuscript:

"However, if presynaptic neurons became correlated the optimal response became nonlinear and the best linear response was unable to accurately follow the fluctuations in the input (Figure 1D)."

"Same as (A-C) but for presynaptic neurons exhibiting synchronized switches between a quiescent and an active state (D, bottom), introducing higher order correlations between the neurons." (Figure 2 caption)

"In particular, sublinear integration was optimal when presynaptic activities exhibited simple Gaussian random walks and thus they could not contain statistical dependencies beyond second order correlations (Figure 2A-C)."

The authors here are trying to provide an intuitive explanation as to why input correlations lead to the observed behavior and this, of course, is good and to be encouraged. In each case, however, what type of correlation is being thought of – spatial or temporal? Further elaboration might provide additional insight.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for submitting your work entitled "Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Eve Marder as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing editor has drafted this decision to help you prepare a revised submission.

The reviewers appreciated the revisions done by the authors, but it was felt that the intuition as provided was incomplete and the mathematical appendix was still not clearly presented. Specifically, it was noted that is unclear how the authors distinguish autocorrelation from cross-correlation (spatial) in the formalism and the text, and as this is a main point of the paper, it should be up front early on in the Discussion (and not just in caption of Figure 1 and later in Methods). Also, eq A91 has an exact solution (separable). In essence, the math appendix should only include what is needed to explain the methods – anything that is more extensive than needed should be either removed or simplified.

In summary, the authors should carefully go through the entire ms to ensure that these aspects are addressed in general. Please also note specific additional comments from the reviewers to address:

1) The manuscript still has trouble delineating spatial vs temporal correlations when it tries to explain different points. For example:

Introduction section: "We pursued this principle in understanding the integrative properties of individual cortical neurons, for which the relevant statistical input patterns are those characterising the dynamically evolving spike trains of their presynaptic partners." This sounds like the manuscript is concerned with temporal issues.

"The optimal response determined by this statistical model, for essentially any setting of parameters, was inherently nonlinear because the additional effect of a presynaptic spike depended on the pattern of spikes that had been previously received from the presynaptic population and thus the integrated effect of multiple spikes could not be computed as a simple linear sum of their individual effects in isolation." This also sounds to me as if a claim that the temporal pattern of inputs is what's leading to nonlinearity.

2) Regarding the errors in the mathematics, we thought that when the authors repaired the solution of Equation 27 they would also fix other instances of this equation in the manuscript. Equation 27 is recapitulated in the mathematical appendix as Equation A91, but the discussion around Equation A91 is clearly wrong. Somehow a different solution from the solution of Equation 27 is obtained; the answer should be equation A92 with Α = 0,Β = B and C = −A. Here’s how the solution goes: we start with

ζ˙=ζ(1ζ)[Α+Βs(t)].

This equation is separable:

dζζ(1ζ)=[A+Bs(t)]dt.

The left-hand size can be simplified by partial fractions:

1ζ(1ζ)=1ζ+11ζ,

so that

[1ζ+11ζ]dζ=[A+Bs(t)]dt.

Both sides of the equation can now be integrated. Assuming 0 < ζ < 1, we get

lnζln(1ζ)=[A+Bs(t)]dt.

or equivalently,

lnζln(1ζ)=v(t)

where

v˙=A+Bs(t).

Then solving

ζ1ζ=evζ=ev1+ev+11+ev.

We don’t know what the authors are trying to do with the rest of Section B.2.2, but it appears to be wrong. We also don’t know to what extent fixing this will affect the rest of the results.

eLife. 2015 Dec 24;4:e10056. doi: 10.7554/eLife.10056.023

Author response


[Editors’ note: the author responses to the first round of peer review follow.]

We included a thorough answer to all points raised by the reviewers. In particular, we substantially simplified the presentation of the paper by rewriting the introduction and the first part of the results section; adding a flowchart to Figure 1 giving a concise summary of the logic of the manuscript; and rewriting several parts of the Methods section to streamline it. Moreover, we highlighted the take-home messages at the beginning of the Discussion. In addition, we also included experiments using NMDA receptor blockade to provide more mechanistic insights; added a supplementary figure demonstrating that our stimulation protocol is sufficiently complex to probe dendritic integration in the relevant input range and added a supplementary figure where we compared the responses of different models to the averaged presynaptic potential instead of the optimal response.

[…] Overall, two major aspects require consideration, as summarized below with some suggestions. Additional detailed comments from the reviewers are appended below.

1) A clearer and simpler version of the paper is needed. A clear line through the reasoning that will allow everyone to see the core of what has been done, and why, needs to be presented. Specifically, the paper needs to be transparent and math that obfuscates should be removed or simplified or explained.

We substantially simplified the presentation of the math at the beginning of the paper and removed all math that is not essential to follow the logic of the presentation. We believe that the equations that remained in the main text (4 out of 7 in the previous version) are essential to introduce key concepts of the paper. These equations are now explained and illustrated in the text and presented in a highly simplified way.

The authors are encouraged to present their essential results in a much more straightforward fashion to help elucidate their mathematical derivations. A suggestion is to use a flow chart of the process/protocol (capturing presynaptic firing statistics, fitting, optimizing etc., and lay bare the limitations/assumptionse.g., the output computation f(u) is assumed to be a linear function),

As suggested, we added a flowchart presenting the logic of the paper to the first figure. This new panel also serves as a roadmap to the rest of the paper. In the flowchart, we highlighted our assumptions, illustrated the relationships between different parts of the paper and emphasized how specific figures in the paper support our conclusions. Moreover, we added a new paragraph describing the structure of the paper at the end of the Introduction and another paragraph in the Results that uses the flowchart to explain the logic of the Results section after introducing the key concepts at the beginning. We believe that these changes will substantially help the reader to follow the presentation of the paper.

[…] Describe the essential results in words, and point to the mathematical parts in methods/or a mathematical appendix of how and why.

We removed a substantial fraction of the equations from the main text (see above), and clarified the presentation in both the Results and Methods sections. Our main goal was to keep only those mathematical formulae in the main text that were absolutely essential to follow the logic of the paper. Those details that are only necessary to reproduce our results are included in the Methods section while further details of the derivations have been moved to the mathematical appendix.

We understand and appreciate that you wish to publish the full mathematical treatment, but most eLife readers won't benefit from the math in its present form. The mathematical results could be made complete and readable in an Appendix, while freeing the Results to show the logic of the model and how it works. It looks complicated because of using the most general formulations (multi-state state transitions, and multi-variate normal with arbitrary state-dependent covariance), but the main results are obtained via somewhat simpler formulations.

We changed the formula describing the state transition dynamics in the model as suggested by the Reviewers, and this, indeed, simplified substantially the presentation of both the model for the presynaptic statistics and the optimal response (Eq. 15-17). We also added an intuitive explanation of the equations describing the optimal response. We also greatly simplified the derivation of the dendritic approximation of the optimal response which provided a more intuitive justification for the sigmoidal nonlinearity (see subsection “Dendritic approximation of the optimal response”). In the representation of the optimal response, the multivariate (rather than univariate) normal and the state-dependent posterior covariance matrix are necessary consequences of having more than 1 presynaptic neuron and 1 state, respectively, which are both key to our results, and so we could not simplify further this part of the Methods section.

More detailed mathematical derivations could go into a mathematical appendix if necessary. In this way, people (of diverse backgrounds) would be able to sink their teeth into the contribution, build on it, test it and so on.

We have restructured the paper following this suggestion. We believe that the paper is now presented in a form that is suitable for readers with diverse backgrounds: researchers with experimental interest can follow the logic of the paper by reading the main text, while readers interested in the mathematical details can consult the Methods section and the Mathematical Appendix. We believe that the minimal amount of mathematical details that we left in the main text is necessary to clearly define the concepts used in the paper and to avoid misunderstandings due to the novelty of our approach, and that these formulae and mathematical concepts are sufficiently lightweight (with the corresponding explanations in the text) even for the non-specialist reader.

2) Take-home message(s) and biological/biophysical intuition needs to be presented. No biophysical model seems to be included, even simply. Everything seems to be folded into the optimal g(s), so that it is not obvious how best to make a biological/biophysical connection.

We note that the first paragraph of the section “Nonlinear dendrites closely approximate the optimal response” in the Results starts by considering potential biophysical substrates of the optimal g(s), and indeed that whole section is devoted to show how the optimal response can be (approximately) implemented by a simplified mechanistic, biophysically motivated model of dendritic integration that has enjoyed prominent success and wide popularity in the field (Poirazi & Mel, 2001). Crucially, this mechanistic model has been extensively linked to biophysical processes by previous studies so we felt that by way of linking our theory to this model we do provide biophysical insights. We also devote the first subsection of the Discussion (now expanded, see below) to discussing biophysical substrates.

Nevertheless, to better manage readers’ expectations, we included “disclaimers” upfront, already in the Introduction, and the first section of the Results, explaining how the inherently top-down nature of our approach does not allow us to make specific biophysical predictions, but how we are still able to provide some biophysical insights. In fact, to provide tighter links with biophysical factors, we also included, as an initial attempt, a novel set of experimental data directly probing the contribution of NMDA receptors to optimal input integration (Figure 5E and J). Specifically, we found that after blocking the NMDA currents dendritic integration became incompatible with the known presynaptic statistics, both in the neocortex and in the hippocampus, supporting the crucial role of NMDA currents in matching dendritic integration to the input statistics.

The derived 'optimal traces' do, in principle, bring a more stringent test to the theory, but in Figure 4 they have access to the whole pre-synaptic dynamics so the optimal trace can be replaced by the averaged pre-synaptic potential,

Indeed, in Figure 4 we could show the averaged pre-synaptic potential together with the optimal trace. However, even in this test the computation is spike based and the postsynaptic neuron does not have access to the presynaptic membrane potentials. In this case the optimal response is the theoretically best possible approximation to the presynaptic potentials that is achievable to the postsynaptic cell given the limited information conveyed by the irregular firing of the presynaptic neurons. Therefore, in Figure 4 we compared the models’ responses to the optimal response and included an additional supplementary figure (Figure 4—figure supplement 1) where we compare the responses to the averaged presynaptic potential.

[…] While in Figure 5 the experimental pattern of stimulation is not sufficiently complex to really probe the detailed temporal structure predicted by the theory.

We agree with the reviewers that the currently used experimental protocols are relatively simplistic compared to the highly structured spatio-temporal stimuli experienced by the dendrites under in vivo conditions. However, we found that despite their relative simplicity, our in vitro stimuli activated the dendritic branches in a similar range (in terms of the integrated postsynaptic effect of synaptic stimulations) as could be expected during in vivo-like situations. To illustrate this point, we prepared a supplementary figure (Figure 1—figure supplement 2) where we compare the total input to a given branch during an in vitro experiment with the total input when the stimuli is sampled from the model matched to in vivo data. We found an excellent match between these two distributions suggesting that our stimulation protocols probed integration in a single dendritic branch in the physiologically relevant input range.

Crucially, our approach will also allow us in the future to modify the in vitro stimulation protocols to make them similar also in their structural details to the patterns typical in vivo. Designing such novel, richer but experimentally still feasible protocols possibly optimised for gaining information about dendritic nonlinearities is a highly non-trivial endeavour on its own right (where collaboration between theory and experiments can indeed bring significant advances) and was thus beyond the scope of the current paper.

The other figures support qualitative statements. This, for example, could be explicitly stated and rationalized for the reader.

We feel that Figure 3 is also quantitative in that it shows a direct numerical comparison of the optimal response and the response of the canonical biophysically-motivated model. Figure 2 is indeed qualitative and we now make this clear upfront in our summary flowchart (Figure 1B). More generally, we included a roadmap to the paper at the end of the Introduction which also states explicitly that this first step is qualitative.

Some aspects came through but because of the first point above, there was not always a clear consensus of what the authors intended as the main message(s) from the work. Could these intents (if appropriate) be better explained? For example:i) Presynaptic neurons with correlated inputs should target neighboring regions of the dendritic tree so that the inputs can sum nonlinearly while presynaptic neurons with uncorrelated inputs should map onto different portions of the tree and the inputs sum linearly?ii) An optimizing aspect/principle that could ultimately predict the 'location structure' of synapses as a crucial test of their theory.iii) Optimally decoding the information present in the spikes to do the desired computation depends upon the correlationsiv) Addressing the question of why there are nonlinearities in the first place.v) An optimal way to recover the averaged presynaptic potential only from the pre-synaptic spike trains? Can the biophysics approximate this optimal 'decoding' operation?

We rewrote the first paragraph of the Discussion where we first highlight the main messages of the paper and then close the paragraph by a simple take-home message. We hope that these changes, together with the restructured Introduction of the paper will help the reader to identify the major contributions and the main messages of our work.

Reviewer #1:

1) The authors could state that graded synaptic transmission exists (many examples in invertebrates), so that their principle may not apply there, I assume. Is it definitively known that graded synaptic transmission is not present in cortical circuits?

Synaptic transmission can be modulated by graded signals, such as the somatic depolarisation of the presynaptic neuron (Shu et al., 2006), and indeed, this is why we wrote in the manuscript that “cortical neurons mainly communicate by action potentials”, but this effect is believed to have a minor importance relative to the coding of information by action potentials. We added a sentence about graded synaptic transmission in the Discussion.

2) It would be helpful if the authors could expand on the discussion regarding biophysical substrate. That is, to fill in the blanks on how and why "NMDA spikes may provide a general solution". Was some modeling (not shown) done to be able to state this? Please provide some additional discussion/rationale/assumptions for making these statements.

To thoroughly answer this question of the reviewer we made two distinct changes in the manuscript: First, in order to more directly demonstrate the role of NMDA receptors in tuning dendritic nonlinearities to the statistics of the input we tested how well our theory can predict experimental data recorded under their pharmacological blockade. We found that the prediction of our theory became substantially poorer in both the hippocampus and the neocortex after blocking the NMDA receptors supporting their crucial role in matching dendritic integration to the input statistics. Second, we expanded the Discussion regarding the possible role of NMDA receptors in tuning dendritic nonlinearities to state switching dynamics.

3) Regarding inhibition, the authors discuss potential addition from the perspective of excitatory cells. However, perspectives from inhibitory cells are not expressed. Do the authors expect their principle to hold from the perspective of inhibitory cells (receiving excitatory and/or inhibitory inputs)? The authors should provide some discussion/rationale/assumptions of why this is not included/considered.

We added a paragraph to the section “Inhibitory Neurons” discussing the possible application of the theory to inhibitory neurons. In short, our theory applies equally well to dendritic integration in inhibitory cells. We focused on pyramidal neurons because their nonlinearities have been much more extensively characterised.

I do not think it is a given that computation is only from the perspective of excitatory cells? For example, consider the reviews by Klausberger and Somogyi (Science 2008) and Chamberland and Topolnik (Frontiers in Neuroscience 2012).

We agree with the reviewer that computation is not necessarily exclusive to excitatory neurons. In fact we discuss that they might have a similar role in single neuron computation as excitatory neurons.

Reviewer #2:

Nonlinear dendritic integration is often seen as a thorn in the foot. Following in vitro work by Larkum, Magee, Stuart, Hausser and others, it is now clear that dendrites introduce various types of nonlinearity to input integration.

[…]

For these reasons, I believe that even if the supposed role is later shown to be false or incomplete, the article will have had a strong impact on the community.

We are grateful for reviewer 2’s constructive comments. The wording of several of his sentences motivated us to rewrite our paper in a clearer way potentially more understandable for people with diverse background.

Reviewer #2 (Minor Comments):

Generally, confusion arises in the number of models considered. I think that, when a comparison between models is the main result, these different models should be mentioned at the beginning of the section. For instance in the last paragraph of the section “The form of the optimal nonlinearity depends on the statistics of presynaptic inputs” one suddenly learns of the actual comparisons being made.

We agree with the reviewer that it is crucial to clearly introduce the models used in the manuscript when they are first mentioned. Therefore we carefully checked that not only models but also the important concepts are introduced and defined when first presented in the main text. As for the specific section to which the reviewer is referring, we specified the details of the models the first time they are introduced (sections “Optimal input integration is nonlinear” and” The form of the optimal nonlinearity depends on the statistics of presynaptic inputs”) such that when we refer to them later they do not come as a surprise.

On a first reading, I could not understand what was done in the first paragraph of the section” Nonlinear integration in cortical neurons is matched to their input statistics”. After reading the supplementary, it is clear, but I think a few sentences should be added to describe in a little more details what was actually compared. For instance, one can read from the first sentence that to test the theory, we need to measure presynaptic statistics and derive the optimal nonlinearity, then measure the nonlinearity in vitro and compare the nonlinearities. That is not what was done, I know now, but I could not rule that out from the paragraph as is. It is also difficult to understand end of the second paragraph of the same section on the first go.

We added a clause to the referred paragraph and clarified that when testing the theory we compared the actual and the predicted responses instead of the form of the nonlinearities themselves. We also clarified what we mean by the “parameters related to the postsynaptic dendritic filtering”.

What is the experimental stimulus protocol used in the uncaging experiments? How broad is the distribution of ISI? Evenly distributed on Figure 5c and g? Are each burst made of evenly spaced pre-synaptic encaging?

We added a separate paragraph about the stimulation protocols we used to the Methods section where we describe that in the uncaging experiments we targeted visually identified dendritic spines (1-7 in the cortex, 1-30 in the hippocampus) with a burst of presynaptic uncaging stimuli evenly spaced in time. We varied either the number of stimuli in the burst (1-7 in the neocortex, 1-30 in the hippocampus) or the delay between the uncaging events within the burst (1-20 ms in the neocortex and 0.1-5 ms in the hippocampus) for each dendrite.

Can you really say overfitting is small?

In order to demonstrate that the precise match between the predicted and the actual dendritic nonlinearities is not due to overfitting we performed model selection using Bayesian information criterion that penalises more complex models (see also Methods). We found that the extra flexibility of the neocortical and hippocampal models offered by the one additional parameter compared to the independent model was highly unlikely to account for the superior performance of the more complex models. Moreover, the model with second order correlations had the same number of free parameters as the neocortical and hippocampal models, yet it was unable to fit the data even remotely as well. These results indicate that the good performance of the cortical models is not simply due to the larger number of parameters.

To further demonstrate that overfitting is small, we now also conducted a set of validation simulations. In these simulations we first generated synthetic data matching the data recorded in our hippocampal and neocortical experiments (see below), and fitted this synthetic dataset following the same procedure that we used for fitting real experimental data in the paper, using hippocampal and neocortical input statistics. We expected that if the risk of overfitting were high, then we would be able to fit our simulated hippocampal dataset with neocortical parameters just as well as with hippocampal parameters and vice versa.

Specifically, to generate synthetic data matching the data recorded in our hippocampal experiments, we computed the optimal response for a stimulus (i.e. 30 input spikes with various delays) assuming general presynaptic statistics similar to our hippocampal model (Table 1, HP). We added 20% variability to the presynaptic firing rate and assembly activation rate to simulate a possible mismatch between the true population activity statistics and the statistics used for estimating the optimal response. This added variability within the hippocampal statistics that was small compared to the differences between hippocampal and neocortical population activity statistics. The postsynaptic parameters were sampled randomly from a uniform distribution in the realistic range (Figure 5—figure supplement 1B, C). This protocol provided us with a set of traces very similar to the original experimental data recorded from hippocampal CA3 cells. We followed the same protocol, using neocortical statistics (Table 1, NC), to generate synthetic data similar to our neocortical recordings.

Using these datasets we investigated the accuracy of our fitting procedure with both hippocampal and neocortical presynaptic statistics. We found that when the presynaptic statistics used for fitting matched the presynaptic statistics used to generate the data the fitting error was almost zero (Author response image 1A) and the estimated postsynaptic parameters were similar to the original postsynaptic parameters used to generate the data (Author response image 1B). (Note that fitting error was much smaller than with actual experimental data because synthetic data lacked intrinsic variability between responses to the same stimulus.) The non-zero error between the original and the estimated postsynaptic parameters is due to the variability of the presynaptic parameters within a given dataset that could be compensated by altering the postsynaptic parameters. However, the postsynaptic parameters could not compensate for differences in the presynaptic parameters as large as between hippocampal and neocortical statistics, resulting in substantially larger fitting errors (Author response image 1A) and greater differences between the original and fitted parameters (Author response image 1B). Taken together, these simulations suggest that the stimulation protocol used in our experiments is sufficiently complex to identify the postsynaptic parameters without the danger of overfitting.

Author response image 1. Results of the validation simulations.

Author response image 1.

(A) The fitting error was substantially smaller if we used similar presynaptic parameters to fit the validation data than the true parameters used to generate the data. In this case the fitting error was essentially zero since there is no intrinsic variability in the artificial data. (B) The error in the postsynaptic parameters (number of cells targeting the dendritic branch, and the postsynaptic filtering time constant) is substantially smaller when the true presynaptic parameters match those used for fitting the model (blue) than when the parameters are mismatched.

DOI: http://dx.doi.org/10.7554/eLife.10056.021

Why exactly the multi-state description in Eq. 11 and 19 if it is never used? Personally, it is only when I saw Eq. S67 that I finally reached the proper mathematical intuitions. Other than the need for simplicity, I am concerned that some of the conclusions would not hold in the multi-state scenario suggested by Eq. 11.

We simplified the equations in the main text and only use the two state description in the main text and moved the more general, multi-state equations into the mathematical appendix.

Figure 1: 1B is argued to offer a better fit to optimal than 1C. But there are two differences, correlation and scale. Could it be that in the uncorrelated case, the fit to optimal is not as good when the potential fluctuates on greater amplitudes?

In this figure we illustrated a case in which the nonlinearity of the optimal postsynaptic response depends only on the correlations between the presynaptic neurons and not their individual activity dynamics or spiking. Therefore we chose the same parameters for the individual presynaptic neurons in the two panels - in fact, the only difference between C and D (note, that the numbering of the panels has changed in the new version of the manuscript) is that all neurons in D are switching to the active state simultaneously while they are independent in C. To illustrate this point we added the membrane potential of one example presynaptic neuron to each panel, on the same scale.

The large difference between the scale of the postsynaptic response is due to the fact that individual fluctuations average out when neurons in the population are independent, while they do not if they are correlated. To control for the differences in the scale of the fluctuations we normalise the error between the actual and predicted responses by the variance of the true signal, whenever we quantify errors throughout the paper. Moreover, as we show in the Methods section, when presynaptic neurons are independent and the computation is linear (as we assume in panel B), then the optimal response is literally the linear combination of the responses to the spike trains of the individual neurons.

In fact, I don't understand why the uncorrelated case should not have a saturating dendritic nonlinearity to counteract the exponential spiking nonlinearity."

We argue here that optimal dendritic integration of spikes from different presynaptic neurons is linear if the presynaptic neurons are independent. We are not saying that the optimal response to the spikes from the same presynaptic neuron would be linear in this case. In fact, as the reviewer suggests, spikes from the same neuron should still be integrated in a nonlinear way even if presynaptic neurons are independent (see Pfister et al., 2010). In the Discussion we suggest that the nonlinearities that are private to the individual inputs should be implemented by nonlinear synapses (short term plasticity) instead of nonlinear dendrites. We have now made this point clear in the caption of Figure 1.

Reviewer #3:

The authors present a detailed mathematical analysis, computations and experimental results aimed at supporting the conclusion that dendritic nonlinearities used to integrate synaptic inputs in a neuron are optimized based upon the statistics of that neuron's presynaptic connections. In particular, the suggested result is that if a neuron's presynaptic connections are uncorrelated, sublinear responses are optimal. If, however, the firing of presynaptic neurons are correlated, supralinear responses are optimal for that neuron. If correct, I would find this conclusion to be interesting.

This is probably also what the reviewer meant, but just to be precise: presynaptic correlations imply nonlinear responses, which can be sublinear or supralinear depending on the nature of those correlations, see e.g. Figure 2.

One of the main difficulties with the manuscript is that the details are not easily approachable. There are many pages of complicated mathematics that one must work through, and in many places there hasn't been much attempt to try to explain things in simpler terms to the general reader – I suspect that one will not be able to follow the details here unless one is already an expert.

We simplified the math presented in the main text and provided an overview of the main concepts in Figure 1B. Our goal was to present the most important concepts of the paper in the main text with only the minimal amount of mathematical details necessary and by illustrating the concepts graphically. Therefore we moved the mathematical details necessary for reproducing the results to the Methods section. Additional details about the mathematical derivations are included in the mathematical appendix.

While the major focus on presynaptic activity is in terms of spikes (and I will grant that a good explanation for this is given), I think it's a concern is that the focus on postsynaptic activity is in terms of subthreshold responses. Granted, the authors say that they can also perform their calculation if the firing rate is substituted for the subthreshold response. This seems like an inconsistency, however – if two or more neurons share synaptic inputs from overlapping sets of presynaptic neurons, and the correlations in the spiking patterns of the presynaptic neurons matter, why is it that correlations in the output spikes of those two or more neurons don't matter? This would seem to be a basic internal inconsistency in the manuscript.

We agree with the reviewer that mathematical consistency is an important requirement for any theory. According to our theory, correlations between the activities of any groups of cells presynaptic to a given neuron must be handled by the postsynaptic neuron. In the example brought up by the reviewer, correlations between the neurons in the input neurons determine input integration in the output neurons, whereas correlations in the two output neurons will influence integration in neurons reading them. Thus, correlations in the output spikes of the two neurons DO matter, but only for cells that receive inputs from those particular two neurons. More generally, note that our formalism rests on the very objective of making a self-consistent (purely membrane potential-based, or, equivalently, purely firing rate-based) description of network dynamics work when neurons communicate with spikes.

In addition, while there is a huge amount of detail associated with the mathematical calculations, I found the details associated with the experiments to be a bit too sparse. It would have been useful, for example, to see an image of the neocortical pyramidal cell and the specific synapses which are being stimulated.

We added the image of the neurons we recorded from in Figure 5 and provided details about the experimental protocols used in this study in a new paragraph in the Methods section.

[Editors' note: the author responses to the re-review follow.]

All of the reviewers are strongly positive about this work but they also want to ensure that the work is accessible, understandable and widely appreciated. They would like all aspects be made as clear as possible. Thus, pending the following revisions, we expect to accept the paper. Please respond directly to the revisions on a point-by-point basis, revising the paper where needed.

1) Double check all equations in the mathematical appendixthere was some concern given potential sign errors in other equations (see below). Additional explanatory guidance through the mathematical steps should be provided for those readers who may be interested in following these details. The math is still very complicated.

We carefully checked the equations in the mathematical appendix and included additional explanatory text to all important steps.

2) Check and correct potential inconsistencies related to the description of the function s(t). In subsection “Presynaptic statistics” it is still stated that in the limit in which the time step goes to zero the discrete version becomes binary, i.e., zero or 1. In subsection “Inference and the optimal response”, however it's stated that s(t) is a sum of Dirac-delta functions. The two descriptions can't be consistent, of course, unless there is a division by a time step somewhere. There does not appear to be a careful definition in the manuscript, however.

We clarified the difference between the discrete and continuous versions of the presynaptic spike trains, st and s(t) in the Methods section (below Eq. 22.) and provided a formal derivation of s(t) from st in the mathematical appendix (Eq. A62).

3) Check the solution of equation (22) given in (23) and (24) for potential sign errors.

We are grateful to the reviewers for pointing out this sign error. We corrected the sign error in Eq. 22 (now Eq. 27) and in the definition of the constant A in the mathematical appendix (B.1).

4) The additional details about the experiments were appreciated. However, a question arose about the blocking of NMDA channels. It's stated that substantially poorer fits were obtained for this case. Please confirm/clarify whether the postsynaptic filtering parameters were re-fit in this case?

Yes, we confirm that the postsynaptic filtering parameters were re-fit in this case. We clarified this in the text.

5) It seems like a bit of an inconsistency that while a big point is made that presynaptic neurons must communicate with spikes, no spiking is considered in the postsynaptic neurons. If spiking is the only way that neurons can communicate (the authors' point), why should one care about the subthreshold response of the postsynaptic neuron? Since so much of the manuscript covers the subthreshold response, something explicit (explanation-wise) should be said about this in the manuscript (and we realize that they consider the case of firing rates).

We added a sentence clarifying this issue in the Discussion (section about analog communication) by explaining that our theory is fully self-consistent as we assume that the relevant mapping is between the membrane potentials of the presynaptic and the postsynaptic neurons. This implies that the spikes of the neuron called postsynaptic in the manuscript will influence input integration of its own postsynaptic partners, and the form of the integration depends on the statistical dependences in the new presynaptic population.

6) Include a 'toy model' explanation (without math, or if possible, with straightforward enough math equations that most can follow) that summarizes the link between pairwise correlation (temporal) and dendritic integration (spatial) before any specific statistical models are discussed. This would help the reader follow the more detailed case, and avoid confusion in consideration of later statements in the paper. A good place for this might be at the beginning of the Results section.

We were somewhat unsure as to what the Reviewers exactly had in mind here, so we added a simple toy explanation of the main idea of the paper from everyday life at the beginning of the Results section. We are happy to omit this paragraph if the reviewers/editor deem it unnecessary.

Moreover, our original intention with Figure 1C-D was precisely to show a simple, toy-model example where the requirements for linear versus nonlinear integration become evident for the reader without the need for diving into the mathematical details, as we introduce the specific mathematical model in the next section. Nevertheless, we found it impossible to do this before at least the main mathematical concepts (computation vs. implementation) are introduced at the beginning of the Results section.

We also added a short conclusion to summarise the link between input correlation and dendritic integration at the end of this section.

Specifically, the way the text reads is confusing about when spatial correlations (for lack of a better term) between different presynaptic neurons are being discussed, and when temporal correlations between inputs are being considered. Part of the issue may be that the mathematical formulation in the main text completely hides issues of temporal correlation in the details. Thus one can be left wondering if one of the two types of correlations might be more important and if the theory could shed light on this.

We added a new paragraph in the Methods (presynaptic statistics) explaining the source of spatial and temporal correlations. We also clarified that under correlations in the manuscript we always mean (cross-)correlations between the spike trains of different presynaptic neurons (spatial correlations). When we refer to the correlations between the firing rates of the same neuron at a particular time delay (temporal correlations) we use the term auto-correlation. We chose this terminology, rather than explicitly stating in every instance whether its spatial or temporal correlations, so as not to overburden the manuscript with too many technical terms while still keeping its terminology consistent.

For example, consider these comments from the manuscript: "However, if presynaptic neurons became correlated the optimal response became nonlinear and the best linear response was unable to accurately follow the fluctuations in the input (Figure 1D)." "Same as (A-C) but for presynaptic neurons exhibiting synchronized switches between a quiescent and an active state (D, bottom), introducing higher order correlations between the neurons." (Figure 2 caption) "In particular, sublinear integration was optimal when presynaptic activities exhibited simple Gaussian random walks and thus they could not contain statistical dependencies beyond second order correlations (Figure 2A-C)." The authors here are trying to provide an intuitive explanation as to why input correlations lead to the observed behavior and this, of course, is good and to be encouraged. In each case, however, what type of correlation is being thought of – spatial or temporal? Further elaboration might provide additional insight.

In all of these cases we were referring to statistical dependencies between different presynaptic neurons (spatial correlations in the reviewer’s terminology). This is clarified in the Methods section, where we included the following sentence: “In this paper we informally refer to any statistical dependency (second or higher order) between the spike trains of different neurons simply as ‘correlation’.”

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The reviewers appreciated the revisions done by the authors, but it was felt that the intuition as provided was incomplete and the mathematical appendix was still not clearly presented. Specifically, it was noted that is unclear how that authors distinguish autocorrelation from cross-correlation (spatial) in the formalism and the text, and as this is a main point of the paper, it should be up front early on in the Discussion (and not just in caption of Figure 1 and later in Methods).

We now tried to clarify which type of correlations we are referring to wherever it was not absolutely clear from the context. We also added text to define auto- and cross-correlations, and the requirements for nonlinear postsynaptic input integration they respectively imply, at the beginning of the Results. We also explain that only correlations between neurons (i.e. spatial correlations) require nonlinear dendritic integration. We also rewrote parts of the Methods to further clarify this in technical detail. As we write there:

“Also note that temporal and spatial correlations can not be studied in complete isolation […] that we did not consider this case worth pursuing.”

Also, eq A91 has an exact solution (separable).

Yes, we do realise it, the results in the Methods were in fact based on this insight. We now provide the detailed derivations for this solution in the Appendix B.2 (Eqs. A92-96).

In essence, the math appendix should only include what is needed to explain the methodsanything that is more extensive than needed should be either removed or simplified.

We simplified the mathematical appendix, and in particular reduced the size of Appendix B.

1) The manuscript still has trouble delineating spatial vs temporal correlations when it tries to explain different points. For example:

Introduction section: "We pursued this principle in understanding the integrative properties of individual cortical neurons, for which the relevant statistical input patterns are those characterising the dynamically evolving spike trains of their presynaptic partners." This sounds like the manuscript is concerned with temporal issues.

We changed the text to “the relevant statistical input patterns are those characterising the spatio-temporal dynamics of their presynaptic spike trains” to highlight the importance of the correlations between the presynaptic population.

"The optimal response determined by this statistical model, for essentially any setting of parameters, was inherently nonlinear because the additional effect of a presynaptic spike depended on the pattern of spikes that had been previously received from the presynaptic population and thus the integrated effect of multiple spikes could not be computed as a simple linear sum of their individual effects in isolation." This also sounds to me as if a claim that the temporal pattern of inputs is what's leading to nonlinearity.

We changed the text to include the following, more explicit explanation:

“The optimal response determined by this statistical model, for essentially any setting of parameters, was inherently nonlinear because the additional effect of a presynaptic spike depended on the pattern of spikes that had been previously received from the presynaptic population. […] Thus, the integrated effect of multiple spikes could not be computed as a simple linear sum of their individual effects in isolation.”

We hope this clarifies the respective roles of temporal and spatial correlations. We note that, technically, spatial correlations only require the nonlinear integration of non-coincident spikes from different cells (and all spikes are non-coincident with probability 1 in continuous time) if there are also temporal correlations in the presynaptic population. Nevertheless, as we argue above (and in the text), it is meaningless to consider spatial correlations without temporal correlations in the context of neural signals, so we hope the reviewers agree with us that the above statement (that we included in the text) is sufficiently accurate.

2) Regarding the errors in the mathematics, we thought that when the authors repaired the solution of Equation 27 they would also fix other instances of this equation in the manuscript. Equation 27 is recapitulated in the mathematical appendix as Equation A91, but the discussion around Equation A91 is clearly wrong. Somehow a different solution from the solution of Equation 27 is obtained; the answer should be equation A92 with Α= 0,Β= B and C = −A. Here’s how the solution goes: we start with

ζ˙=ζ(1ζ)[Α+Βs(t)].

This equation is separable:

dζζ(1ζ)=[A+Bs(t)]dt.

The left-hand size can be simplified by partial fractions:

1ζ(1ζ)=1ζ+11ζ,

so that

[1ζ+11ζ]dζ=[A+Bs(t)]dt.

Both sides of the equation can now be integrated. Assuming 0 < ζ < 1, we get

lnζln(1ζ)=[A+Bs(t)]dt.

or equivalently,

lnζln(1ζ)=v(t)

where

v˙=A+Bs(t).

Then solving

ζ1ζ=evζ=ev1+ev+11+ev.

We thank the reviewers for providing a detailed derivation of the solution of our Eq. 27. As we noted above, our own solution in the Methods was in fact based on the same logic. We now included our detailed derivations (which we believe is essentially identical to that of the reviewers) in Appendix B.2 (Eqs. A92-96).

We believe that the reviewers meant that the discussion after A91, i.e. Section B.2.2 in the previous version (and therefore B.3.2, which had essentially the same logic) was wrong. We maintain that while the derivation we presented there might have seemed unnecessarily complicated, it was not wrong – though some ends were undeniably left slightly loose, and so we now enclose with this response a more detailed version to show that indeed it leads to the same results as the derivation favoured by the reviewers. The reason why we originally included this more complicated derivation for supralinear integration (Appendix B.2) was that we built on parts of its logic (the separation of time bins with or without a spike) in the derivation for the sublinear case (Appendix B.3) to which the more straightforward method could not be applied (as the relevant differential equation is not separable). Nevertheless, we also managed to rewrite that part in a simpler way, so that it stands alone without all the machinery of the original derivation, so we could dispose of our original derivation altogether and make the whole Appendix self consistent using only the simpler derivation (reducing the size of the B appendix).

We don’t know what the authors are trying to do with the rest of Section B.2.2, but it appears to be wrong. We also don’t know to what extent fixing this will affect the rest of the results.

See above for what we were trying to achieve and why we don’t think it was wrong. The new, simpler derivation has made the algebra and the logic easier to follow (we hope) but has not changed any of our results.


Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

RESOURCES