SUMMARY
Decision-making in dynamic environments often involves accumulation of evidence, in which new information is used to update beliefs and select future actions. Using in vivo cellular resolution imaging in voluntarily head restrained rats, we examined the responses of neurons in frontal and parietal cortices during a pulse-based accumulation of evidence task. Neurons exhibited activity that predicted the animal’s upcoming choice, previous choice and graded responses that reflected the strength of the accumulated evidence. The pulsatile nature of the stimuli enabled characterization of the responses of neurons to a single quantum (pulse) of evidence. Across the population individual neurons displayed extensive heterogeneity in the dynamics of responses to pulses. The diversity of responses was sufficiently rich to form a temporal basis for accumulated evidence estimated from a latent variable model. These results suggest that heterogeneous, often transient sensory responses distributed across the fronto-parietal cortex may support working memory on behavioral timescales.
INTRODUCTION
Perceptual decision-making often requires integration of sensory information over time. Behaviorally, this is frequently modeled as an accumulation process, in which a subject’s perceptual judgment is described as a latent variable evolving under the influence of sensory events, comparable to the movement of particles in drift diffusion processes (Forstmann et al., 2016). Over the last several decades, this accumulation process has been investigated using electrophysiological recordings from single-neurons in primate and rodent cortex. Continuous presentation of noisy sensory evidence produces ramping activity in neurons in posterior parietal cortex (PPC; Roitman and Shadlen, 2002) and frontal eye field (FEF; Kim and Shadlen, 1999; Figure 1A) while brief pulses of sensory evidence produce sustained firing rate increases in PPC (Hanks et al., 2015; Huk and Shadlen, 2005; Figure 1A), leading to the hypothesis that the brain represents the accumulated sensory evidence in forebrain neuronal activity.
Recently, the frontal orienting field (FOF) and PPC have been implicated in accumulation of evidence and memory-guided decision making in rats. FOF and rat PPC share similar thalamocortical and corticothalamic projections with primate FEF and PPC respectively (Erlich et al., 2011; Whitlock et al., 2012) and inactivation of FOF causes disruptions in accumulation of evidence and memory-guided orienting tasks, but not in cued sensory-motor responses (Erlich et al., 2011; Erlich et al., 2015; Hanks et al., 2015; Kopec et al., 2015). Tetrode recordings in these regions revealed neurons tuned to the strength of evidence in a pulse-based auditory accumulation task (Hanks et al., 2015).
Most previous electrophysiological studies of fronto-parietal cortex during accumulation of evidence have focused on how firing rates of individual neurons correlate with the magnitude of sensory evidence in favor of a particular choice as predicted by an accumulator model. Typically, the responses of different neurons are analyzed as amplitude-scaled versions of a prototype temporal waveform representing the latent variable of the accumulator, implying relatively homogenous dynamics across the neural population. However, theoretical work has suggested that neural networks could also encode the memory of sensory events using a population code with heterogeneous dynamics. Whether these responses could also underlie an accumulation process has not been explored. For example, Goldman (2009) demonstrated that networks of simulated neurons with complex, temporally heterogeneous responses of increasing response width, can encode memories of transient events in their combined activity. Based on two photon imaging of calcium dynamics in mouse PPC during a memory guided virtual navigation task, Harvey et al. (2012) suggested that sequential activation of neurons following transient sensory events could encode the memory of reward location. More recently, Morcos and Harvey (2016) have shown that more complex state transitions in PPC population activity can encode both the upcoming and past choices of the animal. However, the specific nature of the encoding in each state, and the responses to individual pulses of evidence in the accumulation process were not evaluated. Each of the above models makes a distinct prediction about individual neural response functions following sensory events during accumulation (Figure 1C, B respectively). Motivated by these conceptual ideas, we examined neural responses to single pulses of evidence.
We used cellular resolution imaging in behaving rats to record population dynamics in FOF and PPC during a visual pulse-based accumulation of evidence task (Scott et al., 2015). In this task, rats viewed two streams of light pulses, presented from a left and a right LED and were rewarded for orienting to the side with more pulses. Voluntary head restraint (Scott et al., 2013) provided the stability necessary for two-photon imaging of GCaMP6f-labeled neurons in PPC and FOF. Analysis of calcium dependent fluorescence transients in layer 2/3 neurons revealed activity that predicted the animal’s subsequent choice, the animal’s choice on the previous trial and graded responses reflecting the strength of accumulated evidence. Neurons responded, on average, to each evidence pulse with a sustained increase in activity. This increase mirrored the timescale of behavioral memory of each pulse as predicted by drift diffusion behavioral models (Figure S1; Brunton et al., 2013; Scott et al., 2015). However, analysis of the responses of individual neurons to pulses of evidence revealed extensive temporal heterogeneity across the population. Re-analysis of electrophysiological recordings from an auditory version of the task (Hanks et al., 2015) also revealed heterogeneous, often transient spiking responses of neurons to individual pulses of evidence, confirming the generality of this finding. These results are consistent with a model of evidence accumulation in which the combined activity of neurons with diverse time-varying responses represents the memory of sensory events.
RESULTS
Cellular resolution imaging during accumulation of evidence in voluntarily head restrained rats
We trained rats to perform an accumulation of evidence task during voluntary head-restraint. Seven adult male rats were surgically implanted with custom titanium headplates. After recovery, rats were trained to perform a two-alternative forced choice task for water reward. The details of this task have been previously described (Scott et al., 2015). Rats initiated a behavioral trial by sliding their headplate into a custom headport at one end of a behavioral chamber (Scott et al., 2013) (Figure 2A). Once inserted, the headplate activated a kinematic clamp that registered and stabilized the position of the plate and head. Rats were required to maintain head fixation for 2–3 seconds, until an auditory cue instructed them to remove their head from the headport and orient to a side reward port. During the head restraint period rats were presented with a series of pseudorandomly timed light pulses from a left and right LED (Figure 2B, 2C). Pulses on the same side were separated by a minimum of 240ms to prevent the perceptual fusion of consecutive flashes, leading to a maximum of 6 flashes on each side. Pulses on opposite sides could occur simultaneously, leading to a possible maximum of 12 total flashes. Following release from restraint, rats received a water reward (25–75 µl) for orienting to the side with more flashes.
Semi-automated training facilitated the generation of a behavioral data set comprising 13,438 trials from 7 animals, which in turn enabled statistical characterization of the decision-making process (Figure 2D - 2F, S1). We previously determined that subjects were sensitive to a difference of a single pulse and accumulated all pulses (Scott et al., 2015). Behavioral modeling suggested that the memory of each flash exhibited long time constants, on the order of hundreds of milliseconds to seconds (Figures 2F, S1; Scott et al., 2015).
Once trained to perform accumulation during head restraint (see Scott et al., 2015 for shaping procedures), rats underwent a second surgery in which an optical window was implanted over either FOF (+2AP, 1.5 ML) or PPC (-3.8AP, 2.5 ML) (Figure 3A). We also performed imaging in medial V2 (mV2, -6.0AP, 2.0 ML) in two rats. mV2 is known to exhibit transient elevated firing rates in response to visual stimuli (Espinoza and Thomas, 1983), and is strongly inter-connected with PPC (Kolb and Walkey, 1987; Reep et al., 1994; Wilber et al., 2014). Neuronal expression of the genetically encoded calcium sensor GCaMP6f was accomplished using an adeno-associated viral vector (see Methods).
Following recovery from the optical window surgery and expression of the GCaMP6f indicator we performed cellular resolution imaging of calcium concentration dynamics via two-photon laser-scanning microscopy (Figure 3B-C). The laser shutter was synchronized to the animal’s behavior, opening at trial initiation and closing during the inter-trial interval, limiting laser illumination to the brain region under investigation. Activation of the kinematic clamp, synchronized with the head-fixation period, provided the stability necessary for cellular resolution imaging and registered the position of the headplate and head to within a few microns. This registration enabled imaging of the same field of view on subsequent head insertions, so that the same neurons could be imaged across trials in an imaging session (typically 100–200 trials).
Using this approach, we recorded from 500 (81 in mV2, 151 in PPC, 268 in FOF) morphologically identified neurons 150 to 200 µm below the cortical surface while rats (n=7) performed the visual accumulation of evidence task (Figure 3D-3F). On average, 10.5 (range = 2 to 28) GCaMP6f positive neurons could be visually detected per field of view. 21% of morphologically identified cells exhibited significantly modulated activity during the task (see Methods). The percentage of active cells was not significantly different between PPC (25%, 38/151) and FOF (23% 62/268) but both were significantly greater than mV2 (7%, 6/81, binomial test p<0.05; Figure 3D). Across the population, neurons exhibited complex, time-varying responses which peaked at different times within the imaging period (Figure 3 E,F). The number of active cells we observed was low compared with previous experiments in which neurons in PPC were imaged in head-fixed mice (for example 47% in Harvey et al., 2012). This difference may reflect the fact that imaging was restricted to behavioral trials (during voluntary head-restraint). Neurons were not recorded during inter-trial intervals, limiting the window in which activity could be observed.
We evaluated how neural populations in FOF and PPC encode choice and other task relevant parameters. We used a support vector machine (SVM)-based decoding approach to train a classifier to decode the upcoming choice of the animal based on simultaneously-recorded cells in PPC and FOF (Figure 3G). Decoders based on FOF neurons strongly predicted the upcoming choice of the animal, while population decoders based on PPC neurons did not perform well. We also examined how well each area encoded accumulated evidence (# right flashes - # left flashes) using a linear regression based approach (see Methods). Decoders for accumulated evidence based on the activity of FOF and PPC performed significantly better than chance and improved with increasing numbers of neurons.
Population dynamics in FOF predict upcoming choice
Previous tetrode recordings identified neurons in PPC and FOF, termed ‘side-selective neurons,’ whose firing rates were significantly different for left versus right choice trials (Figure 4A, 4B; Erlich et al., 2011; Hanks et al., 2015). Imaging from the superficial layers of FOF and PPC also revealed side-selectivity (Figure 4, S2). Consistent with previous reports (Erlich et al., 2011; Hanks et al. 2015), neurons with preferences for ipsilateral and contralateral movements were intermixed in a single hemisphere. Most side-selective neurons exhibited ramp-like increases in calcium responses throughout the cue and delay periods (Figure 4C,D). In these neurons, the slope of the ramp was different on left and right choice trials and the ramps began to separate during the cue period and continued to diverge throughout the trial until release from voluntary restraint. The percentage of side-selective neurons detected during the delay period was 16% (6/38) in PPC and 37% (23/62) in FOF which is similar to those observed in superficial recordings with tetrodes during an auditory accumulation of evidence task (Figure 4F). A smaller population of side-selective neurons that ramped down during the cue period was also observed (Figure 4D) in the calcium imaging experiments.
To determine whether neurons that were not identified as side-selective collectively contain information about the upcoming choice, we trained a linear decoder to predict choice based on the activity of groups neurons that were not side-selective (Figure S2E,F). Decoders based on individual non-side selective neurons in FOF and PPC failed to predict the animal’s choice above chance (p>0.01). However, larger groups of simultaneously recorded neurons in both PPC and FOF that were not side selective could be used to decode the upcoming choice of the animal (p<0.01).
Sensory evidence modulates activity in PPC and FOF
We next asked how the dynamics of individual PPC and FOF neurons were modulated by each pulse of sensory evidence (Figure 5). By averaging the activity of neurons aligned to the timing of the flashes (pulse-triggered average), we found that neurons exhibited transient responses to individual pulses of evidence, with varied dynamics (Figure 5A-C). Consistent with exhibiting sensory responses, individual neurons responded earlier on trials with earlier pulses and later on trials when pulses occurred later (Figure 5D,5E).
Next we examined the effect of sensory evidence on the magnitude of neural responses in PPC and FOF. We performed linear regression between evidence at the end of the trial and the average response of the neuron during the delay period, which was at least 740ms, but often longer depending on when the last flash occurred (Figure 5F). Evidence was defined as either the difference between the number of right and left flashes (#R-#L), the total number of right flashes (#R) or the total number of left flashes (#L). A cell was defined as “evidence-tuned” if the slope of the line that represented the best fit linear relationship between evidence and delay period activity was significantly different from zero and the correlation coefficient (r2) was greater than 0.5 (Figure 5F; Figure S3). Thirty nine percent (15/38) of active PPC neurons and 24% (15/62) of active FOF neurons had significant evidence tuning. These results demonstrate that layer 2/3 neurons in rat frontoparietal cortex reflect the upcoming choice of the animal with dynamics that are modulated by the magnitude and timing of sensory evidence.
Two types of networks, single and two accumulator networks, have been proposed to implement neural accumulation in decision-making tasks (Figure 5H). In single accumulator-based models, subjects would maintain a memory of the difference in pulses, while in two accumulator-based models, subjects would accumulate left and right pulses independently. To discriminate between these models, we repeated our regression analysis (shown in figure 5F) using a model in which neurons encoded only the #R or #L pulses and compared the goodness-of-fit of both models to the data from each neuron (Figure 5I). The delay period activity of most neurons was better predicted by the number of flashes on a single side, rather than by the difference in the number of flashes (PPC: 63% (25/38) for #L, 13% (5/38) for #R, and 24% (9/38) for #R-#L. FOF: 27% (17/62) for #L, 37% (23/62) for #R, and 34% (21/62) for #R-#L; 1/62 were not best fit by a single model, Figure 5I). These data demonstrate the existence of a population of neurons that accumulate left and right evidence independently, however, the behavioral relevance of these neurons remains to be demonstrated.
Heterogeneous dynamics across individual FOF and PPC neurons in response to pulses of evidence
Next we exploited the pulsatile nature of the stimuli to determine the temporal profile of individual neural responses to single pulses of sensory evidence. In our previous analysis (e.g. Figure 5B,C), pulse-triggered responses were evaluated by averaging the ΔF/F aligned to the time of the pulse. However, since multiple left and right pulses could be presented in sequence, the response of a neuron on each trial likely reflects a combination of responses to multiple flashes. Moreover, our previous analysis revealed that many neurons exhibit responses that depend on the upcoming choice. Therefore, since the response of a neuron at each point in time is a complex combination of multiple task related variables which are themselves correlated in time, a simple pulse triggered average would be contaminated by responses to other events.
We implemented multiple linear regression to estimate each neurons’ average pulse-triggered response, independent of subsequent pulses and upcoming choice (Figure 6A, S4). We assumed that the response of an individual neuron, at each time point, could be described by the linear function:
where ŷ is the response of the neuron at each time point, X is a matrix of regressors for task events, β are the weights of each regressor and ε is a Gaussian noise term. The task events we parameterized were the timing of the light pulses (left and right) and the time within the trial for either left or right choice trials. In order to capture potentially complex temporal dynamics, each of the four task events were parameterized by a set of regressors that formed a temporal sequence, each at a different lag from the time of the event. Weights were calculated using ridge regression. Once fit to the data, the sets of weights corresponding to each task event represent the average response of each neuron to that particular event. The regression model was evaluated by comparing data and model predictions for cross-validated test sets (see Methods). We subsequently focus on the regression weights corresponding to the different task events (e.g., pulses, choice), which we refer to as kernels (Figure S4, S5).
Examination of the pulse kernels revealed the temporal dynamics of neuronal responses to a single pulse (Figure 6C,D, S5). On average, evidence-tuned neurons responded to a pulse on the preferred side with a large amplitude increase in fluorescence that persisted during the measurable imaging period (~2 seconds; Figure 6B). However, examination of the dynamics of individual cells revealed extensive heterogeneity in the time courses of pulse kernels across the population (6C,D).
To quantify this diversity, we fit the time course of each pulse kernel with a model that consisted of a lag followed by the difference of two exponentials (see Methods). This analysis revealed that neurons from PPC and FOF exhibited responses to flashes on the preferred side with a diversity of time courses and largely overlapping distributions of lags and rise times (Figure 6E). These dynamics were not correlated with the brightness of the neurons, indicating that the temporal heterogeneity was not an artifact of levels of GCaMP expression (p > 0.05, see Methods). Viewed in this framework, the neurons we previously defined as ‘evidence-tuned’ may represent a subset of a population of cells whose temporal dynamics vary in a continuous way.
Are heterogeneous temporal dynamics unique to visual stimuli, or might they also play a role in representing accumulated evidence in tasks with different sensory modalities? To address this question we used multiple linear regression to evaluate the click kernel from the firing rates of neurons previously obtained by tetrode recordings in PPC and FOF during an auditory (“Poisson clicks”) accumulation of evidence task (Hanks et al., 2015). Across the population, FOF and PPC exhibited sustained responses to auditory pulses on the timescale predicted by behavioral models. Averaged click kernels were similar to previously reported “click-triggered averages” which were estimated based on the analysis of residual responses after taking into account the average response a neuron on each trial (Figure S6A,B; Hanks et al., 2015). As we observed in our imaging data, individual neurons exhibited a wide range of temporal dynamics in response to clicks (Figure 6F, S6I). These results demonstrate that heterogeneous temporal responses to sensory events, present in both layer 2/3 and putative deeper layers, may encode accumulated evidence across different modalities and behavioral tasks.
Diversity of flash responses could provide a temporal basis for encoding accumulated evidence
Previous behavioral analysis has suggested that the animals’ memory of sensory evidence is longer than the behavioral trial (>2 seconds; Scott et al., 2015). However, the pulse responses of individual cells in PPC and FOF were transient, but started at different times. How might the brain use transient, but temporally lagged responses to stably represent the memory of a flash over behaviorally relevant timescales? One possibility is that sufficiently diverse neuronal responses can be used as basis functions to construct a stable memory trace (Medina and Mauk, 2000; Goldman 2009), or complex time-varying signals that can be used to predict sensory events (Kennedy et al. 2014). To determine whether the flash impulse response functions we observed were sufficiently diverse to represent possible behavioral accumulation time constants, we constructed a neural network model, in which the output of the model was a weighted sum of each neurons’ pulse kernel (Figure 7A and 7B). We trained this network to generate an output to match the time course of the sensory memory as predicted by behavioral analysis (Scott et al., 2015). The sensory memory was modeled by a step followed by the exponential function: da=dF+λladt, where λl is a leak term fit on a trial-by-trial basis to the behavior of each rat (Brunton et al., 2013). We previously found that λl values ranged from -1.7 (leaky) to 0.5 (unstable) across all rats with a mean of 0.006 (near perfect; Figures 2F, S1; Scott et al., 2015). Using a weighted sum of PPC and FOF impulse response functions, the network was able to match response functions of leaky, impulsive, and nearly perfect accumulators over the range of λl values observed in trained rats (Figure 7C,7D). This result suggests that the impulse response functions of PPC and FOF were sufficiently diverse to provide a temporal basis for a latent variable behavioral model of evidence accumulation.
We next wondered what circuit mechanisms could generate the diversity of timescales that we observed PPC and FOF. Recent theoretical work has suggested that effectively feedforward connected networks could produce similar responses to the heterogeneous dynamics we observe (Goldman, 2009; Rajan et al., 2016). Two popular feedforward-based circuit mechanisms include iterative convolution through sequential neural activity, which relies on the synaptic (Bullock et al., 1994) or membrane time constants (Tank and Hopfield, 1987), and delay lines, which rely on axon conductance times (Carr and Konishi, 1988) or delayed transmitter release (Atluri and Regehr, 1998) to extend responses in time (Figure 7E). Each of these biophysically-inspired models makes a different prediction about the relationship between the peak firing time and the half-width of the impulse response function (Figure 7F,7G, blue and purple lines). We compared these model predictions to the time courses of neurons’ impulse response functions from PPC and FOF (Figure 7G). We observed significant correlation between peak time and half-width, consistent with iterative convolution (Pearson’s correlation p < 0.05). This correlation was also significant (p < 0.05) when we compared the lag and rise times obtained from the parametric fit to the impulse response functions (as in Figure 6E). However, with increasing lags, there was also increased variability in the time course of the flash responses (i.e., the spread of the red dots in Figure 7G). The diversity of responses across the population is not well fit by either the delay line or the iterative convolution model individually and may reflect a combination of these mechanisms or a recurrent-based mechanism.
Heterogeneous dynamics encode the memory of responses on previous trials
Rats make decisions based not only events in the current trial, but also on their memory of previous choices and reward on previous trials (Scott et al 2015). PPC has been recently shown to encode the subjects’ choices on previous trials in mice (Morcos and Harvey, 2016), and inactivations of rat PPC reduce the influence of trial history on decision-making in a working memory task (Akrami et al., SFN abstract 2016).
Consistent with studies in mice (Morcos and Harvey, 2016) we observed that populations of neurons in PPC encoded the choice of the animal on the previous trial (Figure 8A, S7). In addition, we found that FOF neurons also encoded the rat’s previous choice. Notably, we could decode the current choice more accurately than the previous choice in FOF, while we found the opposite trend in PPC. Next we examined the responses of individual neurons, which often exhibited different responses depending on the choice of the animal on the previous trial (Figure 8B, S7D-F). A neuron was defined as “previous choice selective” if the distribution of fluorescence values on trials in which the previous choices were left vs right were significantly different from each other (two-sided t-test p<0.05). Using this metric, 23% of neurons in FOF and 21% of neurons in PPC were previous choice selective. Across the population, previous choice selective neurons displayed heterogeneity in their temporal responses to trial history (Figure 8C). Similar analysis (see Methods) revealed the presence of previous choice selective neurons in our electrophysiological data set in the auditory version of the task (Figure 8D, 8E), 13.9% (55/397) of neurons in FOF and 17.8% (70/394) of neurons in PPC were previous choice selective. Based on these finding we modified our linear regression analysis to include a set of regressors which represented the influence of previous choice on the response of the neuron (Figure S8). The addition of regressors that encoded the previous trial did not significantly affect the pulse kernels (Figure S8; also see Methods: Non-parametric permutation test). This is not surprising as the response of the animal on the previous trial is uncorrelated with timing of flashes on the current trial.
DISCUSSION
Behavioral analysis and neural recordings support the existence of two weakly coupled accumulators
Two general classes of accumulator-based behavioral models, relative and absolute accumulator models, have been proposed to describe the decision-making process during accumulation of evidence tasks. In relative accumulator models (e.g., the drift diffusion model or DDM) the decision is represented by a single value that reflects the difference in the evidence for and against a particular decision. In absolute accumulator models (e.g., race models) the decision is represented by multiple accumulators that encode the strength of evidence for each option independently. If two separate accumulators are negatively correlated –for example, by receiving the opposite of each other’s input– then the 2-dimensional absolute accumulator model reduces to a 1-dimensional relative accumulator model (Bogacz et al., 2006) and for simplicity, many neurophysiological studies of evidence accumulation have assumed a relative accumulator framework (Brody and Hanks, 2016; Gold and Shadlen, 2007). However, we previously demonstrated that errors in both visual and auditory pulse-based accumulation of evidence tasks are correlated with both the total number of flashes and the difference in the number of flashes (Scott et al., 2015). In a limited regime of simple models, such a dependency was better explained by a behavioral model in which rats independently accumulated left and right flashes (in which case they would have access to both the total and difference in flash number) rather than the difference between flashes. In the current study, we evaluated whether the activity of neurons is better predicted by the number of either left or right pulses, or by their difference. The activity of most evidence-tuned neurons in PPC and FOF was better predicted by the number of pulses on one side rather than the difference in pulses, consistent with an absolute evidence accumulator model. However, examination of the pulse kernels revealed a small amplitude negative response to pulses on the non-preferred side (Figure 6B). This suggests that neurons tuned to evidence on one side may be weakly inhibited by evidence on the other, and consequently that the accumulators for left and right evidence are neither completely independent nor strongly coupled, but instead, weakly coupled (Usher and McClelland, 2001).
Temporally diverse sensory responses and evidence encoding in PPC and FOF
Two broad classes of dynamics, homogenous and heterogenous, have been proposed for how networks of neurons might represent sensory evidence during decision-making (Figure 1). (1) The homogeneous dynamics model predicts that neurons would respond similarly to pulses of evidence with a sustained increase in activity. While average population data has been used to support this idea (Huk and Shadlen, 2005), the temporal heterogeneity of responses of individual neurons in primate LIP during a continuous accumulation of evidence has been taken as evidence against this model (Meister et al., 2013). (2) The heterogenous dynamics model predicts that either individual neurons would respond transiently at different times relative to pulses (Harvey et al., 2012), or that individual neurons would respond with diverse, time-varying responses at different times relative to pulses (Brody et al., 2003; Goldman, 2009). Sequential and other forms of heterogeneous dynamics have previously been proposed for memory-based tasks, but whether these responses could also underlie an accumulation process has not been explored.
The diversity in the regression kernels of estimated pulses responses in FOF and PPC was consistent with heterogeneous dynamics, and responses were sufficiently diverse to potentially provide a basis for representing accumulated evidence over several seconds. In this framework, the observation that the responses of individual neurons can be linearly combined to reproduce the dynamics of the decision variable (Figure 7) indicates that FOF and PPC populations represent sensory memory on behaviorally relevant timescales. Whether or not the representations reported here are specific to sensory evidence, or might also apply to accumulation of more abstract forms of evidence (Yang and Shadlen 2007) is an interesting question that remains to be addressed in future experiments.
Recently Morcos and Harvey (2016) recorded dynamics in the mouse PPC during an accumulation of evidence task requiring virtual navigation. They observed that, in a behavioral trial, PPC transitioned through a sequence of states defined by population activity, and the trajectory through this state space was correlated with the mouse’s upcoming choice. Further work is required to determine the relationship between the sequences observed in mouse PPC and the diversity of timescales in response to individual pulses that we observed in FOF and rat PPC. We note that the data presented in the current study is consistent with sequential or other, more complex heterogeneous dynamics (See Figure 1 and Figure 7G)
Network models involving heterogeneous dynamics
Neural network models that incorporate time-varying dynamics during working memory appear to confer a number of computational advantages, including increased robustness to perturbations and reduced sensitivity to tuning parameters (Goldman, 2009; Sussillo, 2014; Sussillo and Abbott, 2009). Networks that encode sensory memory in feedforward-like dynamics, in which the network progresses through a sequence of states with distinct activity patterns, have recently been shown to be particularly robust to perturbations and noise (Ganguli et al., 2008; Goldman, 2009; Lim and Goldman, 2012). In these models, sensory stimuli trigger activity in a population of cells that fire at different times relative to the event. Downstream neurons can decode or recover the history of past events by monitoring activity across the population (Klampfl and Maass, 2013; Laje and Buonomano, 2013; Sussillo and Abbott, 2009; Tank and Hopfield, 1987). The neuronal dynamics that we observed bear striking similarities to predictions of these feedforward networks, such as time varying responses to sensory stimuli and diversity in the timescales of these responses.
Previous studies have also demonstrated that neurons in other frontal cortical areas in rats, such as anterior cingulate cortex and prelimbic cortex, exhibit diverse, time-varying task-relevant responses, suggesting that temporally heterogeneous dynamics may underlie core computations in frontal cortex during behavior (Horst and Laubach, 2012; Jung et al., 1998; Lapish et al., 2008; Ma et al., 2016; Powell and Redish, 2014; Young and Shapiro, 2011). Moreover, Shankar and Howard (2012) have argued that a set of many neurons responding with different time courses can represent complete information about stimulus history, whereas homogeneous dynamics (Figure 1A) cannot (Shankar and Howard, 2012; Tiganj et al., 2015).
The long timescales we observed in the responses of PPC and FOF neurons to sensory evidence place constraints on mechanistic models of accumulation and working memory in these regions. Previous studies have proposed a variety of biophysically plausible mechanisms that generate temporal diversity in feedforward neural networks. For example, axonal conduction velocities (Braitenberg, 1961; Jeffress, 1948) and synaptic delays in chains of neurons (Licklider, 1951) can produce synaptic events at different times relative to a sensory event. Additionally, the time constant of the cell membrane (~10ms) or NMDA receptor activation (~100ms) could act as a convolutional filter to lengthen the timescale of synaptic events in feedforward networks (Tank and Hopfield, 1987; Wang, 2002). Theoretical work has demonstrated that sequential and other feedforward-like dynamics can also be produced by artificial neural networks with recurrent architecture. Given the strong recurrent connectivity frequently observed in rat cortical layer 2/3 (e.g., Barbour and Callaway, 2008; Mason et al., 1991), such artificial networks suggest a biologically plausible mechanism for the generation of heterogeneous dynamics.
Sequential activation of populations of neurons has been observed in both the rodent neocortex and hippocampus during navigation (Harvey et al., 2012; Wilson and McNaughton, 1993). However, similar dynamics have also been observed in tasks that require working memory but that do not involve navigation. For example, in response to discrete presentations of sensory stimuli, neurons in the rat hippocampus (MacDonald et al., 2013) and mouse somatosensory cortex (Peron et al., 2015) exhibit sequential activity that reflects the identity or location of the remembered stimulus during a delay period even when animals were not navigating. Similarly, in the cerebellum, temporally diverse responses of granule cells have been proposed to provide a basis set for sensory motor learning (Medina and Mauk, 2000) and the cancellation of self-generated sensory signals (Kennedy et al., 2014). Our work suggests that similar principles could allow populations of PPC and FOF neurons with diverse timescales to encode not just single sensory events, but also complex sequences of sensory events, as demonstrated with simulated networks (Klampfl and Maass, 2013; Laje and Buonomano, 2013; Sussillo and Abbott, 2009; Tank and Hopfield, 1987).
The roles of the rat PPC and FOF in decision commitment
It has been suggested that PPC and FOF may play distinct roles in the decision making process (Brody and Hanks, 2016). Hanks et al. (2015) reported that neurons in FOF exhibited a more categorical representation of accumulated evidence than neurons in PPC, which the authors suggested reflected FOF’s putative role in action selection (Erlich et al., 2015; Hanks et al., 2015; Kopec et al., 2015). While we observed a greater percentage of side-selective neurons in FOF than in PPC, we did not observe a statistical difference between PPC and FOF in terms of whether representations were categorical or graded (i.e., whether the relationship between activity and evidence was sigmoidal, or linear, respectively; see Methods). One important distinction is that in this study, rats were presented with fewer evidence pulses than in Hanks et al. (2015). FOF firing rates at the accumulator values in the current study (up to +/− 6 flashes) appear linear in the Hanks et al. (2015) dataset (see their figure 3C). The larger range of accumulator values examined in that study may have been critical to reveal the categorical nature of FOF responses. In other words, even if representations in FOF were more categorical, the stimuli in our task may have been in a regime in which they appeared linear. Another possibility is that the increased temporal resolution of the electrophysiological measurements of Hanks et al., combined with their use of a model-based estimate of the internal accumulator variable, enabled distinguishing linear versus sigmoidal encoding.
One striking difference between FOF and PPC that we did observe was related to the memory of the previous trial. We confirmed, as reported in a previous mouse study (Morcos and Harvey 2016), encoding of previous trial choice in PPC. We also found previous trial encoding in FOF. Furthermore, we found a difference between the two areas. While linear decoders based on FOF neurons were slightly better at predicting the upcoming choice than the previous choice, we found the opposite to be true in PPC: decoding accuracy based on PPC was higher for the previous choice than for the upcoming choice. Distinctions between neuronal responses in parietal and frontal cortices can be elusive (Chafee and Goldman-Rakic 1998; Hanks et al. 2015). Our results suggest a novel axis on which to explore functional differences between these two regions.
STAR METHODS
CONTACT FOR REAGENT AND RESOURCE SHARING
Further information and requests for reagents may be directed to and will be fulfilled by the Lead Contact, Dr. David Tank (dwtank@princeton.edu).
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Seven adult male rats, four Long-Evans and three Sprague Dawley (obtained from Taconic Biosciences, Inc.), were used in this study. Rats were typically housed either in pairs or singly in a reverse light cycle room. Access to water was scheduled to within-box training, 2 hours per day, and between 0 and 1 hour ad lib following training.
METHODS DETAILS
Surgery
Rats underwent two different types of surgical procedures during the course of the experiment described here: headplate implantation and optical window implantation. Headplate implantation typically occurred before behavioral training, although in one rat it was necessary to move the headplate to a new location after it had become well-trained. In this surgery rats were surgically implanted with a custom-made, 5-gram, titanium headplate that was bonded to the skull using a dental cement adhesive (Metabond). The headplate contained two important design features (see Scott et al 2103 for details). First, it contained a rectangular window, approximately 8.8 mm by 15mm, centered in the headplate, that allowed access to the skull and second it contained a conical depression and V-groove that interacted with the kinematic clamp to provide mechanical registration and stability during head restraint. The headplate was implanted so that the center window lay over the stereotaxic coordinates for one of the three brain regions in this study.
After training animals underwent a second surgery in which the optical window was implanted. In this surgery, a 3.5mm diameter circulate trephination was made centered on the stereotaxic coordinates for PPC and FOF. The dura was dissected away and 25–50 nl of viral vector was injected. In some animals the gene for the red fluorescent protein marker mCherry was also expressed in neurons in order to determine the magnitude and timing of fluorescence changes due to brain motion and other imaging artifacts. In 4 rats the viral vector was a cocktail containing 1 part AAV-hsyn-2.1-GCAMP6f and 1 part AAV2.1-mCherry diluted in 1 part saline. In another 4 rats 1 part AAV cag 2.1 GCAMP6f was diluted in 1 part saline. All viruses were obtained from the Penn Vector Core (Gene Therapy Program, Perelman School of Medicine, University of Pennsylvania). After injection of the viral vector an optical implant, consisting of a 3.5 mm glass coverslip bonded to a 3.5 mm diameter, 0.8mm high, stainless steel ring using optical adhesive, was placed on the surface of the cortex with the glass side down. The implant was positioned so that it depressed the surface of the cortex by about 0.2mm in order to reduce brain motion and bonded to the skull using a combination of medical grade cyanoacrylate adhesive (Vetbond) and dental cement adhesive (Metabond). Extreme care was taken to minimize trauma to the dura and cortical surface during the surgical procedure and also to maintain sterile conditions. GCaMP6f and mCherry labeling could be observed beginning 1–2 weeks after surgery (Figure 2B) and calcium-dependent transients could be recorded in GCAMP6f-labeled neurons using two photon imaging during voluntary head restraint, typically for 2–4 weeks and in one case, up to 6 months. Finally, we note that although circular 3.5mm optical windows were implanted over the stereotaxic coordinates for PPC and FOF, clipping of the optical beam by the edges of the cannula made it very difficult, if not impossible, to image cell bodies more than a millimeter away from the center of the window (i.e., our coordinates).
Behavioral training
Following recovery rats trained in a high-throughput, computer controlled, behavioral-training chamber. Over the course of two months, rats acclimated to human handling and progressed through 3 initial stages of training. In the first stage rats learned to initiate a behavioral trial by inserting their nose into a center nose port. After trial initiation a water reward became available at either the left or right side reward port, indicated by a light illuminated at the port. Initially the nose port was flush with the front wall of the behavioral chamber, but after each trial the nose port, mounted on a linear translation stage, moved further away from the interior of the chamber so that the rat had to insert its head and headplate further into the head port to initiate a trial. Once the rat inserted far enough to trigger a pair of contact sensors at the end of the head port slot they transitioned to the second stage of training. In the second stage rats were trained to wait in the head port for increasingly long durations before a go cue instructed them to break fixation. In the third stage of training rats learned to associate led flashes with reward. For a more detailed description of behavioral training and analysis see (Scott et al., 2015).
Image acquisition
Two-photon imaging was performed using a custom movable objective microscope that has been previously described (Scott et al., 2013). Scanlmage software controlled the movement of the scan mirrors as well as digitization and recording of four analogue input channels. Two input channels were used to record signals from the two PMTs (one for green light, one for red light) and the other two channels were used to record the timing of task events (LED flashes and the choices of the animal) from the behavioral control software. To prevent light leakage from the LEDs to the PMTs we used blue shifted LEDs (470nm mean wavelength, Super Bright LEDs) combined with a 450nm lowpass filter.
Off-line motion correction
Following acquisition, image stacks were motion corrected using custom scripts written in MATLAB. First a template frame was identified manually. On sessions where expression was low, the template frame was typically composed of the average of up to 10 manually selected individual frames. Next, 2D cross correlation was performed between the template frame and each individual frame. Individual frames were aligned to the template based on the X and Y shifts that produced the highest frame-to-template cross correlation value. After correction we inspected each image stack for residual motion. Frames with evident Z motion were identified and their peak correlation value was recorded. We then excluded all frames with a peak correlation value less than the peak correlation value of the identified frame. This process was repeated until the resulting image stack contained no visible Z-motion.
ROI identification, neuropil subtraction and fluorescence signal extraction
After motion correction, region of interests (ROIs) corresponding to the neuronal somata were identified as previously described (Scott et al., 2013). We used visual inspection to identify the position of cell bodies of the image stack, demarcating pixels corresponding to the edges of cells using the ImageJ ROI manager feature. Fluorescence traces were obtained by averaging the intensity values of the pixels corresponding to each ROI for each frame. Linear interpolation was used to estimate the fluorescence values of frames that had been excluded by the motion correction algorithm. For neuropil analysis we selected a rectangular 10×10 pixel region centered on each neuron that excluded all pixels that had been identified as belonging to a neuronal ROI (Scott et al., 2013).
Identification of positive-going and negative-going transients
Once fluorescence traces were extracted, we calculated the baseline fluorescence level for each ROI. The baseline fluorescence (F) was defined as the mode fluorescence value across all time points. The standard deviation (σF) in the baseline was defined as:
where X are all values in the fluorescence trace less than F. A positive-going significant transient was defined as a portion of the fluorescence trace that exceeded the baseline fluorescence value by more than 3 standard deviations (>F+3σF) for more than .75 seconds. A negative going transient was defined as a portion of the fluorescence trace that fell below the baseline by more than 3 standard deviations (<F-3σF) for more than .75 seconds.
Exclusion of frames due to Z-motion and identification of active cells
Brain motion, particularly Z-motion, can produce fluorescence changes as cells move into and out of the field of view. Previous studies (Dombeck et al. 2007; Scott et al 2013) have used negative-going transients to estimate the rate of such artifacts. We compared the rate of positive going and negative going transients for each ROI. If greater than 5% of all transients for an ROI were negative-going, we raised the correlation threshold value and excluded frames whose peak correlation fell below that threshold. Linear interpolation was used to estimate the fluorescence values of the excluded frames. This process was repeated until negative going transients constituted less than 5% of all transients. If a threshold correlation could be identified at which negative going transients constituted less than 5% of all transients and positive going transients occurred at a rate of more than one per minute that ROI was defined as an active neuron.
Electrophysiological recordings
Electrophysiological recordings characterized here were presented previously (Hanks et al., 2015). Once trained, rats were implanted with custom microdrives containing tetrodes. Tetrode positions for FOF were centered at +2 anterior–posterior (AP), ±1.3 medio–lateral (ML) (mm from Bregma); and those for PPC were centered at -3.8 to -4.1 AP and ± 2.2 ML. Recordings were made with insulated platinum iridium wire (16.6 µm diameter; Neuralynx) twisted into tetrodes. Each tetrode was threaded into a polyimide tube (34 AWG triple wall) which was part of a movable bundle of eight tubes. Within each bundle, tetrodes were spaced ~250 µm from each other. The tetrodes could be advanced by turning a nut against a spring on a 0–80 threaded rod so that a 1/8 turn drove the tetrodes down about 40 µm. The tetrodes were advanced at the end of sessions so that the brain tissue had time to stabilize before recording the next day. Depth relative to the surface of the brain was estimated based on the number of turns of the screw required to reach that recording site, and recording sites were verified with post-mortem histology.
To fit the linear regression model to the electrophysiology data, data were pre-processed as described in Hanks et al. (2015). Briefly, trial firing rate functions were generated by smoothing the spike trains with a causal half-Gaussian filter, with a width of 25ms. The model was fit to this smoothed firing rate. Task-related model components (left and right clicks, left and right choice) were parameterized by temporal bases defined by delta functions at discrete lags (in steps of 25ms) relative to the event. Data collected before the stimulus start time was discarded.
ΔF/F estimation, side selective neurons and evidence tuned neurons
ΔF/F for each ROI was defined as follows:
where F0 is the mean fluorescence value of all frames before the cue period, and F(t) is the fluorescence value at each time point. To identify side-selective neurons and evidence-tuned neurons, we examined the average ΔF/F during the delay period on all correct trials. If the distribution of ΔF/F values were significantly different between left and right trials, according to a two-tailed two-sided t-test with alpha =0.05, that neuron was termed side-selective. Next, for each neuron we performed robust linear regression (MATLAB Statistics Toolbox robustfit.m) to determine the relationship between the average delay period response and the numerical differences between flashes (#R-#L). Evidence index (El) was defined as the slope of the best fit a linear relationship between ΔF/F and #R-#L. A cell was defined as evidence tuned if its evidence index was significantly different from 0 and the correlation coefficient (r2) was greater than 0.5.
Linear decoder analysis
A support vector machine based linear decoder (Chang and Lin 2011) was implemented in MATLAB using an existing library available online: http://www.csie.ntu.edu.tw/~cilin/libsvm/ The decoder was trained to predict either the choice of the animal on the current trial (choice), the choice of the animal on the previous trial (previous choice) or the side with the greater number of flashes (correct side) on each trial using the mean activity during the delay period for multiple neurons. A randomly selected subset of half of the trials was used to train the decoder (training set) and the other half was used to evaluate the decoder (test set). The performance of the decoder for each group of cells reported in the text reflects the mean performance across 100 different training and test set combinations. Groups of cells were constructed from all possible combinations of simultaneously reported neurons for example if cells A, B and C were recorded simultaneously then performance would be evaluated for the following combinations: A, B, C,AB, BC and ABC.
Evidence tuning model comparison
To evaluate whether the data was better fit by a single and two accumulator model we performed the following statistical test: For each neuron we created 1000 surrogate datasets using a bootstrapping method where each surrogate dataset consisted of a randomly selected subset of 75% of the trials. For each surrogate dataset we computed the r-squared value for the best fit linear relationship between the average activity of the neuron during the delay period and the accumulated evidence on each trial. Accumulated evidence was defined as number of right flashes (#R), number of left flashes (#L) or the difference in the number right and number left (#R-#L). Next, we performed a pairwise permutation test (see below and Scott, Constantinople et al., 2015) to compared the distribution of R-squared values for the fits of the data to each of the three models. The model with the highest R-squared value was considered the best fit model.
Multiple linear regression of neural activity
Each active neuron’s response (ΔF/F for calcium imaging; instantaneous firing rate, r, for tetrode recordings) was modeled as a linear combination of task-related events (assuming Gaussian noise). Importantly, randomness and variability in the timings of task-related events enabled dissociating their effects on the neural response. Task-related events (here called model components: left and right flashes, left and right choice) were parameterized by temporal bases defined by delta functions at discrete lags (in steps of frame rate, usually 23 Hz [50 ms], but sometimes 12 Hz [100 ms]) relative to each event, following each left and right flash, and preceding the end of the trial (or the choice epoch). Each model component covered a specified range of time: for flashes, the temporal basis extended to 2 s after flash onset, and for choice, bases covered the maximum trial duration in each imaging session, aligned to the end of the trial. The total number of parameters for each neuron depended on the frame rate and maximum trial duration in the imaging session. The median number of parameters was 212, and ranged from 94 to 222.
The coefficients were fit using ridge regression. The ridge regularizer (λ) was selected using a procedure called evidence optimization, or type II maximum likelihood estimation, which is a maximum-likelihood procedure for estimating the prior distribution of model parameters (governed by λ) from the data (see Park et al., 2014; Park and Pillow, 2011). Evidence optimization maximizes the marginal likelihood of the data given A. The standard deviation estimates of the regressors were obtained by taking the square root of the diagonal of the posterior covariance matrix (which, in our linear regression model was (XTX + λl)−1).
Model predictions were generated by convolving the task-related model components with delta functions at the times of the task events. Goodness-of-fit (cross correlation, variance explained) was computed using the predicted and actual response (ΔF/F or r) at each time point, from the cross-validated test sets. The cross validation procedure was as follows: on each of ten iterations, 75% of the data was partitioned into a training set, 25% partitioned into a test set, and the model prediction was generated for the test set. The average model prediction over all iterations was computed, and used to evaluate model performance on the average test set over all iterations. To evaluate whether regression could recapitulate the evidence tuning of individual neurons, we performed 20-fold cross validation to ensure that sufficiently diverse trial types were represented in the test set.
Model predictions were compared against the responses of each of the neurons averaged across different trial types. The regression model reproduced the temporal profile of the average responses on left and right choice trials. To quantify this, we calculated the correlation between the average model prediction and corresponding trial-averaged data on left and right choice trials. For neurons in PPC the model and data exhibited a correlation coefficient of 0.71 (+/− 0.23), and for neurons in FOF, the model and data exhibited a correlation coefficient of 0.88 (+/− 0.16), compared to .0027 (+/−.01) for shuffled controls. We also found that the model faithfully predicted the graded response to accumulated evidence for each cell. While both the choice components and visual components were required to reproduce the average trial dynamics of each neuron, the visual components alone were sufficient to predict the evidence tuning.
We evaluated the visual components of the model in the mV2 neurons we recorded. Typical mV2 cells exhibited strong, reliable responses to visual pulses that were evident on individual trials. We fit the regression model to these cells, and compared the visual components to model-free estimates of the neurons’ flash-triggered averages. Model fits predicted visual components with similar time courses to those seen in the flash-triggered averages in mV2, and also in PPC and FOF.
Next we evaluated how well the model predicted the activity of each neuron at each time point. For PPC neurons the data and model at each time point exhibited an average correlation coefficient of 0.21 (+/− 0.17), and for FOF neurons, a correlation coefficient of 0.41 (+/− 0.16). Although these correlation values are comparable to what has been observed in similar encoding models (Pinto and Dan, 2015), we sought to evaluate the influence of trial-to-trial variability on model performance. We reasoned that the model could not perform better than the average response on identical trials (Vintch et al., 2012). Therefore, for identical trials, in which the exact same stimuli and behavioral choice were present, we compared the correlation between the data on each trial and the trial average, as well as the data and the model on those same trials. The average response on repeated trials exhibited a correlation coefficient of 0.78 with the data, compared to a correlation between the model and data of 0.39 on those same trials. The ratio of these values, 0.50, reflects the performance of the model normalized by the predictive power of the data on repeated trials.
Non-parametric permutation test
To evaluate whether trial history affected the estimated pulse responses, we performed the following permutation test for each neuron. On each of 100 iterations, we computed the R-squared values between the estimated pulse responses fit using randomly subsampled trials from all of the data and from subsampled trials following left choices (i.e., in which the choice history was fixed). We compared this distribution of R-squared values to R-squared values obtained from randomly subsampling all trials. The null hypothesis is that the difference between the estimated pulse responses following left choices and the estimated pulse responses fit to all trials is not significantly different from the pulse responses obtained from two different random subsamples of the data. Therefore, the area of the distribution of R-squared values from pulse responses fit to randomly subsampled trials at the average R-squared value for post-left-choice-pulse response and randomly subsampled pulse response corresponded to the p-value. For all neurons, the p-value was greater than .05, indicating that there was no significant different between the estimated pulse responses obtained from all trials and from those with fixed trial history (i.e., following left choices).
Time course of the pulse kernels
To quantify the diversity in the time course of sensory responses we fit the following five-parameter Y(t | L,O,D,R,M) model to each neuron’s pulse triggered response: for t ≤ L
for t ≥ L
This difference of exponentials model includes a parameter for lag (L), decay time (D), rise time (R), magnitude (M) and offset (0). Model parameters were fit using an interior-point algorithm that attempted to minimize the mean squared error between the pulse-triggered response and the model.
Comparing fluorescence and time course of pulse kernels
To evaluate whether neurons with longer flash responses had different calcium buffer concentrations (because of GCaMP6f-expression), we computed the average raw fluorescence of the soma during the pre-cue (i.e., baseline) period, and the average fluorescence of the surrounding neuropil during the pre-cue period. The ratio of these values was not significantly correlated with the lag, rise time, or decay of the flash impulse response function (p > 0.05, Pearson’s correlation).
Categorical tuning of FOF and PPC neurons to accumulated evidence
For each evidence-tuned neuron we computed the average ΔF/F during the delay period pooled across trials with the same evidence, i.e. #R-#L. We then found the line and sigmoid function that best characterized the relationship between ΔF/F and evidence. The line was described by:
and the sigmoid was described by:
β1 … β4 represent the best-fit parameters, using the MATLAB functions robustfit.m for the linear model and nlinfit.m for the sigmoidal model. E is the difference in the number of flashes (#R-#L if the neuron preferred right evidence, #L-#R if the neuron preferred left evidence). Y is the model prediction of the neuronal response (ΔF/F).
Next, we compared the goodness of fit for these two models by computing the chi-squared probability, Q. Note that Q depends both the chi-squared statistic and the degrees of freedom of the model, which enables a direct comparison of the two models even though the number of parameters is different for each model (2 for linear, 4 for sigmoid). The model that had the greater Q value was considered to be the better fit. Using this criterion, 100% (4/4) of evidence tuned neurons in PPC neurons were better fit by the linear model. 94% (16/17) of evidence tuned neurons in FOF neurons were better fit by the linear model. We also computed the slope predicted by the sigmoid function, β4 in the equation above. The mean slope for PPC neurons was 8.3 and the mean for FOF was 2.6, however this difference was not statistically significant (p>0.05).
QUANTIFICATION AND STATISTICAL ANALYSIS
The details of all statistical tests performed are described in the Methods Details section above. The number of cells, animals and behavioral trials used for each analysis are reported in the text and figure captions. Error bars and shaded regions on plots reflect the standard error of the mean unless otherwise noted in figure captions.
Supplementary Material
Highlights.
Voluntary restraint allows two-photon Ca2+ imaging during decision-making in rats.
Neurons with diverse dynamics collectively represent accumulated sensory evidence.
Cortical dynamics support the existence of multiple, weakly-coupled accumulators.
Neurons across fronto-parietal cortex encode the memory of past behavioral choices.
Acknowledgments
We thank Mikio Aoi and members of the Tank and Brody labs for useful conversations, and Jovanna Teran and Klaus Osorio for animal husbandry. Jonathan Pillow provided code for implementing evidence optimization, and helpful guidance for implementing and evaluating the multiple linear regression analysis. Marino Pagan provided the helpful comments and code for the SVM-based decoder. Julia Kuhl provided the illustrations of the rat in Figure 1A and Figure 2C. B.B. Scott was supported by a post-doctoral NRSA fellowship, NIH award number 5F32NS78913; CM. Constantinople was supported by a Helen Hay Whitney fellowship; T.D. Hanks was supported by a post-doctoral NRSA fellowship, NIH award number F32MH098572. The work was supported by NIH grant numbers R21NS082956 and U01NS090541 to C.D. Brody and D.W. Tank and by a grant from the Howard Hughes Medical Institute to C.D. Brody.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Author contributions
B.B. Scott and C.M. Constantinople contributed equally to this work. B.B. Scott, C.D. Brody and D.W. Tank designed the study. B.B. Scott and C.M. Constantinople collected imaging data, performed analysis and wrote the initial draft of the manuscript. A. Akrami inspired the previous trial analysis. T.D. Hanks contributed electrophysiological recordings. All authors were involved in interpreting the data and revising the article.
References
- Atluri PP, Regehr WG. Delayed release of neurotransmitter from cerebellar granule cells. The Journal of neuroscience : the official journal of the Society for Neuroscience. 1998;18:8214–8227. doi: 10.1523/JNEUROSCI.18-20-08214.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barbour DL, Callaway EM. Excitatory local connections of superficial neurons in rat auditory cortex. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2008;28:11174–11185. doi: 10.1523/JNEUROSCI.2093-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bogacz R, Brown E, Moehlis J, Holmes P, Cohen JD. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychological review. 2006;113:700–765. doi: 10.1037/0033-295X.113.4.700. [DOI] [PubMed] [Google Scholar]
- Braitenberg V. Functional Interpretation of Cerebellar Histology. Nature. 1961;190 539-&. [Google Scholar]
- Brody CD, Hanks TD. Neural underpinnings of the evidence accumulator. Curr Opin Neurobiol. 2016 doi: 10.1016/j.conb.2016.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brody CD, Hernandez A, Zainos A, Romo R. Timing and neural encoding of somatosensory parametric working memory in macaque prefrontal cortex. Cerebral cortex. 2003;13:1196–1207. doi: 10.1093/cercor/bhg100. [DOI] [PubMed] [Google Scholar]
- Brunton BW, Botvinick MM, Brody CD. Rats and humans can optimally accumulate evidence for decision-making. Science. 2013;340:95–98. doi: 10.1126/science.1233912. [DOI] [PubMed] [Google Scholar]
- Bullock D, Fiala JC, Grossberg S. A Neural Model of Timed Response Learning in the Cerebellum. Neural Networks. 1994;7:1101–1114. [Google Scholar]
- Carr CE, Konishi M. Axonal delay lines for time measurement in the owl’s brainstem. Proceedings of the National Academy of Sciences of the United States of America. 1988;85:8311–8315. doi: 10.1073/pnas.85.21.8311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chafee MV, Goldman-Rakic PS. Matching patterns of activity in primate prefrontal area 8a and parietal area 7ip neurons during a spatial working memory task. Journal of Neurophysiology. 1998;79:2919–2940. doi: 10.1152/jn.1998.79.6.2919. [DOI] [PubMed] [Google Scholar]
- Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2011;2(3):27. [Google Scholar]
- Dombeck DA, Khabbaz AN, Collman F, Adelman TL, Tank DW. Imaging large-scale neural activity with cellular resolution in awake, mobile mice. Neuron. 2007;56(1):43–57. doi: 10.1016/j.neuron.2007.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Erlich JC, Bialek M, Brody CD. A cortical substrate for memory-guided orienting in the rat. Neuron. 2011;72:330–343. doi: 10.1016/j.neuron.2011.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Erlich JC, Brunton BW, Duan CA, Hanks TD, Brody CD. Distinct effects of prefrontal and parietal cortex inactivations on an accumulation of evidence task in the rat. eLife. 2015;4:e05457. doi: 10.7554/eLife.05457. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Espinoza SG, Thomas HC. Retinotopic organization of striate and extrastriate visual cortex in the hooded rat. Brain Res. 1983;272:137–144. doi: 10.1016/0006-8993(83)90370-0. [DOI] [PubMed] [Google Scholar]
- Forstmann BU, Ratcliff R, Wagenmakers EJ. Sequential Sampling Models in Cognitive Neuroscience: Advantages, Applications, and Extensions. Annual review of psychology. 2016;67:641–666. doi: 10.1146/annurev-psych-122414-033645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ganguli S, Huh D, Sompolinsky H. Memory traces in dynamical systems. Proceedings of the National Academy of Sciences of the United States of America. 2008;105:18970–18975. doi: 10.1073/pnas.0804451105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gold JI, Shadlen MN. The neural basis of decision making. Annu Rev Neurosci. 2007;30:535–574. doi: 10.1146/annurev.neuro.29.051605.113038. [DOI] [PubMed] [Google Scholar]
- Goldman MS. Memory without feedback in a neural network. Neuron. 2009;61:621–634. doi: 10.1016/j.neuron.2008.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hanks TD, Kopec CD, Brunton BW, Duan CA, Erlich JC, Brody CD. Distinct relationships of parietal and prefrontal cortices to evidence accumulation. Nature. 2015;520:220–223. doi: 10.1038/nature14066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harvey CD, Coen P, Tank DW. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature. 2012;484:62–68. doi: 10.1038/nature10918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Horst NK, Laubach M. Working with memory: evidence for a role for the medial prefrontal cortex in performance monitoring during spatial delayed alternation. Journal of neurophysiology. 2012;108:3276–3288. doi: 10.1152/jn.01192.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huk AC, Shadlen MN. Neural activity in macaque parietal cortex reflects temporal integration of visual motion signals during perceptual decision making. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2005;25:10420–10436. doi: 10.1523/JNEUROSCI.4684-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeffress LA. A place theory of sound localization. Journal of comparative and physiological psychology. 1948;41:35–39. doi: 10.1037/h0061495. [DOI] [PubMed] [Google Scholar]
- Jung MW, Qin Y, McNaughton BL, Barnes CA. Firing characteristics of deep layer neurons in prefrontal cortex in rats performing spatial working memory tasks. Cerebral cortex. 1998;8:437–450. doi: 10.1093/cercor/8.5.437. [DOI] [PubMed] [Google Scholar]
- Kennedy A, Wayne G, Kaifosh P, Alvina K, Abbott LF, Sawtell NB. A temporal basis for predicting the sensory consequences of motor commands in an electric fish. Nature neuroscience. 2014;17:416–422. doi: 10.1038/nn.3650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim JN, Shadlen MN. Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque. Nature neuroscience. 1999;2:176–185. doi: 10.1038/5739. [DOI] [PubMed] [Google Scholar]
- Klampfl S, Maass W. Emergence of dynamic memory traces in cortical microcircuit models through STDP. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2013;33:11515–11529. doi: 10.1523/JNEUROSCI.5044-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kolb B, Walkey J. Behavioural and anatomical studies of the posterior parietal cortex in the rat. Behavioural brain research. 1987;23:127–145. doi: 10.1016/0166-4328(87)90050-7. [DOI] [PubMed] [Google Scholar]
- Kopec CD, Erlich JC, Brunton BW, Deisseroth K, Brody CD. Cortical and Subcortical Contributions to Short-Term Memory for Orienting Movements. Neuron. 2015;88:367–377. doi: 10.1016/j.neuron.2015.08.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laje R, Buonomano DV. Robust timing and motor patterns by taming chaos in recurrent neural networks. Nature neuroscience. 2013;16:925–933. doi: 10.1038/nn.3405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lapish CC, Durstewitz D, Chandler LJ, Seamans JK. Successful choice behavior is associated with distinct and coherent network states in anterior cingulate cortex. Proceedings of the National Academy of Sciences of the United States of America. 2008;105:11963–11968. doi: 10.1073/pnas.0804045105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Licklider JC. A duplex theory of pitch perception. Experientia. 1951;7:128–134. doi: 10.1007/BF02156143. [DOI] [PubMed] [Google Scholar]
- Lim S, Goldman MS. Noise tolerance of attractor and feedforward memory models. Neural computation. 2012;24:332–390. doi: 10.1162/NECO_a_00234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma L, Hyman JM, Durstewitz D, Phillips AG, Seamans JK. A Quantitative Analysis of Context-Dependent Remapping of Medial Frontal Cortex Neurons and Ensembles. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2016;36:8258–8272. doi: 10.1523/JNEUROSCI.3176-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacDonald CJ, Carrow S, Place R, Eichenbaum H. Distinct hippocampal time cell sequences represent odor memories in immobilized rats. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2013;33:14607–14616. doi: 10.1523/JNEUROSCI.1537-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mante V, Sussillo D, Shenoy KV, Newsome WT. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature. 2013;503:78–84. doi: 10.1038/nature12742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mason A, Nicoll A, Stratford K. Synaptic transmission between individual pyramidal neurons of the rat visual cortex in vitro. The Journal of neuroscience : the official journal of the Society for Neuroscience. 1991;11:72–84. doi: 10.1523/JNEUROSCI.11-01-00072.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Medina JF, Mauk MD. Computer simulation of cerebellar information processing. Nature neuroscience. 2000;(3 Suppl):1205–1211. doi: 10.1038/81486. [DOI] [PubMed] [Google Scholar]
- Meister ML, Hennig JA, Huk AC. Signal multiplexing and single-neuron computations in lateral intraparietal area during decision-making. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2013;33:2254–2261. doi: 10.1523/JNEUROSCI.2984-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morcos AS, Harvey CD. History-dependent variability in population dynamics during evidence accumulation in cortex. Nature Neuroscience. 2016;19:1672–1681. doi: 10.1038/nn.4403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park IM, Meister ML, Huk AC, Pillow JW. Encoding and decoding in parietal cortex during sensorimotor decision-making. Nature neuroscience. 2014;17:1395–1403. doi: 10.1038/nn.3800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park M, Pillow JW. Receptive field inference with localized priors. PLoS computational biology. 2011;7:e1002219. doi: 10.1371/journal.pcbi.1002219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peron SP, Freeman J, Iyer V, Guo C, Svoboda K. A Cellular Resolution Map of Barrel Cortex Activity during Tactile Behavior. Neuron. 2015;86:783–799. doi: 10.1016/j.neuron.2015.03.027. [DOI] [PubMed] [Google Scholar]
- Powell NJ, Redish AD. Complex neural codes in rat prelimbic cortex are stable across days on a spatial decision task. Frontiers in behavioral neuroscience. 2014;8:120. doi: 10.3389/fnbeh.2014.00120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rajan K, Harvey CD, Tank DW. Recurrent Network Models of Sequence Generation and Memory. Neuron. 2016;90:128–142. doi: 10.1016/j.neuron.2016.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reep RL, Chandler HC, King V, Corwin JV. Rat posterior parietal cortex: topography of corticocortical and thalamic connections. Experimental brain research. 1994;100:67–84. doi: 10.1007/BF00227280. [DOI] [PubMed] [Google Scholar]
- Roitman JD, Shadlen MN. Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2002;22:9475–9489. doi: 10.1523/JNEUROSCI.22-21-09475.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scott BB, Brody CD, Tank DW. Cellular resolution functional imaging in behaving rats using voluntary head restraint. Neuron. 2013;80:371–384. doi: 10.1016/j.neuron.2013.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scott BB, Constantinople CM, Erlich JC, Tank DW, Brody CD. Sources of noise during accumulation of evidence in unrestrained and voluntarily head-restrained rats. eLife. 2015;4:e11308. doi: 10.7554/eLife.11308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shankar KH, Howard MW. A scale-invariant internal representation of time. Neural computation. 2012;24:134–193. doi: 10.1162/NECO_a_00212. [DOI] [PubMed] [Google Scholar]
- Sussillo D. Neural circuits as computational dynamical systems. Curr Opin Neurobiol. 2014;25:156–163. doi: 10.1016/j.conb.2014.01.008. [DOI] [PubMed] [Google Scholar]
- Sussillo D, Abbott LF. Generating coherent patterns of activity from chaotic neural networks. Neuron. 2009;63:544–557. doi: 10.1016/j.neuron.2009.07.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tank DW, Hopfield JJ. Neural computation by concentrating information in time. Proceedings of the National Academy of Sciences of the United States of America. 1987;84:1896–1900. doi: 10.1073/pnas.84.7.1896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tiganj Z, Hasselmo ME, Howard MW. A simple biophysically plausible model for long time constants in single neurons. Hippocampus. 2015;25:27–37. doi: 10.1002/hipo.22347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vintch B, Zaharia AD, Movshon JA, Simoncelli EP. Efficient and direct estimation of a neural subunit model for sensory coding. Adv Neural Inf Process Syst. 2012;25:3113–3121. [PMC free article] [PubMed] [Google Scholar]
- Wang XJ. Probabilistic decision making by slow reverberation in cortical circuits. Neuron. 2002;36:955–968. doi: 10.1016/s0896-6273(02)01092-9. [DOI] [PubMed] [Google Scholar]
- Whitlock JR, Pfuhl G, Dagslott N, Moser MB, Moser EI. Functional split between parietal and entorhinal cortices in the rat. Neuron. 2012;73:789–802. doi: 10.1016/j.neuron.2011.12.028. [DOI] [PubMed] [Google Scholar]
- Wilber AA, Clark BJ, Demecha AJ, Mesina L, Vos JM, McNaughton BL. Cortical connectivity maps reveal anatomically distinct areas in the parietal cortex of the rat. Frontiers in neural circuits. 2014;8:146. doi: 10.3389/fncir.2014.00146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson MA, McNaughton BL. Dynamics of the hippocampal ensemble code for space. Science. 1993;261:1055–1058. doi: 10.1126/science.8351520. [DOI] [PubMed] [Google Scholar]
- Yang T, Shadlen MN. Probabilistic reasoning by neurons. Nature. 2007;447:1075–1080. doi: 10.1038/nature05852. [DOI] [PubMed] [Google Scholar]
- Young JJ, Shapiro ML. Dynamic coding of goal-directed paths by orbital prefrontal cortex. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2011;31:5989–6000. doi: 10.1523/JNEUROSCI.5436-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.