Skip to main content
eLife logoLink to eLife
. 2020 Jul 14;9:e51121. doi: 10.7554/eLife.51121

Stable task information from an unstable neural population

Michael E Rule 1, Adrianna R Loback 1, Dhruva V Raman 1, Laura N Driscoll 2, Christopher D Harvey 3, Timothy O'Leary 1,
Editors: Stephanie Palmer4, Ronald L Calabrese5
PMCID: PMC7392606  PMID: 32660692

Abstract

Over days and weeks, neural activity representing an animal’s position and movement in sensorimotor cortex has been found to continually reconfigure or ‘drift’ during repeated trials of learned tasks, with no obvious change in behavior. This challenges classical theories, which assume stable engrams underlie stable behavior. However, it is not known whether this drift occurs systematically, allowing downstream circuits to extract consistent information. Analyzing long-term calcium imaging recordings from posterior parietal cortex in mice (Mus musculus), we show that drift is systematically constrained far above chance, facilitating a linear weighted readout of behavioral variables. However, a significant component of drift continually degrades a fixed readout, implying that drift is not confined to a null coding space. We calculate the amount of plasticity required to compensate drift independently of any learning rule, and find that this is within physiologically achievable bounds. We demonstrate that a simple, biologically plausible local learning rule can achieve these bounds, accurately decoding behavior over many days.

Research organism: Mouse

Introduction

A core principle in neuroscience is that behavioral variables are represented in neural activity. Such representations must be maintained to retain learned skills and memories. However, recent work has challenged the idea of long-lasting neural codes (Rumpel and Triesch, 2016). In our recent work (Driscoll et al., 2017), we found that neural activity–behavior relationships in individual posterior parietal cortex (PPC) neurons continually changed over many days during a repeated virtual navigation task. Similar ‘representational drift’ has been shown in other neocortical areas and hippocampus (Attardo et al., 2015; Ziv et al., 2013; Levy et al., 2019). Importantly, these studies showed that representational drift is observed in brain areas essential for performing the task long after the task has been learned.

These experimental observations raise the major question of whether drifting representations are fundamentally at odds with the storage of stable memories of behavioral variables (e.g. Ganguly and Carmena, 2009; Tonegawa et al., 2015). Theoretical work has proposed that a consistent readout of a representation can be achieved if drift in neural activity patterns occurs in dimensions of population activity that are orthogonal to coding dimensions - in a ‘null coding space’ (Rokni et al., 2007; Druckmann and Chklovskii, 2012; Ajemian et al., 2013; Singh et al., 2019). This can be facilitated by neural representations that consist of low-dimensional dynamics distributed over many neurons (Montijn et al., 2016; Gallego et al., 2018; Hennig et al., 2018; Degenhart et al., 2020). Redundancy could therefore permit substantial reconfiguration of tuning in single cells without disrupting neural codes (Druckmann and Chklovskii, 2012; Huber et al., 2012; Kaufman et al., 2014; Ni et al., 2018; Kappel et al., 2018). However, the extent to which drift is confined in such a null coding space remains an open question.

Purely random drift, as would occur if synaptic strengths and other circuit parameters follow independent random walks, would eventually disrupt a population code. Several studies have provided evidence that cortical synaptic weights and synaptic connections exhibit statistics that are consistent with a purely random process (Moczulska et al., 2013; Loewenstein et al., 2011; Loewenstein et al., 2015). Indeed, our previous experimental findings reveal that drift includes cells that lose representations of task relevant variables, suggesting that some component of drift affects coding dimensions (Driscoll et al., 2017).

Together, these observations raise fundamental questions that have not been directly addressed with experimental data, and which we address here. First, to what extent can ongoing drift in task representations be confined to a null coding space over extended periods while maintaining an accurate readout of behavioral variables in a biologically plausible way? Second, how might we estimate how much additional ongoing plasticity (if any) would be required to maintain a stable readout of behavioral variables, irrespective of specific learning rules? Third, is such an estimate of ongoing plasticity biologically feasible for typical levels of connectivity, and typical rates of change observed in synaptic strengths? Fourth, can a local, biologically plausible plasticity mechanism tune readout weights to identify a maximally stable coding subspace and compensate any residual drift away from this subspace?

We addressed these questions by modelling and analyzing data from Driscoll et al., 2017. This dataset consists of optical recordings of calcium activity in populations of hundreds of neurons in Posterior Parietal Cortex (PPC) during repeated trials of a virtual reality T-maze task (Figure 1a). Mice were trained to associate a visual cue at the start of the maze with turning left or right at a T-junction. Behavioral performance and kinematic variables were stable over time with some per-session variability (mouse four exhibited a slight decrease in forward speed; Figure 2—figure supplement 1). Full experimental details can be found in the original study.

Figure 1. Neural population coding of spatial navigation reconfigures over time in a virtual-reality maze task.

Figure 1.

(a) Mice were trained to use visual cues to navigate to a reward in a virtual-reality maze; neural population activity was recorded using Ca2+ imaging Driscoll et al., 2017. (b) (Reprinted from Driscoll et al., 2017) Neurons in PPC (vertical axes) fire at various regions in the maze (horizontal axes). Over days to weeks, individual neurons change their tuning, reconfiguring the population code. This occurs even at steady-state behavioral performance (after learning). (c) Each plot shows how location-averaged normalized activity changes for single cells over weeks. Missing days are interpolated to the nearest available sessions, and both left and right turns are combined. Neurons show diverse changes in tuning over days, including instability, relocation, long-term stability, gain/loss of selectivity, and intermittent responsiveness.

© 2017 Elsevier

Panel B reprinted from Driscoll et al., 2017 with permission from Elsevier. They are not covered by the CC-BY 4.0 licence and further reproduction of this panel would need permission from the copyright holder.

Previous studies identified planning and choice-based roles for PPC in the T-maze task (Harvey et al., 2012), and stable decoding of such binary variables was explored in Driscoll et al., 2017. However, in primates PPC has traditionally been viewed as containing continuous motor-related representations (Andersen et al., 1997; Andersen and Buneo, 2002; Mulliken et al., 2008), and recent work (Krumin et al., 2018; Minderer et al., 2019) has confirmed that PPC has an equally motor-like role in spatial navigation in rodents (Calton and Taube, 2009). It is therefore important to revisit these data in the context of continuous kinematics encoding.

Previous analyses showed that PPC neurons activated at specific locations in the maze on each day. When peak activation is plotted as a function of (linearized) maze location, the recorded population tiles the maze, as shown in Figure 1b. However, maintaining the same ordering in the same population of neurons revealed a loss of sequential activity over days to weeks (top row of Figure 1b). Nonetheless, a different subset of neurons could always be found to tile the maze in these later experimental sessions. In all cases, the same gradual loss of ordered activation was observed (second and third rows, Figure 1b). Figure 1c shows that PPC neurons gain or lose selectivity and occasionally change tuning locations. Together, these data show that PPC neurons form a continually reconfiguring representation of a fixed, learned task.

PPC representations facilitate a linear readout

We asked whether precise task information can be extracted from this population of neurons, despite the continual activity reconfiguration evident in these data. We began by fitting a linear decoder for each task variable of interest (animal location, heading, and velocity) for each day. This model has the form x(t)=Mz(t), where x(t) is the time-binned estimate of position, velocity or heading (view angle) in the virtual maze, M is a vector of weights, and z(t) is the normalized time-binned calcium fluorescence (Materials and methods: Decoding analyses).

Example decoding results for two mice are shown in Figure 2a, and summaries of decoding performance for four mice in Figure 2b. Position, speed, and view angle can each be recovered with a separate linear model. The average mean absolute decoding error for all animals included in the analysis was 47.2 cm ±8.8 cm (mean ±1 standard deviation) for position, 9.6 cm/s ±2.2 cm/s for speed, and 13.8° ± 4.0° for view angle (Materials and methods: Decoding analyses).

Figure 2. A linear decoder can extract kinematic information from PPC population activity on a single day.

(a) Example decoding performance for a single session for mice 4 and 5. Grey denotes held-out test data; colors denote the prediction for the corresponding kinematic variable. (b) Summary of the decoding performance on single days; each point denotes one mouse. Error bars denote one standard deviation over all sessions that had at least N=200 high-confidence PPC neurons for each mouse. (Mouse two is excluded due to an insufficient number of isolated neurons). Chance level is ∼1.5 m for forward position, and varies across subjects for forward velocity (∼0.2–0.25 m/s) and head direction (∼20-30 ). (c) Extrapolation of the performance of the static linear decoder for decoding position as a function of the number of PPC neurons, done via Gaussian process regression (Materials and methods). Red '×' marks denote data; solid black line denotes the inferred mean of the GP. Shaded regions reflect ±1.96σ Gaussian estimates of the 95th and 5th percentiles. (d) Same as panel (c), but where the neurons have been ranked such that the ‘best’ subset of size 1≤K≤N is chosen, selected by greedy search based on explained variance (Materials and methods: Best K-Subset Ranking).

Figure 2.

Figure 2—figure supplement 1. Behavioral stability.

Figure 2—figure supplement 1.

Statistics of forward motion show small daily variations. It is possible that changes in population codes relate to systematic changes in behavior over time. As described in Driscoll et al., 2017, these experiments were performed only after mice achieved asymptotic performance in speed and accuracy. Nevertheless, there is some behavioral variability. Each mouse’s velocity in the initial (forward) segment of the ‘T’ maze varies slightly between days. Differences in means (black lines) are often statistically significant (p<0.05 in 91% pairs of sessions; Bonferroni multiple-comparison correction for a 0.05 false discovery rate), but are small (Δμ/σ i.e. Cohen’s d ranges between 10–16% per animal). Systematic drift-like trends appear absent from mice 1 and 3. A statistically significant trend is present for mouse 4 (Pearson’s ρ = − 0.9, p<0.05). We show only forward velocity here, as other kinematics variables exhibited less variability. Daily fluctuations in behavior could be used to weakly predict the recording session. Under cross-validation, linear discriminant analysis based on ten-second windows of kinematics predicted the recording session 9–17% above chance. This suggests that each mouse exhibited small but detectable daily variability in their behavior. Most variability was unsystematic, and therefore unrelated to the slow changes in neural codes studied here. We expect changes in forward speed in mouse four to contribute to apparent drift in some cells. However, the results presented here generalize across mice 1, 3, and 5, which exhibited stable behavior.

We chose a linear decoder specifically because it can be interpreted biologically as a single ‘readout’ neuron that receives input from a few hundred PPC neurons, and whose activity approximates a linear weighted sum. The fact that a linear decoder recovers behavioral variables to reasonable accuracy suggests that brain areas with sufficiently dense connectivity to PPC can extract this information via simple weighted sums.

The number of PPC neurons recorded is a subset of the total PPC population. To assess whether additional neurons might improve decoding accuracy, we evaluated decoding performance of randomly drawn subsets of recorded neurons (Figure 2c). Extrapolation of the decoding performance suggested that better performance might be possible with a larger population of randomly sampled PPC neurons than we recorded.

It is possible that a random sample of neurons misses the ‘best’ subset of cells for decoding task variables. When we restricted to optimal subsets of neurons we found that performance improved rapidly up to ∼30 neurons and saturated at ∼30%(50–100 neurons) of the neurons recorded (Figure 2d). On a given day task variables could be decoded well with relatively few (∼10) neurons. However, the identity of the neurons in this optimal subset changed over days. For all subjects, no more than 1% of cells were consistently ranked in the top 10%, an no more than 13% in the top 50%. We confirmed that this instability was not due to under-regularization in training (Materials and methods: Best K-Subset Ranking).

Of the neurons with strong location tuning, Driscoll et al., 2017 found that 60% changed their location tuning over two weeks and a total of 80% changed over the 30- day period examined. We find that even the small remaining ‘stable’ subset of neurons exhibited daily variations in their Signal-to-Noise Ratio (SNR) with respect to task decoding, consistent with other studies (Carmena et al., 2005). For example, no more than 8% of neurons that were in the top 25% in terms of tuning-peak stability were also consistently in the top 25% in terms of SNR for all days. If a neuron becomes relatively less reliable, then the weight assigned may become inappropriate for decoding. This affects our analyses, and would also physiologically affect a downstream neuron with fixed synaptic weights.

Representational drift is systematic and significantly degrades a fixed readout

Naively fitting a linear model to data from any given day shows that behavioral variables are encoded in a way that permits a simple readout, but there is no guarantee that this readout will survive long-term drift in the neural code. To illustrate this, we compared the decoding performance of models fitted on a given day with decoders optimized on data from earlier or later days. We restricted this analysis to those neurons that were identified with high confidence on all days considered. We found that decoding performance decreased as the separation between days grew (Figure 3a). This is unsurprising given the extent of reconfiguration reported in the original study (Driscoll et al., 2017) and depicted in Figure 1. Furthermore, because task-related PPC activity is distributed over many neurons, many different linear decoders can achieve similar error rates due to the degeneracy in the representation (Rokni et al., 2007; Kaufman et al., 2014; Montijn et al., 2016). Since the directions in population activity used for inter-area communication might differ from the directions that maximally encode stimulus information in the local population (Ni et al., 2018; Semedo et al., 2019), single-day decoders might overlook a long-term stable subspace used for encoding and communication. This motivates the question of whether a drift-invariant linear decoder exists and whether its existence is biologically plausible.

Figure 3. Single-day decoders generalize poorly to previous and subsequent days, but multi-day decoders exist with good performance.

Figure 3.

(a) Blue: % increase in error over the optimal decoder for the testing day (mouse 3, 136 neurons; mouse 4, 166 neurons). Red: Mean absolute error for decoders trained on a single day (‘0’) and tested on past/future days. (b) Fixed decoders M for multiple days d1D (‘concatenated decoders’) are fit to concatenated excerpts from several sessions. The inset equation reflects the objective function to be minimized (Methods). Due to redundancy in the neural code, many decoders can perform well on a single day. Although the single-day optimal decoders vary, a stable subspace with good performance can exist. (c) Concatenated decoders (cyan) perform slightly but significantly worse than single-day decoders (ochre; Mann-Whitney U test, p<0.01). They also perform better than expected if neural codes were unrelated across days (permutation tests; red). Plots show the mean absolute decoding error as a percent of the chance-level error (points: median, whiskers: 5th–95th%). Chance-level error was estimated by shuffling kinematics traces relative to neural time-series (mean of 100 samples). For the permutation tests, 100 random samples were drawn with the neuronal identities randomly permuted. (d) Plots show the rate at which concatenated-decoder accuracy (normalized R2) degrades as the number of days increase. Concatenated decoders (black) degrade more slowly than expected for random drift (ochre). Shaded regions reflect the inner 95% of the data (generated by resampling for the null model). The null model statistics are matched to the within- and between-day variance and sparsity of the experimental data for each animal (Materials and methods).

To address this, we tested the performance of a single linear decoder optimized across data from multiple days. We concatenated data from different days using the same subset of PPC neurons (Figure 3b). In all four subjects, we found that such fixed multiple-day linear ‘concatenated’ decoders could recover accurate task variable information despite ongoing changes in PPC neuron tuning. However, the average performance of the multiple-day decoders was significantly worse than single-day linear decoders for each day (Figure 3c).

The existence of a fixed, approximate decoder implies a degenerate representation of task variables in the population activity of PPC neurons. In other words, there is a family of linear decoders that can recover behavioral variables while allowing weights to vary in some region of weight space. This situation is illustrated in Figure 3b, which depicts regions of good performance of single-day linear decoders as ellipsoids. The existence of an approximate concatenated decoder implies that these ellipsoids intersect over several days for some allowable level of error in the decoder. For a sufficiently redundant neural code, one might expect to find an invariant decoder for some specified level of accuracy even if the underlying code drifts. However, there are many qualitative ways in which drift can occur in a neural code: it could resemble a random walk, as some studies suggest (Moczulska et al., 2013; Loewenstein et al., 2011; Loewenstein et al., 2015), or there could be a systematic component. Is the accuracy we observe in the concatenated decoder expected for a random walk? In all subjects, we found that a concatenated decoder performed substantially better on experimental data than on randomly drifting synthetic data with matched sparseness and matched within/between-session variability (Figure 3d). This suggests that the drift in the neural data is not purely random.

We further investigated the dynamics of drift by quantifying the direction of changes in neural variability over time (Figure 4c,d, Materials and methods: Drift alignment). We found that drift is indeed aligned above chance to within-session neural population variability. This suggests that the biological mechanisms underlying drift are in part systematic and constrained by a requirement to keep a consistent population code over time. In comparison, the projection of drift onto behavior-coding directions was small, but still above chance. This is consistent with the hypothesis that ongoing compensation might be needed for a long-term stable readout.

Figure 4. A slowly-varying component of drift disrupts the behavior-coding subspace.

(a) The small error increase when training concatenated decoders (Figure 3) suggests that plasticity is needed to maintain good decoding in the long term. We assess the minimum rate for this plasticity by training a separate decoder Md for each day, while minimizing the change in weights across days. The parameter λ controls how strongly we constrain weight changes across days (the inset equation reflects the objective function to be minimized; Methods). (b) Decoders trained on all days (cyan) perform better than chance (red), but worse than single-day decoders (ochre). Black traces illustrate the plasticity-accuracy trade-off for adaptive decoding. Modest weight changes per day are sufficient to match the performance of single-day decoders (Boxes: inner 50% of data, horizontal lines: median, whiskers: 5–95th%). (c) Across days, the mean neural activity associated with a particular phase of the task changes (Δμ). We define an alignment measure ρ (Materials and methods) to assess the extent to which these changes align with behavior-coding directions in the population code (blue) verses directions of noise correlations (ochre). (d) Drift is more aligned with noise (ochre) than it is with behavior-coding directions (blue). Nevertheless, drift overlaps this behavior-coding subspace much more than chance (grey; dashed line: 95% Monte-Carlo sample). Each box reflects the distribution over all maze locations, with all consecutive pairs of sessions combined.

Figure 4.

Figure 4—figure supplement 1. Concatenated decoder performance depends on the rank of the drift.

Figure 4—figure supplement 1.

sufficiently low-rank drift resembles the data in terms of the performance of a concatenated decoder. Here, we further explore the null model introduced in Figure 3d. As in Figure 3d, we simulated random drift in the neural readout. We matched the null model to the statistics of neural activity, the within-day decoding accuracy, and the performance degradation when generalizing between days. In these simulations, we explore the scenario that the drift may be confined to a (randomly-selected) low-dimensional subspace. We evaluated a range of dimensionalities for the drift subspace (horizontal axes), and evaluated the performance of a concatenated decoder on simulated data. While unconstrained drift prevents the identification of a concatenated decoder with good performance (Figure 3d), sufficiently constrained drift does not. In these simulations, we found that constraining drift to a subspace of rank 14–26 (red vertical lines) led to similar performance as the data (dashed horizontal lines) in all subjects except for mouse 5. We speculate that this is because Mouse five had limited data and poor generalization of single-day decoders over time, but other scenarios are possible. Black traces reflect the mean over 20 random simulations, and shaded regions reflect one standard deviation.

To quantify the systematic nature of drift further, we modified the null model to make drift partially systematic by constraining the null-model drift within a low rank subspace (Figure 4—figure supplement 1). This reflects a scenario in which only a few components of the population code change over time. We found that the performance of a concatenated decoder for low-rank drift better approximated experimental data. For three of the four mice we could match concatenated decoder performance when the dimension of the drift process was constrained within a range of 14–26, a relatively small fraction (around 20%) of the components of the full population.

Biologically achievable rates of plasticity can compensate drift, independent of specific learning rules

Together, these analyses show that the observed dynamics of drift favor a fixed linear readout above what would be expected for random drift. However, our results also show that a substantial component of drift cannot be confined to the null space of a fixed downstream linear readout. We asked how much ongoing weight change would be needed to achieve the performance of single-day decoders while minimizing day-to-day changes in decoding weights. We first approached this without assuming a specific plasticity rule, by simultaneously optimizing linear decoders for all recorded days while penalizing the magnitude of weight change between sessions (Figure 4a, Materials and methods: Concatenated and constrained analyses). By varying the magnitude of the weight change penalty we interpolated between the concatenated decoder (no weight changes) and the single-day decoders (optimal weights for each day). The result of this is shown in Figure 4b. Performance improves rapidly once small weight changes are permitted (∼12–25% per session). Thus, relatively modest amounts of synaptic plasticity might suffice to keep encoding consistent with changes in representation, provided a mechanism exists to implement appropriate weight changes.

A biologically plausible local learning rule can compensate drift

The results in Figure 4b suggest that modest amounts of synaptic plasticity could compensate for drift, but do not suggest a biologically plausible mechanism for this compensation. Could neurons track slow reconfiguration using locally available signals in practice? To test this, we used an adaptive linear neuron model based on the least mean square learning (LMS) rule (Widrow and Hoff, 1960; Widrow and Hoff, 1962) (Materials and methods). This algorithm is biologically plausible because it only requires each synapse to access its current weight and recent prediction error (Figure 5a, Materials and methods: Online LMS algorithm).

Figure 5. Local, adaptive decoders can track representational drift over multiple days.

(a) The Least Mean-Squares (LMS) algorithm learns to linearly decode a target kinematic variable based on error feedback. Continued online learning can track gradual reconfiguration in population representations. (b) As the average weight change per day (horizontal axis) increases, the average decoding error (vertical axis) of the LMS algorithm improves, shown here for three kinematic variables (Mouse 4, 144 units, 10 sessions over 12 days; Methods: methods:lms). (Dashed line: error for a decoder trained on only the previous session without online learning; Solid line: performance of a decoder trained over all testing days). As the rate of synaptic plasticity is increased, LMS achieves error rates comparable to the concatenated decoder. (c) Example LMS decoding results for three kinematic variables. Ground truth is plotted in black, and LMS estimate in color. Sample traces are taken from day six. Dashed traces indicate the performance of the decoder without ongoing re-training. (d) (top) Average percent weight-change per session for online decoding of forward position (learning rate: 4 × 10-4/sample). The horizontal axis reflects time, with vertical bars separating days. The average weight change is 10.2% per session. To visualize %Δw continuously in this plot, we use a sliding difference with a window reflecting the average number of samples per session. (bottom) LMS (black) performs comparably to the concatenated decoder (cyan) (LMS mean absolute error of 0.47 m is within ≤ 3% of concatenated decoder error). Without ongoing learning, the performance of the initial decoder degrades (orange). Error traces have been averaged over ten minute intervals within each session. Discontinuities between days reflect day-to-day variability and suggest a small transient increase in error for LMS decoding at the start of each day.

Figure 5.

Figure 5—figure supplement 1. Online learning with LMS: additional subjects.

Figure 5—figure supplement 1.

LMS results for mice 1, 3, 4, and 5. Results of applying the online LMS algorithm with a learning rate of 4 × 10-4/sample. Errors reflect the mean absolute error over ten minute intervals. LMS (black) achieves errors comparable to an offline decoder trained on all sessions ('concatenated’, blue), and outperforms a fixed decoder trained on the initial day (red). Only times within a trial were used for training. For the LMS algorithm, we observed inter-day weight changes of 7.6-10.4%, consistent with observed rates of change in the volume of dendritic spines in other studies. We present two spans of time from Mouse 3, reflecting two largely non-overlapping populations of tracked neurons on non-overlapping spans of days.
Figure 5—figure supplement 2. The plasticity level required to track drift varies with population size.

Figure 5—figure supplement 2.

Smaller populations require more plasticity to achieve target error levels. These plots show the daily weight changes required to track drift when decoding forward positions as a function of population size for mice 3 and 4. Smaller populations require more plasticity. The target error (M3: 0.68 m, M4: 0.48 m) was set based on the performance of LMS on the full population (M3: 114 neurons, M4: 134 neurons). For each sub-population size, 50 random sub-populations were drawn, and the learning rate was optimized to achieve the target error level. Shaded regions reflect the inner 95th percentile over all sampled sub-populations. Weight change was assessed as the weight change between the end of consecutive sessions and normalized by the overall average weight magnitude.
Figure 5—figure supplement 3. Extrapolation to larger populations.

Figure 5—figure supplement 3.

The plasticity required to achieve a fixed error level decreases for larger populations. Typically, the number of inputs to a neuron is much larger than the ∼100 neurons observed here. The ∼10% weight change per day reported by LMS could therefore over-estimate of plasticity needed to track drift. To address this, we combined data from multiple mice to extend the LMS analysis to a synthetic population of 1238 cells over six sessions. Trials were matched based on the current and previous cue, and converted to pseudotime based on the fraction of the maze completed between 0 and 100%. We allowed up to two-day recording gaps between consecutive sessions from the same mouse. These synthetic populations are not equivalent to large recordings from a single mouse, but nevertheless reveal how plasticity scales with population size. We found that larger populations could achieve the same performance as ∼100 cells with a ∼4% weight change per day. (a) Trial pseudotime (% of trial complete; black) can be decoded from a synthetic pooled population (1238 cells) using the LMS algorithm (violet: prediction). (b) Similarly to the single-subject results, LMS tracks changes in the population code over time. In this case, a learning rate of 8 × 10-4/sample achieved comparable error to a concatenated decoder. The larger population permits better decoding error of ∼5%, compared to the ∼15 - 20% error in forward position decoded from ∼100 neurons. (c) As population size increases, both the weight magnitudes (left) and the rates of weight change (middle) decrease. Small populations could not achieve the error rates possible using the full population, even with very large learning rates. We therefore set the target error a bit higher, at 13% chance level. This is comparable to the error rates seen in individual mice using ∼100 cells. Overall, the required percentage weight change decreased for larger populations (right).

Figure 5b shows that this online learning rule achieved decoding performance comparable to the offline constrained decoders. Over the timespan of the data, LMS allows a linear decoder to track representational drift observed (Figure 5c), exhibiting weight changes of ∼10%/day across all animals (learning rate 4 × 10-4/sample, Figure 5—figure supplement 1). These results suggest that small weight changes could track representational drift in practice. In contrast, we found that LMS struggled to match the unconstrained drift of the null model explored in Figure 3d. Calibrating the LMS learning rate on the null model to match the mean performance seen on the true data required an average weight change of 93% per day. In comparison, matching the average percent weight change per day of 10%, the null model produced a normalized mean-squared-error of 1.3 σ2 (averaged over all mice), worse than chance. This further indicates that drift is highly structured, facilitating online compensation with a local learning rule.

We stress that modeling assumptions mean that these results are necessarily a proxy for the rates of synaptic plasticity that are observed in vivo. Nonetheless, we believe these calculations are conservative. We were restricted to a sample of ∼100–200 neurons, at least an order of magnitude less than the typical number of inputs to a pyramidal cell in cortex. The per-synapse magnitude of plasticity necessarily increases when smaller subsets are used for a readout (Figure 5—figure supplement 2). One would therefore expect lower rates of plasticity for larger populations. Indeed, when we combined neurons across mice into a large synthetic population (1238 cells), we found that the plasticity required to achieve target error asymptotes at less than 4% per day (Figure 5—figure supplement 3). Together, these results show a conservatively achievable bound on the rate of plasticity required to compensate drift in a biologically plausible model.

Discussion

Several theories have been proposed for how stable behavior could be maintained despite ongoing changes in connectivity and neural activity. Here, we found that representational drift occurred in both coding and non-coding subspaces. On a timescale of a few days, redundancy in the neural population could accommodate a significant component of drift, assuming a biological mechanism exists for establishing appropriate readout weights. Simulations suggested that the existence of this approximately stable subspace were not simply a result of population redundancy, since random diffusive drift quickly degraded a downstream readout. Drift being confined to a low-dimensional subspace is one scenario that could give rise to this, although we do not exclude other possibilities. Nevertheless, a non-negligible component of drift resides outside the null space of a linear encoding subspace, implying that drift will eventually destroy any fixed-weight readout.

However, we showed that this destructive component of drift could be compensated with small and biologically realistic changes in synaptic weights, independently of any specific learning rule. Furthermore, we provided an example of a simple and biologically plausible learning rule that can achieve such compensation over long timescales with modest rates of plasticity. If our modeling results are taken literally, this would suggest that a single unit with connections to ∼100 PPC neurons can accurately decode task information with modest changes in synaptic weights over many days. This provides a concrete and quantitative analysis of the implications of drift on synaptic plasticity and connectivity. Together, our findings provide some of the first evidence from experimental data that representational drift could be compatible with long-term memories of learned behavioral associations.

A natural question is whether a long-term stable subspace is supported by an unobserved subset of neurons that have stable tuning (Clopath et al., 2017). We do not exclude this possibility because we measured a subset of the neural population. However, over multiple samples from different animals our analyses consistently suggest that drift will reconfigure the code entirely over months. Specifically, we found that past reliability in single cells is no guarantee of future stability. This, combined with an abundance of highly-informative cells on a single day, contributes to poor (fixed) decoder generalization, because previously reliable cells eventually drop out or change their tuning. Consistent with this, studies have shown that connectivity in mammalian cortex is surprisingly dynamic. Connections between neurons change on a timescale of hours to days with a small number of stable connections (Holtmaat et al., 2005; Minerbi et al., 2009; Holtmaat and Svoboda, 2009; Attardo et al., 2015).

We stress that the kind of reconfiguration observed in PPC is not seen in all parts of the brain; primary sensory and motor cortices can show remarkable stability in neural representations over time (Gallego et al., 2020). However, even if stable representations exist elsewhere in the brain, PPC still must communicate with these areas. We suggest that a component of ongoing plasticity maintains congruent representations across different neural circuits. Such maintenance would be important in a distributed, adaptive system like the brain, in which multiple areas learn in parallel. How this is achieved is the subject of intense debate (Rule et al., 2019). We hypothesize that neural circuits have continual access to two kinds of error signals. One kind should reflect mismatch between internal representations and external task variables, and another should reflect prediction mismatch between one neural circuit and another. Our study therefore motivates new experiments to search for neural correlates of error feedback between areas, and suggests further theoretical work to explore the consequences of such feedback.

Materials and methods

Data acquisition

The behavioral and two-photon calcium imaging data analyzed here were provided by the Harvey lab. Details regarding the experimental subjects and methods are provided in Driscoll et al., 2017.

Virtual reality task

Details of the virtual reality environment, training protocol, and fixed association navigation task are described in Driscoll et al., 2017. In brief, virtual reality environments were constructed and operated using the MATLAB-based ViRMEn software (Virtual Reality Mouse Engine) Harvey et al., 2012. Data were obtained from mice that had completed the 4–8 week training program for the two-alternative forced choice T-maze task. The length of the virtual reality maze was fixed to have a total length of 4.5 m. The cues were patterns on the walls (black with white dots or white with black dots), and were followed by a gray striped ‘cue recall’ segment (2.25 m long) that was identical across trial types.

Data preparation and pre-processing

Raw Ca2+ fluorescence videos (sample rate=5.3Hz) were corrected for motion artefacts, and individual sources of Ca2+ fluorescence were identified and extracted (Driscoll et al., 2017). Processed data consisted of normalized Ca2+ fluorescence transients ('ΔF/F') and behavioral variables (mouse position, view angle, and velocity). Inter-trial intervals (ITIs) were removed for all subsequent analyses. For offline decoding, we considered only correct trials, and all signals were centered to zero-mean on each trial as a pre-processing step.

When considering sequences of days, we restricted analysis to units that were continuously tracked over all days. For Figures 3 and 4, we use the following data: M1: seven sessions, 15 days, 101 neurons; M3: 10 sessions, 13 days, 114 neurons; M4: 10 sessions, 11 days, 146 neurons; M5: seven sessions, 7 days, 112 neurons. We allowed up to two-day recording gaps between consecutive sessions from the same mouse.

Quantification and statistical analysis

Decoding analyses

We decoded kinematics time-series 𝐱={x1,,xT} with T time-points from the vector of instantaneous neural population activity z={z1,...,zT}, using a linear decoder with a fixed set of weights M, that is 𝐱^=M𝐳. We used the ordinary least-squares (OLS) solution for M, which minimizes the squared (L2) prediction error ε=𝐱-M𝐳2 over all time-points. For the ‘same-day’ analyses, we optimize a separate Md for each day d (Figure 2), restricting analysis to sessions with at least 200 identified units. We assessed decoding performance using 10-fold cross-validation, and report the mean absolute error, defined as 𝐱-𝐱^. Here, . denotes the element-wise absolute value, and . denotes expectation.

Best K-Subset ranking

For Figure 2d, we ranked cells in order of explained variance using a greedy algorithm. Starting with the most predictive cell, we iteratively added the next cell that minimized the MSE under ten-fold cross-validated linear decoding. To accelerate this procedure, we pre-computed the mean and covariance structure for training and testing datasets. MSE fits and decoding performance can be computed directly from these summary statistics, accelerating the several thousand evaluations required for greedy selection. We added L2 regularization to this analysis by adding a constant λI to the covariance matrix of the neural data. The optimal regularization strength (λ = 10-4 to 10-3) slightly reduced decoding error, but did not alter the ranking of cells.

Extrapolation via GP regression

To qualitatively assess whether decoding performance saturates with the available number of recorded neurons, we computed decoding performance on a sequence of random subsets of the population of various sizes (Figure 2c,d). Results for all analyses are reported as the mean over 20 randomly-drawn neuronal sub-populations, and over all sessions that had at least N=150 units. Gaussian process (GP) regression was implemented in Python, using a combination of a Matérn kernel and an additive white noise kernel. Kernel parameters were optimized via maximum likelihood (Scikit-learn, Pedregosa et al., 2011).

Concatenated and constrained analyses

For both the concatenated (Figure 3b,e) and constrained analyses (Figure 4a,b), we used the set of identified neurons included in all sessions considered. In the concatenated analyses, we solved for a single decoder Mc for all days:

ε=d=1n𝐱d-Mc𝐳d2, (1)

where ε denotes the quadratic objective function to be minimized. In the constrained analysis, we optimized a series of different weights M={M1,...,MD} for each day d1...D, and added an adjustable L2 penalty λ on the change in weights across days. This problem reduces to the ‘same-day’ analysis for λ=0, and approaches the concatenated decoder as λ approaches 1:

ε=(1-λ)d=1n𝐱d-Md𝐳d2+λd=1n-1Md+1-Md2. (2)

For the purposes of the constrained analysis, missing days were ignored and the remaining days treated as if they were contiguous. Two sessions were missing from the 10 and 14 day spans for mice 3 and 4, respectively (Figure 4b). Figure 3c also shows the expected performance of a concatenated decoder for completely unrelated neural codes. To assess this, we permuted neuronal identities within individual sessions, so that each day uses a different ”code’.

Null model

We developed a null model to assess whether the performance of the concatenated decoder was consistent with random drift. For this, we matched the amount of day-to-day drift based on the rate at which single-day decoders degrade. We also sampled neural states from the true data to preserve sparsity and correlation statistics. The null model related neural activity to a ’fake’ observable readout (e.g. mouse position) via an arbitrary linear mapping. The null model changed from day to day, reflecting drift in the neural code. The fidelity of single day and across day decoders in inferring a readout from the null model was matched to the true data.

For each animal, we take the matrix zRn×d of mean-centered neural activity on day one, where n represents the number of recorded neurons and d represents the number of datapoints. We relate this matrix to pseudo-observations of mouse position z via a null model of the form zr=Mrz+ϵr, where Mr,ϵrR1×n. Note that r indexes days. The vector ϵr is generated as scaled i.i.d. Gaussian noise. We scale ϵr such that the accuracy of a linear decoder trained on the data (z,xr) matches the average (over days) accuracy of a single-day decoder trained on the true data.

Next, we consider the choice of the randomly-drifting readout, Mr. On day one, M1 is generated as a vector of uniform random variables on [0,1]. Given Mr, we desire an Mr+1 that satisfies.

  • Mr+12=Mr2.

  • The expected coefficient of multiple correlation of xr+1=Mr+1z against the predictive model Mrz (between day R2) matches the average (over days) of the equivalent statistic generated from the true data.

To do this, we first generate a candidate ΔMrRn×1 as a vector of i.i.d. white noise. The components of ΔMr orthogonal and parallel to Mr are then scaled so that Mr+1=Mr+ΔMr satisfies the constraints above.

In Figure 4—figure supplement 1, a modification of the null model that confined inter-day model drift to a predefined subspace was used. Before simulating the null model over days, we randomly chose k orthogonal basis vectors, representing a k-dimensional subspace. We then searched for a candidate ΔMr, on each inter-day interval, that was representable as a weighted sum of these basis vectors. This requirement was in addition to those previously posed. Finding such a ΔMr corresponds to solving a quadratically-constrained quadratic program. This is non-convex, and thus a solution will not necessarily be found. However, solutions were always found in practice. We used unit Gaussian random variables as our initial guesses for each component of ΔMr, before solving the quadratic program using the IPOPT toolbox (Wächter and Biegler, 2006).

Drift alignment

We examine how much drift aligns with noise correlations verses directions of neural activity that vary with the task ('behavior-coding directions’). We define an alignment statistic ρ that reflects how much drift projects onto a given subspace (i.e. noise vs. behavior). We normalize ρ so that 0 reflects chance-level alignment and one reflects perfect alignment of the drift with the largest eigenvector of a given subspace (e.g. the principal eigenvector of the noise covariance).

Let z(x) denote the neural population activity, where x reflects a normalized measure of maze location, akin to trial pseudotime. Define drift Δμz(x) as the change in the mean neural activity μz(x) across days. We examine how much drift aligns with noise correlations verses directions of neural activity that vary with task pseudotime (dz(x)/dx).

To measure the alignment of a drift vector Δμ with the distribution of inter-trial variability (i.e. noise), we consider the trial-averaged mean µ and covariance Σ of the neural activity (log calcium-fluorescence signals filtered between 0.03 and .3 Hz and z-scored), conditioned on trial location and the current/previous cue direction. We use the mean squared magnitude of the dot product between the change in trial-conditioned means between days (Δμ), with the directions of inter-trial variability (Δz=z-z) on the first day, which is summarized by the product ΔμΣΔμ:

|ΔμΔz|2=ΔμΔzΔzΔμ=ΔμΔzΔzΔμ=ΔμΣΔμ. (3)

To compare pairs of sessions with different amounts of drift and variability, we normalize the drift vector to unit length, and normalize the trial-conditioned covariance by its largest eigenvalue λmax:

ϕtrial2=ΔμΣΔμ|Δμ|2λmax (4)

The statistic ϕtrial equals one if the drift aligns perfectly with the direction of largest inter-trial variability, and can be interpreted as the fraction of drift explained by the directions of noise correlations.

Random drift can still align with some directions by chance, and the mean squared dot-product between two randomly-oriented D-dimensional unit vectors scales as 1/D. Accounting for the contribution from each dimension of Σ, the expected chance alignment is therefore ϕ02=tr(Σ)/(Dλmax). We normalize the alignment coefficient ρnoise such that it is 0 for randomly oriented vectors, and one if the drift aligns perfectly with the direction of largest variability:

ρnoise=ϕtrialϕ01ϕ0 (5)

We define a similar alignment statistic ρcoding to assess how drift aligns with directions of neural variability that encode location. We consider the root-mean-squared dot product between the drift Δμ, and the directions of neural activity (z) that vary with location (x) on a given trial, that is xz(x):

Δμxz(x)|2=Δμ[xz(x)][xz(x)]Δμ=Δμ[xz(x)][xz(x)]Δμ=Δμ[Σ+μμ]Δμ (6)

In contrast to the trial-to-trial variability statistic, this statistic depends on the second moment Σ+μμ, where xz(x)𝒩(μ,Σ). We define a normalized ϕcoding2 and ρcoding similarly to ϕtrial2 and ρnoise. For the alignment of drift with behavior, we observed ρcoding= 0.11–0.24 (µ=0.15, σ=0.03), which was significantly above chance for all mice. In contrast, the 95th percentile for chance alignment (i.e. random drift) ranged from 0.06 to 0.10 (µ=0.07, σ=0.02). Drift aligned substantially more with noise correlations, with ρ=0.29–0.43 (µ=0.36, σ=0.04).

Online LMS algorithm

The Least Mean-Squares (LMS) algorithm is an online approach to training and updating a linear decoder, and corresponds to stochastic gradient-descent (Figure 4a). The algorithm was originally introduced in Widrow and Hoff, 1960; Widrow and Hoff, 1962; Widrow and Stearns, 1985. Briefly, LMS computes a prediction error for an affine decoder (i.e. a linear decoder with a constant offset feature or bias parameter) at every time-point, which is then used to update the decoding weights. We analyzed twelve contiguous sessions from mouse 4 (144 units in common), and initialized the decoder by training on the first two sessions using OLS.

By varying the learning rate, we obtained a trade-off (Figure 4b) between the rate of weight changes and the decoding error, with the most rapid learning rates exceeding the performance of offline (static) decoders. In Figure 4d, we selected an example with a learning rate of η=4×10-4. To provide a continuous visualization of the rate of weight change in Figure 4d, we used a sliding difference with a duration matching the average session length. This was normalized by the average weight magnitude to report percent weight change per day. In all other statistics, per-day weight change is assessed as the difference in weights at the end of each session, divided by the days between the sessions.

Data and code availability

Datasets recorded in Driscoll et al., 2017 are available from the Dryad repository (https://doi.org/10.5061/dryad.gqnk98sjq). The analysis code generated during this study is available on Github (https://github.com/michaelerule/stable-task-information; copy archived at https://github.com/elifesciences-publications/stable-task-information; Rule, 2020).

Acknowledgements

We thank Fulvio Forni, Yaniv Ziv and Alon Rubin for in depth discussions. This work was supported by the Human Frontier Science Program (RGY0069), ERC Starting Grant (StG FLEXNEURO 716643) and grants from the NIH (NS089521, MH107620, NS108410)

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Timothy O'Leary, Email: tso24@cam.ac.uk.

Stephanie Palmer, University of Chicago, United States.

Ronald L Calabrese, Emory University, United States.

Funding Information

This paper was supported by the following grants:

  • Human Frontier Science Program RGY0069 to Michael E Rule, Adrianna R Loback, Christopher D Harvey, Timothy O'Leary.

  • H2020 European Research Council FLEXNEURO 716643 to Dhruva Raman, Timothy O'Leary.

  • National Institutes of Health NS089521 to Christopher D Harvey.

  • National Institutes of Health MH107620 to Christopher D Harvey.

  • National Institutes of Health NS108410 to Christopher D Harvey.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Formal analysis, Validation, Investigation, Visualization, Methodology.

Formal analysis, Investigation, Visualization, Methodology.

Conceptualization, Validation, Investigation, Methodology.

Data curation.

Data curation, Funding acquisition, Project administration.

Conceptualization, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Project administration.

Additional files

Transparent reporting form

Data availability

Datasets recorded in Driscoll et al., 2017, are available from the Dryad repository under the https://doi.org/10.5061/dryad.gqnk98sjq. The analysis code generated during this study is available on Github https://github.com/michaelerule/stable-task-information (copy archived at https://github.com/elifesciences-publications/stable-task-information).

The following dataset was generated:

Driscoll LN. 2020. Data from: Stable task information from an unstable neural population. Dryad Digital Repository.

References

  1. Ajemian R, D'Ausilio A, Moorman H, Bizzi E. A theory for how sensorimotor skills are learned and retained in noisy and nonstationary neural circuits. PNAS. 2013;110:E5078–E5087. doi: 10.1073/pnas.1320116110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Andersen RA, Snyder LH, Bradley DC, Xing J. Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annual Review of Neuroscience. 1997;20:303–330. doi: 10.1146/annurev.neuro.20.1.303. [DOI] [PubMed] [Google Scholar]
  3. Andersen RA, Buneo CA. Intentional maps in posterior parietal cortex. Annual Review of Neuroscience. 2002;25:189–220. doi: 10.1146/annurev.neuro.25.112701.142922. [DOI] [PubMed] [Google Scholar]
  4. Attardo A, Fitzgerald JE, Schnitzer MJ. Impermanence of dendritic spines in live adult CA1 Hippocampus. Nature. 2015;523:592–596. doi: 10.1038/nature14467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Calton JL, Taube JS. Where am I and how will I get there from here? A role for posterior parietal cortex in the integration of spatial information and route planning. Neurobiology of Learning and Memory. 2009;91:186–196. doi: 10.1016/j.nlm.2008.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Carmena JM, Lebedev MA, Henriquez CS, Nicolelis MA. Stable ensemble performance with single-neuron variability during reaching movements in primates. Journal of Neuroscience. 2005;25:10712–10716. doi: 10.1523/JNEUROSCI.2772-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Clopath C, Bonhoeffer T, Hübener M, Rose T. Variance and invariance of neuronal long-term representations. Philosophical Transactions of the Royal Society B: Biological Sciences. 2017;372:20160161. doi: 10.1098/rstb.2016.0161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Degenhart AD, Bishop WE, Oby ER, Tyler-Kabara EC, Chase SM, Batista AP, Yu BM, Byron MY. Stabilization of a brain–computer interface via the alignment of low-dimensional spaces of neural activity. Nature Biomedical Engineering. 2020;381:1–14. doi: 10.1038/s41551-020-0542-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Driscoll LN, Pettit NL, Minderer M, Chettih SN, Harvey CD. Dynamic reorganization of neuronal activity patterns in parietal cortex. Cell. 2017;170:986–999. doi: 10.1016/j.cell.2017.07.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Druckmann S, Chklovskii DB. Neuronal circuits underlying persistent representations despite time varying activity. Current Biology. 2012;22:2095–2103. doi: 10.1016/j.cub.2012.08.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gallego JA, Perich MG, Naufel SN, Ethier C, Solla SA, Miller LE. Cortical population activity within a preserved neural manifold underlies multiple motor behaviors. Nature Communications. 2018;9:4233. doi: 10.1038/s41467-018-06560-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gallego JA, Perich MG, Chowdhury RH, Solla SA, Miller LE. Long-term stability of cortical population dynamics underlying consistent behavior. Nature Neuroscience. 2020;23:260–270. doi: 10.1038/s41593-019-0555-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Ganguly K, Carmena JM. Emergence of a stable cortical map for neuroprosthetic control. PLOS Biology. 2009;7:e1000153. doi: 10.1371/journal.pbio.1000153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Harvey CD, Coen P, Tank DW. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature. 2012;484:62–68. doi: 10.1038/nature10918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Hennig JA, Golub MD, Lund PJ, Sadtler PT, Oby ER, Quick KM, Ryu SI, Tyler-Kabara EC, Batista AP, Yu BM, Chase SM. Constraints on neural redundancy. eLife. 2018;7:e36774. doi: 10.7554/eLife.36774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Holtmaat AJ, Trachtenberg JT, Wilbrecht L, Shepherd GM, Zhang X, Knott GW, Svoboda K. Transient and persistent dendritic spines in the neocortex in vivo. Neuron. 2005;45:279–291. doi: 10.1016/j.neuron.2005.01.003. [DOI] [PubMed] [Google Scholar]
  17. Holtmaat A, Svoboda K. Experience-dependent structural synaptic plasticity in the mammalian brain. Nature Reviews Neuroscience. 2009;10:647–658. doi: 10.1038/nrn2699. [DOI] [PubMed] [Google Scholar]
  18. Huber D, Gutnisky DA, Peron S, O'Connor DH, Wiegert JS, Tian L, Oertner TG, Looger LL, Svoboda K. Multiple dynamic representations in the motor cortex during sensorimotor learning. Nature. 2012;484:473–478. doi: 10.1038/nature11039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kappel D, Legenstein R, Habenschuss S, Hsieh M, Maass W. A dynamic connectome supports the emergence of stable computational function of neural circuits through Reward-Based learning. Eneuro. 2018;5:ENEURO.0301-17.2018. doi: 10.1523/ENEURO.0301-17.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kaufman MT, Churchland MM, Ryu SI, Shenoy KV. Cortical activity in the null space: permitting preparation without movement. Nature Neuroscience. 2014;17:440–448. doi: 10.1038/nn.3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Krumin M, Lee JJ, Harris KD, Carandini M. Decision and navigation in mouse parietal cortex. eLife. 2018;7:e42583. doi: 10.7554/eLife.42583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Levy SJ, Kinsky NR, Mau W, Sullivan DW, Hasselmo ME. Hippocampal spatial memory representations in mice are heterogeneously stable. bioRxiv. 2019 doi: 10.1101/843037. [DOI] [PMC free article] [PubMed]
  23. Loewenstein Y, Kuras A, Rumpel S. Multiplicative dynamics underlie the emergence of the log-normal distribution of spine sizes in the neocortex in vivo. Journal of Neuroscience. 2011;31:9481–9488. doi: 10.1523/JNEUROSCI.6130-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Loewenstein Y, Yanover U, Rumpel S. Predicting the dynamics of network connectivity in the neocortex. Journal of Neuroscience. 2015;35:12535–12544. doi: 10.1523/JNEUROSCI.2917-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Minderer M, Brown KD, Harvey CD. The spatial structure of neural encoding in mouse posterior cortex during navigation. Neuron. 2019;102:232–248. doi: 10.1016/j.neuron.2019.01.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Minerbi A, Kahana R, Goldfeld L, Kaufman M, Marom S, Ziv NE. Long-term relationships between synaptic tenacity, synaptic remodeling, and network activity. PLOS Biology. 2009;7:e1000136. doi: 10.1371/journal.pbio.1000136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Moczulska KE, Tinter-Thiede J, Peter M, Ushakova L, Wernle T, Bathellier B, Rumpel S. Dynamics of dendritic spines in the mouse auditory cortex during memory formation and memory recall. PNAS. 2013;110:18315–18320. doi: 10.1073/pnas.1312508110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Montijn JS, Meijer GT, Lansink CS, Pennartz CM. Population-Level neural codes are robust to Single-Neuron variability from a multidimensional coding perspective. Cell Reports. 2016;16:2486–2498. doi: 10.1016/j.celrep.2016.07.065. [DOI] [PubMed] [Google Scholar]
  29. Mulliken GH, Musallam S, Andersen RA. Forward estimation of movement state in posterior parietal cortex. PNAS. 2008;105:8170–8177. doi: 10.1073/pnas.0802602105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Ni AM, Ruff DA, Alberts JJ, Symmonds J, Cohen MR. Learning and attention reveal a general relationship between population activity and behavior. Science. 2018;359:463–465. doi: 10.1126/science.aao0284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V. Scikit-learn: machine learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830. [Google Scholar]
  32. Rokni U, Richardson AG, Bizzi E, Seung HS. Motor learning with unstable neural representations. Neuron. 2007;54:653–666. doi: 10.1016/j.neuron.2007.04.030. [DOI] [PubMed] [Google Scholar]
  33. Rule ME, O'Leary T, Harvey CD. Causes and consequences of representational drift. Current Opinion in Neurobiology. 2019;58:141–147. doi: 10.1016/j.conb.2019.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Rule ME. Stable Task Information from an Unstable Neural Population. 3.0GitHub. 2020 doi: 10.7554/eLife.51121. https://github.com/michaelerule/stable-task-information [DOI] [PMC free article] [PubMed]
  35. Rumpel S, Triesch J. The dynamic connectome. E-Neuroforum. 2016;22:48–53. doi: 10.1515/s13295-016-0026-2. [DOI] [Google Scholar]
  36. Semedo JD, Zandvakili A, Machens CK, Yu BM, Kohn A. Cortical Areas interact through a communication subspace. Neuron. 2019;102:249–259. doi: 10.1016/j.neuron.2019.01.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Singh A, Peyrache A, Humphries MD. Medial prefrontal cortex population activity is plastic irrespective of learning. The Journal of Neuroscience. 2019;39:1370-17–1373483. doi: 10.1523/JNEUROSCI.1370-17.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Tonegawa S, Pignatelli M, Roy DS, Ryan TJ. Memory Engram storage and retrieval. Current Opinion in Neurobiology. 2015;35:101–109. doi: 10.1016/j.conb.2015.07.009. [DOI] [PubMed] [Google Scholar]
  39. Wächter A, Biegler LT. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Mathematical Programming. 2006;106:25–57. doi: 10.1007/s10107-004-0559-y. [DOI] [Google Scholar]
  40. Widrow B, Hoff ME. Adaptive Switching Circuits. Stanford Univ Ca Stanford Electronics Labs; 1960. [Google Scholar]
  41. Widrow B, Hoff ME. Associative storage and retrieval of digital information in networks of adaptive ‘neurons’. In: Bernard E. E, Kare M. R, editors. Biological Prototypes and Synthetic Systems. Springer; 1962. pp. 160–161. [DOI] [Google Scholar]
  42. Widrow B, Stearns SD. Adaptive Signal Processing. Prentice-Hall, Inc; 1985. [Google Scholar]
  43. Ziv Y, Burns LD, Cocker ED, Hamel EO, Ghosh KK, Kitch LJ, El Gamal A, Schnitzer MJ. Long-term dynamics of CA1 hippocampal place codes. Nature Neuroscience. 2013;16:264–266. doi: 10.1038/nn.3329. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision letter

Editor: Stephanie Palmer1

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This work addresses how accurate readout in the brain can be maintained despite shifts in neural population tuning and variability. The work reanalyzes previous data from posterior parietal cortex and digs deeper to show that a simple linear readout can, in fact, recover kinematic variables like animal position and speed from this drifting population. While this simple readout works well, it does slowly degrade over days. This work also shows how to ameliorate this degradation: plasticity that operates via a biologically plausible mechanism can maintain accurate readout.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Stable task information from an unstable neural population" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by a Senior Editor. The reviewers have opted to remain anonymous.

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife.

This is a clearly-presented initial study on how stable readout across days might be achieved despite shifting neural representations. The results have been judged to be sound, analytically, but the potential impact of the work falls short of threshold for a Short Report. Individual reviewer comments are listed below, but the main critiques are summarized here:

1) There are only data on 1 mouse to support the key result, which itself is not surprising given previous work from Driscoll et al., 2017.

2) The present work lacks a null model against which one can properly interpret the success of the concatenated decoder.

Reviewer #1:

This is a well-written, short manuscript about changes in neuronal activity patterns in PPC over days, and how stable readout can be achieved with a simple, linear decoder despite these shifting sands. The idea is that a single, best compromise, linear decoder can be found that is immune to the reconfiguration in the neural population. The work posits, but doesn't prove that the reconfiguration exists in the "null space" of the task.

There are a number of theoretical papers (as nicely referenced in this document) about how accurate decoders might be maintained in changing neural populations, but the upside of this work is that:

a) The results are taken from experimental data with large enough N's and over enough days that decoding accuracy can be traced, and

b) This is the simplest of all possible theories of how performance is maintained, and it's reasonably plausible.

I have some substantive concerns :

1) Given that the consensus decoder had to perform better across days than any single day decoder, it's not clear how surprising these results are.

2) It wasn't clear how well this extrapolates across different mice. In some figures, 3 or 4 individuals are compared, others just 2, others yet, just 1 mouse (mouse 4) is mentioned. This is central to the generality of the paper and should be laid out more clearly. Do the concatenated decoder and LMS decoder results hold for more than one individual?

3) The arguments about the scaling of the biologically plausible weight adjustments seem a little problematic. It's not clear why the results here form an upper bound on the weight changes needed to maintain accurate decoding. Also, it wasn't clear how the interactions between networks, maintaining congruence, is achieved. That final part of the Discussion was a bit vague.

Reviewer #2:

Loback et al. re-analyze data from Driscoll et al., 2017, which had previously shown that PPC representations are unstable over days during a delayed VR T-maze task. Here, using linear decoders, they find that a static decoder can do a reasonable job if trained on data from all days, and that an old model of synaptic weight updates can be applied to maintain decoder performance. The analyses seem to have been done reasonably, but the results strike me as rather shallow and are based on limited data.

The first main result is simply that unstable representations cause single-day (linear) decoders to generalize poorly, but a multi-day decoder to perform somewhat better. I have two issues with this result:

1) Given that activity is sparse and does not have systematic shifts in tuning, this decoding result is very nearly a mathematical necessity. Because of sparsity, the decoder likely ends up built so that different units drive the decoder performance on different sessions. This would not be news. Further, there are no null models analyzed for what would happen under different patterns of tuning shifts, which would be a helpful comparison. It is therefore not clear whether there is anything to be surprised at here.

2) How stable is the behavior within and across sessions? Details of behavior matter all over the brain (Stringer, Pachitariu et al., 2019, Musall, Kaufman et al., 2019), so it is possible that drift in behavioral details could lead to these shifts as well. At least, it should really be shown whether the parameters the authors track are stable over time.

These points said, there is value in this section. The quantifications of instability and of how many neurons are required for good decoder quality are helpful, and the point that only 6% of neurons are even in the top 50% of informativeness is surprising and interesting.

The second major result is the application of the Widrow and Hoff, 1962 model. If I understand correctly, this is primarily a different way of quantifying how fast the tuning changes occur (requiring ~2%/minute weight changes), and secondarily a proof of principle for that model. However, unless I'm missing something, this is a one-mouse result. That would not meet the standard for the field. In addition, comment #2 above applies to this result as well, making it harder to interpret.

Reviewer #3:

In their report, Loback and colleagues reanalyze data from Driscoll et al., 2017. They confirm the finding of that paper, namely that neuronal representations in the parietal cortex of mice reorganize over the time scale of days, while the overall information content is preserved. The authors then more specifically study the dynamics of these changes and relate them to simplified synaptic plasticity rules.

Overall, I find that the paper is clearly written and everything seems technically correct. However, I also find that it lacks scientific novelty. While I find the idea of linking the observed reorganization of neural activity with synaptic plasticity exciting, I find that the paper does not quite achieve that. I think the authors would need to work out some concrete consequences/constraints on plasticity for

this paper to become viable.

Broadly speaking, the current study is divided into two parts. The first part is a re-analysis of the data of Driscoll et al., 2017, which is performed in Figures 1-3. The authors use decoding methods to retrieve task information from the population activity. While some of the details of the population decoding methods are different to those used by the Driscoll et al., the overall conclusions are the same. The strongest point of the re-analysis is that the authors more clearly quantify the strength of the day-to-day changes using decoders that are constrained to change only little over days. That is a nice twist that was not performed in the Driscoll paper.

The second part of the paper is an attempt to relate these day-to-day changes to synaptic plasticity (Figures 3, 4). This part is rather brief and quite sketchy. Roughly, the authors simply reformulate the constrained decoder as an adaptive decoder. Conceptually, that is similar to the ideas brought forward by Rokni et al., Ajemian et al., and others. What could make this part interesting, is if this link could be made stronger, i.e., if it could really be a link to synaptic plasticity, rather than a link to a hypothetical readout. But even if the authors limit themselves to a single readout neuron, many questions are left unaddressed, e.g. how to extrapolate the adaptation rules for the decoder to realistic network sizes.

Other comments:

1) It was not clear to me what happens with the decoders within a session and between days. Do decoders 'jump' between days or stay roughly the same? How does that influence the adaptation rules?

2) Legend of Figure 4 and subsection “Biologically plausible weight adjustment can compensate for ongoing reconfiguration of PPC activity”. You repeatedly state that you approach the 'concatenated decoder.' I guess that should be the 'constrained decoder', otherwise it makes no sense to me.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for choosing to send your work entitled "Stable task information from an unstable neural population" for consideration at eLife. Your letter of appeal has been considered by a Senior Editor and a Reviewing editor, and we are prepared to consider a revised submission with no guarantees of acceptance.

Please address the following concerns that were raised in the discussion of your appeal and revised manuscript:

The null model added is a very nice one (Figure 3—figure supplement 2). It seems there has been a good effort to match it to properties of the data while incorporating a random walk. This is a crucial control. In addition, the new analysis of how drift aligns with coding vs. noise vs. chance (Figure 3—figure supplement 3) is also of substantial interest. Both of these new results are for 4 mice, which is excellent. The framing in the new manuscript also makes it somewhat clearer what the point of this paper is.

Points left to address in full:

1) Please go back and consider the more interesting null model in the other analyses and quantifications in this manuscript. This will improve many other parts of the paper. Please also place this new null model result in the main text of the paper.

2) Regarding the new null model:

Past evidence has clearly shown that neural tuning (or population activity) changes both randomly (assumed to be due to plasticity noise) and directionally (assumed to be due to feedback and learning). With the new null model, the analyses attempts to rule out a random walk. This is valuable effort. However, please add commentary on how this null model is useful despite ignoring the influence of the systematic, directional changes which were already demonstrated in the past, including the authors' own data, and which have usually been related to ongoing learning.

3) Please address in full the expanded review comments sent during the initial appeal. That text is reproduced here:

Thank you for sending us your thoughts and questions about the reviewer comments. This is an excellent piece of work, and the rejection is in no way about whether or not this is solid and publishable. The debate amongst the reviewers revolved around whether it was a significant enough advance for eLife. I have consulted with the reviewers in question and have a more thorough explanation of their comments. Please feel free to reach out if you have further questions.

This manuscript does, indeed, have some basic controls / null models. The shuffle control shows that the decoding is better than chance, and the static same-day model gives an idea of how much the weights have to change per session to do as well as freshly retrained decoders. The null models we'd like to see would compare results with more specific models. This is explained below:

The issue that I think all of the reviewers had is that it wasn't clear how much we should be surprised by these results, and we weren't clear on what new beliefs we should have after reading this paper if we've already read Driscoll, 2017.

There are two basic results that are claimed to be original. First: we should be surprised by the success of a concatenated decoder. On reviewer commented:

"Given that activity is sparse and does not have systematic shifts in tuning, this decoding result is very nearly a mathematical necessity. Because of sparsity, the decoder likely ends up built so that different units drive the decoder performance on different sessions. This would not be news."

In other words, to believe that there's something novel here, we would want to see a null model that can recapitulate the changes seen in Driscoll, 2017, with similar sparsity in the responses, where there isn't an ability to obtain a good concatenated decoder. We'd like to see what's required to have a different result. Without that, we would have expected that the concatenated decoder would work well.

Second, as it was understood by the reviewers, the manuscript argues that we should be surprised that updating decoder weights with the Widrow and Hoff model works here. From Driscoll, 2017, we have an idea of how rapidly location selectivity changes, and how rapidly a static decoder decays. Given that we know this, how rapidly would you expect to have to change the decoder? We didn't see much in the paper that wasn't just a different way of quantifying the same tuning changes. One reviewer suggested adding more specific null models because they think this would let the authors answer these deeper questions. For example, are all of the neurons smoothly changing their tuning? Do some change fast and others slow, and is this a continuous distribution? Is there coordination between neurons' tuning changes or are neurons changing independently? The current null models are extremes: the shuffle is related to a model where everything changes instantly (obviously wrong), and the same-day decoder is equivalent to there being no changes ever (which we know is wrong from Driscoll, 2017). So, what new have we learned?

Finally, regarding the result that only 6% of neurons are in the top 50% all 10 days: again, we lack the context to know how surprised we should be. If we suppose that the informative neurons are chosen randomly each day, then we'd expect the number of neurons that are in the top 50% for 10 days to be 0.5 ^ 9 = ~0.2%. In that case, 6% is surprisingly high. Looking at Driscoll's Figure 2B, ~40% of neurons keep their place preference for 10 days. In that case, 6% is surprisingly low. In fact, why is it so low? Could this just mean the decoder is under-regularized?

eLife. 2020 Jul 14;9:e51121. doi: 10.7554/eLife.51121.sa2

Author response


Editors’ note: The authors appealed the original decision. What follows is the authors’ response to the first round of review.]

This is a clearly-presented initial study on how stable readout across days might be achieved despite shifting neural representations. The results have been judged to be sound, analytically, but the potential impact of the work falls short of threshold for a Short Report. Individual reviewer comments are listed below, but the main critiques are summarized here:

1) There are only data on 1 mouse to support the key result, which itself is not surprising given previous work from Driscoll et al., 2017.

2) The present work lacks a null model against which one can properly interpret the success of the concatenated decoder.

We have completely addressed points (1) and (2) by extending the analysis across animals and by providing a null model for the concatenated decoder. We discuss details below. The outcome strengthens our conclusions. This, along with extensive additional analysis and rewriting to address remaining reviewers' comments means that the manuscript is significantly improved.

There was broad agreement between reviewers that the study (as previously presented) lacked depth and the importance of the results was not clear. Our original goal was to provide a short, sharp analysis with easily digestible results. We concede that in trying to keep the presentation terse we were too glib and superficial.

We have performed extensive additional analyses that strengthen our results. We have also rewritten the manuscript with a more comprehensive Discussion and Introduction, and revised the text to more clearly state the purpose of the study and its contribution. We are open to the suggestion of changing the manuscript to a full report, as opposed to a Short Report, by bringing in the supplementary results/figures to the main text.

Additional results/analyses:

– Figure 2—figure supplement 1 quantifies behavioral stability

– Figure 3—figure supplement 1 shows that the constrained and concatenated results generalize across all four mice for which there was sufficient data

– Figure 3—figure supplement 2 tests concatenated decoder performance against a null model for drift

– Figure 3—figure supplement 3 Shows that drift is not random, and instead aligns far above change with fast fluctuations in neuronal activity

– Figure 4—figure supplement 1 Shows that the LMS results generalize across animals

– Figure 4—figure supplement 2 Shows that the relative plasticity rates scale with population size, for a fixed error level

– Figure 4—figure supplement 3 Extrapolates the LMS results to a synthetic population of >1000 neurons, showing that very little plasticity would be needed to track the stable subspace as the number of neurons is increased

Reviewer #1:

There are a number of theoretical papers (as nicely referenced in this document) about how accurate decoders might be maintained in changing neural populations, but the upside of this work is that:

a) The results are taken from experimental data with large enough N's and over enough days that decoding accuracy can be traced, and

We appreciate the reviewer's feedback. We want to point out that we show that not all drift sits in a linear subspace. We expand on this in the responses below and in the revised manuscript.

b) This is the simplest of all possible theories of how performance is maintained, and it's reasonably plausible.

I have some substantive concerns :

1) Given that the consensus decoder had to perform better across days than any single day decoder, it's not clear how surprising these results are.

There is nothing to guarantee that a concatenated decoder would perform as well as observed in the data. In fact, taking such concerns onboard, we tested performance against a null model with matched sparsity and within/between day variance. We find that a concatenated decoder performs substantially better on the data than on this null model, and extended the analysis to show that this holds across all of the animals that could be analyzedover many days. This is included in a new figure supplement (Figure 3—figure supplement 2).

Secondly, we believe there may be some misunderstanding as to the purpose of constructing these decoders, possibly due to our decision to write a brief manuscript. Our goal is not to predict behavior reliably from data. Our goal is to analyze the dynamics of a drifting representation from the perspective of a system with similar properties and constraints as a downstream neuron/circuit, and then assess, quantitatively, whether these data pose a serious problem for understanding how the brain maintains consistent behavior. The first question we addressed in the paper was indeed a simple, but necessary one: does simple weighted readout work? An affirmative answer suggests a biologically plausible means of reading out the information that is hypothesized to reside in this brain area. The second, immediate, follow on question is: could linear decoding continue to work despite drift? If so, how well, how many units are needed, and does drift induce changes that cannot be confined to a linear subspace? Thirdly, is there a way to quantify the demands placed by drift on connectivity and synaptic plasticity, and do so in a way that is independent of particular models of plasticity? Fourthly, given the actual numbers that emerge from answering the previous questions, is there a specific, parsimonious and biologically plausible model that can find an approximately stable coding subspace and continuously compensate changes that occur outside this subspace? We would argue that none of the follow on questions have obvious answers and all of these questions are important. Reviewer 2 had similar concerns, and we have added Figure 3—figure supplements 1 and 2 address this in more depth. We discuss this in more detail in our response to reviewer 2.

2) It wasn't clear how well this extrapolates across different mice. In some figures, 3 or 4 individuals are compared, others just 2, others yet, just 1 mouse (mouse 4) is mentioned. This is central to the generality of the paper and should be laid out more clearly. Do the concatenated decoder and LMS decoder results hold for more than one individual?

We agree that this was a weakness and we have now addressed it. Overall, we examined five mice, four of which had sufficient neurons recorded for further analyses. We originally focused on two mice (M3, M4) because they had the largest number of tracked days, but the results appear consistent in the other subjects (M1 and M5). We now present supplementary figures for all four mice.

– Figure 3—figure supplement 1 shows that the concatenated decoding results are similar across these four subjects.

– Figure 4—figure supplement 1 shows that the LMS results are general across all four subjects.

3) The arguments about the scaling of the biologically plausible weight adjustments seem a little problematic. It's not clear why the results here form an upper bound on the weight changes needed to maintain accurate decoding.

We agree that this was stated in a glib way and have clarified this point and substantiated it with further analysis. In essence, the argument is that if a biological neuron or circuit had access to even more neurons than we sampled (which we would expect) then the per-synapse rates of change in such a network would certainly be no larger than for a single readout unit with access to a limited population, and would likely be smaller. As the number of useful connections grows, the per-connection contribution shrinks.

We edited the text and added two supplementary figures to better convey how plasticity rate scales with population size.

Figure 4—figure supplement 2 examines scaling with population size in mice 3 and 4. Due to the limited population recorded, this figure does not address scaling to larger populations. Instead, we fix the required error level to match the performance of the full-population LMS (Figure 4—figure supplement 1). We then consider smaller sub-populations, and increase the learning rate to achieve this target error level. Smaller populations require more plasticity to achieve the same decoding performance.

Figure 4—figure supplement 3 extrapolates LMS performance to larger populations (>1000 neurons) by combining neurons from different mice and aligning behavior on each trial. The resulting population exhibits similar scaling relationships as in Figure 4—figure supplement 2. Both weight magnitude and the rate of weight change decrease for larger populations. We also find that the rate of weight change decreases faster than the weight sizes themselves. This confirms that larger (more redundant) populations can be tracked using less per-synapse plasticity.

Secondly, we realised that it might be difficult to directly interpret LMS weight adjustments in the existing model where we impose an upper limit on the change artificially. To simplify things, we removed the limit on the LMS weight change parameter, and control plasticity using only the learning rate parameter η. Rather than analyzing the fast fluctuations, we consider only the slow-timescale changes in weights between days, which can be more clearly related (if only qualitatively) to long-term changes in spine sizes or density. These changes are reflected in the revised Figure 4 and associated supplementary figures.

Also, it wasn't clear how the interactions between networks, maintaining congruence, is achieved. That final part of the Discussion was a bit vague.

Thank you, this is useful feedback. We were referring to ideas that are more extensively and clearly articulated in a Current Opinion article that we published last year, which we cite. Nonetheless, our writing in the present manuscript was vague and we have rewritten this paragraph in the Discussion.

We believe that the revised Discussion better emphasizes the insights into drift in PPC population codes provided by our analysis, and more clearly states the questions we addressed. We have also re-written the Discussion to more clearly convey the limitations of our study, and to highlight new experimental and theoretical directions suggested by our results.

Reviewer #2:

Loback et al. re-analyze data from Driscoll et al., 2017, which had previously shown that PPC representations are unstable over days during a delayed VR T-maze task. Here, using linear decoders, they find that a static decoder can do a reasonable job if trained on data from all days, and that an old model of synaptic weight updates can be applied to maintain decoder performance. The analyses seem to have been done reasonably, but the results strike me as rather shallow and are based on limited data.

We appreciate the constructive comments, other reviewers noted similar concerns. We have substantially revised the text and extended the manuscript with deeper analyses, extended across animals. We believe that this revised manuscript addresses these concerns.

We've added Figure 3—figure supplement 1 and Figure 4—figure supplement 1) to show that the results are generalize over all four mice considered. Please see our response to reviewer 1, which goes into greater depth regarding results from additional subjects.

We also emphasize that we chose an older and simple model of plasticity (LMS) after considerable deliberation and exploration. The choice was not ad-hoc because our goal was not to invent yet another model of plasticity, but instead to evaluate how difficult the problem of drift would be for a simple and biologically plausible learning rule. We evaluated several decoders and learning rules, including nonlinear methods, Gaussian process methods, etc. In all cases, more sophisticated methods required additional assumptions about the mechanism of plasticity and obscured any biological interpretation.

Although it was sometimes possible to get better decoding performance with more sophisticated approaches, this was not our goal. We felt that LMS was more appropriate for lower-bounding the required plasticity to achieve a target decoding performance. The simplicity (and limitations) of LMS made it a useful assay for determining how disruptive drift would be in a biological system. Our reasoning was that if a relatively under-powered local learning rule could track drift, then it would also be very likely that neurons in the brain could do the same (or better), especially with access to a larger PPC population.

We therefore do not believe that the choice of a simple, widely known and parsimonious model is a weakness, but rather a strength. There was no guarantee that such a simple model would work and the fact that is does is important given the fundamental questions raised by the experimental observations.

The first main result is simply that unstable representations cause single-day (linear) decoders to generalize poorly, but a multi-day decoder to perform somewhat better. I have two issues with this result:

1) Given that activity is sparse and does not have systematic shifts in tuning, this decoding result is very nearly a mathematical necessity. Because of sparsity, the decoder likely ends up built so that different units drive the decoder performance on different sessions. This would not be news. Further, there are no null models analyzed for what would happen under different patterns of tuning shifts, which would be a helpful comparison. It is therefore not clear whether there is anything to be surprised at here.

We have taken on board this concern, especially the issue of sparse activity. We constructed a null model which we now present in Figure 3—figure supplement 2 and associated text in Results. In fact, the performance of a concatenated decoder is far above chance compared to a null model with matched variance, sparsity and random drift. Please also refer to our response to the similar issue raised in reviewer 1's first comment, especially our clarification on the purpose of this study and the non-obvious questions it addresses.

We also now present further evidence that drift in the data is not random (Figure 3—figure supplement 3) and associated text in Results. The overlap of drift with behavior coding directions is significantly above chance, but a significant proportion of drift still lies in the null space for location coding.

Overall, our results now provide deeper insight and show that drift (partially) preserves important features of population tuning curve statistics. In light of this, we now feel that the original result is stronger: drift is constrained in a way that could make it more disruptive than chance, but we find that a stable subspace exists nonetheless.

After addressing these issues we feel even more strongly that these results are important for the community, especially since several other groups are now examining drift and stability in other brain areas and other model organisms.

2) How stable is the behavior within and across sessions? Details of behavior matter all over the brain (Stringer, Pachitariu et al., 2019, Musall, Kaufman et al., 2019), so it is possible that drift in behavioral details could lead to these shifts as well. At least, it should really be shown whether the parameters the authors track are stable over time.

Yes, we agree. Driscoll et al., 2017 verified that the overall task performance was stable, but behavioral details are also important.

To address this, we have added Figure 2—figure supplement 1. We assessed behavior changes over time, and found systematic changes only in the forward movement of mouse 4. In all other instances we found no systematic changes. While statistically significant, daily fluctuations in behavior were small. Importantly, all behavioral statistics recorded were stable for three of the four mice studied, suggesting that our results are general and do not arise from systematic changes in behavior.

We now refer to this figure in the main text:

"Behavioral variables were stable over time with some per-session variability (mouse 4 exhibited a slight decrease in forward speed over two weeks; Figure 2—figure supplement 1)."

These points said, there is value in this section. The quantifications of instability and of how many neurons are required for good decoder quality are helpful, and the point that only 6% of neurons are even in the top 50% of informativeness is surprising and interesting.

The second major result is the application of the Widrow and Hoff, 1962 model. If I understand correctly, this is primarily a different way of quantifying how fast the tuning changes occur (requiring ~2%/minute weight changes), and secondarily a proof of principle for that model. However, unless I'm missing something, this is a one-mouse result. That would not meet the standard for the field. In addition, comment #2 above applies to this result as well, making it harder to interpret.

Before addressing the LMS issue, we want to point out that the second major result is a quantification of how much drift occurs outside a linear subspace. We find that a non-negligible component does indeed lie outside a linear subspace, thus preventing long term, reliable decoding by a fixed decoder. Before attempting to find an example of a biologically plausible model that could compensate for this, we quantified the expected per-synapse adjustment that would be required to compensate for this component of drift independently of a specific learning rule. This result and analysis is in Figure 3D-E.

Turning to the issue of the LMS results, we agree that showing only one example was a weakness. We now provide results from LMS from all mice (Figure 4—figure supplement 1); the results generalise.

Reviewer #3:

Overall, I find that the paper is clearly written and everything seems technically correct. However, I also find that it lacks scientific novelty. While I find the idea of linking the observed reorganization of neural activity with synaptic plasticity exciting, I find that the paper does not quite achieve that. I think the authors would need to work out some concrete consequences/constraints on plasticity for

this paper to become viable.

We appreciate this assessment, and share the reviewer's excitement regarding linking neural activity with synaptic plasticity. We need to immediately point out that this was not the sole aim of the paper. The other important aim was to characterise drift as it occurs experimentally and ask if drift is any way structured or minimally disruptive with respect to a plausible readout mechanism. In doing so we are directly testing a well-known theoretical proposal that 'irrelevant' changes in a neural code can be confined to a null space. Our conclusions to this crucial question are more fully discussed elsewhere in this response and we have revised the manuscript substantially to further articulate them.

Turning back to the problem of relating drift to plasticity, the reviewer will appreciate that it is very difficult to directly connect population activity to synaptic plasticity. Where do we start? How do we avoid making too many assumptions and at the same time provide concrete, interpretable models and numbers that can be directly related to the biological system?

We feel that our revised analysis goes a long way to achieving this by considering several variations of a "plausible worst case" scenario that addresses the most pressing question raised by the data, namely, does activity drift pose an immediate problem for understanding the function of PPC and other cortical circuits? We would argue that our analysis does provide concrete constraints and consequences for plasticity, not just qualitatively, but down to actual numbers that are meaningful given known physiology and connectivity.

Specifically:

– We now quantify the effect of population size on long term decoder performance

– We now quantify how much drift occurs in a linear subspace, finding that a non-negligible component does not reside in a linear subspace, and will eventually degrade a fixed readout to chance levels

– We quantify the extent to which a 'best subset' of neurons exists in the population, finding that this subset turns over completely and surprisingly rapidly

– We find a way to estimate how much plasticity would be required, under reasonable and clear assumptions, to compensate for drift independently of a plasticity mechanism – We provide a parsimonious, biologically plausible example of a specific learning rule that can, indeed, achieve this compensation

We would urge the reviewer to contemplate what an alternative approach would consist of that could better address these issues. This is not to say that our original manuscript did not have weaknesses. We failed to include analyses across animals and didn't go as deep as we could have in the analyses that we performed. We also didn't fully articulate the main questions and goals of the study in the very terse manuscript we originally submitted, so some of the above contributions were easy to overlook. We have addressed this as well as adding new results, as detailed below in this response. We believe the manuscript is now clearer and stronger.

Broadly speaking, the current study is divided into two parts. The first part is a re-analysis of the data of Driscoll et al., 2017, which is performed in Figures 1-3. The authors use decoding methods to retrieve task information from the population activity. While some of the details of the population decoding methods are different to those used by the Driscoll et al., the overall conclusions are the same. The strongest point of the re-analysis is that the authors more clearly quantify the strength of the day-to-day changes using decoders that are constrained to change only little over days. That is a nice twist that was not performed in the Driscoll paper.

The second part of the paper is an attempt to relate these day-to-day changes to synaptic plasticity (Figures 3 and 4). This part is rather brief and quite sketchy. Roughly, the authors simply reformulate the constrained decoder as an adaptive decoder. Conceptually, that is similar to the ideas brought forward by Rokni et al., Ajemian et al., and others. What could make this part interesting, is if this link could be made stronger, i.e., if it could really be a link to synaptic plasticity, rather than a link to a hypothetical readout. But even if the authors limit themselves to a single readout neuron, many questions are left unaddressed, e.g. how to extrapolate the adaptation rules for the decoder to realistic network sizes.

Although we cannot access synaptic plasticity directly in these data, we feel that our decoding-based analysis can provide a useful approach for studying constraints on plasticity from recordings of population activity alone.

As outlined in more detail in our responses to reviewers 1 and 2, we have added several new supplementary analyses. These analyses show that the results are general across subjects (Figure 3—figure supplement 1, Figure 4—figure supplement 1), and that observed drift is structured (Figure 3—figure supplements 2, 3). Drift aligns with neural activity, especially noise correlations (Figure 3—figure supplement 3).

We also now address scaling of these results with network size. In Figure 4—figure supplement 2 we show that the required rate of plasticity increases for smaller network sizes, for two mice (M3, M4).

Extrapolating the results to large networks was more challenging, but we were able to construct a synthetic population by aligning trials from different mice in pseudo-time (Figure 4—figure supplement 3). Although this analysis extends over only 6 days, it scales to >1000 neurons and shows that the required plasticity continues to decrease as more neurons are added.

We feel that this report is useful for the community, as many groups are beginning to study drift and plasticity in other brain areas and in other model organisms. We feel that the decoding-based approach to drift is a useful foundation, and that our results will contribute to further experimental and theoretical work on this topic.

Overall, we would summarize the contributions of our revised work as follows:

– While there has been speculation on how to reconcile stable representations with drift in neuronal tuning, this study tests these ideas against experimental data.

– Our work highlights that drift must be structured if it is to preserve population-coding statistics, and our analysis shows that drift dynamics are indeed structured far above chance.

– We find that drift consists of daily fluctuations around a more stable substructure, which nevertheless changes over weeks to months.

– We find that some, but not all, drift occurs in a linear coding subspace. This has immediate implications for existing theories of circuit function.

– Our modelling demonstrates that this structured drift could allow a readout neuron to readily compensate for changes in the neural code, and quantifies the constraints on plasticity and connectivity independently of specific learning rules while also providing a specific example of a plausible model that can operate within these constraints.

– Our results motivate further experiments to search for neural correlates of error signals between brain areas, which we believe would be required to maintain consistency between drifting representations.

– Our results also motivate future theoretical treatment of the underlying cause of drift, how it is related to plasticity, learning and biological noise and whether it is expected to be a universal feature of large, adaptive neural circuits.

Other comments:

1) It was not clear to me what happens with the decoders within a session and between days. Do decoders 'jump' between days or stay roughly the same? How does that influence the adaptation rules?

Agreed. We changed the plotting code so that Figure 4B, Figure 4—figure supplement 1, and figure 4—figure supplement 3 to show discontinuities between days. Per-day fluctuations are present, and can sometimes even lead to improvements across days.

Overall, sharp "jumps" in the LMS error are rare, since LMS tracks close to optimal performance.

2) Legend of Figure 4 and subsection “Biologically plausible weight adjustment can compensate for ongoing reconfiguration of PPC activity”. You repeatedly state that you approach the 'concatenated decoder.' I guess that should be the 'constrained decoder', otherwise it makes no sense to me.

Thanks; we have changed the caption in Figure 4 to read:

"a decoder trained over all testing days".

[Editors’ note: what follows is the authors’ response to the second round of review.]

[…] Points left to address in full:

1) Please go back and consider the more interesting null model in the other analyses and quantifications in this manuscript. This will improve many other parts of the paper. Please also place this new null model result in the main text of the paper.

1) We have included a new null model in the main figures with sparsity matched to the data (Figure 3D); this supersedes the original null model and it is discussed in further detail in response to Points 2 and 3 below.

2) We have used the null model to evaluate levels of plasticity and performance of local ongoing drift compensation with the LMS algorithm.

3) We have constructed a new rank-constrained model of drift in Figure 4—figure supplement 1, which quantifies the level of constraint needed to best match drift to data.

4) We have evaluated null models for head direction and velocity; results were similar but we have omitted the quantification because would add numerous figure panels without adding any insight.

5) We have added text interpreting and discussing the new null models in the main text.

2) Regarding the new null model:

Past evidence has clearly shown that neural tuning (or population activity) changes both randomly (assumed to be due to plasticity noise) and directionally (assumed to be due to feedback and learning). With the new null model, the analyses attempts to rule out a random walk. This is valuable effort. However, please add commentary on how this null model is useful despite ignoring the influence of the systematic, directional changes which were already demonstrated in the past, including the authors' own data, and which have usually been related to ongoing learning.

We have designed and analyzed a new variation of the (sparsity matched) null model that constrains drift to low rank subspaces and quantifies how rank affects the degradation of the code with respect to static readout weights. This is presented in a new Figure 4—figure supplement 1. As discussed in the revised manuscript, this new analysis shows that drift in the data can be quantified in terms of both a random and a systematic component and that drift is far more systematic than would be expected by chance. By modelling drift as confined to a subspace we are now able to provide and interpret a measure of how systematic the drift is in terms of the subspace rank that best matches the data.

We note that the review comments here neglect the alignment analysis in the previous version of the manuscript (now in main Figure 4C, D), which again shows evidence of (and quantifies) the systematic and random components of drift.

We have now extensively modelled, analyzed, interpreted and discussed systematic vs. random drift. Nonetheless, the null model in Figure 3D is useful precisely because it omits systematic changes in the population code. As we outline in the manuscript, the purpose is to illustrate that random, diffusive drift would rapidly degrade a downstream readout. The fact that the null model performs worse than the data confirms that the systematic structure present in the drift makes it far less destructive to a linear readout than expected by chance. The modifications we have made in the revision also now shows that sparsity doesn’t make this finding trivial. Moreover, we still see a slow degradation of decoding within the data, which motivates the later analyses that quantify how much additional plasticity would be required of a downstream area. For the data we have, these analyses together show that in the long run, regardless of a systematic component, drift degrades an optimised static linear readout, indicating a need for ongoing plasticity.

Finally, to clarify once more: this data was explicitly gathered not during ongoing learning. Behavioral performance had plateaued before imaging began. Any additional change in neural activity is not a feature of measurable behavioral improvement. We posit that systematic changes in activity are a feature of the maintenance of learned behaviors. The original 2017 paper alluded to this ideas but came short of demonstrating them in the analysis. The decoders used in the original paper simply demonstrated the utility of using a large number of cells for decoding a single binary variable: trial type (i.e. decoding a single bit of information). It is highly unlikely that the activity in PPC amounts to only 1 bit. The reviewers will therefore recognise that the success of decoding a single bit doesn’t say much, if anything, about the extent to which drift damages or preserves information in a static readout, it simply says that drift doesn’t completely destroy a small amount of information within a limited timescale. It also doesn’t say anything about how readout weights might be learned/maintained.

3) Please address in full the expanded review comments sent during the initial appeal. That text is reproduced here:

Thank you for sending us your thoughts and questions about the reviewer comments. […] Without that, we would have expected that the concatenated decoder would work well.

It turns out that the reviewer’s assertion about the concatenated decoder is incorrect. In the new null model we now match the sparsity of the activity in the data using the activation patterns themselves. A comparison with the previous (non-sparse) null model is shown in Author response image 1.

Author response image 1.

Author response image 1.

We see that the degradation of a concatenated decoder on the sparsity-matched null model is in fact more severe than the original null model. This shows, contrary to the reviewers' intuition, that ‘sparse’ representation of the task variables does not make existence of a multi-day decoder trivial. In fact, it makes its existence statistically less likely.Intuitively, this is because the type of sparseness in the data corresponds to only a small number of cells representing a given range of a task variable (e.g. a handful of cells active in a particular velocity range). Any drift that affects a significant proportion of these cells cannot be compensated by other cells in the population, unlike in a non-sparse case where a given cell may have activity spread over a large range of task space.

Nonetheless, the reviewer’s challenge was useful because it prompted us to construct a more relevant null model which shows that the structure of drift in the data is even less likely to occur by chance than one might suppose.

Second, as it was understood by the reviewers, the manuscript argues that we should be surprised that updating decoder weights with the Widrow and Hoff model works here. From Driscoll, 2017, we have an idea of how rapidly location selectivity changes, and how rapidly a static decoder decays. Given that we know this, how rapidly would you expect to have to change the decoder? We didn't see much in the paper that wasn't just a different way of quantifying the same tuning changes. One reviewer suggested adding more specific null models because they think this would let the authors answer these deeper questions. For example, are all of the neurons smoothly changing their tuning? Do some change fast and others slow, and is this a continuous distribution? Is there coordination between neurons' tuning changes or are neurons changing independently? The current null models are extremes: the shuffle is related to a model where everything changes instantly (obviously wrong), and the same-day decoder is equivalent to there being no changes ever (which we know is wrong from Driscoll, 2017). So, what new have we learned?

First we stress that these results aren’t the only two contributions of this study; we have enumerated the key contributions below.

Addressing this point, we have now used the (new) null model of drift to assess how well online compensation in the Widrow-Hoff LMS algorithm might be expected to perform. We find that performance of LMS on a sparsity-matched null model of drift is substantially worse than the data. Thus, the results in this section are far from trivial and cannot be taken for granted. We have quantified these results in the text that accompanied Figure 5:

"These results suggest that small weight changes could track representational drift in practice. […] This further indicates that drift is highly structured, facilitating online compensation with a local learning rule."

Finally, we would remind the reviewers that it is one thing to have a hunch or suspect that something may be possible. It is quite another to explicitly demonstrate it and to find a means for doing so. We thus believe the main value of this specific result is not that it has some kind of ‘shock’ value, but that it is a principled and informative scientific analysis: we established a way to place theoretical bounds on levels of plasticity required to compensate drift, independently of any learning rule (Figure 4); we then showed they could be achieved using a biologically plausible learning rule (Figure 5). Neither of these steps is obvious or trivial. Both are meaningful.

Finally, regarding the result that only 6% of neurons are in the top 50% all 10 days: again, we lack the context to know how surprised we should be. If we suppose that the informative neurons are chosen randomly each day, then we'd expect the number of neurons that are in the top 50% for 10 days to be 0.5 ^ 9 = ~0.2%. In that case, 6% is surprisingly high. Looking at Driscoll's Figure 2B, ~40% of neurons keep their place preference for 10 days. In that case, 6% is surprisingly low. In fact, why is it so low? Could this just mean the decoder is under-regularized?

We used regularization in the linear models. This is now documented fully in the Materials and methods and does not affect the ranking of the ‘best subset’ of cells.

What’s happening here is the following: neurons with stable tuning peaks can exhibit unstable signal-to-noise ratios. In other words, the location of the maximum firing may change little, but the profile of firing away from the peak can change a lot. As a result, decoders that try to rely on previously good or stable cells eventually suffer when these cells become less reliable.

Therefore means of assessing tuning curve stability used in Driscoll et al. is not the correct measure for assessing stability with respect to a downstream neuron with fixed synaptic weights. This highlights the importance of the decoding perspective in Loback et al. We now clarify this in the text:

"For all subjects, no more than 1% of cells were consistently ranked in the top 10%, an no more than 13% in the top 50%. We confirmed that this instability was not due to under-regularization in training (Materials and methods: Best K-Subset Ranking)."

This instability might seem surprising, since Driscoll et al., 2017, found that ∼40% of cells were tuned to similar preferred locations over time. We find that even this ‘stable’ subset exhibited daily variations in their Signal-to-Noise Ratio (SNR) with respect to task decoding. For example, no more than 8% of neurons that were in the top 25% in terms of tuning-peak stability were also consistently in the top 25% in terms of SNR for all days. If a neuron becomes relatively less reliable, then the weight assigned may become inappropriate for decoding.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Driscoll LN. 2020. Data from: Stable task information from an unstable neural population. Dryad Digital Repository. [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Transparent reporting form

    Data Availability Statement

    Datasets recorded in Driscoll et al., 2017 are available from the Dryad repository (https://doi.org/10.5061/dryad.gqnk98sjq). The analysis code generated during this study is available on Github (https://github.com/michaelerule/stable-task-information; copy archived at https://github.com/elifesciences-publications/stable-task-information; Rule, 2020).

    Datasets recorded in Driscoll et al., 2017, are available from the Dryad repository under the https://doi.org/10.5061/dryad.gqnk98sjq. The analysis code generated during this study is available on Github https://github.com/michaelerule/stable-task-information (copy archived at https://github.com/elifesciences-publications/stable-task-information).

    The following dataset was generated:

    Driscoll LN. 2020. Data from: Stable task information from an unstable neural population. Dryad Digital Repository.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES