Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jan 11.
Published in final edited form as: Nat Neurosci. 2016 Jul 11;19(9):1234–1242. doi: 10.1038/nn.4342

Fast and slow transitions in frontal ensemble activity during flexible sensorimotor behavior

Michael J Siniscalchi 1, Victoria Phoumthipphavong 2, Farhan Ali 2, Marc Lozano 2, Alex C Kwan 1,2,3
PMCID: PMC5003707  NIHMSID: NIHMS796564  PMID: 27399844

Abstract

The ability to shift between repetitive and goal-directed actions is a hallmark of cognitive control. Previous studies have reported that adaptive shifts in behavior are accompanied by changes of neural activity in frontal cortex. However, neural and behavioral adaptations can occur at multiple time scales, and their relationship remains poorly defined. Here, we developed a novel adaptive sensorimotor decision-making task for head-fixed mice, requiring them to shift flexibly between multiple auditory-motor mappings. Two-photon calcium imaging of secondary motor cortex (M2) revealed different ensemble activity states for each mapping. Notably, when adapting to a conditional mapping, transitions in ensemble activity were abrupt and occurred before the recovery of behavioral performance. By contrast, gradual and delayed transitions accompanied shifts towards repetitive responding. These results demonstrate distinct ensemble signatures associated with the start versus end of sensory-guided behavior, and suggest that M2 leads in engaging goal-directed response strategies that require sensorimotor associations.


Operant behaviors are structured around stimuli, actions, and outcomes. Successful execution of a task requires selecting actions that are consistent with the contingencies between these task variables. Importantly, control of action selection in the brain should be both stable and flexible. On the one hand, stability allows a subject to sustain high performance to maximize reward. On the other hand, flexibility is essential for quickly adjusting behavior when a change in contingencies occurs. Striking the delicate balance between stability and flexibility is therefore a key requirement of adaptive decision-making. Moreover, a lack of balance between these opposing aspects of cognitive control is a hallmark of psychiatric disorder1.

How do we know when to be stable or flexible in a changing environment? In tasks without explicit contextual cues, subjects may adjust their response strategy through reward feedback. Prior studies have observed task-dependent differences in neuronal firing rates and selectivity in multiple frontal cortical regions24. Notably, during periods of behavioral adjustment, evolution of cortical activity was found to be gradual and late, occurring at time courses that generally match or lag the improvement in task performance58. However, neurons in the frontal cortex exhibit substantial cell-to-cell variability in such time courses5. Population activity may therefore be more useful for capturing circuit dynamics911. Using ensemble recordings, two studies examined reward-guided adaptations, and found the corresponding changes in network activity to be surprisingly abrupt12,13. Determining the functional significance of these findings, however, will require quantitative comparisons of ensemble activity transitions that differ in their dynamics. Transitions that are relatively gradual versus abrupt, or that differ in onset with respect to behavioral changes, could reflect distinct underlying mechanisms for cognitive control.

To study adaptive sensorimotor decision-making in mice, we designed a novel head-fixed task that requires animals to shift many times between three sets of stimulus-response contingencies. This task is a variant of arbitrary sensorimotor mapping, a classic paradigm in which subjects are required to follow “conditional rules”14,15, such as 'for stimulus A, perform one action; for stimulus B, perform another action.' Once learned, the stimulus-response contingencies can then be switched, requiring the learning of novel mappings or retrieval of familiar associations. Associations are made by linking non-spatial stimuli or conditions to actions, and are therefore termed arbitrary16. A number of brain regions are involved in arbitrary sensorimotor mapping, including the frontal lobe, striatum, hippocampus and thalamus16. Within the frontal lobe, the dorsal premotor cortex has been implicated in the selection of motor programs based on antecedent conditions, as evidenced by the results of lesion studies1719, electrophysiology5,6, functional imaging20,21, and transcranial stimulation22 in humans and non-human primates.

Secondary motor cortex (M2) has been described as a potential rodent homolog of primate higher-order motor areas23,24. Its location, adjacent to the medial prefrontal and primary motor regions, suggests that it may function as a cognitive-motor interface. A long line of research has linked the premotor cortex and neighboring regions to the generation of volitional movements2527. Recent studies in rodents have also focused on the role of M2 in driving motor actions. Random-ratio lever-pressing was shown to become insensitive to reward devaluation in M2-lesioned mice, suggesting a role in goal-directed actions28. Neural activity is modulated prior to movement, reflecting involvement in action preparation and initiation2931. Moreover, M2 neurons not only encode current action, but also prior choice and outcome, indicating a broader role in decision-making29. One early study showed that rats with lesions of medial frontal cortex, a broader region including M2, had deficits in a visual conditional motor task32.

To elucidate the relationship between frontal ensemble activity and adaptive behavior, we used two-photon calcium imaging to record from M2 neurons in behaving mice. We found distinct population activity patterns associated with each of the three sets of stimulus-response contingencies. Moreover, following a contingency switch, transitions between ensemble patterns occurred earlier and were more abrupt when animals were required to abort repetitive actions and use a conditional rule. In fact, this change in ensemble activity state could be detected after only a few error trials, preceding the more gradual recovery of behavioral performance. Our results uncover distinct neural transitions associated with different phases of voluntary behavior, and identify a leading role for M2 in engaging actions that require the use of sensorimotor associations.

Results

An adaptive decision-making task for head-fixed mice

We trained head-fixed mice to perform a task requiring flexible sensorimotor mapping. In each trial, fluid-restricted mice were presented with one of two randomized auditory stimuli – either logarithmic frequency-modulated sweeps from 5 to 15 kHz (“upsweep”), or from 15 to 5 kHz (“downsweep”) – and had to respond with a lick to the left or right port (Fig. 1a, Supplementary Video 1). A correct response was rewarded with 2 μL of water, while an incorrect response resulted in white noise. Trials were organized into blocks (Fig. 1b), each with a distinct set of stimulus-response contingencies: “sound-guided” (upsweep-left; downsweep-right), “action-left” (upsweep-left; downsweep-left), and “action-right” (upsweep-right; downsweep-right). When performance reached a criterion of 85% correct over 20 trials, a new block began with different contingencies. Sound and action blocks alternated, and no contextual cue was given to signal the block transition. Therefore, performance beyond the first block required flexible response selection and outcome monitoring. Mice were prepared for this task by initial training to an expert level on two-choice auditory discrimination, i.e. ~30 days on a task with only sound-guided trials. Here, we present data from mice with fewer than six sessions of experience in the adaptive decision-making task.

Figure 1. Behavioral performance of head-fixed mice in an adaptive sensorimotor decision-making task.

Figure 1

(a) Schematic of experiment. Each trial begins with an auditory cue. A response window starts 0.5 s after cue onset, during which the first lick is recorded as the response for that trial. Water reward is delivered contingent on a correct response.

(b) There are three trial types, which vary by their cue-response mappings. In sound-guided trials, the correct response is left for the upsweep sound (5-to-15 kHz frequency-modulated) and right for the downsweep sound (15-to-5 kHz). For action-guided left and right trials, the correct responses are left and right, respectively, for either sound cue. Trials of the same type are presented in blocks. Block switches, in which a new trial type is introduced, occur when the correct rate reaches 85% for the last 20 trials.

(c) Behavioral performance surrounding a block switch, either from action to sound (top) or sound to action (bottom). Filled circle, hit rate. Open circle, perseverative error rate. Dotted line, other error rate. Mean±s.e.m. of 33 action to sound switches and 38 sound to action switches.

(d) Performance from one example behavioral session. Each trial results in 1 of 4 outcomes: correct (filled circle), perseverative error (open circle), other error (open triangle), or miss (cross). Vertical line, block switch.

(e) Lick rates detected at the left and right lick ports for upsweep or downsweep sound cues during all correct sound-guided (black), action-left (red), and action-right (blue) trials. For each choice direction, lick rates in action trials were compared with those in sound trials in 0.1 s bins, and the bars atop the panels denote significant differences (p<0.01, paired t-test). Line, mean. Shading, ±s.e.m.

n = 9 sessions from 5 mice.

As expected, a switch in contingencies was associated with an immediate drop in correct response rate (Fig. 1c, d). Most incorrect responses were perseverative errors, indicating a failure to update response strategy for ~20 trials after the switch. We obtained concurrent calcium imaging and behavioral data during 9 sessions from 5 mice (Supplementary Table 1). On average, these mice performed 418±49 trials per session, including 296±38 rewarded trials and 9±1 block switches (mean±s.e.m.; range: 6–19 switches; Supplementary Fig. 1a). To quantify motor output, we calculated the mean lick rates and the time of first lick for different trial types. Overall, licks were tightly locked to the time of auditory cue during correct trials (Fig. 1e). For congruent trials (in which stimulus-response contingencies match), lick rates were indistinguishable across sound and action blocks. For incongruent trials (e.g., left action for upsweep during sound block vs. downsweep during action-left block), there was a noticeable difference in mean lick rates and an increased latency to first lick (Supplementary Fig. 2). Nevertheless, the major determinant for the shape of the lick distribution was response direction, i.e. whether the animal chose left or right (Fig. 1e). Additionally, we used video tracking to monitor whisker and hindpaw positions, and found that their movements also depended mostly on response direction (Supplementary Fig. 3). Therefore, although tongue licks were the means for making operant responses in this head-fixed setup, mice performed more complex motor programs to indicate their choices.

Silencing M2 selectively impairs shift to sound-guided actions

To determine whether frontal cortical activity is necessary for adaptive decision-making in our task, we used the GABAA receptor agonist muscimol to inactivate M2 bilaterally. Muscimol (5mM, 46 nL per hemisphere) or saline vehicle was injected ~1 hr before behavioral testing (n=11 mice; Fig. 2a, Supplementary Table 1). We injected low-molecular-weight fluorescein to estimate the extent of the affected region, which included M2 and Cg1, but not other neighboring regions (Supplementary Fig. 4). Compared to controls, muscimol-injected mice performed fewer trials (Fig. 2b; saline: 608±42, muscimol: 476±31, mean±s.e.m.; p=0.007, W=62, Wilcoxon signed-rank test), although there was no difference in the number of switches per 100 trials (saline: 2.7±0.2, muscimol: 2.7±0.1, mean±s.e.m.; p=0.96, W=34). Notably, separate analyses of sound and action blocks revealed selective impairments in the animals’ ability to engage sound-guided actions, evidenced by a marked (55%) increase in the number of perseverative errors per block (Fig. 2c; saline: 5.7±1.1, muscimol: 8.9±1.9, mean±s.e.m.; p=0.042, W=10, Wilcoxon signed-rank test), and a greater number of trials to reach criterion (saline: 38±4, muscimol: 48±6, mean±s.e.m.; p=0.042, W=10). Nevertheless, muscimol-injected mice eventually reached the criterion of >85% correct, indicating that the transition to a high level of performance was slowed, but not blocked, by M2 inactivation. Intriguingly, silencing had the opposite effect on shifts into action blocks, during which the mice required fewer trials to reach, although this effect fell short of statistical significance criterion (saline: 43±4, muscimol: 32±2, mean±s.e.m.; p=0.054, W=55). Inactivation had no effect on the timing or rates of lick motor output (Supplementary Fig. 5). These results indicate a causal role for M2 in the flexible control of action selection. Additionally, the opposing effects of silencing are useful for understanding how the mouse performs the outlined task. One solution to the task would be to forget and re-learn the relevant stimulus-response associations after each contingency change, similar to a reversal task33. This approach predicts symmetric changes in behavior following perturbations. An alternative approach would be to rely on these associations for sound-guided trials, and then ignore them during action blocks to favor repeated selection of the same response. In this case, the mouse would perform the task by shifting the balance between conditional and non-conditional means of responding. The asymmetric deficits observed in our experiments are consistent with the second approach, and implicate M2 in the breaking of repetitive actions and biasing choices based on learned associations.

Figure 2. Bilateral inactivation of secondary motor cortex impairs adjustment to sound-guided trials.

Figure 2

(a) Schematic of experiment.

(b) Task performance after bilateral infusion of saline vehicle (Veh) or muscimol (Mus) into M2

(c) Effects of muscimol infusion on action-to-sound and sound-to-action block shifts.

Gray lines, individual paired experiments. Bar, mean±s.e.m. Wilcoxon signed-rank test. n = 11 mice.

Imaging task-related activity at cellular resolution in M2

To characterize neural activity, we injected adeno-associated viruses encoding GCaMP6s into layer 2/3 of M2 (AAV1-Syn-GCaMP6s-WPRE-SV40; Fig. 3a). GCaMP6s is a genetically encoded calcium indicator that exhibits a ~25% rise in fluorescence intensity per action potential in cortical pyramidal neurons34. While mice performed the adaptive decision-making task, we used two-photon microscopy to record from 62±6 cells per field of view (mean±s.e.m.; range, 26–83 cells; n=9 sessions from 5 mice; Fig. 3b). Figure 3c shows four example M2 neurons with fluorescence transients (ΔF/F) concurrent with responses during sound-guided trials. To examine how the use of conditional rules affects the activity of individual neurons, we averaged ΔF/F across correct trials for the congruent upsweep-left and downsweep-right conditions, separately for sound and action blocks. Neural responses were diverse, even for neurons within the same field of view (Fig. 3d). During sound-guided trials, neurons could exhibit higher ΔF/F for specific associations, i.e. upsweep-left (cell 2) or downsweep-right (cells 1 and 3), or have no preference (cell 4). The use of conditional rules clearly modulated ΔF/F in some neurons (cells 1, 2, and 3), and in other cases had no effect (cell 4).

Figure 3. Two-photon calcium imaging of task-related activity in secondary motor cortex.

Figure 3

(a) Example post hoc and

(b) in vivo two-photon images of GCaMP6s-expressing neurons in layer 2/3 of M2.

(c) Fractional fluorescence changes (ΔF/F) in example M2 neurons during performance of sound-guided trials. Vertical line indicates the time of response associated with correct left (solid black), correct right (dotted black), or incorrect (magenta) trials.

(d) Trial-averaged ΔF/F of four M2 neurons for correct left (solid line) and correct right (dotted line) responses in sound-guided (black) and action-guided (red) trials. Line, mean. Gray shading, 95% confidence intervals.

Neural transition is more rapid during shift to sound rule

The observed heterogeneity of neural responses opened the question of whether single-neuron activity in M2 reflects the components of an ensemble representation for specific task variables. If so, then population-level analyses might more effectively capture the content of such representations. Toward this end, we calculated population activity vectors from ΔF/F and used demixed principal component analysis35,36 to project the vectors in a reduced representational space (see Methods). Plotting these vectors over time generates trajectories describing the time-dependent evolution of ensemble activity during behavior. To determine how the ensemble activity evolved around block switches on a trial-by-trial basis, we calculated the Mahalanobis distances between population activity vectors of each trial and those of the 20 trials prior to the last or next block switch. Following a contingency switch, we found that the ensemble activity migrated away from the previous representational subspace, toward a new subspace associated with the new rule (Fig. 4a). Notably, comparisons of the transition dynamics following a switch into conditional versus non-conditional rules uncovered marked differences. Out of 33 action-to-sound and 38 sound-to-action transitions, 33 and 35 respective switches could be fitted to a logistic function to compare the onset and rate of shifts in population activity patterns (Fig. 4b, c). State transitions associated with the shift to sound-guided responses occurred after only several trials, much earlier than with shifts into repeated actions (Fig. 4d, g; sound: xo=4.0, action: xo=10.4, median; p=0.007, z=2.70, Wilcoxon rank-sum test). Furthermore, breaking from repetitive to sound-guided responding involved transitions that were more abrupt (sound: k=1.02, action: k=0.35, median; p=0.03, z= −2.17, Wilcoxon rank-sum test). These differences in neural dynamics were not due to behavioral differences, because in this set of experiments, trials to criterion were similar for the two rule types (sound: 39, action: 38, median; p=0.9, z=0.09, Wilcoxon rank-sum test; Supplementary Fig. 1a). Overall, these results suggest that ensemble activity patterns in M2 shift earlier and more steeply when animals are required to abort repetitive actions and engage conditional associations to perform sound-guided behavior.

Figure 4. Transitions in ensemble activity occur earlier and are more abrupt following switch to sound-guided trials.

Figure 4

(a) A schematic illustrating ensemble activity dynamics around a block switch. Each curved line represents a single-trial neural trajectory deduced from calcium imaging data. When the contingencies switch, neural trajectories move within the representational space. Trial-by-trial location of ensemble activity patterns was determined by calculating a ratio of Mahalanobis distances, dorigin/(dorigin + ddest), where dorigin and ddest are the Mahalanobis distance from neural trajectory in the current trial to those of the 20 trials pre-switch for the last and current blocks, respectively.

(b) Trial-by-trial location of ensemble activity patterns surrounding two switches from action to sound block. Trial outcomes are plotted on the top row: correct (filled circle), perseverative error (open circle), and other error (open triangle). Filled circles, Mahalanobis distance ratios for individual trials. Line, fit to the logistic function. Upward arrow, behavioral transition trial. Downward arrow, neural transition trial.

(c) Same as (b) for two switches from sound to action block. Note the vertical axis is inverted for presentation purposes.

(d) Summary of parameters extracted by fitting action-to-sound neural transitions with the logistic function. Arrow, median value.

(e) Neural transition trials plotted against behavioral transition trials for action-to-sound switches (see Methods for definition of transition trials). Each symbol represents one block switch. Symbol shapes denote the different sessions. Large circle, median value.

(f) Mean hit and error rates at the behavioral trial corresponding to specific neural transition locations as estimated by the logistic fit for each action-to-sound transition. Circles, mean±s.e.m.

(g–i) Same as (d–f) for switches from sound to action block; black arrows in (f) shown for comparison with (c). *, p<0.05; **, p<0.01, Wilcoxon rank-sum test. Difference in range, L, was not significant (sound: 0.36, action: 0.38, median; p = 0.8, z = 0.28, Wilcoxon rank-sum test). Rightmost bar of the histogram includes all instances above the range.

n = 33 action-to-sound and 35 sound-to-action switches from 9 sessions from 5 mice.

To what extent must population activity resemble the final ensemble state in order to improve behavior? To address this question, we performed two analyses to compare the timing of neural and behavioral transitions. In the first analysis, we defined “transition trials” for behavior (trials to criterion minus 20, the sliding window for assessing criterion) and neural ensemble activity (Mahalanobis distance ratio equaling 75% L based on logistic fit, see Methods). Block-by-block paired comparisons of neural and behavioral transition trials showed that ensemble activity in M2 shifted prior to the recovery of behavioral performance when adapting to conditional rules (Fig. 4e and Supplementary Fig. 6; p=0.003, z= −2.96; Wilcoxon signed-rank test). By contrast, neural and behavioral changes occurred at around the same time for shifts to non-conditional responding (Fig. 4h; p=0.19, z=1.32; Wilcoxon signed-rank test). We should note, however, that the definitions used for transition trials were arbitrary. Therefore, we performed a second, more unbiased analysis in which we determined the mean performance at the behavioral trial corresponding to a series of different neural transition locations. Compared with shifts to action trials (Fig. 4i), transitions to sound-guided trials were associated with hit and error rates that diverged later (Fig. 4f), indicating that behavioral improvement occurred later along the time course of neural transitions. Taken together, these two analyses suggest that when shifting to sound-guided actions, neural ensemble transitions in M2 are nearly complete before behavioral performance improvement can be detected.

Distinct activity patterns accompany rule implementations

Our results indicate that rule shifts are associated with distinct transitions in network activity. This leads naturally to the question of what ensemble dynamics accompany successful rule implementation. We examined trajectories associated with correct responses in the 20 trials pre-switch, when response strategies have stabilized (>85% correct by task design). Figure 5a shows the trajectories of a 56-cell ensemble for left and right responses during sound-guided trials. The trajectories are initially indistinguishable, and then diverge sharply after the animal has made a response. Expanding this analysis to include action blocks reveals population activity patterns that occupy additional, distinct subspaces within the same representational space (Fig. 5b). To quantify rule representations present in the population code, we asked how accurately block type could be predicted from individual population activity vectors. For each session, we constructed a classifier based on linear discriminant analysis (see Methods). Testing each classifier with five-fold cross-validation revealed that in all cases, trial type could be decoded well above chance (Fig. 5c; sound: 78±3%, action-left: 86±4%, action-right: 82±3%; versus chance level of 33%, p=1 x 10−6, 1 x 10−6, 5 x 10−7, t(8)=13.2, 12.8, 14.3, one-sample t-test, n=9 sessions). Repetition of this analysis using a moving window yielded high decoding accuracy at all times during a trial (Fig. 5d), consistent with a global shift in engagement of the network, rather than a simple change in the processing of cue, action, or outcome related signals. Next, we asked whether accuracy of the ensemble classifier could have been driven by a few cells that were highly selective for rule. When classifiers were trained on ΔF/F of individual cells, we found that 27% of the cells could be used to decode block-type above chance; however, accuracies fell along a continuum and at levels below the accuracy of the ensemble (Fig. 5e). To ensure that the differences in trajectories and decoding accuracies were not due to simple sensory or motor parameters, we computed trajectories with matched stimulus, prior choice, current choice, and outcome conditions. Analyses of these congruent trials that differed only by rule yielded similar results (Fig. 5f–i and Supplementary Fig. 7a–d). Taken together, these results indicate that the behavioral implementation of specific conditional and non-conditional rules is associated with distinct network activity patterns in M2, such that population activity from any time during behavior can be used to decode task contingencies with high accuracy.

Figure 5. Multiple strategies are associated with distinct population activity patterns.

Figure 5

(a) Neuronal circuit trajectories for an ensemble of 56 simultaneously imaged cells in one experiment. Trajectories were determined from trial-averaged ΔF/F for 44 correct left (dotted line) and 59 correct right (solid line) responses in sound-guided trials. Open circle, time of response. Filled circles, 3 and 6 s after response. PC, principal component.

(b) Same axes as (a), with additional trajectories from 51 correct action-left (red) and 51 correct action-right (blue) trials. Left, trajectories calculated using trial-averaged ΔF/F. Right, three representative single-trial trajectories from each trial type.

(c) Median accuracy of decoding trial type from individual population activity vectors. S, sound; AL, action-left; AR, action-right. Open triangles, individual experiments. Filled triangles, mean±s.e.m. Dotted line, chance-level accuracy.

(d) Temporal dependence of ensemble decoding accuracy, calculated by repeating the decoding analysis separately for each 0.28-s-long window with step size of 0.28 s. The window duration is the inverse of imaging frame rate, which is 3.6 Hz. Gray, individual experiments. Black, mean. Dotted line, chance-level accuracy.

(e) Median accuracy of decoding trial type from fluorescence transients of single neurons. Green circles, trial type-selective cells, i.e. 95% percentile confidence intervals are above chance-level of 33.3%. Black circles, other cells. Black line, mean decoding accuracy using ensemble activity. Dotted line, chance-level accuracy.

(f–g) Same as (b–c) except restricting to trials matching these conditions: stimulus was upsweep, choice was left, and outcome was reward for the current trial, and choice was left for the prior trial. Trial type could be decoded well above chance (sound: 87±5%, action-left: 88±5%; versus chance level of 50%, p = 7 x 10−5, 7 x 10−5, t(8) = 7.50, 7.52, one-sample t-test).

(h–i) Same as (b–c) except restricting to trials matching these conditions: stimulus was downsweep, choice was right, and outcome was reward for the current trial, and choice was right for the prior trial. Trial type could be decoded well above chance (sound: 85±5%, action-right: 86±4%; versus chance level of 50%, p = 7 x 10−5, 8 x 10−6, t(8) = 7.54, 10.01, one-sample t-test).

n = 9 sessions from 5 mice.

Activity patterns toggle between rule-related configurations

When animals solve trials with the same contingencies a second time, do M2 ensembles revisit similar activity patterns, or alternatively, does population activity migrate to a previously uncharted region of state space? Our task was well suited to address this question because blocks of the same trial type were presented multiple times within the same behavioral session. Figure 6a shows an example set of neural circuit trajectories for the first 12 trial blocks within one behavioral session, in which trajectories could be clearly grouped by block-type, and not by their temporal order. To quantify the representational similarity of ensemble dynamics on a block-by-block basis, we calculated the mean Euclidean distances between all possible pairwise comparisons of trajectories within an experiment (see Methods). We found that neural circuit trajectories from blocks of the same type have a relatively small distance of separation, and are similarly compact (Fig. 6b and Supplementary Fig. 7e,f; for Sound (S), Action-left (AL), and Action-right (AR); S-S vs. AL-AL: p=0.6, W=3; S-S vs. AR-AR, p=0.5, W=13; Wilcoxon signed-rank test). By contrast, trajectories from different block types were represented by markedly different ensemble activity (S-S vs. S-AL, p=0.004, W=0; S-S vs. S-AR, p=0.004, W=0; S-S vs. AL-AR, p=0.004, W=0; corrected α =0.01, Wilcoxon signed-rank test with Bonferroni correction). These results indicate that during adaptive decision-making, M2 toggles between distinct functional configurations as the animal repeatedly engages corresponding changes in task demands.

Figure 6. M2 ensembles revisit previous activity patterns upon re-exposure to corresponding trial-type.

Figure 6

(a) Neural circuit trajectories, calculated from trial-averaged ΔF/F for each trial block during one behavioral session. Circled numbers denote temporal order in which trial blocks were presented. Open circles, time of response. Black, sound-guided. Blue, action-right. Red, action-left.

(b) Normalized distance between neural circuit trajectories from different trial types across all experiments (see Methods) for sound (S), action-left (AL), and action-right (AR). Open triangles, median distances from individual experiments. Solid triangles, mean±s.e.m. **, p<0.01, Wilcoxon rank-sum test, corrected α = 0.01. n = 9 sessions except for AL-AL (n = 4) and AR-AR (n = 8) because mice did not perform enough switches to experience the same block type again in some sessions.

n = 9 sessions from 5 mice.

Comparison of task-related neural dynamics in M2, ALM, and V1

Next, we sought to determine whether the observed neural dynamics are specific to M2, or may be found also in other brain regions. For this purpose, we imaged neural ensembles in layer 2/3 of anterior lateral motor cortex (ALM; 65±6 cells per field of view, mean±s.e.m.; 8 sessions from 4 mice; Supplementary Fig. 1b) and primary visual cortex (V1; 57±7 cells per field of view; 4 sessions from 2 mice; Supplementary Fig. 1c) to compare with the data from M2 (62±6 cells per field of view; 9 sessions from 5 mice). ALM has been implicated in motor planning and execution37,38; however, it is ~1.5 mm distant from M2, and the relationship between the two frontal cortical regions is not understood. V1 was chosen as a control region because the task is performed in the dark and involves no visual stimulus. Multiple linear regression analysis showed that M2 neurons robustly encode not only choice of the current trial, but also choices of the two prior trials (Fig. 7a). By contrast, a higher proportion of cells in ALM encode current choice; however, the signals decay faster, resulting in weaker encoding of prior choices (Fig. 7b). Activity of M2 and ALM neurons can prefer either the ipsilateral or contralateral direction (Supplementary Fig. 8a,b), consistent with prior studies30,38. Unexpectedly, choice signals were also observed in V1 (Fig. 7c). Choice selectivity in V1 was relatively weak, and ΔF/F was almost always higher when animals made an ipsilateral choice (Supplementary Fig. 8c). Because choice signals in V1 were transient and animals performed the task in the dark, we conjecture that the selectivity might relate to corollary discharge. To investigate ensemble activity, we employed the same dPCA and linear classifier analyses used for M2. We found that rule type could be decoded with high accuracy using ensemble activity from ALM (matched sound: action-left trials: 78±4%, t(7)=6.94; p=2 x 10−4; matched sound: action-right trials: 78±3%, t(7)=8.68; p=5 x 10−5; versus chance level of 50%, one-sample t-test; Fig. 7d), but at a much worse rate for V1 (matched sound: action-left trials: 58±5%, t(3)=1.88; p=0.2; matched sound: action-right trials: 67±4%, t(3)=3.76; p=0.03). Therefore, both ALM and M2 exhibit task-specific ensemble activity patterns. However, unlike M2, characterization of ensemble transitions in ALM did not reveal significant differences between switches to sound versus action blocks (sound: xo=8.2, action: xo=10.2, median, z=1.62, p=0.11; sound: k=0.37, action: k=0.56, median, z=0.89, p=0.4; Wilcoxon rank-sum test; Fig. 7e). There were also no detectable timing differences between neural and behavioral transitions in ALM (sound: p=0.5, z=0.61; action: p=0.13, z=−1.50; Wilcoxon signed-rank test; neural transition defined as 75% L). Taken together, these data indicate regionally specific ensemble dynamics associated with adaptive behavior.

Figure 7. Comparison between neural activity patterns in M2, ALM, and V1 during flexible sensorimotor behavior.

Figure 7

(a) Multiple linear regression analysis was used to evaluate the fraction of 562 M2 neurons encoding choice signals as a function of time. Regression was performed with a moving window (duration = 0.5 s, step = 0.5 s) to test for significance with α = 0.01 on all sound-guided trials where the current and prior outcomes are hits (i.e. R(n) = 1 and R(n-1) = 1). The bars atop the panels denote significant fractions (p<0.01, binomial test). Gray shading, significance level of 0.01. Dotted line in the middle panel, fraction of cells significant for the interaction term C(n)*C(n-1). n = 9 sessions from 5 mice.

(b) Same as (a) for 518 ALM neurons. n = 8 sessions from 4 mice.

(c) Same as (a) for 227 V1 neurons. n = 4 sessions from 2 mice.

(d) Median accuracy of decoding trial type from individual population activity vectors restricted to matched trials, comparing M2, ALM, and V1 ensembles. For the S:AL subset, sound and action-left trial types were decoded from trials where stimulus was upsweep, choice was left, and outcome was reward for the current trial, and choice was left for the prior trial. For the S:AR subset, sound and action-right trial types were decoded from trials where stimulus was downsweep, choice was right, and outcome was reward for the current trial, and choice was right for the prior trial. Trial type decoded with high accuracy using ensemble activity from M2 (matched sound: action-left trials: 84±4%, t(8)=8.37; p=3 x 10−5; matched sound: action-right trials: 81±2%, t(8)=12.73; p=1 x 10−6; versus chance level of 50%, one-sample t-test). Open triangles, individual experiments. Filled triangles, mean±s.e.m. Dotted line, chance-level accuracy. (e) Neural transition parameters obtained by fitting action-to-sound (black) and sound-to-action (red) transitions with the logistic function, comparing M2 and ALM ensembles. Difference in L for ALM was not significant (sound: 0.35, action: 0.31, median; p = 0.51, z = −0.65, Wilcoxon rank-sum test). Filled circle, median. Line, 25th and 75th percentiles. *, p<0.05; **, p<0.01, Wilcoxon rank-sum test.

Discussion

The results support two novel insights regarding the function of higher-order motor cortex in adaptive choice behavior. First, fast and slow ensemble transitions are neural signatures for distinct phases of voluntary behavior. Comparison between transitions was possible because our task design allows for multiple shifts between multiple contingencies within a single behavioral session. Second, the relative timing of neural and behavioral shifts, as well as the specific deficits following inactivation, highlight a leading role for this region in the engagement of sensory cue-guided actions (Fig. 8). This conclusion contrasts with previous studies of homologous or nearby prefrontal cortical regions, in which neural changes closely match or lag the time course of behavioral adaptation5,7,12. A key difference is that prior studies have focused on the learning of novel sensorimotor mappings or new rules, whereas our task requires animals to repeatedly disengage and re-engage the learned associations required for sound-guided trials. Likewise, although our paradigm shares important features with other assays for flexibility7,9,10,12,3941, there are also crucial differences (see Methods).

We found that bilateral inactivation of M2 selectively impairs the shift into sound-guided actions. This observation is highly consistent with results of dorsal premotor lesions in primates, which disrupt both the learning of novel visuomotor associations and the engagement of previously learned mappings1719. Interestingly, adaptation to action blocks was facilitated by M2 inactivation. This effect could result from a tendency to repeat the prior choice29: if M2 normally biases animals toward sensory cue-guided actions, then inactivation may remove an important brake on the non-conditional strategy. In our hands, M2 inactivation slowed but did not preclude the eventual transition to high performance on sound-guided trials. This suggests that at least for trained mice, two-choice auditory discrimination alone does not require M2 and may be subserved by other circuits42. Furthermore, we have taken the opposing effects of inactivation on shifts to sound-guided versus repeated actions as evidence that mice perform the task by balancing the use of conditional and non-conditional responses.

A key finding of this study regards how specific parameters of ensemble activity transitions may relate to behavior. We found that ensemble transitions are more abrupt when animals need to retrieve and begin using conditional associations. These fast transitions may be related to those observed in medial prefrontal cortex, which have been interpreted as neural correlates of insight12, or abandonment of an inadequate internal model at the onset of exploration13. On what quantitative basis should transitions be classified as abrupt or gradual? In our study, the steepness of these transitions was compared directly to the slower transitions that accompanied action blocks. Moreover, ensemble transitions occurred after only a few errors, whereas behavioral improvements took tens of sound-guided trials (Fig. 4e,f). The difference in neural and behavioral timing suggests that M2 neural activity has mostly adjusted while the animal still systematically responds in a non-conditional manner. M2 may facilitate the engagement of sound-guided behavior by biasing the use of sensory information, suppressing repetitive actions, or both. By contrast, prior studies have shown that when an animal must acquire novel arbitrary associations, changes in cortical activity track behavioral improvements5,43 and lag the more rapid remapping in the striatum7. A major difference between these studies of fast learning and our study is that the auditory-motor associations are already well learned in our task.

We found that multiple rules were each associated with a distinct subset of population activity patterns. Such task-dependent changes in neural activity have been reported in multiple frontal cortical regions across species24,12,44. By asking the animal to shift repeatedly during a single session, we found that the network can return to a previously employed functional configuration to meet similar behavioral demands. This back-and-forth toggling of ensemble activity is reminiscent of the ensemble remapping observed in CA1 of the hippocampus during repeated exposure to spatial contexts45,46. Interestingly, one study reported that changes in environmental context also caused network activity shifts in the rodent medial prefrontal cortex. However, the ensemble code was not identical upon re-exposure, potentially due to a systematic drift over time47. The divergent findings of repeatable versus drifting network states in the rodent frontal cortex could reflect regional differences, or differences in how frontal areas encode cognitive versus environmental variables.

Several lines of evidence support the idea that the neural dynamics in M2 reflect changes in internal processes (e.g. representation of task contingencies or motor planning and preparation), rather than differences in overt physical movements. First, three different ensemble analyses with matched, congruent trial conditions indicated distinct neural dynamics in sound and action blocks (Fig. 5f–i and Supplementary Fig. 7), despite a lack of observable difference in motor output for the same sets of trials (Fig. 1e and Supplementary Fig. 2). Second, neural signals related to motor execution should be strongest at the time of response. Instead, we found that the rule-specific separation of population activity patterns was significantly above chance at all times across a trial (Fig. 5d). Third, and perhaps the strongest evidence: muscimol inactivation of M2 had no detectable effect on motor output (Supplementary Fig. 5), while clearly affecting behavioral flexibility.

What is the purpose of functional reconfiguration during adaptive decision-making? Ensemble activity patterns within multiple network subspaces reflect the diversity of neural representations in M2. Recent studies have shown that rodent M2 sends long-range projections to sensory cortex48,49 and dorsal striatum50. Appropriate shifts in neural representations could allow M2 to exert differential top-down control in a task-dependent manner. Further study regarding the downstream impacts of frontal network transitions may yield important insights into neuropsychiatric disorders in which cognitive flexibility is impaired. Plausibly, the cognitive rigidity characteristic of disorders such as schizophrenia could result from an inability of frontal cortical networks to shift or maintain stable ensemble states.

Methods

Animals

Adult male mice with C57BL/6J genetic background were used. Mice were housed in groups of 3 – 5, in 12h/12h light-dark cycle (lights off at 7PM), and most experiments were performed in late afternoons and evenings (4PM – midnight). At the start of experiments, mice were P51 – 117. No statistical tests were used to pre-determine sample sizes, but sample sizes for this study are similar to those generally employed in the field. All experimental procedures were approved by the Institutional Animal Care and Use Committee, Yale University.

Surgery

Mice underwent two surgeries. For each surgery, the mouse was anesthetized with 2% isoflurane in oxygen during induction, then lowered to 1 – 1.5% for the remainder of the surgery. The mouse was placed over a water-circulating heating pad (TP-700, Gaymar Stryker) in a stereotaxic frame (David Kopf Instruments). Pre-operatively, the mouse was injected with carprofen (5 mg/kg, s.c.; #024751, Butler Animal Health) and dexamethasone (3 mg/kg, s.c.; Dexaject SP, #002459, Henry Schein Animal Health). Post-operatively, the mouse was injected with carprofen immediately after surgery (5 mg/kg, s.c.) and each day for 3 days following (5 mg/kg, s.c.). For the first surgery, an incision was made to expose the skull. Based on stereotaxic coordinates, the center location of the mouse secondary motor cortex (M2; AP = 1.5 mm, ML = 0.5 mm; relative to bregma) was marked in the right hemisphere. In other experiments, we targeted the anterior-lateral motor cortex37 (ALM; AP = 2.5 mm, ML = 1.5 mm) or the primary visual cortex (V1; AP = −3.8 mm, ML = 2 mm) on the right hemisphere. A stainless steel head plate (eMachineshop.com) was affixed to the skull with Metabond (C&B, Parkell, Inc.), and a thin layer of clear Metabond was then applied to cover the entire skull. Mice were given at least 1 week to recover prior to behavioral training. Head plate-implanted mice were then trained on behavioral tasks (see below). Once a mouse reached a performance criterion of >90% correct rate on three consecutive days and was ready for imaging experiments, a second surgery was performed under anesthesia. Using a dental drill, a 3 mm-diameter craniotomy was made at the targeted location, which had been marked previously and remained visible through the Metabond. Dura was left intact, and was irrigated with artificial cerebrospinal fluid (ACSF, in mM: 5 KCl, 5 HEPES, 135 NaCl, 1MgCl2, 1.8 CaCl2; pH 7.3). Using a glass micropipette attached to a microinjection system (Nanoject II, Drummond), 32 – 46 nL of AAV1-Syn-GCaMP6s-WPRE-SV40 (5 x 1013 titer; UPenn Vector Core) was injected at a depth of 400 μm below dura at each of 4 locations, vertices of a 200 μm-wide square centered at the targeted cortical region. The glass micropipette was left in place for 5 min after injection to reduce backflow. A drop of warmed agar (1.2% in ACSF, Type III-A, High EEO, A9793, Sigma-Aldrich) was then applied to the cortical surface. A two-layer glass window was fabricated by first etching out a 2-mm diameter circle from #0 thickness glass cover slip, then bonding with UV-activated polymer (61, Norland Optical Adhesive) to a #1 thickness, 3-mm diameter round glass cover slip (64-0720 CS-3R, Warner Instruments). This glass window was then placed against the cortical surface. While applying light pressure, super glue was added to the rim to attach the glass to the skull and Metabond. Mice were again given at least 1 week to recover before resuming behavioral training. Imaging experiments would begin when behavioral performance criterion was reached. Eight out of eleven mice went through this procedure involving two surgeries. For the remaining three mice, the head plate implant, viral injection, and window implant procedures were performed in the same surgery before behavioral training.

Behavioral setup

For head-fixed mouse behavior, we used a training apparatus that has two lick ports, thus enabling two alternative choices. The use of two lick ports was inspired by another study37. Two metal screws were used to affix the head plate of the mouse onto a stainless steel mount. The mouse was then restrained inside an acrylic tube, which restricted gross body movements but allowed postural adjustments. The lick ports were fabricated from stainless steel 20-gauge needles, which were positioned at 90 and 270 degrees with respect to the mouse’s head orientation, and held in place by a 3D-printed plastic part mounted on a micromanipulator for fine positional adjustment. Water was delivered at the ports by gravity feed and the liquid volume was controlled by pneumatic valves (EV-2-24, Clippard), calibrated with an intravenous dripper to deliver ~2 μL per pulse. A battery-operated touch detector circuit signaled when the mouse’s tongue contacted a lick port. Auditory stimuli were played through computer speakers placed directly in front of the animal. The intensity of the auditory stimuli was calibrated to ~85 dB peak amplitude. Water delivery, lick detection, and sound presentation were connected to a desktop computer via a data acquisition board (USB-201, Measurement Computing). Presentation software (Neurobehavioral Systems) controlled the entire behavioral system. An infrared webcam was used to monitor the animal while in the rig. Behavioral training was performed inside the closed compartment of an audio-visual cart that was dark and soundproofed with acoustic foams (5692T49, McMaster-Carr). For imaging, mice were tested using a replica of the behavioral training setup under a two-photon microscope.

Adaptive decision-making task

To motivate participation in the task, water consumption was restricted to behavioral sessions. Mice were trained for 1 session per day, 6 days a week. On the non-training day, water was provided ad libitum in the home cage for 15 min. The mice were trained through four phases to shape their behavior. Phase one (~2 days): mice were habituated to head fixation in the behavior box, and trained to lick either one of the two ports for water reward. Mice were advanced to the next phase when they made >100 responses in a session. Phase two (~2 days), mice were trained to sample both ports. Here, mice were required to lick the left port to obtain water rewards three times, followed by the right port for the next three rewards, and so on. Mice were advanced to the next phase when they made >100 correct responses in a session. Phase three (>15 days), animals underwent training for two-choice auditory discrimination. One of two auditory cues was presented to begin each trial: a 2 s-long train of 0.5 s-long logarithmic frequency modulated sweeps from 5-to-15 kHz (“upsweep”) or from 15-to-5 kHz (“downsweep”). The stimuli were interleaved randomly from trial to trial. At 0.5 s following the onset of the auditory cue, a response window would open lasting for a maximum duration of 2 s. The animal's first lick within this response window was registered as its response for the trial. All other licks were logged but had no consequences. Once a response was recorded, playback of the auditory cue was terminated. A correct response, i.e. a left lick for upsweep or a right lick for downsweep, resulted in immediate delivery of 2 μL of water from the corresponding port. The next trial would begin 6 s following response. Incorrect responses resulted in 2 s of white noise presentation, with the next trial beginning 4 s later. Each trial had a total duration within a range from 7.5 to 9 s. Animals were allowed to perform trials until satiated (20 consecutive misses), typically after ~60 minutes. Training continued daily until a correct rate of >90% was attained for 3 consecutive days. For imaging experiments, mice were then trained under the two-photon microscope (with laser turned off) for habituation to the recording setup. All the mice were able to discriminate at >90% correct rate after 1–3 days of re-training. Finally, mice were tested on the adaptive decision-making task. The task always began with a sound block (S) indistinguishable from the two-choice auditory discrimination task. However, once the mouse reached a performance criterion of >85% correct for the last 20 trials, the stimulus-response-outcome contingencies would change from sound- to action-guided trials. In action-guided trials, task structure was identical to sound-guided trials. However, the correct response became fixed to one response direction, e.g. always left, regardless of the stimulus identity. No cue signaled the change in contingencies. When the mouse reached performance criterion again, another block switch would occur. A sound block was always followed by an action block, and vice versa. The second block was randomly chosen for each experiment to be action-left (AL) or action-right (AR). However, once the first action block was chosen, the block sequence became fixed for the remainder of the session. Therefore, the sequence of blocks could be one of two possibilities: (S-AL-S-AR-S-AL-S-AR…) or (S-AR-S-AL-S-AR-S-AL-…). Each session was terminated after 20 consecutive misses (trials with no response). Mice typically performed the adaptive decision-making task for 60 – 90 min. Following each adaptive decision-making test, mice resumed daily two-choice auditory discrimination until the next recording session, up to a maximum of seven adaptive decision-making tests.

Our behavioral paradigm consists of blocks of trials that require the animal to shift between conditional and non-conditional approaches to action selection. In principle, mice may solve this task by ignoring sensory information completely during action blocks. However, the temporally structured lick rates during action blocks (Fig. 1e) strongly suggest use of the stimulus for gating lick responses. Our task has similarities with other paradigms that test behavioral flexibility, but there are also crucial differences. In contrast to paradigms that use a contextual cue to instruct rapid executive control on a trial-by-trial basis9,10,39, animals adapt on a time scale of tens of trials in our task (Fig. 1c). This relatively slow rate of adaptation is akin to learning during arbitrary visuomotor mapping, where the animal’s basis for action selection is updated gradually based on reward feedback7,40. Our task also differs from other strategy- or set-shifting tasks for rodents12,41 because non-spatial stimuli were used to probe arbitrary sensorimotor associations that do not conform to classical definitions of exemplars or sets. Furthermore, analysis of the types of errors made during training suggests that mice perform two-choice auditory discrimination in part by suppressing a prepotent tendency to repeat a rewarded choice. Action trials could thus be considered a natural strategy to the animal, whereas sound-guided trials require weeks of training to achieve high performance. Therefore, one caveat for our task is that animals are likely to have different degrees of learned and intrinsic familiarity for sound versus action trials.

Two-photon calcium imaging

The two-photon microscope (Movable Objective Microscope, Sutter Instrument) was controlled using ScanImage software51. The excitation source was an ultrafast laser (Chameleon Ultra II, Coherent). Excitation intensity was controlled by a Pockels cell (350-80-LA-02, Conoptics) and focused onto the sample with a 20x, N.A. 0.95 water immersion objective (Olympus). The time-averaged excitation laser intensity was 90–100 mW after the objective. To image fluorescence transients from GCaMP6s-expressing neurons, excitation wavelength was set at 920 nm, and emission was collected from 475 – 550 nm with a GaAsP photomultiplier tube. Time-lapse images were acquired at a resolution of 256 x 256 pixels and a frame rate of 3.62 Hz using bidirectional scanning. To synchronize behavior with imaging, a TTL pulse was sent at the beginning of each trial from the data acquisition board of the behavioral system to the imaging system to act as an external trigger for initiating image acquisition.

Inactivation

Mice were implanted with a head plate. The locations of M2 were marked on both hemispheres (AP = 1.5 mm, ML = 0.5 mm), and then covered with a thin layer of clear Metabond. Mice were then trained as described above, in preparation for the adaptive decision-making test. Craniotomies were performed at the marked locations. Using a glass micropipette attached to a microinjection system (Nanoject II, Drummond), ACSF, with or without muscimol (5 mM, 46 nL per hemisphere; cat. #195336, MP Biomedical), was injected at a depth of 400 μm into M2 of both hemispheres. Behavioral testing began 1–3 hr following injection. The same mice were tested after saline and muscimol treatments on consecutive days in a counter-balanced design, with no blinding. The mice were randomized to receive either saline or muscimol first in an alternating manner depending on the order in which they reached the behavioral performance criterion. Twelve mice were allocated for this experiment; however, one was excluded due to equipment malfunction during testing.

Histology

Following experiments, mice underwent transcardial perfusion with chilled formaldehyde solution (4% in phosphate-buffered saline). The brains were sectioned with a vibratome and imaged with an inverted wide-field fluorescence microscope.

Analysis: behavioral data

Timestamps of stimulus presentation, licks, and water delivery were logged in a text file by Presentation software (Neurobehavioral Systems, Inc.). Scripts were written in MATLAB to parse the log files. For the adaptive decision-making task, a perseverative error was defined as an incorrect response that would have been correct according to the last trial block’s contingencies. For example, during an action-left block, the stimulus-response pairings of upsweep-left lick and downsweep-left lick would be “correct”. Downsweep-right lick would be a “perseverative error”, because this stimulus-response pairing would have been correct in the preceding sound-guided block. The remaining possible stimulus-response pairing, upsweep-right lick, would be classified as an “other error”. The number of trials performed included all correct and error trials, but excluded the miss trials when the mouse failed to lick within the response window. Miss trials typically occurred near the end of the session when the mouse was satiated. Trials-to-criterion was defined as the number of trials performed in a certain trial block before reaching a performance criterion of 85% correct for the last 20 trials. Therefore, the minimum value of this quantity is 20. Mean trials to criterion for each session was calculated excluding the first sound block, because contingency switches have not yet begun. Mean blocks per 100 trials, mean perseverative errors per block, and mean other errors per block were calculated excluding the last block (i.e. trials after the last block switch). For analysis, we often compared pre-switch and post-switch conditions, which were defined as the 20 trials prior to or following a block switch. The first lick time was defined as the time of the first lick after sound onset for each trial, which may occur prior to the start of the response window. The first lick time is thus a sum of the reaction time and movement time. For this measurement, we excluded trials in which the mouse licked within 0.5 s before cue onset, in which case the first lick may represent the continuation of a spontaneous lick bout rather than a reaction to the stimulus.

Analysis: imaging data

Time-lapse fluorescence images were corrected for x-y motion using the TurboReg plug-in for ImageJ (NIH). We wrote a GUI in MATLAB to select cell bodies as regions of interest (ROIs). Values of pixels within an ROI were averaged to generate FC(t). For each cell, we estimated the neuropil signal by drawing a doughnut52, by approximating the ROI area as a circle to estimate a radius r, then creating an annulus-shaped neuropil area with inner and outer diameters of 2r and 3r. This neuropil area excluded pixels if they were part of the ROI of another cell body. Values of pixels within the annulus-shaped neuropil area were averaged to generate FN(t). To subtract the neuropil signal, we calculated F(t) = FC(t) - α FN(t), where α is a correction factor ranging from 0.2 – 0.6. The value of α was calibrated for each experiment to avoid over-correction, by making sure that F(t) > 0 for each cell. For each ROI, the fractional change in fluorescence, ΔF/F(t), was calculated as: ΔFF(t)=F(t)-Fo(t)Fo(t),

where Fo(t) is the baseline fluorescence as a function of time. To estimate baseline, we first obtained Fimage(t), the mean pixel intensity for the entire 256 pixel x 256 pixel field of view as a function of time. Fo(t) was then calculated as:

Fo(t)=F×Fo,image(t)Fo,image,

where Fo,image(t) is the 10th percentile of Fimage(t) within a sliding window of 10 minute duration. F* and F*o,image are the 10th percentile of F(t) and Fo,image(t) within the first 10 minutes of the session, respectively. We verified that F0,image(t)/F*0,image does not vary with specific choices or rule blocks, and thus serves the purpose of compensating for slow, full-field signal drifts due to non-physiological sources. We have repeated the ensemble analyses with two other methods for calculating baseline. One, estimating Fo(t) using the 10th percentile of F(t), on a per-cell basis, with a moving window of 10 minute duration. Two, estimating Fo(t) using the 10th percentile of F(t) from the entire session, i.e. without a moving window. These different ways to estimate baseline led to qualitatively similar results for all the ensemble analyses.

Analysis: task-related activity and choice encoding

To calculate trial-averaged fluorescence transients, we created time bins that were 0.5 s wide, and then assigned each ΔF/F(t) value at a particular time t to the corresponding time bin relative to the animal’s response. The binned ΔF/F(t) values were averaged to obtain trial-averaged ΔF/F. To estimate uncertainty of the trial-averaged ΔF/F, a bootstrap analysis was performed by drawing fluorescence transients per trial, with replacement, up to the same number used to construct the trial average. The median and 95% confidence intervals of trial-averaged ΔF/F were estimated from 1000 iterations of this bootstrap analysis. To quantify choice encoding, we performed multiple linear regression analysis on the ΔF/F(t) of each cell using the following equation:

ΔFF(t)=a0+a1C(n)+a2C(n-1)+a3C(n)C(n-1)+a4C(n-2)+ε(t),

where C(n) was the choice of current trial, C(n-1) was the choice of prior trial, C(n-2) was the choice two trials ago, ε(t) was the error term and a’s were regression coefficients. We coded a choice of left as 1 and right as −1. We used a non-overlapping 0.5 s-long moving window with step size of 0.5 s. A cell was deemed to encode one of the choice parameters or interaction if p < 0.01 for the corresponding regression coefficient. To avoid confounds from rule and reward signals, we analyzed only sound-guided trials in which R(n) = 1 (outcome of current trial = reward) and R(n-1) = 1 (outcome of prior trial = reward). We did not analyze action trials, because parameters such as C(n) and C(n-1) were highly correlated by virtue of the task structure, obviating a simple interpretation of the analysis.

Analysis: neural circuit trajectories

Scripts for the ensemble analysis were written in MATLAB, and are available upon request. For state-space analysis, we used demixed principal component analysis36 (dPCA). To prepare the imaging data for dPCA, ΔF/F(t) for each cell for each trial was aligned in time, from 0 to 6 s from the time of the response in that trial. We have tried numerous other time windows and found similar results. This alignment led to an array with dimensions = cells x time x trials. Using this array, we averaged across 4 trial types: C(n) = 1, R(n) = 1, pre-switch sound trials; C(n) = −1, R(n) = 1, pre-switch sound trials; C(n) = 1, R(n) = 1, pre-switch action trials; C(n) = −1, R(n) = 1, pre-switch action trials. This trial-averaged array (cells x time x 4) was input into the dPCA algorithm36 to demix time- and task-dependent variances and obtain principal components (PCs). To calculate neuronal circuit trajectories, single-trial or trial-averaged ΔF/F were projected onto the first three PCs. To characterize similarities between the neuronal circuit trajectories across blocks, we calculated the neuronal circuit trajectory for each block by using the trial-averaged fluorescence across the 20 trials pre-switch. The similarity between a pair of trajectories was quantified by calculating the mean of the Euclidean distances between the trajectories at matching time points in state-space. In order to compare between different experiments, this distance was normalized for each experiment: the Euclidean distances were divided by the spread of all population vectors, calculated as the root mean square of distances between all population vectors and the centroid of the vectors. To quantify how the neuronal circuit trajectories evolve on a trial-to-trial basis, we used the Mahalanobis distance, which is a measure of distance between one point and another collection of points. We defined the origin as the 20 trials preceding a block switch, and the destination as the 20 trials preceding the next block switch. We were interested in the relative separation between the origin, an individual trial that occurred in between, and the destination. Therefore, for each time point of a trial, we calculated Mahalanobis distances, dorigin(t) and ddest(t), from the individual trial (1 three-dimensional value) to the origin and destination respectively (20 three-dimensional values). The dorigin(n) and ddest(n) for each individual trial is the median of dorigin(t) and ddest(t) of the ~30 time points within a trial. To estimate the location of an individual trial relative to the origin and destination, we calculated the ratio of Mahalanobis distances, dorigin(n) / (dorigin(n) + ddest(n)). For the Mahalanobis distance ratios, which are a function of trial number from switch, we fitted with a logistic function,

f(x)=L1+e-k(x-xo)+Lmin,

where x0 is the midpoint trial, k is the steepness, L is the range, and Lmin is the minimum value. The parameter Lmin is not fitted, but rather estimated for each transition by calculating the mean of Mahalanobis distance ratio using trials −5 to −1 from switch. We fitted every neural ensemble transition using this method, but excluded those in which the midpoint trial x0 < −5 or x0 > 200, indicating a poor fit. Based on this criterion, we excluded none (0/33) of the action-to-sound shifts and 8% (3/38) of the sound-to-action shifts in our analysis of M2 neural ensembles. For analysis of the ALM data set, we excluded 8% (2/26) of the action-to-sound shifts and 3% (1/32) of the sound-to-action shifts. When comparing behavioral and neural transitions, we defined ‘behavioral transition trial’ as the trial to criterion (85% correct for 20 trials) subtracted by 20, i.e. the first of the sequence of 20 trials leading to block switch. The ‘neural transition trial’ was defined as the trial when the first term of the logistic fit of Mahalanobis distance ratios reached a value of 75% L. That is, the trial x that satisfies this equation:

0.75L=L1+e-k(x-xo).

This definition is arbitrary; it is unknown how much the population activity pattern must resemble the final pre-switch ensemble state in order to qualify as a ‘transition’. Therefore, in another analysis we first fitted each neural transition with the logistic function, and identified the behavioral trial corresponding to each 5% L step of neural transition from 10 to 90% L. We then calculated the mean hit and error rates at those corresponding behavioral trials, thus plotting the relationship between behavioral performance and neural transition without explicitly defining a transition trial.

Analysis: decoding

To determine how well ensemble dynamics could be used to predict trial type, we first selected those imaging frames that occurred between 0 to 6 s from time of response out of the frame-by-frame imaging data (i.e., ΔF/F(t)). We then projected these ΔF/F(t) onto the PCs deduced from dPCA to obtain population activity vectors. This procedure reduced the dimensionality of our data from (frames × cells) to (frames × 3). Each population activity vector in this analysis came from one of four possible trial types: R(n)=1, pre-switch sound trials; R(n)=1, pre-switch action-left trials; R(n)=1, pre-switch action-right trials; other trial types were not considered for the decoding analysis. Using a randomly chosen fraction (80%) of the population activity vectors, we constructed a classifier based on linear discriminant analysis, using Mahalanobis distances with stratified covariance estimates (the “classify” function in MATLAB with “Mahalanobis” option). We then tested the performance of this classifier on the remaining 20% of the population activity vectors, comparing the classification results with actual trial types. This five-fold cross-validation process was repeated 1,000 times to obtain a median estimate of classifier accuracy. To investigate decoding accuracy across time, the timing information of each population activity vector relative to the time of response in each trial was retained. We then ran a separate decoding analysis on the population activity vectors measured during each time period, using a non-overlapping sliding window with duration of 0.28 s and step size of 0.28 s. This window duration is the inverse of frame rate, which was 3.6 Hz. To decode from single-cell activity, ΔF/F(t) of each cell was used instead of population activity vectors as inputs to construct the classifier.

Statistics

Statistical tests were performed in MATLAB, and are indicated in the main text or figure legends. Briefly, a Wilcoxon signed-rank test was used for all two-sample, paired comparisons. For two-sample, unpaired comparisons, a Wilcoxon rank-sum test was used. Paired t-tests were used for bin-wise analysis of lick rates. For quantification of choice signals as a function of time, multiple linear regression was first performed as detailed above; a binomial test was then applied to the proportion of cells significantly encoding choice within each time-bin. For ensemble decoding analyses, mean classification accuracy was tested against chance level using a one-sample t-test. For t-tests, the sampling distribution of the mean was assumed to be normal, but this was not formally tested. All t-tests were two-tailed. A statistics checklist is available in the Supplementary Materials.

Code availability

The custom MATLAB code used for this study is available upon request.

Data availability

The data that support the findings of this study are available from the corresponding author upon request.

Supplementary Material

1
2
3

Acknowledgments

We thank Daeyeol Lee, Marina Picciotto, and Jane Taylor for useful discussions, and Caroline Posner and Rachel Hannibal for assistance with behavioral training. This work was supported by National Institute of Aging center grant P50AG047270 (A.C.K.), NARSAD Young Investigator Award (A.C.K.), National Institute of Mental Health grant R21MH110712 (A.C.K.), National Institutes of Health training grant T32NS041228 (M.J.S.), National Science Foundation Graduate Research Fellowship DGE-1122492 (M.J.S.), and Brown-Coxe Postdoctoral Fellowship (F.A.).

Footnotes

Author contributions

M.J.S. and A.C.K. conceived the project. M.J.S. performed all experiments. V.P. assisted with mouse surgery and inactivation experiments. F.A. and M.L. assisted with behavioral training and histology. M.J.S. and A.C.K. analyzed the data and wrote the manuscript.

Competing financial interests

none.

References

  • 1.Griffiths KR, Morris RW, Balleine BW. Translational studies of goal-directed action as a framework for classifying deficits across psychiatric disorders. Front Syst Neurosci. 2014;8:101. doi: 10.3389/fnsys.2014.00101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Asaad WF, Rainer G, Miller EK. Task-specific neural activity in the primate prefrontal cortex. J Neurophysiol. 2000;84:451–459. doi: 10.1152/jn.2000.84.1.451. [DOI] [PubMed] [Google Scholar]
  • 3.Rich EL, Shapiro M. Rat prefrontal cortical neurons selectively code strategy switches. J Neurosci. 2009;29:7208–7219. doi: 10.1523/JNEUROSCI.6068-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Rodgers CC, DeWeese MR. Neural correlates of task switching in prefrontal cortex and primary auditory cortex in a novel stimulus selection task for rodents. Neuron. 2014;82:1157–1170. doi: 10.1016/j.neuron.2014.04.031. [DOI] [PubMed] [Google Scholar]
  • 5.Mitz AR, Godschalk M, Wise SP. Learning-dependent neuronal activity in the premotor cortex: activity during the acquisition of conditional motor associations. J Neurosci. 1991;11:1855–1872. doi: 10.1523/JNEUROSCI.11-06-01855.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chen LL, Wise SP. Neuronal activity in the supplementary eye field during acquisition of conditional oculomotor associations. J Neurophysiol. 1995;73:1101–1121. doi: 10.1152/jn.1995.73.3.1101. [DOI] [PubMed] [Google Scholar]
  • 7.Pasupathy A, Miller EK. Different time courses of learning-related activity in the prefrontal cortex and striatum. Nature. 2005;433:873–876. doi: 10.1038/nature03287. [DOI] [PubMed] [Google Scholar]
  • 8.Antzoulatos EG, Miller EK. Differences between neural activity in prefrontal cortex and striatum during learning of novel abstract categories. Neuron. 2011;71:243–249. doi: 10.1016/j.neuron.2011.05.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mante V, Sussillo D, Shenoy KV, Newsome WT. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature. 2013;503:78–84. doi: 10.1038/nature12742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Stokes MG, et al. Dynamic Coding for Cognitive Control in Prefrontal Cortex. Neuron. 2013;78:364–375. doi: 10.1016/j.neuron.2013.01.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wilson RC, Takahashi YK, Schoenbaum G, Niv Y. Orbitofrontal cortex as a cognitive map of task space. Neuron. 2014;81:267–279. doi: 10.1016/j.neuron.2013.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Durstewitz D, Vittoz NM, Floresco SB, Seamans JK. Abrupt transitions between prefrontal neural ensemble states accompany behavioral transitions during rule learning. Neuron. 2010;66:438–448. doi: 10.1016/j.neuron.2010.03.029. [DOI] [PubMed] [Google Scholar]
  • 13.Karlsson MP, Tervo DGR, Karpova AY. Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty. Science. 2012;338:135–139. doi: 10.1126/science.1226518. [DOI] [PubMed] [Google Scholar]
  • 14.Bunge SA, et al. Neural circuitry underlying rule use in humans and nonhuman primates. J Neurosci. 2005;25:10347–10350. doi: 10.1523/JNEUROSCI.2937-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.White IM, Wise SP. Rule-dependent neuronal activity in the prefrontal cortex. Exp Brain Res. 1999;126:315–335. doi: 10.1007/s002210050740. [DOI] [PubMed] [Google Scholar]
  • 16.Wise SP, Murray EA. Arbitrary associations between antecedents and actions. Trends in Neurosciences. 2000;23:271–276. doi: 10.1016/s0166-2236(00)01570-8. [DOI] [PubMed] [Google Scholar]
  • 17.Petrides M. Deficits on conditional associative-learning tasks after frontal- and temporal-lobe lesions in man. Neuropsychologia. 1985;23:601–614. doi: 10.1016/0028-3932(85)90062-4. [DOI] [PubMed] [Google Scholar]
  • 18.Halsband U, Passingham RE. Premotor cortex and the conditions for movement in monkeys (Macaca fascicularis) Behavioural Brain Research. 1985;18:269–277. doi: 10.1016/0166-4328(85)90035-x. [DOI] [PubMed] [Google Scholar]
  • 19.Nixon PD, McDonald KR, Gough PM, Alexander IH, Passingham RE. Cortico-basal ganglia pathways are essential for the recall of well-established visuomotor associations. European Journal of Neuroscience. 2004;20:3165–3178. doi: 10.1111/j.1460-9568.2004.03788.x. [DOI] [PubMed] [Google Scholar]
  • 20.Toni I, Ramnani N, Josephs O, Ashburner J, Passingham RE. Learning arbitrary visuomotor associations: temporal dynamic of brain activity. NeuroImage. 2001;14:1048–1057. doi: 10.1006/nimg.2001.0894. [DOI] [PubMed] [Google Scholar]
  • 21.Boettiger CA, D'Esposito M. Frontal networks for learning and executing arbitrary stimulus-response associations. J Neurosci. 2005;25:2723–2732. doi: 10.1523/JNEUROSCI.3697-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Rushworth MFS, Hadland KA, Paus T, Sipila PK. Role of the human medial frontal cortex in task switching: a combined fMRI and TMS study. J Neurophysiol. 2002;87:2577–2592. doi: 10.1152/jn.2002.87.5.2577. [DOI] [PubMed] [Google Scholar]
  • 23.Murray EA, Bussey TJ, Wise SP. Role of prefrontal cortex in a network for arbitrary visuomotor mapping. Exp Brain Res. 2000;133:114–129. doi: 10.1007/s002210000406. [DOI] [PubMed] [Google Scholar]
  • 24.Preuss TM. Do rats have prefrontal cortex? The Rose-Woolsey-Akert program reconsidered. J Cogn Neurosci. 1995;7:1–24. doi: 10.1162/jocn.1995.7.1.1. [DOI] [PubMed] [Google Scholar]
  • 25.Nachev P, Kennard C, Husain M. Functional role of the supplementary and pre-supplementary motor areas. Nat Rev Neurosci. 2008;9:856–869. doi: 10.1038/nrn2478. [DOI] [PubMed] [Google Scholar]
  • 26.Schall JD, Stuphorn V, Brown JW. Monitoring and control of action by the frontal lobes. Neuron. 2002;36:309–322. doi: 10.1016/s0896-6273(02)00964-9. [DOI] [PubMed] [Google Scholar]
  • 27.Isoda M, Hikosaka O. Switching from automatic to controlled action by monkey medial frontal cortex. Nat Neurosci. 2007;10:240–248. doi: 10.1038/nn1830. [DOI] [PubMed] [Google Scholar]
  • 28.Gremel CM, Costa RM. Premotor cortex is critical for goal-directed actions. Front Comput Neurosci. 2013;7:110. doi: 10.3389/fncom.2013.00110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Sul JH, Jo S, Lee D, Jung MW. Role of rodent secondary motor cortex in value-based action selection. Nat Neurosci. 2011;14:1202–1208. doi: 10.1038/nn.2881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Erlich JC, Bialek M, Brody CD. A cortical substrate for memory-guided orienting in the rat. Neuron. 2011;72:330–343. doi: 10.1016/j.neuron.2011.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Murakami M, Vicente MI, Costa GM, Mainen ZF. Neural antecedents of self-initiated actions in secondary motor cortex. Nat Neurosci. 2014;17:1574–1582. doi: 10.1038/nn.3826. [DOI] [PubMed] [Google Scholar]
  • 32.Passingham RE, Myers C, Rawlins N, Lightfoot V, Fearn S. Premotor cortex in the rat. Behavioral Neuroscience. 1988;102:101–109. doi: 10.1037//0735-7044.102.1.101. [DOI] [PubMed] [Google Scholar]
  • 33.Fusi S, Asaad WF, Miller EK, Wang XJ. A neural circuit model of flexible sensorimotor mapping: learning and forgetting on multiple timescales. Neuron. 2007;54:319–333. doi: 10.1016/j.neuron.2007.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Chen TW, et al. Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature. 2013;499:295–300. doi: 10.1038/nature12354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Machens CK, Romo R, Brody CD. Functional, but not anatomical, separation of "what" and ‘when’ in prefrontal cortex. J Neurosci. 2010;30:350–360. doi: 10.1523/JNEUROSCI.3276-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Brendel W, Romo R, Machens CK. Demixed principal component analysis. Advances in Neural Information …. 2011 doi: 10.7554/eLife.10989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Guo ZV, et al. Flow of cortical activity underlying a tactile decision in mice. Neuron. 2014;81:179–194. doi: 10.1016/j.neuron.2013.10.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Li N, Chen TW, Guo ZV, Gerfen CR, Svoboda K. A motor cortex circuit for motor planning and movement. Nature. 2015;519:51–56. doi: 10.1038/nature14178. [DOI] [PubMed] [Google Scholar]
  • 39.Duan CA, Erlich JC, Brody CD. Requirement of Prefrontal and Midbrain Regions for Rapid Executive Control of Behavior in the Rat. Neuron. 2015;86:1491–1503. doi: 10.1016/j.neuron.2015.05.042. [DOI] [PubMed] [Google Scholar]
  • 40.Asaad WF, Rainer G, Miller EK. Neural activity in the primate prefrontal cortex during associative learning. Neuron. 1998;21:1399–1407. doi: 10.1016/s0896-6273(00)80658-3. [DOI] [PubMed] [Google Scholar]
  • 41.Darrah JM, Stefani MR, Moghaddam B. Interaction of N-methyl-D-aspartate and group 5 metabotropic glutamate receptors on behavioral flexibility using a novel operant set-shift paradigm. Behav Pharmacol. 2008;19:225–234. doi: 10.1097/FBP.0b013e3282feb0ac. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Znamenskiy P, Zador AM. Corticostriatal neurons in auditory cortex drive decisions during auditory discrimination. Nature. 2013;497:482–485. doi: 10.1038/nature12077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Brasted PJ, Wise SP. Comparison of learning-related neuronal activity in the dorsal premotor cortex and striatum. Eur J Neurosci. 2004;19:721–740. doi: 10.1111/j.0953-816x.2003.03181.x. [DOI] [PubMed] [Google Scholar]
  • 44.Wallis JD, Anderson KC, Miller EK. Single neurons in prefrontal cortex encode abstract rules. Nature. 2001;411:953–956. doi: 10.1038/35082081. [DOI] [PubMed] [Google Scholar]
  • 45.Wills TJ, Lever C, Cacucci F, Burgess N, O'Keefe J. Attractor dynamics in the hippocampal representation of the local environment. Science. 2005;308:873–876. doi: 10.1126/science.1108905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Leutgeb S, et al. Independent codes for spatial and episodic memory in hippocampal neuronal ensembles. Science. 2005;309:619–623. doi: 10.1126/science.1114037. [DOI] [PubMed] [Google Scholar]
  • 47.Hyman JM, Ma L, Balaguer-Ballester E, Durstewitz D, Seamans JK. Contextual encoding by ensembles of medial prefrontal cortex neurons. PNAS. 2012;109:5086–5091. doi: 10.1073/pnas.1114415109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Schneider DM, Nelson A, Mooney R. A synaptic and circuit basis for corollary discharge in the auditory cortex. Nature. 2014;513:189–194. doi: 10.1038/nature13724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Manita S, et al. A Top-Down Cortical Circuit for Accurate Sensory Perception. Neuron. 2015;86:1304–1316. doi: 10.1016/j.neuron.2015.05.006. [DOI] [PubMed] [Google Scholar]
  • 50.Rothwell PE, et al. Input- and Output-Specific Regulation of Serial Order Performance by Corticostriatal Circuits. Neuron. 2015;88:345–356. doi: 10.1016/j.neuron.2015.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Pologruto TA, Sabatini BL, Svoboda K. ScanImage: flexible software for operating laser scanning microscopes. Biomed Eng Online. 2003;2:13. doi: 10.1186/1475-925X-2-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Peron SP, Freeman J, Iyer V, Guo C, Svoboda K. A Cellular Resolution Map of Barrel Cortex Activity during Tactile Behavior. Neuron. 2015;86:783–799. doi: 10.1016/j.neuron.2015.03.027. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
3

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon request.

RESOURCES