Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Sep 7.
Published in final edited form as: Nat Neurosci. 2022 Mar 7;25(3):330–344. doi: 10.1038/s41593-022-01025-5

Secondary auditory cortex mediates a sensorimotor mechanism for action timing

Jonathan R Cook 1,2,3, Hao Li 1, Bella Nguyen 1, Hsiang-Hsuan Huang 1,2, Payaam Mahdavian 1, Megan A Kirchgessner 2,4,5, Patrick Strassmann 1,2, Max Engelhardt 1, Edward M Callaway 4, Xin Jin 1,6,7,
PMCID: PMC9288832  NIHMSID: NIHMS1819271  PMID: 35260862

Abstract

The ability to accurately determine when to perform an action is a fundamental brain function and vital to adaptive behavior. The behavioral mechanism and neural circuit for action timing, however, remain largely unknown. Using a new, self-paced action timing task in mice, we found that deprivation of auditory, but not somatosensory or visual input, disrupts learned action timing. The hearing effect was dependent on the auditory feedback derived from the animal’s own actions, rather than passive environmental cues. Neuronal activity in the secondary auditory cortex was found to be both correlated with and necessary for the proper execution of learned action timing. Closed-loop, action-dependent optogenetic stimulation of the specific task-related neuronal population within the secondary auditory cortex rescued the key features of learned action timing under auditory deprivation. These results unveil a previously underappreciated sensorimotor mechanism in which the secondary auditory cortex transduces self-generated audiomotor feedback to control action timing.


Action selection is a fundamental question in behavioral neuroscience. The brain has to choose not only what to do, but also when to start and stop an action1,2. Timing is thus a fundamental dimension of behavior that under many circumstances determines whether an animal’s actions achieve their intended outcome. At very short timescales, temporal control of action can be mediated via genetically preprogrammed sensorimotor feedback circuits, such as those in the spinal cord36. In contrast, the behavioral mechanism and neural locus for the timing of complex action patterns operating over longer durations, like those seen in hunting prey and human athletics, remain largely unknown.

Using a self-paced action timing task, we explored the role of a self-generated, action-dependent sensory feedback cue on a temporally defined, learned action pattern. Acute sensory deprivation screens revealed that auditory, but not somatosensory or visual, deprivation disrupted learned action timing, or the time when animals terminate action. This hearing effect was found to be specifically dependent on the auditory feedback derived from the animal’s own actions, rather than any passive environmental cues. cFos staining and electrophysiological recordings revealed selective neuronal activation of the secondary, but not primary, auditory cortex that was related to action timing in both a learning-dependent and performance-dependent manner. Pharmacological and optogenetic inactivation of the secondary auditory cortex was found to impair learned action timing and recapitulated the behavioral phenotypes of auditory deprivation. Using a transgenic, activity-dependent labeling system7 (targeted recombination of active populations, or TRAP), we were able to selectively target the neuronal population in the secondary auditory cortex that was active during the timing task performance. Further experiments using closed-loop optogenetics demonstrated that action-triggered, but not random, optical stimulation of this task-related neuronal subpopulation in the secondary auditory cortex was able to rescue the key features of learned action timing under auditory deprivation. These results unveil a previously unappreciated role for sensorimotor feedback in regulating action timing, and pinpoint a secondary sensory cortical field in mediating temporal control over learned action timing.

Results

Mice learn a self-paced start/stop action timing task.

Mice were trained in an operant chamber under a novel action timing paradigm in which the start and stop of the timing interval were under the animals’ control. In this task design, a single lever press triggers clock initiation and a reward is provided upon the first press 30 s after initiation press (Fig. 1a and Methods). Timing performance is assayed based on response patterns during randomly interleaved reward omission or probe trials that occur at 30% chance during a session. For both rewarded and probe trials, the lever remains extended for a total of 100 s after initiation press, making the two trial types indistinguishable in terms of the duration over which the lever is available for responding (Fig. 1a). Lever extension and retraction serve only to demarcate trial progression, providing no information on trial type or clock initiation/termination. This self-paced fixed-interval (SFI) reinforcement schedule thus encourages animals to act according to their internally monitored sense of the timing interval (Fig. 1b). Early in training, animals distributed lever pressing evenly across the 100-s probe trials (Fig. 1c; day 1). With training, however, pressing behavior was characterized by a dynamic pattern of responding reflecting fixed-interval learning (Fig. 1c; day 21). To quantify the response patterns in the probe trials, we used three metrics: the response rate at 30 s, the peak time and the half peak fall time (Fig. 1c; day 21). The response rate at 30 s in the probe trials serves as a readout of how strongly the animals respond around reinforcement time. The peak time and half peak fall time, aligned to the initiation press, reflect the animals’ temporal control of action. As training proceeded, animals gradually increased their response rate around 30 s during the probe trials (Fig. 1d,e). Notably, learning is characterized by subtle changes in the response profile from 0 to 30 s and dramatic reductions in relative responding occurring after 30 s (Fig. 1c,d). These changes in response distribution are evidenced by the large decrease in half peak fall time compared to the smaller decrease in peak time (Fig. 1f).

Fig. 1 |. Self-paced fixed-interval timing task in mice.

Fig. 1 |

a, The SFI schedule requires animals to both initiate and terminate the fixed interval. The first press 30 s after initiation yields a reward. Probe trials with omitted reward were randomly interleaved at 30% probability. b, Operant chamber setup for SFI. c, Behavior of an example mouse for probe trials at days 1, 4, 7 and 21 of SFI training. SFI performance was measured by the response rate at 30 s, peak time and the half peak fall time. d, Average PETHs (n = 10) for response rate (top) and the percentage maximum response rate (bottom). e,f, The presses per min at 30 s (e; main effect of treatment F(4,36) = 19.10, P < 0.0001; day 1 versus day 21, P < 0.0001), peak time (f; main effect of treatment F(4,36) = 5.870, P = 0.0010, day 1 versus day 21, P = 0.0056) and half peak fall time (f; main effect of treatment F(4,36) = 47.05, P < 0.0001, day 1 versus day 21, P < 0.0001) across training days (n = 10). Learning data were analyzed using repeated-measures one-way analysis of variance (ANOVA) followed by Tukey post hoc comparisons. Values for performance metrics are means. Shading for average PETHs and error bars denote the s.e.m. ****P < 0.0001; ** P < 0.01.

Detailed analyses of the learning reveal that the increase in responding at 30 s strongly correlates with the decrease in response rate at later times in the probe trial window (Extended Data Fig. 1a,b), further emphasizing the importance of the half peak fall time in shaping the response dynamics that give rise to the emergence of peak timing. This relationship was not an artifact of peri-event time histogram (PETH) construction and present at both single-trial and population levels (Extended Data Fig. 1c,d). Additional experiments training animals on 100% rewarded trials for 21 d and then switching them to regular SFI (that is, 70% rewarded trials and 30% probe trials) demonstrate that these animals show an inability to stop pressing following 30 s in probes (Extended Data Fig. 1e,f), once again confirming the importance of the fall time in shaping the peak press distribution. Detailed analyses reveal that both the peak time and half peak fall time are independent of the latency to initiate the clock after lever extension (Extended Data Fig. 1gi), further confirming the self-paced nature of action timing in the SFI task design. Furthermore, raster alignment to the start and stop of pressing bouts reveals that PETHs resemble a step function without a peak, indicating that trial-by-trial behavior is in fact characterized by low rates of responding early on, followed by an abrupt switch to a constant high rate, and then an abrupt return to no responding (Extended Data Fig. 1j). Similar step-function-like response dynamics have been previously observed across species trained under fixed-interval schedules811. Furthermore, mean bout stop time correlates with half peak fall time, demonstrating once again that the half peak fall time serves as an effective measure for capturing overall trial-by-trial action timing (Extended Data Fig. 1k). It is important to note, however, that by using the half peak fall time, we are not implying that there is a gradual decrease in responding at a trial-by-trial level, but rather we are utilizing this metric from the PETH as an overall characterization of the individual step-like, stop times for a given session. Together, these results suggest that mice learn to coordinate their actions in a temporally specific manner under the self-paced design of SFI, and that the half peak fall time serves as an effective measure for action timing.

Auditory input is required for performing learned action timing.

In the SFI task, mice coordinate their lever-pressing behavior with the internally estimated time interval. We hypothesized that a sensorimotor mechanism was being used to assist in interval timing. To address this question, we performed acute sensory deprivation experiments to test three modalities that could potentially contribute to action timing in the task: somatosensation/proprioception, vision, and audition, with the expectation that decoupling the performance-tuned modality would disrupt the learned timing behavior. Following 21 d of training on SFI, sensory deprivation experiments were performed over an entire session and compared to pre-control and post-control sessions. In the first experiment, we tested the contribution of somatosensation/proprioception by injecting a lidocaine/cyanquixaline (CNQX) cocktail into the brachial plexus behind the left and right scapula (Fig. 2a and Methods). This pharmacological manipulation has been previously shown to block neurotransmission of both afferent and efferent nerve fibers, resulting in the significant attenuation of somatosensation and proprioception12. Assessment of the nerve block effects using the von Frey method revealed decreased sensitivity and a significant increase in the nociceptive threshold of the forepaws (Fig. 2a and Methods). The somatosensation/proprioception manipulation did not lead to a significant change in response rate at 30 s (Fig. 2b), nor did it alter the peak time or half peak fall time compared to control sessions (Fig. 2b). We then tested the role of the visual system in SFI performance by turning off the operant chamber house light, leaving the animals to perform the learned SFI task in complete darkness (Fig. 2c and Methods). With the house light off, ambient light was reduced to a level that has been previously shown to result in a significant decrease in optomotor-based behaviors13. Similarly to the nerve block, the manipulation of visual input had no effects on SFI performance in terms of response rate at 30 s (Fig. 2d), peak time or half peak fall time (Fig. 2d).

Fig. 2 |. Auditory deprivation acutely disrupts self-paced fixed-interval performance.

Fig. 2 |

a, Validation of acute somatosensory/proprioceptive manipulation. The von Frey threshold following nerve block (n = 8) of the forepaw (main effect of treatment F(2,12) = 7.55, P = 0.0075; nerve block versus pre-control and post-control, P = 0.0477 and P = 0.0069, respectively). b, Response rates at 30 s (left), peak time and half peak fall time (right) for somatosensory/proprioceptive deprivation (n = 10). c, Validation of acute visual manipulation. Ambient light levels resulting from turning the house light off. The solid line indicates ambient light levels whereby animals have been shown to have unaffected optomotor behavior. The dotted line indicates ambient light levels whereby animals have been shown to demonstrate ~25% decrement in optomotor performance. d, Same as b but for visual deprivation (n = 12). e, Validation of acute auditory deprivation. Startle amplitude during ear sealing (n = 7) in response to a 40-ms, 120-dB auditory stimulus (main effect of treatment F(2,10) = 8.66, P = 0.0066; ears sealed versus pre-control and post-control, P = 0.0060 and P = 0.0407, respectively). f, Same as b and d, but for auditory deprivation (n = 10; main effect of treatment for presses per min at 30 s F(2,18) = 10.88, P = 0.0008, ears sealed versus pre-control and post-control, P = 0.0008 and P = 0.0092, respectively; main effect of treatment for half peak fall time F(2,18) = 35.78, P < 0.0001, ears sealed versus pre-control and post-control, P < 0.0001 and P < 0.0001, respectively; main effect of treatment for peak time F(2,18) = 0.6923, P = 0.5132, ears sealed versus pre-control and post-control, P = 0.5636 and P = 0.5865, respectively). g, Response rasters (top), response-rate PETH (middle) and the percentage maximum response-rate PETH (bottom) of an exemplar for probe trials performing SFI under auditory deprivation. h, Average PETHs for response rate (top) and the percentage maximum response rate (bottom) for the pre-control session and the ear-seal manipulation (n = 10). Data were analyzed using repeated-measures one-way ANOVA followed by Tukey post hoc comparisons. Values for average PETHs, startle response, von Frey thresholds and SFI performance metrics are means. Error bars and shading for average PETHs denote the s.e.m. ****P < 0.0001; ***P < 0.001; **P < 0.01; *P < 0.05; NS, not significant.

Next, we tested the role of audition in SFI performance (Fig. 2e and Methods). Bilateral ear sealing profoundly disrupted the learned action temporal dynamics in trained animals (Fig. 2fh), with the manipulation effects resembling the performance features of early SFI training (Fig. 1e,f). Notably, auditory deprivation resulted in a significant decrease in response rate at 30 s compared to control sessions (Fig. 2f). The temporal dynamics of responding were also dramatically altered under auditory deprivation, with a significant increase in the half peak fall time (Fig. 2f). A nonsignificant trend of an increase in peak time was also observed (Fig. 2f). The hearing blockade via ear sealing was confirmed by measuring the acoustic startle reflex behavior14 (Fig. 2e and Methods). This auditory deprivation effect was acute and specific to the manipulation session, with performance features largely returning to pre-manipulation levels in the post-control session (Fig. 2f). We also tested animals’ performance with ear sealing done for every day of training (Extended Data Fig. 2). While significant performance deficits were present at day 21 of ear-seal training, timing behavior was largely intact, indicating that while audition seems to be the default modality for SFI performance, other modalities can take its place.

One may argue that the ear-sealing effects on SFI performance are the result of a disruption in the animals’ ability to hear a reward-delivery-related auditory cue, such as the sound produced from the pellet dropping into the reward magazine. Disruption of an auditory cue such as this might impair SFI performance by delaying the reward retrieval time, leading to within-session learning of a longer interval and a delayed half peak fall time. In two separate experiments, (1) using a liquid reward in place of the pellet (Extended Data Fig. 3ae), and (2) through extinction training (Extended Data Fig. 3fh), however, we were able to demonstrate that the behavioral effects from auditory deprivation are independent of any reward-related cues (Methods).

Secondary auditory cortical activity is correlated with learned action.

Lever pressing produces a very salient clicking sound resulting from the lever tilting and making a mechanical contact (Supplementary Video 1, Fig. 3ac and Methods). We hypothesized that this press-dependent clicking sound was the primary auditory source, which when blocked under auditory deprivation was causing the deficiencies in learned SFI performance. Brain regions, such as the auditory cortices, would be expected to exhibit sensorimotor-dependent activity if animals had indeed learned to utilize their own action-generated sounds to inform action timing15,16. To test this hypothesis, we performed an immunohistochemistry screen on trained animals that had just performed the SFI task, and looked at the expression patterns of the immediate-early protein, cFos, along the auditory pathway (Extended Data Fig. 4), including the auditory cortices. In order to have a within-subject control, we performed unilateral ear sealing in these animals, comparing cFos expression across brain hemispheres (Fig. 3e). Unilateral ear sealing results in significant changes in the press rate at 30 s and half peak fall times compared to controls, although to a lesser extent than the bilateral manipulation (Fig. 3d). Upon completion of an SFI behavioral session under random, unilateral auditory deprivation, animals were immediately euthanized for cFos analysis (Fig. 3e).

Fig. 3 |. Learning- and performance-dependent dorsal secondary auditory cortex cFos activation during self-paced fixed-interval task.

Fig. 3 |

a, Response raster from a single probe trial (top) and audio recording signal (bottom) demonstrating responding faithfully elicits an auditory stimulus. b, Audio spectrogram for 0.2-s bins containing a press (left) and no press (right) from the probe trial recording in a. c, Power spectral density periodogram for the 0.2-s bins containing a press and no press in b shown zoomed in the inset. d, Comparison of press rate at 30 s (bottom left; main effect of treatment F(2,25) = 17.80, P < 0.0001, open versus unilateral and bilateral ear seal, P = 0.0030 and P < 0.0001, respectively), peak time and half peak fall times (bottom right; main effect of treatment F(2,25) = 27.01, P < 0.0001, open versus unilateral and bilateral ear seal, P = 0.0369 and P < 0.0001, respectively) for animals trained on SFI with unilateral ear seal (n = 8), bilateral ear seal (n = 10) versus control (n = 10). e, Schedule for cFos screening experiments using unilateral ear sealing during SFI performance. f, cFos immunohistochemistry showing AUDd, AUDp and AUDv cortical regions for a typical animal on day 21 of SFI training with one ear randomly sealed and another open. g, Comparison of the percentage activation according to cFos counts across hemispheres (effect of interaction F(2,24) = 3.687, P = 0.0402; AUDd open versus sealed, P = 0.0003) for each auditory cortical region ipsilateral versus contralateral to the sealed ear. h, Comparison of the percentage activation according to cFos counts across hemispheres for different cortical layers in AUDd (effect of interaction F(4,40) = 2.958, P = 0.0313; layer V: open versus sealed, P = 0.0005) ipsilateral versus contralateral to the sealed ear. i, cFos immunohistochemistry showing AUDd for an example mouse with one ear sealed on day 1 of SFI training (top) and another regular SFI-trained animal with one ear sealed on day 21 in a session without lever (bottom). j, Comparison of the percentage AUDd activation across hemispheres according to cFos counts for animals with one ear sealed trained on day 1, day 21, and day 21 without lever (effect of interaction F(2,24) = 7.535, P = 0.0029; AUDd open versus sealed, P = 0.0013). k, Activation ratios (ratio between numbers of cells in hemisphere ipsilateral to the open ear and the sealed ear) plotted against the press rate at 30 s of each individual animal on SFI training day 1, day 21 and day 21 without lever. For SFI behavioral performance under unilateral ear sealing, data were analyzed using one-way ANOVA followed by Tukey post hoc comparisons. Values for performance metrics are means and error bars denote the s.e.m. For cFos immunohistochemistry quantifications, data were analyzed using two-way ANOVA followed by Sidak post hoc comparisons. Bars are means and represent cFos percentages across hemispheres and layers. Scale bars for all immunohistochemical images denote 50 μm, and ‘D’ and ‘L’ denote dorsal and lateral, respectively. For the correlation plot, gray non-labeled points denote individual animals, and gray shading denotes the 95% confidence interval for regression. PCC, Pearson correlation coefficient. ****P < 0.0001; ***P < 0.001; **P < 0.01; *P < 0.05; NS, not significant.

cFos expression was detected along the auditory pathway (Extended Data Fig. 4), and in all three major auditory cortical regions, including the dorsal secondary auditory cortex (AUDd), primary auditory cortex (AUDp) and ventral secondary auditory cortex (AUDv; Fig. 3e). Cross-hemisphere comparison (open versus sealed ear) revealed a significantly higher activation in the hemisphere ipsilateral to the open ear in the AUDd (Fig. 3f,g). In contrast, both AUDp and AUDv showed no difference in cFos expression across the two hemispheres regardless of which ear was sealed (Fig. 3f,g). Furthermore, no hemispheric imbalance in cFos expression was observed in the primary visual cortex (VISp), which served as an independent brain region control (Extended Data Fig. 5a,b). Breaking down AUDd into different cortical layers revealed that layer V was primarily responsible for this cross-hemispheric difference in activation, with layer II/III showing a smaller, but nonsignificant difference (Fig. 3f,h). Because the sealed ear was randomly chosen, and higher cFos expression was always observed in the hemisphere ipsilateral to the open ear, the observed AUDd activation is likely associated with auditory input during SFI performance.

Our behavioral experiments pointed strongly toward the sound generated from the animal’s own lever pressing as the primary source underlying the auditory deprivation effects. We thus decided to examine the contribution of lever pressing on AUDd activation across SFI learning. Unilateral ear sealing on day 1 of SFI training, when the press rate at 30 s was low and the response distribution was temporally uniform (Fig. 1e,f), resulted in no significant difference in cFos expression across hemispheres in AUDd (Fig. 3i,j). Furthermore, we trained mice for 20 d on the SFI task and removed the lever on day 21. With the lever removed, trial progression was programmed to proceed automatically and animals still received the same number of rewards, occurring at 70% chance as in regular SFI (Methods). Again, unilateral ear sealing under this lever-omitted context did not result in any differences in cFos activation across hemispheres in AUDd (Fig. 3i,j). These data suggest that the AUDd activation was dependent on both learning and action. Indeed, the differences in hemispheric AUDd activation during unilateral ear sealing were found to be positively correlated with lever-press rates across learning, with larger differences associated with higher press rates (Fig. 3k). Together, these results underscore a learning-dependent sensorimotor mechanism involving the activation of the secondary auditory cortex by self-generated sound.

Audiomotor feedback regulates dorsal secondary auditory cortex neuronal activity during timing behavior.

To investigate how the audiomotor feedback modulates the online auditory cortex activity at the single-neuron level, we used in vivo electrophysiology to record AUDd neuronal activity in trained mice while they performed the SFI task (Methods). Electrode arrays were implanted into the deep layers of the AUDd, and the neuronal responses to lever pressing were analyzed after 21 d of training2,17,18 (Fig. 4a). We observed that the AUDd neurons can be either inhibited (Fig. 4b) or excited (Fig. 4c) by the auditory feedback generated from the lever pressing (Fig. 3ac). Of 197 neurons recorded from the AUDd (n = 6 mice), 40 showed significant firing rate change in response to the auditory feedback (Fig. 4d), with a response latency of 17.6 ms (mean) ± 9.5 ms (s.d.). Notably, 80% of the responsive neurons exhibited increased firing activity (Fig. 4e), suggesting that the auditory feedback generated from the lever pressing during the SFI task largely activates the secondary auditory cortex.

Fig. 4 |. Lever-pressing-related neuronal activity in dorsal secondary auditory cortex during the performance of self-paced fixed-interval task in trained mice.

Fig. 4 |

a, In vivo recording of the AUDd (n = 6 animals). b, An example neuron showing phasically inhibited firing activity in response to the auditory feedback from lever pressing (0 s) at day 21 of SFI training. c, An example neuron showing phasically increased firing in response to the auditory feedback from lever pressing at day 21 of SFI training. d, The percentage of responsive versus nonresponsive neurons to auditory feedback from lever pressing. e, The percentage of neurons showing increased versus decreased firing response profiles to lever pressing. f, Activity of an example neuron in the AUDd with an inhibited response profile during lever pressing, before and after ear sealing. g, Firing index of an inhibited neuron during lever pressing (n = 5 neurons), before and after ear sealing. h, Activity of an example neuron in the AUDd with excited response profile during lever pressing, before and after ear sealing. i, Firing index of excited neurons (n = 11 neurons) during lever pressing, before and after ear sealing (two-tailed, paired t-test, t = 2.923, P = 0.0152). Firing indices are means, and error bars denote the s.e.m. Differences in firing index were analyzed using paired t-tests. *P < 0.05; NS, not significant.

To further determine how the auditory feedback regulates AUDd activity, we decided to monitor the effects of ear sealing on AUDd neuronal response. In a subgroup of SFI-trained mice with electrode implantation, we analyzed and compared the neuronal response in the AUDd before and after both ears were sealed during the SFI performance on the same day (Methods). While ear sealing had no detectable effect on the AUDd neurons inhibited by the audiomotor feedback (Fig. 4f,g), the excitation responses of AUDd neurons to audiomotor feedback were significantly reduced, at both the single-cell level (Fig. 4h) and the neuronal population level (Fig. 4i)—an effect that was still detectable even without normalization to baseline firing rate (Extended Data Fig. 6a,b). Together, these results suggest that neurons in the secondary auditory cortex are largely activated by the auditory feedback generated from the lever pressing during the SFI performance, and auditory deprivation significantly diminished this neuronal audiomotor response.

Dorsal secondary auditory cortex neuronal activity is required for the performance of learned action timing.

We next tested whether the secondary auditory cortex is required for the performance of action timing in the SFI task. Inactivation of the AUDd in the trained mice via infusion of muscimol (Extended Data Fig. 7d), the selective GABAA agonist, resulted in a reduction in the response rate at 30 s and an increase in the half peak fall time (Fig. 5ac), closely resembling the effects observed with acute auditory deprivation (Fig. 2eh). Targeting of the AUDd was confirmed via infusion of fast green dye (Extended Data Fig. 7d). Performance deficits were absent when muscimol was infused into the VISp (Fig. 5c), which is consistent with the results from our control experiments on visual deprivation (Fig. 2c,d) and the cFos screen done in the VISp (Extended Data Fig. 5a,b).

Fig. 5 |. Dorsal secondary auditory cortex activation is necessary in providing the sensorimotor feedback for action timing.

Fig. 5 |

a–c, Muscimol infusion in the auditory cortex disrupted SFI performance. a, Behavior of an example mouse for probe trials with muscimol versus saline infusion in the auditory cortex during SFI performance. b, The response rate (top) and the percentage maximum response rate (bottom) in animals with muscimol versus saline infusion in the auditory cortex. c, Inactivation of the auditory cortex (top left; n = 5) via muscimol infusion resulted in a decreased response rate at 30 s (top middle; two-tailed, paired t-test, t = 3.728, P = 0.0203) and increased half peak fall time (top right; two-tailed, paired t-test, t = 3.404, P = 0.0272). Inactivation of the visual cortex via muscimol infusion (bottom left; n = 6) resulted in no change in response rate at 30 s (bottom middle), peak time or half peak fall time (bottom right). d–g, Press-dependent inhibition of the AUDd via stimulation of Vgat+ populations disrupts SFI performance. d, Exemplar ChR2 expression pattern in the auditory cortex of a Vgat-ires-Cre animal crossed to Ai32. Scale bars, 1 mm. e, Behavior of an example animal using closed-loop, press-triggered optical stimulation (1–2 mW, 500 ms per press) of Vgat+ populations in the AUDd during SFI performance (the blue line denotes stimulation and the gray line denotes no stimulation). f, The optogenetic stimulation effects on response rate (top) and the percentage maximum response rate (bottom) from Vgat+ stimulation in the AUDd of Vgat-Ai32 animals (n = 7) during SFI performance. g, The optogenetic stimulation effects on response rates at 30 s (left; paired t-test, t = 3.332, P = 0.0158), peak time (middle; two-tailed, paired t-test, t = 3.705, P = 0.0100) and half peak fall times (right; paired t-test, t = 9.925, P = 0.0078) for press-dependent optical stimulation of Vgat+ populations in AUDd (n = 7) during SFI performance with ears open. For muscimol infusion and press-triggered optical SFI experiments, data were analyzed using paired t-tests. Values for performance metrics are means and error bars denote the s.e.m. Shading for all average PETHs denotes the s.e.m. Scale bars for immunohistochemical images denote 1 mm, and ‘D’ and ‘L’ denote dorsal and lateral, respectively. **P < 0.01; *P < 0.05.

To further confirm this result in relation to action (that is, pressing), we expressed ChR2 in Vgat-positive neurons (Fig. 5d), and inhibited AUDd contingent on the animal pressing the lever. Optical stimulation occurred for both reward and probe trials, and was interleaved randomly at 50% chance with non-stimulated trials to preclude stimulation from being used to determine trial type. For a given stimulation trial, light was delivered for each lever press throughout the lever extension period to avoid stimulation from being used in temporal discrimination. This press-triggered optical SFI design thus provides closed-loop stimulation, in which each lever press triggers momentary light delivery to AUDd (Methods). Closed-loop, action-dependent inhibition of AUDd via stimulation of Vgat+ populations resulted in similar deficits in SFI performance to those seen with muscimol infusion and ear sealing (Fig. 5eg). Additionally, to interfere with AUDd activity, we performed nonspecific activation of excitatory populations in the region utilizing the CaMKII-Cre driver line and found a similar impact on SFI performance using the same stimulation strategy described above (Extended Data Fig. 8).

Optogenetic stimulation of the dorsal secondary auditory cortex rescues timing behavior under auditory deprivation.

We next asked whether activation of the AUDd is sufficient to subserve the sensorimotor feedback needed for action timing under deprivation of auditory input. To do this, we utilized FosTRAP, an activity-dependent labeling mouse line7, to selectively target the neuronal population activated in AUDd during SFI task performance. Specifically, the FosTRAP animal exploits the promoter of cFos to drive the expression of tamoxifen-sensitive Cre recombinase (CreERT2) in an activity-dependent manner7. Following 21 d of regular SFI training, FosTRAP animals were allowed to perform the task with both ears either sealed or open while also receiving an intraperitoneal injection of 4-hydroxytamoxifen (4-OHT; Fig. 6a and Methods). In this way, the active neuronal population in the AUDd was labeled with CreERT2. This activity-dependent expression of Cre recombinase was utilized to drive the expression of channelrhodopsin (ChR2) in either a Cre-ON- or a Cre-OFF-dependent manner depending on the viral construct present in the AUDd (Fig. 6a,b). Additional experiments were performed to verify the fidelity of the FosTRAP system to recapitulate endogenous cFos expression patterns (Extended Data Fig. 5cg and Methods).

Fig. 6 |. Press-dependent dorsal secondary auditory cortex activation is sufficient in providing the sensorimotor feedback for action timing.

Fig. 6 |

a, Closed-loop, action-dependent optical activation of specific neuronal populations in the AUDd during SFI performance under auditory deprivation, including the three condition groups: ears-open/Cre-ON, ears-open/Cre-OFF and ears-sealed/Cre-ON. b, Exemplar ChR2 expression patterns in the AUDd of the three experimental conditions tested with FosTRAP: ears-open/Cre-ON, ears-sealed/Cre-ON and ears-open/Cre-OFF. Scale bars, 1 mm. c, Behavior of an ears-open/Cre-ON example mouse during SFI performance under auditory deprivation without (red line) versus with (blue line) optogenetic stimulation. d, The optogenetic stimulation effects on response rate (top) and percentage maximum response rate (bottom) in the ears-open/Cre-ON group of animals (n = 14) during SFI performance under auditory deprivation. e, The optogenetic stimulation effects on response rates at 30 s (top; effect of interaction F(2,37) = 8.23, P = 0.0011; ears-open/Cre-ON: stimulation versus no stimulation, P = 0.0009), peak time (middle; effect of interaction F(2,37) = 1.80, P = 0.1791) and half peak fall times (bottom, effect of interaction F(2,37) = 4.13, P = 0.0241, ears-open/Cre-ON: stimulation versus no stimulation, P = 0.0194) for the three experimental groups (ears-open/Cre-ON, n = 14; ears-sealed/Cre-ON, n = 13; and ears-open/Cre-OFF, n = 13) during SFI performance under auditory deprivation. For press-triggered optical SFI experiments, data were analyzed using two-way ANOVA followed by Sidak post hoc comparisons, and bars are means. Shading for all average PETHs denotes the s.e.m. Scale bars for immunohistochemical images denote 1 mm, and ‘D’ and ‘L’ denote dorsal and lateral, respectively. ***P < 0.001; *P < 0.05; NS, not significant.

We prepared three groups of mice under the following conditions and tested each group with press-triggered optical SFI (described above) under auditory deprivation (Fig. 6a,b, Extended Data Fig. 5h and Methods): (1) 4-OHT induction while performing SFI with ears open in combination with a Cre-ON-dependent ChR2 construct, AAV-DIO-ChR2-EYFP, in the AUDd (ears-open/Cre-ON: selective expression of ChR2 in the activated AUDd neurons during normal performance), (2) 4-OHT induction while performing SFI with ears sealed in combination with a Cre-ON-dependent ChR2 construct, AAV-DIO-ChR2-EYFP, in the AUDd (ears-sealed/Cre-ON: selective expression of ChR2 in the activated AUDd neurons during ears sealed performance) and (3) 4-OHT induction while performing SFI with ears open in combination with a Cre-OFF-dependent ChR2 construct, AAV-DO-ChR2-mCherry, in the AUDd (ears-open/Cre-OFF: selective expression of ChR2 in the non-active AUDd neurons during normal performance). Following 4-OHT induction, animals were placed under auditory deprivation to perform press-triggered optical SFI (Extended Data Fig. 6dl and Methods).

Using such a within-subject design, it was found that closed-loop optical stimulation significantly rescued SFI performance under the ear-sealed condition by increasing the lever-press rate and decreasing the half peak fall time (Fig. 6ce). Notably, these rescue effects were specific to the ears-open/Cre-ON group, in which AUDd neurons previously active during normal SFI performance were optically stimulated (activating ~1,100 neurons; Extended Data Fig. 7ac and Methods). This optogenetic effect was absent in both the ears-sealed/Cre-ON and ears-open/Cre-OFF (activating ~5,600 neurons; Extended Data Fig. 7ac and Methods) groups (Fig. 6e), suggesting that the rescue was not simply due to stimulation light being used as a visual feedback cue and actually dependent on the optical activation of the task-related population, but not other neurons, within the AUDd. Notably, the AUDd optogenetic stimulation in the ears-open/Cre-ON group did not result in a generalized increase in responding across the entire probe trial, but a selective increase in response rate around 30 s that decayed with time (Fig. 6c,d and Supplementary Video 2), further emphasizing a specific role for this sensorimotor feedback in forming interval timing response dynamics rather than being generally reinforcing. Furthermore, optically induced increases in press rates correlated with changes in half peak fall time. In other words, animals that experienced the largest increase in press rates as a result of optogenetic stimulation, also experienced the largest decrease in the half peak fall time (Extended Data Fig. 6jl). Ear sealing is characterized by a decrease in the response rate at 30 s, and a rightward shift in the time when animals terminate pressing, resulting in an increase in the PETH half peak fall time (Fig. 2eh). Notably, these two features improved with optogenetic stimulation of the active population in the AUDd under ear sealing, with an increase in response rate around 30 s, and the response distribution losing its rightward skew, as pressing stop times shifted to earlier times in probe trials. This optogenetic rescue effect thus underscores the essential role of self-generated actions and associated sensorimotor feedback mediated by the AUDd in action timing.

A computational model with sensorimotor feedback recapitulates the timing mechanism.

To further understand the neural mechanism underlying action timing, we constructed a computational model of action timing, modeling self-generated actions and their associated sensorimotor feedback. The model operates on a pacemaker–accumulator-like mechanism19,20 and the animal’s own operant behavior serves as a pacemaker (Fig. 7a and Methods). In this model, each operant lever press is a pulse, randomly emitted from a pacemaker based on the current motivational level (Fig. 7b). An integration center accumulates the pulses (that is, operant lever pressing) and compares this value with a reference memory formed during learning (Fig. 7a)19,20. When the integration in the accumulator exceeds the value in the reference memory, the model’s decision-making mechanism determines that it is the time to stop operant actions (Fig. 7c). Notably, the sensory outcome derived from the animal’s own operant action provides a positive feedback cue to increase the momentary motivation (Fig. 2eh), that is, the probability of the operant action occurring again in the following time period (Fig. 7a,b and Methods). In this sense, consistent with the empirical data presented thus far, the model features a positive, sensory feedback loop layered on top of a pulse emitter, in which the driving force to sustain pulse emission (that is, pressing) is dependent on the result of previous emissions (that is, auditory feedback). Importantly, however, there is no variable added into the positive feedback loop in our model that influences variance in the timing behavior. Rather, this variance is specifically the result of the interaction of the comparison, accumulator and reference memory components (Methods).

Fig. 7 |. A computational model of action timing based on the integration of actions regulated by sensorimotor feedback.

Fig. 7 |

a, Schematic of components in a computational model of action timing including: lever-pressing probability function (p(t)), auditory feedback function (f(t)), a pulse accumulator (α(t)), a reference memory accumulation threshold (θ) and a comparison calculator (α(t)/θ). Note, the dashed line denotes updating of reference memory. b, Probability function from the computational model (a) in which the sensory feedback from pressing (bottom) increases the likelihood of another press occurring. c, Example showing how pressing (bottom) leads to accumulation (α(t)) until a threshold (θ) is reached (top) whereby responding is terminated. d, Response rasters (top), response-rate PETHs (middle) and percentage maximum response-rate PETHs (bottom) of a simulation mouse for three conditions: f(t) enabled (left; ears open), f(t) disabled (middle; ears sealed), partial restoration of f(t) (right; ears sealed optogenetic rescue). e, Response rasters (top) and response-rate PETHs (bottom) of a simulation mouse with f(t) enabled (ears open) in which responding is aligned to the start (left) and stop (right) of lever-press sequence. f, Average PETHs for response rate (top) and percentage maximum response rate (bottom) for three simulation conditions (n = 10 each). g, Response rates at 30 s (left; n = 10; main effect of treatment for presses per min at 30 s F(2,27) = 3.35, P < 0.0001, ears sealed versus ears open/ears sealed optogenetic rescue, P < 0.0001 and P = 0.0160, respectively), half peak fall time (right; n = 10; main effect of treatment for half peak fall time F(2,27) = 11.04, P < 0.0001, ears sealed versus ears open/ears sealed optogenetic rescue, P < 0.0001 and P = 0.0003, respectively) and peak time (right, n = 10; main effect of treatment for peak time F(2,27) = 5.89, P = 0.0010, ears sealed versus ears open/ears sealed optogenetic rescue, P = 0.0019 and P = 0.0047, respectively) for three simulation conditions. Simulated performance data were analyzed using repeated-measures one-way ANOVA followed by Tukey post hoc comparisons. Values for performance metrics are means. Shading for average PETHs and error bars denote the s.e.m. ****P < 0.0001; ***P < 0.001; *P < 0.05; NS, not significant.

The lever-pressing sequences generated in the model follow a step function without any consistent peaks in the middle (Fig. 7e), matching our experimental observations (Extended Data Fig. 1j). Irrespective of the step-function response at a single-trial level, a peak emerges around 30 s in the averaged lever-pressing histogram in the model due to the jitter in the start/stop times across trials (Fig. 7d). Ear sealing, simulated by removal of the sensorimotor feedback from the model, decreases the peak response rate and delays both the peak time and the half peak fall time (Fig. 7dg). This phenotype can be rescued by closed-loop optogenetic stimulation in the model to serve as the sensorimotor feedback during ear sealing (Fig. 7dg), similarly to the experimental effects of optogenetic stimulation of the AUDd (Fig. 6ce). These results suggest that a simple pacemaker–accumulator model with a sensorimotor feedback mechanism can replicate mouse behavior in the SFI task, and that animals’ self-generated actions along with the associated sensory feedback consequences can play a critical role in timing.

Dorsal secondary auditory cortex corticostriatal projection neurons subserve the audiomotor feedback for action timing.

The results from our computational model imply that auditory feedback is derived specifically from self-generated actions, and optogenetic rescue effects can only be inferred when action and sensory feedback are perceptually linked. To validate this hypothesis, we performed additional experiments to test the rescue effects of optogenetic activation of task-related AUDd populations with stimulation either delivered contingent on each lever press or randomly delivered (Fig. 8a and Methods). Here we used another activity-dependent labeling technique known as enhanced synaptic activity–responsive element (ESARE), a system using an engineered synthetic promoter to drive activity-dependent gene expression21. Using ESARE, we were able to selectively label the active population in the AUDd during SFI task performance (Fig. 8b). Optogenetic stimulation contingent on each lever press rescued timing performance in mice with their ears sealed (Fig. 8ce), further confirming the results of our FosTRAP experiments (Fig. 8dh). However, when optogenetic stimulation was delivered randomly across probe trials, with the number of stimulation events matching the total number of presses exhibited in a probe trial under ear sealing (Methods), animals’ timing performance was not improved and remained impaired. Performance under this random stimulation protocol was similar to control trials with ear sealing (Fig. 8c,d), both in terms of press rate at 30 s and half peak fall time (Fig. 8f). These results thus suggest that the audiomotor feedback mediated by the AUDd has to be contingent on the animal’s self-generated actions to contribute to action timing.

Fig. 8 |. Dorsal secondary auditory cortex provides sensorimotor feedback during action timing via layer V active populations.

Fig. 8 |

a, Using ESARE activity-dependent labeling in combination with Cre-dependent ChR2 expression in C57BL6 animals, active populations were labeled in the AUDd resulting from SFI performance with ears open. Animals then had their ears sealed and stimulation trials (press-triggered stimulation or random stimulation) were randomly interleaved with no stimulation trials. b, Exemplar ChR2 expression patterns in the AUDd. c, Behavior of an example mouse during SFI performance under auditory deprivation without stimulation (red line) versus with stimulation (press triggered: blue line; random: gray line). d, The optogenetic stimulation effects (n = 6) on response rate (top) and percentage maximum response rate (bottom) comparing press-triggered (left) versus random (right) stimulation under auditory deprivation. e, The presses per minute at 30 s (left; two-tailed, paired t-test, t = 3.080, P = 0.0275), peak time (middle) and half peak fall peak time (right; two-tailed, paired t-test, t = 4.074, P = 0.0096) comparing no stimulation versus press-triggered stimulation (n = 6). f, Same as e but for random stimulation (n = 6). g, Strategy for labeling and manipulating layer V, striatal-projecting active populations in the AUDd using rAAV2-ESARE-ERCreER-PEST. h, ESARE expression patterns. Terminals in the dorsomedial striatum (left) from ESARE populations in layer V of the AUDd (right) expressing Cre-dependent ChR2-EYFP. i, Response rasters for stimulation and no-stimulation trials (top), response-rate PETHs (middle) and percentage maximum response-rate PETHs (bottom) of an exemplar stimulating layer V, ESARE, striatal-projecting populations in AUDd under auditory deprivation. j, Stimulation and no-stimulation average PETHs (n = 7) for response rate (top) and percentage maximum response rate (bottom) for press-triggered optical SFI omission trials under auditory deprivation using retrograde, ESARE, activity-dependent labeling. k, Press-triggered optical SFI response rates at 30 s (left; two-tailed, paired t-test, t = 2.545, P = 0.0438), peak time (middle) and half peak fall time (right; two-tailed, paired t-test, t = 2.890, P = 0.0277) for optical stimulation and no-stimulation omission trials under auditory deprivation using retrograde ESARE, activity-dependent labeling (n = 7). For press-triggered and random optical stimulation SFI experiments, data were analyzed using paired t-tests, and bars are means. ‘D’ and ‘L’ denote dorsal and lateral, respectively. Shading for average PETHs and error bars denote the s.e.m. **P < 0.01; *P < 0.05; NS, not significant.

Our computational model implies that the clock mechanism sends a timing decision to a motor center for action stopping (Fig. 7a). The cortical projections from the auditory cortex to the dorsal striatum are known to mediate auditory-based action selection22. Furthermore, our cFos experiments showed selective layer V activation of the AUDd associated with lever pressing (Fig. 3h). We thus decided to test whether the striatum-projecting AUDd neurons activated in the SFI task are sufficient to provide the audiomotor feedback for action timing. The ESARE system is ideally suited for testing this because, unlike FosTRAP, it can be delivered virally, thus allowing use of the retrograde adeno-associated virus (rAAV2) serotype to label active populations projecting to a particular location. By injecting ESARE21 packaged in rAAV223 into the dorsal striatum, and a Cre-dependent AAV expressing ChR2 into the AUDd, we were able to selectively label the striatum-projecting AUDd neurons activated in the SFI task with ChR2 (Fig. 8g,h). It was found that action-contingent optogenetic stimulation of these striatum-projecting AUDd active populations (Fig. 8h) was sufficient to rescue timing performance under auditory deprivation (Fig. 8ik). Together, our optogenetic experiments utilizing both the FosTRAP and ESARE activity-dependent labeling systems suggest that AUDd neurons are specifically activated during the SFI task, and causally involved in responding, emphasizing their role in an action feedback mechanism. Notably, action-contingent optogenetic stimulation of task-related AUDd neurons projecting to the striatum is sufficient to subserve the auditory, sensorimotor feedback and support the execution of learned action timing.

Discussion

The self-initiated/terminated design of SFI requires animals to maintain an internal representation of time to control their actions. While our task design does not require continuous responding after initiation press, animals develop a step-function-like response profile in anticipation of the learned reward time during probe trials. This anticipatory responding is totally ‘unnecessary’ for getting a reward per se. However, previous behavioral studies have reported on the performance of repetitive, ritualistic action patterns in timing tasks24. Our results suggest that the seemingly ‘collateral’ lever presses in our SFI timing task, and their sensory feedback consequences, are actually a critical part of the timing estimation mechanism. Notably, unlike traditional fixed-interval schedules, in which clock initiation is signaled by an external stimulus19,20,25, SFI requires animals to self-initiate the clock. Neither peak time nor half peak fall time were related to the latency to initiate post-lever extension (Extended Data Fig. 1gi), confirming that animals indeed track time from the initiation press. As there are no peaks in the step-function-like response during the anticipation period (Extended Data Fig. 1j and Fig. 7e), we have proposed half peak fall time rather than the peak time, as a more effective and accurate measurement of action timing. The motivation to use this metric is further confirmed by experiments in which animals show an inability to stop pressing when trained on a version of SFI where 100% of the trials are rewarded (Extended Data Fig. 1e,f). When these animals are presented with probe trials following training, they show an inability to stop pressing after 30 s demonstrating that the key performance feature learned in the SFI task, and presumably other fixed-interval peak timing tasks, is when to terminate responding. While there is no gradual decay in responding with termination at an individual-trial level in the regular SFI task, the PETH half peak fall time metric serves good utility in capturing overall trial-by-trial stop times (Extended Data Fig. 1k). Furthermore, under learning and manipulation conditions (for example, ear sealing and muscimol infusion), trial-by-trial responding becomes more evenly distributed and less bout-like, making the PETH half peak fall time more suitable for capturing overall changes in responding, because precise determination of trial-by-trial stop times becomes less accurate.

Our sensory deprivation experiments demonstrate that animals utilize the sound derived from their own lever-pressing actions to shape response temporal dynamics, revealing a previously unappreciated role for auditory feedback in action timing. cFos screening (Fig. 3) and electrophysiological recordings (Fig. 4) identified the secondary auditory cortical field, AUDd, as a key brain structure involved in this sensorimotor feedback, which was further confirmed by pharmacological and optogenetic inactivation of the region (Fig. 5). Interestingly, we found AUDd cFos activation related to lever pressing was ipsilateral to the open ear (Fig. 3f). The auditory lemniscal pathway largely crosses over from the cochlear nucleus to the contralateral inferior colliculus on its way to the auditory cortex via the medial geniculate of the thalamus26, but significant ipsilateral projections do exit through the superior olivary complex27,28. In addition, there are less dense projections that travel ipsilaterally from the cochlear nucleus directly to the medial geniculate, bypassing the inferior colliculus29,30. So-called non-lemniscal projections have been shown to preferably innervate ipsilateral secondary cortical fields after passing through the medial geniculate3133, providing an anatomical basis for isolated secondary cortical activation, ipsilateral to the open ear. Interestingly, while our preliminary screen did show the inferior colliculus had higher cFos activity in the hemisphere contralateral to the open ear (Extended Data Fig. 4), consistent with canonical auditory pathway anatomy, we also saw a relatively large amount of activity in the colliculus ipsilateral to the open ear, confirming the possibility of ipsilaterally ascending auditory input. Another interesting possibility is that auditory deprivation effects might be attributed to vestibular deficits. However, we do know that our optogenetic rescue experiments (Figs. 6 and 8) completely bypass subcortical structures, implying a limited, if any, role for the vestibular system in the performance of the task. Closed-loop, optogenetic stimulation of the task-related neuronal population in the AUDd was sufficient to rescue timing behavior under auditory deprivation, suggesting that AUDd activity can mediate the sensorimotor feedback control in action timing (Fig. 6). Nonspecific stimulation of excitatory populations in the AUDd (Extended Data Fig. 8) or inhibition (Fig. 5) in a press-dependent manner when animals ears were open caused deficits in timing performance similar to those seen in ear sealing, further implicating a role in the activation of a specific population in the AUDd related to SFI performance. Importantly, rescue effects with the TRAPed population appeared with optical stimulation in a trial-by-trial manner, with randomly interleaved non-stimulation trials showing no rescue effects, underscoring the online nature of this sensorimotor feedback mechanism. Furthermore, additional activity-dependent labeling experiments confirmed that auditory feedback must be temporally contingent with action to be effective (Fig. 8af), emphasizing the idea that action and its sensory consequences have to be perceptually linked to assist in action timing. This result implies a sensorimotor feedback mechanism underlies the performance improvements observed with optogenetic stimulation, and that the population found to be activated by responding is, at least in part, mediating this loop. Our cFos screen shows a modulation in labeling that appears to be specific to the AUDd and related to pressing, while our optogenetic results directly implicate the AUDd in this mechanism; however, we cannot conclusively rule out the involvement of the AUDp or the AUDv.

Previous studies have demonstrated how task engagement34 and self-generated action15,16 lead to the refinement of neuronal subsets in the auditory cortex. The idea being that the refinement of these populations allows for the encoding of relevant auditory features. In this context, we would suggest that responding drives activation of a specific group of neurons in the auditory cortex. Indeed, our electrophysiological data show that neurons in the auditory cortex are activated by the sound of pressing the lever (Fig. 4), which is dampened through ear sealing. Furthermore, when animals’ ears are sealed, and behavior is disrupted, only stimulating the task-relevant population (Cre-ON) under auditory deprivation rescues the behavior (Fig. 6), while stimulating non-task-relevant populations (Cre-OFF) has no impact on performance. Thus, in this sense our interpretation of these results aligns with the idea that a particular activity pattern in the cortex related to active task engagement is necessary for sensorimotor processing as it relates to the sound of lever pressing and timing.

State-dependent network models of sensory timing have described how simple, sub-second-tuned cortical motifs can act in concert to allow for the duration perception of stimuli of hundreds of milliseconds35,36. In addition, models for motor timing have described how spatiotemporal firing patterns in recurrent networks can produce timed actions by behaving in a largely self-sustaining manner3740. Previous studies have found that the presentation of a brief stimulus preceding a reward through training can lead to the embodiment of modal cortical activity occurring at the reward target interval time, which may drive timed action in downstream motor structures41,42. Our results point toward a behavioral and neural mechanism whereby the brain utilizes audiomotor feedback from an animal’s own actions to assist in interval timing. While visual stimuli are processed largely according to their spatial dimensions, auditory stimuli are defined predominantly in accordance with their temporal features4345. Indeed, physiological studies on the brain have suggested that the auditory system is privileged in its ability to process temporal information relevant to both sensory stimuli15,16,4649 and movement control5052. Secondary sensory cortices have been suggested to play a specific role in supporting sensory memory. Experiments inactivating secondary cortical fields (auditory and visual) have demonstrated how these regions specifically play a role in the retrieval of sensory information that have taken on some unique behavioral significance with experience53. Furthermore, secondary auditory cortical regions have been shown to have dense connectivity with the amygdala, in contrast to the primary cortex, further supporting the idea that this region is involved in associating stimuli with specific behavioral context54. One possibility is that as animals learn the SFI task, the sound of the lever changes from initially being a meaningless collateral sensory aspect of performance to a unique sensorimotor cue. Furthermore, unlike the AUDp, the secondary auditory cortex exhibits less tonotopic organization and greater spectral integration properties to support the processing of complex sounds, rather than simple sound processing55. This is further supported by the fact that while the secondary cortex has connections with the AUDp, it also receives direct projections from the medial geniculate56. In fact, this auditory pathway has been implicated with identifying auditory changes57,58, and in the processing of unique temporal patterns59,60.

Our results point to an interaction between responding and when animals terminate action. Thus, based on the evidence presented here, a likely possibility is that an online, audiomotor feedback signal is utilized by the brain to register the passage of time, potentially serving as a signal to support a pacemaker–accumulator-like mechanism for interval timing19. Indeed, our computational model was able to replicate action timing by using a clear sensorimotor relationship in which the sound from responding impacts subsequent response probability, resulting in a temporally defined action sequence (Fig. 7dg). In the model, actions in the sequence are integrated by an accumulator and used to monitor the passage of time, while sensory feedback impacts only the instantaneous probability of another lever press occurring. Ultimately, the comparison of the accumulation with reference memory is responsible for when animals actually stop pressing. This is consistent with our closed-loop, press-dependent optogenetic inhibition and activity-dependent stimulation experiments, which specifically implicate a role for the AUDd in a feedback mechanism, rather than accumulation. It is currently unclear which brain region implements the action integration and serves as the accumulator. Nevertheless, the results from our optogenetic experiments imply that the accumulator likely integrates action and sensation, but not sensation alone, as randomly timed stimulations (equal in number to the total presses typically performed under auditory deprivation) do not rescue timing (Fig. 8af). Indeed, our electrophysiological results failed to show any action-related accumulation activity (Fig. 4), suggesting that the accumulator likely lies somewhere beyond the auditory cortex. Further experiments are needed to identify the brain circuitry of the action integrator in our model, and unravel the neural implementation of comparison with the reference memory for timing actions. We believe this model, which includes canonical components of the pacemaker–accumulator model, while uniquely ascribing actions as pulses and sensory feedback in sustaining pulse emission, will substantially help in future studies of timing by providing a basis for the interpretation of specific pharmacological and optogenetic manipulation effects on interval timing. Because many timing studies have focused on the striatum and dopamine25,61,62, which have powerful control over operant actions17,18, this might be especially important in deciphering the effects of manipulations as being related to clock computation (for example, integrator and memory) or pacemaker (motor behavior).

It is important to note that ear sealing had little effect on when animals began responding following initiation press, as measured by the half peak rise time (Extended Data Fig. 1n). This result was also recapitulated in our model (Fig. 7f). Unlike the termination in pressing seen following 30 s in probe trials, which we quantified with the half peak fall time, the onset of pressing bouts following initiation press is not a performance feature under operant conditioning (that is, there is no design feature in the task between 0 and 30 s that instructs the animal when to begin a pressing bout). Our data indicate that the performance of this aspect of SFI is not related to the audiomotor mechanism outlined here. However, it is possible that a mutually exclusive mechanism might control this dimension of SFI performance. It is also possible, however, that the onset of pressing could be controlled by a sensorimotor mechanism similar to the one described here that utilizes a different modality tuned to whatever collateral behavior24 the animal is performing between the initiation press and the onset of a pressing bout.

Layer V cortical projections from the auditory cortex to the striatum are known to mediate auditory-based action selection22, and striatal neurons have been shown to modulate response dynamics in various timing tasks63,64. Our preliminary cFos screen demonstrated activation in the striatum of the hemisphere ipsilateral to the open ear (Extended Data Fig. 4d), and unilateral ear sealing had a specific effect on layer V activation in the AUDd (Fig. 3h). Interestingly, unilateral ear sealing (Fig. 3d) had less impact on the response rate and half peak fall time than bilateral ear sealing. At a behavioral level, this might be expected based on our model (Fig. 7). Presumably, there would be increasingly less auditory feedback with each sealed ear to drive the increase in press rate, and concurrent decrease in half peak fall time. Furthermore, at the neuronal circuit level, our results from the ESARE experiments (Fig. 8gk) demonstrate that layer V, striatal-projecting populations in the AUDd appear to play a role in the transformation of sensory feedback into time-dependent adjustments in motor responses. Within this context, less excitatory drive from the cortex to striatum might be responsible for deficiencies in response rate and temporal dynamics observed in unilateral versus bilateral ear sealing. It is well known that the striatum is a motor center for controlling the start and stop of action sequences2, and it is possible that different striatal pathways could utilize auditory feedback information for facilitating actions or switching motor programs17.

In the context of the SFI timing task, our results demonstrate that the auditory system is privileged in providing sensorimotor feedback, but do not exclude the possibility of other sensory modalities serving a similar role when auditory information is completely absent from training. In fact, training animals on the SFI task under auditory deprivation revealed that timing performance was only impaired early in training (Extended Data Fig. 2). This result implies that while the auditory system might be prioritized in mediating sensorimotor feedback, other sensory modalities could gradually replace its role and compensate for performance deficits under auditory deprivation. Indeed, in post-control days (Fig. 2g) following ear sealing, animals show some deficits in timing performance, which might imply some degree of relearning under auditory deprivation (that is, switching to using a different modality). For this reason, all optogenetic manipulation trials were interleaved with non-stimulation trials for a given session. Experiments have been performed testing audition, vision and touch on temporal perception and reproduction in humans. Results from these psychophysical experiments suggest that audition is the most temporally tuned modality followed by touch, and finally vision43. Thus, under ideal conditions of complete auditory deprivation across SFI training, one might expect that mice would gradually shift to rely on a somatosensory/proprioceptive feedback mechanism to perform the task. Nevertheless, our results have unveiled a previously underappreciated role of sensorimotor feedback in action timing. Future work is needed to deconstruct how this sensorimotor feedback is integrated and utilized by downstream neural networks to control and time actions19,65.

Methods

Experimental animals.

All procedures were approved by the Institutional Animal Care and Use Committee at the Salk Institute for Biological Studies, and were conducted in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. Adult C57BL/6, FosTRAP7 (Jackson laboratory, stock no. 021882), Vgat-Cre; Ai32 (Jackson laboratory, stock nos. 028862 and 024109, respectively) and CaMKII-Cre (Jackson laboratory, stock no. 005359) male and female mice were group housed on a reverse light cycle with ad libitum access to food and water. Operant training began at 8 weeks of age. Animals were food deprived for 48 h to reach roughly 85% of their initial body weight and then underwent 3 d of a continuous reinforcement schedule before starting the SFI training schedule18 (see below).

Behavioral training and analysis.

Animals were trained in a modular operant chamber (Med Associates) fitted with ultrasensitive, retractable levers to the left of the reward magazine. The reward magazine was fitted with an infrared beam break sensor to monitor head entries. Following food deprivation, animals underwent 3 d on a continuous reinforcement training schedule (15 reinforcements per day) in which responding on the lever to the left of the reward magazine yielded a 20-mg nutritionally balanced pellet (BioServ). Following continuous reinforcement training, animals were trained for 21 d on the SFI schedule (see Fig. 1a and main text for specifics of schedule design), again utilizing the lever to the left of the reward magazine and 20-mg pellets for reinforcement. In addition to 50 reinforcement pellets daily from training, animals received 1.5–2.5 g of rodent chow per day to prevent body weight from dropping below 85% of initial body weight. The exact amount of chow provided each day was adjusted to keep deprivation weight stable. In the lever-omitted context, animals also received 50 pellets; however, trial progression through a session occurred automatically. To exclude the possibility that auditory deprivation effects were due to the animal listening to a reward-related auditory cue, we trained animals on the same SFI task using a liquid milk reward instead of a food pellet. For the condensed milk reward, 10 μl of 50% condensed milk (Carnation) diluted in water was delivered to a small bowl in the reward magazine. Unlike the pellet reward, the milk reward was delivered via a pump placed outside of the sound-attenuated chamber that directs liquid to a small, indented receptacle in the reward magazine. As such, animals tend to constantly engage in licking behavior throughout the trials, going back and forth between pressing and licking. This licking behavior seems to compete with lever pressing and influence the response profile of the animal. Unlike the pellet, animals form a peak around 10 s, with an offset in responding around 30 s (that is, when the reward would appear) in probe trials. While the peak time and response dynamics are quantitatively different from the pellet task, the effects of auditory deprivation are the same. Auditory deprivation effects on response rate and half peak fall time in these milk-trained animals were consistent with those observed in the pellet-trained animals (Extended Data Fig. 3ae), indicating that even with auditory cues of reward delivery completely removed, the effects of auditory deprivation on operant responding remain the same. Furthermore, we exposed another group of SFI-trained animals to a series of continuous probe trials to completely remove any potential sensory interference from reward delivery. The auditory deprivation effects on response rate, peak time and half peak fall time were also found to be evident under this extinction situation (Extended Data Fig. 3fh). Reinforcement training schedules were written using MED-PC programming language and behavioral data were analyzed offline using custom scripts written in MATLAB. The half peak fall time was calculated using the intersection point of the response-rate PETH with the line y = 50% maximum response rate. The response rate PETH was constructed using a moving average function with a smoothing window of 15 s—the minimum duration found to provide the least noise yielding a single peak and half peak fall time value. Peak values were calculated using the midpoint of the two intersection points of the response-rate PETH with the line y = 85% maximum response rate. For acute sensory deprivation and pharmacological/optogenetic experiments, a performance criterion was used to exclude animals that showed poor peak dynamics on the day before experimental manipulation. Animals’ half peak fall times reached an asymptote with training indicating stable peak formation behavior (day 21 normal mean = 46 s, s.d = 4 s). To ensure that animals properly learned peak timing behavior before being tested on sensory deprivation and pharmacological/optogenetic experiments, individuals that showed half peak fall times greater than 50 s (mean + s.d.) were excluded from manipulation results.

Sensory deprivation experiments.

The week following training on the SFI schedule, animals underwent 3-d acute sensory manipulations (cutaneous/proprioceptive, visual and auditory) performing the same task. Cutaneous/proprioceptive sensory deprivation was achieved via a forelimb nerve block12, in which 50 μl of a lidocaine cocktail (0.5% lidocaine, 10 mM CNQX in 0.9% sterile saline) was injected on the internal side of the pectoral girdle of the left and right forelimb. The injected volume was adjusted such that animals could still press the lever, but exhibited no withdrawal reflex to pinching the forepaw toes. Animals received 0.9% saline injections in the same locations on the pre-control day and no treatment on the post-control day. For visual deprivation, the house light was turned off on the manipulation day and left on in the pre-control and post-control days. Finally, for auditory deprivation, a small amount of cotton was placed inside the ear cavity and the ear canal reversibly sealed using a small amount of Vetbond. For pre-control and post-control days, animals had their ears open. Auditory and cutaneous/proprioceptive sensory manipulations were performed under 3–4% isoflurane anesthesia 15–20 min before beginning behavior. For auditory deprivation experiments, Vetbond was removed using ethanol, also under isoflurane, and post-procedural monitoring of ears was performed for treatment of inflammation by veterinarian staff. For training under auditory deprivation, a hypoallergenic latex gel (Dermaflage) was placed inside the ear canal and allowed to set while animals were under isoflurane. This was done to prevent irritation from repetitively occluding the ear canal across training. The sensory experiments were performed to specifically test the effect of omitting specific sensory modalities on the performance of SFI. However, it was found that under auditory deprivation, reward retrieval time also changed. Delays in reward retrieval time can alter animals’ timing performance and consequently prevent an accurate interpretation of the manipulation results in isolation. A criterion of the minimum change in reward retrieval time allowed across the pre-control day and manipulation days was imposed to prevent animals from learning a new reward retrieval time on the manipulation day. Between the pre-control and post-control days for the ear-sealing experiment, animals show a skewed distribution of reward retrieval time changes (pre-control versus post-control, Δ reward retrieval time: median = 1 s, s.d. = 2 s). To reduce the impact of any changes in pellet retrieval time, animals that exhibited changes in reward retrieval time greater than 5 s (median + 2 × s.d.) compared to the pre-control day were omitted from the experimental groups.

Validation of sensory deprivations.

The von Frey assay was utilized to validate the effects of the lidocaine/CNQX nerve block. Mice were put on an elevated, wire mesh grid and the forepaw stimulated with monofilaments of increasing force (0.008–1.4 g) until the withdrawal was elicited. To ensure accuracy in the threshold determination, Dixon’s up–down method was used66. To validate the effects of auditory deprivation, ear-sealed animals were restrained within a small acrylic tube that was attached to an accelerometer, and placed within a sound-attenuating cubicle (Med Associates). Startle amplitude in response to a 40-ms, 120-dB auditory stimulus was recorded14. To determine the effectiveness of decreasing ambient light levels by turning the house light off, an ultrasensitive photometer (PR-810L Ultra-Sensitive Photometer, Photo Research) was placed inside the operant chamber and luminance density was recorded.

Activity-dependent labeling.

To label active populations with CreERT2, FosTRAP animals7 were given 4-OHT, which is a metabolized version of tamoxifen and as such allows for a narrower temporal window over which activity-dependent labeling can take place. It has been previously demonstrated in the visual cortex that maximum labeling occurs when sensory input is provided 1 h before injection of 4-OHT, with minimal to no labeling occurring around 5–6 h before and after injection7. For the day of labeling, animals with their ears sealed bilaterally or open were placed in the operant chamber 1 h before SFI was initialized. Furthermore, they were run on an extended SFI schedule in which they received 100 reward pellets. As in regular SFI training, rewarded trials occurred 70% of the time and probe trials occurred 30% of the time. Animals received a dose of 50 mg per kg body weight of 4-OHT in Chen oil after receiving roughly 50 rewards. Following completion of 100 rewards, animals were left in the chamber for an additional hour before being returned to their home cage. To acclimate animals to the disruption of receiving an intraperitoneal injection during SFI performance, animals received saline injections upon completion of the first 20 rewards in the week leading up to induction day. FosTRAP activity-dependent labeling was verified by confirming the AUDd laminar-specific cFosCreERT2 and cFos expression profiles in FosTRAP animals that performed SFI (Extended Data Fig. 5cg), which effectively recapitulated the results observed in wild-type animals (Fig. 3f,h). For the striatal-projecting experiments using rAAV2-ESARE-ERCreER-PEST and AAV9-ESARE-ERCreER-PEST21, C57BL/6J mice had stereotaxic injections and implantations performed before training and were induced in the same way as FosTRAP animals on day 21 of training.

Stereotaxic surgery.

For AAV injections, fiber-optic implantation and cannulation, mice were administered a xylazine and ketamine cocktail (10 mg and 100 mg per kg body weight, respectively) intraperitoneally and placed inside a stereotaxic surgical frame (Kopf). Craniotomies were made using a drill (Drummond). For optogenetic experiments targeting the auditory cortex, 1 μl concentrated AAV (University of Pennsylvania Vector Core: AAV9-DIO-ChR2-EYFP; Salk Institute Vector Core: AAV9-DO-ChR2-mCherry) was bilaterally injected using a manual syringe (Hamilton) into the following coordinates: −2.3 AP, ±4.0 ML and −0.66 DV67 (Allen Brain Institute Mouse Atlas; https://mouse.brain-map.org/). Following viral injections, custom-made optrodes consisting of a ceramic ferrule and fiber-optic cable (200 μm, Thor Labs) were lowered into the brain 0.01 mm above the auditory cortex injection site18. The optrodes were affixed to the skull using OptiBond (Kerr, 35129) and light-activated dental cement (Ivoclar Viviadent, 595953WW). For the striatal-projecting experiments, C57BL/6J animals were injected with the 0.3 μl of rAAV2-ESARE-ERCreER-PEST (Salk Institute Vector Core) in the following coordinates into the dorsal striatum: +0.34 AP, ±2.25 ML and −2.30 DV (Allen Brain Institute Mouse Atlas). For cannula implantation (see below), mice had guide cannulas (Plastics One, C313GS-4/SPC) affixed to the skull targeting the same location of the auditory cortex as with optrode implantation. For muscimol experiments targeting the VISp, the following coordinates were used: −3.20 AP, ±2.65 ML and −0.30 DV67 (Allen Brain Institute Mouse Atlas). Following all surgical procedures, mice received analgesia consisting of buprenorphine (1 mg per kg body weight) and were allowed to recover for 1 week before beginning training/retraining on SFI.

Three-day muscimol experiments.

Following 21 d of training on SFI and then recovery from cannula implantation surgery, C57BL/6J animals were retrained on SFI before undergoing a 3-d pharmacological manipulation experiment. As with the sensory experiments, the manipulation day in which muscimol was infused was flanked by two control days in which no pharmacological manipulation occurred. Dummy cannulas (Plastics One, C313DCS-4/SPC) cut to be flush with the end of the guide cannula remained inserted during all times, apart from infusions, to prevent debris from entering the brain. For the pre-control day, animals received a bilateral infusion of 0.5 μl saline delivered at a flow rate of 0.5 μl per min, 30 min before starting SFI. For the post-control day, no infusions were performed. On the manipulation day, animals received 0.5 μl of 1.0 μg μl−1 muscimol in saline delivered bilaterally at a flow rate 0.5 μl per min, 30 min before starting SFI17. This flow rate and concentration achieves delivery of 0.5 μg of muscimol, which has been demonstrated to result in 90% tissue inactivation 1–3 mm3 from the site of infusion68. Muscimol and saline were infused via an infusion cannula (Plastics One, C313IS-4/SPC) that was cut to protrude 0.5 mm beyond the base of the guide cannula. For all infusions, animals were physically restrained and infusion cannulas were left in place for 5 min following the completion of infusion before being removed and the dummy cannulas returned. Infusion cannulas were attached to 28-gauge polyethylene tubing (Plastics One, PE50). The tubing was attached to 1-ml syringes (BASi, MDN-0100) being depressed by a microinfusion pump system to control the flow rate (BASi, MD-1000–MD-1002).

Behavioral optogenetic experiments.

For FosTRAP and ESARE experiments, following surgery and 21 d of SFI training, implanted animals were retrained for 1 week on SFI before undergoing a single optical stimulation experiment occurring between 10—14 d after 4-OHT induction. For retraining and stimulation, animals were tethered via fiber-optic cables connecting optrode ferrules to a commutator. On the day of optogenetic stimulation, animals had their ears sealed and received 100 ms of constant blue-light stimulation (Laserglow, 473 nm, 5mW) with each lever press18. For CaMKII AUDd stimulation experiments, animals also received 100 ms of constant blue light per press (Laserglow, 473 nm, 5 mW) but with their ear open. For Vgat-Ai32 experiments, implanted animals received 500 ms of constant blue light (Laserglow, 473 nm, 1–2 mW) per press, also with ears open. Regardless, optical stimulation occurred whenever the lever was pressed and was not restricted to any time window in the probe or rewarded trials, thus avoiding stimulation itself from being temporally discriminative. The MED-PC software randomly selected stimulation trials for 50% of the rewarded trials and 50% of the probe trials18. For a normal SFI training session, animals perform 50 rewarded trials, with probe trials occurring 30% of the time and rewarded trials occurring 70% of the time. To ensure there were enough stimulated and non-stimulated probe trials, animals were allowed to perform 100 rewarded trials for the day of optical stimulation. For experiments using ESARE testing, the role of whether optical stimulation must be press dependent to infer rescue effects, press-triggered stimulation trials and random stimulation trials both occurred at 25% chance interleaved with non-stimulated trials. During a probe trial under auditory deprivation, animals on average press 54 times. For random stimulation trials, the program was designed to stimulate at a probability of 5.4% chance every 100 ms, so that a 100-s probe trial would have ~54 stimulation events.

Head-fixed electrophysiological experiments.

At least 1 d before recording, animals were implanted with a headframe while under anesthesia with isofluorane. A circle of skin, approximately 10 mm in diameter was removed to make room for the headframe implant. The skull was cleaned and the headframe implant was secured with Vetbond (Santa Cruz Biotechnology, sc-361931) and Metabond (Parkell, S380). The headframe was placed as centrally as possible with the right-hemisphere AUDd still accessible within the inner well of the headframe. For all animals, the inner well of the headframe implant was filled with Kwik-Cast sealant (WPI KWIK-CAST) until recording. Electrophysiological recordings were performed in the right hemisphere of the AUDd of two open/Cre-ON animals (two penetrations) and two open/Cre-OFF animals (four penetrations in total). Under 2% isofluorane anesthesia, a small craniotomy was made over the AUDd and filled with artificial cerebrospinal fluid. Subsequently, the animals were removed from anesthesia and head-fixed on a custom-made wheel. Under this setup, animals were free to run at will. A 64-channel silicon microprobe69, with channels spanning 1.05 mm, was lowered into the craniotomy. An optical fiber (1 mm diameter, 0.39 NA, Thor Labs) was positioned as close as possible to the craniotomy without touching the probe for optical stimulation. LED stimulation via a blue LED driver (T-Cube, 470 nm, Thor Labs) was controlled by Arduino. Trials consisted of a 500-ms pre-stimulus period (used for determining the baseline firing rate) and 100 ms of constant, 8–10-mW LED stimulation with 2–4 s interstimulus intervals. Spikes were extracted and sorted into clusters using Kilosort70 followed by manual cleaning and verification (phy template-GUI; https://github.com/kwikteam/phy/blob/master/README.md/). All single units with <1% refractory period violations and waveform amplitudes at least two standard deviations above the noise were included for subsequent analyses18.

Freely moving electrophysiological experiments.

For electrophysiological recording in freely moving mice, animals were first anesthetized using isoflurane (4% induction; 1–2% sustained) and placed in a stereotaxic frame. Each mouse was chronically implanted with an electrode array, which consists of 16 tungsten contacts (2 × 8), each 35 μm in diameter (Innovative Neurophysiology). Electrodes were spaced 150 μm apart in the same row, and 200 μm apart between two rows, with a length of 5 mm for each electrode2,17,18. The array targeting the AUDd (−2.3 AP, ±3.9 ML and −0.75 DV) was incrementally lowered into the deep layers of the AUDd. Silver grounding wire was attached to skull screws.

Neural activity was recorded using the MAP system (Plexon). The spike activities were initially online sorted using a sorting algorithm (Plexon). Only spikes with clearly identified waveforms and a relatively high signal-to-noise ratio were used for further analysis. After the recording session, the spike activities were further sorted to isolate single units by an offline sorting software (Plexon). Single units displayed a clear refractory period in the inter-spike interval histogram, with no spikes during the refractory period (larger than 1.3 ms). All the time stamps of the animal’s behavioral events were recorded as transistor–transistor logic pulses, which were generated by a Med Associates interface board and sent to the MAP recording system through an A/D board (Texas Instrument). The animal’s behavioral time stamps during the training session were synchronized and recorded together with the neural activity2,17,18.

Neuronal activities were aligned to lever pressing and averaged across trials in 1-ms bins, and then smoothed by a MATLAB built-in Gaussian filter to construct the firing PETH. Neurons showing significant response changes after lever pressing were defined as responsive neurons. The threshold for significance test was defined as mean (baseline firing) + 3 × s.d. (baseline firing) for excitation and mean (baseline firing) − 2 × s.d. (baseline firing) for inhibition, respectively17,18. The neuronal firing latency to lever pressing was defined as the beginning of significant firing rate change after lever pressing.

For those responsive neurons, we then calculated z-scores based on the PETH:

zscore=(PETHmean(baseline firing))÷(s.d.(baseline firing))

The firing index was defined as the z-score value at the maximum (excitation) or the minimum (inhibition) after lever pressing18. Data analyses were performed in MATLAB with custom-written programs (MathWorks).

Computational modeling.

A computational model of action timing was constructed based on the following hypotheses: (1) the model operates on a pacemaker–accumulator like mechanism19,20; (2) the animal’s own operant behavior serves as the pacemaker, that is, each operant action, or lever press, is a pulse emitted from the pacemaker; (3) the sensory outcome derived from the animal’s own operant action provides a positive feedback to increase the motivation, or possibility of the same operant action occurring in a following time period; (4) an integration center accumulates the pulses (that is, operant lever presses) emitted from the pacemaker and compares it with a reference memory formed during previous learning experiences. When the integration value in the accumulator exceeds the value defined by the reference memory, the decision-making model mechanism stops operant actions.

More specifically, the model consists of the following components (Fig. 7a): after a random latency, T0, following trial initiation, lever pressing, P(t), occurs according to a Poisson process with a baseline probability, μ, for each time bin71. The auditory effect derived from each lever press serves as positive feedback to increase the motivation, or probability, of lever pressing occurring in the following time period, that is, p(t) = μ {1 + f(t)}. The integration center accumulates the lever presses, α(t) = ∫P(t) dt, and compares this value with a threshold, θ, defined in the reference memory. The model continues the process of generating lever presses until α(t)/θ ≥ 1, at which point in time the decision-making mechanism is engaged to stop lever pressing. During auditory deprivation, sensorimotor feedback is blocked, and so the positive feedback function becomes f(t) = 0, with p(t) = μ. During closed-loop optogenetic stimulation in ear-sealed mice, each lever press triggers positive feedback, fopto(t), generating the motivational effect of increasing lever pressing probability, or p(t) = μ {1 + fopto(t)}.

Inspired by our own experimental data in mice, the parameters used for the simulation are: T0, generated from a normal distribution with a mean of 12 s and s.d. of 12 s (T0 = 0 if the value is negative); μ = 0.15, the baseline lever press probability for each simulation time bin of 200 ms (~45 lever presses per min); f(t) = 1.2et/2, the sensorimotor feedback function, which increases the probability of the following lever press occurring up to ~10 s with a half-life of ~1.4 s. During closed-loop optogenetic stimulation in ear-sealed mice, the sensorimotor feedback is partially restored to ~50%, or fopto(t) = 0.60et/2. All lever presses occur according to the probability function, p(t), and are integrated by the accumulator, α(t), whose value in turn is momentarily compared to a noisy threshold reference memory, θ = 55 with an s.d. of 10, to decide when to stop or continue lever presses across trials for a given animal. For the control, auditory deprivation and optogenetic stimulation conditions, the model was run multiple times to simulate 50 probe trials for a given animal. For simulating a group of animals, the baseline reference memory, θ = 55, remained the same, but with a s.d. of 10 across animals to take into account that each animal might form a slightly different reference memory during learning. The lever-press rate, peak time and half peak fall time were characterized and quantified the same way as in the mouse experiments, and averages were obtained from multiple simulated animals. It is important to note that the lever-pressing sequences generated in this model do not contain any reward-time-related information, and, as in the actual behavioral data (Extended Data Fig. 1j), there is no peak timing of responses in individual trials (Fig. 7e). Nevertheless, by using a leaky integrator in this model and/or different parameter values, the simulation generates results qualitatively consistent with our experiments. The model program and all simulations were created and run on MATLAB R2019a (MathWorks).

Histological processing and stereological analysis.

Mice were administered a xylazine and ketamine cocktail (10 mg and 100 mg per kg body weight, respectively) intraperitoneally and were transcardially perfused with 8–10 ml of 0.9% saline, followed by 8–10 ml of 4% paraformaldehyde in PBS. Following extraction, brains were placed in 4% paraformaldehyde in PBS overnight, before being placed in 30% sucrose in PBS for 5–7 d in preparation for sectioning. Then, 40-μm coronal sections were prepared using a freezing microtome and tissue was labeled with the following antibodies: chicken anti-GFP (1:500 dilution; Aves Labs, GFP-1020), mouse anti-mCherry (1:500 dilution; Clontech, 632543), rabbit anti-cFos (1:500 dilution; Synaptic Systems, 226 003), Alexa-488 donkey anti-chicken (1:250 dilution; Jackson Immuno Research, 703-545-155), Cy3 donkey anti-rabbit (1:250 dilution; Jackson Immuno Research, 711-165-152) and Cy3 donkey anti-mouse (1:250 dilution; Jackson Immuno Research, 715-165-151). For secondary and primary incubations, antibody blocking was performed using normal horse serum (Sigma-Aldrich, H0146). Antibody amplification was used to visualize ChR2-mCherry and ChR2-EYFP. Images were taken at ×20 magnification using an Olympus VS120 slide scanner18.

Estimation of number of cells activated by optogenetics.

Using a formula that accounts for scattering, absorption and geometric loss of light through the brain72, we calculated the total illuminated volume that is within the intensity threshold (≥1 mW/mm2) to activate ChR2 and elicit spikes (Extended Data Fig. 7c). This area, which is based on the 5-mW initial intensity at the fiber tip (~159 mW/mm2), makes up a cone with an angle of 33° (determined by the Numerical Aperture of the fiber used) that stretches 0.87 mm downward. Using the trigonometric relationship between height and the angle of light divergence, we calculated the radius of the base of this cone, and determined the volume to be 0.061 mm3. Based on the known density of neurons in the auditory cortex73, and the percentage of Cre-ON and Cre-OFF cells from our head-fixed recordings (Extended Data Fig. 6f,i), we calculated ~1,100 cells per hemisphere were activated in the Cre-ON condition, and ~5,600 cells per hemisphere in the Cre-OFF condition. Given the fiber placement and downward light penetrance of the activation cone, along with the fact that AUDd is the most dorsal region of the three auditory cortical fields, activation of the lower-density populations in the AUDp and the AUDv is unlikely because it is largely beyond the depth whereby light intensity is sufficiently high to elicit spikes.

cFos- and GFP-positive cell counting.

For cFos experiments measuring hemispheric differences in activation, C57BL6 animals each had a single ear randomly sealed following 21 d of training on SFI, and were then allowed to perform a full session of SFI. At the completion of the session, animals were immediately euthanized. Before sectioning, brains were scored with a line across the cortex of the left hemisphere to ensure counts for a given hemisphere were correctly associated with the sealed or open ear. For cFos and GFP quantification, images were taken with a VS120 slide scanner and were saved as JPEG files using the Olympus offline proprietary software, VS-ASW. Images were then overlayed in Adobe Photoshop with coronal section maps from the Allen Brain Institute Mouse Atlas (https://mouse.brain-map.org/) and manually counted74. Cortical layers were determined by overlaying DAPI images from the same slices.

Quantification of activity-dependent cell labeling density across auditory cortical fields.

The DIO-ChR2 construct used for the stimulation experiments does not clearly fill somas, so direct quantification of ChR2-labeled cells was not possible. However, based on the Cre-positive cells labeled under the ears-open condition using AAV9-FLEX-GFP injected into the AUDd (Extended Data Fig. 5c,d,g), we determined the number of Cre-positive cells in each of the auditory cortical fields for the ears-open/Cre-ON condition.

Statistical analyses.

No statistical methods were used to predetermine sample sizes but our sample sizes are similar to those reported in previous studies in mice16,68. Data collection and analysis were performed blind to the conditions of the experiments. No randomization procedures were used. Any exclusion criteria have been already stated above. Data distribution was assumed to be normal, but this was not formally tested. Exemplar histological images were selected that closely resembled expression patterns seen overall in the experimental group. Learning performance, sensory deprivation performance (unilateral, bilateral ear seal with milk/pellet reward), model simulation results, acoustic startle experiments and von Frey test results were analyzed using one-way ANOVA with Tukey’s post hoc multiple-comparison tests. FosTRAP optogenetic manipulations, FosTRAP laminar validation, and auditory unilateral ear seal cFos expression experiments were analyzed using two-way ANOVA with Sidak post hoc multiple-comparison tests. Pharmacological inactivation, VISp unilateral ear-seal cFos expression and ESARE optogenetic experiments were analyzed using paired t-tests. Extinction experiments were analyzed using an unpaired t-test. Response-rate relationships across learning, relationship of optically induced changes in response rate versus fall time and the relationship between response rate with respect to AUDd cFos expression were analyzed by performing a linear regression and calculating the Pearson correlation coefficient. Statistical analyses were performed using MATLAB and Prism software.

Reporting Summary.

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Extended Data

Extended Data Fig. 1 |. Self-initiated learning and performance of SFI peak timing is characterized notably by half peak PETH fall time, which reflects individual step-like lever press responding, related to Figs. 1 and 2.

Extended Data Fig. 1 |

a, Correlation analysis of average response rate (n = 10) at 30 s versus 70 s based on PETH for omission trials across training. b, The Pearson correlation coefficient can be calculated for the average response rate at each second over the omission trial window versus the average response rate at 30 s across learning showing autocorrelation around 30 s and a cross-temporal period of correlation around 70 s. The arrow denotes the Pearson correlation coefficient for the correlation in (a). c-d, Trial-by-trial analysis for the number of presses from 20–30 s and 60–70 s confirms the relationship between response rate changes at 30 s and 70 s from PETHs in (a). c, The response raster of an exemplar for the 5th trial of training days 1, 4, 7 and 21 shows that as the number of responses increase around 30 s, the number of responses around 70 s decreases. d, The numbers of responses over 20–30 s for the same exemplar in (c) negatively correlates with the number of responses over 60–70 s across learning for the 5th trial (top). The average number of presses for all animals (n = 10) for the 5th trial over 20–30 s negatively correlates with the average number of presses over 60–70 s (bottom). e-g, Response rasters (top), response rate PETH (middle), and percent maximum response rate PETH (bottom) of exemplars for rewarded (left) and omission trials (right) performing regular SFI (70% rewarded trials) on day 21 after being trained 20 days on either SFI with 70% (e) or 100% (f) rewarded trials. g-i, The latency to initiate post-lever extension does not change across training days (g), and does not show a relationship to half peak fall time (h) or peak time (i) at day 21 of training (n = 10). j, Alignment to the start and stop of pressing bouts reveals that peak timing is an artifact of averaging many trials. Bouts of pressing for an individual animal can be defined based on the trial-by-trial press rate to establish the start and stop (left, top, red hash marks) of pressing sequences that give rise to the overall PETH, which can be used to define the PETH peak time and half peak rise and fall time (left, bottom). Alignment of pressing bouts to the start (middle, top) and stop (right, top) reveals that PETHs resemble step functions (middle and right, bottom), indicating that trial-by-trial behavior is in fact characterized by low rates of responding early on, followed by an abrupt switch to a constant high rate, and then an abrupt return to no responding. k, Correlating PETH half peak fall time versus mean trial-by-trial stop time at day 21 of training reveals that the half peak fall time metric can serve as an accurate measure of overall individual trial stop times. l, Same as (k) but for PETH half peak rise time versus trial-by-trial start time. m-n, The PETH half peak rise time changes minimally across 21 days of training (m, main effect of treatment F(4, 36) = 11.54, P < 0.0001; day 1 vs. day 21, P< 0.0001), compared to half peak fall time (see Fig. 1), and shows nonsignificant changes with ear sealing (n). o-p, Analysis of interpress interval times between individual trial start and stop pressing times. The histogram of all interpress interval times between the start and stop times for all trials for the exemplar shown in (j) follows a Poisson-like distribution. (p) Autocorrelation function coefficients calculated for the sequence of interpress intervals across the second (top) and 10th (bottom) trials for increasing lag times, again for the exemplar shown in (j). Grey, non-labeled points denote intermediate training days for learning correlation plots. Error bars denote s.e.m. For correlation plots analyzing latency to initiate post-lever extension, and trial-by-trial mean stop time, grey points denote individual animals. Grey shading for all correlation plots denote 95% confidence interval for regression and PCC denotes Pearson correlation coefficient. **** P < 0.0001; NS, not significant. Blue dotted lines for autocorrelation functions denote rejection region bands for testing individual autocorrelations.

Extended Data Fig. 2 |. SFI ear seal learning, related to main Fig. 2.

Extended Data Fig. 2 |

a, Behavior of an example mouse for probe trials at day 1, 4, 7 and 21 of SFI ear seal training. b, Average training PETHs for ears open (left, n = 10) and ear sealed (right, n = 8) for response rate (top) and percent maximum response rate (bottom). c-e, The presses per minute at 30 s (c, effect of interaction F(4, 64) = 4.55, P = 0.0027; ears open vs. sealed: day 7, day 14, and day 21, P = 0.0009, P = 0.0008, and P = 0.0004, respectively), peak time (d), and half peak fall time (e, effect of interaction F(4, 64) = 3.43, P = 0.0133; ears open vs. sealed: day 4 and day 7, P < 0.0001 and P = 0.0139, respectively) across training days for ears open (n = 10) and sealed (n = 8). Learning data were analyzed using two-way ANOVA followed by Sidak post hoc comparisons. Values for performance metrics are means. Shading for average PETHs and error bars denote s.e.m. **** P < 0.0001, *** P < 0.001, * P < 0.05; NS, not significant.

Extended Data Fig. 3 |. Auditory deprivation effects on SFI performance are independent of any cues related to reward delivery, related to main Fig. 2.

Extended Data Fig. 3 |

a-e, Sweetened, condensed milk (n = 5) can be used as a reward for SFI training. a, Response rasters (top), response rate PETH (middle) and percent maximum response rate PETH (bottom) of an exemplar for omission trials performing SFI under auditory deprivation (ears sealed) between flanking control sessions (pre-control and post-control) using sweetened, condensed milk as a reward. SFI performance with milk reward can be measured by the response rate at 10 s (pre-control middle) and the half peak fall time (pre-control bottom) for omission trials. Ear sealing has no effect on the rewarded head entry time (b), or how often animals check for the milk reward by making a head entry (c), indicating auditory deprivation does not have an effect on a sensory cue related to reward availability. d, Auditory deprivation experiments on animals trained using the milk reward showed similar effects on response rate (left, main effect of treatment F(2,8) = 9.209, P = 0.0084; ears sealed versus pre-/post-control, P = 0.0115 and P = 0.0191, respectively) and half peak fall time (right, main effect of treatment F(2,8) = 9.414, P = 0.0079; ears sealed versus pre-/post-control, P = 0.0107 and P = 0.0183, respectively) as with the pellet reward. e, Average PETHs for response rate (top) and percent maximum response rate (bottom) for performance with sweetened, condensed milk reward on pre-control session and ear seal session. f-h, Separate groups of animals underwent extinction through exposure to continuous omission trials with ears open (n = 8) or sealed (n = 10) also demonstrating that the auditory deprivation effects on SFI performance were independent of a reward-related cue. f, Under auditory deprivation during extinction, response rate at 30 s decreased (left, two-tailed, unpaired t-test, t = 5.076, P = 0.0001), half peak fall time increased (right, two-tailed, unpaired t-test, t = 3.347, P = 0.0041), and no significant change was observed in the peak time. g, Response rasters (top), response rate PETH (middle), and percent maximum response rate PETH (bottom) of exemplars for omission trials performing extinction post-21 days of training on SFI with ears open (left) and sealed (right). h, Average PETHs for response rate (top) and percent maximum response rate (bottom) for ears sealed and open groups. SFI milk data were analyzed using repeated-measures one-way ANOVA followed by Tukey post hoc comparisons. Extinction results were analyzed using unpaired t-tests. Values for performance metrics are means, and error bars and shading for average PETHs denote s.e.m. ***P < 0.001; ** P< 0.01; *P< 0.05; NS, not significant.

Extended Data Fig. 4 |. Effects of unilateral auditory deprivation on cFos expression in auditory structures and striatum, related to main Fig. 3.

Extended Data Fig. 4 |

a, cFos expression in AUDd (left), ipsilateral to the sealed ear (middle), and ipsilateral to the open ear (right) of a unilateral ear sealed animal. (b-d), same as (a), but for the medial geniculate (b), inferior colliculus (c), and striatum (d). Scale bars for immunohistochemical images denote 200 μm for auditory structures, and 1 mm for striatum. ‘D’ and ‘L’ denote dorsal and lateral, respectively.

Extended Data Fig. 5 |. Effects of unilateral auditory deprivation on the visual system activation, and validation of FosTRAP labeling system in AUDd, related to main Figs. 3 and 4.

Extended Data Fig. 5 |

a-b, Random unilateral ear sealing does not disrupt cFos expression across hemispheres in VISp. a, cFos expression across hemispheres in VISp was quantified in the same group of animals that underwent random unilateral ear sealing while performing SFI and then sacrificed. b, cFos immunohistochemistry for exemplar showing VISp cortical region for an animal sacrificed on day 21 of SFI training with one ear sealed (left) and one ear open (middle). Scale bars denote 50 μm. Comparison of percent activation according to cFos counts across hemispheres for VISp cortical region (right) ipsilateral versus contralateral to the sealed ear of animals sacrificed upon completion of SFI at day 21 (n = 5) of training. c-g, FosTRAP expression recapitulates cFos protein expression pattern in AUDd. c, FosTRAP animals were trained for 21 days on SFI and split into two groups. One group (top) (n = 4) was induced with 4-OHT after being injected with a Cre-dependent AAV expressing GFP in AUDd and eventually sacrificed immediately after performing SFI again with ears open. The other group (bottom)(n = 4) performed the no lever version of the task also with ears open and immediately sacrificed following session completion. d, Percent distribution across the cortical layers in AUDd of cFos protein and cFosCreERT2 (as visualized via Cre-dependent GFP expression) in FosTRAP animals that were induced (green) while performing SFI with their ears open and later in another SFI session immediately sacrificed upon completion (black), again with ears open. These distributions were compared to the cFos protein percent distribution of FosTRAP animals that were sacrificed upon session completion in the no lever context with their ears open (red)(effect of interaction F(8,45) = 2.335, P = 0.0343; no lever layer V cFos protein vs. SFI layer V cFosCreERT2/cFos protein, P = 0.0172 and P = 0.0496, respectively) with ears open (n = 4). e-f, AUDd cortical layer expression pattern of endogenous cFos protein for FosTRAP animals sacrificed immediately after performing SFI (e) and being in the no lever context (f) with their ears open. g, FosTRAP cFosCreERT2 expression as visualized via Cre-dependent GFP expression in AUDd of an animal induced with its ears open while performing SFI. Scale bars denote 100 μm. h, Fiber placement in AUDd of FosTRAP animals injected with an AAV expressing Cre-dependent ChR2-EYFP and induced with its ears open while performing SFI. Scale bar denotes 500 μm. For VISp cFos expression analysis, data were analyzed using a paired t-test. Bars denote mean percentage across hemispheres. For FosTRAP validation and AUDd laminar analysis, data were analyzed using two-way ANOVA followed by Sidak post hoc comparisons. Values are mean percentages across layers. Error bars denote s.e.m. For all immunohistochemistry images, ‘D’ and ‘L’ denote dorsal and lateral, respectively. *P < 0.05; NS, not significant.

Extended Data Fig. 6 |. Firing properties of AUDd neurons with ears open versus sealed and photoactivation of AUDd FosTRAPed populations, related to main Figs. 4 and 6.

Extended Data Fig. 6 |

a, Baseline firing rates of AUDd neurons with lever press responses exhibit no significant difference between ears open and sealed (n = 83). b, Raw firing rate (that is not normalized to baseline firing rate) between ears open and sealed for neurons with activated responses (left, n = 11) upon lever pressing significantly decreased their peak firing rate (open: 17.0 ± 4.7 Hz; sealed: 14.2 ± 5.0 Hz; two-tailed, paired t-test, t = 2.480, P = 0.0325), while neurons with inhibited responses (right, n = 5) showed no significant change with ear sealing. c, Latency of peak/dip responses upon lever pressing between ears open and sealed for activated (left, n = 11) and inhibited (right, n = 5) neurons showed no significant difference. d, Single-units were recorded extracellularly from AUDd in FosTRAP animals (n = 2, 2 recording sessions) in which ChR2-EYFP was expressed in a Cre-ON dependent manner following 4-OHT induction. e, Raster plot (left) and PSTH (right) of an example photo-tagged unit (significant response within 10 ms of light stimulus onset; P < 0.01, Stimulus-Associated spike Latency Test (SALT)). f, Light modulation index (difference in light-evoked and baseline firing rate divided by their sum) of all single-units recorded at different cortical depths. Filled dots indicate putatively photo-tagged units (P < 0.01, SALT). Bars indicate the mean light modulation index in 100 μm bins (unfilled: all units, filled: photo-tagged units). Pie chart: percent of single-units that are putatively photo-tagged (P < 0.01, SALT). g, Single-units were recorded extracellularly from AUDd in FosTRAP animals (n = 2, 3 recording sessions) in which ChR2-mcherry was expressed in a Cre-OFF dependent manner following 4-OHT induction. h-i, Same as for (e-f) but for Cre-OFF population. j-l, Correlation analysis of change in the press rate at 30 s and change in half peak fall time between stimulation and no stimulation trials (Δ = stimulation - no stimulation) for the three experimental conditions tested with FosTRAP: ears open/Cre-ON (j), ears sealed/Cre-ON (k), and ears open/Cre-OFF (l). Firing indicies are means, and error is s.e.m. Differences in firing index were analyzed using paired t-tests. *P < 0.05; NS, not significant. Grey shading for correlation plots denotes 95% confidence interval for regression. PCC denotes Pearson correlation coefficient.

Extended Data Fig. 7 |. FosTRAPed populations, and assessment of muscimol spread, related to main Fig. 4.

Extended Data Fig. 7 |

a, Exemplar histology from a FosTRAP animal showing Cre-dependent expression of EYFP is primarily restricted to AUDd. Optrode placement is overlayed showing conical light spread from the fiber tip. b, Cell density of Cre-ON cells in the three auditory cortical fields (top), and percentage within each field (bottom). The highest density (main effect of treatment F(2, 9) = 4.305, P = 0.0488, AUDd vs. AUDv, P = 0.0439) and percentage (main effect of treatment F(2, 9) = 17.75, P = 0.0008, AUDd vs. AUDv, P = 0.0006) of cells are located within AUDd, with decreasing expression moving ventrally to AUDv (n = 4). c, Diagram depicting parameters used to calculate conical volume of illumination sufficient to induce spiking. Using a formula that takes into account scattering, absorption, and geometric loss of light through the brain, we calculated the total illuminated volume that is within the intensity threshold (≥1 mW/mm2) to activate ChR2 and elicit spikes. This volume, which is based on the 5 mW initial intensity at the fiber tip (~159 mW/mm2) makes up a cone with an angle of θ = 33° that stretches z = 0.87 mm downward. Using the trigonometric relationship between z and θ, the radius of the base of this cone, r = 0.061 mm3, can be calculated. Based on r and z, the volume of the cone can be determined. Using the neuronal density of the auditory cortex, and the percentage of Cre-ON and Cre-OFF cells from laminar recordings, we calculated ~1,100 cells/hemisphere were activated in the ears open/Cre-ON condition, and ~5,600 cells/hemisphere in the ears open/Cre-OFF condition. d, Exemplar histology of an animal implanted with a cannula in AUDd, and sacrificed following infusion of fast green, showing dye is largely restricted to AUDd. For FosTRAP cell density and regional percentage quantifications, data were analyzed using one-way ANOVA followed by Tukey post hoc comparisons, and bars are means. Scale bars for all immunohistochemical images denote 1 mm, and ‘D’ and ‘L’ denote dorsal and lateral, respectively. ***P < 0.001; *P < 0.05; NS, not significant.

Extended Data Fig. 8 |. Optogenetic, press-dependent perturbation of AUDd via stimulation of CAMKII + populations with ears open disrupts SFI timing performance, related to main Fig. 5.

Extended Data Fig. 8 |

a, Exemplar histology from a CAMKII-Cre animal showing Cre-dependent expression of ChR2-EYFP is primarily restricted to AUDd. b, Behavior of an example animal using closed-loop, press-triggered optical stimulation (5 mW, 100 ms per press) of CAMKII + populations in AUDd during SFI performance (blue line denotes stimulation and grey denotes no stimulation). c, Stimulation and no stimulation average PETHs (n = 8) for response rate (top) and percent maximum response rate (bottom) for press triggered optical stimulation of CAMKII + populations in AUDd with ears open. d, The optogenetic stimulation effects on response rates at 30 s (left, two-tailed, paired t-test, t = 4.295, P = 0.0036), peak time, and half peak fall times (right, two-tailed, paired t-test, t = 5.261, P = 0.0012) for press-dependent optical stimulation of CAMKII + populations in AUDd (n = 8) during SFI performance with ears open. Scale bars for immunohistochemical image denotes 400 μm, and ‘D’ and ‘L’ denote dorsal and lateral, respectively. Shading for average PETHs denotes SEM. **P < 0.01; NS, not significant.

Supplementary Material

Supplementary Video 1
Download video file (33.5MB, mov)
Supplementary Video 2
Download video file (28.6MB, mov)

Acknowledgements

The authors thank H. Bito for the ESARE viral construct, B. Sabatini for the DO-ChR2 viral construct, and M. Goulding, C. Kintner and J. Thomas for helpful discussion.

This study was supported by grants from the National Institutes of Health under award numbers R01NS083815 and R01AG047669 to X.J., and EY022577 to E.M.C. and the McKnight Memory and Cognitive Disorders Award to X.J.

Footnotes

Online content

Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41593-022-01025-5.

Code availability

All code is available upon request.

Competing interests

The authors declare no competing interests.

Extended data is available for this paper at https://doi.org/10.1038/s41593-022-01025-5.

Supplementary information The online version contains supplementary material available at https://doi.org/10.1038/s41593-022-01025-5.

Data availability

All data are available upon reasonable request from the corresponding author.

References

  • 1.Gallistel CR The organization of action: a new synthesis. Behav. Brain Sci 4, 609–619 (1981). [Google Scholar]
  • 2.Jin X & Costa RM Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature 466, 457–462 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Prochazka A & Ellaway P Sensory systems in the control of movement. Compr. Physiol 2, 2615–2627 (2012). [DOI] [PubMed] [Google Scholar]
  • 4.Gossard JP, Cabelguen JM & Rossignol S An intracellular study of muscle primary afferents during fictive locomotion in the cat. J. Neurophys 65, 914–926 (1991). [DOI] [PubMed] [Google Scholar]
  • 5.Rossignol S, Dubuc R & Gossard JP Dynamic sensorimotor interactions in locomotion. Physiol. Rev 86, 89–154 (2006). [DOI] [PubMed] [Google Scholar]
  • 6.Zehr EP & Stein RB What functions do reflexes serve during human locomotion? Prog. Neurobiol 58, 185–205 (1999). [DOI] [PubMed] [Google Scholar]
  • 7.Guenthner CJ, Miyamichi K, Yang HH, Heller HC & Luo L Permanent genetic access to transiently active neurons via TRAP: targeted recombination in active populations. Neuron 78, 773–784 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Abner RT, Edwards T & Douglas A Pharmacology of temporal cognition in two mouse strains. Int. J. Comp. Psychol 14, 189–210 (2001). [Google Scholar]
  • 9.Cheng K & Westwood R Analysis of single trials in pigeons’ timing performance. J. Exp. Psychol. Anim. Behav. Process 19, 56–67 (1993). [Google Scholar]
  • 10.Church RM, Meck WH & Gibbon J Application of scalar timing theory to individual trials. J. Exp. Psychol. Anim. Behav. Process 20, 135–155 (1994). [DOI] [PubMed] [Google Scholar]
  • 11.Gallistel CR, King A & McDonald R Sources of variability and systematic error in mouse timing behavior. J. Exp. Psychol. Anim. Behav. Process 30, 3–16 (2004). [DOI] [PubMed] [Google Scholar]
  • 12.Koch SC et al. RORβ spinal interneurons gate sensory transmission during locomotion to secure a fluid walking gait. Neuron 96, 1419–1431 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Schmucker C, Seeliger M, Humphries P, Biel M & Schaeffel F Grating acuity at different luminances in wild-type mice and in mice lacking rod or cone function. Invest. Ophthalmol. Vis. Sci 46, 398–407 (2005). [DOI] [PubMed] [Google Scholar]
  • 14.Crawley JN et al. Behavioral phenotypes of inbred mouse strains: implications and recommendations for molecular studies. Psychopharmacology 132, 107–124 (1997). [DOI] [PubMed] [Google Scholar]
  • 15.Eliades SJ & Wang X Neural substrates of vocalization feedback monitoring in primate auditory cortex. Nature 453, 1102–1106 (2008). [DOI] [PubMed] [Google Scholar]
  • 16.Schneider DM, Sundararajan J & Mooney R A cortical filter that learns to suppress the acoustic consequences of movement. Nature 561, 391–395 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Geddes CE, Li H & Jin X Optogenetic editing reveals the hierarchical organization of learned action sequences. Cell 174, 32–43 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Howard CD, Li H, Geddes CE & Jin X Dynamic nigrostriatal dopamine biases action selection. Neuron 93, 1436–1450 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Buhusi CV & Meck WH What makes us tick? functional and neural mechanisms of interval timing. Nat. Rev. Neurosci 6, 755–765 (2005). [DOI] [PubMed] [Google Scholar]
  • 20.Gibbon J, Church RM & Meck WH Scalar timing in memory. Ann. N. Y. Acad. Sci 423, 52–77 (1984). [DOI] [PubMed] [Google Scholar]
  • 21.Kawashima T et al. Functional labeling of neurons and their projections using the synthetic activity-dependent promoter E-SARE. Nat. Meth 10, 889–895 (2013). [DOI] [PubMed] [Google Scholar]
  • 22.Znamenskiy P & Zador AM Corticostriatal neurons in auditory cortex drive decisions during auditory discrimination. Nature 497, 482–485 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Tervo DGR et al. A designer AAV variant permits efficient retrograde access to projection neurons. Neuron 92, 372–382 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Skinner BF ‘Superstition’ in the pigeon. J. Exp. Psychol 38, 168–172 (1948). [DOI] [PubMed] [Google Scholar]
  • 25.Mello GBM, Soares S & Paton JJ A scalable population code for time in the striatum. Curr. Biol 25, 1113–1122 (2015). [DOI] [PubMed] [Google Scholar]
  • 26.Auerbach BD, Rodrigues PV & Salvi RJ Central gain control in tinnitus and hyperacusis. Front. Neurol 5, 206 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Caspary DM, Ling L, Turner JG & Hughes LF Inhibitory neurotransmission, plasticity and aging in the mammalian central auditory system. J. Exp. Biol 211, 1781–1791 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Tollin DJ The lateral superior olive: a functional role in sound source localization. Neuroscientist 9, 127–143 (2003). [DOI] [PubMed] [Google Scholar]
  • 29.Malmierca MS, Merchán MA, Henkel CK & Oliver DL Direct projections from cochlear nuclear complex to auditory thalamus in the rat. J. Neurosci 22, 10891–10897 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Schofield BR, Motts SD, Mellott JG & Foster NL Projections from the dorsal and ventral cochlear nuclei to the medial geniculate body. Front. Neuroanat 8, 10 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Huang CL & Winer JA Auditory thalamocortical projections in the cat: laminar and areal patterns of input. J. Comp. Neurol 427, 302–331 (2000). [DOI] [PubMed] [Google Scholar]
  • 32.Lee CC Exploring functions for the non-lemniscal auditory thalamus. Front. Neural Circuits 9, 69 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Smith PH, Uhlrich DJ, Manning KA & Banks MI Thalamocortical projections to rat auditory cortex from the ventral and dorsal divisions of the medial geniculate nucleus. J. Comp. Neurol 520, 34–51 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Otazu GH, Tai L-H, Yang Y & Zador AM Engaging in an auditory task suppresses responses in auditory cortex. Nat. Neurosci 12, 646–654 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Buonomano DV Temporal information transformed into a spatial code by a neural network with realistic properties. Science 267, 1028–1030 (1995). [DOI] [PubMed] [Google Scholar]
  • 36.Buonomano DV & Maass W State-dependent computations: spatiotemporal processing in cortical networks. Nat. Rev. Neurosci 10, 113–125 (2009). [DOI] [PubMed] [Google Scholar]
  • 37.Goldman MS Memory without feedback in a neural network. Neuron 61, 621–634 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Laje R Robust timing and motor patterns by taming chaos in recurrent neural networks. Nat. Neurosci 16, 925–933 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Liu JK & Buonomano DV Embedding multiple trajectories in simulated recurrent neural networks in a self-organizing manner. J. Neurosci 29, 13172–13181 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Miller A & Jin DZ Potentiation decay of synapses and length distributions of synfire chains self-organized in recurrent neural networks. Phys. Rev. E Stat. Nonlin. Soft Matter Phys 88, 062716 (2013). [DOI] [PubMed] [Google Scholar]
  • 41.Namboodiri VMK, Huertas MA, Monk KJ, Shouval HZ & Shuler MGH Visually cued action timing in the primary visual cortex. Neuron 86, 319–330 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Shuler MG Reward timing in the primary visual cortex. Science 311, 1606–1609 (2006). [DOI] [PubMed] [Google Scholar]
  • 43.Goodfellow LD An empirical comparison of audition, vision, and touch in the discrimination of short intervals of time. Am. J. Psychol 46, 243–258 (1934). [Google Scholar]
  • 44.Kubovy M Should we resist the seductiveness of the space:time::vision:audition analogy? J. Exp. Psychol. Hum. Percept. Perform 14, 318–320 (1988). [Google Scholar]
  • 45.Repp BH & Penel A Rhythmic movement is attracted more strongly to auditory than to visual rhythms. Psychol. Res 68, 252–270 (2003). [DOI] [PubMed] [Google Scholar]
  • 46.Brosch M & Schreiner CE Time course of forward masking tuning curves in cat primary auditory cortex. J. Neurophys 77, 923–943 (1997). [DOI] [PubMed] [Google Scholar]
  • 47.He J, Hashikawa T, Ojima H & Kinouchi Y Temporal integration and duration tuning in the dorsal zone of cat auditory cortex. J. Neurosci 17, 2615–2625 (1997). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Welch RB, DuttonHurt LD & Warren DH Contributions of audition and vision to temporal rate perception. Percept. Psychophys 39, 294–300 (1986). [DOI] [PubMed] [Google Scholar]
  • 49.Zhou X, de Villers-Sidani E & Panizzutti R Successive-signal biasing for a learned sound sequence. Proc. Natl Acad. Sci. USA 107, 14839–14844 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Nobre AC & van Ede F Anticipated moments: temporal structure in attention. Nat. Rev. Neurosci 19, 34–48 (2018). [DOI] [PubMed] [Google Scholar]
  • 51.Pizzera A & Hohmann T Acoustic information during motor control and action perception: a review. Open Psychol. J 8, 183–191 (2015). [Google Scholar]
  • 52.Repp BH & Penel A Auditory dominance in temporal processing: new evidence from synchronization with simultaneous visual and auditory sequences. J. Exp. Psychol. Hum. Percept. Proc 28, 1085–1099 (2002). [PubMed] [Google Scholar]
  • 53.Sacco T & Sacchetti B Role of secondary sensory cortices in emotional memory storage and retrieval in rats. Science 329, 649–656 (2010). [DOI] [PubMed] [Google Scholar]
  • 54.Tsukano H et al. Reciprocal connectivity between secondary auditory cortical field and amygdala in mice. Sci. Rep 9, 19610 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Issa JB et al. Multiscale optical Ca2+ imaging of tonal organization in mouse auditory cortex. Neuron 83, 944–959 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Duque D, Ayala YA & Malmierca MS Deviance detection in auditory subcortical structures: what can we learn from neurochemistry and neural connectivity? Cell Tissue Res. 361, 215–232 (2015). [DOI] [PubMed] [Google Scholar]
  • 57.Kraus N et al. Discrimination of speech-like contrasts in the auditory thalamus and cortex. J. Acoust. Soc. Am 96, 2758–2768 (1998). [DOI] [PubMed] [Google Scholar]
  • 58.Malmierca MS, Cristaudo S, Pérez-González D & Covey E Stimulus-specific adaptation in the inferior colliculus of the anesthetized rat. J. Neurosci 29, 5483–5493 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Kelly JB The effects of insular and temporal lesions in cats on two types of auditory pattern discrimination. Brain Res. 62, 71–87 (1973). [DOI] [PubMed] [Google Scholar]
  • 60.Layton LS, Toga AW, Horenstein S & Davenport DG Temporal pattern discrimination survives simultaneous bilateral ablation of suprasylvian cortex but not sequential bilateral ablation of insular-temporal cortex in the cat. Brain Res. 173, 337–340 (1979). [DOI] [PubMed] [Google Scholar]
  • 61.Meck WH Neuroanatomical localization of an internal clock: a functional link between mesolimbic, nigrostriatal, and mesocortical dopaminergic systems. Brain Res. 1109, 93–107 (2006). [DOI] [PubMed] [Google Scholar]
  • 62.Soares S, Atallah B & Paton JJ Midbrain dopamine neurons control judgement of time. Science 354, 1273–1277 (2016). [DOI] [PubMed] [Google Scholar]
  • 63.Drew MR et al. Transient overexpression of striatal D2 receptors impairs operant motivation and interval timing. J. Neurosci 27, 7731–7739 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Ward D et al. Impaired timing precision produced by striatal D2 receptor overexpression is mediated by cognitive and motivational deficits. Behav. Neurosci 123, 720–730 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Murray JM & Escola GS Learning multiple variable-speed sequences in striatum via cortical tutoring. eLife 6, e26084 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Chaplan SR, Bach FW, Pogrel JW, Chung JM & Yaksh TL Quantitative assessment of tactile allodynia in the rat paw. J. Neurosci. Methods 53, 55–63 (1994). [DOI] [PubMed] [Google Scholar]
  • 67.Hintiryan H et al. The mouse cortico-striatal projectome. Nat. Neurosci 19, 1100–1114 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Lewis MC & Gould TJ Reversible inactivation of the entorhinal cortex disrupts the establishment and expression of latent inhibition of cued fear conditioning in C57BL/6 mice. Hippocampus 17, 462–470 (2007). [DOI] [PubMed] [Google Scholar]
  • 69.Du J, Blanche TJ, Harrison RR & Lester HA Multiplexed, high density electrophysiology with nanofabricated neural probes. PLoS ONE 6, e26204 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Stringer C et al. Inhibitory control of correlated intrinsic variability in cortical networks. eLife 5, e19695 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Dayan P & Abbott LF Theoretical neuroscience: computational and mathematical modeling of neural systems (MIT Press, 2001). [Google Scholar]
  • 72.Aravanis AM et al. An optical neural interface: in vivo control of rodent motor cortex with integrated fiberoptic and optogenetic technology. J. Neural Eng 4, 143–156 (2007). [DOI] [PubMed] [Google Scholar]
  • 73.Keller D, Erö C & Markram H Cell densities in the mouse brain: a systematic review. Front. Neuroanat 12, 83 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Klug JR et al. Differential inputs to striatal cholinergic and parvalbumin interneurons imply functional distinctions. eLife 7, e35657 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Video 1
Download video file (33.5MB, mov)
Supplementary Video 2
Download video file (28.6MB, mov)

Data Availability Statement

All data are available upon reasonable request from the corresponding author.

RESOURCES