Abstract
In classical conditioning, cerebellar Purkinje cells learn an adaptively timed pause in spontaneous firing. This pause reaches its maximum near the end of the interstimulus interval. While it was thought that this timing was due to temporal patterns in the input signal and selective engagement of changes in synapse strength, we have shown Purkinje cells learn timed responses even when the conditional stimulus is delivered to its immediate afferents.1 This shows that Purkinje cells have a cellular timing mechanism. The cellular models of intrinsic timing we are aware of are based on adapting the rise time of the concentration of a given ion. As an alternative, we here propose a selection mechanism in abstract terms for how a Purkinje cell could learn to respond at a particular time after an external trigger.
Keywords: cerebellum, eyeblink conditioning, glutamate transmission, purkinje cell, temporal timing, control
In classical conditioning, preceding an unconditional blink-eliciting stimulus with a neutral conditional stimulus at a fixed temporal delay, an interstimulus interval, gives the conditional stimulus the ability to elicit a blink that is timed to that interval. The blink occurs just before the unconditional stimulus.2 In this learning paradigm, cerebellar Purkinje cells that control the blink learn to respond with a timed pause3-5 in their tonic inhibition of cerebellar nuclear cells, leading to an excitatory signal that generates the overt blink.6-8 The conditional and unconditional blink-eliciting signals reach the Purkinje cell via the mossy-parallel fiber system and climbing fibers respectively.9
The Timing Mechanism is Intrinsic to the Purkinje Cell
Virtually all neural timing models postulate that neurons learn to time their responses by altering the strength of synaptic connections for selected subpopulations of pre-synaptic neurons.10,11 Following the onset of a stimulus, different pre-synaptic neurons are assumed to have activity peaks at different times during the interval. The signals in the parallel fibers with a peak towards the end of the interstimulus interval would coincide with the unconditional stimulus and climbing fiber activity. The synapses active at that time would be selectively recruited for long-term depression or long-term potentiation. When learning is complete, those granule cells that peak at the appropriate time control the timing of the Purkinje cell output. Thus, the timing of conditioned Purkinje cell responses would depend on a time code in the parallel fiber afferents transmitting the conditional stimulus.
However, as we have recently shown, adaptively timed responses also occur when the conditional stimulus is direct stimulation of parallel fibers, demonstrating that the response timing does not reflect a temporal code in the input signal, but must be due to a cellular timing mechanism that cannot be explained by changes in synapse strength.1,12 The Purkinje cell pause response was also shown to be resistant to pharmacological blockade of inhibitory interneurons. Our finding that the adaptive time course of the Purkinje cell conditioned response depends on a mechanism in the cell itself suggests that a glutamate trigger from parallel fibers activates a cellular mechanism with a particular delay after which a hyperpolarizing response with a specific duration is turned on.
What is the Learning Mechanism?
How can a neuron learn to respond with a particular delay in the range of hundreds of milliseconds between receptor activation and voltage response? Notice that we need both a mechanism for “recording” the time interval between the conditional and unconditional stimuli and a mechanism for generating the delayed response itself.
The first mechanism could involve some cumulative biochemical process that is terminated by the unconditional stimulus. One may envision that neurotransmitter receptor activation leads to a gradual build-up in the concentration of a given ion or second messenger molecule until some threshold level is reached. The second mechanism could be that this accumulation somehow also acquires the ability to hyperpolarize the cell.
If a receptor is coupled directly or indirectly to a rise in the concentration of a substance x that can acquire the ability to trigger a voltage response, the time delay between receptor activation and voltage response will depend on the number of receptors that are activated. If there is an x-dependent feedback connection that adjusts the number of available receptors, the neuron can learn to adjust the delay.
In an implementation of this theory, Steuber and Willshaw13 proposed that Ca2+ dependent phosphorylation of receptors could implement adjustable delays in this way. Activation of many different receptors produces a temporary increase in post-synaptic [Ca2+] and the latency of this response can range widely depending on the number of available receptors and second messengers, the number of steps between receptor activation and Ca2+ rise and the rate constants at the different steps. Decreasing the number of available receptors leads to an increase in the latency of the Ca2+ rise. Any delay to a threshold level of [Ca2+] can then be learnt if two antagonistic biochemical processes control the number of available receptors.
Simplified, the specific mechanism here is that before training the number of available receptors is large and Ca2+ influx causes regular depolarization because most Ca2+ activated hyperpolarizing channels are inactivated. The conditional stimulus also evokes PKC synthesis, which increases the number of receptors available.
During training, presentation of the unconditional stimulus evokes PKG production that decreases the number of available receptors, rendering a slower [Ca2+] rise. The conditional stimulus evoked Ca2+/PKC peak moves towards the unconditional stimulus evoked PKG peak until they both overlap and equilibrium between PKG induced receptor decrease and PKC induced receptor increase is reached. The [Ca2+] rise latency now also matches the interstimulus interval. Coincident PKC and PKG activation further leads to phosphorylation and activation of Ca2+ activated K+ channels. The conditional stimulus response is thus gradually transformed into a hyperpolarization response around the time of the unconditional stimulus presentation.
This model, indeed any model that depends on an adjustable concentration rise, raises several difficulties.
First, a learning mechanism that depends on adjusting the latency of a [x] rise will be sensitive to the duration and frequency of the conditional stimulus. However we showed, consistent with data on both overt and Purkinje cell conditioned responses,14,15 that for instance a conditioned response, that was timed to a 150 ms interstimulus interval, was the same on post-training probe trials whether we delivered eight pulses at 400 Hz (17.5 ms) or 81 pulses at 100 Hz (800 ms). It is difficult to see how such disparate receptor activations could render the same [x] rise in the Purkinje cell.
Second, if the interstimulus interval is changed after learning, these models predict that the response will move in time to the new location of the unconditional stimulus. This is not what occurs, however. Both at the behavioral2 and at the Purkinje cell1,5 level, the old response is extinguished and the new response at the new time is acquired separately.
Third, both in behaving animals trained with alternating interstimulus intervals16 and in Purkinje cells re-trained to a new interstimulus interval1,5 double peaked responses can be observed. As noted by Steuber and Willshaw, it is difficult to account for this with a model based on adjustable concentration rise latencies.
Fourth, whereas models such as these predict that conditioning should occur in Purkinje cells with short interstimulus intervals (<100 ms), we have shown that it does not.17
As an alternative to earlier timing models, we would like to propose a selection mechanism. Let us imagine the existence of what we might name “timer units” (receptor subunits, proteins, channels…) that would provide receptors (or molecular structures that are activated by them) with distinct temporal activation profiles. The learning process would then select, among a finite number of such units, a combination that matches the temporal interval. These timer units are the effector components that generate a response at the right time.
Instead of the time tracking that starts with the onset of the conditional stimulus being a rise in the concentration of a given ion, we can envision either a cascade of second messengers, a protein changing its conformation over time or a series of molecular switches. The logic of the hypothesis does not require specification of either one of these so let us call it the ‘recorder’ and let it, for the sake of argument, be a protein changing its conformation over time.
At the onset of the conditional stimulus the ‘recorder’ proteins start changing in a predictable way. We assume four possible conformational states: ‘-’, A, B and C. Suppose that for the first 100 ms they are all in the ‘-’ state, between, say, 100-250 ms most are in the A state, between 200-350 ms most are in the B state and between 300-400 ms in the C state. Whether ‘-’, A, B and C in fact are different conformational states of a protein, different molecules in a second messenger cascade transiently being present or some form of hitherto unknown molecular switch does not matter.
Assume further that the recorder proteins interact with the unconditional stimulus in different ways depending on when it arrives. Suppose that when they are still in the ‘-’ state there is no effect, but when they are in either the A, B or C states, different activation sites are available for the unconditional stimulus. Activating the recorder proteins in one of the states may then cause translation or activation of particular timer units. In this way the learning mechanism selects appropriate timer units (the effector components that generate a response at the right time).
Note that one would not need many different states of the recorder proteins nor a large number of timer units they select from when activated in particular states, in order to learn many different temporal intervals. Recall that in the retina, a combination of only three types of cones is enough to represent the entire visible color spectrum.
Suppose that the timer units A*, B* and C* generate responses with maximum amplitudes at 150 ms, 250 ms and 350 ms respectively. Training with an interval of 150 ms might only lead to activating the recorder in the A state, which translates/activates the pool of timer units A*A*A*A* that in turn produces a response with a maximum amplitude at 150 ms. Training with an interval of 300 ms would lead to a pool of units B*B*B*B* with a maximum at 250 ms. Training with 215 ms might lead to a pool of A*A*B*B* with a maximum somewhere between 150 ms and 250 ms, say 200 ms and training with 400 ms would lead to C*C*C*C* with a maximum at 350 ms.
Such a mechanism could explain more of the experimental data such as the ability of very short conditional stimuli to elicit full responses. After learning, once a glutamate trigger has started a timer unit, it runs its course with a particular delay. On and offset of the response is the same regardless of variations in the conditional stimulus parameters. Concentration rise models would by necessity be affected by further input after the initial trigger. That is however not automatically the case here. If the timer units work like a kitchen timer they would not necessarily be re-started by further input.
There is also no need for the conditioned response to gradually move in time when a cell is re-trained to a new temporal interval. During initial training selection of timer units A*A*A*A* leads to a response latency of 150 ms. When the unconditional stimulus is moved to 400 ms the learning mechanism starts selecting C*C*C*C* instead. A sufficient number of timer units with delays of 200-300 ms are never selected so the response does not gradually move in time from 150 ms to 400 ms. Furthermore, there is no reason why a Purkinje cell could not harbor multiple responses at once. If it is alternately trained with interstimulus intervals of 150 ms and 400 ms, every other trial will result in the recorder selecting timer units A*A*A*A* and C*C*C*C* respectively. Eventually two responses will appear. If the unconditional stimulus arrives in <100 ms, the ‘recorder’ is in the ‘-’ state and no timer units are selected.
At this point, we cannot speculate further on the exact nature of the hypothetical timer units but we suggest that it could be worthwhile to try to identify them. However, given the surprising existence of a temporal memory, we expect the explanation to have more surprises in store for us.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Funding
This work was supported by grants from the Swedish Research Council to The Linnaeus Centre for Cognition, Communication and Learning at Lund University (349-2007-8695) and to G. Hesslow (09899) and from the Krapperup Foundation.
References
- 1. Johansson F., Jirenhed D. A., Rasmussen A., Zucca R., Hesslow G. Memory trace and timing mechanism localized to cerebellar Purkinje cells. Pro Natl Acad Sci U S A ;http://dx.doi.org/ 10.1073/pnas.1415371111 (2014); PMID:25267641 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Kehoe E. J., Macrae M. in A Neuroscientist's Guide to Classical Conditioning (ed J. W. Moore) 171-231 (New York: Springer-Verlag, 2002) . [Google Scholar]
- 3. Hesslow G., Ivarsson M. Suppression of cerebellar Purkinje cells during conditioned responses in ferrets. Neuroreport 5, 649-652 (1994); PMID:8025262 [DOI] [PubMed] [Google Scholar]
- 4. Jirenhed D. A., Bengtsson F., Hesslow G. Acquisition, extinction, and reacquisition of a cerebellar cortical memory trace. J Neurosci 27, 2493-2502 (2007); PMID:17344387 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Jirenhed D. A., Hesslow G. Learning Stimulus Intervals – Adaptive Timing of Conditioned Purkinje Cell Responses. Cerebellum 10, 523-535 (2011); PMID:21416378 [DOI] [PubMed] [Google Scholar]
- 6. Hesslow G. Inhibition of classically conditioned eyeblink responses by stimulation of the cerebellar cortex in the decerebrate cat. J Physiol 476, 245-256 (1994); PMID:8046641 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Hesslow G. Correspondence between climbing fibre input and motor output in eyeblink-related areas in cat cerebellar cortex. J Physiol 476, 229-244 (1994); PMID:8046640 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Heiney S. A., Kim J., Augustine G. J., Medina J. F. Precise control of movement kinematics by optogenetic inhibition of purkinje cell activity. J Neurosci 34, 2321-2330 ;http://dx.doi.org/ 10.1523/JNEUROSCI.4547-13.2014. (2014); PMID:24501371 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Hesslow G., Yeo C. H. in A Neuroscientist's Guide to Classical Conditioning (ed J.W. Moore) 86-146 (New York: Springer-Verlag, 2002). [Google Scholar]
- 10. Mauk M., Buonomano D. The neural basis of temporal processing. Ann Rev Neurosci 27, 307-40 (2004). [DOI] [PubMed] [Google Scholar]
- 11. Yamazaki T., Tanaka S. Computational models of timing mechanisms in the cerebellar granular layer. Cerebellum 8, 423-32 (2009); PMID:19495900 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Hesslow G., Jirenhed D.-A., Rasmussen A., Johansson J. F. Classical conditioning of motor responses: what is the learning mechanism? Neural Netw 47, 81-7 (2013); PMID:23597758 [DOI] [PubMed] [Google Scholar]
- 13. Steuber V., Willshaw D. A. Biophysical Model of Synaptic Delay Learning and Temporal Pattern Recognition in a Cerebellar Purkinje Cell. J Comput Neurosci 17, 149-64 (2004); PMID:15306737 [DOI] [PubMed] [Google Scholar]
- 14. Svensson P., Ivarsson M. Short-lasting conditioned stimulus applied to the middle cerebellar peduncle elicits delayed conditioned eye blink responses in the decerebrate ferret. European J Neurosci 11, 4333-40 (1999); PMID:10594659 [DOI] [PubMed] [Google Scholar]
- 15. Jirenhed D. A., Hesslow G. Time Course of Classically Conditioned Purkinje Cell Response is Determined by Initial Part of Conditioned Stimulus. J Neurosci 31, 9070-4 (2011); PMID:21697357 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Millenson J. R., Kehoe E. J., Gormezano I. Classical conditioning of the rabbit's nictitating membrane response under fixed and mixed CS-US intervals. Learn Motiv 8, 351-66 (1977). [Google Scholar]
- 17. Wetmore D. Z., Jirenhed DA, Rasmussen A, Johansson F, Schnitzer MJ, Hesslow G. Bidirectional plasticity of Purkinje cells matches temporal features of learning. J Neurosci 34, 1731-7 (2013); PMID:24478355 [DOI] [PMC free article] [PubMed] [Google Scholar]
