Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jul 8.
Published in final edited form as: Annu Rev Neurosci. 2018 Jul 8;41:233–253. doi: 10.1146/annurev-neuro-080317-061948

Computational Principles of Supervised Learning in the Cerebellum

Jennifer L Raymond 1, Javier F Medina 2
PMCID: PMC6056176  NIHMSID: NIHMS981414  PMID: 29986160

Abstract

Supervised learning plays a key role in the operation of many biological and artificial neural networks. Analysis of the computations underlying supervised learning is facilitated by the relatively simple and uniform architecture of the cerebellum, a brain area that supports numerous motor, sensory, and cognitive functions. We highlight recent discoveries indicating that the cerebellum implements supervised learning using the following organizational principles: (a) extensive preprocessing of input representations (i.e., feature engineering), (b) massively recurrent circuit architecture, (c) linear input-output computations, (d) sophisticated instructive signals that can be regulated and are predictive, (e) adaptive mechanisms of plasticity with multiple timescales, and (f) task-specific hardware specializations. The principles emerging from studies of the cerebellum have striking parallels with those in other brain areas and in artificial neural networks, as well as some notable differences, which can inform future research on supervised learning and inspire next-generation machine-based algorithms.

Keywords: machine learning, decorrelation, consolidation, plasticity, climbing fiber, Purkinje cell

1. INTRODUCTION

Many of the computations that machines, animals, and humans must perform to effectively interact with the world can be implemented through supervised learning. The central feature of supervised learning is that feedback about a system’s performance is used to adjust internal parameters and thereby improve future performance. This is done through an iterative process whereby the system’s response to a given input is evaluated against a desired outcome, and deviations from the desired outcome (i.e., errors) are then used to adjust adaptive elements within the system. Engineers have used supervised learning to train adaptive filters for applications such as system identification and noise cancellation (Jenkins et al. 1996); mathematicians and computer scientists have demonstrated its power for solving a variety of classification and regression problems, from image and speech recognition to forecasting stock prices or energy consumption (Alpaydin 2014, Hagan et al. 2014). Supervised learning is also thought to play a key role in the development and maintenance of many brain functions (Doya 2000, Knudsen 1994), although the organizational principles governing the underlying neural computations have been elusive (Marblestone et al. 2016). In this review, we summarize recent discoveries about how supervised learning is implemented in the cerebellum, a neural system that is tightly interconnected with multiple brain areas and plays a key role in the adaptive control of motor, sensory, and cognitive functions (Sokolov et al. 2017).

The essential architecture for supervised learning comprises three core elements (Figure 1) in both artificial and biological neural networks: (a) The first element is an adaptive processor. To learn how to generate an appropriate response from each of its inputs, the network that performs the input-output transformation must have internal parameters that can be adaptively tuned. This tuning is achieved by adjusting connection weights in artificial neural networks (Alpaydin 2014, Hagan et al. 2014, Jenkins et al. 1996) or by adjusting the strength of synapses or other properties of neurons in the brain (i.e., neural plasticity) (Titley et al. 2017). (b) The second element is input preprocessing. Before inputs are fed into the adaptive processor of a supervised learning system, it is often necessary to transform the raw input. In artificial neural networks, the process of finding a suitable representation of the input data is called feature engineering and is a critical step that often determines whether the algorithm will succeed or fail (Bengio et al. 2013, LeCun et al. 2015). (c) Instructive signals compose the third element. The main distinction between supervised and unsupervised learning is that the supervised network receives feedback about its performance and uses these instructive signals to adjust the internal parameters of the adaptive processor. The instructive signals are computed by comparing the response of the network with the desired response; in other words, they signal error.

Figure 1.

Figure 1

Core architecture of supervised learning networks. The essential computational frameworks for implementing supervised learning are shown for (a) an artificial neural network and (b) the cerebellar network. A preprocessing stage (cyan) transforms the input signals to create representations that are a suitable substrate for supervised learning. Those representations are then sent to an adaptive processor (green) that generates responses. Errors are computed by comparing the actual response of the network with the desired response (orange) and are used as instructive signals to adjust the internal parameters (e.g., synaptic weights) of the adaptive processor until it learns to generate the desired response.

It has long been recognized that the cerebellum possesses these three core elements of supervised learning systems (Albus 1971, Marr 1969) (Figure 1). Input signals arrive in the cerebellum via mossy fibers and are preprocessed by local circuits in the granule cell layer of the cerebellar cortex. These signals are then sent to Purkinje cells, which are the sole output of the cerebellar cortex. The Purkinje cells are a key part of the adaptive processor that generates responses and distributes them, via the cerebellar nuclei, to numerous brain areas, including sensory, decision-making, and motor centers (Sokolov et al. 2017). The inferior olive nuclei in the brainstem compute instructive signals and convey them to the cerebellum via the climbing fibers. Importantly, Purkinje cells form closed loops with neurons in the cerebellar nuclei and the specific set of cells in the inferior olive that generate a particular instructive signal (Apps & Hawkes 2009, Chaumont et al. 2013) (Figure 1b).

In the following sections, we review recent research that has uncovered key elaborations of the basic functional architecture for supervised learning in the cerebellum. The general principles emerging from this work can inform studies of supervised learning in other areas of the brain and inspire next-generation machine-based algorithms.

2. PREPROCESSING OF INPUTS

The input layer of the cerebellum performs feature engineering by converting the mossy fiber signals it receives into granule cell representations that are suitable for supervised learning. More than half of the neurons in the mammalian brain are cerebellar granule cells; thus, the processing of mossy fiber inputs by the granule cells represents a dramatic expansion of coding space, which could support feature engineering functions such as pattern separation and the generation of temporal basis sets.

2.1. Pattern Separation

Early theories suggested that the highly divergent architecture of the granule cell layer is ideally suited to perform pattern separation (Albus 1971, Marr 1969): Similar input representations, encoded by overlapping groups of mossy fibers, would be transformed into linearly separable representations by activation of sparse and nonoverlapping groups of granule cells. Support for this idea comes from the demonstration that different mossy fiber streams converge onto individual granule cells (Chabrol et al. 2015, Huang et al. 2013, Ishikawa et al. 2015) (multimodal granule cell in Figure 2; see also Sawtell 2010), a process that would favor pattern separation through combinatorial sampling of the mossy fiber signals (Albus 1971, Marr 1969). However, in some parts of the cerebellar cortex, granule cells are driven by mossy fibers carrying similar information (Bengtsson & Jorntell 2009) (unimodal granule cell in Figure 2). Also, representations may not be as sparse as originally envisioned, because widespread activation of many granule cells can occur during natural behaviors and in response to sensory stimulation (Giovannucci et al. 2017, Jorntell & Ekerot 2006, Knogler et al. 2017, Ozden et al. 2012, van Beugen et al. 2013, Wagner et al. 2017). Lower levels of sparseness could improve the capacity of the network to generalize what it learns (Spanne & Jorntell 2015) without compromising pattern separation; the number of mossy fiber inputs per granule cell is low, which should support decorrelation of spatially overlapping mossy fiber patterns and the generation of high-dimensional representations in granule cells (Billings et al. 2014, Cayco-Gajic et al. 2017, Litwin-Kumar et al. 2017).

Figure 2.

Figure 2

Circuit diagram illustrating how supervised learning is implemented in the cerebellar network. The numbers 2–7 correspond to the computational principles described in Sections 27 of the main text and indicate the location(s) where each principle is implemented in the cerebellar network. Feedforward paths (black lines) and feedback paths (red lines) are indicated. Each mossy fiber pattern represents an information stream of a specific modality. An example of a multimodal granule cell (filled with four patterns) and a unimodal granule cell (filled with a one-dot pattern) are shown. Abbreviations: CN, cerebellar nucleus; GrC, granule cell; GoC, Golgi cell; IO, inferior olive; MLI, molecular layer interneuron; PkC, Purkinje cell.

2.2. Dynamic Representations

Learning what response to make to a given input is useful but often insufficient. In many cases, learning when to make the correct response is just as important, because the desired response (Figure 1) often varies as a function of time. Artificial neural networks often solve this problem with a preprocessing step that transforms time-invariant inputs into dynamic representations that enable the adaptive system to approximate the desired response over its entire time course (Jenkins et al. 1996). Artificial network models based on the architecture of the cerebellar granule cell layer can generate temporal basis sets (i.e., time-varying patterns of granule cell activations in response to temporally impoverished mossy fiber inputs) (for reviews, see D’Angelo & De Zeeuw 2009, Medina & Mauk 2000). Such temporal basis sets in the granule cells of cerebellum-like circuits of electric fish have been described (Kennedy et al. 2014); however, technical challenges have impeded empirical studies of temporal patterning in the mammalian cerebellum. Moreover, recent work suggests that molecular mechanisms intrinsic to the Purkinje cells may learn to time a response (Johansson et al. 2014, 2015). Thus, the extent to which the dynamics of the granule cell layer contribute to the cerebellum’s ability to support precise learned timing remains unresolved.

2.3. Plasticity of Representations

Artificial deep networks revolutionized image recognition and related classification problems by allowing the network to learn the representations of inputs suitable for supervised learning, thereby automating the feature engineering function and making it highly adaptive (Bengio et al. 2006, Coates et al. 2011). Likewise, synaptic plasticity in the granule cell layer of the cerebellar cortex (D’Angelo & De Zeeuw 2009, Sgritta et al. 2017) could enable it to implement adaptive feature learning during preprocessing of mossy fiber inputs.

3. MASSIVELY RECURRENT ARCHITECTURE

Although the cerebellar circuit was historically characterized as a traditional, primarily feedforward perceptron network (Albus 1971, Marr 1969) (Figure 1b), recent work has uncovered ubiquitous, recurrent connections (Figure 2). In machine learning, recurrent neural networks, which incorporate feedback paths between the different layers of the network, have been particularly effective for tasks that require processing of sequences, such as natural language processing and time-series prediction (Medsker & Jain 1999). The highly recurrent architecture in the cerebellum may support similar functions, including language processing and the coordination of sequential movements (Hikosaka et al. 1999, Leggio & Molinari 2015, Penhune & Steele 2012, Sokolov et al. 2017).

3.1. Local Feedback Within the Cerebellar Cortex

It has long been recognized that negative feedback loops between excitatory granule cells and inhibitory Golgi cells (GoCs) could play a role in pattern separation and the generation of dynamic representations in the granule cell layer (Albus 1971, Billings et al. 2014, D’Angelo & De Zeeuw 2009, Litwin-Kumar et al. 2017, Marr 1969, Medina & Mauk 2000) (see Section 2). Moreover, feedback is now known to influence computation at multiple additional levels of the local circuit in the cerebellar cortex (Figure 2). First, there are reciprocal connections between groups of excitatory unipolar brush cells (UBCs) (Dino et al. 2000), between groups of inhibitory GoCs (Hull & Regehr 2012), and between groups of inhibitory molecular layer interneurons (MLIs) (Rieubland et al. 2014). This type of recurrent architecture can act as a temporal integrator to prolong the response of the system to transient inputs and build a short-term memory (Maex & Gutkin 2017, van Dorp & De Zeeuw 2015). Second, a positive feedback loop between the axons of granule cells and MLIs (Astorga et al. 2015) forms a recurrent local circuit that is predicted to sharpen the spatial contrast of the input signals delivered to Purkinje cells. Third, Purkinje cells are reciprocally connected to each other in the adult brain (Witter et al. 2016) and also send inhibitory feedback to granule cells (Guo et al. 2016) and MLIs (Witter et al. 2016).

3.2. Long-Range Feedback from Cerebellar Nuclei to the Cerebellar Cortex

We have known for over 40 years that the final output of the cerebellum is sent from the cerebellar nuclei back to the cerebellar cortex (for a review, see Houck & Person 2014), but only recently have some of the details critical for understanding the function of this nucleocortical feedback connection come to light. In many (but not all) regions of the cerebellum (Trott et al. 1998a,b), nucleocortical feedback is part of a closed loop among granule cells (input layer), Purkinje cells (sole output of the cerebellar cortex), and neurons in the cerebellar nuclei (output to other brain areas). This recurrent path can be configured as either a positive or a negative feedback loop, with excitatory neurons in the cerebellar nuclei targeting both excitatory and inhibitory cells in the granule cell layer (Houck & Person 2015) and inhibitory neurons in the cerebellar nuclei targeting inhibitory cells selectively (Ankri et al. 2015). These nucleocortical feedback loops could play several computational roles in supervised learning by (a) enabling the cerebellum to work as a dynamical system to learn sequences of responses (Brandi et al. 2013) and to perform state estimation and time-series prediction using a forward model configuration (Miall et al. 1993, Wolpert et al. 1998), (b) stabilizing learning even in cases where only indirect information about the correct response is available for computing the error signal (Porrill & Dean 2007, Porrill et al. 2004), (c) providing a mechanism for regulating input signals and amplifying or dampening the gain of the system based on the current response (Gao et al. 2016, Lisberger & Sejnowski 1992), and (d) implementing an architecture for reservoir computing capable of generating dynamically rich input signals in the granule cell layer of the cerebellar cortex (Rossert et al. 2015).

4. LINEAR COMPUTING

The goal of supervised learning is to find a mapping function: a transformation of the input signals that generates the desired correct response (Figure 1). Although some machine learning applications have accomplished this with networks of linear units (i.e., units that produce an output that is a weighted sum of their inputs) (Hagan et al. 2014, Haykin 2013), units with nonlinear activation functions are more frequently used. In highly nonlinear networks, it is often extremely difficult to find any obvious relationships between the output of the individual units and any of the features of the input stream or the final response of the system. In contrast, the firing rate of many cerebellar neurons is a linear function of task-related parameters such as the direction of a moving stimulus or the kinematic properties of an action being performed (for reviews, see Ebner et al. 2011, Medina 2011). This linear coding of task-related parameters has been found at all levels of the cerebellar circuit, including mossy fibers (Laurens et al. 2013, Lisberger & Fuchs 1978), granule cells (Arenz et al. 2009, Chadderton et al. 2014, Chen et al. 2017, Powell et al. 2015), MLIs (Chen et al. 2017, Gaffield & Christie 2017, Jelitai et al. 2016, ten Brinke et al. 2015), Purkinje cells (Chen et al. 2016; Dugue et al. 2017; Herzfeld et al. 2015; Hong et al. 2016; Medina 2011; Medina & Lisberger 2007, 2009; Shidara et al. 1993; Sun et al. 2017), and the cerebellar nuclei (Ebner et al. 2011, Heiney et al. 2014, Kleine et al. 2003, ten Brinke et al. 2017). Recent studies have begun to shed light on the mechanisms underlying these linear transformations in the cerebellum (Park et al. 2012; Person & Raman 2011; Steuber & Jaeger 2013; Turecek et al. 2016, 2017; Walter & Khodakhah 2006, 2009). Although cerebellar neurons and synapses have nonlinear intrinsic properties (Hoebeek et al. 2010, Ishikawa et al. 2015, Jahnsen 1986, Loewenstein et al. 2005, Person & Raman 2011), these properties are disengaged or compensated in ways that keep the cerebellar network operating in a linear regime (Alvina et al. 2008, Rossert et al. 2014, Schonewille et al. 2006, Turecek et al. 2017, van Vreeswijk & Sompolinsky 1996). This linear processing would not preclude supervised learning of nonlinear mapping functions by the network if the granule cell layer provided an appropriate basis set (see Section 2.2), because it is possible to approximate any nonlinear function with a linear combination of basis functions (Dean & Porrill 2011, Pouget & Snyder 2000, Spanne & Jorntell 2013).

5. SMART INSTRUCTIVE SIGNALS

Errors play a critical role during supervised learning; they are the instructive signals that guide appropriate adjustments in the adaptive processor (Figure 1). In many machine learning applications, instructive signals are provided that precisely specify the desired network output for a set of training inputs (e.g., “Image 1 is a Welsh terrier, not an Airedale terrier.”). In contrast, the error signals available to guide supervised learning in the cerebellum and other biological networks are often more indirect and delayed (e.g., visual feedback about where the tennis ball landed rather than a specification of what the commands to each muscle should have been). Solutions to this distal teacher problem (Jordan & Rumelhart 1992,Miall et al. 1993) have been reviewed elsewhere (Ito 2013, Porrill et al. 2013). In this section, we focus on recent experimental work showing that instructive signals in the cerebellum are much richer than previously thought.

5.1. Modulation of Instructive Signals

Neurons of the inferior olive, which send their climbing fiber axons to Purkinje cells (Figure 2), are a key source of instructive signals for supervised learning in the cerebellum (De Zeeuw et al. 1998, Ito 2013, Simpson et al. 1996). Although early experiments suggested that climbing fibers signal errors by generating an all-or-none complex spike response in Purkinje cells (Eccles et al. 1966), subsequent work has revealed that the potency of the climbing fiber signal can be exquisitely regulated by several factors (Najafi & Medina 2013). These factors include (a) intrinsic network mechanisms that can influence the state of Purkinje cells (Callaway et al. 1995, Wang et al. 2000) and of the neurons in the inferior olive (Bazzigaluppi et al. 2012, De Gruijl et al. 2012, Lefler et al. 2014, Mathy et al. 2009, Tokuda et al. 2013); (b) signals from neuromodulatory centers in the brain (Carey & Regehr 2009); and (c) environmental parameters related to the current behavioral state (Apps 2000, Lawrenson et al. 2016), the training context (Kimpo et al. 2014), the salience of error-related cues (Najafi et al. 2014a,b), or the prior history of errors (Maruta et al. 2007, Yang & Lisberger 2017). Because climbing fibers trigger the induction of plasticity at multiple sites in the cerebellar cortex (for a review, see Gao et al. 2012), the modulation of their efficacy has important functional implications for determining how much the cerebellum learns from a given error signal (Carey & Regehr 2009; Mathy et al. 2009; Yang & Lisberger 2014a, 2017). Thus, mechanisms exist for regulating the potency of instructive signals in the cerebellum, and these mechanisms could be used to adjust the rate of learning on the basis of factors such as the relevance of the error and the degree of certainty that the current response of the network causally contributed to the error (Wei & Kording 2009) (for more on rate of learning, including its adaptive modulation, see Section 6).

5.2. Temporal-Difference Instructive Signals

In applications where the desired response varies as a function of time, a time-varying error signal can be computed by comparing the response of the network to the desired response at each time point. Neurons in the inferior olive are well positioned to implement this continuous error computation by adding excitatory inputs related to the desired response and inhibitory inputs related to the actual response generated by the cerebellar nuclei at each point in time (De Zeeuw et al. 1998, Ito 2013, Simpson et al. 1996) (Figure 1b). However, recent work indicates that the way the inferior olive computes dynamic error signals may be more consistent with the method of temporal differences (Ohmae & Medina 2015), whereby an error signal is generated whenever there is a difference between the response of the network at the current time and the prediction of the desired response of the network in the next time step (Sutton 1988). Neural signals resembling temporal-difference errors have been found in other parts of the brain known to play a critical role during reinforcement learning tasks, including the dopaminergic neurons of the midbrain (O’Doherty et al. 2003, Schultz 1998, Seymour et al. 2004), and are also used in machine learning applications to improve performance (Schultz et al. 1997).

5.3. Purkinje Cells as a Source of Instructive Signals

Climbing fibers are not the only neural instructive signal available in the cerebellum to guide learning; the output of Purkinje cells can also carry information about the learning that is required (Miles & Lisberger 1981, Popa et al. 2016, Raymond & Lisberger 1998). Cerebellar learning performance is better predicted by the instructive signals carried by both the climbing fibers and the Purkinje cell output during training than by the climbing fibers alone, and learning can occur under training conditions where the climbing fibers do not carry instructive signals (Ke et al. 2009, Raymond & Lisberger 1998). Moreover, direct stimulation of either the climbing fibers or the Purkinje cells can be used to induce learning in the absence of the normal sensory feedback about errors (Lee et al. 2015, Nguyen-Vu et al. 2013). Purkinje cells fire spontaneously at high rates and can thus carry information and influence the responses of the network by decreasing or increasing their firing. Pauses in Purkinje cell firing, by disinhibiting their targets in the cerebellar nuclei (Heiney et al. 2014), may be particularly effective at driving learning mechanisms at those sites (Lee et al. 2015, McElvain et al. 2010, Medina & Mauk 1999, Person & Raman 2010, Zheng & Raman 2010). The use of multiple neural instructive signals may enhance the dynamic range of the network by allowing learning under a broader range of conditions. The relative contribution of climbing fiber-triggered versus Purkinje cell-triggered plasticity to cerebellar learning may vary with the training conditions and over time (as discussed further in Section 6).

6. MULTIPLE TIMESCALES OF PLASTICITY

In machine learning, the choice of the learning rate for adjusting network parameters is critical—if changes are too big, the network may oscillate or otherwise fail to converge, but if they are too small, learning will take too long. A solution is to adjust the learning rate during training, increasing the rate to make bigger changes when the network is far from a solution or when the statistics of the inputs change and decreasing the learning rate under more stable conditions (Schaul et al. 2012). In the cerebellum, one mechanism for adjusting the learning rate is to modulate the potency of the instructive signals carried by climbing fibers (as described in Section 5.1). A second mechanism, observed in the cerebellum and in other brain regions, is to implement learning with distinct forms of neural plasticity that operate over different timescales (Dudai et al. 2015, Kukushkin & Carew 2017, Squire et al. 2015). This may allow the brain to both adapt and forget quickly if errors are transitory (e.g., in response to muscle fatigue) and to retain what it has learned if errors persist for long periods (e.g., if an increase in body weight requires long-lasting changes in the motor commands) (Kording et al. 2007).

6.1. Seconds: One-Shot, Volatile Cerebellar Plasticity

Cerebellum-dependent learning is incremental, requiring hundreds or thousands of training trials to approach asymptote. However, the error signals carried by the climbing fibers not only trigger small changes that accumulate across many trials but also have a more powerful yet transient influence that supports rapid adaptation to recent experience. Following a single climbing fiber response to an error on one trial, there is an approximately 5% change in the output of the Purkinje cell targets of this climbing fiber and in the behavioral response on the next trial (Medina & Lisberger 2008). These changes are roughly 100-fold larger than expected from the cumulative changes that accrue across multiple trials, but they decay quickly, in approximately 6 s (Khilkevich et al. 2016, Kimpo et al. 2014, Yang & Lisberger 2014b). One candidate synaptic mechanism underlying these rapid, transient changes in the neural and behavioral responses is short-term associative plasticity of the granule cell-Purkinje cell synapses (Brenowitz & Regehr 2005, Suvrathan et al. 2016).

6.2. Minutes-Hours: Incremental Changes in Purkinje Cell Output

Over minutes to hours of training with repeated trials, there is a learned change in the response of the Purkinje cells (for a review, see Medina 2011). Early models attributed this learning process to climbing fiber-triggered, long-term plasticity of the granule cell-Purkinje cell synapses (Ito 2000), which is known to last for hours once it has been induced. However, the extent to which this mechanism, rather than other plasticity mechanisms, contributes to the changes in Purkinje cell output is still unresolved after many decades (Mauk et al. 1998, Schonewille et al. 2011, Yamaguchi et al. 2016), most likely because the recruitment of plasticity at the granule cell-Purkinje cell synapses is highly sensitive to the parameters of training (Boyden et al. 2006, Nguyen-Vu et al. 2013). Notably, different Purkinje cells exhibit a correlate of learning at different times within a training session (Yang & Lisberger 2014b), with some individual cells exhibiting changes only transiently for a few dozen trials and then apparently handing off the memory trace to other Purkinje cells (Li et al. 2011). Thus, there appear to be multiple learning time constants and mechanisms of plasticity within the minutes-hours range.

6.3. Hours-Days: Systems Consolidation

The cerebellar cortex is essential for generating the learned response in the first several hours after training, but it is not essential 24 h later (Kassardjian et al. 2005, Shutoh et al. 2006). A structural correlate of the transient dependence on the cerebellar cortex is a decrease in α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors at the granule cell-Purkinje cell synapses 1 h after training, which recovers by 24 h after training (Aziz et al. 2014, Wang et al. 2014). Thus, the memory trace seems to be transferred from the cerebellar cortex to a different site, where more durable mechanisms of plasticity support long-term retention of the learned response; the prime candidate for this site is the immediately downstream cerebellar nuclei, where molecular, electrophysiological, and structural correlates of learning have been described (Boele et al. 2013, Gomi et al. 1999, Kleim et al. 2002, Medina et al. 2000b, ten Brinke et al. 2017). This systems consolidation of cerebellar learning depends on neural activity in the cerebellar cortex during the initial hours after training (Cooke et al. 2004) and is reminiscent of findings from other brain areas, such as the hippocampus or the amygdala, where memory traces are gradually transferred to different sites in the brain for long-term storage (Dudai et al. 2015, Medina et al. 2002, Squire et al. 2015). Notably, although the role of the cerebellar cortex is reduced over time, it is not completely eliminated, as persistent changes in synapse number a day or more after training have been reported (Aziz et al. 2014, Black et al. 1990), and certain aspects of learning, such as the learned timing of a response and, possibly, the pattern of generalization of the learning, retain a long-term dependence on the cerebellar cortex (Boyden et al. 2004, Medina et al. 2000a).

6.4. Interaction Between Different Timescales of Plasticity

Slower, long-term plasticity can be observed in the absence of faster, transient plasticity in the cerebellum (Kimpo et al. 2014, Yang & Lisberger 2014b), suggesting that the mechanisms underlying these two types of plasticity may be at least partially independent of each other, as has been described for other neural circuits (Sutton & Carew 2000). However, there is also evidence that, during cerebellar learning tasks, short-term changes can facilitate the induction of long-term changes (Joiner & Smith 2008, Yang & Lisberger 2010). One candidate link between short-term and long-term plasticity is Purkinje cell activity, which serves the dual functions of (a) adaptively adjusting the response of the system by serving as a conduit for the expression of rapid, shorter-term plasticity in the cerebellar cortex while (b) providing instructive signals for inducing slower, more durable changes that integrate and store the learned response over longer timescales in sites downstream of the cerebellar cortex (Lee et al. 2015, Medina et al. 2000b, Miles & Lisberger 1981, Nguyen-Vu et al. 2013, Popa et al. 2016, Raymond & Lisberger 1998). Finally, the interaction between different timescales is bidirectional—not only does rapid, short-term plasticity influence the induction of long-term plasticity, but the longer-term statistics of experience can also scale up or down the amount of rapid learning that takes place during cerebellar tasks (Herzfeld et al. 2014, Yang & Lisberger 2017), similar to strategies used in machine learning (Schaul et al. 2012).

7. HARDWARE SPECIALIZATION

A striking feature of the cerebellum is its highly regular circuit architecture, which is often described as crystalline because it comprises a series of modules of similar anatomical structure (Figure 2). These modules form closed loops among groups of parasagittally aligned Purkinje cells, neurons in the cerebellar nuclei, and the inferior olive and are repeated throughout the cerebellum. The uniform circuit structure suggests that the different cerebellar modules perform the same, canonical computation on the inputs they receive from different sensory, motor, and cognitive brain areas. However, this interpretation has been called into question by the observation of molecular, physiological, and even anatomical variations across cerebellar modules (Cerminara et al. 2015, Chan-Palay 1977, Urbano et al. 2006).

More than 50 genes are differentially expressed in the parasagittal bands of the cerebellar cortex, giving it a striped appearance in sections stained for any of their markers (Cerminara et al. 2015, Hawkes 2014). The cerebellar Purkinje cells, for example, are a well-defined type of neuron on the basis of their distinct morphology and their role as the sole output neurons of the cerebellar cortex; however, on the basis of differences in gene expression profiling, they could be divided into multiple different subgroups (Cerminara et al. 2015). Heterogeneity is not restricted to the Purkinje cells; there are additional differences in gene expression, packing density, and even the size of the cell bodies of the granule cells and interneurons in different cerebellar modules (Cerminara et al. 2015). There are also physiological differences, which correlate with the molecular variations across modules. For example, Purkinje cells expressing the molecule zebrin II have a lower basal firing rate (Xiao et al. 2014; Zhou et al. 2014, 2015) and lower propensity for associative plasticity of their granule cell inputs, as well as a larger response to their climbing fiber input (Tang et al. 2017), than Purkinje cells that are zebrin II negative (Hawkes 2014, Wadiche & Jahr 2005). This heterogeneity is not simply biological noise but rather systematic variation that is highly reproducible across individuals and that obeys the anatomical boundaries between different cerebellar modules (Cerminara et al. 2015, Tsutsumi et al. 2015, Valera et al. 2016).

Recent findings suggest that the variations between cerebellar modules may simply reflect a fine-tuning of the supervised learning computation rather than fundamentally different computational principles. For example, different regions of the cerebellar cortex have different timing requirements for associative synaptic plasticity at the granule cell-Purkinje cell synapses (Suvrathan et al. 2016). This heterogeneity allows the cerebellum to solve the same temporal credit assignment problem (Minsky 1963, Sutton 1984) across modules that receive instructive signals with different feedback delays: Plasticity is induced selectively in those synapses that were active when an error was generated rather than when the feedback about that error reached the cerebellum. Similarly, cerebellar modules engaged during oculomotor learning have anatomical specializations that could support a functional requirement of those behavioral tasks, namely the generation of long time constants for temporal integration. This could be supported by two characteristic features of oculomotor-related modules: (a) a higher density of UBCs that generate sustained responses to input (Dino et al. 2000, Mugnaini & Floris 1994, Rossi et al. 1995) and (b) a higher density of recurrent loops via feedback projections from Purkinje cells to granule cells (Guo et al. 2016). Other specializations across cerebellar modules might compensate for different mean firing rates in the mossy fiber inputs to keep the input layer of the cerebellar cortex in an optimal dynamic range for coding (Cayco-Gajic et al. 2017, Litwin-Kumar et al. 2017) or to prevent saturation of plasticity (Nguyen-Vu et al. 2017).

Thus, variations on the crystalline circuit architecture of the cerebellum may reflect, not fundamentally different computations, but specializations that allow different cerebellar modules to implement the same supervised learning computation on inputs with different statistics and in a way that meets numerous different task requirements. The different cerebellar modules support many functions, including fine motor control and navigation (Lefort et al. 2015), learned fear and other emotions (Perciavalle et al. 2013, Strata 2015), reward (Mittleman et al. 2008), and cognitive processing (Hoche et al. 2016, Koziol et al. 2014, Sokolov et al. 2017). The cerebellum contributes to these different functions through reciprocal connections with most areas of the cerebral cortex as well as subcortical structures (for a review, see Bostan et al. 2013). In the context of these broader networks, the cerebellum may act as a universal neural chip for implementing supervised learning computations, serving different purposes depending on the specific inputs it receives and the way that it is embedded within the larger networks of the brain (Figure 3).

Figure 3.

Figure 3

Network configurations for supervised learning applications. Depending on the way the adaptive processor is embedded in the broader network, supervised learning can be used for (a) system identification, where the adaptive processor learns to generate a response that mimics that of an unknown system; (b) inverse system identification, where the adaptive processor learns to generate a response that mimics the original input signal before it was transformed or corrupted by an unknown system; (c) noise cancellation, where the adaptive processor learns to generate a response that is a clean version of a signal that was embedded in unwanted noise (in this application, the error signal converges to the signal itself, rather than converging to zero); and (d) prediction, where the adaptive processor takes past values of the input signal to learn to generate a response that predicts what the future values of the input signal will be. These different network configurations have been leveraged extensively in engineering applications. The cerebellum could be embedded in larger brain networks in similar configurations to help support a variety of sensory, motor, and cognitive functions, including supervised learning of inverse models for generating suitable motor commands, as well as supervised learning of forward models for predicting the sensory consequences of our actions and for cancelling out neural noise generated by self-motion.

8. CONCLUSIONS AND OPEN QUESTIONS

A clear picture is beginning to emerge of the fundamental principles that allow the cerebellum to implement a general algorithm for supervised learning. (a) Extensive preprocessing of signals at the input layer is used to perform pattern separation and, possibly, to generate time-varying representations. (b) The architecture of the network is highly recurrent. (c) Computations are performed with neurons that have linear input-output functions. (d) Multiple neural elements provide error signals; these error signals encode predictive information and their potency can be modulated to control the rate of learning. (e) Adaptive changes occur at several nodes of the network during learning and differ with regard to both their rate of induction and their persistence. (f) Although the architecture of the cerebellum is generally uniform, there is also task-related specialization of certain properties of the cerebellar hardware.

The principles of supervised learning discovered in the cerebellum have many parallels with those in other biological and artificial neural networks, as well as some key differences. An important principle shared with many other vertebrate and invertebrate biological neural networks is that learning is not a unitary process—multiple instructive signals guide the induction of plasticity at distributed sites in the network, and there are different changes over different timescales. This flexibility could allow learning to be highly specific yet have a broad dynamic range. Massive recurrence is also a nearly ubiquitous feature of neural networks and is now known to be more prominent in the cerebellum than previously recognized. Meanwhile, molecular and functional heterogeneity among neurons of a given type, which have long been recognized in the cerebellum, have also been discovered in other regions of the brain (Cembrowski et al. 2016, Giocomo & Hasselmo 2008, Soltesz 2005). A key difference between the cerebellum and other brain areas is the extraordinary amount of neural hardware devoted to input preprocessing in the cerebellum, which is roughly equal to the number of neurons in the rest of the brain combined. Yet the computational functions that have been attributed to the cerebellar preprocessing stage are similar to those that have been described for other brain areas—decorrelation, pattern separation, and the generation of temporal basis sets (Eichenbaum 2014, Finnerty et al. 2015, Pitkow & Meister 2012, Yassa & Stark 2011). Additional work is needed to identify the computational principles that arise uniquely from the specific architecture of the cerebellum versus those shared with other neural circuits. The relatively simple and uniform circuit architecture of the cerebellum, combined with powerful new tools for visualizing and manipulating neural activity, makes the cerebellum an ideal system for studying general computational principles in the brain and discovering new ways to apply those principles to improve the performance of machine learning algorithms.

There are striking parallels between supervised learning in the cerebellum and the strategies used for supervised learning in artificial neural networks, including the regulation of the learning rate by modulating the potency of the instructive signals that induce changes in the network. One of the most notable differences is that machine learning has not widely employed passive decay of learned changes in the network, whereas this is prominent in the cerebellum and other vertebrate and invertebrate biological networks (Kukushkin & Carew 2017). Future work will determine how much this passive decay is a limitation of the biology (i.e., a bug) or a feature that allows the network to effectively adapt to a dynamic world. Finally, findings from artificial neural networks provide insights about potential functions of supervised learning in the cerebellum (Figure 3). A definitive answer to the question of what the cerebellum learns will need to take into account the possibility that different cerebellar regions may be implementing the same universal supervised learning algorithm but be specialized for different functions depending on the system configuration, the connections to other parts of the brain, and the kind of information that is carried by the input and output signals.

ACKNOWLEDGMENTS

The authors would like to thank Jay Bhasin for input on a draft of this manuscript. The writing of this article was supported by the following grants to J.F.M.: R01 MH114269, RF1 MH093727, and R21 AA025572.

Footnotes

DISCLOSURE STATEMENT

The authors are not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this review.

LITERATURE CITED

  1. Albus JS. 1971. A theory of cerebellar function. Math. Biosci 10:25–61 [Google Scholar]
  2. Alpaydin E 2014. Introduction to Machine Learning Cambridge, MA: MIT Press [Google Scholar]
  3. Alvina K, Walter JT, Kohn A, Ellis-Davies G, Khodakhah K. 2008. Questioning the role of rebound firing in the cerebellum. Nat. Neurosci 11:1256–58 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Ankri L, Husson Z, Pietrajtis K, Proville R, Lena C, et al. 2015. A novel inhibitory nucleo-cortical circuit controls cerebellar Golgi cell activity. eLife 4:e06262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Apps R 2000. Gating of climbing fibre input to cerebellar cortical zones. Prog. Brain Res 124:201–11 [DOI] [PubMed] [Google Scholar]
  6. Apps R, Hawkes R. 2009. Cerebellar cortical organization: a one-map hypothesis. Nat. Rev. Neurosci 10:670–81 [DOI] [PubMed] [Google Scholar]
  7. Arenz A, Bracey EF, Margrie TW. 2009. Sensory representations in cerebellar granule cells. Curr. Opin. Neurobiol 19:445–51 [DOI] [PubMed] [Google Scholar]
  8. Astorga G, Bao J, Marty A, Augustine GJ, Franconville R, et al. 2015. An excitatory GABA loop operating in vivo. Front. Cell Neurosci 9:275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Aziz W, Wang W, Kesaf S, Mohamed AA, Fukazawa Y, Shigemoto R. 2014. Distinct kinetics of synaptic structural plasticity, memory formation, and memory decay in massed and spaced learning. PNAS 111:E194—202 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bazzigaluppi P, De Gruijl JR, van der Giessen RS, Khosrovani S, De Zeeuw CI, de Jeu MT. 2012. Olivary subthreshold oscillations and burst activity revisited. Front. Neural Circuits 6:91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bengio Y, Courville A, Vincent P. 2013. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell 35:1798–828 [DOI] [PubMed] [Google Scholar]
  12. Bengio Y, Lamblin P, Popovici D, Larochelle H. 2006. Greedy layer-wise training of deep networks Proc. Int. Conf Neural Inf. Process. Syst, 19th, pp. 153–60. Cambridge, MA: MIT Press [Google Scholar]
  13. Bengtsson F, Jorntell H. 2009. Sensory transmission in cerebellar granule cells relies on similarly coded mossy fiber inputs. PNAS 106:2389–94 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Billings G, Piasini E, Lorincz A, Nusser Z, Silver RA. 2014. Network structure within the cerebellar input layer enables lossless sparse encoding. Neuron 83:960–74 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Black JE, Isaacs KR, Anderson BJ, Alcantara AA, Greenough WT. 1990. Learning causes synaptogenesis, whereas motor activity causes angiogenesis, in cerebellar cortex of adult rats. PNAS 87:5568–72 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Boele HJ, Koekkoek SK, De Zeeuw CI, Ruigrok TJ. 2013. Axonal sprouting and formation of terminals in the adult cerebellum during associative motor learning. J. Neurosci 33:17897–907 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Bostan AC, Dum RP, Strick PL. 2013. Cerebellar networks with the cerebral cortex and basal ganglia. Trends Cogn. Sci 17:241–54 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Boyden ES, Katoh A, Pyle JL, Chatila TA, Tsien RW, Raymond JL. 2006. Selective engagement of plasticity mechanisms for motor memory storage. Neuron 51:823–34 [DOI] [PubMed] [Google Scholar]
  19. Boyden ES, Katoh A, Raymond JL. 2004. Cerebellum-dependent learning: the role of multiple plasticity mechanisms. Annu. Rev. Neurosci 27:581–609 [DOI] [PubMed] [Google Scholar]
  20. Brandi S, Herreros I, Sanchez-Fibla M, Verschure PFMJ. 2013. Learning of motor sequences based on a computational model of the cerebellum In Living Machines: Conference on Biomimetic and Biohybrid Systems, ed. NF Lepora, Mura A, Krapp HG, Verschure PFMJ, Prescott TJ, pp. 356–58. Berlin: Springer-Verlag [Google Scholar]
  21. Brenowitz SD, Regehr WG. 2005. Associative short-term synaptic plasticity mediated by endocannabinoids. Neuron 45:419–31 [DOI] [PubMed] [Google Scholar]
  22. Callaway JC, Lasser-Ross N, Ross WN. 1995IPSPs strongly inhibit climbing fiber-activated [Ca2+]i increases in the dendrites of cerebellar Purkinje neurons. J. Neurosci 15:2777–87 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Carey MR, Regehr WG. 2009. Noradrenergic control of associative synaptic plasticity by selective modulation of instructive signals. Neuron 62:112–22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Cayco-Gajic NA, Clopath C, Silver RA. 2017. Sparse synaptic connectivity is required for decorrelation and pattern separation in feedforward networks. Nat. Commun 8:1116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Cembrowski MS, Bachman JL, Wang L, Sugino K, Shields BC, Spruston N. 2016. Spatial gene-expression gradients underlie prominent heterogeneity of CA1 pyramidal neurons. Neuron 89:351–68 [DOI] [PubMed] [Google Scholar]
  26. Cerminara NL, Lang EJ, Sillitoe RV, Apps R. 2015. Redefining the cerebellar cortex as an assembly of non-uniform Purkinje cell microcircuits. Nat. Rev. Neurosci 16:79–93 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Chabrol FP,Arenz A, Wiechert MT,Margrie TW, DiGregorio DA. 2015. Synaptic diversity enables temporal coding of coincident multisensory inputs in single neurons. Nat. Neurosci 18:718–27 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Chadderton P, Schaefer AT, Williams SR, Margrie TW. 2014. Sensory-evoked synaptic integration in cerebellar and cerebral cortical neurons. Nat. Rev. Neurosci 15:71–83 [DOI] [PubMed] [Google Scholar]
  29. Chan-Palay V 1977. Cerebellar Dentate Nucleus: Organization, Cytology, and Transmitters Berlin: Springer-Verlag [Google Scholar]
  30. Chaumont J, Guyon N, Valera AM, Dugue GP, Popa D, et al. 2013. Clusters of cerebellar Purkinje cells control their afferent climbing fiber discharge. PNAS 110:16223–28 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Chen S, Augustine GJ, Chadderton P. 2016. The cerebellum linearly encodes whisker position during voluntary movement. eLife 5:e10509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Chen S, Augustine GJ, Chadderton P. 2017. Serial processing of kinematic signals by cerebellar circuitry during voluntary whisking. Nat. Commun 8:232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Coates A, Lee H, Ng AY. 2011. An analysis of single-layer networks in unsupervised feature learning. Presented at Proc. Int. Conf. Artif. Intell. Stat., 14th, Fort Lauderdale, FL http://proceedings.mlr.press/v15/coates11a.html [Google Scholar]
  34. Cooke SF, Attwell PJ, Yeo CH. 2004. Temporal properties of cerebellar-dependent memory consolidation. J. Neurosci 24:2934–41 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. D’Angelo E, De Zeeuw CI. 2009. Timing and plasticity in the cerebellum: focus on the granular layer. Trends Neurosci. 32:30–40 [DOI] [PubMed] [Google Scholar]
  36. De Gruijl JR, Bazzigaluppi P, de Jeu MT, De Zeeuw CI. 2012. Climbing fiber burst size and olivary subthreshold oscillations in a network setting. PLOS Comput. Biol 8:e1002814. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. De Zeeuw CI, Simpson JI, Hoogenraad CC, Galjart N, Koekkoek SK, Ruigrok TJ. 1998. Microcircuitry and function of the inferior olive. Trends Neurosci. 21:391–400 [DOI] [PubMed] [Google Scholar]
  38. Dean P, Porrill J. 2011. Evaluating the adaptive-filter model of the cerebellum. J. Physiol 589:3459–70 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Dino MR, Schuerger RJ, Liu Y, Slater NT, Mugnaini E. 2000. Unipolar brush cell: a potential feedforward excitatory interneuron of the cerebellum. Neuroscience 98:625–36 [DOI] [PubMed] [Google Scholar]
  40. Doya K 2000. Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr. Opin. Neurobiol 10:732–39 [DOI] [PubMed] [Google Scholar]
  41. Dudai Y, Karni A, Born J. 2015. The consolidation and transformation of memory. Neuron 88:20–32 [DOI] [PubMed] [Google Scholar]
  42. Dugue GP, Tihy M, Gourevitch B, Lena C. 2017. Cerebellar re-encoding of self-generated head movements. eLife 6:e26179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Ebner TJ, Hewitt AL, Popa LS. 2011. What features of limb movements are encoded in the discharge of cerebellar neurons? Cerebellum 10:683–93 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Eccles JC, Llinás R, Sasaki K. 1966. The excitatory synaptic action of climbing fibres on the Purkinje cells of the cerebellum. J. Physiol 182:268–96 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Eichenbaum H 2014. Time cells in the hippocampus: a new dimension for mapping memories. Nat. Rev. Neurosci 15:732–44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Finnerty GT, Shadlen MN, Jazayeri M, Nobre AC, Buonomano DV. 2015. Time in cortical circuits. J. Neurosci 35:13912–16 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Gaffield MA, Christie JM. 2017. Movement rate is encoded and influenced by widespread, coherent activity of cerebellar molecular layer interneurons. J. Neurosci 37:4751–65 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Gao Z, Proietti-Onori M, Lin Z, Ten Brinke MM, Boele HJ, et al. 2016. Excitatory cerebellar nucleocortical circuit provides internal amplification during associative conditioning. Neuron 89:645–57 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Gao Z, van Beugen BJ, De Zeeuw CI. 2012. Distributed synergistic plasticity and cerebellar learning. Nat. Rev. Neurosci 13:619–35 [DOI] [PubMed] [Google Scholar]
  50. Giocomo LM, Hasselmo ME. 2008. Time constants of h current in layer II stellate cells differ along the dorsal to ventral axis of medial entorhinal cortex. J. Neurosci 28:9414–25 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Giovannucci A, Badura A, Deverett B, Najafi F, Pereira TD, et al. 2017. Cerebellar granule cells acquire a widespread predictive feedback signal during motor learning. Nat. Neurosci 20:727–34 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Gomi H, Sun W, Finch CE, Itohara S, Yoshimi K, Thompson RF. 1999. Learning induces a CDC2-related protein kinase, KKIAMRE. J. Neurosci 19:9530–37 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Guo C, Witter L, Rudolph S, Elliott HL, Ennis KA, Regehr WG. 2016. Purkinje cells directly in hibit granule cells in specialized regions of the cerebellar cortex. Neuron 91:1330–41 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Hagan MT, Demuth HB, Beale M, De Jesus O. 2014. Neural Network Design Stillwater, OK: Martin Hagan; 2nd ed. [Google Scholar]
  55. Hawkes R 2014. Purkinje cell stripes and long-term depression at the parallel fiber-Purkinje cell synapse. Front. Syst. Neurosci 8:41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Haykin SS. 2013. Adaptive Filter Theory. London: Pearson [Google Scholar]
  57. Heiney SA, Kim J, Augustine GJ, Medina JF. 2014. Precise control of movement kinematics by optogenetic inhibition of Purkinje cell activity. J. Neurosci 34:2321–30 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Herzfeld DJ, Kojima Y, Soetedjo R, Shadmehr R. 2015. Encoding of action by the Purkinje cells of the cerebellum. Nature 526:439–42 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Herzfeld DJ, Vaswani PA, Marko MK, Shadmehr R. 2014. A memory of errors in sensorimotor learning. Science 345:1349–53 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Hikosaka O, Nakahara H, Rand MK, Sakai K, Lu X, et al. 1999. Parallel neural networks for learning sequential procedures. Trends Neurosci.22:464–71 [DOI] [PubMed] [Google Scholar]
  61. Hoche F, Guell X, Sherman JC, Vangel MG, Schmahmann JD. 2016. Cerebellar contribution to social cognition. Cerebellum 15:732–43 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Hoebeek FE, Witter L, Ruigrok TJ, De Zeeuw CI. 2010. Differential olivo-cerebellar cortical control of rebound activity in the cerebellar nuclei. PNAS 107:8410–15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Hong S, Negrello M, Junker M, Smilgin A, Thier P, De Schutter E. 2016. Multiplexed coding by cerebellar Purkinje neurons. eLife 5:e13810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Houck BD, Person AL. 2014. Cerebellar loops: a review of the nucleocortical pathway. Cerebellum 13:378–85 [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Houck BD, Person AL. 2015. Cerebellar premotor output neurons collateralize to innervate the cerebellar cortex. J. Comp. Neurol 523:2254–71 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Huang CC, Sugino K, Shima Y, Guo C, Bai S, et al. 2013. Convergence of pontine and proprioceptive streams onto multimodal cerebellar granule cells. eLife 2:e00400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Hull C, Regehr WG. 2012. Identification of an inhibitory circuit that regulates cerebellar Golgi cell activity. Neuron 73:149–58 [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Ishikawa T, Shimuta M, Hausser M. 2015. Multimodal sensory integration in single cerebellar granule cells in vivo. eLife 4:e12916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Ito M 2000. Mechanisms of motor learning in the cerebellum. Brain Res. 886:237–45 [DOI] [PubMed] [Google Scholar]
  70. Ito M 2013. Error detection and representation in the olivo-cerebellar system. Front. Neural Circuits 7:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Jahnsen H 1986. Electrophysiological characteristics of neurones in the guinea-pig deep cerebellar nuclei in vitro. J. Physiol 372:129–47 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Jelitai M, Puggioni P, Ishikawa T, Rinaldi A, Duguid I. 2016. Dendritic excitation-inhibition balance shapes cerebellar output during motor behaviour. Nat. Commun 7:13722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Jenkins WK, Hull AW, Strait JC, Schnaufer BA, Li X. 1996. Advanced Concepts in Adaptive Signal Processing. Alphen aan den Rijn, Neth.: Wolters Kluwer [Google Scholar]
  74. Johansson F, Carlsson HA, Rasmussen A, Yeo CH, Hesslow G. 2015. Activation of a temporal memory in Purkinje cells by the mGluR7 receptor. Cell Rep. 13:1741–46 [DOI] [PubMed] [Google Scholar]
  75. Johansson F, Jirenhed DA, Rasmussen A, Zucca R, Hesslow G. 2014. Memory trace and timing mechanism localized to cerebellar Purkinje cells. PNAS 111:14930–34 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Joiner WM, Smith MA. 2008. Long-term retention explained by a model of short-term learning in the adaptive control of reaching. J. Neurophysiol 100:2948–55 [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Jordan M, Rumelhart D. 1992. Forward models: supervised learning with a distal teacher. Cogn. Sci 16:307–54 [Google Scholar]
  78. Jorntell H, Ekerot CF. 2006. Properties of somatosensory synaptic integration in cerebellar granule cells in vivo. J. Neurosci 26:11786–97 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Kassardjian CD, Tan YF, Chung JY, Heskin R, Peterson MJ, Broussard DM. 2005. The site of a motor memory shifts with consolidation. J. Neurosci 25:7979–85 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Ke MC, Guo CC, Raymond JL. 2009. Elimination of climbing fiber instructive signals during motor learning. Nat. Neurosci 12:1171–79 [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Kennedy A, Wayne G, Kaifosh P, Alvina K, Abbott LF, Sawtell NB. 2014. A temporal basis for predicting the sensory consequences of motor commands in an electric fish. Nat. Neurosci 17:416–22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Khilkevich A, Halverson HE, Canton-Josh JE, Mauk MD. 2016. Links between single-trial changes and learning rate in eyelid conditioning. Cerebellum 15:112–21 [DOI] [PubMed] [Google Scholar]
  83. Kimpo RR, Rinaldi JM, Kim CK, Payne HL, Raymond JL. 2014. Gating of neural error signals during motor learning. eLife 3:e02076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Kleim JA, Freeman JH Jr., Bruneau R, Nolan BC, Cooper NR, et al. 2002. Synapse formation is associated with memory storage in the cerebellum. PNAS 99:13228–31 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Kleine JF, Guan Y, Buttner U. 2003. Saccade-related neurons in the primate fastigial nucleus: What do they encode? J. Neurophysiol 90:3137–54 [DOI] [PubMed] [Google Scholar]
  86. Knogler LD, Markov DA, Dragomir EI, Stih V, Portugues R. 2017. Sensorimotor representations in cerebellar granule cells in larval zebrafish are dense, spatially organized, and non-temporally patterned. Curr. Biol 27:1288–302 [DOI] [PubMed] [Google Scholar]
  87. Knudsen EI. 1994. Supervised learning in the brain. J. Neurosci 14:3985–97 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Kording KP, Tenenbaum JB, Shadmehr R. 2007. The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat. Neurosci 10:779–86 [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Koziol LF, Budding D, Andreasen N, D’Arrigo S, Bulgheroni S, et al. 2014. Consensus paper: the cerebellum’s role in movement and cognition. Cerebellum 13:151–77 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Kukushkin NV, Carew TJ. 2017. Memory takes time. Neuron 95:259–79 [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Laurens J, Heiney SA, Kim G, Blazquez PM. 2013. Cerebellar cortex granular layer interneurons in the macaque monkey are functionally driven by mossy fiber pathways through net excitation or inhibition. PLOS ONE 8:e82239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Lawrenson CL, Watson TC, Apps R. 2016. Transmission of predictable sensory signals to the cerebellum via climbing fiber pathways is gated during exploratory behavior. J. Neurosci 36:7841–51 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. LeCun Y, Bengio Y, Hinton G. 2015. Deep learning. Nature 521:436–44 [DOI] [PubMed] [Google Scholar]
  94. Lee KH, Mathews PJ, Reeves AM, Choe KY, Jami SA, et al. 2015. Circuit mechanisms underlying motor memory formation in the cerebellum. Neuron 86:529–40 [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Lefler Y, Yarom Y, Uusisaari MY. 2014. Cerebellar inhibitory input to the inferior olive decreases electrical coupling and blocks subthreshold oscillations. Neuron 81:1389–400 [DOI] [PubMed] [Google Scholar]
  96. Lefort JM, Rochefort C, Rondi-Reig L, Group L.R.R. Memb. Bio-Psy Labex ENP Found. 2015. Cerebellar contribution to spatial navigation: new insights into potential mechanisms. Cerebellum 14:59–62 [DOI] [PubMed] [Google Scholar]
  97. Leggio M, Molinari M. 2015. Cerebellar sequencing: a trick for predicting the future. Cerebellum 14:35–38 [DOI] [PubMed] [Google Scholar]
  98. Li JX, Medina JF, Frank LM, Lisberger SG. 2011. Acquisition of neural learning in cerebellum and cerebral cortex for smooth pursuit eye movements. J. Neurosci 31:12716–26 [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Lisberger SG, Fuchs AF. 1978. Role of primate flocculus during rapid behavioral modification of vestibuloocular reflex. II. Mossy fiber firing patterns during horizontal head rotation and eye movement. J. Neurophysiol 41:764–77 [DOI] [PubMed] [Google Scholar]
  100. Lisberger SG, Sejnowski TJ. 1992. Motor learning in a recurrent network model based on the vestibulo-ocular reflex. Nature 360:159–61 [DOI] [PubMed] [Google Scholar]
  101. Litwin-Kumar A, Harris KD, Axel R, Sompolinsky H, Abbott LF. 2017. Optimal degrees of synaptic connectivity. Neuron 93:1153–64.e7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Loewenstein Y, Mahon S, Chadderton P, Kitamura K, Sompolinsky H, et al. 2005. Bistability of cerebellar Purkinje cells modulated by sensory stimulation. Nat. Neurosci 8:202–11 [DOI] [PubMed] [Google Scholar]
  103. Maex R, Gutkin B. 2017. Temporal integration and 1/f power scaling in a circuit model of cerebellar interneurons. J. Neurophysiol 118:471–85 [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Marblestone AH, Wayne G, Kording KP. 2016. Toward an integration of deep learning and neuroscience. Front. Comput. Neurosci 10:94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Marr D 1969. A theory of cerebellar cortex. J. Physiol 202:437–70 [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Maruta J, Hensbroek RA, Simpson JI. 2007. Intraburst and interburst signaling by climbing fibers. J. Neurosci 27:11263–70 [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Mathy A, Ho SS, Davie JT, Duguid IC, Clark BA, Häusser M. 2009. Encoding of oscillations by axonal bursts in inferior olive neurons. Neuron 62:388–99 [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Mauk MD, Garcia KS, Medina JF, Steele PM. 1998. Does cerebellar LTD mediate motor learning? Toward a resolution without a smoking gun. Neuron 20:359–62 [DOI] [PubMed] [Google Scholar]
  109. McElvain LE, Bagnall MW, Sakatos A, du Lac S. 2010. Bidirectional plasticity gated by hyperpolarization controls the gain of postsynaptic firing responses at central vestibular nerve synapses. Neuron 68:763–75 [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Medina JF. 2011. The multiple roles of Purkinje cells in sensori-motor calibration: to predict, teach and command. Curr. Opin. Neurobiol 21:616–22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Medina JF, Garcia KS, Nores WL, Taylor NM, Mauk MD. 2000a. Timing mechanisms in the cerebellum: testing predictions of a large-scale computer simulation. J. Neurosci 20:5516–25 [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Medina JF, Lisberger SG. 2007. Variation, signal, and noise in cerebellar sensory-motor processing for smooth-pursuit eye movements. J. Neurosci 27:6832–42 [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Medina JF, Lisberger SG. 2008. Links from complex spikes to local plasticity and motor learning in the cerebellum of awake-behaving monkeys. Nat. Neurosci 11:1185–92 [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Medina JF, Lisberger SG. 2009. Encoding and decoding of learned smooth-pursuit eye movements in the floccular complex of the monkey cerebellum. J. Neurophysiol 102:2039–54 [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Medina JF, Mauk MD. 1999. Simulations of cerebellar motor learning: computational analysis of plasticity at the mossy fiber to deep nucleus synapse. J. Neurosci 19:7140–51 [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Medina JF, Mauk MD. 2000. Computer simulation of cerebellar information processing. Nat. Neurosci 3(Suppl.):1205–11 [DOI] [PubMed] [Google Scholar]
  117. Medina JF, Nores WL, Ohyama T, Mauk MD. 2000b. Mechanisms of cerebellar learning suggested by eyelid conditioning. Curr. Opin. Neurobiol 10:717–24 [DOI] [PubMed] [Google Scholar]
  118. Medina JF, Repa JC, Mauk MD, LeDoux JE. 2002. Parallels between cerebellum- and amygdala-dependent conditioning. Nat. Rev. Neurosci 3:122–31 [DOI] [PubMed] [Google Scholar]
  119. Medsker L, Jain LC. 1999. Recurrent Neural Networks: Design and Applications. Boca Raton, FL: CRC Press [Google Scholar]
  120. Miall RC, Weir DJ, Wolpert DM, Stein JF. 1993. Is the cerebellum a smith predictor? J. Mot. Behav 25:203–16 [DOI] [PubMed] [Google Scholar]
  121. Miles FA, Lisberger SG. 1981. Plasticity in the vestibulo-ocular reflex: a new hypothesis. Annu. Rev. Neurosci 4:273–99 [DOI] [PubMed] [Google Scholar]
  122. Minsky ML. 1963. Steps toward artificial intelligence In Computers and Thought, ed. Feigenbaum EA, Feldman J, pp. 406–50. New York: McGraw-Hill [Google Scholar]
  123. Mittleman G, Goldowitz D, Heck DH, Blaha CD. 2008. Cerebellar modulation of frontal cortex dopamine efflux in mice: relevance to autism and schizophrenia. Synapse 62:544–50 [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Mugnaini E, Floris A. 1994. The unipolar brush cell: a neglected neuron of the mammalian cerebellar cortex. J. Comp. Neurol 339:174–80 [DOI] [PubMed] [Google Scholar]
  125. Najafi F, Giovannucci A, Wang SS-H, Medina JF. 2014a. Coding of stimulus strength via analog calcium signals in Purkinje cell dendrites of awake mice. eLife 3:e03663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Najafi F, Giovannucci A, Wang SS-H, Medina JF. 2014b. Sensory-driven enhancement of calcium signals in individual Purkinje cell dendrites of awake mice. Cell Rep 6:792–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Najafi F, Medina JF. 2013. Beyond “all-or-nothing” climbing fibers: graded representation of teaching signals in Purkinje cells. Front. Neural Circuits 7:115 [Google Scholar]
  128. Nguyen-Vu TD, Kimpo RR, Rinaldi JM, Kohli A, Zeng H, et al. 2013. Cerebellar Purkinje cell activity drives motor learning. Nat. Neurosci 16:1734–36 [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Nguyen-Vu TDB, Zhao GQ, Lahiri S, Kimpo RR, Lee H, et al. 2017. A saturation hypothesis to explain both enhanced and impaired learning with enhanced plasticity. eLife 6:e20147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. O’Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ. 2003. Temporal difference models and rewardrelated learning in the human brain. Neuron 38:329–37 [DOI] [PubMed] [Google Scholar]
  131. Ohmae S, Medina JF. 2015. Climbing fibers encode a temporal-difference prediction error during cerebellar learning in mice. Nat. Neurosci 18:1798–803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Ozden I, Dombeck DA, Hoogland TM, Tank DW, Wang SS-H. 2012. Widespread state-dependent shifts in cerebellar activity in locomoting mice. PLOS ONE 7:e42650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Park SM, Tara E, Khodakhah K. 2012. Efficient generation of reciprocal signals by inhibition. J. Neurophysiol 107:2453–62 [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Penhune VB, Steele CJ. 2012. Parallel contributions of cerebellar, striatal and M1 mechanisms to motor sequence learning. Behav. Brain Res 226:579–91 [DOI] [PubMed] [Google Scholar]
  135. Perciavalle V,Apps R, Bracha V, Delgado-Garcia JM, Gibson AR, et al. 2013. Consensus paper: current views on the role of cerebellar interpositus nucleus in movement control and emotion. Cerebellum 12:738–57 [DOI] [PubMed] [Google Scholar]
  136. Person AL, Raman IM. 2010. Deactivation of L-type Ca current by inhibition controls LTP at excitatory synapses in the cerebellar nuclei. Neuron 66:550–59 [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Person AL, Raman IM. 2011. Purkinje neuron synchrony elicits time-locked spiking in the cerebellar nuclei. Nature 481:502–5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  138. Pitkow X, Meister M. 2012. Decorrelation and efficient coding by retinal ganglion cells. Nat. Neurosci 15:628–35 [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Popa LS, Streng ML, Hewitt AL, Ebner TJ. 2016. The errors of our ways: understanding error representations in cerebellar-dependent motor learning. Cerebellum 15:93–103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Porrill J, Dean P.2007. Recurrent cerebellar loops simplify adaptive control of redundant and nonlinear motor systems. Neural Comput. 19:170–93 [DOI] [PubMed] [Google Scholar]
  141. Porrill J, Dean P, Anderson SR. 2013Adaptive filters and internal models: multilevel description of cerebellar function. Neural Netw. 47:134–49 [DOI] [PubMed] [Google Scholar]
  142. Porrill J, Dean P, Stone JV. 2004. Recurrent cerebellar architecture solves the motor-error problem. Proc. Biol. Sci 271:789–96 [DOI] [PMC free article] [PubMed] [Google Scholar]
  143. Pouget A, Snyder LH. 2000. Computational approaches to sensorimotor transformations. Nat. Neurosci 3(Suppl.):1192–98 [DOI] [PubMed] [Google Scholar]
  144. Powell K, Mathy A, Duguid I, Hausser M. 2015. Synaptic representation of locomotion in single cerebellar granule cells. eLife 4:e07290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Raymond JL, Lisberger SG. 1998. Neural learning rules for the vestibulo-ocular reflex. J. Neurosci 18:9112–29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  146. Rieubland S, Roth A, Hausser M. 2014. Structured connectivity in cerebellar inhibitory networks. Neuron 81:913–29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. Rossert C, Dean P, Porrill J. 2015. At the edge of chaos: how cerebellar granular layer network dynamics can provide the basis for temporal filters. PLOS Comput. Biol 11:e1004515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  148. Rossert C, Solinas S, D’Angelo E, Dean P, Porrill J. 2014. Model cerebellar granule cells can faithfully transmit modulated firing rate signals. Front. Cell Neurosci 8:304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Rossi DJ, Alford S, Mugnaini E, Slater NT. 1995. Properties of transmission at a giant glutamatergic synapse in cerebellum: the mossy fiber-unipolar brush cell synapse. J. Neurophysiol 74:24–42 [DOI] [PubMed] [Google Scholar]
  150. Sawtell NB. 2010. Multimodal integration in granule cells as a basis for associative plasticity and sensory prediction in a cerebellum-like circuit. Neuron 66:573–84 [DOI] [PubMed] [Google Scholar]
  151. Schaul T, Zhang S, LeCun Y. 2012. No more pesky learning rates. arXiv:12061106 [stat.ML] [Google Scholar]
  152. Schonewille M, Gao Z, Boele HJ, Veloz MF, Amerika WE, et al. 2011. Reevaluating the role of LTD in cerebellar motor learning. Neuron 70:43–50 [DOI] [PMC free article] [PubMed] [Google Scholar]
  153. Schonewille M, Khosrovani S, Winkelman BH, Hoebeek FE, De Jeu MT, et al. 2006. Purkinje cells in awake behaving animals operate at the upstate membrane potential. Nat. Neurosci 9:459–61 [DOI] [PubMed] [Google Scholar]
  154. Schultz W 1998. Predictive reward signal of dopamine neurons. J. Neurophysiol 80:1–27 [DOI] [PubMed] [Google Scholar]
  155. Schultz W, Dayan P, Montague PR. 1997. A neural substrate of prediction and reward. Science 275:1593–99 [DOI] [PubMed] [Google Scholar]
  156. Seymour B, O’Doherty JP, Dayan P, Koltzenburg M, Jones AK, et al. 2004. Temporal difference models describe higher-order learning in humans. Nature 429:664–67 [DOI] [PubMed] [Google Scholar]
  157. Sgritta M, Locatelli F, Soda T, Prestori F, D’Angelo EU. 2017. Hebbian spike-timing dependent plasticity at the cerebellar input stage. J. Neurosci 37:2809–23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Shidara M, Kawano K, Gomi H, Kawato M. 1993. Inverse-dynamics model eye movement control by Purkinje cells in the cerebellum. Nature 365:50–52 [DOI] [PubMed] [Google Scholar]
  159. Shutoh F, Ohki M, Kitazawa H, Itohara S, Nagao S. 2006. Memory trace of motor learning shifts transsynaptically from cerebellar cortex to nuclei for consolidation. Neuroscience 139:767–77 [DOI] [PubMed] [Google Scholar]
  160. Simpson JI, Wylie DR, De Zeeuw CI. 1996. On climbing fiber signals and their consequence(s). Behav. Brain Sci 19:384–98 [Google Scholar]
  161. Sokolov AA, Miall RC, Ivry RB. 2017. The cerebellum: adaptive prediction for movement and cognition. Trends Cogn. Sci 21:313–32 [DOI] [PMC free article] [PubMed] [Google Scholar]
  162. Soltesz I 2005. Diversity in the Neuronal Machine: Order and Variability in Interneuronal Microcircuits Oxford, UK: Oxford Univ. Press [Google Scholar]
  163. Spanne A, Jorntell H. 2013. Processing of multi-dimensional sensorimotor information in the spinal and cerebellar neuronal circuitry: a new hypothesis PLOS Comput. Biol 9:e1002979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Spanne A, Jorntell H. 2015. Questioning the role of sparse coding in the brain. Trends Neurosci. 38:417–27 [DOI] [PubMed] [Google Scholar]
  165. Squire LR, Genzel L, Wixted JT, Morris RG. 2015. Memory consolidation. Cold Spring Harb. Perspect. Biol 7:a021766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  166. Steuber V,Jaeger D. 2013. Modeling the generation of output by the cerebellar nuclei. Neural Netw. 47:112–19 [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Strata P 2015. The emotional cerebellum. Cerebellum 14:570–77 [DOI] [PubMed] [Google Scholar]
  168. Sun Z, Smilgin A, Junker M, Dicke PW, Thier P. 2017. The same oculomotor vermal Purkinje cells encode the different kinematics of saccades and of smooth pursuit eye movements. Sci. Rep 7:40613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  169. Sutton MA, Carew TJ. 2000. Parallel molecular pathways mediate expression of distinct forms of intermediateterm facilitation at tail sensory-motor synapses in Aplysia. Neuron 26:219–31 [DOI] [PubMed] [Google Scholar]
  170. Sutton RS. 1984. Temporal credit assignment in reinforcement learning. PhD Thesis, Univ. Mass., Amherst [Google Scholar]
  171. Sutton RS. 1988. Learning to predict by the methods of temporal differences. Mach. Learn 3:9–44 [Google Scholar]
  172. Suvrathan A, Payne HL, Raymond JL. 2016. Timing rules for synaptic plasticity matched to behavioral function. Neuron 92:959–67 [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Tang T, Xiao J, Suh CY, Burroughs A, Cerminara NL, et al. 2017. Heterogeneity of Purkinje cell simple spike-complex spike interactions: zebrin- and non-zebrin-related variations. J. Physiol 595:5341–57 [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. ten Brinke MM, Boele HJ, Spanke JK, Potters JW, Kornysheva K, et al. 2015. Evolving models of Pavlovian conditioning: cerebellar cortical dynamics in awake behaving mice. Cell Rep. 13:1977–88 [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. ten Brinke MM, Heiney SA, Wang X, Proietti-Onori M, Boele HJ, et al. 2017. Dynamic modulation of activity in cerebellar nuclei neurons during Pavlovian eyeblink conditioning in mice. eLife 6:e28132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Titley HK, Brunel N, Hansel C. 2017. Toward a neurocentric view of learning. Neuron 95:19–32 [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Tokuda IT, Hoang H, Schweighofer N, Kawato M. 2013. Adaptive coupling of inferior olive neurons in cerebellar learning. Neural Netw. 47:42–50 [DOI] [PubMed] [Google Scholar]
  178. Trott JR, Apps R, Armstrong DM. 1998a. Zonal organization of cortico-nuclear and nucleo-cortical projections of the paramedian lobule of the cat cerebellum. 1. The C1 zone. Exp. Brain Res 118:298–315 [DOI] [PubMed] [Google Scholar]
  179. Trott JR, Apps R, Armstrong DM. 1998b. Zonal organization of cortico-nuclear and nucleo-cortical projections of the paramedian lobule of the cat cerebellum. 2. The C2 zone. Exp. Brain Res 118:316–30 [DOI] [PubMed] [Google Scholar]
  180. Tsutsumi S, Yamazaki M, Miyazaki T, Watanabe M, Sakimura K, et al. 2015. Structure-function relationships between aldolase C/zebrin II expression and complex spike synchrony in the cerebellum. J. Neurosci 35:843–52 [DOI] [PMC free article] [PubMed] [Google Scholar]
  181. Turecek J, Jackman SL, Regehr WG. 2016. Synaptic specializations support frequency-independent Purkinje cell output from the cerebellar cortex. Cell Rep. 17:3256–68 [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Turecek J, Jackman SL, Regehr WG. 2017. Synaptotagmin 7 confers frequency invariance onto specialized depressing synapses. Nature 551:503–6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  183. Urbano FJ, Simpson JI, Llinas RR. 2006. Somatomotor and oculomotor inferior olivary neurons have distinct electrophysiological phenotypes. PNAS 103:16550–55 [DOI] [PMC free article] [PubMed] [Google Scholar]
  184. Valera AM, Binda F, Pawlowski SA, Dupont JL, Casella JF, et al. 2016. Stereotyped spatial patterns of functional synaptic connectivity in the cerebellar cortex. eLife 5:e09862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  185. van Beugen BJ, Gao Z, Boele HJ, Hoebeek F, De Zeeuw CI. 2013. High frequency burst firing of granule cells ensures transmission at the parallel fiber to Purkinje cell synapse at the cost of temporal coding. Front. Neural Circuits 7:95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  186. van Dorp S, De Zeeuw CI. 2015. Forward signaling by unipolar brush cells in the mouse cerebellum. Cerebellum 14:528–33 [DOI] [PMC free article] [PubMed] [Google Scholar]
  187. van Vreeswijk C, Sompolinsky H. 1996. Chaos in neuronal networks with balanced excitatory and inhibitory activity. Science 274:1724–26 [DOI] [PubMed] [Google Scholar]
  188. Wadiche JI, Jahr CE. 2005. Patterned expression of Purkinje cell glutamate transporters controls synaptic plasticity. Nat. Neurosci 8:1329–34 [DOI] [PubMed] [Google Scholar]
  189. Wagner MJ, Kim TH, Savall J, Schnitzer MJ, Luo L. 2017. Cerebellar granule cells encode the expectation of reward. Nature 544:96–100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  190. Walter JT, Khodakhah K. 2006. The linear computational algorithm of cerebellar Purkinje cells. J. Neurosci 26:12861–72 [DOI] [PMC free article] [PubMed] [Google Scholar]
  191. Walter JT, Khodakhah K. 2009. The advantages of linear information processing for cerebellar computation. PNAS 106:4471–76 [DOI] [PMC free article] [PubMed] [Google Scholar]
  192. Wang SS-H, Denk W, Häusser M. 2000. Coincidence detection in single dendritic spines mediated by calcium release. Nat. Neurosci 3:1266–73 [DOI] [PubMed] [Google Scholar]
  193. Wang W, Nakadate K, Masugi-Tokita M, Shutoh F, Aziz W, et al. 2014. Distinct cerebellar engrams in short-term and long-term motor learning. PNAS 111:E188–93 [DOI] [PMC free article] [PubMed] [Google Scholar]
  194. Wei K, Kording K. 2009. Relevance of error: What drives motor adaptation? J. Neurophysiol 101:655–64 [DOI] [PMC free article] [PubMed] [Google Scholar]
  195. Witter L, Rudolph S, Pressler RT, Lahlaf SI, Regehr WG. 2016. Purkinje cell collaterals enable output signals from the cerebellar cortex to feed back to Purkinje cells and interneurons. Neuron 91:312–19 [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. Wolpert DM, Miall RC, Kawato M. 1998. Internal models in the cerebellum. Trends Cogn. Sci 2:338–47 [DOI] [PubMed] [Google Scholar]
  197. Xiao J, Cerminara NL, Kotsurovskyy Y, Aoki H, Burroughs A, et al. 2014. Systematic regional variations in Purkinje cell spiking patterns. PLOS ONE 9:e105633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  198. Yamaguchi K, Itohara S, Ito M. 2016. Reassessment of long-term depression in cerebellar Purkinje cells in mice carrying mutated GluA2 C terminus. PNAS 113:10192–97 [DOI] [PMC free article] [PubMed] [Google Scholar]
  199. Yang Y, Lisberger SG. 2010. Learning on multiple timescales in smooth pursuit eye movements. J. Neurophysiol 104:2850–62 [DOI] [PMC free article] [PubMed] [Google Scholar]
  200. Yang Y, Lisberger SG. 2014a. Purkinje-cell plasticity and cerebellar motor learning are graded by complexspike duration. Nature 510:529–32 [DOI] [PMC free article] [PubMed] [Google Scholar]
  201. Yang Y, Lisberger SG. 2014b. Role of plasticity at different sites across the time course of cerebellar motor learning. J. Neurosci 34:7077–90 [DOI] [PMC free article] [PubMed] [Google Scholar]
  202. Yang Y, Lisberger SG. 2017. Modulation of complex-spike duration and probability during cerebellar motor learning in visually guided smooth-pursuit eye movements of monkeys. eNeuro 4:ENEURO.0115–17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  203. Yassa MA, Stark CE. 2011. Pattern separation in the hippocampus. Trends Neurosci. 34:515–25 [DOI] [PMC free article] [PubMed] [Google Scholar]
  204. Zheng N, Raman IM. 2010. Synaptic inhibition, excitation, and plasticity in neurons of the cerebellar nuclei. Cerebellum 9:56–66 [DOI] [PMC free article] [PubMed] [Google Scholar]
  205. Zhou H, Lin Z, Voges K, Ju C, Gao Z, et al. 2014. Cerebellar modules operate at different frequencies. eLife 3:e02536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  206. Zhou H, Voges K, Lin Z, Ju C, Schonewille M. 2015. Differential Purkinje cell simple spike activity and pausing behavior related to cerebellar modules. J. Neurophysiol 113:2524–36 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES