Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2021 Mar 25;17(3):e1008866. doi: 10.1371/journal.pcbi.1008866

Learning compositional sequences with multiple time scales through a hierarchical network of spiking neurons

Amadeus Maes 1, Mauricio Barahona 2, Claudia Clopath 1,*
Editor: Abigail Morrison3
PMCID: PMC8023498  PMID: 33764970

Abstract

Sequential behaviour is often compositional and organised across multiple time scales: a set of individual elements developing on short time scales (motifs) are combined to form longer functional sequences (syntax). Such organisation leads to a natural hierarchy that can be used advantageously for learning, since the motifs and the syntax can be acquired independently. Despite mounting experimental evidence for hierarchical structures in neuroscience, models for temporal learning based on neuronal networks have mostly focused on serial methods. Here, we introduce a network model of spiking neurons with a hierarchical organisation aimed at sequence learning on multiple time scales. Using biophysically motivated neuron dynamics and local plasticity rules, the model can learn motifs and syntax independently. Furthermore, the model can relearn sequences efficiently and store multiple sequences. Compared to serial learning, the hierarchical model displays faster learning, more flexible relearning, increased capacity, and higher robustness to perturbations. The hierarchical model redistributes the variability: it achieves high motif fidelity at the cost of higher variability in the between-motif timings.

Author summary

The brain has the ability to learn and execute sequential behaviour on multiple time scales. This behaviour is often compositional: a set of simple behaviours is concatenated to create a complex behaviour. Technological improvements increasingly shine light on the building blocks of compositional behaviour, yet the underlying neural mechanisms remain unclear. Here, we propose a hierarchical model to study the learning and execution of compositional sequences, using bio-plausible neurons and learning rules. We compare the hierarchical model with a serial version of the model. We demonstrate that the hierarchical model is more flexible, efficient and robust by exploiting the compositional nature of the sequences.

Introduction

Many natural behaviours are compositional: complex patterns are built out of combinations of a discrete set of simple motifs [13]. Compositional sequences unfolding over time naturally lead to the presence of multiple time scales—a short time scale is associated with the motifs and a longer time scale is related to the ordering of the motifs into a syntax. How such behaviours are learnt and controlled is the focus of much current research. Broadly, there are two main strategies for the modeling of sequential behaviour: serial and hierarchical. In a serial model, the long-term behaviour is viewed as a chain of motifs proceeding sequentially, so that the first behaviour in the chain leads to the second and so on (’domino effect’). Serial models present some limitations [4, 5]. Firstly, serial models have limited flexibility since relearning the syntax involves rewiring the chain. Secondly, such models lack robustness, e.g., breaking the serial chain halfway means that the later half of the behaviour is not produced. It has been proposed theoretically that hierarchical models can alleviate these problems, at the cost of extra hardware.

Evidence for the presence of hierarchical structures in the brain is mounting [68]. Furthermore, experiments are increasingly shining light on the hierarchical mechanisms of sequential behaviour. An example is movement sequences in multiple animal models, such as Drosophila [911], mice [1214] and C. elegans [15, 16]. Simultaneous recordings of behaviour and neural activity are now possible in order to relate the two together [17, 18]. Songbirds are another example of animals that produce stereotypical sequential behaviour: short motifs are strung together to form songs. In this case, a clock-like dynamics is generated in the premotor nucleus HVC of the bird’s brain, such that neurons are active in sequential bursts of ∼ 10 ms [19]. This activity is thought to control the timing of the spectral content of the song (the within-motif dynamics). The between-motif dynamics has a different temporal structure [20, 21]; hence the ordering of the motifs into a song (the syntax) might be controlled by a different mechanism. Supporting this view, it has been found that learning the motifs and syntax involves independent mechanisms [22]. The computational study of hierarchical structures and compositional behaviour can also lead to insights into the development of human locomotion and language as there are striking conceptual parallels [2326].

Here, we present a model for learning temporal sequences on multiple scales implemented through a hierarchical network of bio-realistic spiking neurons and synapses. In contrast to current models, which focus on acquiring the motifs and speculate on the mechanisms to learn a syntax [2729], our spiking network model learns motifs and syntax independently from a target sequence presented repeatedly. Furthermore, the plasticity of the synapses is entirely local, and does not rely on a global optimisation such as FORCE-training [3032] or backpropagation through time [33]. To characterise the effect of the hierarchical organisation, we compare the proposed hierarchical model to a serial version by looking at their learning and relearning behaviours. We show that, contrary to the serial model, the hierarchical model acquires the motifs independently from the syntax. In addition, the hierarchical model has a higher capacity and is more resistant to perturbations, as compared to a serial model. We also investigate the variability of the neural activity in both models, during spontaneous replay of stored sequences. The organisation of the model shapes the neural variability differently. The within-motif spiking dynamics is less variable in a hierarchical organisation, while the time between the execution of motifs is more variable.

The paper is organised as follows. We start by describing the proposed hierarchical spiking network model and the learning protocol. We then analyse the learning and relearning behaviour of the proposed model, and compare it to the corresponding serial model. Next, we investigate several properties of the model: (i) the performance and consistency of spontaneous sequence replays on a range of learnt sequences; (ii) capacity, i.e., how multiple sequences can be stored simultaneously; (iii) robustness of the sequence replays.

Results

Hierarchical model of spiking neurons with plastic synapses for temporal sequence learning

We design a hierarchical model by combining the following spiking recurrent networks (Fig 1): 1) A recurrent network exhibiting fast sequential dynamics (the fast clock); 2) a recurrent network exhibiting slow sequential dynamics (the slow clock); 3) a series of interneuron networks that store and produce the to-be-learnt ordering of motifs (the syntax networks); 4) a series of read-out networks that store and produce the to-be-learnt motif dynamics (the motif networks). We assume that there are a finite number of motifs and each motif is associated to a separate read-out network (e.g., in Fig 1 there are 2 read-out networks corresponding to motifs A and B). The goal of the model is to learn a complex sequence, with the motifs arranged in a certain temporal order, such that the motifs themselves and the temporal ordering of the motifs are learnt using local plasticity rules.

Fig 1. A cartoon of the model.

Fig 1

Dynamics in the read-out networks (A and B) is learnt and controlled on two time scales. The fast time scale network (fast clock) exhibits sequential dynamics that spans individual motifs. This acts directly on the read-out networks through plastic synapses. These synapses learn the motifs. The slow time scale network (slow clock) exhibits sequential dynamics that spans the entire sequence of motifs. This acts indirectly on the read-out networks through an interneuron network. The synapses from the slow clock to the interneurons are plastic and learn the right order of the motifs, or the syntax. The plastic synapses follow a simple symmetric STDP rule for potentiation, with a constant depression independent of spike time.

Neuronal network architecture

All neurons are either excitatory or inhibitory. Excitatory neurons follow an adaptive exponential integrate-and-fire dynamics and inhibitory neurons follow a standard integrate-and-fire dynamics (see Methods).

The model has two recurrent networks that exhibit sequential dynamics: the fast and slow clocks. The design of the clock networks follows Ref. [29]. Each clock is composed of clusters of excitatory neurons coupled in a cycle with a directional bias (i.e., neurons in cluster i are more strongly connected to neurons in cluster i + 1) together with a central cluster of inhibitory neurons coupled to all the excitatory clusters (Fig 1). This architecture leads to sequential dynamics propagating around the cycle and the period can be tuned by choosing different coupling weights. The individual motifs are not longer than the period of the fast clock and the total length of the sequence is limited to the period of the slow clock. In our case, we set the coupling weights of the fast clock such that a period of ∼ 200 ms is obtained, whereas the weights of the slow clock are set to obtain a period of ∼ 1000 ms.

The fast clock neurons project directly onto the read-out networks associated with each motif, which are learnt and encoded using a supervisor input. Hence the fast clock controls the within-motif dynamics. The slow clock neurons, on the other hand, project onto the interneuron network of inhibitory neurons. The interneuron network is also composed of clusters: there is a cluster associated with each motif, with coupling weights that inhibit all other motif networks, and one cluster associated with the ‘silent’ motif, with couplings that inhibit all motif networks and the fast clock. Hence the temporal ordering of the motifs (the syntax) can be encoded in the mapping that controls the activity of the interneurons driven by the slow clock. As a result of this hierarchical architecture, the model allows for a dissociation of within-motif dynamics and motif ordering. The two pathways, from the fast clock to the read-out and from the slow clock to the interneurons, each control a different time scale of the spiking network dynamics.

Plasticity

Learning is accomplished through plastic synapses under a simple biophysically plausible local STDP rule (see Methods) governing the synapses from the fast clock to the read-out networks (motif synapses) and from the slow clock to the interneurons (syntax synapses). The STDP rule has a symmetric learning window and implements a Hebbian ‘fire together, wire together’ mechanism.

All other weights in the model are not plastic and are fixed prior to the learning protocol. The weights in the fast and slow clocks and the interneuron wiring are assumed to originate from earlier processes during evolution or early development. Previous computational studies have shown that sequential dynamics can be learnt in recurrent networks, both in an unsupervised [34, 35] and supervised [29, 36] fashion.

Learning scheme

During learning, a target sequence is presented. We design a target sequence by combining motifs in any order, e.g., AAB. A time-varying external current, corresponding to the target sequence, projects to the excitatory neurons in the read-out networks. Additionally, a short external current activates the first cluster in the fast clock to signal the onset of a new motif (see Methods for more details). During the presentation of the target sequence, the plastic synapses change. When no target sequence is presented, spontaneous dynamics is simulated. Spontaneous dynamics replays the stored sequence. In this case, there is only random external input and no external input corresponding to a target sequence.

The model allows for independent learning of motifs and syntax

We first show how a non-trivial sequence can be learned emphasising the role that each network plays. As an example, consider the target sequence AAB. This sequence is non-trivial as both the within-motif dynamics and syntax is non-Markovian (Fig 2A). Non-Markovian sequences are generally hard to learn, because they require a memory about past dynamics [37]. The sequential dynamics in the fast and slow clock provide a mechanism to overcome this challenge: by providing time indices the non-Markovian sequence is essentially transformed into a Markovian sequence. First, we present the target sequence repeatedly to the read-out networks (as shown in S1 Fig). After learning is finished, we test whether learning was successful by checking that the sequence is correctly produced by spontaneous dynamics (Fig 2B–2E). Note that the slow clock spans the entire sequence (Fig 2C) and activates the interneurons in the correct order (Fig 2E), whereas the interneuron dynamics in turn determines the activation of the fast clock (Fig 2B) and the selection of a read-out network (Fig 2D). Through the learning phase, the motif weights (from the fast clock to the read-out networks) evolve to match the target motifs (Fig 2F), and, similarly, the syntax weights (from the slow clock to interneurons) evolve to match the ordering of the motifs in the target sequence (Fig 2G). Crucially, as shown below, these two sets of plastic weights are dissociated into separate pathways so that compositional sequences can be learnt efficiently through this model. The interneuron network coordinates the two pathways using non-plastic lateral inhibition (Fig 2H) and receives non-plastic excitatory input from the fast clock and motif networks (Fig 2I). Note that this conceptual model can be implemented in various ways (see Methods section Changing the slow clock into an all-inhibitory network) but can serve as a general framework for the learning and replay of stereotypical compositional behaviour.

Fig 2. Learning sequence AAB.

Fig 2

A. The target sequence is repeatedly presented to the read-out networks corresponding to motifs A and B. A and B are 200 ms long motifs. Between the motifs, we assume a silent period of 150 ms. B-E. Spontaneous dynamics after learning (50 target presentations). Red dots: excitatory neurons; blue dots: inhibitory neurons. B. The fast clock, controlled by interneurons 201 to 300. C. The slow clock, spanning and driving the entire sequence replay. D. The read-out networks, driven by the fast clock and controlled by the interneurons. E. The interneurons, driven by the slow clock. Neurons 1 − 100 inhibit motif B. Neurons 101 − 200 inhibit motif A. Neurons 201 − 300 shut down both the fast clock and read-out networks. F. The motif synapses show that the target motifs A (neurons 1 − 300 on the y-axis) and B (neurons 301 − 600 on the y-axis) are stored. The weights for motif A are stronger because there are two As in the target sequence and only one B. G. The syntax weights store the temporal ordering A-silent-A-silent-B-silent. H. Non-plastic inhibitory weights from the interneuron network to the read-out network and fast clock. I. Non-plastic excitatory weights from the read-out network and fast clock to the interneuron network.

The hierarchical model enables efficient relearning of the syntax

We next demonstrate the ability of the model to relearn the ordering of the motifs. In general, we wish relearning to be efficient, i.e., the model should relearn the syntax without changing the motifs themselves. To test this idea, we perform a re-learning scheme AABABA (Fig 3). An efficient model would only learn the switch in the syntax without the need to relearn the two motifs A and B. Starting from a network where no sequence was stored, we begin with a learning phase where the sequence AAB is presented (as in Fig 2) until it is learnt. We then switch to presenting the sequence ABA in the relearning phase. To quantify the progress of learning throughout both phases, we simulate spontaneous dynamics after every fifth target sequence presentation and compute the error between the spontaneous dynamics and the target sequence (see Methods and Fig 3).

Fig 3. Relearning syntax: AABABA.

Fig 3

Brown shaded areas: presentation of target sequence AAB; dark green shaded areas: presentation of target sequence ABA. Brown dots: spontaneous dynamics is simulated 3 times, and the error with respect to the target sequence AAB is measured; dark green dots: spontaneous dynamics is simulated 3 times, and the error with respect to the target sequence ABA is measured. Lines guide the eye and are averages of the dots. See the Methods section for the details of the error measurements. A. The within-motif error keeps decreasing independent of the motif ordering. B. The motif ordering error (syntax error) switches with a delay.

Our results show that the motifs are not re-learnt when switching between the first and second target sequences—the within-motif error keeps decreasing after we switch to the relearning phase indicating that there continues to be improved learning of the motifs common to both target sequences (Fig 3A). In contrast, the model relearns the temporal ordering of the motifs after the switch to the new target sequence—the syntax error relative to AAB decreases during the learning phase and then grows during relearning at the same time as the syntax error relative to ABA decreases (Fig 3B). Therefore, the hierarchy of the model allows for efficient relearning: previously acquired motifs can be reordered into new sequences without relearning the motifs themselves.

To investigate the role of the hierarchical organisation, we next studied how the relearning behaviour compares to a serial model with no dissociation between motifs and syntax. The serial model contains only one clock network and the read-out networks associated with each motif, with no interneurons (S2 Fig). In this serial architecture, motifs and syntax are both learnt and replayed by a single pathway (S2 Fig), and, consequently, when relearning the syntax, the motifs are also re-learnt from scratch even when there is no change within the individual motifs. This leads to a slower decay of the sequence error during learning and relearning in the serial model as compared to the hierarchical model (S3 Fig). The speed by which an old syntax is unlearned, and a new syntax is learned, depends on the learning rate of the syntax plasticity (S4 Fig).

The above results illustrate the increased efficiency of the hierarchical model to learn compositional sequences. The separation of motifs and syntax into two pathways, each of them associated with a different time scale and reflected in the underlying neuronal architecture, allows for the learning and control of the different aspects of the sequence independently.

The hierarchical organisation leads to improved learning speed and high motif fidelity

We now study the effects of having a hierarchical organisation on the speed and quality of learning. To do so, we consider three target sequences of increasing complexity, where each target sequence is comprised of a motif presented three times (Fig 4A).

Fig 4. Learning speed and performance of hierarchical and serial models on three target sequences of increasing temporal complexity.

Fig 4

A. Each target sequence consists of three presentations of the same motif (200 ms long) but with increasing complexity from left to right. Left: the simplest motif consists of five 40 ms stimulations. Middle: the motif consists of eight 25 ms stimulations. Right: the motif consists of ten 20 ms stimulations. B. Learning curves for the three target sequences for both the hierarchical and serial models. The same plasticity parameters are used for both models (see Methods). The shaded area indicates one standard deviation from the mean (50 trials). Note that the x-axis has two scales to show the three-fold increase in learning speed of the hierarchical model (i.e., for each learning iteration of the hierarchical model there are three iterations of the serial model). The performance degrades from left to right, as a more difficult target sequence is presented.

First, we studied the speed at which the pattern is learnt by the hierarchical model as compared to the serial model. The hierarchical model is roughly three times faster than the serial model in learning patterns consisting of three repetitions (Fig 4B). This is expected: in the hierarchical model, the same motif synapses are potentiated three times during a single target presentation, whereas no such repeated learning takes place in the serial model. Furthermore, the speed and quality of the learning also depends on the complexity of the target sequence, i.e., target sequences with rapid temporal changes are harder to learn. Learning target sequences with faster-changing, more complex temporal features leads to a degradation of the performance of both models, but the hierarchical model consistently learns roughly three times faster than the serial model for all patterns (Fig 4, left to right).

Another important quality measure of learning is the reliability and consistency of the pattern replayed by the model under spontaneous dynamics. To study this, we generated repeated trials in which the three target sequences learnt (in Fig 4) were replayed spontaneously, and we compared the variability of the read-out dynamics across the trials for both the hierarchical and serial models. We first computed the within-motif and between-motif variability in the spontaneous trials. The hierarchical model leads to low within-motif variability and higher variability in the between-motif timings. This follows from the spiking in the read-out networks, with highly variable time gaps between motifs in the hierarchical model (Fig 5A). On the other hand, the spike trains within the three motifs correlate strongly with each other for the hierarchical model (Fig 5B). This is the case for the three target sequences.

Fig 5. Measuring variability and performance in the read-out dynamics.

Fig 5

A. The time between motifs 1 and 2 and motifs 2 and 3 is measured during spontaneous dynamics. We plot the coefficient of variation of these times (50 trials) on the y-axis, for the three target sequences in Fig 4A. B. The cross correlation between the spike trains in the first motif and the second and third motif is measured, normalized by the auto-correlation of the spike trains in motif 1. The maximum of the cross correlation is recorded in each trial (50 trials). This is repeated for the three target sequences in Fig 4A. C. We measure the error between the target sequence with 25 ms stimulations in Fig 4A and spike trains in motif 1, 2 and 3. In both models, the performance degrades towards later occurring motifs. The degradation is significantly worse in the serial model: a linear regression yields a slope of 0.0163 for the serial model and a slope of 0.0048 for the hierarchical model (p < 10−5 using t-test). D. The serial clock (48 clusters) is obtained by adding the slow (28 clusters) and fast (20 clusters) clocks together. Sequential dynamics is simulated 50 times for each clock. The time at which each cluster is activated in the sequential dynamics is measured. The standard deviation of these activation times is plotted as a function of the cluster index. The serial clock has a maximal variability of about 9 ms. The fast and slow clock have a maximal variability of about 3 and 35 ms respectively.

We then studied the consistency of the motif as it is repeated (three times) within a target sequence. We observe that a high degradation of the repeated motif towards the end of the sequence in the serial model, which is milder in the hierarchical model (Fig 5C). In summary, the hierarchical model produces accurate motifs that persist strongly over time, but with higher variability in the timing between them. The high reliability of the motifs is due to the stronger learning on the motif synapses discussed above. The higher variability in the inter-motif times is a result of the underlying variability of the periods of the clock networks. As discussed in Ref. [29], the sequential dynamics that underpins the clock networks operates by creating clusters of neurons that are active over successive periods of time. In that sense, the network uses neuron clusters to discretise a span of time (its period) into time increments. The variability of the period of the clock depends on the number of clusters, the number of neurons per cluster in the network, and the time span to discretise. A fast clock will thus have low variability in its period, whereas the slow clock is highly variable. The variability of the period of the serial clock is between the fast and slow clocks (Fig 5D). Consequently, within-motif temporal accuracy is maintained quasi-uniformly over the sequence in a hierarchical model. The price to pay is the addition of appropriately wired interneurons.

The hierarchical organisation reduces the resources needed to store multiple sequences

As shown above, the plasticity of the model allows it to relearn single sequences, yet the relearning process might be too slow for particular situations. In general, animals acquire and store multiple sequences to be used as needed. Motivated by this idea, we explore the capacity of the hierarchical model to learn, store and replay more than one sequence, and we compare it to the alternative serial model. We define capacity here as the number of neurons and synapses needed to store a number of sequences NS. First, we note that a new sequence can be stored in the hierarchical model by adding another interneuron network in parallel. The additional interneuron network is a replica of the existing one, with the same structure and connection weights to the rest of the system.

Each interneuron network learns one syntax, in the same way as one read-out network learns one motif. As an illustration, we learn the sequences AAB and BAAB (Fig 6A), by presenting the target sequences alternately. We then simulate spontaneous dynamics to test that the learning is successful. The spiking dynamics (Fig 6B–6E) show that the model is able to replay the two sequences. To select between the two sequences, we use an attentional external current to the interneuron networks during learning and spontaneous dynamics (shaded areas in Fig 6E). Depending on the interneuron activity, the fast clock (Fig 6B) and read-out networks (Fig 6D) are active. Note that the motifs are encoded in the motif weights (Fig 6F) and syntax weights encode both target motif orderings (Fig 6G). These results show that the hierarchical model can learn, store and replay multiple sequences. Importantly, the motifs are still efficiently re-used: when motifs A and B are learnt by presenting sequence AAB, they can immediately be re-used when a different syntax (e.g., BAAB) is presented.

Fig 6. Spontaneous dynamics after learning two sequences alternately (80 learning iterations).

Fig 6

A. The target sequences. B-E. Red dots: excitatory neurons; blue dots: inhibitory neurons. Brown shaded area: sequence AAB is played by inhibiting the interneurons related to the second sequence; light green shaded area: sequence BAAB is played by inhibiting the interneurons related to the first sequence. B. Spike raster of the fast clock. C. Spike raster of the slow clock. D. Spike raster of the two read-out networks. E. Spike raster of the interneurons. An external attentional inhibitory current selects which sequence is played. F. The motif weights encode the two motifs. Note the similarity with Fig 2F: the same motifs are re-used in both sequences. G. The syntax weights encode the two motif orderings. Note the difference with Fig 2G: an additional syntax is stored. All motif and syntax synapses are plastic at all times during the sequence presentations.

We then compare the efficiency of the hierarchical model to the serial model (S5 Fig). In the serial model, read-out networks have to be added in order to learn and store multiple sequences. This is inefficient for two reasons: 1) The same motif might be stored in different read-out networks, making learning slower; 2) The addition of new read-out networks in the serial model requires more ‘hardware’ (i.e. neurons and synapses) than the addition of an interneuron network in the hierarchical model. In the case where we have a number of sequences NS consisting of two motifs (and using network parameters as detailed in the Methods), we have the following capacities. For the serial model, we have 6000 neurons in the serial clock (of which 4800 are excitatory), and 750 neurons in each read-out network (of which 600 are excitatory). Then the number of neurons needed is 6000 + 750 ⋅ NS, and the number of plastic synapses needed is 4800 ⋅ NS ⋅ 600.

For the hierarchical model, on the other hand, there are 6000 neurons in the fast and slow clocks combined, 750 neurons in the read-out network, and 300 neurons in each interneuron network. Hence the number of neurons needed is 6750 + 300 ⋅ NS, and the number of plastic synapses and non-plastic lateral connections due to the interneuron network is 2000 ⋅ 600 + 2800 ⋅ 300 ⋅ NS + 750 ⋅ 300 ⋅ NS + 600 ⋅ 300 ⋅ NS + 2000 ⋅ 100 ⋅ NS + 200 ⋅ 100 ⋅ NS = (2000+ 2441.67 ⋅ NS) ⋅ 600. For two sequences (NS = 2), we then have 7500 neurons and 5, 760, 000 synapses for the serial model, whereas the hierarchical model requires 7350 neurons and 4, 130, 000 synapses—a significant reduction in resources. Even when NS = 1, the hierarchical model uses more neurons than the serial model, but still fewer synapses. In general, the hierarchical model scales substantially more favourably as NS is increased.

Finally, we extend the hierarchical model to learn two sequences consisting of in total six motifs, (S6 Fig). We generalize the model to include multiple motif durations and observe that the hierarchical model is scalable. A serial model would need 10, 500 neurons and 17, 280, 000 synapses. The hierarchical model uses instead 9650 neurons and 13, 630, 000 synapses, again a significant reduction in resources. Our results show that the hierarchical model can learn and store multiple sequences by adding more interneuron networks in parallel. This hierarchical organisation thus exploits the compositional nature of the sequences, in a way the serial model cannot, leading to increased capacity. The hierarchical model primarily uses less synapses.

The hierarchical model displays increased robustness to perturbations of the sequential dynamics

We next investigate the role of the hierarchy in enhancing the robustness of the model to perturbations in the target sequence. Behavioural perturbation experiments have shown that individual motifs can be removed mid-sequence without affecting later-occurring motifs [13]. This is a useful feature which can dramatically improve the robustness of responses, since later-occurring behaviour does not critically depend on the successful completion of all previous behaviours. To examine this issue, we have carried out simulations on the serial and hierarchical models under perturbations in the firing of the neurons in the clock; specifically we remove the external input to excitatory neurons in the clock network. In the serial model, the perturbation leads to the breakdown of the remaining parts of the sequence (Fig 7A), whereas when the same perturbation is applied to the fast clock of the hierarchical model, we see that later motifs are preserved (Fig 7B). The reason is that the dynamics in the slow clock is intact and continues to drive the behaviour. Perturbing the slow clock, and keeping the fast clock intact, has less predictable outcomes for the dynamics. Random activity in the interneurons can cause motifs to be played in a random order (S7 Fig). Overall, the hierarchical model improves the robustness. Indeed, at any point in time, a single cluster of neurons is active in the clock of the serial model, whereas there are two active clusters of neurons (one in the fast clock and another in the slow clock) in the hierarchical model. This separation of time scales is fundamental to preserve the robustness of the model.

Fig 7. Perturbing the dynamics.

Fig 7

We learn sequence AAB and then apply a perturbation. Blue shade indicates the perturbation time, and neurons perturbed. A. 250 ms perturbation of the serial network clock. The targeted neurons (neurons 1000 to 2000) have no excitatory external input during the perturbation. The sequential activity breaks down completely. B. 250 ms perturbation of the fast clock in the hierarchical model. The targeted neurons (neurons 1 to 1000) have no excitatory external input during perturbation. The sequential activity breaks down but is reactivated for the final motif through the interneurons.

Discussion

Summary of results

We have presented here a hierarchical neuronal network model for the learning of compositional sequences. We demonstrated how motifs and syntax can be learnt independently of each other. The hierarchical structure has direct implications for the learning and is contrasted with a serial architecture. The hierarchical structure leads to an increased learning speed and the possibility to efficiently relearn the ordering of individual motifs. The replays of individual motifs are more similar to each other as compared to replays in the serial model. Separating the motifs and syntax into two different pathways in the hierarchical model has also implications for the resources used and robustness. The motifs can be re-used in the hierarchical model, leading to a significant reduction in hardware (i.e. neurons and synapses). Finally, the serial model has a single pathway, as opposed to two, and is therefore more prone to perturbations.

From serial to hierarchical modelling

Modelling studies so far have either focused on the study of sequential dynamics [3841] or on motif acquisition [2729]. This paper introduces an explicitly hierarchical model as a fundamental building block for the learning and replay of sequential dynamics of a compositional nature. Sequential dynamics is omnipresent in the brain and might be important in time-keeping during behaviour [19, 4246]. When temporal sequences are compositional (i.e., formed by the ordering of motifs), they lead to the presence of different time scales associated with the motifs and their ordering (or syntax). From the perspective of learning, such multiscale temporal organisation lends itself naturally to a hierarchical organisation, where the different scales are associated with different groups of neurons in the network (see also [47]). While sequential dynamics has been observed, coordination between multiple sequences on different scales, as we propose in this paper, has not been observed experimentally. We thus present this as a prediction that sequences on different scales may organize compositional behaviour in the same brain region or across different brain regions.

Hierarchical temporal structures might arise during development in a variety of ways [23, 48]. One way is that a single protosequence is learnt first. The protosequence covers the entire behaviour learning the most crude aspects. This might then be followed by splitting the protosequence into multiple sequences specialized to different aspects of the behaviour. A similar splitting of sequences has been observed in birdsong [49, 50]. Hierarchical motor control has also been studied in the artificial intelligence field [51]. A recent model works towards closing the gap from a machine system to a biological system [52] but remains non-trivial to implement using dynamics and plasticity that are considered to be more realistic in a biological sense.

Limitations of the hierarchical model

An important aspect of the hierarchical model is the interneuron network, which coordinates the different time scales. The specificity of the hardwired lateral connectivity to and from the interneuron network is a limitation of the model, but does not require extensive fine tuning, as seen in S8 Fig. An important aspect of sequential behaviour is the ability to vary the speed of execution. In the serial model, the speed can easily be controlled by playing the serial clock faster or slower (see also [29, 36]). In the hierarchical model, this is not as straightforward. One possibility is that the fast and slow clock coordinate the increase or decrease in speed. A different possibility could be that the speed is controlled in a network downstream of the read-out network.

A storage and replay device

The proposed model can be viewed as a biological memory device that stores sequences by means of supervised learning and replays them later by activating the device with spontaneous activity. However, it is important to note that during spontaneous activity there is no input to the device other than the random spike patterns that keep the dynamics of the system going. This mode of operation is therefore distinct from computational machines, such as the liquid state machine [53, 54] or the tempotron [55], where different input patterns are associated with and transformed into different output patterns. Such computational machines, where spiking patterns are inputs to be transformed or classified, are thus complementary to our autonomous memory device.

Hierarchies in other tasks

Hierarchies exist beyond the somewhat simple learning of compositional sequences, and it is expected that hierarchical models share common basic features despite solving highly distinct problems. For instance, a recent example of a hierarchical model for working memory uses two different networks: an associative network and a task-set network [56]. In our setting, the associative network could be identified with the motifs (fast clock+read-out) whereas the task-set network would correspond to the syntax (slow clock+interneurons). Navigation is another typical example of a task where hierarchy is used [57], and the discovery of structure in an environment is closely related to the presence of a hierarchy [58].

Relating the model to experiments

As mentioned above, there are qualitative similarities between the proposed hierarchical model and experimental studies. Experimental studies have pointed increasingly at the importance of hierarchical organisation both in structural studies as well as in the learning and execution of movement and auditory sequences. For example, behavioural re-learning has shown that birds can re-order motifs independently from the within-motif dynamics [22]. Optogenetical perturbation in the striatum of mice has shown that individual motifs can be deleted or inserted mid-sequence, without altering the later part of the behavioural sequence [13]. The proposed model aims to provide a conceptual framework to explain such behavioural observations while simultaneously using biophysically realistic spiking networks and plasticity rules.

However, a quantitative link between model and experiment is not trivial. This is true for behaviour, but even more so for neural activity. Indeed, our model has free parameters, including topology and plasticity, which need to be tuned to the task at hand. Nevertheless, there are two recent advances that may help future work in this direction. Firstly, there have been recent technological improvements in recording of behaviour [18, 59] and neural activity [60] along with the possibility to apply perturbations [13]. Secondly, there has been progress in decoding meaningful information from large observational datasets [61], e,g, the extraction of sequences from neural recordings [62] and the analysis of learning behaviour of songbirds [63]. In this vein, an interesting question to pursue is whether one could rediscover the hierarchical structure from temporal data generated by our model. For instance, one could observe a randomly chosen subset of neurons in the model: could the hierarchical organisation and function of the network be inferred from those partial observations by using data analysis?

Conclusion

Using realistic plasticity rules, we built a spiking network model for the learning of compositional temporal sequences of motifs over multiple time scales. We showed that a hierarchical model is more flexible, efficient and robust than the corresponding serial model for the learning of such sequences. The hierarchical model concentrates the variability in the inter-motif timings but achieves high motif fidelity.

Methods

Excitatory neurons (E) are modelled with the adaptive exponential integrate-and-fire model [64]. A classical integrate-and-fire model is used for the inhibitory neurons (I). Motifs and syntax are learnt using simple STDP-rules (see for example [65]) without need for additional fast normalization mechanisms.

Model architecture

The hierarchical model consists of four recurrent networks. Each network and their parameters are described below. Synaptic weights within each recurrent network are non-zero with probability p = 0.2. The synaptic weights in the recurrent networks which produce sequential dynamics are scaled using a scaling factor f1/N, i.e., it scales with the corresponding network size N.

Fast clock (Fc)

The fast clock network has NFcE=2000 excitatory and NFcI=500 inhibitory neurons recurrently connected with parameters shown in Table 1. Sequential dynamics is ensured by dividing the excitatory neurons in 20 clusters of 100 neurons. The baseline excitatory weights wFcEE within the same cluster are multiplied with a factor of 25, whereas the excitatory weights from cluster i to cluster i + 1 mod 20 (i = 1‥20) are multiplied by a factor of 12.5. Previous studies have shown that such a weight structure leads to sequential dynamics and can be learnt in a biologically plausible way [29, 35, 36]. The last cluster in the fast clock has a special role. It is not inhibited by the ‘silent’ interneurons and as such remains active during silent periods. Once the silent period is over, it restarts the fast clock by activating the first cluster. The fast clock receives excitatory external random Poisson input and inhibitory input from the interneurons, and projects to the read-out networks.

Table 1. Fast clock network parameters.
Constant Value Description
NFcE 2000 Number of recurrent E neurons
NFcI 500 Number of recurrent I neurons
f 0.6325 Scaling factor
wFcEE 5f pF Baseline E to E synaptic strength
wFcIE 3.5f pF E to I synaptic strength
wFcEI 110f pF I to E synaptic strength
wFcII 36f pF I to I synaptic strength

Read-out networks (R)

Each read-out network codes for one individual motif. There are no overlaps or connections between the different read-out networks. The read-out networks are identical and balanced (see Table 2 for the parameters). The excitatory neurons in the read-out networks receive excitatory input from the plastic motif synapses coming from the fast clock. All read-out neurons receive inhibitory input from the interneurons. All read-out neurons also receive external inputs: a supervisor component (only during learning) and a random input. The results are not sensitive to the exact configuration of the read-out networks (see S9 Fig).

Table 2. Read-out network parameters.
Constant Value Description
NRE 300 Number of recurrent E neurons
NRI 75 Number of recurrent I neurons
wREE 3 pF E to E synaptic strength
wRIE 6 pF E to I synaptic strength
wREI 190 pF I to E synaptic strength
wRII 60 pF I to I synaptic strength

Slow clock (Sc)

The slow clock network has NScE=2800 excitatory and NScI=700 inhibitory neurons, recurrently connected. It is essentially a scaled copy of the fast clock. Table 3 shows the parameters of this network. Sequential dynamics is ensured by dividing the excitatory neurons in 28 clusters of 100 neurons. The baseline excitatory weights wFcEE within the same cluster are multiplied with a factor of 25, and the excitatory weights from cluster i to cluster i + 1 mod 28 (i = 1‥28) are multiplied by a factor of 4.7. The slow clock receives excitatory external random Poisson input and projects to the interneuron networks.

Table 3. Slow clock network parameters.
Constant Value Description
NScE 2800 Number of recurrent E neurons
NScI 700 Number of recurrent I neurons
f 0.5345 Scaling factor
wScEE 5f pF Baseline E to E synaptic strength
wScIE 3.5f pF E to I synaptic strength
wScEI 110f pF I to E synaptic strength
wScII 36f pF I to I synaptic strength

Interneuron networks (In)

Each interneuron network codes for one syntax. There are no overlaps between the interneuron networks. Each interneuron network is balanced with parameters given in Table 4. Neurons within each interneuron network are grouped into 3 groups of 100 neurons: one group per motif and one group for the ‘silent’ motif. The ‘silent’ motif inhibits all clusters in the fast clock except the last one, and also silences all read-out motifs. The interneuron networks receive excitatory input from all other networks. They also receive random excitatory external input.

Table 4. Interneuron network parameters.
Constant Value Description
NInI 300 Number of recurrent I neurons
wInII 25 pF I to I synaptic strength

Connections between recurrent networks

The recurrent networks are connected to each other to form the complete hierarchical architecture. All excitatory neurons from the fast clock project to all excitatory neurons in the read-out networks. These synapses, w0M, are plastic. All excitatory neurons from the slow clock project to all the interneurons. These synapses, w0S, are also plastic. To signal the end of a motif, the penultimate cluster of the fast clock activates the interneurons of the ‘silent’ motif. The last cluster is also connected to the ‘silent’ motif which silences all other clusters in the fast clock and all neurons in the read-out networks. Each read-out network gives excitatory input to its corresponding interneuron group. This interneuron group laterally inhibits the other read-out network(s). Table 5 gives all the parameters of the connections between the different networks. To understand the limitations of the model, we test a range of lateral connectivity parameters (S8 Fig).

Table 5. Connections between four networks.
Constant Value Description
w0M 0.3 pF Initial motifs synaptic strengths
w0S 0.1 pF Initial syntax synaptic strengths
wRIn 50 pF In to R synaptic strength of lateral inhibition
wRIn 20 pF In to R synaptic strength of silencing motif
wInR 0.4 pF R to In synaptic strength
wFcIn 20 pF In to Fc synaptic strength of silencing motif
wInFc 1.5 pF Penultimate Fc cluster to In synaptic strength
wInFc 0.4 pF Last Fc cluster to In synaptic strength
wInFc 0 pF Other Fc clusters to In synaptic strength

Serial model (Sm)

The hierarchical model is compared with a serial model. The serial model has one large clock (with the same number of neurons as the fast and slow clocks combined) and no interneurons. Sequential dynamics is generated by clustering the neurons in the network in 48 clusters of 100 neurons. The baseline excitatory weights wSmEE of the same cluster are multiplied with a factor of 25, and the excitatory weights from group i to group i + 1 mod 48 (i = 1‥48) are multiplied by a factor of 6. Table 6 shows the network parameters. The read-out network is kept unchanged (Table 2).

Table 6. Serial model clock network parameters.
Constant Value Description
NSmE 4800 Number of recurrent E neurons
NSmI 1200 Number of recurrent I neurons
f 0.4082 Scaling factor
wSmEE 5f pF Baseline E to E synaptic strength
wSmIE 3.5f pF E to I synaptic strength
wSmEI 110f pF I to E synaptic strength
wSmII 36f pF I to I synaptic strength

Neural and synaptic dynamics

All neurons in the model are either excitatory (E) or inhibitory (I). The parameters of the neurons do not change depending on which network they belong to. Parameters are consistent with Ref. [66].

Membrane potential dynamics

The membrane potential of the excitatory neurons (VE) has the following dynamics:

dVE(t)dt=1τE(ELEVE(t)+ΔTEexp(VE(t)VTEΔTE))+gEEEEVE(t)C+gEIEIVE(t)CaEC (1)

where τE is the membrane time constant, ELE is the reversal potential, ΔTE is the slope of the exponential, C is the capacitance, gEE, gEI are synaptic input from excitatory and inhibitory neurons respectively and EE, EI are the excitatory and inhibitory reversal potentials respectively. When the membrane potential diverges and exceeds 20 mV, the neuron fires a spike and the membrane potential is reset to Vr. This reset potential is the same for all neurons in the model. There is an absolute refractory period of τabs. The parameter VTE is adaptive for excitatory neurons and set to VTE+AT after a spike, relaxing back to VT with time constant τT:

τTdVTEdt=VTVTE. (2)

The adaptation current aE for excitatory neurons follows:

τadaEdt=aE+α(VEELE). (3)

where τa is the time constant for the adaptation current. The adaptation current is increased with a constant β when the neuron spikes.

The membrane potential of the inhibitory neurons (VI) has the following dynamics:

dVI(t)dt=ELIVI(t)τI+gIEEEVI(t)C+gIIEIVI(t)C. (4)

where τI is the inhibitory membrane time constant, ELI is the inhibitory reversal potential and EE, EI are the excitatory and inhibitory resting potentials respectively. gEE and gEI are synaptic input from excitatory and inhibitory neurons respectively. Inhibitory neurons spike when the membrane potential crosses the threshold VT, which is non-adaptive. After this, there is an absolute refractory period of τabs. There is no adaptation current (see Table 7 for the parameters of the membrane dynamics).

Table 7. Neuronal membrane dynamics parameters.
Constant Value Description
τE 20 ms E membrane potential time constant
τI 20 ms I membrane potential time constant
τabs 5 ms Refractory period of E and I neurons
EE 0 mV excitatory reversal potential
EI −75 mV inhibitory reversal potential
ELE −70 mV excitatory resting potential
ELI −62 mV inhibitory resting potential
Vr −60 mV Reset potential (both E and I)
C 300 pF Capacitance
ΔTE 2 mV Exponential slope
τT 30 ms Adaptive threshold time constant
VT −52 mV Membrane potential threshold
AT 10 mV Adaptive threshold increase constant
τa 100 ms Adaptation current time constant
α 4 nS Adaptation current factor
β 0.805 pA Adaptation current increase constant

Synaptic dynamics

The synaptic conductance, g, of a neuron i is time dependent, it is a convolution of a kernel with the total input to the neuron i:

giXY(t)=KY(t)*(WextXsi,extX+jWijXYsjY(t)). (5)

where X and Y can be either E or I. K is the difference of exponentials kernel:

KY(t)=et/τdYet/τrYτdYτrY,

with a decay time τd and a rise time τr dependent only on whether the neuron is excitatory or inhibitory. The conductance is a sum of recurrent input and external input. The externally incoming spike trains sextX are generated from a Poisson process with rates rextX. The externally generated spike trains enter the network through synapses WextX (see Table 8 for the parameters of the synaptic dynamics).

Table 8. Synaptic dynamics parameters.
Constant Value Description
τdE 6 ms E decay time constant
τrE 1 ms E rise time constant
τdI 2 ms I rise time constant
τrI 0.5 ms I rise time constant
WextE 1.6 pF External input synaptic strength to E neurons
rextE 4.5 kHz Rate of external input to E neurons
WextI 1.52 pF External input synaptic strength to I neurons
rextI 2.25 kHz Rate of external input to I neurons

Plasticity

Motif plasticity

The synaptic weight from excitatory neuron j in the fast clock network to excitatory neuron i in the read-out network is changed according to the following differential equation:

dWijM(t)dt=AdepM+ApotM(yi(t)sj(t)+yj(t)si(t)). (6)

where ApotV and AdepV are the amplitude of potentiation and depression, si(t) is the spike train of the postsynaptic neurons, and sj(t) is the spike train of the presynaptic neurons. Both pre- and post-synaptic spike trains are low pass filtered with time constant τM to obtain y(t):

τMdy(t)dt=s(t)y(t). (7)

The synapses from the fast clock to the read-out network have a lower and upper bound [WminM,WmaxM]. Table 9 shows parameter values for the motif plasticity rule.

Table 9. Motif plasticity parameters.
Constant Value Description
ApotM 0.03 pFHz Amplitude of potentiation
AdepM 2/3 × 10−6 pF Amplitude of depression
τM 5 ms Time constant of low pass filter
WminM 0 pF Minimum I to I weight
WmaxM 1 pF Maximum I to I weight

Syntax plasticity

Similar to the motif plasticity rule, the syntax plasticity rule has a symmetric window. The dynamics is as such governed by the same equations, with slightly different parameters:

dWijS(t)dt=AdepS+ApotS(yi(t)sj(t)+yj(t)si(t)). (8)

where si(t) is the spike train of the postsynaptic neurons, and sj(t) is the spike train of the presynaptic neurons. The spike trains are low pass filtered with time constant τS to obtain y(t) (as in Eq 7). The synapses from the slow clock to the interneurons have a lower and upper bound [WminS,WmaxS]. Table 10 shows parameter values for the syntax plasticity rule. Note that the time constants are longer than the time constants in the motif plasticity.

Table 10. Syntax plasticity parameters.
Constant Value Description
ApotS 0.025 pFHz Amplitude of potentiation
AdepS 0.10 × 10−5 pF Amplitude of depression
τS 20 ms Time constant of low pass filter
WminS 0 pF Minimum I to I weight
WmaxS 0.3 pF Maximum I to I weight

Measuring the error

Motif error

The fast clock and motif networks are uncoupled from the slow clock and the interneuron network. We simulate spontaneous dynamics in the fast clock and motif networks by giving an input to the first cluster of the fast clock. We end the simulation after one fast clock sequence is completed. The spike trains of the excitatory neurons in the motif networks are compared to the individual target motifs (e.g. A or B), which are binary. The spike trains are first convolved using a Gaussian kernel of width ∼ 10 ms. This gives a proxy to the firing rates of the neurons. The firing rates are then normalized between 0 and 1. Dynamic time warping is finally used to compare the normalized spontaneous dynamics to the target sequence. Dynamic time warping is needed to remove the timing variability in the spontaneous dynamics. We computed dynamic time warping using the built-in Matlab function dtw. Dynamic time warping was not used to compute the error in Fig 5.

The ordering error

Spontaneous dynamics is simulated using the complete model. The target sequence is now the binary target dynamics of the interneurons. Similarly as described above, the spike trains of the interneurons are convolved and normalized to compute the error with the target using dynamic time warping.

Total error

Spontaneous dynamics is simulated using the complete model. The spike trains of the excitatory read-out neurons are compared to a binary target sequence to measure the error during learning. The target sequence is the entire sequence of motifs (e.g. AAB). The spontaneous spiking dynamics is convolved and normalized, as described above, to compute the error with the target using dynamic time warping.

Numerical simulations

Protocol—Learning

A start current of 5 kHz is given for 10 ms to the first cluster of the slow clock to initiate a training session. Strong supervising input (50 kHz, see also S8 Fig) to the read-out networks controls the dynamics in the read-out networks. The weights from the read-out networks to the interneurons make sure that also the interneurons follow the target ordering: there is no need for an explicit target current to the interneurons. At the start of each motif the fast clock is activated by giving a strong current of 50 kHz to the first cluster for 40 ms. The high supervisor currents are assumed to originate from a large network of neurons, external to this model.

Protocol—Spontaneous dynamics

A start current of 5 kHz is given for 10 ms to the first cluster of the slow clock to initiate a spontaneous replay. The slow clock determines which interneurons are active, together with an external attention mechanism (if multiple sequences are stored). The interneurons then determine which read-out network is active. The fast dynamics in the read-out networks is controlled by the input from the fast clock.

Simulations

The code used for the training and testing of the spiking network model is built in Matlab. Forward Euler discretisation with a time step of 0.1 ms is used. The code is available on GitHub: https://github.com/amaes-neuro/compositional-sequences.

Changing the slow clock into an all-inhibitory network

The hierarchical model is composed of four networks. These networks can be implemented in various ways. Here, we implement the slow clock differently to illustrate this (Fig 8, to be compared with Fig 1). Sequential dynamics can also be obtained by having an all-inhibitory network (see for example [36]). Learning the sequence AAB with this differently implemented hierarchical model leads to similar results (Fig 9, to be compared with Fig 2). Table 11 shows the new slow clock inhibitory network parameters. We conserve the other networks. Sequential dynamics in the slow clock is ensured by grouping the inhibitory neurons in 20 clusters of 100 neurons. The inhibitory weights wII of the same group are multiplied with a factor of 1/30. The inhibitory weights from group i to group i + 1 mod 20 (i = 1‥20) are multiplied by a factor of 1/2. This weight structure does not lead to sequential dynamics by itself, some form of adaptation has to be introduced. To this end, short-term depression is used:

τxddxd(t)dt=1xd(t) (9)

where xd is a depression variable for each neuron in the all-inhibitory network, and τxd=200ms. This variable is decreased by 0.07xd(t) when the neuron spikes. The outgoing weights of each neuron in the network are multiplied with this depression variable. The slow clock receives excitatory external random Poisson input and projects to the interneuron networks. The syntax synapses follow the same dynamics as Eq 8, but the right hand side of the equation is multiplied by −1 (an inverted STDP window). The parameters are summarized in Table 12.

Fig 8. The networks in the model can have different components.

Fig 8

The slow clock is replaced by an all-inhibitory network (compare with Fig 1). The syntax synapses follow the same STDP rule as the motif synapses, only inverted.

Fig 9. Learning sequence AAB with an inhibitory slow clock network.

Fig 9

The target sequence is repeatedly presented to the read-out network. A-D. Spontaneous dynamics is simulated after learning (85 target presentations). Red dots: excitatory neurons; blue dots: inhibitory neurons. A. The fast clock, controlled by interneurons 201 to 300. B. The slow clock, consisting of only inhibitory neurons, inhibits the interneurons in the correct order after learning. C. The read-out networks, driven by the fast clock and controlled by the interneurons. D. The interneurons, controlled by the slow clock. E. The motif synapses show that the target motifs A and B are stored after learning. F. The syntax weights store the correct temporal ordering A-silent-A-silent-B-silent.

Table 11. Slow clock inhibitory network parameters.

Constant Value Description
NI 2000 Number of recurrent I neurons
wII 30 pF I to I synaptic strength

Table 12. Syntax plasticity parameters.

Constant Value Description
ApotS 0.03 pFHz Amplitude of potentiation
AdepS 0.25 × 10−5 pFHz Amplitude of depression
τS 25 ms Time constant of low pass filter
WminS 0 pF Minimum I to I weight
WmaxS 0.3 pF Maximum I to I weight

Supporting information

S1 Text. Extending the model to more and variable motif lengths.

We extend the model such that it can learn sequences consisting of more than two motifs, with variable durations. In the main text, each motif has the same duration. This means the supervisor only needs to provide a starting signal to the fast clock, indicating when a motif starts. In general, a motif can be shorter than a fast clock sequence. In that case, the supervisor has to provide a stop signal to the fast clock, indicating when a motif ends. This stop signal activates the penultimate cluster in the fast clock, which activates in turn the ‘silent’ interneurons. The stop signal is 10 ms long and has the same rate as the start signal. We learn example sequences to illustrate this (S6 Fig). Specifically, we learn sequences ABCD and EBCF. Motifs A and B are both 200 ms long. Motif C, D, E and F are respectively 150 ms, 120 ms, 180 ms and 100 ms long. To keep the sequences as general as possible, we also include variable inter-motif intervals. The silent gap between motifs A and B, motifs B and C, and motifs C and D is respectively 70 ms, 50 ms, and 80 ms. The silent gaps in the second sequence between motifs E and B, motifs B and C, and motifs C and F are respectively 70 ms, 50 ms, and 150 ms. We observe that the model is able to learn the two sequences, but the replay of shorter motifs D and F is less accurate. The parameters used in this simulation are the same as in other simulations, with an increased network size for the interneuron networks, and read-out network.

(PDF)

S1 Fig. Dynamics of the hierarchical model during target sequence presentation.

A. The first cluster of the fast clock receives a high input current at the start of each motif presentation. B. The first cluster of the slow clock receives a high input current at the beginning of the sequence presentation. C. The high input current forces spiking in the read-out neurons. D. The read-out neurons activate the interneurons.

(TIF)

S2 Fig. The serial network model.

A. A single recurrent network clock (left) produces sequential dynamics and drives the dynamics in the read-out networks (right). The weights from the serial clock to the read-out network are plastic. B. We learn target sequence AAB. Spontaneous dynamics is simulated after 90 target sequence presentations. C. The read-out weights after learning. Both motif and syntax information are stored in the same weights.

(TIF)

S3 Fig. Total sequence error for hierarchical and serial model, during relearning: AABABA.

Spontaneous dynamics is simulated every fifth training iteration and compared with target sequence AAB (brown line) and target sequence ABA (dark green line) to compute the total sequence error. A. Total sequence error for the hierarchical model. Note how the total sequence error (which is the combination of within-motif error and syntax error) relative to AAB decreases for about 30 iterations after target ABA is presented for the first time due to the continued improvement in the within-motif dynamics. After this, there is a marked increase in the syntax error and the total error relative to AAB. B. Total sequence error for the serial model. The lack of hierarchy in the serial model implies that both the within-motif dynamics and motif ordering has to be relearned. This leads to a more gradual and slower relearning (note the longer x-axis).

(TIF)

S4 Fig. Total sequence error for various learning rates, during relearning AABABA.

Spontaneous dynamics is simulated every fifth training iteration and compared with target sequence AAB (brown line) and target sequence ABA (dark green line) to compute the total sequence error. The lines shows the average of 5 simulations. A. The solid line shows the same total error as in S3(A) Fig (the baseline). The dashed line shows the total error, when learning faster. The right hand side of Eq 8 is multiplied by a factor 2. B. The total error when learning slower. The right hand side of Eq 8 is divided by a factor 2. More iterations are shown because the model needs more time to learn the sequences.

(TIF)

S5 Fig. Learning two sequences.

The hierarchical model requires an additional interneuron network. An external current is assumed to inhibit the interneurons for sequence BAAB when sequence AAB is presented and vice versa. The serial model duplicates the entire read-out network. Here also, an external current is assumed to inhibit the read-out networks for sequence BAAB when sequence AAB is presented and vice versa.

(TIF)

S6 Fig. Learning 2 sequences with variable motif durations and variable inter-motif intervals.

A. The two target sequences. Individual motifs have durations between 100 and 200 ms. Inter-motif intervals range from 50 to 150 ms. B-E. Red dots: excitatory neurons; blue dots: inhibitory neurons. Brown shaded area: sequence ABCD is played by inhibiting the interneurons related to the second sequence; light green shaded area: sequence EBCF is played by inhibiting the interneurons related to the first sequence. B. Spike raster of the fast clock. C. Spike raster of the slow clock. D. Spike raster of the six read-out networks. E. Spike raster of the interneurons. An external attentional inhibitory current selects which sequence is played. F. The motif weights encode the six motifs. Note that motifs B and C are learned more as they occur in both sequences. G. The syntax weights encode the two sequences. All motif and syntax synapses are plastic at all times during the sequence presentations (see S1 Text for Method details).

(TIF)

S7 Fig. Perturbing the slow clock of the hierarchical network.

Blue shade indicates the perturbation time, all excitatory neurons receive no external input for 250 ms. The sequential dynamics in the slow clock breaks down (top right) but random activity in the interneurons (bottom right) leads to sequences in the fast clock (top left), which in turn leads to motif replays (bottom left).

(TIF)

S8 Fig. Limitations on parameters.

A. Spontaneous dynamics is simulated for a range of parameters, for a model that has learned sequence ABA. The potentiated motif synapses have values between 0.7 pF and 1 pF. Raster plots of the read-out network is shown. The lateral inhibition wRIn and the lateral excitation wInR are varied. When the lateral inhibition is too weak, the motifs occur at the same time (top left panel). When the lateral inhibition is sufficiently strong, the motifs are replayed well (bottom right panel). B. A supervisor gives input ABA to the read-out network, for a model that has stored sequence AAB. When the supervisor input is too low (left panel), the stored sequence dominates the dynamics in the read-out network and there will be no relearning. When the supervisor input is sufficiently high (right panel), the stored sequence is overwritten by the supervisor input and there will be relearning.

(TIF)

S9 Fig. Learning curves for different read-out configurations.

The read-out network in the main text consists of two separate networks, which are not interconnected. A. Cartoon of read-out network without recurrent excitatory connections. B-D: The learning curves when the recurrent connections in the two separate motif networks are zero. The same relearning protocol as in Fig 3 and S4 Fig is used. E. Cartoon of read-out network when the two motif networks are combined and interconnected into one network. F-H: The learning curves when the two motif networks are combined and interconnected into one network. In this case, the same connections as listed in Table 2 are used but multiplied by 1/2, and NRE=600, NRI=150. The sparsity of the connections remains p = 0.2. The same relearning protocol as in Fig 3 and S4 Fig is used.

(TIF)

Acknowledgments

We thank Victor Pedrosa and Barbara Feulner for helpful comments.

Data Availability

The code is available online at url: https://github.com/amaes-neuro/compositional-sequences.

Funding Statement

AM acknowledges funding through the EPSRC Centre for Neurotechnology (https://epsrc.ukri.org/skills/students/centres/2013-cdt-exercise/neurotechnologyforlifeandhealth/). MB acknowledges funding through EPSRC award EP/N014529/1 supporting the EPSRC Centre for Mathematics of Precision Healthcare at Imperial (https://www.imperial.ac.uk/mathematics-precision-healthcare). CC acknowledges support by BBSRC BB/N013956/1 (https://bbsrc.ukri.org/), BB/N019008/1, Wellcome Trust 200790/Z/16/Z (https://wellcome.org/?gclid=CjwKCAjw5Kv7BRBSEiwAXGDElTTydWe4MWJu_2waXdH7DsTdOym3ijPyGHfHePBEuai0XKfJa5RbIRoC3KcQAvD_BwE), Simons Foundation 564408 (https://www.simonsfoundation.org/) and EPSRC EP/R035806/1 (https://epsrc.ukri.org/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Tresch MC, Saltiel P, Bizzi E. The construction of movement by the spinal cord. Nature Neuroscience. 1999;2(2):162–167. 10.1038/5721 [DOI] [PubMed] [Google Scholar]
  • 2. Bizzi E, Cheung VCK, D’Avella A, Saltiel P, Tresch M. Combining modules for movement. Brain Research Reviews. 2008;57(1):125–133. 10.1016/j.brainresrev.2007.08.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Wiltschko AB, Johnson MJ, Iurilli G, Peterson RE, Katon JM, Pashkovski SL, et al. Mapping Sub-Second Structure in Mouse Behavior. Neuron. 2015;88(6):1121–1135. 10.1016/j.neuron.2015.11.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Lashley KS. The Problem of Serial Order in Behavior. Cerebral Mechanisms in Behavior. 1951;21:112–146. [Google Scholar]
  • 5. Houghton G, Hartley T. Parallel models of serial behavior: Lashley revisited. Psyche. 1995;2(25):1–25. [Google Scholar]
  • 6. Tanji J. Sequential Organization of Multiple Movements: Involvement of Cortical Motor Areas. Annual Review of Neuroscience. 2001;24:631–651. 10.1146/annurev.neuro.24.1.631 [DOI] [PubMed] [Google Scholar]
  • 7. Kiebel SJ, Daunizeau J,Friston KJ. A hierarchy of time-scales and the brain. PLoS Computational Biology. 2008;4(11). 10.1371/journal.pcbi.1000209 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Murray JD, Bernacchia A, Freedman DJ, Romo R, Wallis JD, Cai X, et al. A hierarchy of intrinsic timescales across primate cortex. Nature Neuroscience. 2014;17(12):1661–1663. 10.1038/nn.3862 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Seeds AM, Ravbar P, Chung P, Hampel S, Midgley FM, Mensh BD, et al. A suppression hierarchy among competing motor programs drives sequential grooming in Drosophila. eLife. 2014;3:e02951. 10.7554/eLife.02951 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Berman GJ, Bialek W, Shaevitz JW. Predictability and hierarchy in Drosophila behavior. Proceedings of the National Academy of Sciences of the United States of America. 2016;113(42):11943–11948. 10.1073/pnas.1607601113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Jovanic T, Schneider-Mizell CM, Shao M, Masson JB, Denisov G, Fetter RD, et al. Competitive Disinhibition Mediates Behavioral Choice and Sequences in Drosophila. Cell. 2016;167(3):858–870.e19. 10.1016/j.cell.2016.09.009 [DOI] [PubMed] [Google Scholar]
  • 12. Jin X, Costa RM. Shaping action sequences in basal ganglia circuits. Current Opinion in Neurobiology. 2015;33:188–196. 10.1016/j.conb.2015.06.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Geddes CE, Li H, Jin X. Optogenetic Editing Reveals the Hierarchical Organization of Learned Action Sequences. Cell. 2018;174(1):32–43.e15. 10.1016/j.cell.2018.06.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Markowitz JE, Gillis WF, Beron CC, Neufeld SQ, Robertson K, Bhagat ND, et al. The Striatum Organizes 3D Behavior via Moment-to-Moment Action Selection. Cell. 2018;174(1):44–58.e17. 10.1016/j.cell.2018.04.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Kato S, Kaplan HS, Schrödel T, Skora S, Lindsay TH, Yemini E, et al. Global Brain Dynamics Embed the Motor Command Sequence of Caenorhabditis elegans. Cell. 2015;163(3):656–669. 10.1016/j.cell.2015.09.034 [DOI] [PubMed] [Google Scholar]
  • 16. Kaplan HS, Salazar Thula O, Khoss N, Zimmer M. Nested Neuronal Dynamics Orchestrate a Behavioral Hierarchy across Timescales. Neuron. 2020;105(3):562–576.e9. 10.1016/j.neuron.2019.10.037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Vogelstein JT, Park Y, Ohyama T, Kerr RA, Truman JW, Priebe CE, et al. Discovery of brainwide neural-behavioral maps via multiscale unsupervised structure learning. Science. 2014;344(6182):386–392. 10.1126/science.1250298 [DOI] [PubMed] [Google Scholar]
  • 18. Berman GJ. Measuring behavior across scales. BMC Biology. 2018;16(1). 10.1186/s12915-018-0494-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Hahnloser RHR, Kozhevnikov AA, Fee MS. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature. 2002;419(6902):65–70. 10.1038/nature00974 [DOI] [PubMed] [Google Scholar]
  • 20. Glaze CM, Troyer TW. Temporal Structure in Zebra Finch Song: Implications for Motor Coding. Journal of Neuroscience. 2006;26(3):991–1005. 10.1523/JNEUROSCI.3387-05.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Glaze CM, Troyer TW. Development of temporal structure in zebra finch song. Journal of Neurophysiology. 2013;109(4):1025–1035. 10.1152/jn.00578.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Lipkind D, Zai AT, Hanuschkin A, Marcus GF, Tchernichovski O, Hahnloser RHR. Songbirds work around computational complexity by learning song vocabulary independently of sequence. Nature Communications. 2017;8(1). 10.1038/s41467-017-01436-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Dominici N, Ivanenko YP, Cappellini G, D’Avella A, Mondì V, Cicchese M, et al. Locomotor primitives in newborn babies and their development. Science. 2011;334(6058):997–999. 10.1126/science.1210617 [DOI] [PubMed] [Google Scholar]
  • 24. Lipkind D, Marcus GF, Bemis DK, Sasahara K, Jacoby N, Takahasi M, et al. Stepwise acquisition of vocal combinatorial capacity in songbirds and human infants. Nature. 2013;498(7452):104–108. 10.1038/nature12173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Ding N, Melloni L, Zhang H, Tian X, Poeppel D. Cortical tracking of hierarchical linguistic structures in connected speech. Nature Neuroscience. 2015;19(1):158–164. 10.1038/nn.4186 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Lipkind D, Geambasu A, Levelt CC. The Development of Structured Vocalizations in Songbirds and Humans: A Comparative Analysis. Topics in Cognitive Science. 2019. [DOI] [PubMed] [Google Scholar]
  • 27. Stroud JP, Porter MA, Hennequin G, Vogels TP. Motor primitives in space and time via targeted gain modulation in cortical networks. Nature Neuroscience. 2018;21(12):1774–1783. 10.1038/s41593-018-0276-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Logiaco L, Abbott LF, Escola S. A model of flexible motor sequencing through thalamic control of cortical dynamics. bioRxiv. 2019; p. 2019.12.17.880153. [DOI] [PMC free article] [PubMed]
  • 29. Maes A, Barahona M, Clopath C. Learning spatiotemporal signals using a recurrent spiking network that discretizes time. PLoS Computational Biology. 2020;16(1). 10.1371/journal.pcbi.1007606 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Nicola W, Clopath C. Supervised learning in spiking neural networks with FORCE training. Nature Communications. 2017;8(1):1–15. 10.1038/s41467-017-01827-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Hardy NF, Buonomano D. Encoding Time in Feedforward Trajectories of a Recurrent Neural Network Model. Neural Computation. 2018;30(2):378–396. 10.1162/neco_a_01041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Nicola W, Clopath C. A diversity of interneurons and Hebbian plasticity facilitate rapid compressible learning in the hippocampus. Nature Neuroscience. 2019;22(7):1168–1181. 10.1038/s41593-019-0415-2 [DOI] [PubMed] [Google Scholar]
  • 33. Werbos PJ. Backpropagation Through Time: What It Does and How to Do It. Proceedings of the IEEE. 1990;78(10):1550–1560. 10.1109/5.58337 [DOI] [Google Scholar]
  • 34. Jun JK, Jin DZ. Development of neural circuitry for precise temporal sequences through spontaneous activity, axon remodeling, and synaptic plasticity. PLoS ONE. 2007;2(8). 10.1371/journal.pone.0000723 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Zheng P, Triesch J. Robust development of synfire chains from multiple plasticity mechanisms. Frontiers in Computational Neuroscience. 2014;8:1–10. 10.3389/fncom.2014.00066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Murray JM, Escola GS. Learning multiple variable-speed sequences in striatum via cortical tutoring. eLife. 2017;6:e26084. 10.7554/eLife.26084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Brea J, Senn W, Pfister JP. Matching Recall and Storage in Sequence Learning with Spiking Neural Networks. Journal of Neuroscience. 2013;33(23):9565–9575. 10.1523/JNEUROSCI.4098-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Chenkov N, Sprekeler H, Kempter R. Memory replay in balanced recurrent networks. PLOS Computional Biology. 2017;13(1):e1005359. 10.1371/journal.pcbi.1005359 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Billeh YN, Schaub MT. Feedforward architectures driven by inhibitory interactions. Journal of Computational Neuroscience. 2018;44(1):63–74. 10.1007/s10827-017-0669-1 [DOI] [PubMed] [Google Scholar]
  • 40. Setareh H, Deger M, Gerstner W. Excitable neuronal assemblies with adaptation as a building block of brain circuits for velocity-controlled signal propagation. PLoS Computational Biology. 2018;14(7):e1006216. 10.1371/journal.pcbi.1006216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Spreizer S, Aertsen A, Kumar A. From space to time: Spatial inhomogeneities lead to the emergence of spatiotemporal sequences in spiking neuronal networks. PLoS computational biology. 2019;15(10):e1007432. 10.1371/journal.pcbi.1007432 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Ikegaya Y, Aaron G, Cossart R, Aronov D, Lampl I, Ferster D, et al. Synfire Chains and Cortical Songs: Temporal Modules of Cortical Activity. Science. 2004;304(5670):559–564. 10.1126/science.1093173 [DOI] [PubMed] [Google Scholar]
  • 43. Harvey CD, Coen P, Tank DW. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature. 2012;484(7392):62–68. 10.1038/nature10918 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Peters AJ, Chen SX, Komiyama T. Emergence of reproducible spatiotemporal activity during motor learning. Nature. 2014;510(7504):263–267. 10.1038/nature13235 [DOI] [PubMed] [Google Scholar]
  • 45. Katlowitz KA, Picardo MA, Long MA. Stable Sequential Activity Underlying the Maintenance of a Precisely Executed Skilled Behavior. Neuron. 2018;98(6):1133–1140.e3. 10.1016/j.neuron.2018.05.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Adler A, Zhao R, Shin ME, Yasuda R, Gan WB. Somatostatin-Expressing Interneurons Enable and Maintain Learning-Dependent Sequential Activation of Pyramidal Neurons. Neuron. 2019;102(1):202–216.e7. 10.1016/j.neuron.2019.01.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Schaub MT, Billeh YN, Anastassiou CA, Koch C, Barahona M. Emergence of Slow-Switching Assemblies in Structured Neuronal Networks. PLoS Computational Biology. 2015;11(7):1–28. 10.1371/journal.pcbi.1004196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Yang Q, Logan D, Giszter SF. Motor primitives are determined in early development and are then robustly conserved into adulthood. Proceedings of the National Academy of Sciences of the United States of America. 2019;116(24):12025–12034. 10.1073/pnas.1821455116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Fiete IR, Senn W, Wang CZH, Hahnloser RHR. Spike-Time-Dependent Plasticity and Heterosynaptic Competition Organize Networks to Produce Long Scale-Free Sequences of Neural Activity. Neuron. 2010;65(4):563–576. 10.1016/j.neuron.2010.02.003 [DOI] [PubMed] [Google Scholar]
  • 50. Okubo TS, Mackevicius EL, Payne HL, Lynch GF, Fee MS. Growth and splitting of neural sequences in songbird vocal development. Nature. 2015;528(7582):352–357. 10.1038/nature15741 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Merel J, Botvinick M, Wayne G. Hierarchical motor control in mammals and machines. Nature Communications. 2019;10 (5489). 10.1038/s41467-019-13239-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Logiaco L, Escola GS. Thalamocortical motor circuit insights for more robust hierarchical control of complex sequences. arXiv. 2020;2006(13332v1).
  • 53. Maass W, Natschläger T, Markram H. Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Computation. 2002;14(11):2531–2560. 10.1162/089976602760407955 [DOI] [PubMed] [Google Scholar]
  • 54. Maass W. Liquid state machines: Motivation, theory, and applications. In: Computability in Context: Computation and Logic in the Real World. Imperial College Press; 2011. p. 275–296. [Google Scholar]
  • 55. Gütig R, Sompolinsky H. The tempotron: A neuron that learns spike timing-based decisions. Nature Neuroscience. 2006;9(3):420–428. 10.1038/nn1643 [DOI] [PubMed] [Google Scholar]
  • 56. Bouchacourt F, Palminteri S, Koechlin E, Ostojic S. Temporal chunking as a mechanism for unsupervised learning of task-sets. eLife. 2020;9:e50469. 10.7554/eLife.50469 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Tomov MS, Yagati S, Kumar A, Yang W, Gershman SJ. Discovery of hierarchical representations for efficient planning. PLoS Computational Biology. 2020;16(4). 10.1371/journal.pcbi.1007594 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Karuza EA, Thompson-Schill SL, Bassett DS. Local Patterns to Global Architectures: Influences of Network Topology on Human Learning. Trends in Cognitive Sciences. 2016;20(8):629–640. 10.1016/j.tics.2016.06.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Egnor SER, Branson K. Computational Analysis of Behavior. Annual Review of Neuroscience. 2016;39(1):217–236. 10.1146/annurev-neuro-070815-013845 [DOI] [PubMed] [Google Scholar]
  • 60. Jun JJ, Steinmetz NA, Siegle JH, Denman DJ, Bauza M, Barbarits B, et al. Fully integrated silicon probes for high-density recording of neural activity. Nature. 2017;551(7679):232–236. 10.1038/nature24636 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Williams AH, Kim TH, Wang F, Vyas S, Ryu SI, Shenoy KV, et al. Unsupervised Discovery of Demixed, Low-Dimensional Neural Dynamics across Multiple Timescales through Tensor Component Analysis. Neuron. 2018;98(6):1099–1115.e8. 10.1016/j.neuron.2018.05.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Mackevicius EL, Bahle AH, Williams AH, Gu S, Denisenko NI, Goldman MS, et al. Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience. eLife. 2019;8:e38471. 10.7554/eLife.38471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Kollmorgen S, Hahnloser RHR, Mante V. Nearest neighbours reveal fast and slow components of motor learning. Nature. 2020;577(7791):526–530. 10.1038/s41586-019-1892-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Brette R, Gerstner W. Adaptive exponential integrate-and-fire model as an effective description of neuronal activity. Journal of Neurophysiology. 2005;94(5):3637–3642. 10.1152/jn.00686.2005 [DOI] [PubMed] [Google Scholar]
  • 65. Kempter R, Gerstner W, van Hemmen JL. Hebbian learning and spiking neurons. Physical Review E—Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics. 1999;59(4):4498–4514. [Google Scholar]
  • 66. Litwin-Kumar A, Doiron B. Formation and maintenance of neuronal assemblies through synaptic plasticity. Nature Communications. 2014;5(5319):1–12. [DOI] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008866.r001

Decision Letter 0

Abigail Morrison, Samuel J Gershman

8 Dec 2020

Dear Dr. Clopath,

Thank you very much for submitting your manuscript "Learning compositional sequences with multiple time scales through a hierarchical network of spiking neurons" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

In particular, the revision should be explicit about critical model assumptions and should clarify both the predictions on network structure and the compatibility with current experimental findings.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Abigail Morrison

Associate Editor

PLOS Computational Biology

Samuel Gershman

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Reviewer #1: This paper presents a computational model of sequence generation. The novel aspect of the model is that it focuses on hierarchical sequences—such as the sequences of syllables within a motif, and the sequence of motifs (syntax) that compose a songbirds song. The model assumes two clocks, a fast and slow clock responsible for the syllable sequence within a motif and the motif sequence, respectively. There is a readout network for each motif and an interneuron network for each overall sequence (syntax). Learning is supervised but biological, and takes place at the fast clock -> to read-out synapses and slow clock -> interneurons synapses. The problem being addressed is important and the model captures a number of interesting biological properties in a flexible manner. But the model is also require some significant assumptions and is fairly hardwired in some regards, for example, in regard to the connectivity between from the read-out to interneuron networks.

To allow the reader to better understand the limitations of the model and what has to be improved upon in future models the authors should discuss or address the shortcomings of the model and where it deviates from experimental data, including:

1. It appears that, as presented each motif (or the inter-motif interval) has to be the same duration. This would appear to imply a periodicity to the motifs which does not seem to be anecdotally consistent with many sequential behaviors, or with the birdsong data in which different motifs can have different durations.

2. I’m not aware of any evidence that support the notion of two separate and independent fast and slow clocks in the biological literature. For example, I don’t think there is any evidence of a fast and slow sequence/clock in the birdsong system. Do the authors consider this to be a prediction of their model?

3. The specific hardwiring of the readout to interneuron connections along with the presence of essentially independent motif network seems like a potential challenge to the biological plausibility.

Clarify whether the Inh units in the read-out networks receive external input, and plastic synapses from the fast clock.

The model will be easier to understand if in Figure 1 the projections from Motif A and Motif B, and Slow clock are shown to project exclusively and specifically to the appropriate population of units in the interneuron network. Otherwise it is difficult to understand how the supervised learning governs plasticity in the interneuron network.

In Figure 2 the authors should also show the weight matrices of the nonplastic weights from the read-out to interneuron networks, and vice-versa, as well as the interneuron to fast clock. This will greatly help the reader understand how the model works.

Clarify what the multiple dots in each iteration in Figure 3 represent.

Reviewer #2: In their manuscript the authors address the interesting problem of storing sequences (e.g. ABA) that are made up two motifs (A and B), where each motif is again a sequence (e.g. 12321 or 456). A sequence (called syntax) unfolds on a timescale of a second and each motif lasts up to a couple hundred milliseconds. Building on their recent work (Maes, Barahona, Clopath, 2020, Plos Comp. Biol.), Maes et al introduce a spiking neural network structure consisting of four interconnected components: a slow clock, a fast clock, interneuron network, and read out networks. With this network structure they learn motifs in feedforward weights from the fast clock to read out networks, with each read out network representing one motif. Feedforward weights from the slow clock to the interneuron network switch between predefined interneuron clusters. These clusters control which read out network may become active, suppressing all others. They show that this network structure learns the motifs and syntax synapses independently. Further, compared to a single network (serial model) their network structure is more resistant to perturbations. New combinations of the same motifs can be achieved without adding more readout neurons; however, this requires recruiting copies of the inhibitory network. They also find that the pattern of variability differs when compared to a single network as there is less within motif variability however more variability in the timing of motif switching, due to variability in the period of the slow clock.

Overall this is a nice extension of their recent paper (Maes et al, 2020, Plos Comp. Biol.) to now include sequences of sequences on multiple timescales. Previously they used one clock network to control read outs. Switching between read out networks was done manually. The novel aspect here is that a second slower clock is added to control interneuron groups which can switch between read out networks, forming a slow sequence (syntax) of faster subsequences (motif). There remain however concerns about the clarity of presentation, robustness of the architecture, and some of the conclusions drawn, as listed in the following.

Major comments:

- It seems that the model requires specific connectivity and balance between the different weights to achieve the desired behavior. For example, there are specific feedback connections from each motif to one interneuron cluster. Activation of this interneuron cluster recruits lateral inhibition onto all competing motifs. This inhibition must be strong enough to counteract all feedforward excitation from the fast clock to the read out, in the manuscript these inhibitory weights are two orders of magnitude stronger (0.3 pF vs. 50 pF, Table 5). Thus, at least some discussion of the constraints in connectivity, weights, and activity should be added (e.g. ratio of inh/exc weights allowed? How sparse can activity in the motif be before feedback to its interneuron cluster does not recruit sufficient lateral inhibition?). In addition, limitations as well as experimental predictions stemming from these constraints might be helpful to discuss.

- While the idea is compelling, the results section on capacity of the model, the shown results do not sufficiently support the claim. Given that you are talking about how many neurons and synapses need to be added to store a new sequence, it may be more appropriate to discuss and quantify the resources required instead of capacity.

- If understood correctly, if A is repeated at different positions in a sequence (e.g. ABA), or is in different sequences (e.g. AAB vs. BAAB), different representations are required for each instance of A. While the motif for A is preserved, different instances of A are stored in the corresponding weights from slow clock clusters to the inhibitory networks. If this is correct, slow clock clusters can be reused for each instance of A, however different sets of (slow clock to inhibitory network) synapses are required to store multiple representations of A. This strategy seems to save on neurons at the expense of requiring more synapses. Please discuss.

- I agree with the authors when they write ``Non-Markovian sequences are generally hard to learn, because they require a memory about past dynamics". As I understand it, in this work, for a repeating motif in a sequence (e.g. ABA), two different clusters in the slow clock potentiate their projections with interneuron cluster A. The slow clock is used as an index. Perhaps this is a matter of interpretation, but it seems that this bypasses the need for memory about past dynamics. Instead the order of the sequence elements is stored via the slow clock cluster order and is executed through feedforward weights to the interneuron groups, making the problem Markovian. I would be interested to hear the authors' interpretation of this.

- Related to the previous comment: can your system learn and recall overlapping sequences such as ABCD vs. EBCF?

Minor comments:

- Related to point 1, Figure 1 could better reflect the specific connectivity required. For instance, it would be helpful to label the interneurons in the interneuron network. In this case, one belongs to motif A, one to motif B, and one to the fast clock. It would also be helpful to show that the connectivity is predetermined in a very specific way.

- For motif error, is each motif played out independently of the slow clock sequence order? What is exactly stimulated and how?

- Regarding relearning syntax sequences while reusing motifs: which constraints are there on the timescales? Is it correct that the syntax must be learned faster to avoid motifs being relearned?

- Based on Figure 3B, it looks like learning is faster than relearning. Is it possible to speed up the relearning? Does this depend on the timescale of the depotentiation term?

- For movement control, the variation of speed is an essential part. Please discuss, if and how you could control the recall speed in the proposed system.

Reviewer #3: The manuscript by A. Maes et al. presents an interesting spiking neural model for the learning of dynamical spike sequences that are modular, i.e. composed of subsequences. The basic idea is that two clock-like networks, operating at slow and fast speed, govern the progress of the overall sequence and of all individual subsequences, respectively. The selection of a subsequence happens by inhibition of the other subsequences. The responsible interneurons receive input from the slow clock; their input connections are trained such that the right networks are inhibited at the right times. The connections from the fast clock to the different output neuron networks are similarly trained to generate their individual subsequences.

Remarkably, the networks can flexibly relearn the order of subsequences while the precision of individual ones still increases (Fig. 3). This is in contrast to a purely serial model with a single clock. Further, the networks can easily learn sequences where subsequences repeat. The dynamics are robust against deletion of individual subsequences and the slow clock can be implemented with a sequence generating mechanism that is generically slow as it driven by short term depression.

The manuscript is interesting, clear and well-written. I recommend its publication essentially as it is and would like to ask only a few minor questions:

1. In Fig. 1, might it increase clarity to explicitly display the inhibitory connections from IN-A to the network Motif B and from IN-B to the network Motif A?

2. Is the recurrent excitation in the Motif networks important?

3. What resets the fast clock to start with its first group of neurons after inhibition from IN-Silence stops? Fig. 2B seems to indicate that while IN-Silence is active, the last group of the fast network stays active and initiates the first group as soon as inhibition stops.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Christian Tetzlaff

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008866.r003

Decision Letter 1

Abigail Morrison, Samuel J Gershman

8 Mar 2021

Dear Dr. Clopath,

We are pleased to inform you that your manuscript 'Learning compositional sequences with multiple time scales through a hierarchical network of spiking neurons' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Abigail Morrison

Associate Editor

PLOS Computational Biology

Samuel Gershman

Deputy Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have done a good job addressing my concerns, and I think the paper is appropriate for publication.

Reviewer #2: The authors have clarified all my points. Thank you.

Reviewer #3: I thank the authors for their careful answering of my questions. I can fully recommend acceptance of the paper.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: None

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Dr. Christian Tetzlaff

Reviewer #3: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008866.r004

Acceptance letter

Abigail Morrison, Samuel J Gershman

19 Mar 2021

PCOMPBIOL-D-20-01730R1

Learning compositional sequences with multiple time scales through a hierarchical network of spiking neurons

Dear Dr Clopath,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Alice Ellingham

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. Extending the model to more and variable motif lengths.

    We extend the model such that it can learn sequences consisting of more than two motifs, with variable durations. In the main text, each motif has the same duration. This means the supervisor only needs to provide a starting signal to the fast clock, indicating when a motif starts. In general, a motif can be shorter than a fast clock sequence. In that case, the supervisor has to provide a stop signal to the fast clock, indicating when a motif ends. This stop signal activates the penultimate cluster in the fast clock, which activates in turn the ‘silent’ interneurons. The stop signal is 10 ms long and has the same rate as the start signal. We learn example sequences to illustrate this (S6 Fig). Specifically, we learn sequences ABCD and EBCF. Motifs A and B are both 200 ms long. Motif C, D, E and F are respectively 150 ms, 120 ms, 180 ms and 100 ms long. To keep the sequences as general as possible, we also include variable inter-motif intervals. The silent gap between motifs A and B, motifs B and C, and motifs C and D is respectively 70 ms, 50 ms, and 80 ms. The silent gaps in the second sequence between motifs E and B, motifs B and C, and motifs C and F are respectively 70 ms, 50 ms, and 150 ms. We observe that the model is able to learn the two sequences, but the replay of shorter motifs D and F is less accurate. The parameters used in this simulation are the same as in other simulations, with an increased network size for the interneuron networks, and read-out network.

    (PDF)

    S1 Fig. Dynamics of the hierarchical model during target sequence presentation.

    A. The first cluster of the fast clock receives a high input current at the start of each motif presentation. B. The first cluster of the slow clock receives a high input current at the beginning of the sequence presentation. C. The high input current forces spiking in the read-out neurons. D. The read-out neurons activate the interneurons.

    (TIF)

    S2 Fig. The serial network model.

    A. A single recurrent network clock (left) produces sequential dynamics and drives the dynamics in the read-out networks (right). The weights from the serial clock to the read-out network are plastic. B. We learn target sequence AAB. Spontaneous dynamics is simulated after 90 target sequence presentations. C. The read-out weights after learning. Both motif and syntax information are stored in the same weights.

    (TIF)

    S3 Fig. Total sequence error for hierarchical and serial model, during relearning: AABABA.

    Spontaneous dynamics is simulated every fifth training iteration and compared with target sequence AAB (brown line) and target sequence ABA (dark green line) to compute the total sequence error. A. Total sequence error for the hierarchical model. Note how the total sequence error (which is the combination of within-motif error and syntax error) relative to AAB decreases for about 30 iterations after target ABA is presented for the first time due to the continued improvement in the within-motif dynamics. After this, there is a marked increase in the syntax error and the total error relative to AAB. B. Total sequence error for the serial model. The lack of hierarchy in the serial model implies that both the within-motif dynamics and motif ordering has to be relearned. This leads to a more gradual and slower relearning (note the longer x-axis).

    (TIF)

    S4 Fig. Total sequence error for various learning rates, during relearning AABABA.

    Spontaneous dynamics is simulated every fifth training iteration and compared with target sequence AAB (brown line) and target sequence ABA (dark green line) to compute the total sequence error. The lines shows the average of 5 simulations. A. The solid line shows the same total error as in S3(A) Fig (the baseline). The dashed line shows the total error, when learning faster. The right hand side of Eq 8 is multiplied by a factor 2. B. The total error when learning slower. The right hand side of Eq 8 is divided by a factor 2. More iterations are shown because the model needs more time to learn the sequences.

    (TIF)

    S5 Fig. Learning two sequences.

    The hierarchical model requires an additional interneuron network. An external current is assumed to inhibit the interneurons for sequence BAAB when sequence AAB is presented and vice versa. The serial model duplicates the entire read-out network. Here also, an external current is assumed to inhibit the read-out networks for sequence BAAB when sequence AAB is presented and vice versa.

    (TIF)

    S6 Fig. Learning 2 sequences with variable motif durations and variable inter-motif intervals.

    A. The two target sequences. Individual motifs have durations between 100 and 200 ms. Inter-motif intervals range from 50 to 150 ms. B-E. Red dots: excitatory neurons; blue dots: inhibitory neurons. Brown shaded area: sequence ABCD is played by inhibiting the interneurons related to the second sequence; light green shaded area: sequence EBCF is played by inhibiting the interneurons related to the first sequence. B. Spike raster of the fast clock. C. Spike raster of the slow clock. D. Spike raster of the six read-out networks. E. Spike raster of the interneurons. An external attentional inhibitory current selects which sequence is played. F. The motif weights encode the six motifs. Note that motifs B and C are learned more as they occur in both sequences. G. The syntax weights encode the two sequences. All motif and syntax synapses are plastic at all times during the sequence presentations (see S1 Text for Method details).

    (TIF)

    S7 Fig. Perturbing the slow clock of the hierarchical network.

    Blue shade indicates the perturbation time, all excitatory neurons receive no external input for 250 ms. The sequential dynamics in the slow clock breaks down (top right) but random activity in the interneurons (bottom right) leads to sequences in the fast clock (top left), which in turn leads to motif replays (bottom left).

    (TIF)

    S8 Fig. Limitations on parameters.

    A. Spontaneous dynamics is simulated for a range of parameters, for a model that has learned sequence ABA. The potentiated motif synapses have values between 0.7 pF and 1 pF. Raster plots of the read-out network is shown. The lateral inhibition wRIn and the lateral excitation wInR are varied. When the lateral inhibition is too weak, the motifs occur at the same time (top left panel). When the lateral inhibition is sufficiently strong, the motifs are replayed well (bottom right panel). B. A supervisor gives input ABA to the read-out network, for a model that has stored sequence AAB. When the supervisor input is too low (left panel), the stored sequence dominates the dynamics in the read-out network and there will be no relearning. When the supervisor input is sufficiently high (right panel), the stored sequence is overwritten by the supervisor input and there will be relearning.

    (TIF)

    S9 Fig. Learning curves for different read-out configurations.

    The read-out network in the main text consists of two separate networks, which are not interconnected. A. Cartoon of read-out network without recurrent excitatory connections. B-D: The learning curves when the recurrent connections in the two separate motif networks are zero. The same relearning protocol as in Fig 3 and S4 Fig is used. E. Cartoon of read-out network when the two motif networks are combined and interconnected into one network. F-H: The learning curves when the two motif networks are combined and interconnected into one network. In this case, the same connections as listed in Table 2 are used but multiplied by 1/2, and NRE=600, NRI=150. The sparsity of the connections remains p = 0.2. The same relearning protocol as in Fig 3 and S4 Fig is used.

    (TIF)

    Attachment

    Submitted filename: Response_to_reviewers1.pdf

    Data Availability Statement

    The code is available online at url: https://github.com/amaes-neuro/compositional-sequences.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES