Abstract
The brain exhibits capabilities of fast incremental learning from few noisy examples, as well as the ability to associate similar memories in autonomously-created categories and to combine contextual hints with sensory perceptions. Together with sleep, these mechanisms are thought to be key components of many high-level cognitive functions. Yet, little is known about the underlying processes and the specific roles of different brain states. In this work, we exploited the combination of context and perception in a thalamo-cortical model based on a soft winner-take-all circuit of excitatory and inhibitory spiking neurons. After calibrating this model to express awake and deep-sleep states with features comparable with biological measures, we demonstrate the model capability of fast incremental learning from few examples, its resilience when proposed with noisy perceptions and contextual signals, and an improvement in visual classification after sleep due to induced synaptic homeostasis and association of similar memories.
Author summary
We created a thalamo-cortical spiking model (ThaCo) with the purpose of demonstrating a link among two phenomena that we believe to be essential for the brain capability of efficient incremental learning from few examples in noisy environments. Grounded in two experimental observations—the first about the effects of deep-sleep on pre- and post-sleep firing rate distributions, the second about the combination of perceptual and contextual information in pyramidal neurons—our model joins these two ingredients. ThaCo alternates phases of incremental learning, classification and deep-sleep. Memories of handwritten digit examples are learned through thalamo-cortical and cortico-cortical plastic synapses. In absence of noise, the combination of contextual information with perception enables fast incremental learning. Deep-sleep becomes crucial when noisy inputs are considered. We observed in ThaCo both homeostatic and associative processes: deep-sleep fights noise in perceptual and internal knowledge and it supports the categorical association of examples belonging to the same digit class, through reinforcement of class-specific cortico-cortical synapses. The distributions of pre-sleep and post-sleep firing rates during classification change in a manner similar to those of experimental observation. These changes promote energetic efficiency during recall of memories, better representation of individual memories and categories and higher classification performances.
1 Introduction
Increasing experimental evidence is mounting for both the role played by the combination of bottom-up (perceptual) and top-down/lateral (contextual) signals [1] and for the beneficial effects of sleep as key components of many high-level cognitive functions in the brain. In the following, we give an overview of some aspects, driven from experimental observations, that we have taken as fundamental building blocks for the construction of the model we present.
It is known that the cortex follows a hierarchical structure [2]; starting from this, Larkum et al. [1] propose an associative mechanism built-in at a cellular level into the pyramidal neuron (see Fig 1B), exploiting the cortical architectural organization (see Fig 1C). Long-range connectivity in the cortex follows the basic rule that sensory input (i.e., the feed-forward stream) terminates in the middle cortical layers, whereas information from other parts of the cortex (i.e., the feedback stream) mainly projects to the outer layers. This also applies to projections from the thalamus, a structure that serves as both a gateway for feed-forward sensory information to the cortex and a hub for feedback interactions between cortical regions. Indeed, only 10% of the synaptic feedback inputs to the apical tuft come from nearby neurons, and the missing 90% arise from long-range feedback connections. This feedback information stream is vitally important for cognition and conscious perception: this picture leads to the suggestion that the cortex operates via an interaction between feed-forward and feedback information. Larkum et al. [1] highlight that, counter-intuitively, distal feedback input to the tuft dendrite could dominate the input/output function of the cell: short high-frequency bursts would be produced on a combination of distal and basal input. As a consequence, although small (under-threshold) signals contribute only to their respective spike initiation zones, the fact that input has reached the threshold in one zone is quickly signalled to other zones. This provides the possibility for a contextual prediction: the activity in the apical tuft of the cell can lower the activity threshold driven by the basal region, the target of the specific nuclei in the thalamus that projects there the perceptual and feed-forward streams. In summary, this mechanism is ideally suited to associating feed-forward and feedback cortical pathways. Thus, they propose a conceptual interpretation of these biological pieces of evidence: the feedback signal aims at predicting whether a particular pyramidal neuron could or should be firing. Moreover, any neuron can fire only if it receives enough feed-forward input. Resulting from this interpretation, the internal representation of the world by the brain can be matched at every level with ongoing external evidence via a cellular mechanism, allowing the cortex to perform the same operation with massively parallel processing power.
Soft Winner-Take-All (WTA) plays an important role in many high-level cognitive functions such as decision making [3–5] classification and pattern recognition [6, 7]. Under a rough simplification, this mechanism can be realized through the competition among groups of excitatory neurons connected towards the same population of inhibitory neurons, which in turn is connected towards the excitatory groups it arbitrates [8–11]. Within appropriate conditions, the inhibitory signal will be sufficiently high to suppress the signal of all the low-firing excitatory groups of neurons whereas the high-firing ones survive. Such conditions can be achieved thanks to synaptic plasticity, which strengthens the connections among neurons of the same group and weakens those among competing groups, coupled with homeostatic mechanisms [12, 13].
Spike-timing-dependent plasticity (STDP) has been proposed as one of the essential learning ingredients in the cortex [14–18]. According to this plasticity rule, if the postsynaptic neuron fires an action potential just after a presynaptic spike, the synaptic weight will increase, whereas in the opposite case it will decrease. Through this mechanism, the synapses connecting neurons correlated by a principle of causality are metabolically rewarded. Chen et al. [19] have shown that networks of excitatory and inhibitory spiking neurons with either STDP or short-term plasticity can generate dynamically-stable WTA behaviour under certain conditions on initial synaptic weights.
Another key aspect that we consider in this study is the role of sleep during learning. Sleep is essential in all animal species, and it is believed to play a crucial role in memory consolidation [20, 21], in the creation of novel associations, as well as in the preparation of tasks expected during the next awake periods. Indeed, young humans pass the majority of time sleeping, and the youngest are the subjects that have to learn at faster rates. In adults, sleep deprivation is detrimental for cognition [22] and it is one of the worst tortures that can be inflicted. Among the multiple effects of sleep on the brain and body, we focus here on the consolidation of learned information [23]. Homeostatic processes could normalize the representation of memories and optimize the energetic working point of the system by recalibrating synaptic weights [24] and firing rates [25]. Specifically, Watson et al. [25] show that fast-firing pyramidal neurons decrease their firing rates over sleep, whereas slow-firing neurons increase their rates, resulting in a narrower population firing rate distribution after sleep. Also, sleep should be able to select memories for association, promoting higher performance during the next awake phases [26]. Indeed, Capone et al. [27] demonstrate the beneficial effects of sleep-wake phases involving homeostatic and associative processes in a visual classification task. Indeed, in [27], some of us illustrated how to assemble a simplified thalamo-cortical spiking model that can both express deep-sleep-like oscillations (in the form of an emergent, self-induced network phenomenon) and enter an awake-like asynchronous regime. This dynamical behaviour has been obtained by changing a few parameters in the equation that describes the dynamics of excitatory neurons in the spiking model, and thus exploiting a well established modelling principle that represents a few prominent features of brain-state acetylcholine-mediated neuromodulation, able to induce in the model the transition between awake-like asynchronous and deep-sleep-like oscillatory regimes [28]. However, this neuromodulation modelling principle has neither been previously applied to the study of the deep-sleep cognitive effects nor to simulations of learning-sleep cycles (such as in our previous work [27] and in this study). Specifically, the spiking model we propose is trained on a set of training patterns (here, on images of handwritten digits) and then exposed to never-seen examples to be classified (here, among human-assigned digit classes). Also, the model structure proposed in [27] and adopted in this work, is able to perform an asynchronous awake-like state of the network, by acting on the neural dynamics parameters. When the prescribed changes in the neural parameters induce the network to express deep-sleep-like oscillations, STDP is observed to produce in the model the spontaneous emergence of a differential homeostatic process. First, a down-regulation emerges of the stronger synapses created by the STDP during the training, those synapses that connect the best-tuned neurons during the training phase on each training example. At the same time, we observe that STDP increases the strength of synapses among neurons tuned on patterns belonging to the same class. Such hierarchical, spontaneous reorganization promotes better post-sleep classification performances. In short, the underlying mechanism is based on the similarity among thalamic coding of training examples belonging to the same class. During deep-sleep oscillations, such similarity supports the preferential activation of thalamo-cortico-thalamic connection paths among neural groups tuned to training examples belonging to the same class, and the consequent coactivation and unsupervised strengthening of class-specific synapses. This point has been illustrated in [27].
Combining the above-described set of cortical principles, we aimed at creating a simplified, yet biologically-plausible, thalamo-cortical spiking model (ThaCo, see Fig 1A). ThaCo exploits the combination of contextual and perceptual signals to construct a soft Winner-Take-All mechanism (WTA) capable of fast learning from few examples [29] in a synaptic matrix shaped by spike-timing-dependent plasticity (STDP). ThaCo has been calibrated to express deep-sleep-like activity and to induce modifications to the distributions of pre- and post-sleep firing rates comparable to biological measures like those carried out by Watson et al. [25] for an investigation of the deep-sleep effects on learning and classification (another beneficial aspect, the recovery and restoration of bio-chemical optimality, is not considered at this level of abstraction). In the context of machine learning, a distinction is made between instance-incremental methods, which learn from each training example as it arrives, and batch-incremental methods, in which the training data are organized in groups of examples, called batches, and the model is trained only on complete batches [30]. Depending on how different classes are represented by the examples in the batches, there are three training schemes [31]: new instances (NI), in which each new batch contains different instances of the same classes represented in previous batches, new classes (NC) in which examples belonging to novel classes become available in subsequent batches, and new instances and classes (NIC) in which subsequent training batches contain examples from both known and new classes. However, it should be noted that when it is necessary to constantly evaluate the performance of incremental learning, for reasons of computational efficiency the training set is divided into batches even for models capable of instance-incremental learning. Shimizu et al. [32] propose a training method based on balanced mini-batches, which reduces the effect of imbalanced data in supervised training. Our work is focused on instance incremental learning, and the training scheme is based on balanced mini-batches. Specifically, with ThaCo we investigated several brain aspects and learning capabilities: 1- incremental learning from few examples; 2- resilience to noise when trained over degraded-quality examples and asked to classify corrupted images; 3- comparison with the performances of knn algorithms; 4- the ability to fight noise in the contextual signal thanks to the introduction of a biologically-plausible deep-sleep-like state, inducing beneficial homeostatic and associative synaptic effects.
2 Results
In this work we test the capability of the implemented thalamo-cortical network model (ThaCo) of expressing incremental learning when trained to learn and recall images (from the MNIST dataset), and we investigated the role and the mechanisms of the occurrence of biological-like deep-sleep dynamics. First, we present a comparison of the ThaCo model behaviour with the biological observations made by Watson et al. [25] on the changes of firing rate distributions in awake, sleep and post-sleep phases (see Fig 2). Indeed, since one of the goals of this work is to implement a biologically-plausible model capable to display different “cognitive states”, the comparison with experimental outcomes is important to question its plausibility. After the validation obtained against experimental results, we demonstrate the capability of the model to learn incrementally, i.e. to continuously extend its knowledge by learning from new training examples while retaining most of the previously acquired memories. The learning ability of the model was assessed using an approach that alternates incremental training with tests meant to evaluate the pre-sleep and post-sleep classification performance. During the training phase, samples are randomly extracted from the training set of the MNIST database; they are given in input to the system together with example-specific contextual signals that reach the cortical neurons, and a digit-class-specific contextual signal that reaches only the read-out neurons. Notably, to stress out the difference with respect to the training phase, during the classification phase no contextual signal is transmitted: the response of the network is recorded, based on the firing rates of the excitatory neurons of the cortex. It should be emphasized that the proposed model is not an engineering solution to the problem of incremental learning in pattern classification, but a simplified model of the low-level processes that supports and emulates the ability to learn incrementally in the biological brain. Indeed, the size of the training set is relatively small compared to those often used in machine learning, and some issues that are of primary importance for both artificial and biological incremental learning, such as catastrophic forgetting, go beyond the aims of this paper. Then, We measure the model incremental classification performance and we compare it to that expressed by the K-Nearest neighbour family of artificial incremental learning algorithms (specifically Knn-1, Knn-3, Knn-5). Knn is an extensively used classification algorithm, which has been succesfully applied to a wide range of problems in different fields. Furthermore, unlike many other classification systems used in machine learning, the Knn family is suitable for incremental learning and also it works relatively well even with few training examples and, for large enough training sets, the Knn algorithm is guaranteed to yield an error rate no worse than twice the Bayes error rate, which is the minimum achievable given the distribution of the data [33]. For these reasons, the Knn classifier has been chosen as reference for the evaluation of the classification ability of the proposed system. We show that—even without the beneficial contribution of sleep—this model shows higher resilience to noisy inputs than Knn. Finally, we demonstrate the beneficial effects of deep-sleep-like cortical slow oscillations on the post-sleep classification accuracy of MNIST characters when a noisy contextual signal is injected during the awake training (a situation that could be interpreted both as the case of different levels of prior knowledge about the correct classification label of the current example during the training, and as related to the largely stochastic nature of cortical organisation and of the activity of other cortical areas).
2.1 ThaCo model pre- and post-sleep firing rates and comparison with the experiments
We compare the network behaviour of the ThaCo model during three simulated phases (pre-sleep awake-like, deep-sleep-like and post-sleep awake-like, see Fig 2) with those observed in rats by Watson et al. in [25]. When approaching the design of the ThaCo spiking model, an improvement of what some of us presented in [27], we relied on the well-established framework of Mean-Field theories [34], [35] to construct a network capable of spontaneously displaying two different dynamical regimes. This is obtained by acting on some parameters of the excitatory neurons (specifically, the spike frequency adaptation (SFA) and the excitatory synaptic conductance), to model acetylcholine-mediated neuromodulation on neural dynamics that supports the transitions between awake-like asynchronous activity and deep-sleep-like slow oscillations [28]. Specifically, in Fig 3A we show the incoming current to each cortical neuron versus its adaptation current during different network stages (pre- and post-sleep classification, beginning and end of the sleeping phase). In particular, sleep states are characterized by high levels of spike frequency adaptation currents (obtained through a modulation of the SFA parameter), inducing oscillations. Moreover, late sleep and after-sleep classification have low levels of input currents, due to a sleep-mediated synaptic depression leading to a reduction in the current circulating in the network. When set in the deep-sleep state, a non-specific stimulus, administered at a low steady firing rate to cortical neurons, is sufficient to elicit the emergence of cortically-generated Up-states and of thalamo-cortical Slow Oscillations (SO). As shown in top lines of Fig 3B and 3C, in the SO regime, the thalamo-cortical spiking network displays a firing rate oscillation frequency between 0.25Hz and 1.0Hz and durations of Up-states (a few hundreds of ms) comparable with experimental observations in deep-sleep recordings. During the initial stages of SO, Up-states are independently sustained by neuron populations tuned to specific images memorized during the training phase, and tend to reactivate thalamic neuron coding for the memorized images. Then, thanks to the similarity among training instances, the recruitment of other neural groups in the cortex is promoted. This creates preferential cortico-talamo-cortical excitatory pathways, inducing an STDP-mediated association of cortical neurons previously tuned to training instances that expressed similar thalamic representation (see Fig 3B and 3C). We name top-down prediction such cortico-thalamic activation that spontaneously occurs during SO. During the sleep period, thanks to cortico-cortical plasticity, the coactivation of neurons originally tuned to training instances of the same class becomes a typical feature of each Up-state: the WTA mechanism cooperates in selecting different neuron codings for different classes during each Up-State. Another key aspect is the generalized homeostatic depression, which is known to happen during deep-sleep and serves as a protection, to prevent Up-state-mediated associations that could drive towards a fully associated network. This effect is modelled thanks to the Non-Linear Temporal Asymmetric Hebbian (NLTAH) learning rule of the STDP we used [36], which reduces the strength among the most frequently coactivated neurons, leading to a progressive reduction of mean firing rates and frequency of the Up-States (see Fig 3B and 3C, top rows) which is consistent with experimental observations, in particular for what concerns the decrease of SO frequency during the night course [37]. The first noteworthy result presented in this paper is that this new calibration of the model greatly enhances the match with experimental data, as detailed in the following. Indeed, while the model [27] was already able to express the transition between states [38] such as sleep-like slow oscillations activity and awake-like classification, the refinements of its parameters here introduced make ThaCo more biologically plausible, leveraging as calibration tool the accurate comparison with experimental observations of differential changes in firing rates. In their work, Watson et al. [25] used large-scale recordings to examine the activity of neurons in the frontal cortex of rats and observe the distributions of pyramidal cell firing rates in different brain states: Awake, REM, nonREM and Microarousals. They found that periods of nonREM sleep reduced the post-sleep awake activity of neurons with high pre-sleep firing rate while up-regulating the firing of slow-firing neurons. Moreover, in their experiments, the neuronal firing rate varied with the brain state and, across all states, the distribution of per-cell mean firing rates was strongly positively-skewed, with a lognormal tail towards higher frequencies and a supra-lognormal tail towards lower frequencies. We set out the model parameters to reproduce these measures. In Fig 2A we present the cumulative distribution of neuronal mean firing rates for both awake and nonREM states of our model, to be compared with Fig 2A of Watson et al. [25] (REM not included in ThaCo). Median rates (± SD) of excitatory cortical neurons in ThaCo in each state are: awake, 1.2 ± 1.1Hz and nonREM, 0.6 ± 0.3Hz.
An interesting feature is that lognormal distributions spontaneously emerge from our simulations. This result is coherent with experimental observations and with theoretical considerations showing that the lognormal distribution of activities in randomly connected recurrent networks is a natural consequence of the non-linearity of the input-output gain function [39]. In agreement with Watson et al. [25], we also found that the arithmetic mean of the population firing rates declined throughout sleep, as visible using a test of correlation of spike rate versus time (see Fig 2D to be compared with Fig 3B by [25], the slope of the rate change within time-normalized sleep from all cx neurons in all recordings is R = −0.10, p = 10−3). In order to demonstrate that sleep brings varying differential effects across the rate spectrum, we compared mean firing rates in the first and the last 100s of sleep. As depicted in Fig 2C, fast-firing neurons decreased their rates over sleep, whereas slow-firing neurons increased their rates (to be compared with Fig 3D by [25]). To quantify this observation, we assessed spike rates of the same neurons in the first versus the last nonREM 100s of sleep and found the slope of this correlation significantly departed from unity (slope, 95% confidence interval 0.6015 − 0.6130).
Furthermore, following [25], we divided ThaCo excitatory cortical neurons into six sextile groups sorted by their awake firing rates (Fig 2A). As shown in Fig 2B, the sextile with the highest firing rates significantly decreased its activity over sleep, in accordance with results obtained by Watson et al. [25] (see Fig 3B of their work). Finally, we evaluated the impact of sleep on the cortical firing rates distribution during awake states. In Fig 2E, we compare firing rates distribution pre- and post-sleep depicting the homeostatization effect of sleep.
2.2 The ThaCo network model and the training protocol
The proposed ThaCo circuit is organized into three layers, as shown in Fig 1A: an input layer, the thalamus, which consists of an excitatory population (tc) whose firing rate is under the control of a reticular inhibitory fully-connected population (re); the cortex, consisting of an excitatory population (cx) and an inhibitory population (in), both fully connected as well; a readout (ro) layer, to which the cortex is also fully connected, composed of subgroups of neurons of neurons associated to each class. The learning protocol is organized in alternation of training phases—when the internal structure of the network is shaped according to the learnt examples—and testing phases—when the classification performance of the network is evaluated (see Section 4.1). In both training and classification phases, the network is provided with sample images drawn from the MNIST dataset. The sample images are pre-processed to produce stimulus signals that are transmitted to the excitatory neurons of the thalamus (see paragraphs “The datasets of handwritten characters” and “Thalamic coding of visual stimuli” in S1 Text for more details). During the training phase, simultaneously with the input sensory-like stimulus, contextual signals are transmitted to the excitatory neurons of the cortex and to the readout neurons. The observed bursting behaviour of the neurons is a consequence of the temporal coincidence between impinging perceptual and contextual signals. Specifically, for each example to learn, an example-specific group of excitatory neurons in cx is facilitated through the presentation of a contextual signal. This induces a higher activity in these neurons causing a strengthening of both thalamo-cortical synapses and recurrent synapses. This example-specific tuning involves each neuron with a single training example only, whose category defines (in an unsupervised manner) a natural category for which the neuron is better tuned. Meanwhile, a subgroup of readout neurons (ro) is stimulated by a digit-class specific contextual signal, leading to an enhancement of connections between the cortical neurons trained over the presented example and the subgroup of readout neurons associated with the correct class. The simultaneous stimulation by perceptual signals and contextual signals emulates the organizing principle of the cerebral cortex as described by Larkum et al. [1], approximating the effects of the dendritic apical amplification mechanism at the cellular level. It is worth noting that this is the only phase when a category-specific (rather than an example-specific) signal is given to the network: protocols concerning this ro layer are supervised training protocols, whereas those for the other layers can be referred as unsupervised training protocols. During the classification phase, signals resulting from preprocessed images are again transmitted to the thalamus analogously to the training phase, however, no contextual signal is transmitted to either cortical and readout neurons. During this stage, the neuronal activation results from the combination of the current injected by perceptual signals and the one injected by recurrent interconnections strengthened by the synaptic STDP dynamics during the training and modified by STDP during sleep cycles. We infer the network answer to the classification task in two different ways: first, unsupervised, taking the class of the example over which the most active subgroup of cortical neurons has been trained; second, supervised, taking the class associated to the most active subgroup of readout neurons. Specifically, the readout layer performs the integration of signals coming from the subgroups of cortical neurons trained over different examples belonging to the same class (see Section 4.1 for a more detailed representation of the learning process). The activity produced by the cortical neurons during training, classification and sleeping phases is depicted in Fig 3B and 3C.
We set the network parameters in an under threshold regime that enables the training above described through the selected STDP model on the single-compartment standard Adaptive Exponential (AdEx) integrate-and-fire neuron that would not otherwise distinguish among basal and apical stimuli. See details about the model construction, the presentation of visual stimuli and the addition of noise in the Material and Methods, Section 4 and in S1 Text.
2.3 Incremental learning: Performances
We trained the proposed network over an incremental number of training examples and evaluated its classification performances on a set of images never shown. We also compared the average accuracy of our thalamo-cortical spiking model with that obtained using standard Knn-x classification systems for different numbers of training examples per digit category. See Fig 4 and Table 1. The model presented in this work enables instance-incremental as well as class-incremental learning. The training protocol we adopted for the results presented here was based on the balanced-mini-batches scheme proposed by [32]. More specifically, the training set of hand-written digits was divided into mini-batches of 10 examples each, in which each class was represented by just one example. In S1 Text, we include a comparison of performance obtained using different training protocols.
Table 1. Accuracy achieved by the different learning algorithms over a different number of training examples.
Accuracy (%) | ||||||
---|---|---|---|---|---|---|
Training examples per class | ||||||
Algorithm | 1 | 2 | 3 | 5 | 10 | 20 |
Knn, k = 1 | 68.0 ± 1.0 | 76.1 ± 1.0 | 80.6 ± 0.8 | 84.5 ± 0.4 | 88.3 ± 0.4 | 90.4 ± 0.3 |
Knn, k = 3 | 37.2 ± 1.3 | 66.0 ± 1.1 | 75.4 ± 0.9 | 82.7 ± 0.4 | 88.3 ± 0.3 | 91.0 ± 0.3 |
Knn, k = 5 | 23.9 ± 1.5 | 62.0 ± 1.3 | 72.2 ± 0.9 | 81.9 ± 0.4 | 88.2 ± 0.3 | 91.2 ± 0.3 |
ThaCo—Digit class readout | 65.7 ± 1.0 | 74.8 ± 0.8 | 80.4 ± 0.7 | 84.8 ± 0.6 | 88.6 ± 0.5 | 91.1 ± 0.3 |
ThaCo—Example specific group | 65.7 ± 0.9 | 73.6 ± 0.9 | 78.1 ± 0.6 | 82.7 ± 0.4 | 86.4 ± 0.4 | 88.6 ± 0.4 |
MNIST images have been presented to the ThaCo th layer using the improved pre-processing protocol described in paragraph Thalamic coding of visual stimuli of the S1 Text. The accuracy has been evaluated over classification trials, each one including 500 images, and the classification accuracy has been averaged over 20 trials. Fig 4A shows the accuracy for incremental learning as a function of the number of training examples per class. Fig 4B and 4C, on the other hand, depict the average accuracy of the compared training algorithms for the last 10 to 20 and the first 1 to 5 training examples per class respectively, to better show their different behaviour at different stages of the learning process. For the MNIST dataset, higher-order Knn algorithms surpass the performance of Knn-1 only when the training set includes more than 10 examples per digit class. It is worth noting that the soft WTA mechanism of ThaCo can learn incrementally and has comparable performances to the best Knn-n algorithm for a given number of training examples. Specifically, the supervised ThaCo is proven to be able to perform the integration of signals coming from subgroups of cortical neurons trained over different examples belonging to the same class and its performances are comparable to higher-order Knns, whereas the unsupervised ThaCo performances are proven to be comparable with Knn-1 performances when few examples are presented.
2.4 Classification of noisy input
We evaluated the network behaviour within a noisy input environment and compared it to the Knn performances. For this, we injected a ‘Salt and Pepper’ noise [40] (density = 0.2) into the unprocessed MNIST images. Noisy images are then pre-processed (see 1) and presented to the network in both training and classification phases. Fig 4A depicts the average accuracy of the network trained incrementally over a total number of 20 noisy examples per class and compares it to performances of Knn-n algorithms (as in section 2.3, both Knn and ThaCo algorithms are trained incrementally). It is worth noting that in this scenario the ThaCo algorithm has better performances than the Knn-n algorithms.
2.5 Beneficial effect of deep-sleep in compensating the impact of noisy contextual labels
For the aim of introducing more biologically-plausible elements regarding the combination of contextual and perceptual signals, we slightly modify the ThaCo training protocol: the magnitude of the contextual signal given to both the cortex and the readout layer trained over a new learning example is now randomly extracted from a Gaussian distribution. As a consequence, some of the presented examples are better represented than others, resembling a more realistic situation in the cortex in which both the degree of knowledge projected by other areas and the number and strength of apical synapses carrying the contextual information and raising the perceptual thresholds during learning are not exactly equal for all the presented examples and all the neurons in the selected group.
We introduce the deep-sleep state in our training protocol, as follows: after each training phase, we disconnect ThaCo from external inputs and induce deep-sleep-like oscillations, following the method described in [27] and here described in Section 2.1 and in S1 Text, paragraph Sleep-like oscillatory dynamics. As expected, noise in the contextual signal leads to a drop in performance, compared to the idealized situation presented in the previous section (i.e. the careful equalization of contextual signal), but such drop can be reduced by sleep, as shown in Fig 5A.
At the synaptic level, it is possible to observe how deep-sleep-like slow oscillations induce in the current ThaCo model both a regularisation of the strength of the memories of individual learned examples through homeostasis and an association between groups of neurons trained over different examples of the same class. Figs 5B and 6 report such sleep-induced optimization of the synaptic representation of memories. Specifically, within neurons belonging to the same example-specific group, the synaptic weights distribution decreases its mean and coefficient of variation (from μ = 74, μ/σ = 0.3, skewness 0.55 pre-sleep to μ = 60, μ/σ = 0.2, skewness −0.26 post-sleep) whereas within neurons belonging to different example-specific group but coding for the same class, the synaptic weights distribution increases its mean and coefficient of variation (from μ = 0.005, μ/σ = 0.001, skewness 0.003 pre-sleep to μ = 0.5, μ/σ = 1.6, skewness −2.9 post-sleep).
Specifically, the homeostatic effect of deep-sleep-like SOs can be identified comparing Fig 6C and 6D: the distribution of synaptic weights sharpens (i.e. it exhibits smaller post-sleep σ and μ) and presents a general depression of synaptic weights. These two variations combine to produce beneficial effects. First, this leads to a lowering of the heterogeneity of representation of learned examples, a reduction of the energetic cost of memory recall (reduced synaptic strength is associated with a lower metabolic cost of synaptic activity) and lower post-sleep spiking rates (see section 2.1). Moreover, deep-sleep-like oscillations affects categorical association and is depicted in Fig 5B.b: synapse weights connecting groups of neurons trained over different examples but belonging to the same digit class increase from a nearly zero pre-sleep value, while synapses connecting representations of memories belonging to different classes are much less affected. This effect is also visible comparing Fig 6A and 6B, where synapses connecting representations of memories belonging to the same digit class light up (big squares along the diagonal). Asymmetric STDP induces, on one hand, the depression of strong synapses, on the other the association among neuronal groups coding for the same class (i.e. trained over similar stimuli), through a mechanism of resemblance in their thalamic representation.
3 Discussion
We propose a simplified thalamo-cortical spiking model (ThaCo) that exploits the combination of context and perception to build a soft-WTA circuit, and that is able to express sleep-like slow oscillations. In order to be compliant with biological rhythms, we first verified that the proposed network is able to reproduce the experimental measures of neuronal firing rates during awake and deep-sleep states performed by Watson et al [25]. The agreement with the experiments has been achieved by further developing the thalamo-cortical spiking model proposed in [27] and by setting the model parameters to better fit the experimental recordings. The model we propose is capable of fast incremental learning from few examples (its performances are comparable to those expressed by Knn, of rank increasing with the number of examples) and of alternating several learning-sleep phases; moreover, it demonstrates resilience when subjected to noisy perceptions with better performances than Knn algorithms; these three facts constitute significant extensions to the previous study [27].
In recent years, there has been growing interest in the development of artificial neural networks (ANNs) or deep neural networks inspired by features found in biology, yet still using mechanisms for learning and inference which are fundamentally different from what is actually observed in biology. On the other hand, there is also a plenty of computational models aiming at reproducing biological proprieties in an exact way. Many models have been proposed for pattern recognition tasks that use biologically-plausible mechanisms, combining spiking networks and STDP plasticity [41–43]. The ThaCo model has been developed in line with this philosophy, delivering a spiking neural network which relies on a combination of biologically plausible mechanisms. It uses conductance-based AdEx neurons, STDP and lateral inhibition. A crucial ingredient, which mostly differentiates our approach from previous works, is the introduction of a contextual signal which drives the training procedure, making it similar to a target-based approach [44, 45] and enabling huge advantages in terms of training velocity and precision. Such mechanism was inspired by the work done by Larkum [1] suggesting that the activity of a neuron is amplified when it receives a coincidence of signals from both lower and higher levels of abstraction. This allows the recruitment of new neurons to learn novel examples through the incremental building of a soft-WTA mechanism.
We stress that, even though we showed to be successful in reproducing specific experimental observations, the aim of this work is not to exactly reproduce a biological network (for instance, the emulation of metabolic processes goes beyond the scope of this work), but to develop a simplified task-specific spiking neural network able to express biological features, and receive an indication on how even an approximated emulation of deep-sleep and of the combination of contextual and perceptual information can positively affect the network performances. Specifically, without any pretence to be biologically realistic, the neuron model used in these simulations is a point-like AdEx (see section The neuron model), yet we are able to emulate a compartment neuron behaviour (described in [1]) without introducing more complex morphological units in the network. For this, we approximated the coincidence mechanism by setting both the contextual and sensory inputs impinging on cortical neurons in a subthreshold point (see 4.1).
Another major aspect of our work is the effect of sleep on the network and on the memories stored in it. The role played by sleep in memory consolidation has been widely studied from an experimental point of view [46, 47], but only recently it has become the object of theoretical and computational modelizations [27, 48–50]. In our work, we investigated computationally the effect of slow oscillations on the structure and the performances of the network when the STDP plasticity is turned on. We proved that deep-sleep-like slow oscillations can be beneficial to equalize the memories stored in a cortico-thalamic structure when learned in noisy conditions. Indeed, slow oscillations can compensate for the contextual noise through homeostasis, equalizing synaptic weights and creating beneficial associations that improve classification performance.
The predictions of our model are also a first step toward the reconciliation of recent experimental observations about both an average synaptic down-scaling effect (synaptic homeostatic hypothesis—SHY [51]) and a differential modulation of firing rates [25] induced by deep-sleep, which is believed to be a default state mode for the cortex [52].
As mentioned above, we focused on the role of NREM-sleep for memory consolidation. The simulation of a complete sleep cycle that includes REM and micro-arousal phases goes beyond the scope of this paper and is currently under investigation. One more limitation of this work is that it does not take into account the role of synchronization among different brain regions. Actually, assuming a typical neural density in the range of 5 ⋅ 104 neurons per mm2 of the cortex, and considering that the maximum size of the proposed model rises up to 5000 cortical neurons, such a number is equivalent to a small cortical area with a dimension of about 300μm, that is well below the size of a single cortical area. To overcome this limitation, we are extending the model to multi-layers and multi-area descriptions.
Finally, this work represents an additional contribution in understanding sleep mechanisms and functions, in line with the efforts we are carrying out in data analysis [53, 54] and in large-scale simulations [55], aimed at bridging different elements in a multi-disciplinar approach. In particular, it hints to a careful balance between architectural abstraction and experimental observations as a valid methodology for the description of brain mechanisms and of their links with cognitive functions.
4 Materials and methods
The results of the ThaCo model (Section 2) have been obtained thanks to fine implementations of several features. Such fine-tuning is presented in this Section and in the S1 Text. In particular, Section 4.1 addresses the crucial point of the model calibration, aimed at inducing a soft-WTA mechanism by combining context and perception, achieved by setting the network parameters in what we call an under-threshold regime, that enables a training through the selected STDP model on single-compartment standard Adaptive Exponential integrate-and-fire neuron (AdEx) that would not otherwise distinguish among basal and apical stimuli (see S1 Text). MNIST characters are coded by the thalamus according to the scheme presented in S1 Text that preserves a notion of distance among visual features.
Simulation reported were executed on dual-socket servers with eight-core Intel(R) Xeon(R) E5–2620 v4 CPU per socket. The cores are clocked at 2.10GHz with HyperThreading enabled, so that each core can run 2 processes, for a total of 32 processes per server. The ThaCo model has been implemented using the NEST 2.12.0 [56] simulation engine.
4.1 Winner-take-all mechanisms by combining context and perception
We set the network parameters to induce the creation of WTA mechanisms by emulating the organizing principle of the cortex described by Larkum et al. [1].
During the training, the network is set in a hard-WTA regime (firing rate different from zero only on a selected example-specific sub-set of neurons), while during classification it works in a soft-WTA regime (i.e. the firing rate can be different from zero in multiple groups of neurons, and the winner group is assumed to be the one firing at the higher rate). Specifically, during the training, we set our parameters to be so that the thalamic signal alone is not sufficient to make neurons spike. This is reported in Fig 7C that represents the mean firing rate and the membrane potential over time for a group of cortical neurons stimulated to encode for a training example in three different contextual scenarios: Fig 7C-center shows the network behaviour in the absence of a thalamic signal; Fig 7C-left shows the network response without the contextual signal; Fig 7C-right shows the network behaviour with both the contextual and the thalamic signal. The cortical activity in the absence of contextual signal is null and it is really low when only the stimulation of the contextual signal is present. The combined action of the two, on the other hand, yields a higher spiking activity. We can therefore conclude that we put the network in what we named an under-threshold regime. Moreover, to better show the implemented soft Winner-take-all dynamics, we present the mean firing rate of three subgroups of cortical neurons trained over different examples belonging to different categories during both retrieval (i.e. training examples are presented again to the network without any contextual signal) and classification phases. The implementation of WTA dynamics is depicted in Fig 7D.
4.1.1 Simple mathematical model of soft-WTA creation
In this section, we discuss the capability of our model to learn over a few examples through soft-WTA mechanisms. First, we demonstrate how the network is surely endowed with the capability to behave like a Knn classifier. In the first training step, the network is exposed to one example for each of ten digit class (L = 10). Let D(l) = {1 + (l − 1)K, …, lK} be the set of indices of the K excitatory cortical neurons that are induced to fire by the simultaneous presence of the contextual stimulation and the thalamic input, carried by T thalamic neurons (see Fig 7C) when presented with one of the training examples l ∈ {1, ‥, L}. Also, starting from an initial value , let be the final average weight induced by STDP on the connections between the thalamic excitatory neurons that are active during the learning of the training example l and the K excitatory cortical neurons that are induced to activity. Finally, let be the binary feature vector of the training example l. The average weight at equilibrium of the connections between the thalamic neurons activated by the example l and the excitatory cortical neurons can be written as:
(1) |
After the training on the first set of L examples, a total of C = kL cortical neurons will have been exposed to the combination of contextual and thalamic stimulation (see Fig 3A). During the classification phase, represented in Fig 3B, when a never seen stimulus (the image S to be classified) is presented to the network, the average signal from the thalamic layer (composed of T neurons) to the excitatory cortical neurons (in all the L trained cortical groups) is:
(2) |
where ρth is the rate of the active thalamic neurons, and is the binary thalamic feature vector of the novel image S to be classified. Assuming w0 to be much smaller than weq, the average signal from thalamic neurons to each cortical neuron belonging to D(l) group can thus be written as:
(3) |
where S is the novel stimulus presented during the classification phase and l is the learning example over which the set of neurons D(l) have been trained. The vectors of thalamic features can be normalized (u = x/N(x), using their Euclidean norm (). The Euclidean distance among each training example (l) and the images (S) to be classified can be written as dl,S = ‖u(l) − u(S)‖2. It follows that , where we used the normalization condition for both u(l) and u(S). In this way Eq 3 can be rewritten as:
(4) |
Eq 4 tells us that the thalamic signal is a decreasing function of the distance dl,S, if all training examples are equally normalized(‖x(i)‖2 = ‖x(j)‖2∀i, j ∈ 1, ‥L) and neglecting for a while the possible changes in the thalamic rate ρth, that in our model can be mediated by the existing cortico-thalamic feedback path. Under the approximation of constant ρth we can immediately show that, after having being exposed to the first set of training examples, the soft-WTA ThaCo excitatory network is at least endowed with the capability to behave as a nearest neighbour classifier of the first order (Knn-1 classifier). The winning candidate K among the L competing cortical groups is initially suggested to the network as the one reached by the strongest thalamic stimulus when presented with the never seen image S:
(5) |
Indeed, under the assumption that the neuron activity depends on the incoming signal (both excitatory and inhibitory) through a transfer function monotonically increasing over the total incoming current g, we will now show that: 1) the role of inhibition will be to help the computation of (a soft) argmax; 2) the recurrent intra-group cortical excitation provides an additional boost to the selection of the winner. To confirm this, we shall now consider explicitly the contribution of both recurrent and inhibitory contributions. The total average input signal to each cortical neuron depends on the group l the neuron belongs to and on the the stimulus S to be classified:
(6) |
Under the approximation of constant ρth, the first term in Eq 6 is provided by Eq 3.
Concerning the second term, the training protocol illustrated by Fig 7 creates cortico-cortical synapses of strength only among neurons belonging to the same group l, i.e among neurons trained on the same example, while connections among neurons selective for different training examples are left to the initial value . Assuming that , after learning we have:
(7) |
Under the assumption that the activities of the K neurons belonging to the same subgroup (l) are similar to each other, the second term in Eq 6 reduces to the recurrent intra-group excitatory contribution:
(8) |
where is the average firing rate reached by the cortical group l when activated by the novel stimulus S.
In our simplified mode, all wcx→inh and winh→cx synapses are non-plastic and set to an identical value. Therefore, the third term, the input signal from cortico-cortical inhibition, is in our architecture equal to:
(9) |
where Ninh is the number of cortical inhibitory neurons and ρinh is the inhibitory neurons activity.
In summary, Eq 6, i.e. the total current stimulating each of the L groups of cortical neurons responding to the thalamic stimulus S can be reformulated:
(10) |
When the average rate is well below saturation, its relationship to the total input signal is well described by a threshold-linear function:
(11) |
where α is a constant coefficient, H is the Heaviside function and gthresh is the firing threshold. Therefore, assuming that the input signal is above threshold (), we have that
(12) |
we also require that , i.e. self feedback should be smaller than one, otherwise the system would become unstable.
Considering that the inhibitory signal is equal for all L groups under the provisional assumption of constant ρth, i.e. no cortico-thalamic feedback), Eq 12 tells us that the final choice of the network would confirm the initial guess of Eq 5:
(13) |
i.e. the network tends to a stationary condition in which the L groups of K neurons can be set at different firing rates that decrease with the distance dl,S. Moreover, the readout layer combines signal coming from groups of cortical neurons trained over different examples yet belonging to the same class: thus, the network expresses a behaviour similar to that of a a higher-order Knn—n,.
5 Supporting information
Acknowledgments
We thank the artist Lorenzo PONT Pontani for cat drawing in Fig 1.
Data Availability
The model and data used to draw the results outlined in this manuscript are available at https://github.com/APE-group/ThaCo2.git and 10.5281/zenodo.4769175.
Funding Statement
This work has been supported by the European Union Horizon 2020 Research and Innovation program under the FET Flagship Human Brain Project (grant agreement SGA3 n. 945539 and grant agreement SGA2 n. 785907; recipient Pier Stanislao Paolucci) and by the INFN APE Parallel/Distributed Computing laboratory. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Larkum M. A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex. Trends in Neurosciences. 2013;36(3):141—151. Available from: http://www.sciencedirect.com/science/article/pii/S0166223612002032. [DOI] [PubMed] [Google Scholar]
- 2. Barone P, Batardiere A, Knoblauch K, Kennedy H. Laminar Distribution of Neurons in Extrastriate Areas Projecting to Visual Areas V1 and V4 Correlates with the Hierarchical Rank and Indicates the Operation of a Distance Rule. The Journal of neuroscience: the official journal of the Society for Neuroscience. 2000. 06;20:3263–81. doi: 10.1523/JNEUROSCI.20-09-03263.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Wang XJ. Probabilistic Decision Making by Slow Reverberation in Cortical Circuits. Neuron. 2002;36(5):955—968. Available from: http://www.sciencedirect.com/science/article/pii/S0896627302010929. [DOI] [PubMed] [Google Scholar]
- 4. Furman M, Wang XJ. Similarity Effect and Optimal Control of Multiple-Choice Decision Making. Neuron. 2008;60(6):1153—1168. Available from: http://www.sciencedirect.com/science/article/pii/S0896627308010490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Walther D, Koch C. Modeling attention to salient proto-objects. Neural Networks. 2006;19(9):1395—1407. Brain and Attention. Available from: http://www.sciencedirect.com/science/article/pii/S0893608006002152. [DOI] [PubMed] [Google Scholar]
- 6. Wolfrum P. A Correspondence-Based Neural Model for Face Recognition. In: Heidelberg SB, editor. Information Routing, Correspondence Finding; 2010. p. 29–67. Available from: 10.1007/978-3-642-15254-2_3. [DOI] [Google Scholar]
- 7. Nessler B, Pfeiffer M, Buesing L, Maass W. Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity. PLOS Computational Biology. 2013. 04;9(4):1–30. Available from: 10.1371/journal.pcbi.1003037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Coultrip R, Granger R, Lynch G. A cortical model of winner-take-all competition via lateral inhibition. Neural Networks. 1992;5(1):47—54. Available from: http://www.sciencedirect.com/science/article/pii/S0893608005800061. [Google Scholar]
- 9. Maass W. On the Computational Power of Winner-Take-All. Neural Computation. 2000;12(11):2519–2535. Available from: 10.1162/089976600300014827. [DOI] [PubMed] [Google Scholar]
- 10. Douglas RJ, Martin KAC. NEURONAL CIRCUITS OF THE NEOCORTEX. Annual Review of Neuroscience. 2004;27(1):419–451. Available from: 10.1146/annurev.neuro.27.070203.144152. [DOI] [PubMed] [Google Scholar]
- 11. Rutishauser U, Douglas RJ, Slotine JJ. Collective Stability of Networks of Winner-Take-All Circuits. Neural Computation. 2011;23(3):735–773. Available from: 10.1162/NECO_a_00091. [DOI] [PubMed] [Google Scholar]
- 12.Jug F, Cook M, Steger A. Recurrent competitive networks can learn locally excitatory topologies. In: The 2012 International Joint Conference on Neural Networks (IJCNN); 2012. p. 1–8.
- 13. Binas J, Rutishauser U, Indiveri G, Pfeiffer M. Learning and stabilization of winner-take-all dynamics through interacting excitatory and inhibitory plasticity. Frontiers in Computational Neuroscience. 2014;8:68. Available from: https://www.frontiersin.org/article/10.3389/fncom.2014.00068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Gerstner W, Kempter R, van Hemmen JL, Wagner H. A neuronal learning rule for sub-millisecond temporal coding. Nature. 1996. sep;383(6595):76–78. Available from: 10.1038/383076a0. [DOI] [PubMed] [Google Scholar]
- 15. Song S, Miller KD, Abbott LF. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience. 2000. sep;3(9):919–926. Available from: 10.1038/78829. [DOI] [PubMed] [Google Scholar]
- 16. Morrison A, Diesmann M, Gerstner W. Phenomenological models of synaptic plasticity based on spike timing. Biological Cybernetics. 2008. Jun;98(6):459–478. Available from: 10.1007/s00422-008-0233-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Sboev A, Vlasov D, Serenko A, Rybka R, Moloshnikov I. On the applicability of STDP-based learning mechanisms to spiking neuron network models. AIP Advances. 2016;6(11):111305. Available from: 10.1063/1.4967353. [DOI] [Google Scholar]
- 18. Levenstein D, Watson BO, Rinzel J, Buzsáki G. Sleep regulation of the distribution of cortical firing rates. Current Opinion in Neurobiology. 2017;44:34–42. Neurobiology of Sleep. Available from: https://www.sciencedirect.com/science/article/pii/S0959438816301933. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Chen Y, Mckinstry J, Edelman G. Versatile networks of simulated spiking neurons displaying winner-take-all behavior. Frontiers in Computational Neuroscience. 2013;7:16. Available from: https://www.frontiersin.org/article/10.3389/fncom.2013.00016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Walker MP, Stickgold R. Sleep, Memory, and Plasticity. Annual Review of Psychology. 2006;57(1):139–166. Available from: 10.1146/annurev.psych.56.091103.07007. [DOI] [PubMed] [Google Scholar]
- 21. Jadhav SP, Kemere C, German PW, Frank LM. Awake Hippocampal Sharp-Wave Ripples Support Spatial Memory. Science. 2012;336(6087):1454–1458. Available from: https://science.sciencemag.org/content/336/6087/1454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Killgore WDS. Effects of sleep deprivation on cognition. vol. 185 of Progress in Brain Research. Elsevier; 2010. p. 105–129. Available from: http://www.sciencedirect.com/science/article/pii/B9780444537027000075. [DOI] [PubMed] [Google Scholar]
- 23. Buzsáki G. Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and planning. Hippocampus. 2015;25(10):1073–1188. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/hipo.22488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Tononi G, Cirelli C. Sleep and the Price of Plasticity: From Synaptic and Cellular Homeostasis to Memory Consolidation and Integration. Neuron. 2014;81(1):12—34. Available from: http://www.sciencedirect.com/science/article/pii/S0896627313011860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Watson BO, Levenstein D, Greene JP, Gelinas JN, Buzsáki G. Network Homeostasis and State Dynamics of Neocortical Sleep. Neuron. 2016;90(4):839—852. Available from: http://www.sciencedirect.com/science/article/pii/S0896627316300563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Smulders FTY, Kenemans JL, Jonkman LM, Kok A. The effects of sleep loss on task performance and the electroencephalogram in young and elderly subjects. Biological Psychology. 1997;45(1):217—239. Mental Resources: Intensive and Selective Aspects. Available from: http://www.sciencedirect.com/science/article/pii/S0301051196052295. [DOI] [PubMed] [Google Scholar]
- 27. Capone C, Pastorelli E, Golosio B, Paolucci PS. Sleep-like slow oscillations improve visual classification through synaptic homeostasis and memory association in a thalamo-cortical model. Scientific Reports. 2019. June;9:8990. Available from: https://https://www.nature.com/articles/s41598-019-45525-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Goldman JS, Tort-Colet N, di Volo M, Susin E, Bouté J, Dali M, et al. Bridging Single Neuron Dynamics to Global Brain States. Frontiers in Systems Neuroscience. 2019;13:75. Available from: https://www.frontiersin.org/article/10.3389/fnsys.2019.00075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Vul E., Goodman N., Griffiths T.L. and Tenenbaum J.B. One and Done? Optimal Decisions From Very Few Samples. Cognitive Science. 2014. jan;38(4):599–637. Available from: https://onlinelibrary.wiley.com/doi/full/10.1111/cogs.12101. [DOI] [PubMed] [Google Scholar]
- 30. Read J, Bifet A, Pfahringer B, Holmes G. Batch-Incremental versus Instance-Incremental Learning in Dynamic and Evolving Data . In: Advances in Intelligent Data Analysis XI. Springer; Berlin Heidelberg; 2012. p. 313–323. Available from: 10.1007/978-3-642-34156-4_29. [DOI] [Google Scholar]
- 31.Lomonaco V, Maltoni D. CORe50: a New Dataset and Benchmark for Continuous Object Recognition. In: Levine S, Vanhoucke V, Goldberg K, editors. Proceedings of the 1st Annual Conference on Robot Learning. vol. 78 of Proceedings of Machine Learning Research. PMLR; 2017. p. 17–26. Available from: http://proceedings.mlr.press/v78/lomonaco17a.html.
- 32.Shimizu R, Asako K, Ojima H, Morinaga S, Hamada M, Kuroda T. Balanced Mini-Batch Training for Imbalanced Image Data Classification with Neural Network. In: 2018 First International Conference on Artificial Intelligence for Industries (AI4I); 2018. p. 27–30.
- 33. Cover T, Hart P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory. 1967. January;13(1):21–27. doi: 10.1109/TIT.1967.1053964 [DOI] [Google Scholar]
- 34. Gigante G, Mattia M, Giudice PD. Diverse Population-Bursting Modes of Adapting Spiking Neurons. Phys Rev Lett. 2007. Apr;98:148101. Available from: https://link.aps.org/doi/10.1103/PhysRevLett.98.148101. [DOI] [PubMed] [Google Scholar]
- 35. Capone C, Rebollo B, Muñoz A, Illa X, Del Giudice P, Sanchez-Vives MV, et al. Slow Waves in Cortical Slices: How Spontaneous Activity is Shaped by Laminar Structure. Cerebral Cortex. 2017. 11;29(1):319–335. Available from: 10.1093/cercor/bhx326. [DOI] [PubMed] [Google Scholar]
- 36. Gütig R, Aharonov R, Rotter S, Sompolinsky H. Learning Input Correlations through Nonlinear Temporally Asymmetric Hebbian Plasticity. Journal of Neuroscience. 2003;23(9):3697–3714. Available from: https://www.jneurosci.org/content/23/9/3697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Hobson JA, Pace-Schott EF. The cognitive neuroscience of sleep: neuronal systems, consciousness and learning. Nature Reviews Neuroscience. 2002;3(9):679–693. doi: 10.1038/nrn915 [DOI] [PubMed] [Google Scholar]
- 38.Tort-Colet N, Capone C, Sanchez-Vives MV, Mattia M. Attractor competition enriches cortical dynamics during awakening from anesthesia. bioRxiv. 2019; Available from: https://www.biorxiv.org/content/early/2019/01/10/517102. [DOI] [PubMed]
- 39. Roxin A, Brunel N, Hansel D, Mongillo G, van Vreeswijk C. On the Distribution of Firing Rates in Networks of Cortical Neurons. Journal of Neuroscience. 2011;31(45):16217–16226. Available from: https://www.jneurosci.org/content/31/45/16217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Charles B. 4.5—Image Noise Models. In: BOVIK A, editor. Handbook of Image and Video Processing (Second Edition). second edition ed. Communications, Networking and Multimedia. Burlington: Academic Press; 2005. p. 397—409. Available from: http://www.sciencedirect.com/science/article/pii/B9780121197926500875. [Google Scholar]
- 41. Diehl P, Cook M. Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Frontiers in Computational Neuroscience. 2015;9:99. Available from: https://www.frontiersin.org/article/10.3389/fncom.2015.00099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Mozafari M, Ganjtabesh M, Nowzari-Dalini A, Thorpe SJ, Masquelier T. Bio-inspired digit recognition using reward-modulated spike-timing-dependent plasticity in deep convolutional networks. Pattern Recognition. 2019;94:87—95. Available from: http://www.sciencedirect.com/science/article/pii/S0031320319301906. [Google Scholar]
- 43.Bagheri A, Simeone O, Rajendran B. Training Probabilistic Spiking Neural Networks with First- To-Spike Decoding. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2018. p. 2986–2990.
- 44. Muratore P, Capone C, Paolucci PS. Target spike patterns enable efficient and biologically plausible learning for complex temporal tasks. PLOS ONE. 2021. 02;16(2):1–22. Available from: 10.1371/journal.pone.0247014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Ingrosso A, Abbott L. Training dynamically balanced excitatory-inhibitory networks. PloS one. 2019;14(8). doi: 10.1371/journal.pone.0220547 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Walker MP, Stickgold R. Sleep-Dependent Learning and Memory Consolidation. Neuron. 2004. 2020/03/20;44(1):121–133. Available from: 10.1016/j.neuron.2004.08.031. [DOI] [PubMed] [Google Scholar]
- 47. Diekelmann S, Born J. The memory function of sleep. Nature Reviews Neuroscience. 2010;11(2):114–126. Available from: 10.1038/nrn2762. [DOI] [PubMed] [Google Scholar]
- 48. Wei Y, Krishnan GP, Komarov M, Bazhenov M. Differential roles of sleep spindles and sleep slow oscillations in memory consolidation. PLOS Computational Biology. 2018. 07;14(7):1–32. Available from: 10.1371/journal.pcbi.1006322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Wei Y, Krishnan GP, Marshall L, Martinetz T, Bazhenov M. Stimulation Augments Spike Sequence Replay and Memory Consolidation during Slow-Wave Sleep. Journal of Neuroscience. 2020;40(4):811–824. Available from: https://www.jneurosci.org/content/40/4/811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Fachechi A, Agliari E, Barra A. Dreaming neural networks: Forgetting spurious memories and reinforcing pure ones. Neural Networks. 2019;112:24—40. Available from: http://www.sciencedirect.com/science/article/pii/S0893608019300176. [DOI] [PubMed] [Google Scholar]
- 51. Tononi G, Cirelli C. Sleep and synaptic down-selection. European Journal of Neuroscience. 2020;51(1):413–421. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/ejn.14335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Sanchez-Vives MV, Massimini M, Mattia M. Shaping the default activity pattern of the cortical network. Neuron. 2017;94(5):993–1001. doi: 10.1016/j.neuron.2017.05.015 [DOI] [PubMed] [Google Scholar]
- 53. De Bonis G, Dasilva M, Pazienti A, Sanchez-Vives MV, Mattia M, Paolucci PS. Analysis Pipeline for Extracting Features of Cortical Slow Oscillations. Frontiers in Systems Neuroscience. 2019;13:70. Available from: https://www.frontiersin.org/article/10.3389/fnsys.2019.00070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Celotto M, De Luca C, Muratore P, Resta F, Allegra Mascaro AL, Pavone FS, et al. Analysis and Model of Cortical Slow Waves Acquired with Optical Techniques. Methods and Protocols. 2020;3(1). Available from: https://www.mdpi.com/2409-9279/3/1/14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Pastorelli E, Capone C, Simula F, Sanchez-Vives MV, Del Giudice P, Mattia M, et al. Scaling of a Large-Scale Simulation of Synchronous Slow-Wave and Asynchronous Awake-Like Activity of a Cortical Model With Long-Range Interconnections. Frontiers in Systems Neuroscience. 2019;13:33. Available from: https://www.frontiersin.org/article/10.3389/fnsys.2019.00033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kunkel S, Morrison A, Weidel P, Eppler JM, Sinha A, Schenck W, et al. NEST 2.12.0. Zenodo; 2017. Available from: 10.5281/zenodo.259534. [DOI]