A Neurocomputational Model of Stimulus-Specific Adaptation to Oddball and Markov Sequences

Robert Mill; Martin Coath; Thomas Wennekers; Susan L Denham

doi:10.1371/journal.pcbi.1002117

. 2011 Aug 18;7(8):e1002117. doi: 10.1371/journal.pcbi.1002117

A Neurocomputational Model of Stimulus-Specific Adaptation to Oddball and Markov Sequences

Robert Mill ^1,^*, Martin Coath ², Thomas Wennekers ², Susan L Denham ¹

Editor: Olaf Sporns³

PMCID: PMC3158038 PMID: 21876661

Abstract

Stimulus-specific adaptation (SSA) occurs when the spike rate of a neuron decreases with repetitions of the same stimulus, but recovers when a different stimulus is presented. It has been suggested that SSA in single auditory neurons may provide information to change detection mechanisms evident at other scales (e.g., mismatch negativity in the event related potential), and participate in the control of attention and the formation of auditory streams. This article presents a spiking-neuron model that accounts for SSA in terms of the convergence of depressing synapses that convey feature-specific inputs. The model is anatomically plausible, comprising just a few homogeneously connected populations, and does not require organised feature maps. The model is calibrated to match the SSA measured in the cortex of the awake rat, as reported in one study. The effect of frequency separation, deviant probability, repetition rate and duration upon SSA are investigated. With the same parameter set, the model generates responses consistent with a wide range of published data obtained in other auditory regions using other stimulus configurations, such as block, sequential and random stimuli. A new stimulus paradigm is introduced, which generalises the oddball concept to Markov chains, allowing the experimenter to vary the tone probabilities and the rate of switching independently. The model predicts greater SSA for higher rates of switching. Finally, the issue of whether rarity or novelty elicits SSA is addressed by comparing the responses of the model to deviants in the context of a sequence of a single standard or many standards. The results support the view that synaptic adaptation alone can explain almost all aspects of SSA reported to date, including its purported novelty component, and that non-trivial networks of depressing synapses can intensify this novelty response.

Author Summary

For processing real-life auditory scenes, it is not enough that auditory neurons code only for basic stimulus properties, such as frequency and intensity; at some point, these isolated properties must be woven into a pattern. Stimulus-specific adaptation (SSA), whereby neurons adapt to common stimuli but otherwise remain sensitive to other, rare stimuli, has been proposed as a low-level substrate for such abstract pattern processing. SSA has been previously investigated using ‘oddball sequences’ of tones, in which one frequency is common, the other rare. In this article, we present the first neurocomputational model of SSA and show that it can reproduce a wide range of published data. We also propose a natural generalisation of the oddball paradigm, based on Markov chains, which allows the experimenter to manipulate other characteristics of the sequence such the rate of switching. Finally, we show that a small network of neurons can distinguish novelty from mere rarity; e.g., a B stands out in the sequence ABAAA in a way that it does not in CBADE, even though it is equally probable in both. We demonstrate that cascades of depressing synapses can adequately encode this difference, whereas the simple adaptation-based models proposed to date cannot.

Introduction

Natural acoustic environments play host to a wide variety of sounds that are either repetitive or follow a regular pattern. If an organism that inhabits one of these environments hears a repeating sound and does not react to the first few salient presentations, then it is unlikely that further repetitions will be behaviourally relevant. On the other hand, if the organism is to respond to changes in its environment, then it cannot adapt to stimuli indiscriminately; rather, it must remain sensitive to even small deviations from an established pattern. It is within such an evolutionary context that the brain has acquired stimulus-specific adaptation (SSA) mechanisms that operate across several time scales and sensory resolutions [1].

SSA in response to tone sequences has been measured in the spiking of single neurons at various stages of the auditory pathway, including the inferior colliculus (IC) in the rat [2], [3], medial geniculate body (MGB) of the thalamus in the mouse [4] and rat [5], thalamic reticular nucleus in the rat [6], and primary auditory cortex in the cat [7], [8] and rat [9]. It has been suggested [7], [8], [10] that SSA in single neurons lies on the path leading to the generation of mismatch negativity (MMN)–a frontocentrally negative-going deflection in the event-related potential [11], [12], evoked in response to violations of an established temporal sound pattern, including changes in frequency, intensity, duration and even the omission of an expected stimulus (for a recent review, see [13]). It is thought that MMN, in turn, may be implicated in the redirection of attention [14], maintain the representation of the auditory context [12], and contribute to auditory scene analysis [12], [15].

In this article we describe a neurocomputational model of SSA based on a small network of spiking neurons connected by dynamic synapses. The model components are all drawn from the literature [16]–[19] and are implemented without significant modification in order to keep free parameters to a minimum. In terms of its overall architecture, the model rests upon few anatomical assumptions, as it consists solely of a small number of homogeneous populations joined together in uniform patterns of connectivity, which could exist in the brain (e.g., all-to-all, sparse/random). The model requires feature-tuned inputs but does not require that these inputs be mapped topographically. As frequency selectivity in neurons is best understood, and most SSA studies to date have only manipulated frequency, the inputs of our model are tuned to frequencies. This study offers three distinct contributions to the ongoing discussion concerning stimulus-specific adaptation in single neurons: a new model of SSA that accounts for an array of experimental results; a description of a novel stimulus paradigm, accompanied by predictions from the model that can be tested experimentally; and an exploration of the effect of linking adapting processes in series on SSA and novelty detection in general.

It is sometimes remarked that the time scale of recovery from adaptation to tones measured in cortex is consistent with the time it takes cortical synapses to recover from synaptic depression [7], [8], [20]. Despite the strikingly suggestive similarity in the dynamics, and the availability of a light-weight model of a depressing cortical synapse [18], we are not aware of any modelling study to date that has explicitly attempted to bridge this explanatory gap: assembling these model synapses into networks with a view to replicating the results of SSA experiments. Here we undertake just such a study, taking as our primary data the results obtained by von der Behrens et al. [9] in the auditory cortex of the awake rat, which are presented in such a format as to be particularly conducive to the calibration of a model. A more general, theoretical treatment of the properties of networks constructed using this dynamic synapse model is given in [21]. Some mathematical results pertaining to SSA when viewed as an abstract computational process are discussed in [22].

Having configured the model to respond to oddball sequences in a manner consistent with the published physiological data, we then probe it with patterns of standards and deviants generated by first-order Markov chains [23], wherein the probability that a given tone is standard or deviant depends on its immediate predecessor. Oddball sequences actually constitute a specific subset of two-state Markov chains. Progressing to general Markov chains enables one to vary not only the probability of a deviant ( Inline graphic ), but also the probability of switching between deviants and standards (); or, from another perspective, to control the degree to which deviants and standards “clump together” in the sequence, whilst maintaining their overall proportions. The model furnishes explicit predictions regarding the response of SSA neurons to tone sequences generated by Markov chains.

Finally, we examine serial arrangements of depressing synapses as a possible basis for certain types of novelty detection. This architecture is motivated by the fact that some neurons respond more vigorously to deviant tones if they are embedded in a background of a single standard frequency than if they appear as one of many, equiprobable random tones [7], [10]. At the very least, the difference in the responses is not so great as one would expect from a model based on adaptation within channels [24]. A similar sensitivity to novelty is also apparent in the mismatch negativity [25], [26]. The idea of a two-layer model rests on the plausible suggestion that the pre-synaptic inputs to some depressing synapses themselves undergo adaptation due to synaptic depression elsewhere.

In the current study, we found that cross-channel adaptation within a single layer of depressing synapses was sufficient to account for the excess response to deviants embedded in a single standard provided that Inline graphic was large enough. However, introducing two layers of synaptic depression in series enhanced the effect, in that this excess response was larger, and the required to elicit the effect was smaller. In summary, on the one hand, our results support the case for an explanation of SSA based solely on adaptation, at least as far as frequency is concerned. On the other hand, commentators that adopt an adaptation-based interpretation of SSA tend to speak exclusively in terms of the depression or fatigue associated with afferents, whereas we demonstrate that linking depressing synapses in series (and, in principle, recurrently) can dramatically modify these effects.

Methods

In this section, we first describe the individual components that constitute the model, and then explain how these components are assembled to form networks containing units that exhibit SSA. We then discuss the time-varying patterns supplied as input to these networks, which are taken to represent the kinds of tone stimuli used in physiological SSA experiments. The neurocomputational models presented in this article are constructed from spiking units partitioned into populations, labelled A to D. We consider three types of network, and designate each according to the populations it contains: the AB model, the ABC model and the ABD model. These networks are illustrated schematically in Figure 1 and are described in greater detail below. Some results from an ABCD model, which contains all four populations, are included in Supplementary Text S1.

The blue boxes depict populations, and the figures printed inside state the number of units (or Poisson groups). The number of sub-populations in population A depends on whether the task is two-tone (96) or multi-tone (144). A) AB model consisting of a single layer of depressing synapses. B) ABC model introduces an inhibitory population. C) ABD population consisting of two layers of depressing synapses. The synaptic pathways drawn between populations stand for all-to-all connectivity. An exception is : each unit in B receives 16 synapses at random from units in C.

Inline graphic — The blue boxes depict populations, and the figures printed inside state the number of units (or Poisson groups). The number of sub-populations in population A depends on whether the task is two-tone (96) or multi-tone (144). A) AB model consisting of a single layer of depressing synapses. B) ABC model introduces an inhibitory population. C) ABD population consisting of two layers of depressing synapses. The synaptic pathways drawn between populations stand for all-to-all connectivity. An exception is : each unit in B receives 16 synapses at random from units in C.

Model Components

Spiking neuron models

The units in population A are independent Poisson processes, whose firing rates are modulated by the input stimulus. The units in populations B to D implement the adaptive exponential integrate-and-fire (AdEx) model proposed in [16], which incorporates sub-threshold and spike-triggered adaptation currents. Every AdEx model uses the parameters listed in (Table 1 in [16]).

Dynamic synapses

The synapses in the model fall into three classes: fast excitatory, fast inhibitory, and fast excitatory with rapid depression and slow recovery. Fast excitatory and inhibitory synapses are based, respectively, on the simplified kinetic models of the AMPA/kainate and Inline graphic receptors described in [17], and we adopt the parameter sets provided there.

The depressing synapse model combines features of the AMPA synapse model from [17] and the model presented in [18]. It assumes that a unit supply of resources is divided amongst three states: recovered ( Inline graphic ), effective () and inactive (). Initially, and . The system of equations governing the flow of transmitter between states is similar to that found in [18]:

Following a pre-synaptic action potential, Inline graphic is set (or reset) to one for a duration of ; afterwards, it returns to zero. In these equations, refers to a quantity analogous to that defined in [18] as . Whereas the model in [18] uses a delta function to represent the effect of a pre-synaptic spike, this model and [17] use a brief, square pulse ( Inline graphic ). The time constants and are taken from [17], and control the rate at which recovered transmitter substance becomes effective, and effective transmitter substance becomes inactive, respectively. The third time constant, , controls the rate at which inactive substance is recovered and is taken from Figure 1B in [18]. Note that by setting Inline graphic one obtains the non-depressing version of the AMPA synapse. The excitatory post-synaptic current for the depressing synapse is then proportional to the fraction of substance that is effective:

where Inline graphic is the post-synaptic membrane potential, is a reversal potential [17], and is an overall synaptic efficacy.

Noise sources

Altogether, three sources of noise may be identified in the model. First, upon initialisation, the parameters of every synapse in the model (time constants, Inline graphic , synaptic efficacies, reversal potentials) are perturbed by multiplication with log-normal random variables [27] (; ). The neuron parameters are not perturbed.

Secondly, every AdEx neuron is subject to an in vivo-like fluctuating noise current to simulate synaptic background activity. The noise model and its parameters are taken from eqn. 2; Table 1, col. 1 in [28], with two exceptions: the overall magnitude of the current is scaled to compensate for the change in surface area between the neuron modelled in [28] and that modelled here [16], [19]. The standard deviation of the excitatory conductance, designated Inline graphic in [19], is set to one of two values, depending on the experiment. For the AB and ABC models, is hand-tuned to to yield a mean firing rate of approximately , typical of high spontaneous activity in auditory cortex. For the ABD model, . This is the original value used in [17] and causes membrane potential fluctuations, but few spontaneous spikes. The third source of noise is due to variability in the spiking of the Poisson neurons between repeated trials.

The level of spontaneous activity varies amongst SSA studies [5], [9]. A high level of background noise was incorporated into the models to ensure that the SI values obtained from the model were conservative (i.e., likely, if anything, to be higher in a cleaner model), and also to militate against the possibility of obtaining results that required delicately chosen synaptic weights.

Input Population (A)

Population A comprises sub-populations of Poisson neurons, each of which fires at a rate that depends on the frequency of the input tone. The best frequencies of the sub-populations are spaced uniformly on an octave scale. The number of sub-populations and the range of octaves spanned is task-dependent: two-tone tasks utilise 96 inputs spanning a range of 2 octaves; multi-tone tasks utilise 144 inputs spanning a range of 3 octaves. The firing rate (Hz) of sub-population Inline graphic with best frequency in response to tone frequency has the form of a raised Gaussian profile,

graphic file with name pcbi.1002117.e039.jpg

where Inline graphic is the spontaneous firing rate in the absence of a signal; is the maximum firing rate, elicited when the tone and best frequencies coincide; and controls the width of the tuning curve.

As a measure of bandwidth, we take the separation, in octaves, between the frequencies that evoke firing rates half-way between the maximum and spontaneous rates, and denote this quantity Inline graphic . Unlike stimulus parameters, which can be chosen to match the original SSA experiments exactly, the tuning of the putative input channels can, at best, only be inferred from the results of the SSA experiments themselves, or estimated in line with other experimental data. We typically set Inline graphic in this study, which we consider to be conservative, given the tuning width of certain neurons in the inferior colliculus [29] and fibres at the auditory periphery [30]. Alternative values for are also investigated, however. Figure 2A depicts the overlap between two tuning curves with best frequencies separated by half an octave.

AB Model

The AB model is the simplest instance of an adaptation-based model that exhibits SSA. It consists of two populations labelled A and B (see Figure 1A). The computations relevant to SSA are effectively performed by a single, feed-forward layer of depressing synapses ( Inline graphic ).

Population B consists of 48 AdEx neurons, each of which receives a connection from a distinct Poisson neuron in every sub-population of A via a depressing, excitatory synapse. Thus population A contains Inline graphic or Poisson neurons, depending on whether the experiment is two-tone or multi-tone, respectively. It is the depression of the synapses which guarantees the basic behaviour required of the model, namely, that the responses in B reduce if the same tone is presented repeatedly, but recover if another tone is presented. This scenario is presented diagrammatically in Figure 2B. The degree of overlap in the tuning of the Poisson inputs determines how SSA varies with the frequency separation between the tones. When Inline graphic is small, the synaptic resources associated with the standard and deviant frequencies coincide to a greater extent, and the SSA measured is smaller.

ABC Model

The ABC model extends the AB model by adding an inhibitory population, C, consisting of 48 AdEx neurons, and two additional synaptic pathways, Inline graphic and (Figure 1B). The connectivity of the pathway is identical to that of , described above, with the exception that the synapses involved do not depress. Each unit in population B receives input from sixteen randomly-chosen units in population C via fast, inhibitory synapses, which collectively form the pathway Inline graphic . As in the AB model, SSA is sought in population B.

In this model, the indirect pathway Inline graphic does not participate in the generation of SSA. Rather, the tonic inhibition of population B ensures that spontaneous activity is minimised, so that spiking activity reflects the input signal, not the background noise. Peri-stimulus time histograms (PSTH) from a study of SSA in the awake rat [9], show a transient response at the tone onset, followed by a period of spiking below the spontaneous rate, suggestive of inhibition, which lasts for the duration of the tone (see Figures 1A and 3A in [9]; see also Results).

State transition diagrams (A) and example sequences (B) for three two-state Markov chains with different scaled switching metrics. The transition probabilities are represented using line thickness (see Key). Standards and deviants are indicated in blue and red, respectively. Each block shows ninety-nine tones wrapped onto three lines.

In summary, the SSA responses in the ABC model are essentially generated in the same way as those in the AB model, namely, through the depression and recovery of the Inline graphic synapses. There is, however, a difference in the resultant firing patterns. In the AB model, activity in population B persists throughout the tone, until the synapses are depressed to the extent that the units can no longer reach threshold. In the ABC model, in contrast, the neurons receive a strong, delayed, shunting inhibitory input, which suppresses both spontaneous and stimulus-driven spiking. Thus, if a neuron in population B is to fire at all, the excitatory component from population A must cause it to reach threshold in the short time window before it is inhibited. An appropriate balance of excitation and delayed inhibition leads to binary spiking, i.e., the tendency to respond to a stimulus with either no spikes or one spike, which is observed in auditory cortical neurons in general [31], and also in SSA studies in cortex [9] and MGB [5]. Synaptic depression weakens the excitatory contribution to the post-synaptic potential and effectively turns this binary response from ‘on’ to ‘off’.

ABD Model

The ABD model extends the AB model by adding population D, which consists of 48 AdEx neurons, and an excitatory synaptic pathway, Inline graphic . There are no inhibitory populations in this model. The units in population D receive input from population B only, via depressing synapses, connected in an all-to-all pattern (Figure 1C). Our primary interest is SSA in population D, although SSA is also present in population B.

Whilst several authors have suggested adaptation on the inputs to a neuron as the mechanism whereby SSA is generated [7]–[9], none have considered the properties of a network consisting of a cascade of depressing synapses. The ABD model is used to investigate the simplest instance of such a network, in which there are just two depressing pathways ( Inline graphic ; ). The pathway has a recovery time constant of [7], [32]. The synaptic weights are .

The original motivation for the ABD model was the suggestion that the responses obtained for deviants embedded in a single standard exceeded those obtained for the same deviants embedded in a “many standards” control condition [10]. We elaborate on the descriptions of these protocols below. Here it will suffice to sketch the intuitive difference in the stimuli and the behaviour required of the model. If deviant tones are presented against a background of a single, repeating standard frequency, then they are conspicuous, and the model should respond. However, if the same deviant frequency appears as one of many equiprobable random tones, then it is no longer conspicuous, it is simply one tone amongst many, and the model should not respond. In summary, the model must respond to the novelty of the tone, not simply its rarity–which is the same in both conditions.

Figure 2C–D illustrates how the two-layer architecture can make this distinction. Figure 2C shows how the model responds to a deviant embedded in a single standard. A repetitive standard (left) causes the synapses associated with that frequency to depress, and the neurons in population B stop firing. Because the activity in population B is low, the Inline graphic synapses do not depress. When a deviant tone is presented (right), there is a recovered synaptic pathway leading from population A to D, via B, and the neurons in population D respond.

Now we consider the many standards configuration. Figure 2D (left) depicts the presentation of many standards. Because the frequencies of the standards vary, there is usually time for the Inline graphic synapses to recover between presentations. As a consequence, the average response in population B is high, and synapses are depressed. Now, when the nominal deviant tone is presented (right), there is no longer a complete pathway of recovered synapses leading from A to D, and the neurons in population D are silent. The units in population D of the ABD model react to deviants in an appropriate context-dependent manner, whereas the units in population B do not. In closing, we emphasise that the binary distinctions firing/not firing and depressed/recovered are drawn for the benefit of the illustration. In the model, we seek only differential effects consistent with this general behaviour.

Stimulus Configurations

Oddball sequences

Oddball stimuli are sequences of tones consisting of two frequencies, Inline graphic and (Hz), one of which is deviant, and the other standard. The ratio of standards and deviants is controlled. The frequencies are presented to the model equally-spaced on an octave scale around the centre of the input range. Each oddball sequence is presented twice: and are swapped in the second presentation, but the pattern of standards and deviants is preserved. It is necessary to present the same frequency in a standard and deviant context in order to control for any frequency preference associated with the neuron.

As in [7] and elsewhere, Inline graphic and refer to the mean spike count elicited in response to frequency when presented as the standard or deviant, respectively. The degree of stimulus-specific adaptation is quantified using various SSA indices (SI) [7]. The frequency-specific SI is a normalised measure of the difference in responses to Inline graphic when deviant and standard:

The SI is confined to the interval Inline graphic . A larger SI corresponds to greater excess in the deviant response over the standard, and in fact, when the two responses are very close, . SSA is absent when the SI is not significantly positive. The neuron-specific SI quantifies the overall level of SSA that a neuron exhibits, and it has a similar definition:

The term “SI”, without qualification, denotes the neuron-specific SI. (For discussion of an alternative version of the oddball paradigm, called the “switching oddball design”, see [8] and Supplementary Text S1.)

Two-state markov chains

Oddball sequences have been widely used to investigate SSA; but little consideration has been given to the possibility of employing Markov chains [23] in a similar capacity–Markov chains being a broader class of random process, to which oddball sequences belong as a special case. If one designates states Inline graphic and of a two-state Markov chain as ‘deviants’ and ‘standards’, respectively, then the transition matrix for an oddball sequence can be written

where each element, Inline graphic , relates the probability of transiting from state to state . The stationary distribution for this Markov chain is the vector , i.e., . The probability of switching between states depends on the probability of a deviant; specifically, .

Two-state Markov chains offer a way to decouple the probability of switching from the probability of a deviant. This generalised Markov chain has two degrees of freedom, and its transition matrix has the form

graphic file with name pcbi.1002117.e090.jpg

As the maximum valid choice for Inline graphic depends on , it is convenient to define a scaled switching metric, , which, for a given , expresses how often a Markov chain transits from one state to the other, as a value between zero (never switches) and one (switches at highest possible rate). Figure 3 shows state transition diagrams and example realisations of Markov chains where Inline graphic is held fixed and is varied. The response of the ABCD model to three-state Markov chains is discussed in Supplementary Text S1.

Block, random and sequential stimuli

Multiple tone frequencies are routinely used to evaluate the frequency-response areas of neurons and have also been used to assess stimulus-specific adaptation [2], [7]. Pérez-González et al. [2] measured the responsiveness of neurons in the rat IC to one hundred-tone sequences, consisting of ten frequencies repeated ten times. In this protocol, the tones are presented in three configurations: block, sequential and random. In block mode, tones of identical frequency are presented in blocks, ascending from the lowest frequency to the highest. In sequential mode, an ascending, stepwise series of tones is repeated ten times. In random mode, the tones are ordered randomly.

Deviants amongst many standards

Oddball experiments cannot, by themselves, adjudicate the question of whether the enhanced response to a deviant, if present, is due to its novelty–the fact that it stands out against a uniform background–or simply its rarity. Previously, to address this issue, the “deviant amongst many standards” protocol has been used as a control condition for MMN oddball experiments [25], [26]. A sequence of many, equiprobable tones is presented, which is constructed in such a way that the deviant frequency still appears in the same positions as it did in an oddball sequence. The average responses to deviants presented in the two contexts are then compared to delineate the effect of the context on the processing of the same sound. (Note that in the many standards condition, the term ‘deviant’ is employed in a nominal sense, as it refers to the true deviant in the corresponding oddball sequence.) The signal is enhanced when the deviant tone is presented against a background of a single standard (the difference is termed the “true MMN”), and the same is true for the spiking responses of single cortical neurons [7] (see discussion in [10]), which we aim to model here.