Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2012 Mar 8;8(3):e1002385. doi: 10.1371/journal.pcbi.1002385

State-Space Analysis of Time-Varying Higher-Order Spike Correlation for Multiple Neural Spike Train Data

Hideaki Shimazaki 1,*, Shun-ichi Amari 1, Emery N Brown 2,3,4, Sonja Grün 1,5,6
Editor: Olaf Sporns7
PMCID: PMC3297562  PMID: 22412358

Abstract

Precise spike coordination between the spiking activities of multiple neurons is suggested as an indication of coordinated network activity in active cell assemblies. Spike correlation analysis aims to identify such cooperative network activity by detecting excess spike synchrony in simultaneously recorded multiple neural spike sequences. Cooperative activity is expected to organize dynamically during behavior and cognition; therefore currently available analysis techniques must be extended to enable the estimation of multiple time-varying spike interactions between neurons simultaneously. In particular, new methods must take advantage of the simultaneous observations of multiple neurons by addressing their higher-order dependencies, which cannot be revealed by pairwise analyses alone. In this paper, we develop a method for estimating time-varying spike interactions by means of a state-space analysis. Discretized parallel spike sequences are modeled as multi-variate binary processes using a log-linear model that provides a well-defined measure of higher-order spike correlation in an information geometry framework. We construct a recursive Bayesian filter/smoother for the extraction of spike interaction parameters. This method can simultaneously estimate the dynamic pairwise spike interactions of multiple single neurons, thereby extending the Ising/spin-glass model analysis of multiple neural spike train data to a nonstationary analysis. Furthermore, the method can estimate dynamic higher-order spike interactions. To validate the inclusion of the higher-order terms in the model, we construct an approximation method to assess the goodness-of-fit to spike data. In addition, we formulate a test method for the presence of higher-order spike correlation even in nonstationary spike data, e.g., data from awake behaving animals. The utility of the proposed methods is tested using simulated spike data with known underlying correlation dynamics. Finally, we apply the methods to neural spike data simultaneously recorded from the motor cortex of an awake monkey and demonstrate that the higher-order spike correlation organizes dynamically in relation to a behavioral demand.

Author Summary

Nearly half a century ago, the Canadian psychologist D. O. Hebb postulated the formation of assemblies of tightly connected cells in cortical recurrent networks because of changes in synaptic weight (Hebb's learning rule) by repetitive sensory stimulation of the network. Consequently, the activation of such an assembly for processing sensory or behavioral information is likely to be expressed by precisely coordinated spiking activities of the participating neurons. However, the available analysis techniques for multiple parallel neural spike data do not allow us to reveal the detailed structure of transiently active assemblies as indicated by their dynamical pairwise and higher-order spike correlations. Here, we construct a state-space model of dynamic spike interactions, and present a recursive Bayesian method that makes it possible to trace multiple neurons exhibiting such precisely coordinated spiking activities in a time-varying manner. We also formulate a hypothesis test of the underlying dynamic spike correlation, which enables us to detect the assemblies activated in association with behavioral events. Therefore, the proposed method can serve as a useful tool to test Hebb's cell assembly hypothesis.

Introduction

Precise spike coordination within the spiking activities of multiple single neurons is discussed as an indication of coordinated network activity in the form of cell assemblies [1] comprising neuronal information processing. Possible theoretical mechanisms and conditions for generating and maintaining such precise spike coordination have been proposed on the basis of neuronal network models [2][4]. The effect of synchronous spiking activities on downstream neurons has been theoretically investigated and it was demonstrated that these are more effective in generating output spikes [5]. Assembly activity was hypothesized to organize dynamically as a result of sensory input and/or in relation to behavioral context [6][10]. Supportive experimental evidence was provided by findings of the presence of excess spike synchrony occurring dynamically in relation to stimuli [11][14], behavior [14][19], or internal states such as memory retention, expectation, and attention [8], [20][23].

Over the years, various statistical tools have been developed to analyze the dependency between neurons, with continuous improvement in their applicability to neuronal experimental data (see [24][26] for recent reviews). The cross-correlogram [27] was the first analysis method for detecting the correlation between pairs of neurons and focused on the detection of stationary correlation. The joint-peri stimulus time histogram (JPSTH) introduced by [11], [28] is an extension of the cross-correlogram that allows a time resolved analysis of the correlation dynamics between a pair of neurons. This method relates the joint spiking activity of two neurons to a trigger event, as was done in the peri-stimulus time histogram (PSTH) [29][31] for estimating the time dependent firing rate of a single neuron. The Unitary Event analysis method [25], [32], [33] further extended the correlation analysis to enable it to test the statistical dependencies between multiple, nonstationary spike sequences against a null hypothesis of full independence among neurons. Staude et al. developed a test method (CuBIC) that enables the detection of higher-order spike correlation by computing the cumulants of the bin-wise population spike counts [34], [35].

In the last decade, other model-based methods have been developed that make it possible to capture the dependency among spike sequences by direct statistical modeling of the parallel spike sequences. Two related approaches based on a generalized linear framework are being extensively investigated. One models the spiking activities of single neurons as a continuous-time point process or as a discrete-time Bernoulli process. The point process intensities (instantaneous spike rates) or Bernoulli success probabilities of individual neurons are modeled in a generalized linear manner using a log link function or a logit link function, respectively [36][38]. The dependency among neurons is modeled by introducing coupling terms that incorporate the spike history of other observed neurons into the instantaneous spike rate [38][40]. Recent development in causality analysis for point process data [41] makes it possible to perform formal statistical significance tests of the causal interactions in these models. Typically, the models additionally include the covariate stimulus signals in order to investigate receptive field properties of neurons, i.e., the relations between neural spiking activities and the known covariate signals. However, they are not suitable for capturing instantaneous, synchronous spiking activities, which are likely to be induced by an unobserved external stimulus or a common input from an unobserved set of neurons. Recently, a model was proposed to dissociate instantaneous synchrony from the spike-history dependencies; it additionally includes a common, non-spike driven latent signal [42], [43]. These models provide a concise description of the multiple neural spike train data by assuming independent spiking activities across neurons conditional on these explanatory variables. As a result, however, they do not aim to directly model the joint distribution of instantaneous spiking activities of multiple neurons.

In contrast, an alternative approach, which we will follow and extend in this paper, directly models the instantaneous, joint spiking activities by treating the neuronal system as an ensemble binary pattern generator. In this approach, parallel spike sequences are represented as binary events occurring in discretized time bins, and are modeled as a multivariate Bernoulli distribution using a multinomial logit link function. The dependencies among the binary units are modeled in the generalized linear framework by introducing undirected pairwise and higher-order interaction terms for instantaneous, synchronous spike events. This statistical model is referred to as the ‘log-linear model’ [44], [45], or Ising/spin-glass model if the model contains only lower-order interactions. The latter is also referred to as the maximum entropy model when these parameters are estimated under the maximum likelihood principle. In contrast to the former, biologically-inspired network-style modeling, the latter approach using a log-linear model was motivated by the computational theory of artificial neural networks originating from the stationary distribution of a Boltzmann machine [46], [47], which is in turn the stochastic analogue of the Hopfield network model [48], [49] for associative memory.

The merit of the log-linear model is its ability to provide a well-defined measure of spike correlation. While the cross-correlogram and JPSTH provide a measure of the marginal correlation of two neurons, these methods cannot distinguish direct pairwise correlations from correlations that are indirectly induced through other neurons. In contrast, a simultaneous pairwise analysis based on the log-linear model (an analogue of the Ising/spin-glass analysis in statistical mechanics) can sort out all of the pair-dependencies of the observed neurons. A further merit of the log-linear model is that it can provide a measure of the ‘pure’ higher-order spike correlation, i.e., a state that can not be explained by lower-order interactions. Using the viewpoint of an information geometry framework [50], Amari et al. [44], [45], [51] demonstrated that the higher-order spike correlations can be extracted from the higher-order parameters of the log-linear model (a.k.a. the natural or canonical parameters). The strengths of these parameters are interpreted in relation to the lower-order parameters of the dual orthogonal coordinates (a.k.a. the expectation parameters). The information contained in the higher-order spike interactions of a particular log-linear model can be extracted by measuring the distance (e.g., the Kullback-Leibler divergence) between the higher-order model and its projection to a lower-order model space, i.e., a manifold spanned by the natural parameters whose higher-order interaction terms are fixed at zero [44], [52][54].

Recently, a log-linear model that considered only up to pairwise interactions (i.e., an Ising/spin-glass model) was proposed as a model for parallel spike sequences. Its adequateness was shown by the fact that the firing rates and pairwise interactions explained more than Inline graphic% of the data [55][57]. However, Roudi et al. [54] demonstrated that the small contribution of higher-order correlations found from their measure based on the Kullback-Leibler divergence could be an artifact caused by the small number of neurons analyzed. Other studies have reported that higher-order correlations are required to account for the dependencies between parallel spike sequences [58], [59], or for stimulus encoding [53], [60]. In [60], they reported the existence of triple-wise spike correlations in the spiking activity of the neurons in the visual cortex and showed their stimulus dependent changes. It should be noted, though, that these analyses assumed stationarity, both of the firing rates of individual neurons and of their spike correlations. This was possible because the authors restricted themselves to data recorded either from in vitro slices or from anesthetized animals. However, in order to assess the behavioral relevance of pairwise and higher-order spike correlations in awake behaving animals, it is necessary to appropriately correct for time-varying firing rates within an experimental trial and provide an algorithm that reliably estimates the time-varying spike correlations within multiple neurons.

We consider the presence of excess spike synchrony, in particular the excess synchrony explained by higher-order correlation, as an indicator of an active cell assembly. If some of the observed neurons are a subset of the neurons that comprise an assembly, they are likely to exhibit nearly completely synchronous spikes every time the assembly is activated. It may be that such spike patterns are not explained by mere pairwise correlations, but require higher-order correlations for explanation of their occurrence. One of the potential physiological mechanisms for higher-order correlated activity is a common input from a set of unobserved neurons to the assembly that includes the neurons under observation [25], [61][63]. Such higher-order activity is transient in nature and expresses a momentary snapshot of the neuronal dynamics. Thus, methods that are capable of evaluating time-varying, higher-order spike correlations are crucial to test the hypothesis that biological neuronal networks organize in dynamic cell assemblies for information processing. However, many of the current approaches based on the log-linear model [44], [45], [53], [55], [56], [61], [62], [64], [65] are not designed to capture their dynamics. Very recently two approaches were proposed for testing the presence of non-zero pairwise [66] and higher-order [67] correlations using a time-dependent formulation of a log-linear model. In contrast to these methods, the present paper aims to directly provide optimized estimates of the individual time-varying interactions with confidence intervals. This enables to identify short lasting, time-varying higher-order correlation and thus to relate them to behaviorally relevant time periods.

In this paper, we propose an approach to estimate the dynamic assembly activities from multiple neural spike train data using a ‘state-space log-linear’ framework. A state-space model offers a general framework for modeling time-dependent systems by representing its parameters (states) as a Markov process. Brown et al. [37] developed a recursive filtering algorithm for a point process observation model that is applicable to neural spike train data. Further, Smith and Brown [68] developed a paradigm for joint state-space and parameter estimation for point process observations using an expectation-maximization (EM) algorithm. Since then, the algorithm has been continuously improved and was successfully applied to experimental neuronal spike data from various systems [38], [69][71] (see [72] for a review). Here, we extend this framework, and construct a multivariate state-space model of multiple neural spike sequences by using the log-linear model to follow the dynamics of the higher-order spike interactions. Note that we assume for this analysis typical electrophysiological experiments in which multiple neural spike train data are repeatedly collected under identical experimental conditions (‘trials’). Thus, with the proposed method, we deal with the within-trial nonstationarity of the spike data that is expected in the recordings from awake behaving animals. We assume, however, that dynamics of the spiking statistics within trials, such as time-varying spike rates and higher-order interactions, are identical across the multiple experimental trials (across-trial stationarity).

To validate the necessity of including higher-order interactions in the model, we provide a method for evaluating the goodness-of-fit of the state-space model to the observed parallel spike sequences using the Akaike information criterion [73]. We then formulate a hypothesis test for the presence of the latent, time-varying spike interaction parameters by combining the Bayesian model comparison method [74][76] with a surrogate method. The latter test method provides us with a tool to detect assemblies that are momentarily activated, e.g., in association with behavioral events. We test the utility of these methods by applying them to simulated parallel spike sequences with known dependencies. Finally, we apply the methods to spike data of three neurons simultaneously recorded from motor cortex of an awake monkey and demonstrate that a triple-wise spike correlation dynamically organizes in relation to a behavioral demand.

The preliminary results were presented in the proceedings of the IEEE ICASSP meeting in 2009 [77], as well as in conference abstracts (Shimazaki et al., Neuro08, SAND4, NIPS08WS, Cosyne09, and CNS09).

Results

General formulation

Log-linear model of multiple neural spike sequences

We consider an ensemble spike pattern of Inline graphic neurons. The state of each neuron is represented by a binary random variable, Inline graphic (Inline graphic where ‘Inline graphic’ denotes a spike occurrence and ‘Inline graphic’ denotes silence. An ensemble spike pattern of Inline graphic neurons is represented by a vector, Inline graphic, with the total number of possible spike patterns equal to Inline graphic. Let Inline graphic, where Inline graphic and Inline graphic or Inline graphic (Inline graphic), represent the joint probability mass function of the Inline graphic-tuple binary random variables, Inline graphic. Because of the constraint Inline graphic, the probabilities of all the spike patterns are specified using Inline graphic parameters. In information geometry [44],[50], these Inline graphic parameters are viewed as ‘coordinates’ that span a manifold composed of the set of all probability distributions, Inline graphic. In the following, we consider two coordinate systems.

The logarithm of the probability mass function can be expanded as

graphic file with name pcbi.1002385.e021.jpg (1)

where Inline graphic. In this study, a prime indicates the transposition operation to a vector or matrix. Inline graphic is a log normalization parameter to satisfy Inline graphic. The log normalization parameter, Inline graphic, is a function of Inline graphic. Eq. 1 is referred to as the log-linear model. The parameters Inline graphic, Inline graphic,…, Inline graphic are the natural parameters of an exponential family distribution and form the ‘Inline graphic-coordinates’. The Inline graphic-coordinates play a central role in this paper.

The other coordinates, called Inline graphic-coordinates, can be constructed using the following expectation parameters:

graphic file with name pcbi.1002385.e033.jpg (2)

These parameters represent the expected rates of joint spike occurrences among the neurons indexed by the subscripts. We denote the set of expectation parameters as a vector Inline graphic.

Eqs. 1 and 2 can be compactly written using the following ‘multi-indices’ notation. Let Inline graphic be collections of all the Inline graphic-element subsets of a set with Inline graphic elements (i.e., a k-subset of Inline graphic elements): Inline graphic Inline graphic, Inline graphic, etc. We use the ‘multi-indices’ Inline graphic to represent the natural parameters and expectation parameters as Inline graphic and Inline graphic, respectively. Similarly, let us denote the interaction terms in Eq. 1 as

graphic file with name pcbi.1002385.e045.jpg (3)

Using Inline graphic (Inline graphic), Eqs. 1 and 2 can now be compactly written as Inline graphic and Inline graphic, respectively.

The higher-order natural parameters in the log-linear model represent the strength of the higher-order spike interactions. Amari et al. [44], [45], [50], [51] proved that the Inline graphic- and Inline graphic-coordinates are dually ‘orthogonal’ coordinates and demonstrated that the natural parameters that are greater than or equal to the Inline graphicth-order, Inline graphic (Inline graphic, Inline graphic), represent an excess or paucity of higher-order synchronous spikes in the Inline graphic (Inline graphic) coordinates. To understand the potential influence of the higher-order natural parameters on the higher-order joint spike event rates, let us consider a log-linear model in which the parameters higher than or equal to the Inline graphicth-order vanish: Inline graphic (Inline graphic). In the dual representation, the higher-order joint spike event rates, Inline graphic (Inline graphic), are chance coincidences expected from the lower-order joint spike event rates, Inline graphic (Inline graphic). From this, it follows that the non-zero higher-order natural parameters that are greater than or equal to the Inline graphicth-order of a full log-linear model represent the excess or paucity of the higher-order joint spike event rates, Inline graphic (Inline graphic), in comparison to their chance rates expected from the lower-order joint spike event rates, Inline graphic (Inline graphic).

However, in this framework, the excess or scarce synchronous spike events of Inline graphic subset neurons reflected in Inline graphic (Inline graphic) are caused not only by the non-zero Inline graphicth-order interactions Inline graphic (Inline graphic), but also by all non-zero higher-order interactions Inline graphic (Inline graphic). Therefore, one cannot extract the influence of pure Inline graphicth-order spike interactions (Inline graphic) on the higher-order synchrony rates unless the parameters higher than the Inline graphicth-orders vanish, Inline graphic (Inline graphic). To formally extract the Inline graphicth-order spike interactions, let Inline graphic denote a log-linear model whose parameters higher than the Inline graphicth-order are fixed at zero. By successively adding higher-order terms into the model, we can consider a set of log-linear models forming a hierarchical structure as Inline graphic. Here, Inline graphic denotes that Inline graphic is a sub-manifold of Inline graphic in the Inline graphic-coordinates. In Inline graphic, the set of non-zero Inline graphicth-order natural parameters explains the excess or paucity of Inline graphicth-order synchronous spike events, Inline graphic (Inline graphic), in comparison to their chance occurrence rates expected from the lower-order marginal rates, Inline graphic (Inline graphic). In a full model Inline graphic, the last parameter in the log-linear model, Inline graphic, represents the pure Inline graphicth-order spike correlation among the observed Inline graphic neurons. The information geometry theory developed by Amari and others [44], [45], [50], [51] provides a framework for illustrating the duality and orthogonal relations between the Inline graphic- and Inline graphic-coordinates and singling out the higher-order correlations from these coordinates using hierarchical models. In the Methods subsection ‘Mathematical properties of log-linear model’ we describe the known properties of the log-linear model utilized in this study, and give references [44], [45], [50], [51] for further details on the information geometry approach to spike correlations.

State-space log-linear model of multiple neural spike sequences

To model the dynamics of the spike interactions, we extend the log-linear model to a time-dependent formulation. To do so, we parameterize the natural parameters as Inline graphic in discrete time steps Inline graphic. We denote the dimension of Inline graphic for an Inline graphicth-order model as Inline graphic. The time-dependent log-linear model up to the Inline graphic-th order interactions (Inline graphic) at time Inline graphic is then defined as

graphic file with name pcbi.1002385.e112.jpg (4)

The corresponding time-varying expectation parameters at time Inline graphic are written as Inline graphic, where each element (Inline graphic) is given as

graphic file with name pcbi.1002385.e116.jpg (5)

In neurophysiological experiments, neuronal responses are repeatedly obtained under identical experimental conditions (‘trials’). Thus, we here consider the parallel spike sequences repeatedly recorded simultaneously from Inline graphic neurons over Inline graphic trials and align them to a trigger event. We model these simultaneous spike sequences as time-varying, multivariate binary processes by assuming that these parallel spike sequences are discretized in time into Inline graphic bins of bin-width Inline graphic. Let Inline graphic be the observed Inline graphic-tuple binary variables in the Inline graphic-th bin of the Inline graphic-th trial, whereby a bin containing ‘Inline graphic’ values indicates that one or more spikes occurred in the bin whereas ‘Inline graphic’ values indicate that no spike occurred in the bin. We regard the observed spike pattern Inline graphic at time Inline graphic as a sample (from a total of Inline graphic trials) from a joint probability mass function. An efficient estimator of Inline graphic is the observed spike synchrony rate defined for the Inline graphic-th bin as

graphic file with name pcbi.1002385.e132.jpg (6)

for Inline graphic. The synchrony rates up to the Inline graphic-th order, Inline graphic, constitute a sufficient statistic for the log-linear model up to the Inline graphic-th order interaction.

Using Eqs. 4 and 6, the likelihood of the observed parallel spike sequences is given as

graphic file with name pcbi.1002385.e137.jpg (7)

with Inline graphic and Inline graphic. Here, we assumed that the observed spike patterns are conditionally independent across bins given the time-dependent natural parameters, and that the samples across trials are independent.

Our prior assumption about the time-dependent natural parameters is expressed by the following state equation

graphic file with name pcbi.1002385.e140.jpg (8)

for Inline graphic. Matrix Inline graphic (Inline graphic matrix) contains the first order autoregressive parameters. Inline graphic (Inline graphic matrix) is a random vector drawn from a zero-mean multivariate normal distribution with covariance matrix Inline graphic (Inline graphic matrix). The initial values obey Inline graphic, with Inline graphic (Inline graphic matrix) being the mean and Inline graphic (Inline graphic matrix) the covariance matrix. The parameters Inline graphic, Inline graphic, Inline graphic, and Inline graphic are called hyper-parameters. In the following, we denote the set of hyper-parameters by Inline graphic: Inline graphic.

Given the likelihood (Eq. 7) and prior distribution (Eq. 8), we aim at obtaining the posterior density

graphic file with name pcbi.1002385.e159.jpg (9)

using Bayes' theorem. The posterior density provides us with the most likely paths of the log-liner parameters given the spike data (maximum a posteriori, MAP, estimates) as well as the uncertainty in its estimation. The posterior density depends on the choice of the hyper-parameters, Inline graphic. Here, the hyper-parameters, Inline graphic, except for the covariance matrix Inline graphic for the initial parameters, are optimized using the principle of maximizing the logarithm of the marginal likelihood (the denominator of Eq. 9, also referred to as the evidence),

graphic file with name pcbi.1002385.e163.jpg (10)

For non-Gaussian observation models such as Eq. 7, the exact calculation of Eq. 10 is difficult (but see the approximate formula in the Methods section). Instead, we use the EM algorithm [68], [78] to efficiently combine the construction of the posterior density and the optimization of the hyper-parameters under the maximum likelihood principle. Using this algorithm, we iteratively construct a posterior density with the given hyper-parameters (E-step), and then use it to optimize the hyper-parameters (M-step). To obtain the posterior density (Eq. 9) in the E-step, we develop a nonlinear recursive Bayesian filter/smoother. The filter distribution is sequentially constructed by combining the prediction distribution for time Inline graphic based on the state equation (Eq. 8) and the likelihood function for the spike data at time Inline graphic, Eq. 4. Figure 1 illustrates the recursive filtering process in model subspace Inline graphic. In combination with a fixed-interval smoothing algorithm, we derive the smooth posterior density (Eq. 9). The time-dependent log-linear parameters (natural parameters) are estimated as MAP estimates of the optimized smooth posterior density. In the Methods section, please refer to the subsection on ‘Bayesian estimation of dynamic spike interactions’ for the derivation of the optimization method for hyper-parameters, along with the filtering/smoothing methods. We summarize a method for estimating the dynamic spike interactions in Table 1.

Figure 1. Geometric view of recursive filtering in subspace Inline graphic.

Figure 1

Each point in this figure represents a probability distribution, Inline graphic, of an Inline graphic-tuple binary variable, Inline graphic. The underlying time-dependent model is represented by white circles in the space of Inline graphic. The dashed lines indicate projections of the underlying models to the model subspace, Inline graphic. The maximum a posteriori (MAP) estimates of the underlying models projected on subspace Inline graphic were obtained recursively: Starting from the MAP estimate at time Inline graphic (filter estimate, red circle), the model at time Inline graphic is predicted based on the prior knowledge of the state transition, Eq. 8 (blue arrow, prediction; black cross, a predicted distribution). The maximum likelihood estimate (MLE, black circle) for the spike data at time Inline graphic derived by Eq. 4 is expected to appear near the projection point of the underlying model at time Inline graphic in Inline graphic. The filter distribution at time Inline graphic is obtained by correcting the prediction by the observation of data at time Inline graphic (black arrow). The filter estimation at time Inline graphic is used for predicting the model at time Inline graphic and so on. This recursive procedure allows us to retain past information while tracking the underlying time-dependent model based on the current observation.

Table 1. Method for estimating dynamic spike interactions.
I Preprocessing parallel spike data
(1) Align parallel spike sequences from Inline graphic neurons at the onset of external clock that repeated Inline graphic times.
(2) Construct binary sequences Inline graphic (Inline graphic and Inline graphic) using Inline graphic bins of width Inline graphic from the spike timing data.
(3) Select Inline graphic, the order of interactions included in the model. At each bin, compute the joint spike event rates up to the Inline graphicth order Inline graphic, using Eq. 6.
II Optimized estimation of time-varying log-linear parameters
(1) Initialize the hyper parameters: Inline graphic.
(2) E-step: Apply the recursive Bayesian filter/smoother to obtain posterior densities.
(i) Filtering: For Inline graphic, recursively obtain
the one-step prediction density, Inline graphic, using Eqs. 25 and 26,
the filter density, Inline graphic, using Eqs. 31 and 32.
(ii) Smoothing: For Inline graphic, recursively obtain
the smooth density, Inline graphic, using Eqs. 34 and 35.
(3) M-step: Optimize the hyper-parameters.
Update the hyper-parameters, Inline graphic and Inline graphic, using Eqs. 38 and 39, and Inline graphic.
(4) Repeat (2) and (3) until the iterations satisfy a predetermined convergence criterion.

†: In this study, we initialized the hyper-parameters using Inline graphic, Inline graphic, Inline graphic and used a fixed diagonal covariance matrix for an initial density, as Inline graphic, unless specified otherwise in the main text or figure captions.

‡: In our algorithm, we computed the approximate log marginal likelihood, Inline graphic, of the model using Eq. 45. We stopped the EM algorithm if the increment of the log marginal likelihood was smaller than Inline graphic.

Application of state-space log-linear model to simulated spike data

Estimation of time-varying pairwise spike interaction

To demonstrate the utility of the developed methods for the analysis of dynamic spike correlations, we first consider a nonstationary pairwise spike correlation analysis. For this goal, we apply the state-space method to two examples of simulated spike data, with Inline graphic neurons. The dynamic spike correlation between two neurons can be analyzed by conceptually simpler histogram-based methods, e.g., a joint peri-stimulus time histogram (JPSTH). However, even for the pair-analysis, the proposed method can be advantageous in the following two aspects. First, the proposed method provides a credible interval (a Bayesian analogue of a confidence interval). Using the recursive Bayesian filtering/smoothing algorithm developed in the Methods section, we obtain the joint posterior density of the log-linear parameters (Eq. 9). The posterior density provides, not only the most likely path of the log-liner parameters (MAP estimates), but also the uncertainty in its estimation. The credible interval allows us to examine whether the pairwise spike correlation is statistically significant (but see the later section on “Testing spike correlation in nonstationary spike data” for the formal use of the joint posterior density for testing the existence of the spike correlation in behaviorally relevant time periods). Second, an EM algorithm developed in the proposed method optimizes the smoothness of the estimated dynamics of the pairwise correlation (i.e., optimization of the hyper-parameters, Inline graphic in the state equation, Eq. 8). By the automatic selection of the smoothness parameter, we can avoid the problem of spurious modulation in the estimated dynamic spike correlation caused by local noise, or excessive smoothing of the underlying modulation.

Figure 2A displays an application of our state-space method to 2 parallel spike sequences, Inline graphic (Inline graphic, Inline graphic, and Inline graphic), which are correlated in a time-varying fashion. The data are generated as realizations from a time-dependent formulation of a full log-linear model of 2 neurons (Figure 2A left, repeated trials: Inline graphic; duration: Inline graphic bins of width Inline graphic). Here, the underlying model parameters, Inline graphic, Inline graphic, and Inline graphic (Figure 2A right, dashed lines), are designed so that the individual spike rates are constant (Inline graphic and Inline graphic), while the spike correlation between the two neurons, Inline graphic, varies in time, i.e., across bins (synchronous spike events caused by the time-dependent correlation are marked as blue circles in Figure 2A left). While the bin-width, Inline graphic, is an arbitrary value in this simulation analysis, the bin-width typically selected in spike correlation analyses is on the milli-second order. If the bin-width is Inline graphic, the individual spike rates of simulated neurons 1 and 2 are 38.4 Hz and 19.4 Hz, respectively. The correlation coefficient calculated from these parallel spike sequences is 0.0763. These values are within the realistic range of values obtained from experimentally recorded neuronal spike sequences. By applying the state-space method to the parallel spike sequences, we obtain the smooth posterior density of the log-linear parameters. The right panels in Figure 2A display the MAP estimates of the log-linear parameters (solid lines). The analysis reveals the time-varying pairwise interaction between the two neurons (Figure 2A right, bottom). The gray bands indicate 99% credible intervals from the posterior density, Eq. 9. We used marginal posterior densities, Inline graphic (Inline graphic), to display the credible intervals for the individual log-linear parameters. The variances of the individual marginal densities were obtained from the diagonal of a covariance matrix of the smooth joint posterior density (Eq. 35).

Figure 2. Estimation of pairwise interactions in two simulated parallel spike sequences.

Figure 2

(A) Application of the state-space log-linear model to parallel spike sequences with time-varying spike interaction. (Left) Based on a time-dependent formulation of the log-linear model (dashed lines in the right panels represent the model parameters), Inline graphic parallel spike sequences, Inline graphic, are simulated repeatedly for Inline graphic trials (duration: Inline graphic bins). The two panels show dot displays of the spike events of the variables, Inline graphic or Inline graphic (Inline graphic and Inline graphic). The observed synchronous spike events across the two spike sequences within the same trials are marked by blue circles. (Right) Smoothed estimates of the log-linear parameters, Inline graphic (solid lines, red: pairwise interaction; blue and green: the first order), estimated from the data shown in the left panels. The gray bands indicate the 99% credible interval from the posterior density of the log-linear parameters. The dashed lines are the underlying time-dependent model parameters used for the generation of the spike sequences in the left panels. (B) Application of the state-space log-linear model to independent parallel spike sequences with time-varying spike rates. Each panel retains the same presentation format as in A.

Figure 2B shows an application of the method to parallel spike sequences of time-varying spike rates. Here, the underlying model parameters are constructed so that the two parallel spike sequences are independent (Inline graphic for Inline graphic), while the individual spike rates vary in time (i.e., across the bins). The observation of synchronous spike events (Figure 2B left, blue circles) confirms that chance spike coincidences frequently and trivially occur at higher spike rates. The analysis based on our state-space method reveals that virtually no spike correlation exists between the two neurons, despite the presence of time-varying rates of synchronous spike events (Figure 2B right, bottom).

Simultaneous estimation of time-varying pairwise spike interactions

In this subsection, we extend the pairwise correlation analysis of 2 neurons to the simultaneous analysis of multiple pairwise interactions in the parallel spike sequences obtained from more than 2 neurons.

Figure 3 demonstrates an application of our method to simulated spike sequences of Inline graphic neurons. As an underlying model, we construct a time-dependent log-linear model of 8 neurons with time-varying rates and pairwise interactions (Inline graphic, Inline graphic, duration: Inline graphic bins). The higher-order log-linear parameters are set to zero, i.e., no higher-order interactions are included in the model. Figure 3A displays snapshots of the dynamics of the parameters of the individual spike rates, Inline graphic (Inline graphic), and pairwise interactions, Inline graphic (Inline graphic), at Inline graphic. Figure 3B shows the parallel spike sequences (50 out of 200 trials are displayed) simulated on the basis of this model. The spikes involved in the pairwise, synchronous spike events between any two of the neurons (in total: 28 pairs) are superimposed and marked with blue circles. Figure 3C displays snapshots of the simultaneous MAP estimates of the pairwise interactions, Inline graphic (Inline graphic), of a log-linear model of 8 neurons applied to the parallel spike train data. In addition, the spike rates were estimated from the dual coordinates, i.e., Inline graphic (Inline graphic). The results demonstrate that the simultaneous estimation of time-varying, multiple pairwise interactions can be carried out by using a state-space log-linear model with up to pairwise interaction terms.

Figure 3. Simultaneous estimation of pairwise interactions of 8 simulated neurons.

Figure 3

(A) Snapshots of the underlying model parameters of a time-dependent log-linear model of Inline graphic neurons containing up to pairwise interactions (duration: Inline graphic bins) at Inline graphic bins. No higher-order interactions are included in the model. Each node represents a single neuron. The strength of a pairwise interaction between the Inline graphicth and Inline graphicth neurons, Inline graphic (Inline graphic), is expressed by the color as well as the thickness of the link between the neurons (see legend at the right of panel B). A red solid line indicates a positive pairwise interaction, whereas a blue dashed line represents a negative pairwise interaction. The underlying spike rates of the individual neurons, Inline graphic (Inline graphic), are coded by the color of the nodes (see color bar to the right of panel A). (B) Dot displays of the simulated parallel spike sequences of 8 neurons, Inline graphic, sampled repeatedly for Inline graphic trials from the time-dependent log-linear model shown in A. For better visibility, only the first Inline graphic trials are displayed (Inline graphic). Synchronous spike events between any two neurons (28 pairs in total) are marked by blue circles. (C) Pairwise analysis of the data illustrated in B (using all Inline graphic trials) assuming a pairwise model (Inline graphic) of 8 neurons. The snapshots at the Inline graphic bins show smoothed estimates of the time-varying pairwise interactions, Inline graphic (Inline graphic), and the spike rates, Inline graphic (Inline graphic). For this estimation, we use Inline graphic for the prior density of initial parameters. The scales are identical to the one in panel A.

Estimation of time-varying triple-wise spike interaction

Another important aspect of the proposed method is its ability to estimate time-varying higher-order spike interactions that cannot be revealed by a pairwise analysis. To demonstrate this, we apply the state-space log-linear model to Inline graphic parallel spike sequences by considering up to a triple-wise interaction (i.e., the full log-linear model). Spike data (Figure 4A) are generated by a time-dependent log-linear model (Figure 4C, dashed lines) repeatedly in Inline graphic trials. Figure 4C displays the MAP estimates (solid lines) of the log-linear parameters from the data shown in Figure 4A. Here, non-zero parameter Inline graphic represents a triple-wise spike correlation, i.e., excess synchronous spikes across the three neurons or absence of such synchrony compared to the expectation if assuming pairwise correlations. The gray band is the 99% credible interval from the marginal posterior density, Inline graphic, for Inline graphic.

Figure 4. Estimation of triple-wise interaction from simulated parallel spike sequences of 3 neurons.

Figure 4

(A) Dot displays of the simulated spike sequences, Inline graphic, which are sampled repeatedly for Inline graphic trials from a time-dependent log-linear model containing time varying pairwise and triple-wise interactions (duration: Inline graphic bins; see the dashed lines in C for the model parameters). Each of the 3 panels shows the spike events for each of the 3 variables, Inline graphic (Inline graphic and Inline graphic), as black dots. Synchronous spike events across the 3 neurons as detected in individual trials are marked by blue circles. (B) Observed rates of joint spike events, Inline graphic (Inline graphic). (Top) Observed rates of the synchronous spike events between all possible pair constellations as specified by index Inline graphic (Inline graphic). (Bottom) Observed rate of the synchronous spikes across all 3 neurons, Inline graphic. (C) Smoothed estimates of the time-varying log-linear parameters, Inline graphic. The three panels depict the smoothed estimates (solid lines) of the log-linear parameters, Inline graphic, of the different orders (Inline graphic), as obtained from the data shown in A and B (top and middle: the first and second order log-linear parameters; bottom: triple-wise spike interaction, Inline graphic). The gray bands indicate the 99% credible interval of the marginal posterior densities of the log-linear parameters. The dashed lines indicate the underlying time-dependent parameters used for the generation of the spike sequences.

The credible interval of the higher-order (triple-wise) log-linear parameter in the bottom panel of Figure 4C appears to be larger than those of the lower-order parameters. In general, the observed frequency of simultaneous spike occurrences decreases as the number of neurons that join the synchronous spiking activities increases (note that the marginal joint spike occurrence rate, Eq. 2, is a non-increasing function with respect to the order of interaction, i.e., Inline graphic if the elements of Inline graphic are included in Inline graphic). Thus, the estimation variance typically increases for the higher-order parameters, as seen in the bottom panel of Figure 4C. Related to the above, because of the paucity of samples for higher-order joint spike events, the automatic smoothness optimization method selects hyper-parameters that make the trajectories of the higher-order log-linear parameters stiff in order to avoid statistical fluctuation caused by a local noise structure. Given the limited number of trials available for data analyses, these observations show the necessity of a method to validate inclusion of the higher-order interaction terms in the model.

Additionally, we observe that in the later period of spike data (300–500 bins), the dynamics of the estimated triple-wise spike interaction do not follow the underlying trajectory faithfully: The underlying trajectory falls on outside the 99% credible interval. Similar results are sometimes observed when an autoregressive parameter, Inline graphic, in a state model is optimized (Eq. 8). In contrast, when we replace the autoregressive parameter with an identity matrix (i.e., Inline graphic, where Inline graphic is the identity matrix), the credible intervals become larger. Therefore, such observations do not typically occur. Thus, we also need a method for validating the inclusion of the autoregressive parameter in the state model using an objective criterion. Detailed analyses of these topics will be given in the next section using the example of 3 simulated neurons displayed in Figure 4. In the above example, the state-space log-linear model with an optimized Inline graphic provides a better overall fits to the spike data than the model using Inline graphic, despite an inaccurate representation in part of its estimation. However, for the purpose of testing the spike correlation in a particular period of spike data, we recommend using an identity matrix as an autoregressive parameter, i.e. Inline graphic, in the state model.

Selection of state-space log-linear model

For a given number of neurons, Inline graphic, we can construct state-space log-linear models that contain up to the Inline graphicth-order interactions (Inline graphic). While the inclusion of increasingly higher-order interaction terms in the model improves its accuracy when describing the probabilities of Inline graphic spike patterns, the estimation of the higher-order log-linear parameters of the model may suffer from large statistical fluctuations caused by the paucity of synchronous spikes in the data, leading to an erroneous estimation of such parameters. This problem is known as ‘over-fitting’ the model to the data. An over-fitted model explains the observed data, but loses its predictive ability for unseen data (e.g., spike sequences in a new trial under the same experimental conditions). In this case, the exclusion of higher-order parameters from the model may better explain the unseen data even if an underlying spike generation process contains higher-order interactions. The model that has this predictive ability by optimally resolving the balance between goodness-of-fit to the observed data and the model simplicity is obtained by maximizing the cross-validated likelihood or minimizing the so-called information criterion. In this section, we select a state-space model that minimizes the Akaike information criterion (AIC) [73], which is given as

graphic file with name pcbi.1002385.e305.jpg (11)

The first term is the log marginal likelihood, as in Eq. 10. The second term that includes Inline graphic is a penalization term. The AIC uses the number of free parameters in the marginal model (i.e., the number of free parameters in Inline graphic) for Inline graphic. Please see in the Methods subsection ‘Selection of state-space model by information criteria’ for an approximation method to compute the marginal likelihood. Selecting a model that minimizes the AIC is expected to be equivalent to selecting a model that minimizes the expected (or average) distance between the estimated model and unknown underlying distribution that generated the data, where the ‘distance’ measure used is the Kullback-Leibler (KL) divergence. The expectation of the KL divergence is called the KL risk function.

Selection from hierarchical models

Here, we examine the validity of including higher-order interaction terms in the model by using the AIC. We apply the model selection method to the spike train data of Inline graphic simulated neurons. The data are generated by the time-varying, full log-linear model that contains a non-zero triple-wise interaction terms shown in Figure 4C (dashed lines). The AICs are computed for hierarchical state-space log-linear models, i.e., for models of interaction orders up to Inline graphic. To test the influence of the data sample size on the model selection, we vary the number of trials, Inline graphic, used to fit the hierarchical log-linear models. The results are shown in Table 2 for Inline graphic. For a small number of trials (Inline graphic), a model without any interaction structure (Inline graphic) is selected. For larger numbers of trials, models with larger interaction orders are selected. For Inline graphic, the full log-linear model (Inline graphic) is selected.

Table 2. AICs for different numbers of trials.
The number of trials Model order
Inline graphic Inline graphic Inline graphic
Inline graphic 775.15* 807.146 841.96
Inline graphic 1957.4 1890.6* 1922
Inline graphic 7857.2 7565.1* 7585.8
Inline graphic 37791 36264 36231*
Inline graphic 75443 72366 72283*

This table displays the AICs of a state-space log-linear model with increasing interaction orders: an independent model (Inline graphic), pairwise model (Inline graphic), and full model (Inline graphic), applied to simulated Inline graphic spike sequences with an increasing number of trials, Inline graphic, considered in the analysis. The spike data are identical to that shown in Figure 4A. The asterisk indicates the model that minimizes the AIC.

Below, we examine whether the AIC selected a model that minimizes the KL risk function by directly computing its approximation using the known underlying model parameters. First, Table 3 shows how often a specific order is selected by the AIC by repeatedly applying the method to different samples generated from the same underlying log-linear parameters (Figure 4C, dashed lines). We examine two examples: One in which a sample is composed of Inline graphic trials (left) and the other of Inline graphic trials (right). We repeatedly compute the AICs of state-space models of different orders (Inline graphic) applied to 100 data realizations (of the respective number of trials). We then count how often a model of order Inline graphic is selected by minimizing the AIC. For comparison, the table includes the outcomes from other criteria such as the Bayesian information criterion (BIC) [79], [80] and the predictive divergence for indirect observation models (PDIO) [81], which are suggested for models containing latent variables. Please see the Methods section for the details of these criteria. Next, Table 4 displays the most frequently selected model (from Inline graphic = 1,2,3) by the various information criteria when they are applied to 100 data realizations as a function of the number of trials in each data set, Inline graphic (see Table 3 for the outcomes of Inline graphic and Inline graphic).

Table 3. Models selected using different information criteria.
Inline graphic trials Inline graphic trials
Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
AIC 29 71* 0 0 3 97*
BIC 15 85* 0 0 99* 1
PDIO 46* 25 29 3 60* 37

The state-space log-linear models containing interactions up to the Inline graphicth-order (Inline graphic) are applied to the data from Inline graphic simultaneous spike sequences. The spike data is generated from a time-dependent full log-linear model (see dashed lines in Figure 4C) repeatedly for Inline graphic trials; either Inline graphic (left) or Inline graphic (right) trials. For this data set, we compute three information criteria (AIC, BIC, and PDIO) and find the order of the model that minimizes these information criteria. We repeated the selection of the model order Inline graphic times, using each of the criteria and using the spike data that contains respective number of trials (Inline graphic or Inline graphic). The count of the order of spike interactions that minimizes the applied information criteria is increased by 1, and finally expressed as a frequency. The asterisk marks the most frequently selected model.

Table 4. Model orders selected by different information criteria for different numbers of trials.
The number of trials
Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
AIC 1 2 2 3 3
BIC 2 2 2 2 3
PDIO 1 1 1 2 2
KL-risk 1 2 3 3 3
MSE 1 1 3 3 3

The state-space log-linear models of different orders (Inline graphic) are applied to samples of the Inline graphic spike sequences of Inline graphic repeated trials generated from a time-dependent full log-linear model (indicated by the dashed lines in Figure 4C). Three data-driven information criteria, AIC, BIC, and PDIO, are computed for the fitted state-space models of the different orders, Inline graphic. The count for the model order Inline graphic that minimizes the respective criteria is determined by repeating the process for Inline graphic repetitions as in Table 3. The most frequently selected model order, Inline graphic, is displayed for each of the information criteria and for the different numbers of trials, Inline graphic. For comparison, we also show the order of interactions that minimizes the KL risk function (KL-risk) and mean squared error (MSE). We approximated the KL-risk and MSE as follows. At each bin, we compute the KL-divergence (Eq. 21), between a full underlying log-linear model of Inline graphic neurons and the estimated log-linear model whose parameters are given by the MAP estimates of the Inline graphic th-order model. The total sum of the all KL-divergences from Inline graphic bins is used as the distance between the two (time-dependent) models: i.e., Inline graphic, where the function Inline graphic is given in Eq. 21. Inline graphic represents the underlying log-linear parameters used to generate the data. Inline graphic is its estimate from one sample composed of Inline graphic trials. The parameters higher than the Inline graphic th-order that are not included in the model are set to zero. The KL-risk function is estimated as the average of the KL-divergences of Inline graphic realizations of the spike data, each composed of Inline graphic trials. To obtain the MSE, we first computed the sum of the squared errors (SE) : Inline graphic, using one sample composed of Inline graphic trials. The MSE is then estimated as the average of the SEs over 100 samples.

We compare the outcomes of the AIC and two other information criteria with the model order that minimizes the KL risk function (KL-risk). For this goal, we include in Table 4 the model order that minimizes the KL-risk between the underlying log-linear model (Figure 4C, dashed lines) and estimates of the Inline graphic th-order model. In addition to the KL-risk, we calculate the mean squared error (MSE) between the underlying model parameters and corresponding estimates. Please see in the caption of Table 4 how we compute the KL-risk and MSE for this analysis. We find that the KL-risk and MSE select the same model, except for the case of Inline graphic trials. In comparison to the KL-risk, the BIC tends to select models with an excessively higher-order of interaction (over-fitting) for a small number of trials (Inline graphic) and tends to choose lower-order models for Inline graphic. The PDIO mostly selects lower-order models. In contrast, the AIC follows the selection of the KL-risk minimization principle, except for Inline graphic, where it shows a conservative choice compared to the KL-risk.

We repeat the same analysis for spike data generated from an underlying model that contains up to pairwise interactions, but does not contain the triple-wise interaction term. The purpose of this analysis is to show that the methods do not select models with excessively higher orders of interaction than those actually contained in the data. To construct such an underlying model, we project the full model (shown in Figure 4C) onto the subspace of a pairwise log-linear model, Inline graphic. The projection model does not contain any triple-wise correlation, while the 1st and 2nd order expectation parameters (Inline graphic for Inline graphic) are the same as those of the full model that was used to generate the data in the analysis of Tables 24. Table 5 displays the most frequently selected model orders by the AIC, BIC, PDIO, along with the selections by the KL-risk and MSE. We find that the pairwise model is the most frequently selected model under all of the criteria for the samples with a large number of trials (Inline graphic). Under this condition, only the AIC among the other data-driven methods follows the KL-risk selection. These results lead us to the conclusion that the AIC is a reliable measure to assess the goodness of fit of the state-space log-linear model.

Table 5. Model orders selected using different information criteria for model without triple-wise spike interaction.
The number of trials
Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
AIC 1 2 2 2 2
BIC 2 2 2 2 2
PDIO 1 1 1 2 2
KL-risk 1 2 2 2 2
MSE 1 2 2 2 2

Similar to Table 4, the state-space log-linear models (Inline graphic) are repeatedly applied to Inline graphic samples of Inline graphic spike sequences with Inline graphic trials. In contrast to Table 4, here a sample of spike sequences is generated from a pairwise (Inline graphic) time-dependent log-liner model of Inline graphic neurons. This underlying model is a projection of the full log-liner model examined in Table 4 (dashed lines in Figure 4C) to the pairwise model space. The frequencies of the Inline graphic th order model that minimizes AIC, BIC, and PDIO are counted by repeatedly applying models with different orders (Inline graphic) to Inline graphic samples (each with Inline graphic trials). The most frequently selected models are displayed for the different criteria and for different numbers of trials, Inline graphic, as in Table 4. The rows for the KL-risk and MSE display the models that minimize the estimates of the KL-risk and MSE.

Selection of state transition model

In addition to validating the inclusion of the order of spike interactions, we examine state models (Eq. 8) with different conditions for the hyper-parameters by the AIC, using as an example the same spike train data of 3 neurons with Inline graphic trials (displayed in Figure 4A). The tested state models are (i) a time-independent model in which the hyper-parameters are fixed as Inline graphic and Inline graphic, where Inline graphic is an identity matrix; (ii) a random walk model (Inline graphic), in which only the covariance matrix, Inline graphic, is optimized via the EM algorithm; and (iii) a 1st-order autoregressive model, in which Inline graphic and Inline graphic are both optimized. With these settings, the state-space log-linear model of case (i) becomes a stationary log-linear model, which has been frequently employed in spike data analyses, often in the form of a maximum entropy model [55][57], [82]. In contrast, the state-space models of cases (ii) and (iii) are nonstationary because the joint distribution of the spike pattern changes in time as a result of the time-varying log-linear parameters. The interaction parameters themselves are assumed to be nonstationary in the state model in case (ii), and can be either stationary or nonstationary in the state model in case (iii). To see this, let Inline graphic be the eigenvalues of Inline graphic, i.e., the solutions for Inline graphic. For a stationary process, the eigenvalues have to satisfy Inline graphic, otherwise the process is nonstationary. This excludes the case Inline graphic.

The AICs of the full model with the various state-equations become progressively smaller for increasingly complex state-space models (38525 for case (i), 36263 for case (ii), and 36231 for case (iii)). The fact that conditions (ii) and (iii) are better fitted to the data than condition (i) confirms that the proposed method of time-varying spike correlation analysis performs better than an analysis based on a stationary log-linear model for this data. In addition, the fact that condition (iii) better fits the data than condition (ii) supports the use of autoregressive models (note: The estimation in Figure 4B is done with state model (iii)). We find that the fitted Inline graphic (to the data in Figure 4A) yields an eigenvalue larger than 1, indicating that the underlying state process (Inline graphic bins) is modeled as a nonstationary process. We consider that the nonstationary state model is selected because of the relatively short observation period, during which a nonstationary trend for the parameters can appear even if the state process is stationary in the long run.

Test for presence of spike correlation in nonstationary spike sequences

One of the goals of a time resolved analysis of spike correlation is to discover dynamical changes in the correlated activities of neurons that reflect the behavior of an animal. This implies the necessity of dealing with the within-trial nonstationarity that is typically present in the data from awake behaving animals. However, we know from other correlation analysis approaches that, if not well corrected for, nonstationary spike data bear the potential danger of generating false outcomes [25], [83]. Here we deal with the within-trial nonstationarity of the data by using the state-space log-linear model while assuming identical dynamic spiking statistics across trials (across-trial stationarity). In order to correctly detect the time-varying correlation structure within trials, we apply to the state-space log-linear model a Bayesian model comparison method based on the Bayes factor (BF) [74][76], and combine it with a surrogate approach. The BF is a likelihood ratio for two different hypothetical models of latent signals, e.g., in our application, different underlying spike correlation structures. Using the BF, we determine which of the two spike correlation models the spike data supports. By computing the BF for a particular task period in a behavioral experiment, we can test whether the assumed correlation structure appears in association with the timing of the animal's behavior. In the following, we denote a specific task period of interest by the time period Inline graphic.

In this study, the BF, Inline graphic, is defined as the ratio of the marginal likelihoods of the observed spike patterns, Inline graphic, in the time period Inline graphic under different models, Inline graphic or Inline graphic, assumed for the hidden state parameters,

graphic file with name pcbi.1002385.e427.jpg (12)

By successively conditioning the past, the BF is computed by the multiplication of the bin-by-bin one-step BF given at time Inline graphic as Inline graphic. Here, the bin-by-bin BF at time Inline graphic, Inline graphic, can be calculated as (see the Methods subsection, ‘Bayesian model comparison method for detecting spike correlation’),

graphic file with name pcbi.1002385.e432.jpg (13)

where Inline graphic is the space of the interaction parameters, Inline graphic, for the model, Inline graphic (Inline graphic). In Eq. 13, Inline graphic is the filter density and Inline graphic is called the one-step prediction density, both of which are obtained in the Bayesian recursive filtering algorithm developed in the Methods section (cf. Eqs. 25, 26 and Eqs. 31, 32). Therefore, the bin-by-bin BF at time Inline graphic, Inline graphic, is the ratio of the odds (of opposing models) found by observing the spike train data up to time Inline graphic (filter odds, the numerator in Eq. 13) to the odds predicted from Inline graphic without observing the data at time Inline graphic (prediction odds, the denominator in Eq. 13). Thus, an unexpected synchronous spike pattern that significantly updates the filter odds for the interaction parameters from their predicted odds gives rise to a large absolute value for the BF. Because the posterior densities are approximated as a multivariate normal distribution in our filtering algorithm, the BF at time Inline graphic can be easily computed by using normal distribution functions. Please see the subsection, ‘Bayesian model comparison method for detecting spike correlation’, in the Methods section for the derivation of Eq. 13 and detailed analysis of the BF.

The BF becomes larger than 1 if the data, Inline graphic, support model Inline graphic as opposed to Inline graphic as an underlying spike correlation structure and becomes smaller than 1 if the data support model Inline graphic as opposed to Inline graphic. Alternatively, it is possible to use the logarithm of the BF, known as the ‘weight of evidence’ [75] which becomes positive if the data support model Inline graphic as opposed to model Inline graphic and negative in the opposite situation. Below, we display the results for the BF in bit units (logarithm of the BF to base 2), i.e., the weight of evidence. By sequentially computing the bin-by-bin BF, we can obtain the weight of evidence in a period Inline graphic as the summation of the local weight of evidence: Inline graphic.

An intuitive interpretation of the BF values is provided in the literature [74], [76]. For example, in [76], a BF (weight of evidence) from 1.6 to 4.3 bit was interpreted as ‘positive’ evidence in favor of Inline graphic against Inline graphic. Similarly, a BF from 4.3 to 7.2 bit was interpreted as ‘strong’ evidence, and a BF larger than 7.2 bit was found to be ‘very strong’ evidence in favor of Inline graphic against Inline graphic. While the classical guidelines are useful in practical situations, they are defined subjectively. Thus, in this study, in order to objectively analyze the observed value of the BF, we combine the Bayesian model comparison method with a surrogate approach. In this surrogate method, we test the significance of the observed BF for the tested spike interactions by comparing it with the surrogate BFs computed from the null-data generated by destroying only the target spike interactions while the other structures such as the time-varying spike-rates and lower-order spike interactions are kept intact.

The BF in a behaviorally relevant sub-interval Inline graphic can be computed from the optimized state-space log-linear model fitted to the entire spike train data in Inline graphic. Here, for the purpose of testing spike correlation in the sub-interval, we recommend to use Inline graphic in the state model because the autoregressive parameters are optimized for entire spike train data, which are not necessarily optimal for the sub-interval. Similarly, a typical trial-based experiment is characterized by discrete behavioral or behaviorally relevant events, e.g. movement onset after a go signal or a cue signal for trial start, etc. Thus, on top of the expected smooth time-varying change in the spike-rate and spike-correlation, sudden transitions may be expected in their temporal trajectories. Because we use time-independent smoothing parameters (i.e., hyper parameter Inline graphic in Eq. 8) that were optimized to entire data (see the EM algorithm in the Methods section), such abrupt changes may not be captured very well. This may cause a false detection or failure in the detection of the spike correlation at the edge of a task period. For such data, we suggest applying the Bayesian model comparison method to state-space models which are independently fitted to each of the task periods (or relatively smooth sub-intervals within each task period).

Detecting triple-wise spike correlation in simulated nonstationary spike sequences

In this subsection, we examine a method to test for the presence of a higher-order (triple-wise) spike correlation using simulated spike train data with known spike interaction dynamics. In this context, we again use the simulated spike data of Inline graphic neurons (of length 500 bins with Inline graphic trials) shown in Figure 4A, which was generated by the time-varying model, as shown in Figure 4C (dashed lines). In the following, we assume for this data an underlying experimental protocol that can be segmented into behaviorally-relevant time periods, I–IV, as indicated in Figure 5A, as for example a period before the trial starts, a preparatory period, a movement period, etc. (e.g., see [8], [19] for a typical experimental protocol).

Figure 5. Detection of spike correlations and their relation to pseudo experimental protocol (simulation study).

Figure 5

(A) Sketch of an assumed experimental time course composed of four epochs (I–IV), e.g., of different behavioral task conditions. Epochs I–III have a duration of 100 bins, period IV has a duration of 200 bins. (B) Time-varying triple-wise spike interaction parameter of the underlying model (cf. Figure 4C, Inline graphic) used for the simulation of spike data (Inline graphic neurons, Inline graphic trials) during the time course outlined in A. The gray areas indicate the time intervals in which the triple-wise interaction is positive: Inline graphic. (C) Hypothesis testing for a triple-wise spike correlation based on surrogates. In each time period (I–IV), we perform a test on the Bayes factor (BF) resulting from the original data. The observed BF (Eq. 12), marked by a red line and triangle, is computed as evidence of a positive triple-wise interaction, Inline graphic: Inline graphic, as opposed to a zero or negative triple-wise interaction, Inline graphic: Inline graphic. We then compare the ‘observed BF’ with cumulative distribution functions (CDFs, solid lines) for the BFs derived from surrogate data sets generated from a model containing only up to pairwise interactions (Inline graphic). In all of the CDFs, the gray area indicates the 95% confidence interval of the distribution. If the observed BF falls into the lower tail of the distribution (blue area), Inline graphic is supported; if it falls into the upper tail (red marked area), Inline graphic is supported. (D) Time courses of the underlying pair-interaction parameters of a pairwise log-linear model (Inline graphic), Inline graphic (Inline graphic). These underlying parameters are obtained by projecting the full underlying model (Inline graphic) in Figure 4 to the pairwise model space, Inline graphic. The gray areas indicate the time periods in which all of the pairwise interactions are positive. (E) Similar tests as in C, but for the BF computed as evidence for the presence of simultaneous positive pairwise interactions, Inline graphic, as opposed to the absence of such an assembly, Inline graphic. (F) Compact visualization of the test results from C and E. The colored bars show which hypotheses are supported in the different time segments (red and blue) and where the null hypothesis cannot be rejected.

In each of the time periods, we compare opposing models for the hidden log-linear parameters of Inline graphic neurons, using a full model (Inline graphic). In one hypothetical model, Inline graphic, we assume that the triple-wise interaction term is positive, Inline graphic. The time (bin) index Inline graphic expresses that the BF is computed under the same models for all of the time steps, Inline graphic, in the respective time period. In the other model Inline graphic, we assume that the triple-wise interaction term is smaller than or equal to zero, Inline graphic. Neither model makes specific assumptions about the first and second order log-linear parameters; thus, they are allowed to be real numbers. These parameters are integrated out in Eq. 13. In this simulation study, we independently applied a state-space log-linear model to each of the four periods and computed the respective BFs. Figure 5B displays the ground truth of the time-dependent triple-wise interaction parameter, Inline graphic, of the full log-linear model. The gray areas indicate periods where the triple-wise spike interaction, Inline graphic, is positive: Model Inline graphic is true in this period. In the remaining periods (white), the model of a negative triple-wise spike interaction, Inline graphic, is true.

We compare the observed BF for the tested correlation models to the BFs computed from surrogate data sets resampled under a null hypothesis of no tested order of spike interaction. Specifically, to test the existence of a triple-wise spike correlation, the surrogate BFs are computed from resampled spikes generated from a model containing no triple-wise spike interaction. For this, we first apply a pairwise state-space log-linear model (Inline graphic) to the observed spike data to derive estimates of the time-varying spike rates and dynamic pair-interactions. Then we generate 1000 surrogate samples of Inline graphic parallel spike sequences, where each sample is composed of Inline graphic trials, using the fitted pairwise state-space log-linear model. Thus, the resampled spike sequences exhibit the same time-varying spike rates and pairwise correlations as the original data, but the triplet coincidences occur on a chance level, as expected by the individual spike rates and pairwise correlations among the neurons. The surrogate BFs of a triple-wise spike correlation (using Inline graphic, Inline graphic) are computed for each surrogate data set by applying the full model (Inline graphic).

Figure 5C shows the observed BF (red vertical line and red triangle) and the cumulative distribution function (CDF) of the surrogate BFs (solid black lines) for each of the four time periods, I–IV. In each of the graphs, the gray area indicates the 95% confidence interval derived from the distribution of 1000 respective surrogate BFs. If the observed BF falls outside the confidence interval, the null hypothesis that no tested order of spike interaction exists is rejected. Further, by the two-tailed test, if the observed BF falls outside the confidence interval into the red area of the distribution on the right (i.e., significantly positive BF values), the data supports model Inline graphic as an underlying correlation structure, whereas if it falls into the blue area on the left, we conclude that it is a significantly negative BF, i.e., Inline graphic is supported. If the observed BF falls into the 95% confidence interval, the null hypothesis that no tested order of spike correlation exists is not rejected. We first look at the results for periods II and IV in Figure 5C. In these periods, the null hypothesis of no triple-wise spike correlation is rejected, and the observed BF correctly suggests a positive triple-wise correlation model (Inline graphic), reflecting the fact that the underlying triple-wise interaction term is positive throughout the two periods (see periods II and IV in Figure 5B). In contrast, in periods I and III, where a negative triple-wise correlation model (Inline graphic) is true, the null hypothesis of no triple-wise correlation cannot be rejected.

It is possible that the synchronous spike events observed among multiple neurons are detected as simultaneous increases in pair interaction terms using the log-linear model of the second order (Inline graphic), without having to take higher-order spike interactions into consideration. Paradoxically, this issue becomes relevant when a triple-wise spike correlation is a dominant factor in the generation of synchronous spike events because the projection of a full model (Inline graphic) that contains a positive triple-wise interaction term to a pairwise model induces simultaneous increases in the pair interaction terms in that projected model (Inline graphic). In such a case, an analysis of the higher-order spike correlations might just add redundant information about an animal's behavior to lower-order analyses. Thus, we repeat an analysis similar to the one above under this alternative hypothesis, using a model that contains up to pairwise interactions (Inline graphic). In this test, the BF (Eq. 12) is computed using the following assumed models on the hidden log-linear parameters: Inline graphic, all three neurons simultaneously exhibit positive pairwise interactions (Inline graphic for all Inline graphic); and Inline graphic, at least one of the pairwise interactions is not positive (Inline graphic for at least one Inline graphic). The first order log-linear parameters are again assumed to be real values. Surrogate data sets are obtained by resampling spike sequences from the first order state-space log-linear model (Inline graphic) fitted to the data. The surrogate BFs for the model of simultaneously positive pairwise interactions, as opposed to its complementary model, are computed for each of the surrogate data sets by applying a pairwise model (Inline graphic). Figure 5D displays the ground truth of the dynamics of the pairwise interaction terms, Inline graphic (Inline graphic), of the pairwise log-linear model. These are obtained by projecting the full underlying model (Figure 4C dashed lines) to the probability space of the second-order model (Inline graphic). Note that in period II, the pair-interaction terms simultaneously increase because the effect of the triple-wise interaction in Inline graphic is projected to this sub-space model (Inline graphic) (see Figure 4C, middle panel, for comparison, and observe that there is no such increase in the pair-interaction terms of the full log-linear model). The gray areas indicate the period where model Inline graphic (simultaneously positive pairwise interactions) is true. Figure 5E displays the results for testing the simultaneous increases in the pair interactions. We again look at periods II and IV. In period II, the null hypothesis of independent spike sequences is rejected, and model Inline graphic is supported, reflecting the fact that the projected pairwise interactions are positive for most of this time period. In contrast, the same null hypothesis cannot be rejected in period IV because neither model Inline graphic nor Inline graphic alone can support the data in this period: Models Inline graphic and Inline graphic are true in the first and second halves of period IV, respectively (see Figure 5D).

Figure 5F summarizes the results for periods I–IV using bars in corresponding colors. The results of our tests reflect the dynamics of the underlying parameters in Figures 5B and D, and we find that in period IV, only the test for a triple-wise spike correlation detected the presence of interactions among all three neurons in the data. This is because, in this time period, the dynamics of the underlying triple-wise interaction correlate well with the assumed behavioral time table, whereas the dynamics of the simultaneous pairwise interactions do not. In summary, it is possible that the higher-order analysis allows us to discover the correlated activities of multiple neurons associated with behavioral events, which may not be revealed by pairwise analyses.

Detecting triple-wise spike correlation in neural spike data from behaving monkey

Finally, we demonstrate the application of our method on simultaneous spike recordings from the primary motor cortex (MI) of an awake, behaving monkey. The data were recorded by Alexa Riehle and her colleagues [8] to test the hypothesis that neuronal cooperativity is involved in the planning of motor actions. Therefore, Riehle et al. designed a behavioral task in which different durations of preparation intervals were provided to the monkeys before they had to perform an arm movement and touch a target on a screen. The detailed time table of the task is as follows (cf. Figure 6A). A trial started by a signal indicating to the monkey that he may initiate the task by pressing a button. After initiating the trial, the monkey had to wait for 1000 ms until a preparatory signal (PS) was delivered, indicating to the monkey that now the preparation interval started. After appearance of a response signal (RS) at 600, 900, 1200, or 1500 ms (randomly selected with equal probability), the monkey had to move his arm and touch the target position. The reaction times of the monkey showed a dependence on the duration of the preparatory period (PP): The longer the PPs the shorter the reaction times [8], [19], [84]. The conditional probability for the occurrence of the RS increases during the longer PP, in particular at the times when the RS may occur. The hypothesis for the underlying network activity realizing reduced reaction times with longer PPs was that the system is better prepared for the requested movement by increasingly enhancing the synchrony in the relevant network to facilitate the response. Indeed, Riehle et al. showed that cortical neurons modulate the degree of excess synchronous spiking activities independently from their individual firing rates in conjunction with the occurrence of the expected RSs [8], [19], [84]. An open question that could not be conclusively answered in that study was if larger groups of neurons than pairs coordinate their activity in relation to the expected events. Here, we approach this question by applying our newly developed method and will indeed demonstrate the existence of higher-order (triple-wise) spike interaction in relation to the behavioral task.

Figure 6. Analysis of experimental spike data using the state-space log-linear model.

Figure 6

(A) Experimental time table of delayed-response hand movement task. The experiments were designed and conducted by Riehle and her colleagues [8], [84]. During the preparatory period (PP, 1500 ms) that starts with the preparatory signal (PS), the presentation of the response signal (RS) was expected at three distinct moments at 600, 900, and 1200 ms (expected signals, ESs). Here the RS occurs finally after the longest possible delay of 1500 ms. After the RS, the requested movement was executed (reaction and movement time, RT-MT). See [8], [84] and the text for the detailed experimental protocol. (B) Dot displays of the spike sequences (duration: 2 s, sampling resolution 1 ms) of three neurons simultaneously recorded from the primary motor cortex (MI). The spike sequences are aligned at the onset of the PS (Inline graphic trials). The synchronous spike events across the 3 neurons detected in individual trials (detection in bins of Inline graphic width) are marked by blue circles. (C) Observed rates of joint spike events, Inline graphic (Inline graphic). All of the events are detected in bins with a width of Inline graphic. (Top) Observed rates of the spike occurrence of the individual neurons (Inline graphic). (Middle) Observed rates of the synchronous spike events between two neurons specified by index Inline graphic (Inline graphic). (Bottom) Observed rate of the synchronous spike events of all 3 neurons, Inline graphic. (D) Estimation of the time-varying log-linear parameters of 3 neurons, Inline graphic (Inline graphic), from the binary data shown in C, according to the method summarized in Table 1. The panels depict the MAP estimates of the log-linear interaction parameters (solid lines; from top to bottom, the first and second order log-linear parameters and a triple-wise spike interaction, Inline graphic). The gray bands indicate the 95% credible interval. In this analysis, we used an identity matrix as an autoregressive parameter, i.e. Inline graphic, in the state model. The covariance matrix of the initial parameter was fixed to Inline graphic. (E) The smoothed estimate of a triple-wise spike interaction, Inline graphic, is computed from binary data constructed using a bin-width of 2 ms (Top) and a bin-width of 5 ms (Bottom). The top of each panel shows the timing of the synchronous spike events of all three neurons.

To that end, we analyze data which were in part analyzed by the Unitary Events analysis method (neuron id 2 and 3; shown in Figure 2 in [8]). Figure 6B shows this particular data set that consists of three simultaneously recorded neurons (neuron id 2, 3, and 5; Here we denote them as neuron 1, 2, and 3) observed during trials (Inline graphic) of the longest preparatory period of 1500 ms. As in Riehle et al. (Figure 2 in [8]) we align the trials at the PS, and analyze the data for the interaction parameters as a function of time from 200 ms before the PS and 1800 ms after the PS. This time segment is composed of three behavioral epochs: an interval before the PS (200 ms), the PP with three expected signals (ESs, at 600, 900 and 1200 ms), and a period after the RS at 1500 ms including the reaction time and partly the movement (reaction and movement time, RT-MT) (see Figure 6A). The average firing rates of the neurons during the PP are 29.4, 12.9, and 41.9 Hz for neurons 1, 2, and 3 (neuron id 2, 3, and 5) respectively. We constructed binary sequences, Inline graphic, from the spike times of the neurons by binning the data using a bin-width of Inline graphic. Figure 6C displays the occurrence rates for the individual spiking activities (upper panel) and the pair and triple synchronous spike events (middle and bottom panels) detected in the bins with a width of Inline graphic.

We apply a full log-linear model of Inline graphic neurons with up to a triple-wise interaction term to the binary spike data shown in Figure 6C. Figure 6D displays the estimated dynamics of the interaction parameters in the log-linear model by using the method summarized in Table 1. The thick bold lines indicate the MAP estimates, i.e., the most probable paths, of the interaction parameters. The gray bands and thin solid lines mark the 95% credible interval (the Bayesian analogue of the confidence interval) computed from a marginal posterior density. The last parameter of the log-linear model, Inline graphic, indicates the triple-wise interaction among the three neurons in the MI. Strong positive (or negative) value of this term indicates that the three neurons are triple-wise correlated, i.e. dependent in a manner that cannot be explained by pair correlations of the neurons. Specifically, a positive triple-wise interaction value means that the occurrence of the synchronous events among the three neurons is more frequent than the chance level expected from the observed individual rates and pairwise spike correlations among them.

The analysis of the MI neurons using the state-space log-linear model reveals that the triple-wise interaction, Inline graphic, during the PP gradually increases, with additional local peaks at the ESs that also increase in heights towards the end of the PP (Figure 6D, bottom panel). This result is consistent with the results found by Riehle and collaborators for a subset of the neurons of the same data set (Figure 2 in [8]). In this previous study neuron 1 and 2 (neuron id 2, 3) were analyzed for excess spike synchrony using the Unitary Events analysis [32], [33]. The analysis revealed that the two neurons exhibit a modulation of significant excess spike synchrony, with peaks at the ESs (despite the first) and at the RS.[8], [32], [84]. That observation was interpreted as evidence that the neurons cooperate to prepare for motor action and facilitate the efficiency for movement execution [8], [84]. Similarly, occurrences of synchronous spike events of the three neurons roughly coincide with the expected events (ESs) (Figure 6C, bottom panel); accordingly, our result shows that the triple-wise interaction, Inline graphic, is also locked to the ESs, however decreases at RS. Note that the triple-wise interaction is not only determined by the frequency of synchronous spike events of all three neurons but is also determined by frequencies of other observed spike patterns across trials. However, for typical neural spike train data with low spike rates and low rates of synchronous spiking of pairs of neurons, the occurrence of synchronous spike events of three neurons can significantly increase the triple-wise interaction. In the subsection ‘Bayesian model comparison method for detecting spike correlation’ in the Methods section we explore the contribution of different spike patterns in different spiking scenarios to the evidence of a triple-wise spike correlation in simulated data. Figure 6E displays modulation of the interaction term, Inline graphic, using the binary data constructed with bins of smaller (Inline graphic ms, upper panel) and larger (Inline graphic, lower panel) widths. The top of each panel shows the timing of the synchronous spike events of all three neurons. The sample sizes for synchronous spike events across all three neurons for the 2 ms bin-width are greatly reduced from those observed for a bin-width of 3 ms. Thus estimated dynamics is much less structured: The data prevents us from detecting existence of a triple-wise spike correlation using the proposed statistical test (see below for the application of the test method). With larger bins of a width of 5 ms, the precise locking of the synchronous spike events across the three neurons to ESs is no longer apparent. However, the state-space log-linear model reveals a gradual increase in the strength of the triple-wise interaction until the end of the PP.

To strengthen the findings that a triple-wise interaction term increases during the PP, we test for the presence of a triple-wise spike correlation in the PP using the Bayes factor (marginal likelihood ratio, Eq. 12). We compute the BF for the opposing models using a full model (Inline graphic) of Inline graphic neurons for the hidden log-linear parameters. In one hypothetical model, Inline graphic, we assume that the triple-wise interaction term is positive, Inline graphic, in the other model, Inline graphic we assume that the triple-wise interaction term is smaller than or equal to zero, Inline graphic. A large positive value of the BF indicates that the spike data in that period supports model Inline graphic as opposed to model Inline graphic, i.e., the existence of excess synchronous spike events among the three neurons in comparison with the chance rate given by spike rates and pairwise correlations. A large negative value of the BF shows support of model Inline graphic as opposed to model Inline graphic, i.e., a paucity of synchronous spikes for the three neurons. Figure 7A shows the BF computed bin-by-bin (Eq. 13) in bit units. In the bottom panel of Figure 7A, we indicate the behavioral periods for which we test for the presence of triple-wise spike correlation (the PP and RT to MT; in later analysis we divide the PP into early and late stages). The evidence for model Inline graphic, as opposed to Inline graphic, in the behaviorally relevant time periods is obtained by summing the log of the bin-by-bin BF in that period (cf. Eqs. 12 and 13). The BF in the PP is found to be 18.08 bit, which is interpreted as ‘very strong’ evidence for presence of a positive triple-wise spike correlation by the classical guideline [76].

Figure 7. Detection of triple-wise spike correlation of MI neurons.

Figure 7

(A) (Top) The bin-by-bin Bayes factor (BF) for a model of triple-wise spike interaction computed locally in time in bit units, Eq. 13. The bin-by-bin BF is computed from Eq. 13, using the state-space log-linear model fitted to the spike data (a total of 36 trials, 2 s binned using 3 ms bin-width; cf. Figure 6C). The BF computes evidence of a positive triple-wise spike interaction, Inline graphic, as opposed to a zero or negative triple-wise interaction, Inline graphic. The evidence for model Inline graphic as opposed to Inline graphic in a behaviorally relevant time period is obtained by summing the log of the bin-by-bin BF in that period (cf. Eq. 12). (Bottom) Two behavioral periods (preparatory period, PP, and reaction and movement time, RT-MT) tested for presence of a triple-wise spike correlation. In addition, to examine the evolution of the triple-wise spike correlation in the PP, the PP is divided into early and late stages at the middle of the PP. (B) (Left) The observed BF for an entire PP (marked by a red line and triangle) computed using Eq. 12 (bin-width: 3 ms). We then test the ‘observed BF’ using a distribution function (solid line) of null BFs derived from surrogate data sets generated from a fitted model containing only up to pairwise interaction terms (Inline graphic). The gray area indicates the 95% confidence interval of the distribution. Vertical dashed lines indicate the 90% confidence interval. If the observed BF falls into the lower tail of the distribution (blue area), Inline graphic is supported, if it falls into the upper tail (red marked area), Inline graphic is supported. (Right) The same analysis in the left panel, but with binary data constructed using larger (5 ms, upper panel) and smaller (2 ms, lower panel) bin-widths. (C) (Top) Test of observed BFs (bin-width 3 ms) computed in distinct periods: early PP (0–750 ms), late PP (750–1500 ms), and RT-MT (1500–1800 ms). (Bottom) The same test as in the top panel, except that binary data using a bin-width of 5 ms were used. (D) Test of observed BFs (bin-width: 3 ms) in each period using spike data from only the last half of the 36 trials (trials 19–36).

Next, we test the observed value of the BF in the PP (18.08 bit) by comparing it with surrogate BFs. These are computed from null data that contain no triple-wise spike interactions, while keeping the time-varying structure of the pairwise correlations and the individual spike rates the same as those observed in the original spike train data. The construction of the null data follows the same procedure as in the simulation study shown before: We apply a pairwise state-space log-linear model (Inline graphic) to the spike data and then generate 1000 surrogate samples (each with Inline graphic trials) of Inline graphic parallel spike sequences using the fitted pairwise state-space log-linear model. Figure 7B (left panel) displays the surrogate distribution along with the observed BF in the PP. The observed positive BF falls out of the 95% confidence interval, suggesting the presence of a positive triple-wise interaction as an underlying model for the spike train data in this period. In the right panels of Figure 7B, we display the results of the same analysis, but using spike train data binned by a larger (Inline graphic, top panel) and a smaller (Inline graphic, bottom panel) bin-width. We obtained almost the same results as in the analysis with a bin-width of 3 ms as for the analysis with the larger bin-width (5 ms). However, we could not reach the same results for the smaller bin width (2 ms) because the small count of synchronous spike events made it impossible to reject the null hypothesis.

If the higher-order dependency among the three neurons is related to motion preparation, the evidence for the triple-wise spike correlation should be stronger in the late stage of the PP than before. To test this idea, we divided the PP into earlier and later stages of the PP (each with a 750 ms duration, see the bottom of Figure 7A), and investigate whether the three MI neurons exhibit a triple-wise spike correlation in these periods. In addition, we select a duration of 300 ms after the onset of the RS, during which the animal starts to initiate the motor action (reaction and movement time, RT-MT). The upper panels of Figure 7C display the results obtained with bins of width Inline graphic ms. The weights of evidence for the triple-wise spike correlation models are 7.16, 11.2, and 0.98 bit in the early PP, late PP, and RT-MT periods respectively: i.e., ‘very strong’ evidence is found at the late stage of the PP. The null hypothesis is rejected in this late period of the PP whereas the same null hypothesis is not rejected in the earlier period of PP or the period after the RS. In the late PP, the significantly large positive BF indicates the existence of a positive triple-wise spike correlation. The bottom panels of Figure 7C display the same analyses, but with a larger (Inline graphic) bin-width. We obtain the same results as in the analysis with a bin-width of 3 ms. When we analyze the data using a smaller bin-width (Inline graphic), the BFs are not significant in any of the periods (not shown here) because of the small samples of synchronous spikes. In addition, as in the simulation study in the previous subsection, we tested whether the observed synchronous spiking activities are explained merely by simultaneous increases in the pairwise interaction terms of the second order log-linear model. We did not find any evidence for such simultaneous increases in the pairwise interactions from the data for all three periods (bin-width Inline graphic) (not shown here). Thus only by the application of the higher-order analysis we were able to detect the task-dependent changes in the joint interactions of all three neurons.

The critical assumptions made in this study are independence and identical sampling across the trials (across-trial stationarity). However, in the spike sequences of neuron 2 (neuron id 3) shown in Figure 6B, we notice an increase in the firing rates across trials. Higher firing rates yield a higher chance rate for synchronous spiking events. Thus, underestimation of the spike rate caused by averaging across the entire trials might induce spurious estimation of higher-order spike interaction. Indeed, synchronous spike events of the three neurons are mostly observed in later trials (e.g., trials 19–36, see blue dots in Figure 6B) when the firing rates were higher than in the earlier trials. Hence, we repeat the same analysis by using only the latter half of the trials (trials 19–36). Figure 7D displays the results. We still observe a significantly positive BF in the late PP. We note that the estimated dynamics of the log-linear parameters using only the latter half of the trials are not changed much from those using all trials, i.e., Figure 6D. In the analysis using the first half of the trials (1–19, not shown), the BF is not significant during the late PP. Kilavik et al. [19] reported that “during practice, the temporal structure of synchrony was shaped, with synchrony becoming stronger and more localized in time during late experimental sessions”. We observe similar effects in our triple-wise spike correlation analysis.

Discussion

In this study, we introduced a novel method for estimating dynamic spike interactions in multiple parallel spike sequences by means of a state-space analysis (see Methods for details). By applying this method to nonstationary spike train data using the pairwise log-linear model, we can extend the stationary analysis of the spike train data by the Ising/spin-glass model to within-trial nonstationary analysis (Figure 3). In addition, our approach is not limited to a pairwise analysis, but can perform analyses of time-varying higher-order spike interactions (Figure 4). It has been discussed whether higher-order spike correlations are important to characterize neuronal population spiking activities, assuming stationarity in the spike data [54][59]. Based on the state-space model optimized by our algorithm, we developed two methods to validate and test its latent spike interaction parameters, in particular the higher-order interaction parameters, which may dynamically change within an experimental trial. In the first method, we selected the proper order for the spike interactions incorporated in the model under the model selection framework using the approximate formula of the AIC for this state-space model (In Methods, ‘Selection of state-space model by information criteria’). This method selects the model that best fits the data overall across the entire observation period. The selected model can then be used to visualize the dynamic spike interactions or for a performance comparison with other statistical models of neuronal spike data. However, more importantly, the detailed structure of the transient higher-order spike interactions needs to be tested locally in time, particularly in conjunction with the behavioral paradigm. To meet this goal, we combined the Bayesian model comparison method (the Bayes factor) with a surrogate method (In Methods, ‘Bayesian model comparison method for detecting spike correlation’). The method allows us to test for the presence of higher-order spike correlations and examine its relations to experimentally relevant events. We demonstrated the utility of the method using neural spike train data simultaneously recorded from primary motor cortex of an awake monkey. The result is consistent with, and further extended the findings in the previous report [8]: We detected an increase in triple-wise spike interaction among three neurons in the motor cortex during the preparatory period in a delayed motor task, which was also tightly locked to the expected signals. Although the analysis was done for a limited number of neurons, smaller than the expected size of an assembly, it demonstrates that the nonstationary analysis of the higher-order activities is useful to reveal cooperative activities of the neurons that are organized in relation to behavioral demand. Of course, further analysis is required to strengthen the findings made above including a meta-analysis of many different sets of multiple neurons recorded under the same conditions.

In this study, we adopted the log-linear model to describe the higher-order correlations among the spiking activities of neurons. There are, however, other definitions for ‘higher-order spike correlation’. An important alternative concept is the definition based on cumulants. Using the cumulants of an observed count distribution from a spike train pooled across neurons, Staude et al. developed an iterative test method that can detect the existence of a high amplitude in the jump size distribution of the assumed compound Poisson point process (CPP) model for the pooled spike train [34], [35]. This method can detect an assembly from a few occurrences of synchronous spike events to which many neurons belong to, typically by using the lower-order cumulants of the observed spike counts. In contrast, the information geometry measure for the higher-order spike correlation used in this study aims to represent the correlated state that cannot be explained by lower-order interactions. Consequently, the information geometry measure extracts the relative strength of the higher-order dependency to the lower-order correlated state. Therefore, the presence of positive higher-order spike correlations does not necessarily indicate that many neurons spike synchronously whenever they spike because such activities can be induced by positive pairwise spike correlations alone [65], [85], [86] (see also Figure 8A and B in the Methods section). In contrast, the cumulant-based correlation method by Staude et al. [34] infers the presence of ‘higher-order correlation’ for such data by determining the presence of high amplitudes in the jump size distribution of the assumed CPP model. Yet another important tool for analyzing higher-order dependency among multiple neurons is the copula function, a standardized cumulative distribution function used to model the dependence structure of multiple random variables (see [87][89] for an analysis of neurophysiological data using the copula, including an analysis for higher-order dependency [89]). In summary, it should be remembered that the analysis method used for the higher-order dependency among neuronal spikes inherits its goal from the assumed model for spike generation as well as a parametric measure defined for the ‘higher-order’ spike correlation [34], [62].

Figure 8. Analysis of stationary spike correlations of Inline graphic simulated neurons using Bayes factor.

Figure 8

(A) Sketch of different time periods and the underlying models used for the generation of parallel spike sequences: (I) Model of independent spiking (Inline graphic, Inline graphic, Inline graphic for Inline graphic and Inline graphic); (II) Model of simultaneously positive pairwise interactions, without a triple-wise interaction (Inline graphic, Inline graphic, Inline graphic for Inline graphic); (III) Model of triple-wise interaction, with negative pair interactions (Inline graphic, Inline graphic, Inline graphic for Inline graphic). (B) Raster display of three parallel spike sequences, Inline graphic, within one example trial. Each spike is colored according to the spike pattern in which it appears: Spikes occurring in triplets are shown in red, spikes within doublets (all types) are marked in blue, and spikes not involved in any synchrony pattern are shown in black. (C) The bar plots demonstrate the Bayes factors (BFs), Eq. 48, for each of the time periods I–III in bit units. The upper panel shows the average BF when testing simultaneously positive pairwise interactions (Test 1), averaged across 200 realizations (Inline graphic). A positive value for the log of the BF supports the model for the presence of simultaneously positive pairwise interactions, Inline graphic, while a negative value supports the absence of such an assembly, Inline graphic. The BFs per sample and time period are computed by applying a pairwise state-space log-linear model (Inline graphic) independently to the three periods. We use a state model with Inline graphic. The lower panel shows the BF for the positive triple-wise spike interaction (Test 2), Inline graphic, as opposed to a zero or negative triple-wise spike interaction, Inline graphic, in each of the three periods. The BFs are computed from a full state-space log-linear model (Inline graphic). (D) Bar plot of the bin-by-bin BFs (Eq. 52) sorted by the different spike patterns in the three periods (from top to bottom). The contributions to the BFs (shown in C) from the different spike patterns are sorted and displayed using the indicated spike patterns (000, 100, 110 and 111) as representative examples. The gray bars indicate the average BFs of simultaneously positive pairwise interactions, with the average computed for the spike patterns observed in 200 realizations in the respective periods, while the dark gray bars indicate the average BFs for the triple-wise spike correlation.

Although we face a high-dimensional optimization problem in our settings, we are able to successfully obtain MAP estimates of the underlying parameters because of the simplicity of the formulation of the state-space model: The use of the log-concave exponential family distributions [50], [90] in both the state and observation models guarantees that the MAP estimates can be obtained using a convex optimization program. At each bin, the method numerically solves a nonlinear filter equation to obtain the mode of the posterior state density (the MAP estimates, see Eqs. 28, 29, and 30 in Methods). With only a few (3–8) Newton-Raphson iterations, the solution reaches a plateau (the increments of all the elements of the updated state space vector are smaller than Inline graphic). The entire optimization procedure can be performed in a reasonable amount of time: On a contemporary standard laptop computer it takes no longer than 30 seconds to obtain smooth estimates of a full log-linear model for Inline graphic neurons (Inline graphic bins, Figure 4), which includes 100 EM iterations. The method is even faster when approximating the posterior mode using the update formula Eq. 30 without any iterations, using the one-step prediction mean as an initial value. This fast approximation method suggested in [69] could even be utilized in a real-time, on-line application of our filter (the filtering method applied to a single trial, Inline graphic, using predetermined hyper-parameters) at the cost of estimation accuracy.

The pairwise analysis can be applied up to about Inline graphic neurons simultaneously to derive time-dependent pair interactions. However, the current version of the algorithm does not scale to a larger number of neurons because the number of spike patterns that need to be considered suffers from a combinatorial explosion. The major difficulty arises from the coordinate transformation from the Inline graphic-coordinates to the Inline graphic-coordinates that appear in the non-linear filter equation (Eq. 31 in Methods). The coordinate transformation is required in this equation to calculate the innovation signal, i.e., the difference between the observed synchrony rates, Inline graphic, and the expected synchrony rates (Inline graphic-coordinates) based on the model. We numerically derived the exact Inline graphic-coordinates by marginalizing the Inline graphic dimensional joint probability mass function computed from the Inline graphic-coordinates. Thereby, a full knowledge of the probability mass function is required even if the model considers only the lower-order interactions. Because this is a common problem in the learning of artificial neural networks [46], [47], [91], sampling algorithms such as the Markov chain Monte Carlo method have been developed to approximate the expectation parameters, Inline graphic, without having to compute the partition function [92]. The inclusion of such methods allows us to analyze the time-varying low order spike interactions from a larger number of parallel spike sequences. Recent progress [93], e.g., in the mean field approach and/or the minimum probability flow learning algorithm for an Ising model, may allow us to further increase the number of neurons that can be treated in this nonstationary pairwise analysis. Nonetheless, the method presented here, which aims at a detailed analysis of the dynamics in higher-order spike interactions, may not easily scale to massively parallel spike sequences that can be analyzed by other methods such as those based on the statistics pooled across neurons. Thus, we consider it to be important to combine the detailed analysis method proposed in this study with other state-of-the-art analysis techniques in practical applications. For example, test methods based on population measures such as the Unitary Event method and cumulant-based inference method [34], [35] allow us to detect the existence of statistically dependent neurons in massively parallel spike sequences. If the null-hypothesis of independence among those neurons is not rejected in these methods, we can exclude those neurons from any further detailed analysis of their dynamics using the methods proposed in this study.

Several critical assumptions made in the current framework need to be addressed. First, it was assumed in constructing the likelihood (Eq. 7) that no spike history effect exists in the generation of a population spike pattern. Second, we assumed the use of identically and independently distributed samples across trials when constructing the likelihood (Eq. 7). The first assumption may appear to be strong constraint given the fact that individual neurons exhibit non-Poisson spiking activities [94]. However, as in the case of the estimation of the firing rate of a single neuron, the pooled spike train across the (independent) trials is assumed to obey a Poisson point process because of the general limit theorem for point processes [30], [31], [95], [96]. This is because most of the spikes in the pooled data come from independent different trials. They are thus nearly statistically independent from each other, even if the individual processes are non-Poisson. Similarly, in our analysis, we used statistics from a pooled binary spike train, assuming independence across trials: The occurrences of joint spikes in the binary data pooled across trials are mostly independent of each other across bins. Because these joint spike occurrences are sparse (i.e., they rarely happen closely to each other in the same trial), it is even more feasible to assume their statistical independence across bins. Third, however, while pooling independent and identical trials (the second assumption) may validate the first assumption of the independence of the samples across bins, that assumption of independently and identically distributed samples across trials has itself been challenged [97], [98] and is known to be violated in some cases, e.g., by drifting attention, ongoing brain activity, adaptation, etc. It is possible that the trial-by-trial jitter/variation in the spike data causes spurious higher-order spike correlation. Thus, as discussed in the section on the application of our methods to real neuronal data, it is important to examine the stationarity of the spike train data across trials. Note that, not only the firing rates, but also the spike synchrony can be shaped on a longer time scale by repeatedly practicing a task [19]. In fact, the current analysis method can be used to examine the long-term evolution of pairwise and higher-order spike interactions across trial sessions by replacing the role of a bin with a trial, assuming within-trial stationarity. It will be a challenge to construct a state-space log-linear model that additionally applies a smoothing method across trials (see [98] for such a method for a point process model).

The present method is left with one free parameter, namely the bin-width Inline graphic. The bin-width determines the permissible temporal precision of synchronous spike events. Very large bin-widths result in binary data that are highly synchronized across sequences, while very small bin-widths result in asynchronous multiple spike sequences. In the latter case, we might overlook the existing dependency between multiple neural spike sequences due to disjunct binning [99] (but see [57], [100] that aim to overcome such a problem by modeling the spike interactions across different consecutive time bins). Within our proposed modeling framework, which focuses on instantaneous higher-order spike correlations, it is important to catch the innate temporal precision of the neuronal population under investigation using the appropriate bin-width. Thus, the choice may be guided by the biophysical properties of the neurons. However, it may be of advantage to derive the bin-width in a data-driven manner. For example, in the context of an encoding problem, the proper bin-width can be chosen based on the goodness-of-fit test for single neuron spiking activities [101], conditional on the spiking activities of the other neurons [40]. For questions about the relation of coincident spiking to stimulus/behavior, the bin-width may be selected based, for example, on the predictive ability of an external signal. For this goal, it is important to search the optimal bin-width using elaborate methods such as those developed in the context of the Unitary Event analysis method [99] (see [25] for a review of related methods).

A substantial number of studies have demonstrated that stimulus and behavioral signals can be decoded simply based on the firing rates of individual neurons. At the same time, it has been discussed whether spike correlations, particularly higher-order spike correlations, are necessary to characterize neuronal population spiking activities [54][58] or to encode or decode information related to stimuli [53], [60], [102]. At this point in time, a smaller number of dedicated experiments have supported the conceptual framework of information processing using neuronal assemblies formed by neurons momentarily engaged in coordinated activities, as expressed by temporally precise spike correlations (see [6], [7], [9], [10] for reviews of these experiments). Nevertheless, it is possible that the current perspective on this subject has been partly formed by a lack of proper analysis approaches for simultaneously tracing time-varying individual pairwise spike interactions, and/or their higher-order interactions. Indeed, we demonstrated by the time-resolved higher-order analysis that three cortical neurons coordinated their spiking activities in accordance with behaviorally relevant points in time. Thus our suggested analysis methods are expected to be useful to reveal the dynamics of assembly activities and their neuronal composition, as well as for testing their behavioral relevance. We hope that these methods help shed more light on the cooperative mechanisms of neurons underlying information processing.

Methods

Mathematical properties of log-linear model

In this subsection, we review the known mathematical properties of a log-linear model for binary random variables. These properties are used in constructing recursive filtering/smoothing formulas in the next section. Using the multi-index, Inline graphic (see the Results subsection ‘Log-linear model of multiple neural spike sequences’), the probability mass function (Eq. 1), Inline graphic, where Inline graphic and Inline graphic (Inline graphic), and the expectation parameters (Eq. 2) are compactly written as

graphic file with name pcbi.1002385.e623.jpg (14)

and

graphic file with name pcbi.1002385.e624.jpg (15)

where Inline graphic is a feature function, here representing an interaction among the neurons indicated by the multi-index, Inline graphic (Eq. 3).

The Inline graphic- and Inline graphic-coordinates are dually flat coordinates in the exponential family probability space [44], [50], and the coordinate transformation from one to the other is given by the Legendre transformation [40], [50]. From Eq. 14, the log normalization function, Inline graphic, is written as

graphic file with name pcbi.1002385.e630.jpg (16)

The first derivative of the log normalization function, Inline graphic, with respect to Inline graphic (Inline graphic), provides the expectation parameter, Inline graphic:

graphic file with name pcbi.1002385.e635.jpg (17)

Let Inline graphic be the negative entropy of the distribution:

graphic file with name pcbi.1002385.e637.jpg (18)

Eqs. 17 and 18 complete the Legendre transformation from Inline graphic-coordinates to Inline graphic-coordinates [44], [50]. The Legendre transformation transfers the functional relationship of Inline graphic and Inline graphic to the equivalent relation in the dual coordinates, Inline graphic and Inline graphic. The inverse transformation is given by Eq. 18 and Inline graphic.

Using the log normalization function, we can obtain the multivariate cumulants of Inline graphic with respect to the random variables, Inline graphic. The cumulant generating function of the exponential family distribution is given as Inline graphic. Let us compactly write the partial derivative with respect to Inline graphic (i.e., Inline graphic) as Inline graphic. Then, the first order cumulant is given as Inline graphic, as shown in Eq. 17. In general, the cumulants of the exponential family distribution are given by the derivatives of the log normalization function. Thus, the second derivative of Inline graphic yields the second-order cumulant, Inline graphic (by the cup, Inline graphic, we mean the multi-index representation of an union of the elements of the two multi-indices, e.g., if Inline graphic and Inline graphic, then Inline graphic):

graphic file with name pcbi.1002385.e658.jpg (19)

for Inline graphic. Inline graphic is known as the Fisher metric with respect to the natural parameters. Eqs. 17 and 19 are important relations used in this study to construct a non-linear filtering equation for a dynamic estimate of the natural parameters because we approximate the log-linear model (Eq. 14) with a precision of up to a (log) quadratic function (cf. Eqs. 28 and 29). Similarly, the higher-order derivatives yield higher-order multivariate cumulants. For example, the third-order derivative yields the third order cumulant, Inline graphic, where Inline graphic.

The pseudo distance between two different distributions, Inline graphic and Inline graphic is defined using the Kullback-Leibler (KL) divergence

graphic file with name pcbi.1002385.e665.jpg (20)

We represent distribution Inline graphic by using Inline graphic-coordinates as Inline graphic, and Inline graphic by using Inline graphic-coordinates as Inline graphic. Here, we used Inline graphic for the Inline graphic-parameters of Inline graphic (and Inline graphic for Inline graphic-parameters of Inline graphic) in order to differentiate it from the representation of distribution Inline graphic in the Inline graphic-coordinates (and the representation of Inline graphic in the Inline graphic-coordinates). Then, the KL-divergence between the two distributions, Inline graphic and Inline graphic, is computed as [44], [50]

graphic file with name pcbi.1002385.e684.jpg (21)

Optimized estimation of dynamic spike interactions

We develop a non-linear recursive Bayesian filtering/smoothing algorithm and its optimization method in order to trace dynamically changing spike interactions from parallel spike sequences. To reach this goal, we use the expectation-maximization (EM) algorithm [68], [73], [103], [104], which is known to efficiently combine the construction of the posterior density of a state (the natural parameters) and the optimization of the hyper-parameters. This method maximizes the lower bound of the log marginal likelihood, Eq. 10. Using Jensen's inequality and nominal hyper-parameters, Inline graphic, the lower bound of the log marginal likelihood with hyper-parameters Inline graphic is given by

graphic file with name pcbi.1002385.e687.jpg (22)

Here, Inline graphic represents a negative entropy. The maximization of the lower bound with respect to Inline graphic is equivalent to maximizing the expected complete data log-likelihood in Eq. 22, known as the Inline graphic-function:

graphic file with name pcbi.1002385.e691.jpg (23)

The expectation in the above equation is read as Inline graphic. We maximize the Inline graphic-function by alternating the expectation (E) and maximization (M) steps. In the E-step, we obtain the expected values with respect to Inline graphic in Eq. 23 using a fixed Inline graphic. In the M-step, we obtain the hyper-parameter, Inline graphic, that maximizes Eq. 23. The resulting Inline graphic is then used in the next E-step. The details of each step are now given as follows.

E-step: Bayesian recursive filter/smoother

The E-step is composed of filtering and smoothing algorithms conducted by forward and backward recursions, respectively. The forward algorithm sequentially constructs a posterior density of the state at time Inline graphic given the spike data up to and including time Inline graphic, whereas the backward algorithm constructs a posterior density at time Inline graphic given the entire data. The posterior density allows us to compute the maximum a posteriori (MAP) estimate or Bayes estimator and provides uncertainty for the estimate. In the following, Inline graphic and Inline graphic denote the conditional mean, Inline graphic, and covariance, Inline graphic. The filter mean and covariance are denoted as Inline graphic and Inline graphic, respectively. The mean and covariance of a smooth posterior density are denoted as Inline graphic and Inline graphic, respectively.

We first compute the one-step prediction density, Inline graphic. This is the conditional density of the state at time Inline graphic given the observation of parallel spike sequences up to time Inline graphic. The one-step prediction density is written using the Chapman-Kolmogorov equation as [37], [69], [105]

graphic file with name pcbi.1002385.e712.jpg (24)

Here the transition probability, Inline graphic, is a multivariate normal distribution with mean Inline graphic and covariance Inline graphic, as defined in the state equation, Eq. 8. For an initial prior, Inline graphic, we use a normal distribution with mean Inline graphic and covariance Inline graphic. The other distribution, Inline graphic, in Eq. 24 is the filter density at time Inline graphic. In the next paragraph, the filter density will be obtained by approximating it with a normal distribution whose mean and covariance are denoted as Inline graphic and Inline graphic. Under this condition, the one-step prediction density (Eq. 24) again becomes a normal distribution whose mean, Inline graphic, and covariance, Inline graphic, are given as [37], [68]

graphic file with name pcbi.1002385.e725.jpg (25)
graphic file with name pcbi.1002385.e726.jpg (26)

The filter density, Inline graphic, is the conditional distribution of the state given the observation of parallel spike sequences up to time Inline graphic. Using the likelihood function and one-step prediction density, the filter density is given by Bayes' theorem as

graphic file with name pcbi.1002385.e729.jpg (27)

This posterior density is a complicated function with respect to the natural parameter, Inline graphic. Here, we apply a Gaussian approximation to the posterior density using the Laplace method [37], [104], [106], [107]: The filter mean, Inline graphic, is identified with a mode of the posterior density as Inline graphic, and the filter covariance, Inline graphic, is determined from the Hessian of the log posterior at the mode as Inline graphic. The approximate posterior mode is obtained using the iterative procedure of a gradient ascent method or the Newton-Raphson method using the gradient and Hessian matrix. The gradient and Hessian of the log of the posterior density at Inline graphic is calculated as

graphic file with name pcbi.1002385.e736.jpg (28)
graphic file with name pcbi.1002385.e737.jpg (29)

Note that, as in Eqs. 17 and 19, the first derivative of the log normalization function, Inline graphic, with respect to Inline graphic provides the dual coordinates, Inline graphic: Inline graphic. Furthermore, the second derivative yields the Fisher metric: Inline graphic. In this study, we adopt the Newton-Raphson method:

graphic file with name pcbi.1002385.e743.jpg (30)

The gradient and Hessian, Eqs.28 and 29, are evaluated using an old value, Inline graphic. Here, Inline graphic is a learning coefficient that was introduced in the context of ‘natural’ gradient search algorithm [91]. For the small population size analyzed in this study, we used Inline graphic. However, it is recommended that a smaller positive value be selected for the analysis of a larger system to avoid numerical instability. Because both the likelihood function and prior density are logarithmically concave functions, the posterior is also log-concave. Thus, this optimization problem is convex, which guarantees a unique solution for the filter estimate [90]. The optimized natural parameter is selected as the filter mean, Inline graphic. The filter covariance is approximated as Inline graphic by using Inline graphic. It is also possible to use a simple gradient ascent method to obtain the mode, and then compute the Hessian matrix at the mode.

The above method is equivalent to solving the following nonlinear recursive filter formulas:

graphic file with name pcbi.1002385.e750.jpg (31)
graphic file with name pcbi.1002385.e751.jpg (32)

Eq. 31 was obtained from Inline graphic. Eqs. 31 and 32 are recursively computed in combination with the one-step prediction equations, Eqs. 25 and 26, for Inline graphic. Because Inline graphic and Inline graphic are dual representations of the same probability distribution, Eq. 31 is a nonlinear equation and needs to be solved as suggested above (Eqs. 28, 29, 30). As pointed out for a point process adaptive filter [37], [69], the residual, Inline graphic, in Eq. 31 acts similarly to an innovation vector of a standard Kalman filter. The same error signal between the observed synchrony rates and expected synchrony rates is utilized in training the Boltzmann machine [46], [47], [91]. The innovation term corrects the one-step prediction mean, Inline graphic, i.e., an expected state by the prior distribution. The degree of the correction is determined by the number of repeated trials, Inline graphic, and uncertainty of the predicted state, Inline graphic. The hyper-parameters of the prior density significantly affect the latter gain (see Eq. 26), and thereby the smoothness of the estimated processes. The filter covariance equation, Eq. 32, describes the reduction of the prediction uncertainty, Inline graphic, by observing the parallel spike sequences at time Inline graphic, with the amount determined by the number of repeated trials, Inline graphic, and the Fisher information, Inline graphic. Please see Figure 1 for a geometric view of the recursive Bayesian filter.

Finally, we compute the smooth posterior density using the backward recursive algorithm [68], [105], [107],

graphic file with name pcbi.1002385.e764.jpg (33)

Because the density functions in the recursive formula were approximated as a normal distribution, we follow the fixed-interval smoothing algorithm [37], [68], [105] established for a Gaussian state and observation equation. Starting from Inline graphic and Inline graphic, which are obtained from the filtering algorithm, we obtain the smoothed mean and covariance,

graphic file with name pcbi.1002385.e767.jpg (34)
graphic file with name pcbi.1002385.e768.jpg (35)

with

graphic file with name pcbi.1002385.e769.jpg (36)

for Inline graphic. The lag-one covariance smoother, Inline graphic, which appears in the Inline graphic-function, is obtained using the method of De Jong and Mackinnon [108]:

graphic file with name pcbi.1002385.e773.jpg (37)

M-step: Optimization of hyper-parameters

At the M-step, we optimize hyper-parameter Inline graphic given the posterior density under the principle of maximizing the Inline graphic-function. From Inline graphic, the update rule of the covariance matrix, Inline graphic, is obtained as

graphic file with name pcbi.1002385.e778.jpg (38)

Here, Inline graphic, Inline graphic, and Inline graphic are the smoother mean and covariance, and the lag-one covariance matrix given by Eqs. 34, 35, and 37, respectively. Similarly, from Inline graphic, the auto-regressive parameter, Inline graphic, is updated according to

graphic file with name pcbi.1002385.e784.jpg (39)

The mean of the initial distribution is updated with Inline graphic from Inline graphic. The covariance matrix, Inline graphic, is not updated; instead, we use a fixed matrix, Inline graphic. Alternatively, the covariance matrix of the initial distribution can be updated according to Inline graphic from Inline graphic, while the mean, Inline graphic, is fixed. Both methods work well with appropriate choices for the fixed parameters. In this study, we updated the mean vector, Inline graphic, of the initial normal distribution, and used a fixed diagonal matrix for its covariance matrix, Inline graphic. It was also suggested to use a stationary mean and covariance of an unconstrained state process (Eq. 8) as parameters of the initial prior distribution [68], [109]. The equilibrium condition from Eq. 8 yields Inline graphic and Inline graphic. The solutions are obtained as Inline graphic and Inline graphic (p. 121 and p. 426 in [109], p. 112 in [110]). In the latter, Inline graphic denotes the Kronecker product (tensor product) and the Inline graphic operator creates a single column vector from a matrix by stacking its column vectors. However, the stationary condition is not always satisfied. As we found that the fitted Inline graphic sometimes violates the stationary assumption, we did not adopt this approach in the current study.

Selection of state-space model by information criteria

The method developed in the previous subsection is applicable to a full log-linear model, as well as a model that considers an arbitrary order of interactions. In order to select the most predictive model among the hierarchical models in accordance with the observed spike data, we select the state-space model that minimizes the Akaike information criterion (AIC) for a model with latent variables [73], [105]:

graphic file with name pcbi.1002385.e801.jpg (40)

Here, Inline graphic is the log marginal likelihood (Eq. 10) and Inline graphic is the number of free hyper-parameters in the prior distribution. For the Inline graphicth order model, the number of natural parameters is given by Inline graphic. The number of free parameters in the prior distribution is computed as Inline graphic, where each term corresponds to the number of free parameters in Inline graphic, Inline graphic, and Inline graphic. Note that the AIC applied to the state-space model is sometimes referred to as the Akaike Bayesian information criterion (ABIC) [73]. In the following, we derive the approximation method to evaluate the AIC for the state-space log-linear model.

The log marginal likelihood, Eq. 10, can be written as

graphic file with name pcbi.1002385.e810.jpg (41)

We make a log quadratic approximation to evaluate the integral. To accomplish this, we denote

graphic file with name pcbi.1002385.e811.jpg (42)

with

graphic file with name pcbi.1002385.e812.jpg (43)

The Laplace approximation of the integral in Eq. 42 is given as [80]

graphic file with name pcbi.1002385.e813.jpg (44)

By applying Eqs. 42, 43 and 44 to Eq. 41, the log marginal likelihood is approximated as

graphic file with name pcbi.1002385.e814.jpg (45)

We confirmed that the log-quadratic approximation provided a better estimate of the marginal likelihood than the first order approximation used in [77] by comparing them with a Monte Carlo approximation of the integral in Eq. 42. We select the state-space log-linear model that minimizes the AIC (Eq. 40), where the log marginal likelihood is approximated using Eq. 45.

For comparison with the AIC, we compute two other information criteria that employ different forms of the penalization term. The Bayesian information criterion [79], [80] (also known as Schwartz's criterion) are obtained by replacing the penalization term of Eq. 40, Inline graphic, with Inline graphic:

graphic file with name pcbi.1002385.e817.jpg (46)

Shimodaira's predictive divergence for indirect observation models (PDIO) [81] is given as

graphic file with name pcbi.1002385.e818.jpg (47)

Here, we redefine Inline graphic as a one-dimensional vector of free hyper-parameters, while Inline graphic denotes the one-step operator of EM iteration. To obtain the Jacobian matrix for the EM operator, we follow the algorithm described in Meng and Rubin [111]. In this method, the Jacobian matrix was approximated using a numerical differentiation of the EM operator. By perturbing one hyper-parameter and then computing a one-step EM iteration, numerical differentiations of the hyper-parameters with respect to the perturbed hyper-parameter were obtained. An entire Jacobian matrix was approximated by repeating the process while changing the hyper-parameter to be perturbed.

Bayesian model comparison method for detecting spike correlation

In this subsection, we formulate a method for detecting the hidden structure of spike interaction by means of a Bayesian model comparison based on the Bayes factor (BF) [74][76]. The BF is a ratio of likelihoods for the observed data, based on two different assumptions about the hidden states (model Inline graphic and Inline graphic). Here we reiterate the definition of the BF for model Inline graphic as opposed to model Inline graphic used in this paper (cf. Eq. 12):

graphic file with name pcbi.1002385.e825.jpg (48)

The BF becomes larger than 1 if the data, Inline graphic in a time period Inline graphic, supports model Inline graphic as opposed to model Inline graphic, and becomes smaller than 1 if the data supports model Inline graphic as opposed to model Inline graphic. The BF can be computed from the one-step prediction and filter density obtained in the method developed in the preceding subsection. From Eq. 48, the BF can be rewritten as

graphic file with name pcbi.1002385.e832.jpg (49)

Let us define the bin-by-bin BF for the spike data at time Inline graphic as

graphic file with name pcbi.1002385.e834.jpg (50)

Using Bayes' theorem, we obtain [74][76]

graphic file with name pcbi.1002385.e835.jpg (51)

for Inline graphic. Using Eq. 51, we can rewrite the bin-by-bin BF as

graphic file with name pcbi.1002385.e837.jpg (52)

where Inline graphic and Inline graphic denote spaces that the natural parameters occupy, supported by models Inline graphic and Inline graphic, respectively. Here, Inline graphic and Inline graphic are the filter density and one-step prediction density, respectively. By sequentially computing Eq. 52, we can obtain the BF with respect to the sub-interval Inline graphic as Inline graphic.

A test with the following models in the sub-interval Inline graphic allows us to detect a momentarily active cell assembly of more than two neurons by the presence of simultaneously positive Inline graphicth-order spike interactions. In the Inline graphicth-order log-linear model of Inline graphic neurons, the natural parameters, Inline graphic (Inline graphic), represent the Inline graphic th-order spike interactions among neurons denoted in index Inline graphic. We examine whether a subset of Inline graphic neurons among the total Inline graphic neurons simultaneously exhibit positive Inline graphic th-order interactions. Let Inline graphic be the subset of Inline graphic neurons from Inline graphic neurons, e.g., Inline graphic from Inline graphic if Inline graphic and Inline graphic. Let Inline graphic be an Inline graphic-subset from Inline graphic, e.g. Inline graphic if Inline graphic. Then, the model where the subset neurons simultaneously exhibit positive Inline graphicth-order interactions (Inline graphic) and its complementary hypothesis (Inline graphic) are represented as

graphic file with name pcbi.1002385.e872.jpg (53)
graphic file with name pcbi.1002385.e873.jpg (54)

for Inline graphic. The remaining parameters are real: Inline graphic (Inline graphic and Inline graphic. excluding Inline graphic). These parameters are integrated out in Eq. 52. The above definition of the assembly is a clique [12], a subset in which each neuron is connected to every other neuron through the positive Inline graphicth-order interactions. Depending on the assembly structure one wishes to uncover, other models can be tested such as one in which neurons bounded in non-exclusive manner.

Analysis of Bayesian model comparison method using simulated stationary spike sequences

In this subsection, we analyze the Bayesian model comparison method by using simulated stationary spike sequences. Figure 8 illustrates an analysis where the BF is applied to stationary spike sequences of Inline graphic simulated neurons. To demonstrate the kind of correlation scenarios that is distinguished by the BF, we generate stationary spike sequences using log-linear models of different spike correlation structures in three distinct time periods (I–III, Figure 8A). These are: (I) an independent model (Inline graphic for Inline graphic and Inline graphic for Inline graphic at Inline graphic), (II) a model containing simultaneously positive pairwise interaction terms, without a triple-wise interaction (Inline graphic for Inline graphic, Inline graphic for Inline graphic, and Inline graphic at Inline graphic), and (III) a model containing a positive triple-wise interaction, with negative pairwise interactions (Inline graphic for Inline graphic, Inline graphic for Inline graphic, Inline graphic at Inline graphic). A sample of the spike sequences from a single trial (Inline graphic) is shown in Figure 8B. In all of the models, the first order log-linear parameters, Inline graphic (Inline graphic), are adjusted such that the spike rates of the individual neurons are Inline graphic (Inline graphic). Therefore, the time segments cannot be distinguished on the basis of the firing rates. In addition, the model in period III is designed such that the projection to Inline graphic yields zero pairwise interactions, i.e., the projection yields an independent model. In this model, the second-order joint synchrony rates, Inline graphic (Inline graphic), are at the chance level expected from the individual spike rates, Inline graphic (Inline graphic). However, the model expresses excess joint spike synchrony between all three neurons, Inline graphic, which is larger than the expected rate (0.001). The presence of a positive triple-wise correlation, combined with the absence of marginal pairwise correlations, leads to a very sparse number of excess synchronous spike triplets, as illustrated by the dot displays in period III in Figure 8B. In contrast, a period containing only positive pairwise correlations without a triple-wise interaction (see Figure 8B, period II) contains a relatively large number of triplets, which are, however, the mere expression of excess pairwise correlations.

We calculate the BFs to test for the presence of the two different correlation structures, i.e., the same tests discussed in the Results section: Test 1 checks for the presence of simultaneously positive pairwise interactions and Test 2 checks for the presence of a triple-wise spike correlation. In one test (Test 1) we ask whether the three neurons simultaneously exhibit positive pairwise interactions by applying a model that contains up to pairwise interactions (Inline graphic) to each of the three time periods. In that case, the BF (Eq. 48) is computed using a model which assumes that all of the pairwise terms are positive: Inline graphic, specified as Inline graphic for all Inline graphic, as opposed to a model where at least one of the pairwise interactions is not positive: Inline graphic, specified as Inline graphic for at least one Inline graphic. Time (bin) index, Inline graphic, expresses that the BF is computed under the same models for all time steps Inline graphic in the respective time period. In both models, we do not make specific assumptions for the first order log-linear parameters; thus, they are allowed to be real numbers. In the other test (Test 2), we ask whether the neurons exhibit excess triple-wise synchrony by applying a full model (Inline graphic) to each of the periods. Here, the BF is computed with the opposing models: Inline graphic, specified as Inline graphic, and Inline graphic, specified as Inline graphic. The lower-order log-linear parameters have again real values.

Figure 8C displays the average BFs (in units of bits) computed from 200 samples (each sample contains Inline graphic trial) within each of the periods I–III. In addition, we show the standard deviations of the BFs (shown as Inline graphic) derived from the BFs of the 200 realizations. The upper panel shows the result of Test 1, expressing the weight of evidence for the presence of excess synchrony realized by simultaneously positive pairwise interactions. We find substantial evidence for the simultaneously positive pairwise interactions in period II, but not in the other two periods. The lower panel displays the evidence for the presence of a positive triple-wise correlation as evaluated by Test 2. The existing positive triple-wise interaction present in period III is well detected by the large BF, and the very low BFs in period I and II correctly detect the absence of triple-wise correlation.

In Figure 8D, we examine how much each individual spike pattern contributes to the calculation of the BFs discussed above. The weight of evidence (log of the BF) in an observation period Inline graphic is obtained by summing the pieces of evidence computed locally at time Inline graphic using Eq. 52, as Inline graphic. Here we elucidate the contributions of each individual spike pattern to the BF by sorting the bin-by-bin evidence, Inline graphic, with each spike pattern. Figure 8D displays the average BFs for each spike pattern in the three periods in bit unit. The degree of the contribution by a specific spike pattern to the evidence for the tested spike interaction model varies depending on the context in which the spike pattern is observed. For example, the magnitudes of the BFs for a triple-wise correlation (Test 2) found by observing a spike triplet (Inline graphic) vary greatly across the three periods (see dark gray bars at Inline graphic in periods I, II, and III). In period III, where pair-synchronous spikes are rarely observed because of the existence of negative pairwise interaction terms, the observation of triplet spikes provides substantial evidence of a triple-wise correlation. In contrast, in period II, where pair-synchronous spikes frequently occur because of the existence of positive pairwise interactions, the chance triplet spikes do not provide substantial evidence of a triple-wise correlation. These results come from the fact that an unexpected synchronous spike pattern that significantly alters and updates the filter density from its one-step prediction density gives rise to a large absolute BF value (Eq. 52). By the same logic, the BFs for the Inline graphic and Inline graphic patterns (and Inline graphic and Inline graphic, not shown) are small because they do not substantially change the posterior densities. However, they should not be neglected because these patterns are abundant as a result of the low firing rates. For example, in period I in Figure 8C, the accumulation of the small weight of evidence from the abundant spike patterns (such as Inline graphic and Inline graphic) offsets the large weight of evidence induced by a few chance coincidences such as triplet spikes (Inline graphic), thus producing virtually zero weight of evidence for this period.

Acknowledgments

We thank B. Staude, S. Koyama, Z. Chen, D. Ba, K. Sadeghi, and T. Toyoizumi for the valuable discussions, and the Diesmann Laboratory for providing the computing environment. We also thank Prof. Alexa Riehle for kindly allowing the use of her data in this study.

Footnotes

The authors have declared that no competing interests exist.

This work was supported in part by JSPS Research Fellowships for Young Scientists (HS), R01 MH59733 (ENB), DP1 OD003646 (ENB), R01 MH071847 (ENB), and RIKEN Strategic Programs for R&D (SG). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Hebb DO. The Organization of Behavior: A Neuropsychological Theory. New York: Wiley; 1949. [Google Scholar]
  • 2.Abeles M. Corticonics: Neural Circuits of the Cerebral Cortex. Cambridge; New York: Cambridge University Press; 1991. [Google Scholar]
  • 3.Diesmann M, Gewaltig M, Aertsen A. Stable propagation of synchronous spiking in cortical neural networks. Nature. 1999;402:529–533. doi: 10.1038/990101. [DOI] [PubMed] [Google Scholar]
  • 4.Kumar A, Rotter S, Aertsen A. Spiking activity propagation in neuronal networks: reconciling different perspectives on neural coding. Nat Rev Neurosci. 2010;11:615–627. doi: 10.1038/nrn2886. [DOI] [PubMed] [Google Scholar]
  • 5.Kuhn A, Aertsen A, Rotter S. Higher-order statistics of input ensembles and the response of simple model neurons. Neural Comput. 2003;15:67–101. doi: 10.1162/089976603321043702. [DOI] [PubMed] [Google Scholar]
  • 6.Gerstein GL, Bedenbaugh P, Aertsen MH. Neuronal assemblies. IEEE Trans Biomed Eng. 1989;36:4–14. doi: 10.1109/10.16444. [DOI] [PubMed] [Google Scholar]
  • 7.Fujii H, Ito H, Aihara K, Ichinose N, Tsukada M. Dynamical cell assembly hypothesis - theoretical possibility of spatio-temporal coding in the cortex. Neural Netw. 1996;9:1303–1350. [PubMed] [Google Scholar]
  • 8.Riehle A, Grün S, Diesmann M, Aertsen A. Spike synchronization and rate modulation differentially involved in motor cortical function. Science. 1997;278:1950–1953. doi: 10.1126/science.278.5345.1950. [DOI] [PubMed] [Google Scholar]
  • 9.Sakurai Y. Population coding by cell assemblies–what it really is in the brain. Neurosci Res. 1996;26:1–16. doi: 10.1016/0168-0102(96)01075-9. [DOI] [PubMed] [Google Scholar]
  • 10.Harris KD. Neural signatures of cell assembly organization. Nat Rev Neurosci. 2005;6:399–407. doi: 10.1038/nrn1669. [DOI] [PubMed] [Google Scholar]
  • 11.Aertsen AM, Gerstein GL, Habib MK, Palm G. Dynamics of neuronal firing correlation: modulation of “effective connectivity”. J Neurophysiol. 1989;61:900–917. doi: 10.1152/jn.1989.61.5.900. [DOI] [PubMed] [Google Scholar]
  • 12.Berger D, Warren D, Normann R, Arieli A, Grün S. Spatially organized spike correlation in cat visual cortex. Neurocomputing. 2007;70:2112–2116. [Google Scholar]
  • 13.Maldonado P, Babul C, Singer W, Rodriguez E, Berger D, et al. Synchronization of neuronal responses in primary visual cortex of monkeys viewing natural images. J Neurophysiol. 2008;100:1523–1532. doi: 10.1152/jn.00076.2008. [DOI] [PubMed] [Google Scholar]
  • 14.Ito H, Maldonado PE, Gray CM. Dynamics of stimulus-evoked spike timing correlations in the cat lateral geniculate nucleus. J Neurophysiol. 2010;104:3276–3292. doi: 10.1152/jn.01000.2009. [DOI] [PubMed] [Google Scholar]
  • 15.Ahissar E, Vaadia E, Ahissar M, Bergman H, Arieli A, et al. Dependence of cortical plasticity on correlated activity of single neurons and on behavioral context. Science. 1992;257:1412–1415. doi: 10.1126/science.1529342. [DOI] [PubMed] [Google Scholar]
  • 16.Vaadia E, Haalman I, Abeles M, Bergman H, Prut Y, et al. Dynamics of neuronal interactions in monkey cortex in relation to behavioral events. Nature. 1995;373:515–518. doi: 10.1038/373515a0. [DOI] [PubMed] [Google Scholar]
  • 17.Ishikane H, Gangi M, Honda S, Tachibana M. Synchronized retinal oscillations encode essential information for escape behavior in frogs. Nat Neurosci. 2005;8:1087–1095. doi: 10.1038/nn1497. [DOI] [PubMed] [Google Scholar]
  • 18.Fujisawa S, Amarasingham A, Harrison MT, Buzsáki G. Behavior-dependent short-term assembly dynamics in the medial prefrontal cortex. Nat Neurosci. 2008;11:823–833. doi: 10.1038/nn.2134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kilavik BE, Roux S, Ponce-Alvarez A, Confais J, Grün S, et al. Long-term modi–cations in motor cortical dynamics induced by intensive practice. J Neurosci. 2009;29:12653–12663. doi: 10.1523/JNEUROSCI.1554-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Sakurai Y. Hippocampal and neocortical cell assemblies encode memory processes for different types of stimuli in the rat. J Neurosci. 1996;16:2809–2819. doi: 10.1523/JNEUROSCI.16-08-02809.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Steinmetz PN, Roy A, Fitzgerald PJ, Hsiao SS, Johnson KO, et al. Attention modulates synchronized neuronal firing in primate somatosensory cortex. Nature. 2000;404:187–190. doi: 10.1038/35004588. [DOI] [PubMed] [Google Scholar]
  • 22.Sakurai Y, Takahashi S. Dynamic synchrony of firing in the monkey prefrontal cortex during working-memory tasks. J Neurosci. 2006;26:10141–10153. doi: 10.1523/JNEUROSCI.2423-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Takahashi S, Sakurai Y. Sub-millisecond firing synchrony of closely neighboring pyramidal neurons in hippocampal CA1 of rats during delayed non-matching to sample task. Front Neural Circuits. 2009;3:9. doi: 10.3389/neuro.04.009.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Brown EN, Kass RE, Mitra PP. Multiple neural spike train data analysis: state-of-the-art and future challenges. Nat Neurosci. 2004;7:456–461. doi: 10.1038/nn1228. [DOI] [PubMed] [Google Scholar]
  • 25.Grün S. Data-driven significance estimation for precise spike correlation. J Neurophysiol. 2009;101:1126–1140. doi: 10.1152/jn.00093.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Grün S, Rotter S, editors. Analysis of Parallel Spike Trains. New York: Springer; 2010. [Google Scholar]
  • 27.Perkel DH, Gerstein GL, Moore GP. Neuronal spike trains and stochastic point processes. II. simultaneous spike trains. Biophys J. 1967;7:419–440. doi: 10.1016/S0006-3495(67)86597-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gerstein GL, Perkel DH. Simultaneously recorded trains of action potentials - analysis and functional interpretation. Science. 1969;164:828–830. doi: 10.1126/science.164.3881.828. [DOI] [PubMed] [Google Scholar]
  • 29.Gerstein GL, Kiang NYS. An approach to the quantitative analysis of electrophysiological data from single neurons. Biophys J. 1960;1:15–28. doi: 10.1016/s0006-3495(60)86872-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Shimazaki H, Shinomoto S. A method for selecting the bin size of a time histogram. Neural Comput. 2007;19:1503–1527. doi: 10.1162/neco.2007.19.6.1503. [DOI] [PubMed] [Google Scholar]
  • 31.Shimazaki H, Shinomoto S. Kernel bandwidth optimization in spike rate estimation. J Comput Neurosci. 2010;29:171–182. doi: 10.1007/s10827-009-0180-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Grün S, Diesmann M, Aertsen A. Unitary events in multiple single-neuron spiking activity: I. detection and significance. Neural Comput. 2002;14:43–80. doi: 10.1162/089976602753284455. [DOI] [PubMed] [Google Scholar]
  • 33.Grün S, Diesmann M, Aertsen A. Unitary events in multiple single-neuron spiking activity: II. nonstationary data. Neural Comput. 2002;14:81–119. doi: 10.1162/089976602753284464. [DOI] [PubMed] [Google Scholar]
  • 34.Staude B, Rotter S, Grün S. Cubic: cumulant based inference of higher-order correlations in massively parallel spike trains. J Comput Neurosci. 2010;29:327–350. doi: 10.1007/s10827-009-0195-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Staude B, Grün S, Rotter S. Higher-order correlations in non-stationary parallel spike trains: statistical modeling and inference. Front Comput Neurosci. 2010;4:16. doi: 10.3389/fncom.2010.00016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Chornoboy ES, Schramm LP, Karr AF. Maximum likelihood identification of neural point process systems. Biol Cybern. 1988;59:265–275. doi: 10.1007/BF00332915. [DOI] [PubMed] [Google Scholar]
  • 37.Brown EN, Frank LM, Tang D, Quirk MC, Wilson MA. A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. J Neurosci. 1998;18:7411–7425. doi: 10.1523/JNEUROSCI.18-18-07411.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Truccolo W, Eden UT, Fellows MR, Donoghue JP, Brown EN. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. J Neurophysiol. 2005;93:1074–1089. doi: 10.1152/jn.00697.2004. [DOI] [PubMed] [Google Scholar]
  • 39.Okatan M, Wilson MA, Brown EN. Analyzing functional connectivity using a network likelihood model of ensemble neural spiking activity. Neural Comput. 2005;17:1927–1961. doi: 10.1162/0899766054322973. [DOI] [PubMed] [Google Scholar]
  • 40.Pillow JW, Shlens J, Paninski L, Sher A, Litke AM, et al. Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature. 2008;454:995–999. doi: 10.1038/nature07140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Kim S, Putrino D, Ghosh S, Brown EN. A granger causality measure for point process models of ensemble neural spiking activity. PLoS Comput Biol. 2011;7:e1001110. doi: 10.1371/journal.pcbi.1001110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Kulkarni JE, Paninski L. Common-input models for multiple neural spike-train data. Network: Comp Neural Sys. 2007;18:375–407. doi: 10.1080/09548980701625173. [DOI] [PubMed] [Google Scholar]
  • 43.Vidne M, Ahmadian Y, Shlens J, Pillow JW, Kulkarni J, et al. Modeling the impact of common noise inputs on the network activity of retinal ganglion cells. J Comput Neurosci. 2011 doi: 10.1007/s10827-011-0376-2. In press. DOI: 101007/s10827-011-0376-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Amari SI. Information geometry on hierarchy of probability distributions. IEEE T Inform Theory. 2001;47:1701–1711. [Google Scholar]
  • 45.Nakahara H, Amari SI. Information-geometric measure for neural spikes. Neural Comput. 2002;14:2269–2316. doi: 10.1162/08997660260293238. [DOI] [PubMed] [Google Scholar]
  • 46.Ackley D, Hinton G, Sejnowski T. A learning algorithm for boltzmann machines. Cognitive Sci. 1985;9:147–169. [Google Scholar]
  • 47.Amari S, Kurata K, Nagaoka H. Information geometry of boltzmann machines. IEEE T Neural Networ. 1992;3:260–271. doi: 10.1109/72.125867. [DOI] [PubMed] [Google Scholar]
  • 48.Amari S. Learning patterns and pattern sequences by self-organizing nets of threshold elements. IEEE Trans Comput. 1972;21:1197–1206. [Google Scholar]
  • 49.Hopfield J. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci U S A. 1982;79:2554–2558. doi: 10.1073/pnas.79.8.2554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Amari S, Nagaoka H. Methods of Information Geometry. Providence: American Mathematical Society; 2000. [Google Scholar]
  • 51.Amari S. Measure of correlation orthogonal to change in firing rate. Neural Comput. 2009;21:960–972. doi: 10.1162/neco.2008.03-08-729. [DOI] [PubMed] [Google Scholar]
  • 52.Schneidman E, Still S, Berry MJ, Bialek W. Network information and connected correlations. Phys Rev Lett. 2003;91:238701. doi: 10.1103/PhysRevLett.91.238701. [DOI] [PubMed] [Google Scholar]
  • 53.Montani F, Ince RAA, Senatore R, Arabzadeh E, Diamond ME, et al. The impact of high-order interactions on the rate of synchronous discharge and information transmission in somatosensory cortex. Philos Transact A Math Phys Eng Sci. 2009;367:3297–3310. doi: 10.1098/rsta.2009.0082. [DOI] [PubMed] [Google Scholar]
  • 54.Roudi Y, Nirenberg S, Latham PE. Pairwise maximum entropy models for studying large biological systems: when they can work and when they can't. PLoS Comput Biol. 2009;5:e1000380. doi: 10.1371/journal.pcbi.1000380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Schneidman E, Berry MJ, Segev R, Bialek W. Weak pairwise correlations imply strongly correlated network states in a neural population. Nature. 2006;440:1007–1012. doi: 10.1038/nature04701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Shlens J, Field GD, Gauthier JL, Grivich MI, Petrusca D, et al. The structure of multineuron firing patterns in primate retina. J Neurosci. 2006;26:8254–8266. doi: 10.1523/JNEUROSCI.1282-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Tang A, Jackson D, Hobbs J, Chen W, Smith JL, et al. A maximum entropy model applied to spatial and temporal correlations from cortical networks in vitro. J Neurosci. 2008;28:505–518. doi: 10.1523/JNEUROSCI.3359-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Santos GS, Gireesh ED, Plenz D, Nakahara H. Hierarchical interaction structure of neural activities in cortical slice cultures. J Neurosci. 2010;30:8720–8733. doi: 10.1523/JNEUROSCI.6141-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Yu S, Yang H, Nakahara H, Santos GS, Nikolic D, et al. Higher-order interactions characterized in cortical activity. J Neurosci. 2011;31:17514–17526. doi: 10.1523/JNEUROSCI.3127-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Ohiorhenuan IE, Mechler F, Purpura KP, Schmid AM, Hu Q, et al. Sparse coding and high-order correlations in fine-scale cortical networks. Nature. 2010;466:617–621. doi: 10.1038/nature09178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Martignon L, Hasseln HV, Grün S, Aertsen A, Palm G. Detecting higher-order interactions among the spiking events in a group of neurons. Biol Cybern. 1995;73:69–81. doi: 10.1007/BF00199057. [DOI] [PubMed] [Google Scholar]
  • 62.Martignon L, Deco G, Laskey K, Diamond M, Freiwald W, et al. Neural coding: higherorder temporal patterns in the neurostatistics of cell assemblies. Neural Comput. 2000;12:2621–2653. doi: 10.1162/089976600300014872. [DOI] [PubMed] [Google Scholar]
  • 63.Macke JH, Opper M, Bethge M. Common input explains higher-order correlations and entropy in a simple model of neural population activity. Phys Rev Lett. 2011;106:208102. doi: 10.1103/PhysRevLett.106.208102. [DOI] [PubMed] [Google Scholar]
  • 64.Gütig R, Aertsen A, Rotter S. Analysis of higher-order neuronal interactions based on conditional inference. Biol Cybern. 2003;88:352–359. doi: 10.1007/s00422-002-0388-0. [DOI] [PubMed] [Google Scholar]
  • 65.Amari SI, Nakahara H, Wu S, Sakai Y. Synchronous firing and higher-order interactions in neuron pool. Neural Comput. 2003;15:127–142. doi: 10.1162/089976603321043720. [DOI] [PubMed] [Google Scholar]
  • 66.Long J, II, Carmena J. A statistical description of neural ensemble dynamics. Front Comput Neurosci. 2011;5:52. doi: 10.3389/fncom.2011.00052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Kass RE, Kelly RC, Loh WL. Assessment of synchrony in multiple neural spike trains using loglinear point process models. Ann Appl Stat. 2011;5:1262–1292. doi: 10.1214/10-AOAS429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Smith AC, Brown EN. Estimating a state-space model from point process observations. Neural Comput. 2003;15:965–991. doi: 10.1162/089976603765202622. [DOI] [PubMed] [Google Scholar]
  • 69.Eden UT, Frank LM, Barbieri R, Solo V, Brown EN. Dynamic analysis of neural encoding by point process adaptive filtering. Neural Comput. 2004;16:971–998. doi: 10.1162/089976604773135069. [DOI] [PubMed] [Google Scholar]
  • 70.Stevenson IH, Rebesco JM, Hatsopoulos NG, Haga Z, Miller LE, et al. Bayesian inference of functional connectivity and network structure from spikes. IEEE Trans Neural Syst Rehabil Eng. 2009;17:203–213. doi: 10.1109/TNSRE.2008.2010471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Truccolo W, Hochberg LR, Donoghue JP. Collective dynamics in human and monkey sensorimotor cortex: predicting single neuron spikes. Nat Neurosci. 2010;13:105–111. doi: 10.1038/nn.2455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Chen Z, Barbieri R, Brown EN. State-space modeling of neural spike train and behavioral data. In: Oweiss KG, editor. Statistical Signal Processing for Neuroscience and Neurotechnology. Burlington: Elsevier Science; 2010. pp. 175–218. [Google Scholar]
  • 73.Akaike H. Likelihood and the bayes procedure. Trabajos de Estadística y de Investigación Operativa. 1980;31:143–166. [Google Scholar]
  • 74.Jeffreys H. Theory of Probability. Oxford: Clarendon Press; 1961. [Google Scholar]
  • 75.Good IJ. Weight of Evidence: A brief survey. In: Bernardo JM, De Groot MH, Lindley DV, Smith AFM, editors. Bayesian statistics 2. New York: North Holland; 1985. pp. 249–269. [Google Scholar]
  • 76.Kass RE, Raftery AE. Bayes factors. J Am Stat Assoc. 1995;90:773–795. [Google Scholar]
  • 77.Shimazaki H, Amari SI, Brown EN, Grün S. State-space analysis on time-varying correlations in parallel spike sequences. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2009. 2009. pp. 3501–3504. ICASSP-2009.
  • 78.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B Met. 1977;39:1–38. [Google Scholar]
  • 79.Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6:461–464. [Google Scholar]
  • 80.Rissanen J. Information and Complexity in Statistical Modeling. New York: Springer; 2009. [Google Scholar]
  • 81.Shimodaira H. A new criterion for selecting models from partially observed data. In: Cheeseman P, Oldford RW, editors. Selecting models from data - artificial intelligence and statistics IV. New York: Springer-Verlag; 1994. pp. 21–30. [Google Scholar]
  • 82.Shlens J, Field GD, Gauthier JL, Greschner M, Sher A, et al. The structure of large-scale synchronized –ring in primate retina. J Neurosci. 2009;29:5022–5031. doi: 10.1523/JNEUROSCI.5187-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Louis S, Gerstein G, Grün S, Diesmann M. Surrogate spike train generation through dithering in operational time. Front Comput Neurosci. 2010;4:127. doi: 10.3389/fncom.2010.00127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Riehle A, Grammont F, Diesmann M, Grün S. Dynamical changes and temporal precision of synchronized spiking activity in monkey motor cortex during movement preparation. J Physiol Paris. 2000;94:569–582. doi: 10.1016/s0928-4257(00)01100-1. [DOI] [PubMed] [Google Scholar]
  • 85.Bohte SM, Spekreijse H, Roelfsema PR. The effects of pair-wise and higher order correlations on the firing rate of a post-synaptic neuron. Neural Comput. 2000;12:153–179. doi: 10.1162/089976600300015934. [DOI] [PubMed] [Google Scholar]
  • 86.Staude B, Grün S, Rotter S. Higher order correlations and cumulants. In: Grün, Rotter, editors. Analysis of parallel spike trains. Springer New York; 2010. pp. 253–280. [Google Scholar]
  • 87.Jenison RL, Reale RA. The shape of neural dependence. Neural Comput. 2004;16:665–672. doi: 10.1162/089976604322860659. [DOI] [PubMed] [Google Scholar]
  • 88.Berkes P, Wood F, Pillow J. Koller D, Schuurmans D, Bengio Y, Bottou L, editors. Characterizing neural dependencies with copula models. Advances in NIPS 21. 2009. pp. 129–136.
  • 89.Onken A, Grünewälder S, Munk MHJ, Obermayer K. Analyzing short-term noise dependencies of spike-counts in macaque prefrontal cortex using copulas and the flashlight transformation. PLoS Comput Biol. 2009;5:e1000577. doi: 10.1371/journal.pcbi.1000577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Paninski L. Saul LK, Weiss Y, Bottou L, editors. Log-concavity results on gaussian process methods for supervised and unsupervised learning. Advances in NIPS 17. 2005. pp. 1025–1032.
  • 91.Amari S. Natural gradient works efficiently in learning. Neural Comput. 1998;10:251–276. [Google Scholar]
  • 92.Murray I, Ghahramani Z. Proceedings of the 20th Annual Conference on Uncertainty in Artificial Intelligence. UAI-2004. Arlington: AUAI Press; 2004. Bayesian learning in undirected graphical models: approximate MCMC algorithms. pp. 392–399. [Google Scholar]
  • 93.Schaub MT, Schultz SR. The Ising decoder: reading out the activity of large neural ensembles. J Comput Neurosci. 2011;32:101–118. doi: 10.1007/s10827-011-0342-z. 101007/s10827-011-0342-z. [DOI] [PubMed] [Google Scholar]
  • 94.Shinomoto S, Kim H, Shimokawa T, Matsuno N, Funahashi S, et al. Relating neuronal firing patterns to functional differentiation of cerebral cortex. PLoS Comput Biol. 2009;5:e1000433. doi: 10.1371/journal.pcbi.1000433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Daley D, Vere-Jones D. An Introduction to the Theory of Point Processes. New York: Springer-Verlag; 1988. [Google Scholar]
  • 96.Kass RE, Ventura V, Brown EN. Statistical issues in the analysis of neuronal data. J Neurophysiol. 2005;94:8–25. doi: 10.1152/jn.00648.2004. [DOI] [PubMed] [Google Scholar]
  • 97.Brody CD. Correlations without synchrony. Neural Comput. 1999;11:1537–1551. doi: 10.1162/089976699300016133. [DOI] [PubMed] [Google Scholar]
  • 98.Czanner G, Eden UT, Wirth S, Yanike M, Suzuki WA, et al. Analysis of between-trial and within-trial neural spiking dynamics. J Neurophysiol. 2008;99:2672–2693. doi: 10.1152/jn.00343.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Grün S, Diesmann M, Grammont F, Riehle A, Aertsen A. Detecting unitary events without discretization of time. J Neurosci Methods. 1999;94:67–79. doi: 10.1016/s0165-0270(99)00126-0. [DOI] [PubMed] [Google Scholar]
  • 100.Marre O, Boustani SE, Frégnac Y, Destexhe A. Prediction of spatiotemporal patterns of neural activity from pairwise correlations. Phys Rev Lett. 2009;102:138101. doi: 10.1103/PhysRevLett.102.138101. [DOI] [PubMed] [Google Scholar]
  • 101.Brown EN, Barbieri R, Ventura V, Kass RE, Frank LM. The time-rescaling theorem and its application to neural spike train data analysis. Neural Comput. 2001;14:325–346. doi: 10.1162/08997660252741149. [DOI] [PubMed] [Google Scholar]
  • 102.Oizumi M, Ishii T, Ishibashi K, Hosoya T, Okada M. Mismatched decoding in the brain. J Neurosci. 2010;30:4815–4826. doi: 10.1523/JNEUROSCI.4360-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Shumway R, Stoffer D. An approach to time series smoothing and forecasting using the em algorithm. J Time Ser Anal. 1982;3:253–264. [Google Scholar]
  • 104.Smith AC, Frank LM, Wirth S, Yanike M, Hu D, et al. Dynamic analysis of learning in behavioral experiments. J Neurosci. 2004;24:447–461. doi: 10.1523/JNEUROSCI.2908-03.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Kitagawa G. Non-Gaussian state-space modeling of nonstationary time series. J Am Stat Assoc. 1987;82:1032–1041. [Google Scholar]
  • 106.Fahrmeir L. Posterior mode estimation by extended kalman filtering for multivariate dynamic generalized linear models. J Am Stat Assoc. 1992;87:501–509. [Google Scholar]
  • 107.Koyama S, Paninski L. Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models. J Comput Neurosci. 2010;89–105:1–2. doi: 10.1007/s10827-009-0150-x. [DOI] [PubMed] [Google Scholar]
  • 108.De Jong P, Mackinnon MJ. Covariances for smoothed estimates in state space models. Biometrika. 1988;75:601–602. [Google Scholar]
  • 109.Harvey AC. Forecasting Structural Time Series Models and the Kalman Filter. Cambridge; New York: Cambridge University Press; 1989. [Google Scholar]
  • 110.Durbin J, Koopman SJ. Time Series Analysis by State Space Methods. Oxford; New York: Oxford University Press; 2001. [Google Scholar]
  • 111.Meng X, Rubin D. Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm. J Am Stat Assoc. 1991;86:899–909. [Google Scholar]

Articles from PLoS Computational Biology are provided here courtesy of PLOS

RESOURCES