Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Jun 8.
Published in final edited form as: J Neurosci Methods. 2006 Dec 22;162(1-2):320–332. doi: 10.1016/j.jneumeth.2006.12.008

Causal Entropies – a measure for determining changes in the temporal organization of neural systems

Jack Waddell a, Rhonda Dzakpasu a, Victoria Booth b,c, Brett Riley b, Jonathan Reasor d, Gina Poe b,e, Michal Zochowski a,f
PMCID: PMC2693078  NIHMSID: NIHMS22053  PMID: 17275095

Abstract

We propose a novel measure to detect temporal ordering in the activity of individual neurons in a local network, which is thought to be a hallmark of activity-dependent synaptic modifications during learning. The measure, called Causal Entropy, is based on the time-adaptive detection of asymmetries in the relative temporal patterning between neuronal pairs. We characterize properties of the measure on both simulated data and experimental multiunit recordings of hippocampal neurons from the awake, behaving rat, and show that the metric can more readily detect those asymmetries than standard cross correlation-based techniques, especially since the temporal sensitivity of causal entropy can detect such changes rapidly and dynamically.

Keywords: temporal pattern formation, multi-unit recording, hippocampal CA1, long term potentiation, asymmetric correlation

Introduction

The availability of multiple single unit recordings of brain activity allows for the investigation of the interaction among neurons and their changing relationships within neural systems. For example, methods using optical imaging (Wu et al, 1994; Zochowski et al, 2000) or multiunit recordings with multi-electrode arrays (Potter, 2001), tetrodes (Gothard et al., 1996; Poe et al., 2002) or multiple single electrodes (Fries et al., 2001) all provide a measurement of neuronal activity at different locations simultaneously, allowing for the evaluation of distributed network activity.

Spatio-temporal pattern formation in the brain has been studied extensively over the past few decades, providing information about neural dynamics during different cognitive tasks such as the formation of stimulus representation, learning and memory. The temporal interdependencies between the neurons can be grouped into two major categories: temporal coincidence (co-occurrence) and causal dependence.

The first category is limited to the case when neurons are likely to fire at about the same time, but lack causal ordering between their activities. Temporal coincidence could be linked to parallel processing within independent channels. Examples of such temporal coincidence interdependency include the oscillatory modulation of population activity observed during multiple oscillatory rhythms, such as the temporally patterned participation of individual neurons in the 20 Hz rhythm during olfactory stimulation or in the 5–10 Hz theta and 200 Hz ripples during place learning.

Causal (or directional) dependence refers to the case when, in addition to temporal co-occurrence, the ordering between the firing patterns of the cells or local networks can be intermittently ascertained. Such directional locking may provide insight about direction of information flow between the individual cells or local networks. Also, causal interdependence of interacting neurons is thought to be the underlying mechanism of activity dependent synaptic modifications, long term potentiation (LTP) and depotentiation (DP), where the direction of the synaptic modification (i.e. strengthening or weakening of the synapse) is directly related to the relative timing of the pre- and post-synaptic spikes (for example see Abbot & Nelson, 2003). In the hippocampus, cross correlations between cell pairs have been considered a signature of LTP (Wilson and McNaughton, 1994) that is itself thought to be the building block of learning. When hippocampal pyramidal cells fire coincidentally on a maze, there is increased probability of them firing together during spontaneous discharge after the learning experience. However, to obtain statistically significant results, the temporal coincidence measure of cross correlation has to be taken over a large time window, and transient changes in the correlation as they occur in time may not be detected. Moreover, when only small temporal shifts occur between different firing patterns, it may be impossible to determine whether the co-occurrence of activity is symmetrical (no directionality present) or if there is a causal firing pattern established, with one neuron following the other.

Improved detection of local coincidence and causal ordering should provide additional information about the dynamics of information processing in a studied system, which themselves are considered to be reflections of modifications in the underlying functional connectivity of the network. It should be noted however that from temporal pattern interaction analyses it is not possible to assess direct structural links between individual neurons since different realizations of network structure may lead to the same functional dependencies of its units. This difficulty in distinguishing structural causality is ubiquitous in the measurement of temporal relationships, and is not limited to this measure.

Multiple techniques have been developed to capture temporal interdependencies within neuronal activity. Many of these operate in a pair-wise manner on the inner product of two signals (usually time series or frequency spectrum), (see Okatan (2005) for a brief review). Such methods include cross-intensity functions (Cox & Lewis, 1972), product densities, cumulant densities, cumulant spectra (Bartlett, 1966), cross-correlation (Perkel et al., 1967), joint peristimulus time histogram (Gerstein & Perkel, 1969), and coherence (Brillinger, 1976). Others operate on the entire space of interaction. These include dimensional reduction methods, such as principle component analysis and multiple discriminant analysis (Lin et al., 2005), optimal linear estimation (Salinas and Abbot, 1994), and Bayes methods (Zhang et al., 1998). Another class of methods stems from bivariate autoregressive calculations. Such methods include Granger causality (Wiener, 1956; Granger, 1969; Chen, Bressler, & Ding, 2006) and directed coherence and partial directed coherence (Schelter et al., 2006). See (Brea et al., 2006) for a comparison of several directed measures.

Okatan et al. (2005) have implemented a discrete computation method for the application of the maximum likelihood method proposed by Chornoboy et al. (1988). This method allows the simultaneous analysis of pair-wise interactions amongst all pairs through the likelihood optimization. The analysis can detect both excitatory and inhibitory interactions, but involves extensive computation.

Another measure, directed information (Massey, 1990), is calculated from two histories: the complete history of one channel, and the earlier history of another channel, upon which the probability of the current state of the second channel is conditioned. This introduces feedback from one channel into mutual information, making it an asymmetric measure.

We focus here on characterizing properties of a recently developed metric of causal interdependence, Causal Entropies (Zochowski & Dzakpasu, 2004; Dzakpasu & Zochowski, 2005), and its application to the analysis of real and simulated neuronal data. The measure detects amplitudes of fluctuations of relative inter-spike intervals (ISIs) between spikes recorded in a pair of neurons. The underlying idea is that if there is a unidirectional causal relationship between the two neurons due to the intrinsic properties of a synchronous state (Rosenblum et al., 1996; Pikovsky et al., 1997. Zhou et al., 2002), or to a direction of information transfer between the two cells, the fluctuations of relative ISIs will be smaller if measured between a spike of the driving neuron to the first subsequent spike of the following neuron than if measured in the reverse direction.

First, we show on simulated data that the metric provides a robust, fast and time-adaptive way of measuring asymmetrical pair-wise temporal interdependences in the system. Then, we compare Causal Entropies with cross-correlations, both applied to single-unit activity recorded from hippocampal CA1 pyramidal cells of freely behaving rats as they explored different spatial. During the recordings, the rats performed a spatial memory task (Poe et al., 2002), which is known to activate hippocampal populations and to require the hippocampus to dynamically change functional interconnected circuitries as the animals learn (see Shapiro, 2001 for review). We show a pair-wise increase in the asymmetric temporal interdependence between neurons when a novel environment is presented to the animal, which could be directly indicative of increased activity-dependent plasticity between the neurons. We also investigate temporal interdependencies in the whole network and show that there are ongoing changes in the network dynamics during exploration of the novel environment, and that the final temporal structure of the network is modified; that is, the changes in interdependencies incorporated during presentation of the novel stimulus remain in the network when the animal is returned to the familiar environment.

Methods

1. Calculation of Causal Entropies

Causal Entropy (CEij) is an asymmetric, time-adaptive, event based measure of the regularity of the phase- or time-lag (Rosenblum et al, 1997) with which neuron i fires after neuron j. It is calculated from two components: a non-parametric time-adaptive estimate of the probability density of spike time lag between two neurons i and j such that i follows j (and, independently, the distribution of j following i), and a cost function estimate of the spread and stability of the distribution

The density estimator adapts in such a way that the contribution of time lags distant in history have less weight than recent observed time lags. In principle, the density estimator and cost function may be any such devices that preserve the characteristics described above. Here we choose an event-normalized histogram as the time-adaptive density estimator – described in detail below - and the Shannon entropy as the cost function estimator, i.e. CEij(t)=kPij(k)·log(Pij(k)). This choice and the consequences thereof are detailed in the first sub-section of the Results section.

The measure monitors temporal variations (i.e. widths of the distribution) of interspike intervals between an event of one neuron (i.e. j-th neuron) and the first subsequent event of the other neuron (i.e. i-th neuron). In this way, the set of intervals taken to measure the complementary distributions CEij and CEji are exclusive. Since the firing pattern of neurons is usually aperiodic (even though it can be modulated by oscillatory patterning), the measure may provide insight into asymmetric locking between pairs of recorded neurons. We refer to the measure as Causal Entropies since we use an entropy function to monitor changes in the functional, temporal interdependence between the two neurons (Zochowski & Dzakpasu, 2004; Dzakpasu & Zochowski, 2005).

The density estimator is a continually updated and normalized histogram of the time lags between a spike-time of neuron i relative to the most recent spike-time of neuron j (see Figure 1). That is, as neuron i fires, the time difference from the most recent, previous spike of neuron j is computed, Δaij where a indexes spike number. The bin corresponding to the value of Δaij in the histogram Pij is updated by adding a value ΔP to that bin, where ΔP is a constant, free parameter that gives the rate of attenuation of older events in the history.

Figure 1.

Figure 1

Interspike intervals and the time-adaptive density estimator. A) Schematic of spike train over time and the inter-spike intervals used in the CE calculation. After neuron j fires, for example, only the interval from the most recent, previous event of i is considered. B,C,D) Examples of the behavior of the metric for different temporal interdependencies of spiking activity: locked with probability 1; unlocked (random) with equal average frequencies; locked with probability 1 and no lag; periodic with equal periods; unlocked (random), but with one frequency ten times greater than the other. For each, we show a sample raster plot along with the calculated causal entropy difference and sum between the pairs (B), the inter-spike interval histogram from the entire duration (C), and the time-adaptive ISI estimated density at the end of the trains (D). The lag locked case is easily distinguishable from the other cases by the CED.

The histogram Pij is then normalized by dividing each bin by (1+ΔP), so that:

Pij(t+1)={Pij(t)1+ΔPifthebinwasnotupdatedPij(t)+ΔP1+Δpifthebinwasupdated (1)

where (t + 1)denotes the time of the next updating of the distribution which takes place at the time the next spike is generated by any of the neurons.

Thus a bin that has not been updated over the course of n events is attenuated by (1+ΔP)−n. This provides the time resolution of the CE that is determined by the rate of attenuation ΔP and allows for a time-adaptive measure of changes in temporal interdependencies between the neurons. Similar results can be obtained using a sliding window of appropriate width if the frequency (and the speed of the process underlying the temporal interdependencies) of the firing units is stable over the recorded interval. However, if the spatio-temporal pattern formation happens over varying timescales, one must choose a window size to optimize a trade-off between statistically significant event counts per window and temporal resolution. Since CEs are event based rather than time based (or more accurately, its rate sensitivity is implicitly coupled to the hidden rate function of the point processes), it can easily detect processes that are imbedded in the data and happen on different timescales.

In our measurements, we set ΔP = 0.2 which sets the history of Pij at roughly 20 events in which i fires after j. We use 10 bins of size 10 ms. Thus, if Δtaij is greater than 100 ms, the time lag between spikes of neuron i and j is considered too long for a causal relationship that would affect synaptic plasticity, and the histogram is not updated.

After neuron i has fired at time t and the histogram of time lags has been updated and normalized, the Shannon entropy CEij(t) is calculated according to standard equation:

CEij(t)=kPij(k)·log(Pij(k)), (2)

where k is the index across the bins of the histogram Pij. In this way, the causal entropy is a time-varying measure of the regularity of the firing relationship of neuron i relative to neuron j within a time period of, in this example, 100 ms. The causal entropies CEij and CEji are computed for all pairs of neurons over the entire recording session, in the case of experimental data, or over the entire network simulation, for model networks.

The significance levels of CEDs were determined by measuring the variation of an ensemble of ten surrogate data sets (though other options are available; see subsection 6 of Methods). These surrogate data sets were created by uniformly randomly shuffling the neuron labels associated with each spike time stamp. That is, the time sequence of action potentials remained unchanged, but the identity of the neuron that fired at each time was randomized with a random permutation of all neuron labels in the data set. In this way, the randomization maintained the original distribution of the number of action potentials so that any global patterns remained unchanged, as did the overall average firing frequency of each neuron.

CEs were calculated for each surrogate data set, s, and for each pair of neurons ij. The maximum absolute difference between CEijs and CEjis was recorded (CEDijs). The significance region for the CE difference (CEij − CEji) was determined to be outside of ± (<CEDijs>s + 2*SE(CEDijs)s), where the subscript s indicates that the averaging and standard error (SE) were taken over all the surrogate data sets.

2. Calculation of Cross Correlations

Cross correlations (XC) were calculated for each pair of neurons using the definition XCi,j(τ)=t(Ai(t)μi)·(Aj(t+τ)μi)σi·σj·(Tτ), where Ai(t) is the discrete function representing the time series of the action potentials of neuron i, where Ai(t0) = 1 if neuron i fired between times t = t0 and t = t0 + 10ms, and Ai(t0) = 0 otherwise. The total time of signal is T, μi is the mean of Ai(t), and σi is the standard deviation of Ai(t). The time lag τ measures the shift in time for the calculation, and is taken to be between [−100ms, 100ms] in steps of 10 ms.

The standard deviations of the cross correlations (XCs) were calculated from the Bartlett estimator (Netoff & Schiff, 2002; Bartlett, 1946), which controls for the spurious XCs due to finite autocorrelations. It is calculated as sd(τ)=1nτT=nnXCi,i(T)·XCj,j(T), where n is of the order of the time trace in the simulation and of the order of a single lap in the experimental data.

3. Experimental design

Single-unit recordings of CA1 hippocampal pyramidal cells were accomplished as in Gothard et al (1996) while the rat completed a spatial maze task (Poe et al., 2002) for food reward. Briefly, two adult male Fisher 344 rats were pre-trained to search for food on the 8-box rectangular maze with 3 of the 8 boxes baited with food. In order to encourage spatial learning and the use of the hippocampus, after 10 laps the relationship between the food positions and local (i.e. intramaze) cues was disrupted by rotating the track 180 degrees and switching baited boxes such that the same room locations held baited boxes. Every 5 laps the rat was removed from the track for 2 minutes and replaced in a semi-random track position. After pretraining for at least 1 week, the rats were anesthetized with sodium pentobarbital (55 mg/kg), placed in a stereotaxic frame and implanted with a multi-electrode drive assembly (Venkatachalam et al, 1998) containing 12 recording tetrodes and 2 reference tetrodes. After recovery from surgery, rats were retrained on the familiar maze until they could run at least 45 laps in an hour. Over the course of a week, tetrodes were gradually lowered into the CA1 pyramidal layer. Two 27 channel Op-amp preamplifier circuit boards were connected to the drive assembly which amplified tetrode electrical signals that were then directed through a commutator to filtering gain amplifiers and data acquisition boards (Neuralynx, Tucson, AZ). Cells were discriminated using cluster analysis software (Plexon, Inc.).

For the experiment, rats were allowed to run the familiar track as usual for 15 laps with a 2 minute break every 5 laps and a maze rotation after the 10th lap. Then the rat was introduced to, what was the first day, an unfamiliar (novel) environment to run 15 laps with the same 2 minute breaks and maze rotation. This novel 8 box maze was configured with a different set of baited boxes than the set used on the familiar maze. The rat was then placed again on the familiar maze to run a final 15 laps. Rats thus ran 45 laps in this familiar-novel-familiar maze sequence over the course of about 1 hour a day, and did so for 3 days. The behavior of recorded cell pairs over all days was considered together for the present analysis.

4. Generation of interlocked spike train sequences

The spike trains are generated such that one train is completely random, with a fixed probability (0.01) of a spike occurring at each time step. The second train is generated by adding a spike with a given probability (the locking probability) for each spike in the first train. If a spike is added, it is placed after the corresponding spike from the first train, with a latency given by a half-Gaussian with a width of 10 ms. Except where stated, the frequency of the second train is matched to that of the first by randomly adding spikes from a uniform distribution.

5. Network Simulation

To demonstrate how CEs may be used to investigate the formation of temporal interdependencies between two neurons and a network (N=10), we here consider a system of coupled Hindmarsh-Rose neurons. The network has all-to-all connectivity and diffusive coupling:

x.i=yia·xi3+b·xi2zi=Ioi+j=1Nα·(xjxi)y.i=cd·xi2yiz.i=r·[s·(xix0)zi] (3)

where x represents the membrane potential and y & z represent two, more slowly changing currents. The following parameters were used: a = 1.0, b = 3.0, c = 1.0, d = 5.0, r = 0.006, s = 4.0, x0 = −1.6, α = 1.1. The subscript i indexes the neuron number. The control parameter I0i ∈ (1.5, 3.35) denotes the amplitude of external current acting on the neuron. The behavior of the neuron changes with the control parameter from periodic firing to bursting with increased I0i, which causes an increase in the mean frequency of the neuron.

6. Statistical estimates of significance levels

6.1 Surrogate data sets

As explained above, the surrogate data sets from which the statistics are derived are generated by the random permutation of the neuron identification number while maintaining the time stamps of action potentials. This allows the retention of both the average frequency of each neuron and the global firing patterns in the system.

6.2. Analytical estimate of significance levels for two Poison processes with dissimilar rates

This estimate uses a statistical model of uncorrelated neural activity to quickly generate large samples of short sequences of interspike intervals, from which the causal entropy difference can be efficiently calculated. Here we use a Poisson process to model the neural spiking activity, which is characterized only by an average firing rate for each neuron as observed in the experimental data (other statistical models may have other parameters which must also be estimated from the data). It should be noted that these firing rates may be varying with time, so that the control value may change with the changes in neural activity patterns.

Given a pair of rates (one rate for each neuron being considered), a sequence of inter-spike intervals is generated. The length of the sequence needs only to be as long as the history length used by the time-adaptive density estimator (characterized in Fig 4). From this sequence, a single CE value is calculated. The difference (and/or sum) of these CE values for different sequences is stored, and the process is repeated until a statistical population of those values is collected, at which time the mean and standard deviations may be calculated. This provides a robust estimate of an interval of CED values that would be expected from chance processes generated from uncorrelated neurons with the defined rates. CED values observed in the experimental data that are outside of this range are significant.

Figure 4.

Figure 4

Significance levels of locking estimated for the sliding window estimator and the time-adaptive estimator as a function of window size (spike number) and ΔP, respectively. The gray region at the bottom of the graph denotes one significance interval (two standard deviations from the mean), and thus the values that fall within this region (<5 spikes/window) do not reach statistical significance. B) The effective history length as a function of ΔP for the time-adaptive estimator. As ΔP increases, the history grows shorter. When a low number of spike counts are included in the history (for ΔP values of 0.6 and above), the time-adaptive estimator yields significantly higher confidence levels than the window estimator.

In the Poisson-distributed model, the probability of an event occurring in a duration δt is a constant ρ. One can show that, in the case of discrete bins, the probability density of the first action potential of neuron i following an action potential of neuron j (with no intervening action potentials of the two) is written as:

P(τ)=(ρi+ρjρi·ρj)·((1ρi)·(1ρj))τ,

where ρi and ρj are the probabilities of neurons i and j firing in a particular bin, and τ ≥ 0 is the number of bins separating the events. The ISI-probability distribution is symmetric in the firing rates of i and j.

The inter-spike interval sequences may be quickly computed by transforming a uniform probability density on [0, 1). This is done by a random value r selected from a uniform density, and using transformation to τ, given by:

τ=log(1r)log[(1ρ1)·(1ρ2)]1

The advantages of this estimate method of the significance levels are the following: it is independent of quality or quantity of the data (per channel); it is easy to have the control adapt over time to the changing firing rates of the neurons; finally, it is adaptable to other statistical neuron models, which may be used to maintain other statistical properties of the observed units while preserving the lack of correlation between units.

Results

I. Investigation of properties of Causal Entropy metric

Causal entropy is calculated from two components: the cost function, in the form of Shannon entropy, and a time-adaptive density estimator for the inter-spike intervals, in the form of the update-normalized histogram. In this section, we illustrate properties of both these components and compare them with other temporal coincidence measures in several different cases. In short, we will show below that the cost function performs comparably to reduced-bias entropy estimators, and that the time-adaptive density estimator performs better than a sliding window estimator for a low spike count. Further, we find that CEs robustly detect temporal locking (or lack of thereof) over large ranges in spike train frequency mismatch. Also, we provide examples when CEs can distinguish between persistent locking and temporal coincidence better than cross correlations and show that, under some circumstances, CEs may be used to infer simple structural interdependencies between the neurons in the network.

1. Causal entropy difference as a determinant of temporal locking

We will first characterize the properties of the cost function. A low CEij indicates that neuron i fires after neuron j with a highly regular time lag, whereas high values of CEij indicate large variability in the firing pattern of neuron i with respect to neuron j. Thus, four possible situations can be detected by the CE measurement: CEijCE ji ≫ 0. In this case, the inter-spike interval distributions of both neurons are fairly wide and thus the activity of the neurons does not reveal any temporal interdependencies, implying a lack of direct functional coupling.

CEij ≅ 0 and CE ji ≫ 0. Here, the ij-th distribution is narrow whereas the ji-th distribution is wide. This indicates persistent temporal lag of the i-th neuron relative to the j-th neuron.

CEij ≫ 0and CE ji ≅ 0. This implies the reverse situation: a persistent temporal lag of the j-th neuron relative to the i-th neuron.

CEijCE ji ≅ 0. Both distributions are peaked which usually implies complete synchronization of both neurons. The last part of this condition, both entropies equal to zero, is relaxed in the presence of noise (see Zochowski & Dzakpasu (2004)). Another possibility is that both processes are periodic, but not causal, in which case the temporal interdependence cannot be determined solely based on lag variability.

To detect asymmetry in the temporal interdependencies between neural activities, it is useful to monitor the Causal Entropy difference (CED), CEDij = CEijCE ji.

2. Role of entropy bias estimators in the calculation of causal entropy difference (CED)

In practice such a direct calculation of entropy (eq 2) may lead to a significant bias. However, since we are only interested in the significance of the CED of the two distributions, the bias is effectively eliminated and there is no need to use entropy bias estimators (see Grassberger, 1988; Roulston, 1999; Schürmann, 2004; Nemenman et al, 2004). To illustrate this point we calculate the cost function on simulated spike train sequences with a locked phase lag (see Methods) by evaluating several entropy estimators: the Shannon entropy; the Grassberger entropy estimator, which reduces the bias and allows a calculation of it, and which is formulated for Poisson processes (Grassberger, 1988; Schürmann, 2004); and the NSB entropy estimator, which has zero bias under proper conditions (Nemenman et al, 2004). All estimators are calculated on the distributions obtained from the complete set of interspike intervals generated over the entire time series (using the same ISIs as during CE calculation – see Fig 1; however, no adaptive updating is used). The entropies are estimated for each direction (spike of train 1 following the last spike of train 2 and vice versa) and the difference (CED) is calculated. The CED value is then compared to that obtained from the surrogate datasets. The results (Fig 2) indicate that the Causal Entropy performs very similarly to the zero-bias estimator, and justify the direct use of the Shannon entropy estimator in the cost function.

Figure 2.

Figure 2

Comparison of the significance of entropy differences using several entropy estimators for simulated pair of locked spike trains. CE denotes the direct Shannon entropy calculation without adaptive updating. Grassberger (Grassberger, 1988) is a reduced bias entropy estimator for which the bias can be estimated. NSB (Nemenman et al, 2004) is a zero-bias entropy estimator. The gray region indicates non-significance. Causal entropy performs comparably to the NSB and Grassberger methods.

3. Effects of spike train frequency mismatch on the CED calculation

Next we investigate the effects of frequency mismatch of the compared spike trains on the causal entropy metric (Figure 3A). We show that the CED measurement detects locking patterns for widely varying frequencies of the spike trains (note the log plot of the frequency ratio). The significance of the measurement is lost only when the lag latency becomes of the order of the spiking period of the faster spike train, in which case the trains can no longer be considered locked. On the other hand, for uncorrelated spike trains the frequency mismatch does not create a locking artifact over any frequency mismatch (Fig 3A: control).

Figure 3.

Figure 3

A) Average causal entropy difference for a pair of spike trains with frequency mismatch. The second train is locked to the first one with a lag latency of 1, 10, or 50 ms. For the control the two spike trains are independent. B) Average causal entropy difference for locked trains, and significance thresholds (2 standard deviations from mean) for the control values for pair of spike trains with frequency mismatch. For the calculation of the CED a lag latency of 50 ms was used. For the shuffle significance estimates, eight more surrogate trains were generated as described in the methods. For the probabalistic significance estimates, the significance levels were estimated as described in section 6. As expected, the shuffle estimate is more conservative than the one derived for spike trains having Poisson distribution.

The probability of spikes between two units occurring closely together randomly is clearly dependent upon the unit frequencies. Thus, the control values of the CED will depend on the relative frequencies. In figure 3B, we illustrate the effect of frequency on the calculation of the two controls discussed in the methods. The conservative shuffle control value rises rapidly, while the more moderate statistical control rises only moderately over this range.

4. Properties of the time-adaptive density estimator

We next characterize the time-adaptive density estimator used in the CEs - the event-normalized histogram. This density estimator is compared to another commonly used density estimator - the histogram of ISIs contained within a sliding window. The advantage of using our estimator is that it automatically adapts the measurement to the frequency of the monitored events and is not constrained to any time window. This in principle allows one to follow more easily the processes happening within the observed spike trains but on different time scales. The principle parameter in the sliding window estimator is the window size, whereas CE estimator uses ΔP, which determines the history dependence of the measure. We investigated significance of the measurements of the two estimators for stationary locked signals for varying window size (sliding window estimator) as well as ΔP (time-adaptive estimator). The results (Fig 4A) are presented in terms of significance levels (two standard deviations from surrogate data mean). As expected, significance of the locking increases monotonically as a function of window size (number of spikes per window) for the sliding window estimator. Conversely, significance diminishes with the ΔP as the history contribution in the distribution estimate becomes limited. To estimate how many spikes are effectively used to estimate the CE for the measurement we calculated the time-adaptive distribution for various number of independent events as a function of ΔP. We then calculated the entropy of the distribution. The distribution saturates with varying speed (dependent upon ΔP). We then calculated how many of the most recent events cumulatively contribute 95% of the saturated entropy value. This number is plotted as a function of ΔP in Fig 4B. For large spike numbers the sliding window estimator obtains better significance levels. However, for relatively low spike counts (e.g. for 5 spikes and below in Fig 4), the time-adaptive estimator performs significantly better; it remains significant while the windowed entropy becomes non-significant.

5. Comparison of CE metric and cross correlation

Causal entropy is principally suited to tracking the way stable temporal ordering changes between pairs of spike trains. Here we show an example where the CE calculation yields significantly different results while the cross correlation remains virtually unchanged. Two pairs of locked spike trains were generated. In the first pair, the lead-lag relationship is alternated on a fast time scale – half the average frequency of the spike trains. In the second pair, the locking is alternated on a slow time scale, roughly a fiftieth of the spike frequency.

Figure 5 shows the result of the calculation of the CE metric and cross correlations for both cases. The cross correlations between the two cases are very similar and symmetrical. The causal entropy distinguishes between the slow and fast switching cases. Thus, even if there is no causal locking on any physiological scale but the two spike trains are coincident, the cross correlation will show highly significant locking. However this locking is unlikely to contribute to structural changes in the network driven by activity dependent synaptic plasticity.

Figure 5.

Figure 5

Comparison of causal entropy and cross correlation measures for two pairs of simulated spike trains with alternating lead-lag relationships. The darkened regions on panels A-D indicate the non-significant regions. In the first case (A & C), the lead-lag relationship was changing quickly, at a frequency half that of the trains. Note that in this case, coincidence is established, but there is no persistent ordering relationship. This is reflected in the low CED but high cross correlation values. In the second case (B & D), the lead-lag relationship switched slowly, with a frequency 50 times slower than the trains. In this latter case, the temporally ordered relationship is observed as a significant CED. The cross correlation is significant and similar to the previous case; the directionality of the relationship is not seen in the cross correlation. Also note that both cross correlations are highly symmetrical; this is expected, given that the several switching events occur, and the first train leads the second roughly as often as the second leads the first.

6. Detection of causal relation ship between more than two cells

In networks of complex structure the lead-lag relationship may indicate the general direction of information flow in the network but often it will not provide detailed information about the structural dependence between the neurons. We study an example of two simple network motifs: in the first motif, two spike trains follow another (third one), but with different latencies; in the second motif, the first train is locked to a second train which, in turn, is locked to a third one. These two motifs are subject to two options, either perfect locking between the spike trains takes place, or locking with probability of 0.6. Figure 6 demonstrates that for perfect locking the two motifs can not be distinguished. However, if the locking probability is not unity, these motifs can be distinguished by the CE measurement: in the first motif, it is the CE between train 2 and 3 that is high, due to the reduced joint probability of 3 following 2 when locking is imperfect; in the second motif, it is the probability of 3 following 1 that is reduced.

Figure 6.

Figure 6

Causal entropies for motifs composed of sets of three spike trains with one of the following two relationships: Either 3 follows 2, which in turn follows 1 (C and D), or 2 and 3 both follow 1, but 3 has a longer latency (A and B). With perfect locking (A and B), the two motifs cannot be distinguished by CEs. With imperfect locking (C and D), differences in the temporal relationships can be detected.

II. Measurement of network properties on simulated and biological data

1. Cell Pair Synchrony

We measured the temporal interdependencies between neural activities in cell pairs recorded from the freely behaving rat using both the causal entropy measure and cross correlations. Both measures were applied to the entire duration of the recording. Two examples of such recordings are shown in Figure 7. The raster plots of the analyzed neurons are shown on the top with the specific pattern of the maze presentations and lap-running periods shown below. The CED calculation is shown on the bottom part of the figure. The gray area corresponds to the non-significant regions of the measure response, as defined in the methods section, i.e. they are areas where the difference between the randomly shuffled data set CEDs did not differ significantly from the actual data.

Figure 7.

Figure 7

The raster plots, behavioral profiles and causal entropy differences (CED) for two pairs of recorded neurons. Shaded regions in the bottom CED plots denote lack of significance in the measured temporal interdependencies. The behavioral time stamps (center) and raster plots (top) of the neuronal spike times are displayed above the CED traces. The dark grey outlined rectangles of the behavioral profiles (center) denote exploration of the novel track, whereas the light gray bordered boxes denotes times exploring of the familiar track. CEs were calculated using ΔP = 0.2, with 10 ms bins over a window of 100 ms.

One can clearly observe a significant jump in the asymmetry in the temporal interdependencies when the animal moved to the novel environment. The change in CED in these examples lasted for most of the duration of the foraging in the novel environment. The change in asymmetry then diminished in the second familiar maze run portion of the experiment.

We examined the effects of random perturbations of relative spike timings on cross correlation and CED measures for the experimental as well as simulated spike trains. We have added a value from a Gaussian distribution with a varying standard deviation to every spike time. The standard deviation of the distribution was varied from 0 to 20 ms for the experimental data and from 0 to 2 ms for the simulated spike trains. The results for the experimental data were averaged over all pairwise comparisons where the non-perturbed values of the CEDs were significant. The results are displayed as a multiple of the SD obtained from the respective calculations. As illustrated in figure 8a, the significance of the causal entropy difference falls quickly compared to cross correlations since the noise disrupts the timeordered relationship between the neurons. This occurs when the noise variability is on the order of the typical lag time between the neurons. Thus, causal entropies give a more sensitive observation of the persistence of inter-event intervals between pairs of neurons.

Figure 8.

Figure 8

Significance of causal entropy differences and cross correlations for noisy data. For cross correlations, the peak value is divided by the Bartlett estimate of the standard deviation. For causal entropies, the expectation of the surrogate maximum CED value is subtracted from the mean causal entropy difference, and then divided by the surrogate standard deviation. Thus, the CE significance values can be negative. In both A & B, a normally distributed value was added to each timestamp, with amplitude plotted on the x axis. A) Results for experimental hippocampal data. The causal entropy falls off quickly as a function of noise, indicating a disruption of the temporal order relationship. The cross correlation standard deviations are reduced to a much smaller extent. B) Results for simulated Hindmarsh-Rose data. These data support the results of the experimental data, indicating a sharp decrease in the significance of the causal entropy difference as the temporal order relationship is disrupted, but only a mild decrease in the significance of the cross correlations.

This sensitivity is highlighted in the analysis of simulated spike trains (figure 8b), using persistent input currents of 3.1 and 3.3 to the HR equations (eq 3), that results in stable driving patterns between the simulated neurons. Here the magnitude of ISIs between the leading and following neuron is much smaller and the CED calculation decreases rapidly over much smaller variations of standard deviation in the noise. The cross correlation remains virtually unchanged over the same interval.

2. Network Behavior

We have extended our analysis to monitor global properties of a whole network. We performed CE calculations over every pair combination, separately, for every phase (familiar, novel, second familiar) of the experiment. We then calculated the dissimilarities between every 5-lap run in every phase of the experiment. The dissimilarity between each 5-lap run within each phase was calculated according to the equation: | CEDijrCEDijr|, where CEDijr is the average causal entropy difference between neurons i and j during phase r. The dissimilarity index was obtained by averaging the CED over all data sets for all pairs of phases r and r′. The smaller the dissimilarity index, the more similar the two network configurations as measured by the CE’s.

To better understand the behavior of the network we first performed a similar analysis on the simulated network. To simulate the experimental design, the input to the neurons in the network was generated with random input currents I0i and then sorted such that I0i < I0j if i < j. The network was allowed to evolve freely for 11 s, at which point the input for every neuron was replaced with new, random values. This phase of the simulation was integrated for another 11 s. The input values to individual neurons were then restored to their original values and the network continued to be integrated for 11 s. The CEs were calculated for the last 5 s of each segment of time between all pairs of neurons and the patterns of CEs were then compared.

The results of the simulated network are plotted in Figure 9. The temporal interdependencies change dramatically with the new inputs to the individual neurons. As soon as the shuffling was reversed back to the original input values, the original temporal configuration in the network was restored. Figures 9a and 9c show that a similar organization was present for the same configuration of inputs, and significantly different (Figure 9b) for the randomized one. Figure 9d depicts dissimilarity measurements for the three simulation phases. As expected, there was a high similarity between first (familiar) and last (also familiar) phases, but a low similarity between the random (novel), second phase and the other two.

Figure 9.

Figure 9

Similarity between different phases in a simulated network: A–C: The CEs between pairs of neurons. (A) The inputs to individual neurons are random, but sorted; (B) The input configuration is randomized; (C) The initial configuration of the inputs is re-established. Every phase was run for 11 s. The elements of the array represent the values of the CE averaged over the last 5 s of every phase. D) Calculation of the dissimilarity between the temporal configurations obtained in every phase (please refer to the text). The temporal ordering in the first and last phase is nearly identical.

Similar calculations performed over experimental data yielded somewhat different results. To examine the experimental data from in vivo networks, the CEDs from each of the 6 experiments were averaged in nine time regions corresponding to the sets of five laps, each called a run: three runs each on an initial familiar track, a novel track, and again on the final familiar track. We have discarded the first run of the initial 5 laps on the first familiar maze experience to eliminate any initial transients. The dissimilarities were calculated between every pair of runs, excluding self-comparison of runs. The calculated dissimilarities were then combined by track to form an average comparison between novel and familiar phases of the experiments (Familiar 1-Novel, Familiar 1-Familiar 2, and Familiar 2 –Novel, in addition to the comparison with each track to itself). The configuration of temporal interdependencies in every phase of the experiment was significantly different (Figure 10). However the configurations obtained from the segments were more self-similar (diagonal of the matrix) than similar to one another, which indicates that the network configuration was relatively stable within a phase, but a relatively large reconfiguration took place between each phase. The self-similarity between runs in the novel phase was somewhat reduced, indicating an ongoing alteration of the function network (e.g., learning). Additionally, the familiar tracks were somewhat more similar to one another than to the novel, indicating perhaps that fewer plastic changes occurred between these two familiar phases than occurred during the novel phase, as would be expected.

Figure 10.

Figure 10

Average absolute difference of pair-wise CE differences between the three phases (Familiar 1, Novel and Familiar 2). This is the dissimilarity measure. (Left) Matrix representation of the dissimilarities; light is more similar, dark more dissimilar. (Right) the comparison of the changes of dissimilarity across the phases. There is a relatively large self-similarity of configurations of temporal interdependencies within familiar phases and a (marginally significant) decrease of self-similarity within the novel phase. Additionally, both familiar phases are somewhat more similar to each other than when comparing novel and familiar phases of the experiment. The small increase in the similarity between the novel and second familiar phase may indicate that the reconfiguration during the novel phase causes the temporal representations to be somewhat preserved in the second encounter with the familiar environment.

One can also observe a decrease (though not significant) in dissimilarity between the second familiar phase (Familiar 2) and the novel phase, as compared with the dissimilarity between the first familiar phase and the novel phase. An increased similarity between the novel and second familiar phases could indicate that the reconfiguration of the network during learning results in the formation of a temporal order configuration that overlaps those two (novel and second familiar) configurations of temporal interdependencies. In other words, the neural network was probably altered by the novel experience which slightly, but measurably, changed the activity of cell pairs active in the subsequent familiar maze. Since the two environments were adjacent, it is possible that the familiar map was altered to include new information gained from exploring the other side of the room behind the barrier.

Conclusions

We have presented a novel time-adaptive method to measure temporal ordering of neuronal activities in a network. We have shown that the Causal Entropy measure may detect asymmetries in temporal dependencies with more sensitivity than the standard measures based on cross correlations. Applying the measure to simulated spike train data, we have illustrated the similarity of the CE cost function with low bias entropy estimators and the advantage of the CE time-adaptive density estimator when spike counts are low. In data gathered from hippocampal neurons in freely behaving rats, we showed that the CE measure can detect additional asymmetries in the temporal correlations between individual neurons during exploration. Changes in CE could be a direct hallmark of activity-dependent synaptic reorganization and learning. We have also extended the measure to look at the configuration of temporal interdependencies in a local network. Those results indicated a possible rewiring of the hippocampal network during the novel phase of the experiment.

Acknowledgments

This research was done with the support of National Institutes of Health (1 R21 EB003583) (M.Z.), NIH Molecular Biology Training Grant (2T32GM08270-17) (J.W.), National Institutes of Mental Health (MH060670 and MH076280) and the Department of Anesthesiology (G.P., V.B., J.R., B.R.). V.B. was also funded by the National Science Foundation (DBI-0340687 and DMS-0315862). J.R. was funded by Department of Neurology Training Grant T32NS07222.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Abbot LF, Nelson SB. Synaptic plasticity: taming the beast. Nature Neuroscience. 2003;3:1178–1183. doi: 10.1038/81453. [DOI] [PubMed] [Google Scholar]
  2. Bartlett MS. On the Theoretical Specification and Sampling Properties of Autocorrelated Time-Series. J R Statist Soc. B8:27–41. [Google Scholar]
  3. Bartlett MS. An introduction to stochastic processes. 2. Cambridge University Press; Cambridge: 1966. [Google Scholar]
  4. Brea J, Russell DF, Neiman AB. Measuring direction in the coupling of biological oscillators: A case study for electroreceptors of paddlefish. Chaos. 2006;16:026111. doi: 10.1063/1.2201466. [DOI] [PubMed] [Google Scholar]
  5. Brillinger D. R. The identification of point process systems Annals of Probability. 1975;3:909–929. [Google Scholar]
  6. Chen Y, Bressler SL, Ding M. Frequency decomposition of conditional Granger causality and application to multivariate neural field potential data. Journal of Neuroscience Methods. 2006;150:228–237. doi: 10.1016/j.jneumeth.2005.06.011. [DOI] [PubMed] [Google Scholar]
  7. Chornoboy ES, Schramm LP, Karr AF. Maximum likelihood identification of neuronal point process systems. Biological Cybernetics. 1988;59:265–275. doi: 10.1007/BF00332915. [DOI] [PubMed] [Google Scholar]
  8. Cox DR, Lewis PAW. Multivariate point processes. Proceedings Sixth Berkeley Symposium on Probability and Mathematical Statistics. 1972;3:401–448. [Google Scholar]
  9. Dzakpasu R, Zochowski M. Discriminating differing types of synchrony in neural systems. Physica D. 2005;208:115–122. [Google Scholar]
  10. Fries P, Neuenschwander S, Engel AK, Goebel R, Singer W. Rapid feature selective neuronal synchronization through correlated latency shifting. Nat Neurosci. 2001;4(2):194–200. doi: 10.1038/84032. [DOI] [PubMed] [Google Scholar]
  11. Gerstein GL, Perkel DH. Simultaneously recorded trains of action potentials: Analysis and functional interpretation. Science. 1969;164:828–830. doi: 10.1126/science.164.3881.828. [DOI] [PubMed] [Google Scholar]
  12. Gothard K, Skaggs WE, Moore KM, McNaughton BL. Binding of hippocampal CA1 neural activity to multiple reference frames in a landmark-based navigation task. J Neurosci. 1996;16:823–835. doi: 10.1523/JNEUROSCI.16-02-00823.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Granger CWJ. Investigating causal relations by econometric models and cross-spectral methods. Econometrica. 1969;37:424–38. [Google Scholar]
  14. Grassberger P. Finite sample corrections to entropy and dimension estimates. Phys Lett A. 1988;128:369–373. [Google Scholar]
  15. Grillenzoni C. Sequential Kernel Estimation of the Conditional Intensity of Nonstationary Point Processes. Statistical Inference for Stochastic Processes. 2006;9:135–160. [Google Scholar]
  16. Lin L, Osan R, Shoham S, Jin W, Zuo W, Tsien JZ. Identification of network-level coding units for real-time representation of episodic experiences in the hippocampus. PNAS. 2005;102(17):6125–6130. doi: 10.1073/pnas.0408233102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Massey JL. Causality, Feedback, and Directed Information. Proc 1990 Intl Symp on Info Th and it Applications. 1990 [Google Scholar]
  18. Nemenman I, Bialek W, de Ruyter van Steveninck R. Entropy and information in neural spike trains: Progress on the sampling problem. Phys Rev E. 2004;69:05611. doi: 10.1103/PhysRevE.69.056111. [DOI] [PubMed] [Google Scholar]
  19. Netoff TI, Schiff SJ. Decreased Neuronal Synchronization during Experimental Seizures. J Neurosci. 2002;22:7297–7307. doi: 10.1523/JNEUROSCI.22-16-07297.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Okatan M, Willson MA, Brown EN. Analyzing Functional Connectivity Using a Network Likelihood Model of Ensemble Neural Spiking Activity. Neural Computation. 2005;17(9):1927–1961. doi: 10.1162/0899766054322973. [DOI] [PubMed] [Google Scholar]
  21. Perkel DH, Gerstein GL, Moore GP. Neuronal spike trains and stochastic point processes II. Simultaneous spike trains. Biophysical Journal. 1967;7:419–440. doi: 10.1016/S0006-3495(67)86597-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Pikovsky AS, Rosenblum MG, Osipov GV, Kurths J. Phase synchronization of chaotic oscillators by external driving. Physica D. 1997;104(3–4):219–238. [Google Scholar]
  23. Poe GR, Thompson CM, Riley BT, Tysor MK, Bjorness TE, Steinhoff BP, Ferluga ED. A spatial memory task appropriate for electrophysiological recordings. J Neurosci Methods. 2002;121(1):65–74. doi: 10.1016/s0165-0270(02)00233-9. [DOI] [PubMed] [Google Scholar]
  24. Potter SM. Distributed processing in cultured neuronal networks. Prog Brain Res. 2001;130:49–62. doi: 10.1016/s0079-6123(01)30005-5. [DOI] [PubMed] [Google Scholar]
  25. Rosenblum MG, Pikovsky AS, Kurths J. Phase synchronization in chaotic oscillators. Phys Rev Lett. 1996;76:1804–1807. doi: 10.1103/PhysRevLett.76.1804. [DOI] [PubMed] [Google Scholar]
  26. Rosenblum MG, Pikovsky AS, Kurths J. From phase to lag synchronization in coupled chaotic oscillators. Phys Rev Lett. 1997;78:4193–6. [Google Scholar]
  27. Roulston MS. Estimating the errors on measured entropy and mutual information. Physica D. 1999;125:285–294. [Google Scholar]
  28. Salinas E, Abbott L. Vector reconstruction from firing rates. J Comput NeuroSci. 1994;1:89–107. doi: 10.1007/BF00962720. 1994. [DOI] [PubMed] [Google Scholar]
  29. Schelter B, Wnterhalder M, Eichler M, Peifer M, Hellwig B, Guschlbauer B, Lücking CH, Dahlhaus R, Timmer J. Testing for directed influences among neural signals using partial directed coherence. J Neurosci Methods. 2006;152:210 – 219. doi: 10.1016/j.jneumeth.2005.09.001. [DOI] [PubMed] [Google Scholar]
  30. Schürmann T. Bias analysis in entropy estimation. J Phys A. 2004;37:L295–L301. [Google Scholar]
  31. Shapiro M. Plasticity, hippocampal place cells, and cognitive maps. Arch Neurol. 2001;58(6):874–81. doi: 10.1001/archneur.58.6.874. [DOI] [PubMed] [Google Scholar]
  32. Venkatachalam S, Fee MS, Kleinfeld D. Ultra-minature headstage with 6-channel drive and vacuum-assisted micro-wire implantation for chronic recording from the neocortex. J of Neuroscience Methods. 1999;90:37–46. doi: 10.1016/s0165-0270(99)00065-5. [DOI] [PubMed] [Google Scholar]
  33. Wiener N. The theory of prediction. In: Beckenbach EF, editor. Modern Mathermatics for Engineers. New York: McGraw-Hill; 1956. (Chapter 8) [Google Scholar]
  34. Wilson MA, McNaughton BL. Reactivation of hippocampal ensemble memories during sleep. Science. 1994;265(5172):676–679. doi: 10.1126/science.8036517. [DOI] [PubMed] [Google Scholar]
  35. Wu JY, Cohen LB, Falk CX. Neuronal activity during different behaviors in Aplysia: a distributed organization? Science. 1994;263(5148):820–3. doi: 10.1126/science.8303300. [DOI] [PubMed] [Google Scholar]
  36. Zochowski M, Cohen LB, Fuhrmann G, Kleinfeld D. Distributed and partially separate pools of neurons are correlated with two different components of the gill-withdrawal reflex in Aplysia. J Neurosci. 2000;20(22):8485–92. doi: 10.1523/JNEUROSCI.20-22-08485.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Zochowski M, Dzakpasu R. Conditional entropies; phase synchronization and changes in the directionality of information flow in neural systems. J Phys A. 2004;37:3823–4. [Google Scholar]
  38. Zhang K, Ginzburg I, McNaughton BL, Sejnowski TJ. Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. J Neurophysiol. 1998;79:1017–1044. doi: 10.1152/jn.1998.79.2.1017. [DOI] [PubMed] [Google Scholar]
  39. Zhou C, Kurths J, Kiss IZ, Hudson JL. Noise-enhanced phase synchronization of chaotic oscillators. Phys Rev Lett. 2002;89:014101. doi: 10.1103/PhysRevLett.89.014101. [DOI] [PubMed] [Google Scholar]

RESOURCES