Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 May 6.
Published in final edited form as: IEEE Trans Biomed Eng. 2009 Jan;56(1):83–92. doi: 10.1109/TBME.2008.2002153

A Spatiotemporal Filtering Methodology for Single-Trial ERP Component Estimation

Ruijiang Li 1,, Jose C Principe 2, Margaret Bradley 3, Vera Ferrari 4
PMCID: PMC3645474  NIHMSID: NIHMS452025  PMID: 19224722

Abstract

A new spatiotemporal filtering method for single-trial event-related potential (ERP) estimation is proposed. Instead of attempting to model the entire ERP waveform, the method relies on modeling ERP component descriptors (amplitude and latency) thru the spatial diversity of multichannel recordings, and thus, it is tailored to extract signals in negative SNR conditions. The model allows for both amplitude and latency variability in the ERP component under investigation. The extracted ERP component is constrained through a spatial filter to have minimal distance (with respect to some metric) in the temporal domain from a user-designed template component. The spatial filter may be interpreted as a noise canceller in the spatial domain. Study with both simulated data and real cognitive ERP data shows the effectiveness of the proposed method.

Index Terms: Event-related potentials (ERPs), single-trial estimation, spatiotemporal filtering

I. Introduction

Traditionally, event-related potential (ERP) analysis techniques have relied on the signal time course in a single recording channel of the EEG. In most cases, due to the low SNR, ensemble averaging of the EEG data over multiple trials is required to recover the ERP. This simple averaging technique is based on the assumption that the ERP is a deterministic signal, relative to the stimulus onset. However, research has shown that the nature of ERP (in particular, the peak latencies and amplitudes) is quite variable; therefore, it violates the assumption implicit in averaging [1]. Moreover, there may be systematic changes of the ERP due to habituation with the repetition of the stimulus [2]. In these cases, it may be desirable to estimate the ERP signature on a single-trial basis.

Woody [3] proposed a simple and widely used adaptive filtering method (called latency-corrected average) to deal with the temporal variability of the ERP signal. The method is based on the matched filtering concept, where the crosscorrelation of a template and the single-trial EEG data are computed within a certain interval until the maximum is achieved. Pham et al. [4] adopted the maximum likelihood (ML) formalism to estimate the latencies of ERP assuming a constant shape and amplitude. Their ML method was extended in [5] incorporating amplitude variability into consideration. These methods are similar in that they all explicitly model the local descriptors (latency and amplitude) of the ERP. Another similarity is that they all rely on analysis of the EEG data in a single channel.

The advent of high-density EEG recordings has opened the door to exploit methodologies that combine effectively time series of multiple channels and enable the extraction of faint signals in very low SNR environments. Parra et al. provided a set of “recipes” using a linear spatial integration of multiple channels for the analysis of multivariate EEG data [6]. Depending upon the desired statistical properties of the output, different spatial integration methods arise. Two methods that have been made popular for ERP analysis are principal component analysis (PCA) [7] and independent component analysis (ICA) [8], [9]. One of the drawbacks of these popular methods is that they work best for large signals (e.g., ocular artifacts), but are not specifically designed to detect signals in very low SNR environments, as is common for ERPs. Other methods using sparse decomposition [10] have also been proposed.

A related but different problem is the (supervised) single-trial EEG classification problem. It is generally less difficult than the (unsupervised) single-trial estimation problem in the sense that the availability of label information for classification facilitates learning. Many spatiotemporal filtering methods have been proposed for the single-trial EEG classification problem, which include, but not limited to, common spatial patterns (CSP) [11], linear discriminant analysis [12], common spatiospectral patterns (CSSPs) [13], and bilinear discriminant component analysis [14]. Recently, a two-stage (spatial and temporal) filtering method with regularization has been proposed for EEG classification [15].

In order to take full advantage of the spatial information in EEG analysis, a method must be found to exploit the localized nature of neural activation. Macroscopic events of clinical relevance such as ERPs do not fit this criterion since they are a combination of cortical responses with different localized sources in the brain. Therefore, in this paper, we propose a methodology that treats each one of the ERP sources independently and shows promise to quantify single brain events. The method does not work with the ERP waveform, but only with features of its main components (e.g., P300, N400, etc.). It constrains through a spatial filter the extracted ERP component to have minimal distance in the temporal domain from a user-designed template component. This component wise-model-based approach allows for amplitude and latency variability in the actual component. We wish to stress that we do not attempt to recover the waveform of the ERP nor of its components (as most conventional methods do), but will instead focus on finding explicitly the local descriptors (latency and amplitude) on a single-trial basis, because these local descriptors, rather than the exact morphology of the waveform, are, in general, of more interest and more psychologically significant. This is fortunate because it opens the door to find methodologies that can work in negative SNR environments, and corroborates our claim that it is a new spatiotemporal methodology for ERP analysis.

The organization of the paper is as follows. Section II introduces a widely used linear generative EEG model, which serves as the foundation for our method. Section III presents the details of the new spatiotemporal filtering approach to the single-trial ERP estimation problem. Results on both simulated and real cognitive ERP data are presented in Section IV. We conclude the paper in Section V.

II. Linear Generative EEG Model

We start with the “neuronal generator” assumption of EEG data, i.e., neuron populations in cortical and subcortical brain tissues act as current sources [16], [17]. Within the EEG frequency range, brain tissues can be assumed to be primarily a resistive medium governed by Ohm’s law [18]. Thus, in EEG data, the electrical potentials collected at each electrode on the scalp are basically a linear combination of neural current sources (and nonneural artifacts) as a result of volume conduction. The linear generative model for EEG data can be written in matrix form:

X=asT+i=1NbiniT (1)

where X is a D × T matrix representing the single-trial EEG data with D channels and T samples, s is the time course of ERP component to be extracted, ni denotes noise in general. The vectors a and bi represent the projection of the corresponding source to each electrode on the scalp and are called the scalp projection associated with the source. They are generally unknown and depend on the orientation of the current source as well as the conductivity distribution of the underlying brain tissues, skull, skin, and electrodes [19].

The EEG model in (1) can be rewritten as follows:

X=σsaosoT+i=1NσiboinoiT (2)

where ao, so, boi, and noi are the normalized versions of their counterparts in (1).

The scalars σs and σi can be seen as the overall contributions of the sources to the EEG data. For a meaningful ERP component, it must have a stable normalized scalp projection ao . Thus, we may assume that ao is fixed for all trials. We also assume that the waveform of a particular ERP component so remains the same for all trials, although its amplitude σs may change across trials.

III. New Spatiotemporal Filtering Method

Most ERP components are monophasic waveforms with compact support in time. The morphology of the waveform can be considered relatively fixed, but may vary in both its peak latency and amplitude from trial to trial. Based on this, we assume that a particular ERP component can be modeled by a fixed dimensionless template (e.g., no physical unit), in the temporal domain, denoted by go (l) (where l is the unknown peak latency), multiplied by an unknown and possibly variable amplitude σs across trials.

It can be shown that under general conditions, there exists a spatial filter that will completely reject the interference from the first largest D − 1 noise sources in the output when it is applied to the single-trial EEG data. In principle, such a spatial filter can be chosen as the first row of the inverse of the square matrix [a b1 · · · bD−1] (if it is invertible). In this sense, the spatial filter may be interpreted as a noise canceller in the spatial domain. The question is how to find such a spatial filter in practice, where the scalp projections a and bi are unknown.

A. Finding the Peak Latency for the ERP Component

Since we do not know the peak latency in a single trial, we denote the template as go (τ) with a variable time lag parameter τ and slide it one lag a time to search for the peak latency. The search for the optimal filter could be realized by minimizing some distance measure between the spatially filtered output wT X and the template for the particular ERP component. We propose the following cost function based on second-order statistics (SOS):

minwwTXgo(τ)T22. (3)

Note that the aforementioned optimization is with respect to w only, with τ fixed. The optimal solution for w is given by:

w(τ)=(XXT)1Xgo(τ). (4)

Obviously, the optimal spatial filter depends on which ERP component is to be extracted, and also, is a function of the variable time lag. From (3) and (4), we obtain the cost solely as a function of the time lag:

J(τ)=go(τ)T[XT(XXT)1XI]22. (5)

The peak latency of the ERP component can be set as the time lag where the local minimum of J(τ) occurs within the meaningful range of peak latencies (TS) for that particular component (provided that its waveform is monophasic), i.e.,

l=arg minτTsJ(τ). (6)

The (normalized) estimated ERP component is then:

yo(l)=XTw(l)XTw(l)2. (7)

It can be shown that under certain conditions, the solution in (6) is identical to the true peak latency of the ERP component (see Appendix).

B. Finding the Scalp Projection and Amplitude for the ERP Component

In the following, we make the index for trial number k explicit. We can absorb the scalar σk into a variable scalp projection:

ak=σkao. (8)

In order to estimate the unknown scalp projection and amplitude of the ERP component, we assume that the ERP component is uncorrelated with all the noise sources. Replacing the dimensionless ERP component in (2) by its estimate in (7) and multiplying yok (lk) on both sides of (2), we will get an estimate for the single-trial scalp projection (the cross terms noiTyok(lk) vanish because of the uncorrelatedness assumption):

a^k=Xkyok(lk). (9)

Taking the normalized version we have

a^ok=a^ka^k2. (10)

To estimate the scalp projection for a stable ERP component, we propose the following cost function:

minaok=1Ka0a^ok22. (11)

This corresponds to an ML estimator for ao under the assumption that each entry of the normalized single-trial scalp projection is an independent identically distributed (i.i.d.) Gaussian random variable. The optimal solution for (11) is a simple average of the estimated single-trial scalp projections for all K trials. Taking the normalized version, the following estimate for ao is obtained:

a^o=1/KK=1Ka^ok1/KK=1Ka^ok2. (12)

Notice that (12) is, in fact, a weighted average of the estimates in (9). We also point out that (12) is different from summing up directly (9) for all trials since the peak latency parameter is involved and it changes from trial to trial.

In the ideal case, the two vectors ao and ak are identical except for a scaling factor, which is exactly the unknown amplitude σk associated with the ERP component in the kth trial. Replacing their respective estimates in (9), we can find σk using again an SOS criterion:

minσka^k=σka^o22. (13)

Simple calculation leads to the following estimate for the amplitude:

σ^k=a^oTa^k. (14)

Note that we do not directly compute the amplitude from the estimated component, nor do we measure it in any single channel. Instead, the amplitude is computed in (14) indirectly through an inner product of two scalp projections, which involves information from all the available channels. These estimates for the scalp projection, peak latencies, and amplitudes of ERP component can be used to analyze its psychological significance on a single-trial basis.

IV. Results

We present in this section a simulation study with synthetic data and real EEG data recorded from subjects during a passive picture-viewing experiment. The goal of the simulation study is to evaluate the latency and amplitude precision of synthetically generated transients immersed in real EEG background with different SNRs and waveform mismatching conditions.

EEG data were recorded from subjects during a passive picture-viewing experiment, consisting of 12 alternating phases: the habituation phase and mixed phase. Each phase has 30 trials. During the 30 trials of the habituation phase, the same picture was repeatedly presented 30 times. During the mixed phase, the 30 pictures are all different. Each trial lasts 1600 ms, and there is 600 ms prestimulus, and 1000 ms poststimulus.

The scalp electrodes were placed according to the 128-channel Geodesic Sensor Nets standards. All 128 channels were referred to channel Cz and were digitally sampled for analysis at 250 Hz. A bandpass filter between 0.01 and 40 Hz was applied to all channels, which were then converted to average reference. Ocular artifacts (eye movement) were corrected with EOG recordings using the off-line EMCP method [20].

A. Simulation Study

Lange et al. 21] have used aGaussian function as the template for an ERP component. Here, we prefer the Gamma function for the shape of the synthetic ERP component because this is a very flexible function for waveform modeling and has been used extensively in neurophysiological modeling [22], [23]. Freeman in [30] argued that the macroscopic EEG electrical field is created from spike trains by a nonlinear generator with a second-order linear component with real poles. According to this model, the impulse response of the system is a monophasic waveform with a single mode, where the rising time depends on the relative magnitude of the two real poles. This may be approximated by a Gamma function, which is expressed by:

g(t)=ctk1exp(tθ),t>0 (15)

where k > 0 is a shape parameter, θ > 0 is a scale parameter, and c is a normalizing constant. The Gamma function is a monophasic waveform with the mode at t = (k − 1)θ, (k > 1). It has a short rise time and a longer tail for small k, and approximates a symmetric waveform for large k.

The scalp projection of the synthetic ERP component is chosen as the normalized P300 scalp projection from another study [24]. The simulation data were created by taking the superposition of the 600 ms (150 samples) prestimulus data from 120 trials as the background EEG data and the scalp projected Gamma waveform as a proxy for the ERP component.

The SNR levels given the background EEG data can be easily adjusted by modifying the normalizing constant c in (15). We define the SNR given the single-trial EEG data as:

SNR=20log(σsTr(XXT)/T). (16)

We will compare the performance of the proposed method with Woody’s filter [3]. The reasons are twofold: first, both methods explicitly model the latency of the ERP component through SOS; second, advantages (if any) of using spatial information for ERP estimation may be demonstrated comparing with single-channel EEG analysis. We will also compare with the well-established techniques based on PCA, which have been widely used for ocular artifact correction. Particularly, we will use the spatial PCA technique [25].

First, we will test the performance of the proposed spatiotemporal filtering method at varying SNR levels (from −20 to 12 dB), where we have access to the “actual” (synthetic) ERP component. Two case studies will be investigated: one where there is an exact match between the synthetic and the ERP component template and the other where there is a mismatch between the two components. For the case of exact match, we use the parameters k = 3, θ = 13 for both ERP components. For the mismatch case, the synthetic ERP component remains the same, but the template has a different waveform with parameters k = 5, θ = 6. Fig. 1(a) shows the waveforms of the two components for the mismatch case. We fixed the peak latency of the synthetic component at 200 ms for all the 120 trials.

Fig. 1.

Fig. 1

Waveforms of synthetic and presumed ERP component. (a) Synthetic component Gamma: K = 3, θ = 13; presumed component: Gamma: K = 5, θ = 6; (b) Synthetic component Gamma: K = 3, θ = 13; presumed components: Gamma: K = 3, θ = 7, and θ = 19, Gaussian with a spread of 20.

For Woody’s filter, we selected channel Pz for analysis, and use the initial ensemble average as the template, avoiding the iterative update on the template (since the true latency is fixed, this is the best-case scenario for Woody’s filter). We search around the true latency within 100 ms for maximum correlation. We estimate the peak amplitude by taking the average of the peak value and its two adjacent values (corresponding to a noncausal low-pass filter with cutoff frequency of 12 Hz). For spatial PCA, we select the eigenvector that has the maximal correlation with the P300 scalp projection (note that this is an ideal case for PCA since in reality we do not know exactly the true scalp projection, nor the exact time course).

The simulation results are summarized in Tables IIII, which show the estimation mean and standard deviation for the estimated latency and the ratio between estimated and true peak amplitude (since the true peak amplitude changes from trial to trial with fixed SNR), as well as the correlation coefficient between estimated and true scalp projection. Since PCA does not give an explicit estimate for latency and amplitude, we will omit the latency estimation and only compute the amplitude ratio between the estimated ERP component and the synthetic ERP component.

Table I.

Latency Estimation

SNR (dB) Woody Exact match Mismatch Refined
−20 191 ± 60 199 ± 5 194 ± 21 207 ± 23
−16 190 ± 60 200 ± 2 191 ± 16 205 ± 19
−12 190 ± 61 200 ± 1 191 ± 14 205 ± 16
−8 191 ± 63 200 ± 0 191 ± 14 204 ± 13
−4 195 ± 60 200 ± 0 191 ± 13 204 ± 12
0 195 ± 54 200 ± 0 191 ± 13 204 ± 12
4 199 ± 45 200 ± 0 191 ± 13 203 ± 11
8 199 ± 32 200 ± 0 191 ± 13 203 ± 11
12 200 ± 15 200 ± 0 191 ± 13 202 ± 9

Table III.

Scalp Projection Estimation

SNR
(dB)
PCA Exact match
(proposed)
Mismatch
proposed Refined
−20 0.568 0.898 0.794 0.772
−16 0.579 0.952 0.888 0.857
−12 0.601 0.978 0.946 0.925
−8 0.649 0.990 0.977 0.965
−4 0.759 0.996 0.990 0.985
0 0.906 0.998 0.996 0.994
4 0.979 0.999 0.998 0.997
8 0.996 1.000 0.999 0.999
12 1.000 1.000 1.000 1.000

First, we note that the single-trial estimation of the peak latency is very stable in the case of exact match. Notably, for EEG data with SNR higher than −12 dB, the method estimates the latency correctly for all the trials. Second, the amplitude estimate for the exact case is also stable, but is more variable than the latency estimation. The mean amplitude approaches to 1 and the standard deviation decreases as the SNR increases. We may say that in the case of exact match between the model and the component, the estimator for the amplitude is asymptotically unbiased and asymptotically consistent with increasing SNR.

The mismatch between the model and the generated component introduces a bias in the estimation of the latency. For SNR higher than −12 dB, the mean latency approaches 191 ms, yielding a bias of around two samples. Moreover, the variance does not drop with increasing SNR, and therefore, the estimation for latency is not consistent with a standard deviation of 13 ms (around three samples). The mismatch of components also introduced a bias in the estimation of the amplitude, which is due to the difference in the waveforms of the synthetic component and template. The ratio of the amplitudes converges to around 1.17 for high SNR. Thus, in the case of mismatch, the estimator for the amplitude is biased and asymptotically consistent with respect to SNR. To assess the statistical significance of the bias at various SNR (last column in Table II), we performed a two-sample, two-sided t-test, using the sample mean (1.17) at 12 dB as the baseline, where the null hypothesis is that the biased mean amplitude does not change at a particular SNR from the baseline. We computed the p-values as 0.052, 0.061, 0.126, 0.229 at SNR of −20, −16, −12, −8 dB, respectively. Thus, we reason that as long as the SNR does not fall below −20 dB, the change in the estimated amplitude mean due to varying SNR is not statistically significant (with a confidence level of 5%). Of course, the estimation variance does increase notably as SNR decreases. But, as evident from Table II, our method at −12 dB still gives a estimation variance smaller than Woody’s method at 0 dB.

Table II.

Amplitude Estimation (True Amplitude: 1)

SNR (dB) PCA Exact match
(proposed)
Mismatch
Woody proposed Refined
−20 10.7 ± 2.12 1.37 ± 2.21 −0.22 ± 9.32 1.59 ± 2.35 2.18 ± 2.96
−16 6.80 ± 1.35 1.14 ± 1.23 0.07 ± 5.95 1.44 ± 1.56 1.68 ± 1.73
−12 4.33 ± 0.86 1.07 ± 0.76 0.40 ± 3.84 1.31 ± 0.99 1.42 ± 1.02
−8 2.78 ± 0.55 1.04 ± 0.48 0.36 ± 2.60 1.24 ± 0.63 1.30 ± 0.62
−4 1.86 ± 0.36 1.0238 ± 0.30 0.53 ± 1.75 1.21 ± 0.40 1.23 ± 0.39
0 1.35 ± 0.23 1.0144 ± 0.19 0.64 ± 1.14 1.19 ± 0.26 1.19 ± 0.24
4 1.13 ± 0.13 1.0089 ± 0.12 0.80 ± 0.74 1.18 ± 0.17 1.17 ± 0.15
8 1.05 ± 0.08 1.0055 ± 0.08 0.90 ± 0.45 1.18 ± 0.11 1.16 ± 0.10
12 1.02 ± 0.05 1.0034 ± 0.05 0.98 ± 0.24 1.17 ± 0.07 1.15 ± 0.06

Tables I and II clearly show the advantage of using spatial information, in contrast to the Woody filter based on single-channel analysis. Specifically, the estimation variance of Woody filter for both latency and amplitude are much larger for realistic (negative) SNRs.

PCA overestimates the amplitude of the ERP component for low SNR data. In contrast to our method, PCA gives a statistically significant bias even at 0 dB from the baseline at 12 dB (p-value less than 0.0001). This means that varying SNR (below 0 dB) imposes a serious problem on the application of PCA in low SNR conditions.

Finally, the simulated mismatch of components affects the estimation of the scalp projection negligibly for the proposed method when SNR is higher than −10 dB. In fact, the estimation for scalp projection with mismatch with our method at −8 dB is comparable to that of PCA at +4 dB.

The second simulation concerns the effects of the mismatch on the estimation, specifically mismatch in the spread parameter which is the most important. We use the same synthetic component as before and vary the spread parameter with K fixed. Fig. 1(b) shows the waveforms of the synthetic and three of the templates, including a Gaussian with a spread of 20. The results are summarized in Tables IV and V, for SNR = −20 and −10 dB, respectively. We have included a new quantification for the amplitude estimation: coefficient of variation (CV), which is defined as the ratio of the standard deviation to the mean of a random variable. It is used as a measure of dispersion of the estimated amplitude (since its true mean is fixed at 1).

Table IV.

Effects of Mismatch

Spread parameter Latency Amplitude CV scalp projection
7 183 ± 18 2.17 ± 2.74 1.27 0.78
9 186 ± 12 1.84 ± 2.64 1.43 0.82
11 192 ± 12 1.52 ± 2.28 1.51 0.86
13 200 ± 5 1.37 ± 2.21 1.61 0.89
15 213 ± 12 1.19 ± 1.97 1.66 0.92
17 226 ± 14 1.06 ± 1.73 1.63 0.92
19 245 ± 18 0.92 ± 1.47 1.60 0.90
20 210 ± 22 2.14 ± 2.89 1.35 0.76

SNR = −20 dB; last row is for the Gaussian template.

Table V.

Effects of Mismatch

Spread parameter Latency Amplitude CV scalp projection
7 179 ± 12 1.39 ± 0.86 0.62 0.95
9 183 ± 9 1.27 ± 0.77 0.61 0.96
11 190 ± 7 1.14 ± 0.68 0.59 0.98
13 200 ± 0 1.06 ± 0.61 0.57 0.99
15 210 ± 5 0.97 ± 0.53 0.55 0.99
17 219 ± 8 0.89 ± 0.47 0.53 0.99
19 229 ± 9 0.81 ± 0.42 0.51 0.99
20 208 ± 11 1.30 ± 0.78 0.60 0.95

SNR = −10 dB; last row is for the Gaussian template.

From Tables IV and V, we can see that at the same SNR level, the degree of variability in the amplitude estimation (measured by coefficients of variation) for mismatch cases exceeds the exact match case by less than 10% (In some cases, CV is even smaller than the exact match case, which gives a better estimate, but only in terms of the amplitude). This is important because although the estimated amplitudes differ in the mismatch case, as long as we use the same template, these amplitudes on average will always be magnified or attenuated by a constant factor at a certain SNR level (or by a factor that is not statistically significant above −20 dB, as shown in the previous simulation). Intuitively, this means that given a fixed template and varying SNR (>−20 dB), the dominant source of variability in amplitude estimation mainly comes from the estimation variance (not bias), and this variability (measured by CV) is well bounded from the aforementioned for a range of spread parameters. Therefore, these amplitude estimates may still be effectively compared across experimental conditions as long as the same (meaningful) template is used and the SNR of the ERP data does not fall below −20 dB. However, the mismatch clearly introduces a bias in the latency estimation, which may be as large as 20 ms in absolute terms. This may or may not be significant depending on the applications. Mismatch may also worsen the estimation for scalp projection if the spread parameter are not well chosen, particularly for low SNR (−20 dB) data.

Note that our estimation for the latency through (6) may be interpreted as a two-step operation; first, we estimate the waveform of the ERP component, and then, calculate the latency based on the distance between the estimated waveform and the template. Zibulevsky and Zeevi [10] described an ERP waveform extraction method using regularization and sparse decomposition and showed that it gave more accurate estimate for the waveform than the MSE criterion. Motivated by this, we start with the same template as in the mismatch case and calculate the refined template according to [10]. This algorithm has two free parameters: the regularization and the number of iterations; the regularization parameter was set to be 100, which does not affect the resulting waveform very much, as suggested in [10]. Then, we reestimate the latency and amplitude parameters using this new template. The results are summarized in the last column of Tables IIII. We notice that under most SNR conditions, the refined template gives more accurate estimate for the latency, with smaller bias and variance. It also gives marginally better estimation for the amplitude for positive SNRs, but the estimates become progressively worse for negative SNRs. This may be because the true waveform becomes more and more difficult to estimate as SNR decreases. At first glance, this may seem to contradict the results in [10]. However, the comparisons made here are between the same estimation methods that use different templates, while those in [10] were made between two different estimation methods. The estimation for scalp projection is generally slightly worse than the original mismatch case. Of course, all of this comes at the cost of more extensive computation and choice of parameters.

B. Results on Cognitive ERP Data

We assume that the entire ERP may be decomposed into several monophasic components with compact support. We will estimate their parameters (amplitude and latency) one by one, using the Gamma template as in the simulation study. The present study illustrates the application of the method for a single late potential component, and the Gamma is not adapted. Its parameters are selected based on neurophysiology plausibility and are set as k = 5, θ = 6, corresponding to a rise time of 96 ms.

In reality, we do not know a priori exactly how many components there are in a single trial, nor do we know when they occur. However, we may be able to estimate these values from single-trial EEG data in the data analysis session. Fig. 2 shows the smoothed histogram of the time lags corresponding to the local minima of the cost function in (5) from 200 ms up to 1000 ms after stimulus onset, using Parzen windowing pdf estimator [26] with a Gaussian kernel bandwidth of 4.2, selected according to Silverman’s rule [27], which is given by h = 1.06σN−0.2 where N is the number of samples and σ is the standard deviation of the data. The number of peaks does depend on the kernel size, but we found that a kernel size between 0.5h and 2h will give the same number of peaks in the smoothed histogram for this data. We can clearly see five distinct peaks after 300 ms of the stimulus onset. Assuming that the error in the latency estimation is equally biased and independent from trial to trial and since there are also about five local minima for each trial, we conjecture that these peaks correspond to the latencies of five distinct components. These components, which may have different origins, are likely to compromise the late positive potentials (LPPs). According to Codispoti et al. [28], the grand average of LPP is maximal around 400 to 500 ms after stimulus. We will concentrate on the component with a latency of 500 ms to exemplify the methodology. We search between 440 and 560 ms (which correspond to the two neighboring local minima) and set the component latency as the local minimum closest to 500 ms.

Fig. 2.

Fig. 2

Smoothed histogram of the time lags corresponding to the local minima of the cost function in (5), using a Gaussian kernel bandwidth of 4.2, selected according to Silverman’s rule.

To avoid the influence of EEG outliers from unexpected artifacts, we reject those trials with three times or larger amplitude of the minimum one. This will eliminate 14 trials from the total of 360 trials (rejection rate: 4%).

Fig. 3 shows the results of estimated latency, amplitude, and scalp projection of the ERP component from the ERP data for subject 1. The scalp projections were plotted using EEGLAB [29]. Each point in Fig. 3(b) stands for the average amplitude over six trials with the same index in the same phase (habituation or mixed). It is clear that for the habituation phase, the amplitude diminishes rapidly with the trial index, while for the mixed phase, the amplitude does not show significant decay. To make the figure more intuitive, we also include the best fit (in the least square sense) to the estimated amplitude for both habituation and mixed phase. We fitted a straight line for the mixed phase, while an exponential curve was fitted to the estimated amplitude of the habituation phase. The fitted exponential curve for the habituation phase has a time constant of around 1.5 trials, which suggests that after three or four trials, the LPP amplitude decreases close to zero. We estimate the SNR for the mixed phase at around −4.1 dB.

Fig. 3.

Fig. 3

Estimation results for subject 1. (a) Histogram of the estimated latency of the ERP component. (b) Estimated amplitude for mixed and habituation phase. (c) Estimated scalp projection. Each point in (b) stands for the average amplitude for six trials with the same index in the same phase (habituation or mixed).

Similar results were obtained with the ERP data from two other subjects (see Figs. 4 and 5), where the same rejection criterion was applied, leading to the rejection of 8 (2%) and 33 (9%) trials, respectively. The difference with subject 2 is that the scalp projection shifts its strength a bit to the occipital area. It may be that the pictures shown to the three subjects caused some emotional bias. It is also possible that the SNR of the ERP data is too low to allow for a stable estimate of the scalp projection across subjects (note that in habituation phase, the LPP amplitude decreases quickly with the trial index). The fitted exponential curves for the habituation phase for these two subjects has a time constant of around 1.5 and 2.0 trials, respectively. The SNR of the mixed phase for these two subjects are estimated to be around −7.6 and −2.4 dB, respectively.

Fig. 4.

Fig. 4

Estimation results for subject 2. (a) Histogram of the estimated latency of the ERP component. (b) Estimated amplitude for mixed and habituation phase. (c) Estimated scalp projection. Each point in (b) stands for the average amplitude for six trials with the same index in the same phase (habituation or mixed).

Fig. 5.

Fig. 5

Estimation results for subject 3. (a) Histogram of the estimated latency of the ERP component. (b) Estimated amplitude for mixed and habituation phase. (c) Estimated scalp projection. Each point in (b) stands for the average amplitude for six trials with the same index in the same phase (habituation or mixed).

V. Conclusion

We have introduced a new spatiotemporal filtering method for the problem of single-trial ERP estimation. Our method relies on explicit modeling of ERP components (not the full ERP waveform), and its output is limited to local descriptors (amplitude and latency) of these components. The reason that we model the ERP components instead of the full ERP waveform is to exploit the localization of scalp projection for each single ERP component, which is impossible to do for the entire ERP. Indeed, note that the ensemble ERP in different channels usually have different morphology because there are multiple neural sources originating from different locations of the brain that give rise to different scalp projections. Since one spatial filter can extract effectively only one scalp projection, in order to utilize the spatial information in a meaningful way, only a component-based analysis is viable. Concentrating only on latency and amplitude of each component together with optimal spatial filtering presents an alternative to deal with the negative SNR. Moreover, since these are, in fact, the features of importance in cognitive studies, the methodology has the same descriptive power of traditional approaches. The weak link of the method is that it requires an explicit template that is unknown a priori, and assumes no temporal overlap among components. The methodology as presented is based on least squares, but it can be further extended to robust estimation [31] for better results. Although the method uses implicitly template matching, we did not see drastic improvements with more sophisticated matching algorithms [10].

The proposed methodology can be seen as a generalization of Woody’s filter in the spatial domain for latency estimation, but it also obtains an explicit expression for amplitude estimation. By design, the method is especially suitable to extract ERPs features in the spontaneous EEG activity, in contrast with PCA and ICA that work best for reliable (large) signals. Another distinction is that unlike most methods based on PCA and ICA, our method utilizes explicitly the timing information, as well as the spatial information.

Using simulated ERP data, we have shown that although the mismatch between the presumed and synthetic ERP components introduces a bias for both latency and amplitude estimation, the bias for the latency is relatively small and the estimated amplitudes are still comparable across experimental conditions for ERP data with a SNR higher than −20 dB. Furthermore, the mismatch of components has minimal influence on the estimation of scalp projection. The method when applied to real cognitive ERP data gives interpretable results, showing diminishing amplitude of LPP during the habituation phase for multiple subjects.

The use of a parametric template (Gamma function) provides the flexibility to change the shape and scale parameters continuously. Using a stochastic model formulation, the method can be extended to a noisy template model, and potentially, the two Gamma parameters can be extracted from the data also for best fit.

However, care must be taken not to over-interpret the results. A crucial factor for amplitude estimation is a reasonably low SNR (>−20 dB). This may not be satisfied for the habituation phase, where the amplitude of LPP drops rapidly to a very small value. Our ability to infer the template accurately, which are selected heuristically from real data, deteriorates with decreasing SNR. As a rule of thumb, we would recommend against the use of the present method for data with SNR less than −10 dB.

Acknowledgment

The authors would like to thank Dr. A. Keil for helpful discussions on the ERP analysis literature.

This work was supported in part by Graduate Alumni Fellowship from the University of Florida and in part by the National Institute of Mental Health (NIMH) under Grant P50 MH072850-01.

Biographies

graphic file with name nihms452025b1.gif

Ruijiang Li (S’05) was born in Shandong, China. He received the B.S. degree in electrical engineering from Zhejiang University, Hangzhou, China, in 2004. He is currently working toward the Ph.D. degree at the Computational Neuro Engineering Laboratory (CNEL), Department of Electrical and Computer Engineering, University of Florida, Gainesville.

His current research interests include signal processing, statistical estimation, and their applications in biomedical engineering.

graphic file with name nihms452025b2.gif

Jose C. Principe (M’83–SM’90–F’00) received the B.S. degree from the University of Porto, Porto, Portugal, in 1972, and the M.Sc. and Ph.D. degrees from the University of Florida, Gainesville, in 1974 and 1979, respectively, all in electrical engineering.

He is currently a Distinguished Professor of Electrical and Biomedical Engineering at the University of Florida, where he teaches advanced signal processing and artificial neural networks (ANNs) modeling, and is a BellSouth Professor and the Founder and Director of its Computational Neuro Engineering Laboratory (CNEL). His current research interests include biomedical signal processing, in particular, the electroencephalogram (EEG) and the modeling and applications of adaptive systems. He is the author or coauthor of more than 129 papers published in refereed journals, 15 book chapters, and over 300 conference papers. He has directed more than 50 Ph.D. dissertations and 61 master’s degree theses.

Dr. Principe is the President of the International Neural Network Society and formal Secretary of the Technical Committee on Neural Networks of the IEEE Signal Processing Society. He is an American Institute for Medical and Biological Engineering (AIMBE) Fellow and a recipient of the IEEE Engineering in Medicine and Biology Society Career Service Award. He is also a member of the Scientific Board of the Food and Drug Administration, and a member of the Advisory Board of the McKnight Brain Institute at the University of Florida.

Margaret Bradley, photograph and biography not available at the time of publication.

Vera Ferrari, photograph and biography not available at the time of publication.

Appendix

We justify the use of the time lag corresponding to the local minimum of J(τ) in (5) as the peak latency. Given the single-trial data matrix X, the peak latency of an ERP component coincides with the local minimum of J(τ) if the following conditions are satisfied:

  1. the presumed component go and the actual component s have the same morphology;

  2. niTgo(τ)=0, for i = 1, . . . , N, and τ ∈ TS . (the signal and noise are uncorrelated); and

  3. XXT is full rank.

Proof: The optimal spatial filter is given by (4). We plug it into (3) and get:

J(τ)=go(τ)T(CI)22=go(τ)T(CTCCTC+I)go(τ)

where, C = XT (XXT)−1 X. Note that CT C = C, so

J(τ)=go(τ)TCgo(τ)+go(τ)Tgo(τ)=[go(τ)Ts]2(aTR1a)+go(τ)Tgo(τ)

where R = X · XT is a positive definite matrix independent of the time lag.

With the constraint that go(τ)T go(τ) = const, the minimum of the cost function J(τ) is achieved when go(τ)T s achieves its maximum, since aT R−1a is positive. This happens when τ coincides with the peak latency l of the actual ERP component s.

Footnotes

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Contributor Information

Ruijiang Li, Email: ruijiang@cnel.ufl.edu, Computational NeuroEngineering Laboratory, Electrical and Computer Engineering Department, University of Florida, Gainesville, FL 32611 USA.

Jose C. Principe, Email: principe@cnel.ufl.edu, Computational NeuroEngineering Laboratory, Electrical and Computer Engineering Department, University of Florida, Gainesville, FL 32611 USA.

Margaret Bradley, Email: bradley@ufl.edu, National Institute of Mental Health (NIMH) Center for the Study of Emotion and Attention, University of Florida, Gainesville, FL 32611 USA.

Vera Ferrari, Email: vferrari@ufl.edu, National Institute of Mental Health (NIMH) Center for the Study of Emotion and Attention, University of Florida, Gainesville, FL 32611 USA.

References

  • 1.Brazier MAB. Evoked responses recorded from the depths of the human brain. Ann. New York Academy Sci. 1964;vol. 112:33–59. doi: 10.1111/j.1749-6632.1964.tb26741.x. [DOI] [PubMed] [Google Scholar]
  • 2.Bruin KJ, Kenemans JL, Verbaten MN, Van Der Heijden AH. Habituation: An event-related potential and dipole source analysis study. Int. J. Psychophysiol. 2000;vol. 36:199–209. doi: 10.1016/s0167-8760(99)00114-2. [DOI] [PubMed] [Google Scholar]
  • 3.Woody CD. Characterization of an adaptive filter for the analysis of variable latency neuroelectric signals. Med. Biological Eng. Comput. 1967;vol. 5:539–553. [Google Scholar]
  • 4.Pham D, Mocks J, Kohler W, Gasser T. Variable latencies of noisy signals: Estimation and testing in brain potential data. Biometrika. 1987;vol. 74:525–533. [Google Scholar]
  • 5.Jaskowski P, Verleger R. Amplitudes and latencies of single-trial ERP’s estimated by a maximum-likelihood method. IEEE Trans. Biomed. Eng. 1999 Aug;vol. 46(no. 8):987–993. doi: 10.1109/10.775409. [DOI] [PubMed] [Google Scholar]
  • 6.Parra L, Spence C, Gerson A, Sajda P. Recipes for the linear analysis of EEG. Neuroimage. 2005;vol. 28:326–341. doi: 10.1016/j.neuroimage.2005.05.032. [DOI] [PubMed] [Google Scholar]
  • 7.Chapman R, McCrary J. EP component identification and measurement by principal components analysis. Brain Cogn. 1995;vol. 27(no. 3):288–310. doi: 10.1006/brcg.1995.1024. (review, Erratum in: Brain Cogn., vol. 28, no. 3, 342, 1995) [DOI] [PubMed] [Google Scholar]
  • 8.Makeig S, Bell A, Jung T, Sejnowski T. Adv. in Neural Inform. Process. Syst. vol. 8. Cambridge, MA: MIT Press; 1996. Independent component analysis of electroencephalographic data; pp. 145–151. [Google Scholar]
  • 9.Tang A, Pearlmutter B, Malaszenko N, Phung D, Reeb B. Independent components of magnetoencephalography: Localization. Neural Comput. 2002;vol. 14(no. 8):1827–1858. doi: 10.1162/089976602760128036. [DOI] [PubMed] [Google Scholar]
  • 10.Zibulevsky M, Zeevi YY. Extraction of a single source from multichannel data using sparse decomposition. Neurocomputing. 2002;vol. 49:163–173. [Google Scholar]
  • 11.Ramoser H, Mueller-Gerking J, Pfurtscheller G. Optimal spatial filtering of single trial EEG during imagined hand movement. IEEE Trans. Rehabil. Eng. 2000 Dec;vol. 8(no. 4):441–446. doi: 10.1109/86.895946. [DOI] [PubMed] [Google Scholar]
  • 12.Parra L, Alvino C, Tang A, Pearlmutter B, Young N, Osman A, Sajda P. Linear spatial integration for single-trial detection in encephalography. NeuroImage. 2002;vol. 17:223–230. doi: 10.1006/nimg.2002.1212. [DOI] [PubMed] [Google Scholar]
  • 13.Lemm S, Blankertz B, Curio G, Muller K-R. Spatio-spectral filters for improving the classification of single trial EEG. IEEE Trans. Biomed. Eng. 2005 Sep;vol. 52(no. 9):1541–1548. doi: 10.1109/TBME.2005.851521. [DOI] [PubMed] [Google Scholar]
  • 14.Dyrholm M, Christoforou C, Parra LC. Bilinear discriminant component analysis. J. Mach. Learning Res. 2007;vol. 8:1097–1111. [Google Scholar]
  • 15.Model D, Zibulevsky M. Learning subject-specific spatial and temporal filters for single-trial EEG classification. NeuroImage. 2006;vol. 32(no. 4):1631–1641. doi: 10.1016/j.neuroimage.2006.04.224. [DOI] [PubMed] [Google Scholar]
  • 16.Caspers H, Speckmann EJ, Lehmenkühler A. Electrogenesis of cortical dc potentials. In: Kornhuber HH, Deecke L, editors. Progress in Brain Research (Motivation, Motor and Sensory Processes of the Brain: Electrical Potentials, Behavior and Clinical use 3-15) The Netherlands: Elsevier: Amsterdam; 1980. [DOI] [PubMed] [Google Scholar]
  • 17.Sams M, Alho K, Näätänen R. Short-term habituation and dishabituation of the mismatch negativity of the ERP. Psychophysiology. 1984;vol. 21:434–441. doi: 10.1111/j.1469-8986.1984.tb00223.x. [DOI] [PubMed] [Google Scholar]
  • 18.Reilly J. Applied Bioelectricity. New York: Springer-Verlag; 1992. [Google Scholar]
  • 19.Parra L, Spence C, Gerson A, Sajda P. Recipes for the linear analysis of EEG. Neuroimage. 2005;vol. 28:326–341. doi: 10.1016/j.neuroimage.2005.05.032. [DOI] [PubMed] [Google Scholar]
  • 20.Gratton G, Coles MGH, Donchin E. A new method for offline removal of ocular artifact. Electroencephalogr. Clin. Neurophysiol. 1983;vol. 55:468–484. doi: 10.1016/0013-4694(83)90135-9. [DOI] [PubMed] [Google Scholar]
  • 21.Lange D, Pratt H, Inbar G. Modeling and estimation of single evoked brain potential components. IEEE Trans. Biomed. Eng. 1997 Sep;vol. 44:791–799. doi: 10.1109/10.623048. [DOI] [PubMed] [Google Scholar]
  • 22.Patterson RD, Robinson K, Holdsworth J, McKeown D, Zhang C, Allerhand MH. Complex sounds and auditory images. In: Cazals Y, Demany L, Horner K, editors. Auditory Physiology and Perception. Oxford, U.K.: Pergamon; 1992. pp. 429–446. [Google Scholar]
  • 23.Koch C, Poggio T, Torre V. Nonlinear interactions in a dendritic tree: Localization, timing, and role in information processing. Proc. Natl. Acad Sci. USA. 1983;vol. 80:2799–2802. doi: 10.1073/pnas.80.9.2799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Li R, Principe JC. Proc. IEEE EMBS Conf. Vancouver, CA: 2006. Blinking artifact removal in cognitive EEG data using ICA; pp. 6273–6278. [DOI] [PubMed] [Google Scholar]
  • 25.Donchin E, Spencer KM, Dien J. The varieties of deviant experience: ERPmanifestations of deviance processors. In: van Boxtel GJM, Bocker KBE, editors. Brain and Behavior: Past, Present, and Future. Tilburg: Tilburg Univ. Press; 1997. pp. 67–91. [Google Scholar]
  • 26.Parzen E. On estimation of a probability density function and mode. Ann. Math. 1962;vol. 33:1065–1076. [Google Scholar]
  • 27.Silverman BW. Density Estimation for Statistics and Data Analysis. London, U.K.: Chapman & Hall; 1986. [Google Scholar]
  • 28.Codispoti M, Ferrari V, Bradley M. Repetitive picture processing: Autonomic and cortical correlates. Brain Res. 2006;vol. 1068:213–220. doi: 10.1016/j.brainres.2005.11.009. [DOI] [PubMed] [Google Scholar]
  • 29.Delorme A, Makeig S. EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods. 2004;vol. 134(no. 1):9–21. doi: 10.1016/j.jneumeth.2003.10.009. [DOI] [PubMed] [Google Scholar]
  • 30.Freeman W. Mass Activation in the Nervous System. New York: Academic; 1975. [Google Scholar]
  • 31.Li R, Principe JC, Bradley M, Ferrari V. Robust single-trial ERP estimation based on spatiotemporal filtering; Proc. IEEE EMBS Conf; 2007. pp. 5206–5209. [DOI] [PubMed] [Google Scholar]

RESOURCES