Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jan 22.
Published in final edited form as: Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:4888–4891. doi: 10.1109/EMBC48229.2022.9871447

Elimination of pseudo-HFOs in iEEG using sparse representation and Random Forest classifier

Behrang Fazli Besheli 1, Zhiyi Sha 2, Thomas Henry 3, Jay R Gavvala 4, Candan Gürses 5, Sacit Karamürsel 6, Nuri F Ince 7
PMCID: PMC9867883  NIHMSID: NIHMS1865634  PMID: 36086345

Abstract

High-Frequency Oscillation (HFO) is a promising biomarker of the epileptogenic zone. However, sharp artifacts might easily pass the conventional HFO detectors as real HFOs and reduce the seizure onset zone (SOZ) localization. We hypothesize that, unlike pseudo-HFOs, which originates from artifacts with sharp changes or arbitrary waveform characteristic, real HFOs could be represented by a limited number of oscillatory waveforms. Accordingly, to distinguish true ones from pseudo-HFOs, we established a new classification method based on sparse representation of candidate events that passed an initial detector with high sensitivity but low specificity. Specifically, using the Orthogonal Matching Pursuit (OMP) and a redundant Gabor dictionary, each event was represented sparsely in an iterative fashion. The approximation error was estimated over 30 iterations which were concatenated to form a 30-dimensional feature vector and fed to a random forest classifier. Based on the selected dictionary elements, our method can further classify HFOs into Ripples (R) and Fast Ripples (FR). In this scheme, two experts visually inspected 2075 events captured in iEEG recordings from 5 different subjects and labeled them as true-HFO or Pseudo-HFO. We reached 90.22% classification accuracy in labeled events and a 21.16% SOZ localization improvement compared to the conventional amplitude-threshold-based detector. Our sparse representation framework also classified the detected HFOs into R and FR subcategories. We reached 91.24% SOZ accuracy with the detected R+FR events.

I. Introduction

Epilepsy is one of the most common neurological disorders associated with abrupt seizures [1]. While medication is widely used to control seizures, 30% of patients do not successfully respond to anti-seizure medication therapy. In these drug-resistant cases, a surgery involving the resection of seizure generating brain tissue becomes a viable solution [2]. The prolonged intracranial EEG (iEEG) monitoring is generally used to identify the seizure onset zone (SOZ) and guide the surgical therapy.

During the past years, high-frequency oscillations (HFOs) of iEEG have been accepted as a promising biomarker of the SOZ [3]. HFOs are low amplitude transients that last around tens of milliseconds with frequencies above 80 Hz [4]. Based on their frequency content, HFOs are further divided into Ripples (R: 80–250 Hz) and Fast Ripples (FR: 250–600 Hz). Recent work suggests that events with both R and FR are more specific to the SOZ [5]. Various methods [6] have been established to find these low amplitude transients in long-term iEEG recordings. The conventional algorithms [6] filter the iEEG above 80 Hz using high pass or bandpass filters and use a set of constraints such as the duration and the number of zero crossings to reject those events originating from noise with arbitrary waveform or sharp/abrupt change in the signal. These events are called pseudo-HFO and described as high-frequency signals of non-physiological or artifactual origin. The pseudo-HFOs can easily pass conventional detectors and contaminate the pool of detected events [7, 8, 9], which causes miss-interpretation of the spatial distribution of HFOs across channels. Consequently, such events lead to erroneous location of the SOZ, particular with high rates of pseudo HFOs. Newer methods [10, 11] are using a second step analysis to remove pseudo-HFOs from the pool of initial detected events. However, our method is established on a simple hypothesis and did not bias toward a specific type of noise or time-frequency features[8].

This study hypothesized that pseudo-HFO events, which generally have arbitrary waveforms characteristic or sharp transients, cannot be represented in a sparse fashion as a combination of a limited number of oscillatory waveforms. In this scheme, we formed a redundant analytical dictionary built from Gabor atoms and applied Orthogonal Matching Pursuit (OMP) [12, 13] in conjunction with this redundant oscillatory dictionary to represent the events passed the initial amplitude threshold-based detector. We observed that, unlike real HFOs, which can be represented efficiently by using a limited number of dictionary elements, pseudo-HFO could not be represented properly due to their random nature or sharp characteristics. We compared our method to a previously established detector based on time-frequency analysis of candidate events [10] and showed that our classification results are superior.

II. Materials and methods

A. Data Acquisition

This study was approved by the institutional Review Boards (IRB) of the University of Houston. The iEEG recordings were obtained from five patients recorded at the Hospital of University of Minnesota (MN, USA), Istanbul University (Istanbul, Turkey), and Baylor College of Medicine (TX, USA) with refractory epilepsy in the epilepsy monitoring unit (EMU) at or greater than 2 kHz sampling frequency. The study protocol was approved by the institutional review boards at each site. The epileptologist reviewed the prolonged iEEG recordings in each subject and identified the SOZ channels. For predicting the SOZ from the distribution of HFO events, 30 minutes of interictal iEEG recordings were randomly selected in each patient, without any preprocessing and channel selection. Simply, all recorded channels were used in the analysis, and no segments with artifacts in the 30min recording were eliminated.

B. Signal Processing

The schematic diagram of the proposed method is shown in Fig 1. In the first stage (Fig. 1A), after converting the raw iEEG data into bipolar derivation, we captured candidate events using an amplitude-threshold-based detector. We formed a pool of events, including real HFOs and pseudo-HFOs, where each event was 512 samples long. Using OMP and a redundant dictionary built from oscillatory atoms, we reconstructed each event and fed the approximation error of the reconstructed event at each iteration to a random forest (RF) classifier [14]. Finally, the classifier removed the pseudo-HFO from the pool of events. In the second stage (Fig. 1B), once the classifier assigned an event into the real-HFOs category, it was further categorized into the R, FR, and R+FR based on the frequency index and energy of the selected atoms in the reconstruction stage. All the processes were developed in MATLAB 2019b (MathWorks, MA, USA). In the following sections, we describe each step in more detail:

Figure 1.

Figure 1.

Schematic diagram of the proposed method. (A) Initial events were extracted from the amplitude-threshold-based detector. All events go into the OMP reconstruction step and are represented by Gabor oscillatory atoms. The approximation error (εi) is computed from the residual (Ri) waveform. The approximation error of OMP reconstruction serves as a feature for the random forest classifier. Finally, the initial event pool classifies the real HFO and noise using the random forest classifier. (B) From the result of the previous step, HFOs were further classified into ripple (R), fast ripple (FR), and ripple+fast ripple (R+FR). All atoms with frequencies above 80 Hz which are used for the reconstruction of events, were selected and passed into the amplitude-threshold-based detector. If the detector accepts this waveform, it can further be classified into the R/FR/R+FR based on the frequency of selected atoms.

1). Initial Detection

We used the previously established high sensitivity but low specificity amplitude-threshold-based detector [10], to capture HFO candidates from the raw iEEG (Fig 2.) recordings. The detector employed two 64-order FIR bandpass filters in R and FR ranges. In the next step, the standard deviation of the bandpass filtered signals was computed in a 100ms long sliding window with 50% overlap. The amplitude threshold was set to three times the median of the standard deviation time series. Those events that crossed the threshold at least four times were placed into the event pool. Finally, in each subject, two experts labeled at least 200 real and 200 pseudo-HFOs that passed the initial amplitude-threshold-based detector (2075 in total). These events are used to train a RF classifier model using one subject leave out method and trained model was used to classify all unlabeled events.

Figure 2.

Figure 2.

Noise in iEEG data. The sharp artifacts from channel LPF 7–8 (showed by red arrows) mislead the amplitude-threshold-based detector and might cause capture some sharp artifact to be detected as HFO.

2). Event Representation and Classification

In order to represent the HFO events, we first designed an analytical redundant Gabor dictionary. The Gabor atoms are defined as the product of Gaussian window and cosine basis:

gu,σ(t)=1σe(tu)22σ2cos(2πωN(tu)) (1)

which is defined by σ: time spread of Gaussian window, u: shift in time, w: frequency of the cosine basis. We tuned these parameters to have full coverage in time and frequency spaces. The frequency starts from zero and goes up to 600 Hz, and the windowing is designed in such a way to have different scales based on the dyadic (size = 512 * 2−j, j = 0, …, 4) sequence. To ensure the dictionary has more localized atoms in higher frequencies, the number of shifts in time also decreases as the frequency increases. In Fig. 1, we showed some of these atoms from the dictionary. Assume that yRm×1 represents a candidate HFO event. The sparse representation of y using a dictionary of DRm×n where the atoms span the columns of matrix D can be formulated as an optimization problem

minαα0s.t.y=Dα (2)

that minimizes the number of nonzero coefficients in α where ‖·‖0 denotes l0 − norm, and αRn×1is the coefficient of all atoms used in the representation. Although not optimal, we used the greedy OMP method [13] to represent the events in a sparse fashion using the oscillatory atoms in D. Fig. 3A shows the matrix formulation of sparse representation. Fig. 3C, D show the OMP representation of one real HFO and one pseudo-HFO passed from the amplitude-threshold-based detector. Furthermore, the selected atoms in each iteration and the residual signal after the 15th iteration are presented in a matrix format. The residual waveform ri of event y was computed during the OMP process at ith iteration:

ri=yk=1iakdk (3)

where dk is the selected atom at iteration k, ak is the corresponding coefficient. The approximation error εi is then computed as the ratio of the energy of the residual to the energy of the event y:

εi=ri2y2. (4)

We fed ε = [ε1, ε2, …, ε30] as a feature vector to a RF classifier. We used maximum of 30 iterations as the mean of the approximation error for real-HFOs reached below 5% (Fig. 3B) and all HFO components can be represented properly. The shaded plot of εi features over different iterations clearly show that the real HFO (Fig. 3C) and pseudo-HFO (Fig. 3D) have different behaviors. The real-HFO events had noticeable lower approximation error over iterations compared to pseudo-HFO events.

Figure 3.

Figure 3.

(A) A matrix-form description of the sparse representation using Di atoms. (B) The shaded plot of approximation error over iterations for labeled events. The blue line shows the approximation error of noise events, while the red line shows the approximation error of the HFO events over iterations. (C, D) These two figures show the OMP reconstruction of one HFO and one noise example and their corresponding residual waveform after 15 iterations. The noise made lots of errors in the residual waveform while HFO reconstructed properly with minimum error in the residual waveform.

3). HFO categorization

The detected HFOs are further classified into the R, FR, and R+FR categories based on the selected dictionary atoms in the sparse representation process. Specifically, we reconstructed the high-frequency part of HFOs by considering those atoms contributing to the sparse reconstruction process with frequencies higher than 80 Hz (Fig. 1B). We then fed the reconstructed high-frequency part to a detector that calculates the number of threshold crossing. This is the same threshold that was learned in the first stage from background iEEG signal. If the reconstructed waveform had atoms between 80–250 Hz and passed the detector (two consecutive oscillations), we marked the event as R. Similarly, if it had atoms above 250 Hz and passed the detector, we marked them as FR. Finally, if the high frequency reconstructed signal had components in both frequency ranges, we marked them as R+FR.

4). Validation of method using SOZ localization

After classifying labeled events and establishing the RF model, we validated the method over 30 minutes of iEEG data. We calculated the SOZ identification accuracy by computing the ratio of the number of events inside the SOZ to the total number of events. We repeated the same process before and after removing the pseudo-HFOs. We further computed the SOZ accuracy for those HFO events which were categorized into R, FR, and R+FR and applied the statistical test to quantify the difference in the accuracy between each group.

III. Results

Fig. 4A shows the number of selected features at each iteration, and Fig. 4B shows the selected features and corresponding threshold by the RF in 2D. We noted that RF mainly used the approximation error around iteration 20 (Fig. 4A), with a split threshold around 0.12 (Fig. 4B). The classification results estimated with leave one subject out cross-validation method are provided as confusion matrices in Fig.4 C and D for our method and for [10] respectively. Compared to the previously well-established method which uses Gaussian mixture model and time-frequency features to detect the pseudo-HFOs and remove from the pool of events [10], the improvement in HFO detection accuracy was 8.1% (proposed method: 90.22% and [10]: 82.12%).

Figure 4.

Figure 4.

(A) represents the selected features in the first three splits of Random Forest Classifier. It shows that the representation of events is sparse around iteration 20. (B) shows the selected atoms and their corresponding threshold in the first three splits of Random Forest. It also shows that the suitable spot of threshold is around 0.12 (C) Confusion Matrix of proposed method (D) Confusion matrix of [10].

We validated SOZ identification accuracy of our method on 30 minutes of iEEG data. We first applied the amplitude-threshold-based detector and computed the SOZ localization accuracy without eliminating the pseudo-FO events. Later, we rejected pseudo-HFO events by applying the sparse representation and RF classifier that we learned from the training data. As shown in Fig. 5A, the spatial distribution of initial detected events (P3) over channels pointed to a wrong channel (LeftPF6–7). However, after applying our method, we removed the pseudo-HFOs from this channel and improved the SOZ localization accuracy. Fig. 5B shows the SOZ accuracy of the amplitude-threshold-based detector (initial detector) compared to the proposed method. The mean SOZ accuracy of the first detector was 50.98%, while the mean accuracy of our method was 72.14%. (Paired t-test p-value: 0.0095), and the improvement was significant across subjects (Fig. 5C). Finally, we categorized the detected HFOs into R, FR, and R+FR. As shown in Fig. 5D and Fig. 5E, the difference in SOZ accuracy between R (mean accuracy: 71.88%) and R+FR (mean accuracy: 91.24%) was significant (paired t-test p-value: 0.023). However, there were no significant difference between FR (mean accuracy: 80.97%) and R+FR, and no significant difference between R and FR.

Figure 5.

Figure 5.

(A) Spatial distribution (P3) of HFO. SOZ channels were pointed with a red arrow, and channel labels (abbreviation) were noted beneath the subplot. (B) Shows the SOZ accuracy across subjects before and after removing pseudo HFOs. (C) Shows the box plot of SOZ accuracy (D) shows the SOZ accuracy of HFOs that were further classified into the ripple (R), fast ripple (FR), and ripple+fast ripple (R+FR) (E) shows the boxplot of SOZ accuracy of each category (R, FR, R+FR).

IV. Discussion

This study aims to remove the pseudo-HFO from the pool of events using a sparse signal representation strategy. We showed that using sparse representation of HFOs with a redundant analytical dictionary and a maximum 30 iterations, could find the real high-frequency components and further categorize HFOs to R, FR, and R+FR at iteration 30. We showed that the quality of reconstruction quantified by the approximation error is an informative feature to distinguish between real and pseudo-HFOs. This sparse feature might be critical as it is translated to improving SOZ identification using leave-one-subject out validation. This study also confirms that the R+FR might be a better biomarker for SOZ identification and could reach more than 90% accuracy in SOZ localization. Finally, in the future, our method can be a useful tool for further analysis of iEEG data recorded intraoperatively which are usually distorted by noises and artifacts originating from the nearby hardware during surgery or in the EMU and can be helpful tool to identify SOZ specific HFO patterns.

Clinical Relevance—

This sparse representation framework establishes a new approach to distinguish real from pseudo-HFOs in prolonged iEEG recordings. It also provides reliable SOZ identification without the selection of artifact-free segments.

Acknowledgments

This study was supported by National Institutes of Health—National Institute of Neurological Disorders and Stroke (Grants R01NS112497 and 1UH3NS117944-01A1).

Contributor Information

Behrang Fazli Besheli, Department of Biomedical Engineering, University of Houston, TX.

Zhiyi Sha, University of Minnesota, Minnesota, MN, USA.

Thomas Henry, University of Minnesota, Minnesota, MN, USA.

Jay R Gavvala, Baylor College of Medicine, Houston, TX USA.

Candan Gürses, Koc University, Istanbul, Turkey.

Sacit Karamürsel, Koc University, Istanbul, Turkey.

Nuri F. Ince, Department of Biomedical Engineering, University of Houston, TX.

References

  • [1].Sander JW and Shorvon SD, “Epidemiology of the epilepsies,” Journal of Neurology, Neurosurgery and psychiatry, vol. 61, no. 5, pp. 433–443, 1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Habibi M, “Refractory Epilepsy,” U.S Pharmacist, vol. 34, no. 3, pp. 8–14, 2009. [Google Scholar]
  • [3].Engel J, Pitkänen A, Loeb JA, Dudek FE, Bertram EH, Cole AJ, Moshé SL, Wiebe S, Jensen FE, Mody I, Nehlig A, and Vezzani A, “Epilepsy Biomarkers,” Epilepsia, vol. 54, pp. 61–69, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Worrell GA, Jerbi K, Kobayashi K, Lina JM, Zelmann R, Le Van Quyen M, “Recording and analysis techniques for high-frequency oscillations,” Progress in Neurobiology, vol. 98, no. 3, pp. 265–78, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Fedele Tommaso, Burnos Sergey, Boran Ece, Niklaus Krayenbühl Peter Hilfiker, Grunwald Thomas & Sarnthein Johannes, “Resection of high frequency oscillations predicts seizure outcome in the individual patient,” Sci Rep, vol. 7, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Kavyakantha Remakanthakurup Sindhu RSAL, “Trends in the use of automated algorithms for the detection of high-frequency oscillations associated with human epileps,” Epilepsia, vol. 61, no. 8, pp. 1553–1569, 2020. [DOI] [PubMed] [Google Scholar]
  • [7].Lee S, Issa NP, Rose S, Tao JX, Warnke PC, Towle VL, van Drongelen W, & Wu S, “DC shifts, high frequency oscillations, ripples and fast ripples in relation to the seizure onset zone,” Seizure, vol. 77, pp. 52–58, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Bénar CG, Chauvière L, Bartolomei F, & Wendling F, “Pitfalls of high-pass filtering for detecting epileptic oscillations: a technical note on “false” ripples,” Clinical neurophysiology : official journal of the International Federation of Clinical Neurophysiology, vol. 121, no. 3, pp. 301–310, 2010. [DOI] [PubMed] [Google Scholar]
  • [9].Thomschewski A, Hincapié AS, & Frauscher B, “Localization of the Epileptogenic Zone Using High Frequency Oscillations,” Frontiers in neurology, vol. 10, no. 94, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Liu Su, Sha Zhiyi, Sencer Altay, Aydoseli Aydin, Bebek Nerse, Abosch Aviva, Henry Thomas, Gurses Candan, Ince Nuri Firat, “Exploring the time-frequency content of high frequency oscillations for automated identification of seizure onset zone in epilepsy,” J Neural Eng, 2016. [DOI] [PubMed] [Google Scholar]
  • [11].Gliske SV, Qin Z, Lau K, Alvarado-Rojas C, Salami P, Zelmann R, & Stacey WC, “Distinguishing false and true positive detections of high frequency oscillations,” Journal of Neural Engineering, vol. 17, no. 5, p. 056005, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Zhang SG Mallat and Zhifeng Zhangand, “Matching Pursuits With Time-Frequency Dictionaries,” IEEE transaction on Signal Processing, vol. 41, no. 12, pp. 3397–3415, 1993. [Google Scholar]
  • [13].Pati YC, Rezaiifar R and Krishnaprasad PS, “Orthogonal Matching Pursuit: Recursive function approximation with applications to wavelet decomposition,” Proceedings of 27th Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 40–44, 1993. [Google Scholar]
  • [14].Breiman L, “Random Forest,” Machine Learning, vol. 45, pp. 5–32, 2001. [Google Scholar]

RESOURCES