Abstract
Objective
High-frequency band (HFB) activity, measured using implanted sensors over the cortex, is increasingly considered as a feature for the study of brain function and the design of neural-implants, such as Brain-Computer Interfaces (BCIs). One common way of extracting these power signals is using a wavelet dictionary, which involves the selection of different temporal sampling and temporal smoothing parameters, such that the resulting HFB signal best represents the temporal features of the neuronal event of interest. Typically, the use of neuro-electrical signals for closed-loop BCI control requires a certain level of signal downsampling and smoothing in order to remove uncorrelated noise, optimize performance and provide fast feedback. However, a fixed setting of the sampling and smoothing parameters may lead to a suboptimal representation of the underlying neural responses and poor BCI control. This problem can be resolved with a systematic assessment of parameter settings.
Approach
With classification of HFB power responses as performance measure, different combinations of temporal sampling and temporal smoothing values were applied to data from sensory and motor tasks recorded with high-density and standard clinical electrocorticography (ECoG) grids in 12 epilepsy patients.
Main results
The results suggest that HFB ECoG responses are best performed with high sampling and subsequent smoothing. For the paradigms used in this study, optimal temporal sampling ranged from 29 Hz to 50 Hz. Regarding optimal smoothing, values were similar between tasks (0.1-0.9 s), except for executed complex hand gestures, for which two optimal possible smoothing windows were found (0.4-0.6 s and 0.9-2.7 s).
Significance
The range of optimal values indicates that parameter optimization depends on the functional paradigm and may be subject-specific. Our results advocate a methodical assessment of parameter settings for optimal decodability of ECoG signals.
Keywords: Electrocorticography, Brain-Computer Interface, Sensorimotor cortex, High-frequency band, Wavelets, Temporal sampling, Temporal smoothing
1. Introduction
The brain’s spontaneous electrical activity is a complex combination of multiple oscillations and other broadband non-oscillatory phenomena with independent frequency bands but with complementary underlying mechanisms (Crone et al., 1999; Miller et al., 2009b). Several recording techniques have been used to capture these signals, such as electroencephalography (EEG), magnetoencephalography (MEG) and electrocorticography (ECoG). From these signals, researchers try to extract information about neuronal mechanisms underlying perception and behavior. ECoG has recently received increasing interest, since it provides a direct measure of neuronal activity in the human brain and benefits from a very high spatial and temporal signal resolution, permitting, in particular, to capture higher frequency changes in the signal (Crone et al., 1999; Leuthardt et al., 2004). These features are especially valuable in the design of neural-implants, such as Brain-Computer Interfaces (BCI) (e.g., Hochberg et al., 2012; Vansteensel et al., 2016). In the last decades, the study of brain function in this field has relied mostly on the interpretation of this higher-frequency spectral content (>50 Hz, also referred to as high-frequency band, HFB, power) of the brain signals (Crone et al., 1999; Miller et al., 2007, 2009a), since it is associated with a time-locked neuronal response to discrete stimuli or motor actions (Buzsáki and Wang, 2012; Crone et al., 1999; Hermes et al., 2012). Evoked neuronal responses expressed as transient changes in HFB power have been associated with synaptic mechanisms and firing rates of neurons (Buzsáki and Wang, 2012; Miller et al., 2009b), but the detailed relationship between HFB signals and their underlying neuronal mechanisms is not yet fully understood.
Considering that electrical brain signals are composed of task-evoked neuronal events with certain temporal features (actions or responses to cues) and electrophysiological noise, produced by uncontrolled neuronal processes, the matter of separating these two is key to extracting information about brain function. The extraction of HFB power from the raw signal, as a proxy for these neural events, can be performed with one of several methods, including Fourier, Hilbert or Wavelet transform. Even though these three methods are mathematically equivalent (Bruns, 2004), wavelet-based signal analysis is a commonly used approach for power extraction, particularly in paradigms where the signals are expected to show dynamics on a short temporal scale (Bruns, 2004). The usual computational implementation of this method (and of the above mathematically equivalent approaches) typically requires the specification of at least two parameters: the temporal sampling of the extracted power signal and the temporal smoothing applied to the resulting signal. Since the temporal features of the underlying task-evoked events are unknown, there is, currently, no “gold-standard” for the optimal values of these parameters. However, the choice of these parameters for extraction of HFB signals must be taken in account especially when designing and implementing BCI systems (Branco et al., 2017a; Hotson et al., 2016). For example, in the case of real-time neuro-feedback, most systems rely on acquisition and processing filters, such as those implemented in the BCI2000 platform (Schalk et al., 2004). Brain signals acquired and processed in real-time require digitization in order to be processed by a computer. An efficient system has a very tight compromise between speed of processing and the amount of information digitized from the analog signals (sampling rate). That is, the more samples the system extracts per unit of time, the bigger is the delay between source and feedback. Additionally, in the offline analysis of brain function using ECoG, sophisticated analysis over large amounts of ECoG data (i.e., tens of electrode channels and hours of data) often demands downsampling of signals to avoid exhausting computer memory and processing capacity and to limit computation time. Temporal smoothing, on the other hand, is useful to mitigate the occurrence of transient, spiky and uncorrelated noisy components. Importantly, however, downsampling and smoothing may affect the shape of the HFB signal, for example by introducing low-frequency components (known as aliases) into the signal and causing it to deviate from the actual shape of the underlying neural response. Consequently, the careful selection of sampling and smoothing parameters may be the key to an optimal interpretation and usage of the HFB power signals.
In the current study, we addressed the following question: how do parameter settings affect decoding performance of discrete events? We present a method for determining the minimum required temporal sampling of the signal and the optimal temporal smoothing to reduce or filter the signal’s noise component. Two different metrics, both relevant for BCI purposes, were used to attest the quality of the HFB power signal: classification decoding accuracy, which allows to distinguish different classes of the task, and the significant normalized difference between active and rest conditions, as a measure of contrast-to-noise ratio. In order to investigate the generalizability of the results, we compared the results for multiple sensory and motor tasks and illustrate the impact of selected sub-optimal parameters in the classification accuracy of a four-class BCI.
2. Materials and Methods
2.1. Subjects
We included 12 epilepsy patients (mean age 27, range 15-49; see Table 1) who were implanted with subdural ECoG grids to localize the seizure focus. Two types of grids were used. Standard ECoG grids had an inter-electrode distance center-to-center of 1 cm and 2.3 mm exposed surface diameter, whereas high-density grids had either 32 or 64 channels, with 1.3 mm exposed surface diameter and an inter-electrode distance of 3 mm center-to-center (all AdTech, Racine, USA). For the following analysis, some electrodes were excluded because they showed flat or unstable signals, or high power-line noise levels. From the remaining electrodes, only those located over sensorimotor cortex (as determined by visual inspection, see Figure 1) were considered for further analysis. For this, the electrodes were localized using co-registration between a post-implantation Computerized Tomography (CT) scan (Philips Tomoscan SR7000, Best, the Netherlands) and a pre-operative T1-weighted anatomical scan on a 3T Magnetic Resonance system (Philips 3T Achieva, Best, the Netherlands), corrected for brain shift (Hermes et al., 2009; Branco et al., 2017b) and projected on a cortex surface rendering (Figure 1). The Medical Ethical Committee of the Utrecht University Medical Center approved the study and all patients signed informed consent according to the Declaration of Helsinki (2013).
Table 1. Patient characteristics and ECoG grid information.
| Patient No. | Age | Gender | Handedness | Implanted hemisphere | Grid type | Number included electrodes |
|---|---|---|---|---|---|---|
| S1 | 29 | Male | Right | Left | High-density | 29 |
| S2 | 42 | Male | Right | Left | High-density | 24 |
| S3 | 19 | Female | Right | Left | High-density | 32 |
| S4 | 19 | Male | Right | Left | High-density | 31 |
| S5 | 45 | Female | Left | Right | High-density | 59 |
| S6 | 18 | Male | Right | Left | High-density | 60 |
| S7 | 28 | Male | Left | Right | High-density | 60 |
| S8 | 20 | Male | Right | Right | High-density | 32 |
| S9 | 26 | Male | Right | Left | Standard | 26 |
| S10 | 49 | Male | Right | Left | Standard | 21 |
| S11 | 15 | Female | Right | Left | Standard | 20 |
| S12 | 15 | Female | Right | Left | Standard | 23 |
Figure 1. ECoG grid localization.
A-D) Brain surface renderings with projected electrode locations of all individual subjects based on their T1-weighted anatomical MRI scans and post-implantation CT scan. Central Sulcus is indicated in white. Yellow and blue squares indicate hand and mouth regions of interest, respectively. For each subject, all electrodes presented in the figure were used for common averaged re-referencing. A) Five patients with a high-density grid (electrodes in black circles) over the sensorimotor hand region (yellow square). B) Three patients with a high-density grid over the sensorimotor mouth region (blue square). C) Three patients with a standard ECoG grid over the sensorimotor cortex. D) One patient with a standard ECoG grid over both mouth (blue square) and hand (yellow square) sensorimotor areas. C-D) Colored stars indicate the electrodes selected for further analysis (see section 2.2.3 and 2.2.4), where each color represents a different sensorimotor region of interest.
2.2. Task description
Four different tasks were analyzed: two multi-class tasks (four hand gestures and four phonemes), and two single-class tasks (button press and tactile stimulation of finger and lip). The former were analyzed with a different performance metric than the latter (see section 2.6 for details). Each task was performed by an independent group of subjects. Hereafter, the task description and the results sections will be divided in four sub-sections addressing these four different tasks.
2.2.1. High-density ECoG and hand gestures
Five subjects (S1-S5) who had a high-density grid covering the hand sensorimotor region were asked to execute four different hand gestures (Bleichner et al., 2014; Branco et al., 2017a) taken from the American Sign Language finger spelling alphabet (‘D’, ‘F’, ‘V’ and ‘Y’). The participants were asked to copy the gesture presented on the screen and hold it for 6 seconds. The trials were interleaved with a rest condition (6 seconds), where the subject was asked to place their hand in a relaxed open hand position. Each run consisted of 40 gesture trials (10 per gesture, randomized). Subject S5 performed two runs.
2.2.2. High-density ECoG and phonemes
Three subjects (S6-S8) who had a high-density grid over the mouth sensorimotor region were asked to pronounce four different phonemes of the International Phonetic Alphabet (/p/, /k/, /u/ and /a:/) (Ramsey et al. 2017). Trials lasted for 1.5 seconds, during which a single phoneme was displayed on the screen, and were interleaved with inter-trial intervals (1.5 s). Subjects S6-S8 performed 200, 86 and 250 trials of randomized phonemes, respectively.
2.2.3. Clinical ECoG and single finger movement
Three subjects (S9-S11) who had a standard clinical grid over the sensorimotor cortex performed a button-press task, where they were asked to press a button with their right thumb (contralateral to the implanted hemisphere) to indicate whether the given answer to an equation was correct or incorrect (Vansteensel et al., 2014). For the current study, we only analyzed the moments of the button press, as well as the inter-trial intervals (rest trials), during which a fixation cross was presented on the screen and patients relaxed. There were 40 equation trials (button-press opportunities) and 40 rest trials in total, of which the subjects missed a maximum of two equations. For each subject, two electrodes (one over primary motor cortex, M1, and another over primary somatosensory cortex, S1) with the largest, and significant (t-test, p-value < 0.05) response to the task were selected for further analysis (Figure 1).
2.2.4. Clinical ECoG and tactile stimulation
Subject S12 had a standard clinical ECoG grid over the sensorimotor cortex and performed a tactile stimulation task of two different body parts: thumb and lips. Stimulations were delivered by a custom-made device, which stimulated the skin with a brief pressure on the thumb or a short vibration to the lips, in a controlled manner. Each run consisted of 51 trials, each with a brief pressure/vibration within a 1.5 s window, that were interleaved with a random inter-trial interval of 5-7 s. Even though the device did not allow for a precise measure of the stimulation duration, this was estimated to be approximately 1 second. The patient performed two runs, one for each body part. The subjects were asked to put on headphones and look at a screen during the task, to minimize the effect of attention. In total, four electrodes with the largest, and significant (t-test, p-value < 0.05) response to the task were selected for further analysis (Figure 1). Two of these four electrodes were located over mouth sensorimotor mouth region, one over M1 and another over S1. The other two electrodes were located over the sensorimotor hand region, one electrode over M1 and another over S1 (above the posterior flank of the central sulcus).
2.3. ECoG acquisition and preprocessing
ECoG signals were continuously recorded using a 128-channels clinical system (Micromed, Treviso, Italy; 22 bits, hardware band-pass filter 0.15-134 Hz) with 512 Hz sampling frequency. The data was analyzed offline and preprocessed using the open source FieldTrip® toolbox (Oostenveld et al., 2010) for MATLAB® (MathWorks Inc.) as follows: first, the continuous data were filtered using a notch filter (center at 50 and 100 Hz) and re-referenced using the common average reference (CAR) of all included channels (see Figure 1); second, the data were divided in epochs that were centered either to an electrophysiological marker (Branco et al., 2017a) (hand gestures and phonemes tasks), response marker (button-press task) or the pre- and post-stimulation marker. Note that for the hand gestures and phonemes tasks, the use of an electrophysiological marker provides a way of optimal trial alignment with reduction of temporal biases (inter-trial jitter) when using spatio-temporal classification as a performance metric (see section 2.6.1 and Branco et al., 2017a). On the other hand, for the button-press and tactile stimulation paradigms, the performance metric used in the current study (see section 2.6.2) treats these tasks as a continuous signal and detects significant increases in amplitude by randomly shuffling the markers in time. Hence, the inter-trial jitter does not pose an issue.
2.4. Feature extraction
High-frequency band power traces were extracted using the Morlet wavelet dictionary, with multiplication in the frequency domain (length equal to 3 standard deviations of the implicit Gaussian kernel and width 7 cycles), as implemented in FieldTrip® Toolbox (Oostenveld et al., 2010). Notably, besides temporal sampling and temporal smoothing, another crucial parameter of this method is the full-width-half-maximum (FWHM) of the wavelet. This parameter has an inherent trade-off between spectral and temporal specificity, which is closely related to temporal smoothing. That is, a broader FWHM leads to less temporal resolution, thus a more smoothed signal. In this study we opted for a good temporal resolution (width 7 cycles), as we focus on HFB power signals that show similar response for a wide range of frequencies from 65 to 125 Hz, and subsequently optimize the smoothing of the computed power signal. Hence, for each trial, the mean power over the HFB frequency bins (65-125 Hz, steps of 1 Hz) was calculated per channel, for a given time window relative to the respective marker mentioned in section 2.3. Time windows were -1 to 2.6 s for gestures and -1.1 to 1.1 s for phonemes, which include both baseline and active periods. For the button-press task, the baseline window was -3 to -1 s with respect to equation trigger, while the response window was defined as -0.7 to 0.4 s with respect to response marker (button-press). Lastly, for the tactile stimulation task, the baseline window was defined as 0 to 4 s with respect to the end of stimulation, while the response window was defined as 0 to 1.2 s with respect to the onset of the stimulation. Note that, all tasks used within the current study were designed for different purposes, which explains the different specifications for each task. Hence, the baseline and active windows per task were selected à posteriori to include similar active and rest events across tasks.
2.5. Two-parameter optimization map
The computational implementation of the wavelet dictionary requires the specification of the temporal sampling (δ, in ms) and temporal smoothing window (SW, in s), where the temporal sampling, in ms, is the inverse of a sampling frequency in Hz. Power signals were computed for windows spaced every δ ms and subsequently smoothed using a moving average filter with window size SW (Figure 2). Here, a two-parameter optimization map (Figure 2) was obtained for each task, in which temporal sampling and smoothing window were applied sequentially (temporal smoothing after temporal sampling) and varied from 2 to 200 ms and from 0 to 3 s (see section 2.8), respectively. For each variable combination, a performance metric (PM) was calculated in order to evaluate the optimal parameter settings, as described below. In this study we define optimal parameter settings as those that result in an acceptable range of either of two types of performance metrics (see section 2.6, 2.7 and 2.8).
Figure 2. Two-parameter optimization map.
Wavelet parameter analysis using a two-dimensional map where temporal sampling (δ) and smoothing window (SW) size varied from 2 (low sampling frequency) to 200 (high sampling frequency) ms and from 0 (i.e., no smoothing, blue line in right-upper plot) to 3 s, respectively. First, the power signal was computed every δ ms from the raw signal amplitude (bottom-left plot). As an example, we show three red wavelet-time-windows spaced δ ms apart. In this plot, the x-axis indicates time in ms. Second, the power signal was smoothed using a moving average window of size SW (in s; top-right plot). As an example, the unsmoothed power signal is shown in blue, and the smoothed version is shown in red. The gray region indicates the smoothing window (SW). The x-axis displays time, t in s. The bottom-right plot shows a schematic example of the optimization map, where each entry (i.e., each combination of δ and SW) is colored according to the performance metric values (color-bar bottom right plot).
2.6. Performance metric
Two performance metrics, typically used in the BCI field, were used to evaluate the best parameter settings. One the one hand, we used a template matching classification method to distinguish between 4 classes (section 2.6.1). This method is frequently used in studies with high-density grids and allows to discriminate between, for example, multiple hand or mouth movements. On the other hand, when only clinical ECoG grids are available, a typical BCI study uses a single channel for control. The selection of that channel is normally made using a single-class paradigm by determining the channel with the largest difference between active and non-active states over the cortical region of interest (section 2.6.2). In order to compare the optimal parameters between M1 and S1 regions during both motor and sensory tasks, we analyzed and compared two separate channels with the largest response to the task, one over M1 and one over S1.
2.6.1. Multi-class paradigm
A standard way of studying and classifying multi-class paradigms is to distinguish between brain activity patterns produced by each class, under the assumption that these patterns are consistent across trials. Usually, spatio-temporal features represented in the data are used to discriminate which brain signal was produced and detect which class each trial belongs to. Here, we investigated the influence of temporal sampling and smoothing windows on the accuracy of such classification based on the HFB response. For that, an offline classification scheme was used, with a template matching method that compared a model (matched template) with each individual trial by means of a Pearson Correlation. This method capitalizes on the spatio-temporal signatures of the HFB response (Bouchard et al., 2013; Branco et al., 2017a). The matched template comprised the average of the feature space over trials, yielding a two-dimensional feature space of channels x time. Using a winner-takes-all leave-one-out cross-validation scheme, each single gesture or phoneme trial was compared with the matched template of each class (average over remaining trials per class) and labeled according to the highest correlation score. The final classification accuracy was computed as the percentage of correctly labeled trials.
2.6.2. Single-class paradigm
Single-class paradigms are usually studied by comparing the active trials with rest trials to extract information about the performed task. One way to evaluate the quality of the active features in contrast to rest is to describe them as a normalized function of the rest condition. Here, for each task and subject, a single channel with the largest, a significant (t-test, p-value < 0.05) response to the task (see Figure 1C-D) was selected and the signal was baseline-corrected by subtracting the real amplitude trace by the mean and dividing by the standard deviation of the signal during rest periods. Then, the mean trace over trials was computed along with a surrogate ensemble of mean traces calculated by randomly shifting stimuli onsets in time 10000 times (Canolty et al., 2007). The distribution of all sample points in the surrogate traces was used to z-score the stimulus- or event-related response. For each task, the duration of the active and baseline conditions were set according to the task description (see section 2.4). Subsequently, the surrogate traces were used to determine the uncorrected two-tailed probability that the deviation seen in the real amplitude trace was due to chance. Thus, the higher the output z-scored traces, the more different were the active and the rest states. For each smoothing and sampling frequency parameter combination, the maximum z-scored value of the mean over trials was computed and used as a performance metric for determination of optimal parameter settings.
2.7. Determination of optimal smoothing window
The optimal smoothing window (SW) was evaluated by setting the temporal sampling equal to the finest value (δ = 2 ms, for a sampling frequency of 512 Hz) and computing the performance metric (PM), that is accuracy or maximum z-score. The PM was computed for a smoothing window size ranging from 0.1 to 3 s in steps of 0.1 s for the multi-class paradigms, and from 0 to 1 s in steps of 0.01 s for the single-class paradigms. For single-class paradigms a smaller range was selected since we expected a shorter response in time compared with multi-class paradigms. Optimal SW criterion was defined by setting a maximum relative standard deviation (rstd = std(PM)/mean(PM)) of the PM values, such that the PM values would be within an acceptable range (std) of the maximum PM. Here we considered as acceptable a relative standard deviation of 5%:
| (1) |
2.8. Determination of optimal temporal sampling
Temporal sampling optimization was not as straightforward as smoothing window optimization, because one should account for the chance that a different sampling starting point may result in a different selection of the most informative time points. For example, sampling the signal by starting in sample number 1 and then extract every 10th sample yields a different signal than starting at the 3rd sample. This would be a temporal sampling confound, leading to a high variance (high instability) in PM for the same temporal sampling, depending on the starting sampling point. To detect the minimum temporal sampling that showed the least variability in PM, we computed the PM for all possible starting samples with each temporal sampling value (Table 2) and computed its standard deviation. This way the temporal sampling optimization takes into account the stability (low variance) of the signals after downsampling.
Table 2. Temporal sampling range tested and optimal smoothing windows per subject.
| Multi-class paradigm |
Optimal smoothing
window size (in s) |
Temporal sampling,
δ, values tested (in ms) |
||
|---|---|---|---|---|
| S1 | 1.6 | 2, 4, 5, 10, 20, 30, 40, 50,
60, 70, 80, 90 and 100 |
||
| S2 | 2.53 | |||
| S3 | 1.64 | |||
| S4 | 0.84 | |||
| S5 | 3 | |||
| S6 | 0.42 | |||
| S7 | 0.59 | |||
| S8 | 0 | |||
|
Single-class paradigm |
M1 | S1 | ||
| S9 | 0.19 | 0.02 | 2, 4, 8, 10, 20, 30, 40, 50,
60, 70, 80, 90, 100, 120, 140, 160, 180 and 200 |
|
| S10 | 0.04 | 0 | ||
| S11 | 0.2 | 0.17 | ||
| S12 | Thumb | 0.16 | 0.15 | |
| Lips | 0.35 | 0.28 | ||
When optimizing the temporal sampling, the SW was kept constant at the optimal value per subject. For this step only, and to provide a more precise estimation of the optimal SW, a polynomial surface function was used to fit the two-parameter optimization map (Figure 2), using a 4th or 5th order polynomial for SW and 3rd order polynomial for δ. This interpolation step was only used to compute the optimal SW. The optimal temporal smoothing window defined as such is indicated in Table 2 per subject.
In contrast to the optimal smoothing window criterion, the optimal temporal sampling (δ) criterion was defined in terms of standard deviations per δ, because what was being controlled was the inaccuracy of the PM when simply changing the starting sample. Hence, here, to capture the accuracy of the result per δ, the acceptable standard deviation of the PM was set to a maximum of 2% of the maximum PM (max(PM)) across all tasks and subjects (100% in case of multi-class paradigms and 15 z-score units in the case of single-class paradigms). That is, 0.02 x 100% = 2% for multi-class paradigms and 0.02 x 15 = 0.3 z-score units for single-class paradigms:
| (2) |
3. Results
3.1. Parameters for decoding multiple classes with high-density ECoG grids
Decoding four hand gestures from high-density ECoG grids placed over sensorimotor cortex was optimal with δ < 25 ms (> 40 Hz) and SW ranging from 0.4 to 2.7 s (Figure 3A). Although the minimum across subjects was consistent (Figure 3E), the optimal SW varied strongly (Figure 3C and Figure 4A-B). This result indicates that SW is subject-specific, as can be seen in Figure 4A-B. Notably, for three out of five subjects (S1-S3) there seemed to be two peaks along the SW values, one around 0.5 s and another around 2.5 s, suggesting that the first, shortest peak, visible in all subjects, could be enough to use for decoding.
Figure 3. Two-parameter optimization for multi-class paradigms.
A-B) Averaged two-parameter optimization map across subjects for the gestures (5 subjects) and phonemes (3 subjects) tasks, respectively. Investigated smoothing window (SW) varied between 0.1 and 3 s, whereas temporal sampling varied between 2 and 100 ms. The color-scale indicates the decoding accuracy as performance metric (PM), ranging from 0 to 1. Optimal parameter regions are indicated with a black box. C-D) Smoothing window optimization for the gestures and phonemes task, respectively. The blue line indicates the decoding accuracy of the mean across subjects (A-B) for each smoothing value. To optimize the smoothing window, temporal sampling was kept at its lowest possible value (2 ms). The PM values above criterion (1) (section 2.7), indicated by the black horizontal line, are shaded in gray and were used to delineate the optimal parameter region in A-B. E-F) Temporal sampling optimization. The blue line indicates the mean standard deviation (STD) of decoding accuracy across subjects. The values below criterion (2) (section 2.8), indicated by the horizontal black line, are shaded in gray and were used to delineate the optimal parameter region in A-B.
Figure 4. Inter-subject variability.
Two-parameter optimization map for the gestures (A-B) and phonemes (C-D) in individual subjects. The smoothing window (SW) varied between 0.1 and 3 s, and temporal sampling varied between 2 and 100 ms. Color-scale indicates the decoding accuracy as performance metric (PM), ranging from 0 to 1. Black boxes indicate the optimal SW and δ, based on criteria (1) (section 2.7) and (2) (section 2.8) for the average across subjects (A and C). For S5, the displayed two-parameter optimization map represents the mean across two runs.
Phonemes had on average lower classification scores than gestures, but the model showed a consistent optimal SW between 0.2 and 0.9 s, and a minimum δ value of 30 ms (> 33 Hz; Figure 3B). In contrast to the hand gestures, the phonemes showed consistent optimal values across the three subjects (Figure 3D,F and Figure 4C-D), thus indicating signal reproducibility across subjects.
3.2. Parameters for single class activation detection with clinical ECoG grids
Single-class paradigm tasks showed a very well defined optimal region on the optimization maps (Figure 5A-B, Figure 6A-B and Figure 7A-B). When comparing the motor task and the tactile stimulation task one can appreciate that optimal SW were similar between the tasks (Figure 5C-D, Figure 6C-D and Figure 7C-D). The optimal SW values for the motor task was between 0 and 0.3 s, and for the tactile stimulation optimal SW was between 0.2 and 0.5 s for finger and 0.1 and 0.2 s for lips stimulation. The optimal δ was smaller for motor activation when compared to sensory stimulation (Figure 5E-F, Figure 6E-F and Figure 7E-F). Hence, to optimally capture the information carried by the ECoG HFB power in the single-class paradigms, channels should be at least sampled at least every 20 ms (> 50 Hz) for motor activation and every 35 ms (> 29 Hz) for sensory tactile stimulation, although for thumb stimulation temporal sampling could be even lower (δ < 70 ms, that is > 14 Hz, Figure 7E-F).
Figure 5. Two-parameter optimization for motor single-class paradigms.
A-B) Averaged two-parameter optimization map across subjects (3 subjects) for M1 and S1 channels during the button-press motor tasks. Color-scale indicates the average maximum z-score value across trials as performance metric (PM), ranging from 0 to 15. The optimal parameter region is indicated with a black box. C-D) Smoothing window optimization for M1 and S1 channels, respectively. The maximum z-score values (mean across subjects) above criterion (1) (section 2.7), indicated by the black horizontal line, were shaded in gray and used to delineate the optimal parameter region in A-B. E-F) Temporal sampling optimization. The values of standard deviation (STD) of maximum z-score (mean across subjects) below criterion (2) (section 2.8), indicated by the horizontal black line, are shaded in gray and were used to estimate the optimal parameter region in A-B.
Figure 6. Two-parameter optimization for the lips sensory single-class paradigms.
A-B) Two-parameter optimization map for M1 and S1 channels (for subject S12) during the sensory tactile stimulation of the lips. Color-scale indicates the average maximum z-score value across trials as performance metric (PM), ranging from 0 to 15. Optimal parameter region is indicated with a black box. C-D) Smoothing window optimization for M1 and S1 channels, respectively, for the lips. The values above criteria (1) (section 2.7), indicated by the black horizontal line, are shaded in gray and used to delineate the optimal parameter region in A-B. To optimize smoothing window, temporal sampling was kept at its lowest possible value (2 ms). E-F) Temporal sampling optimization. The values below the criterion (2) (section 2.8), indicated by the horizontal black line, were shaded in gray and used to estimate the optimal parameter region in A-B.
Figure 7. Two-parameter optimization for the thumb sensory single-class paradigms.
A-B) Two-parameter optimization map for M1 and S1 channels (for subject S12) during the sensory tactile stimulation of the thumb. Color-scale indicates the average maximum z-score value across trials as performance metric (PM), ranging from 0 to 15. Optimal parameter region is indicated with a black box. C-D) Smoothing window optimization for M1 and S1 channels, respectively, for the thumb. The values above criterion (1) (section 2.7), indicated by the black horizontal line, are shaded in gray and used to delineate the optimal parameter region in A-B. To optimize smoothing window, temporal sampling was kept at its lowest possible value (2 ms). E-F) Temporal sampling optimization. The values below the criterion (2) (section 2.8), indicated by the horizontal black line, were shaded in gray and used to estimate the optimal parameter region in A-B.
3.3. Wavelet parameter optimization
An overview of the results can be seen in Table 3. Optimal parameters were evaluated according to criteria (1) (section 2.7) and (2) (section 2.8) specified above, such that the optimal SW δ and parameters yielded high PM values. These optimal parameter settings were drawn from the results shown in section 3.1 and 3.2. The analysis of the results per paradigm allowed for a grouping of the optimal parameters of temporal sampling and temporal smoothing for each of these paradigms. For the single-class paradigms, where we investigated M1 and S1 channels separately, the results per paradigm were similar for both M1 and S1 channels (see Figures 5-7).
Table 3. Summary of optimal SW and δ range for different single- and multi-class paradigms.
| Multi-class Paradigm | Single-class Paradigm | |||
|---|---|---|---|---|
| Gestures | Phonemes | Motor execution | Sensory stimulation | |
| δ | < 25 ms (> 40 Hz) |
< 30 ms (> 33 Hz) |
< 20 ms (> 50 Hz) |
< 35 ms (> 29 Hz) |
| SW | [0.4 0.6] s and [0.9 2.7] s | [0.2 0.9] s | [0 0.3] s | [0.1 0.5] s |
3.4. Illustrating sub-optimal parameter settings
To illustrate the effect of optimal and sub-optimal parameters on the HFB signal interpretation, we compared several combinations of SW and δ, applied to data acquired during one hand gesture trial. A sub-optimal selection of the δ parameter (Figure 8D-E) may yield a power signal completely different from the original neural response (Figure 8A), while the optimal δ captures the entire signal’s shape (Figure 8B). Sub-optimal selection of SW (Figure 7C,E), on the other hand, leads to a considerable smoothing of the characteristic shape of the signal (Figure 8A), while minimum optimal SW preserves the shape well (Figure 8B). Here we show that a combination of optimal δ and SW (Figure 8B) is required to preserve the original signal’s shape (Figure 8A) without introducing or removing any extra temporal characteristics.
Figure 8. Comparison between optimal and sub-optimal parameter selection.
A-E) HFB power trace of one trial during the performance of a specific hand gesture (subject S4, gesture 1, trial 1, channel 6. Y-axis indicates normalized power amplitude in arbitrary units (a.u.), for a 12 s time window (-6 to 6 s, with t = 0 s being activity onset). Parameter settings used are indicated in the gray box. A) HFB power trace computed with a temporal sampling of δ = 2 ms, which corresponds to the original sampling frequency of 512 Hz, and no smoothing (SW = 0s). B) Optimal δ and SW parameters (δ = 15 ms, SW = 0.84 s) allow for high decodability of trials from this gesture to be distinguished from the trials from the three other gestures. C-E) Examples of sub-optimal parameters of either δ and SW or both, where sub-optimal δ shows aliasing. B-E) Parameters used are also indicated on the two-parameter optimization map (right panel) by a small black square. Dashed black lines indicate the lower boundary of the optimal parameters. Color-scale indicates decoding accuracy for this subject ranging from 0 to 1.
4. Discussion
The high-frequency band (HFB) power components of brain signals are increasingly used for the study of brain function and the development of BCIs. Invasive and non-invasive closed-loop systems based on neural-electrical activity rely on the necessity of decreasing the amount of processed information for faster feedback, and minimal signal noise for accurate decoding. For that, signal downsampling and subsequently smoothing to reduce noise is typically performed. Importantly, and as also illustrated here, these procedures may significantly affect the shape of the resulting HFB response and sub-optimal sampling and smoothing may affect the interpretation in terms of the topographical and temporal characteristics of the underlying neuronal response. At the same time, extracting an HFB representation that closely describes this underlying response is required for optimal classification and consistency. Therefore, we investigated multiple combinations of two key parameters of the wavelet-based power extraction methods: the temporal sampling (δ) and the temporal smoothing (SW) of the extracted power signal. We optimized these two parameters for maximal distinction of different conditions of multiple single- and multi-class sensorimotor tasks commonly used in the clinical and scientific community to investigate sensorimotor control. Our results confirm that these parameters affect the outcome of the analyses considerably. We also find that it is feasible to find the optimal parameters for an individual subject and an individual task, and present a method for doing so. Moreover, the comparable range of the optimal parameters between subjects and tasks suggests that the shape of the HFB response, and possibly the shape of the underlying neural response, is similar between people and the paradigms used here. Although especially relevant for improving closed-loop performance in typical BCI settings, the method can also be used in the study of EEG and ECoG neuronal signals in situations where downsampling and smoothing of the signal is relevant.
4.1. Optimal parameters for HFB power extraction
The presented method showed to be a systematic way to characterize the temporal HFB responses. In general, HFB power temporal responses should be sampled higher than 29 Hz and smoothed up to approximately 0.9 s (Table 3). From the optimal values obtained for the different tasks, three clear observations emerge. First, we showed that single motor execution (a button press) requires the highest temporal sampling (sampled faster than once every 20 ms, i.e. at a frequency higher than 50 Hz) and the least amount of smoothing (window size from 0 to 0.3 s), indicating that the neuronal event associated with controlled impulse-like motor acts is precise and short in time.
Second, all tasks performed by more than one subject revealed a consistent temporal sampling and temporal smoothing across subjects with the exception of complex hand gestures. The latter showed two optimal temporal smoothing windows across subjects, being 0.4-0.6 s and 0.9-2.7 s, of which the first is consistent with the optimal window obtained in the other tasks. The existence of a second, longer, optimal smoothing window may be associated with the more complex motor strategy required to perform the gestures. For such a task, optimal power extraction parameters could be considered on a subject-specific basis. Nevertheless, a short range optimal smoothing window (0.4-0.6 s) was also found to yield good performance across subjects, suggesting that the neuronal population underlying gesture production activates also within the short temporal range indicated for the remaining sensorimotor tasks. Notably, the production of phonemes and button-presses are faster, shorter and less complicated than gesture execution, and therefore are likely to be more reproducible across individuals.
Lastly, we stress that the results presented here serve as an indication of the SW and δ values that are optimal for the frequency range (HFB) and tasks here presented. For each of the paradigms, and likewise for each task, a narrow optimal range of values was found, with mostly a high degree of correspondence between subjects. However, the remaining unexplained variance between tasks and subjects suggests that performance may be optimized on a task- and subject-specific level.
4.2. Limitations
This study was performed using executed movements and tactile stimulation of patients scheduled for epilepsy foci removal. Although channels over later resected tissue were not included in this analysis, the effects of inter-ictal activity cannot be completely ruled out. A second limitation is that the tasks for which the parameters were optimized were limited to the sensorimotor cortex, and optimization involved a limited number of electrodes. Hence, the optimal parameters for other brain regions remain to be determined. Furthermore, the multi-class paradigms were recorded using high-density ECoG grids of 32 or 64 electrodes, which are beneficial for spatiotemporal decoding, while the single-class paradigms were analyzed using data from a single standard ECoG electrode. Indeed, as recently studied, there can be a difference in correlation, coherence and phase of local field-potentials between these two types of grids sizes (Kellis et al., 2015; Wang et al., 2016), suggesting that the optimal parameters could be further assessed and compared between grid sizes.
5. Conclusion
In this study we show that the temporal profiles of high-frequency power ECoG signals extracted from the raw neuronal signal depend on the setting of two parameters during the power extraction step. We demonstrate that suboptimal sampling and excessive smoothing of the signals compromises interpretation and usability of the resulting HFB power traces for high accuracy discrimination of task conditions, as used in BCI applications. We present the optimal parameters for four distinct sensorimotor tasks and conclude that the analysis of such signals improves with high temporal sampling and some degree of smoothing.
Acknowledgments
This research was funded by the ERC-Advanced ‘iConnect’ project (grant ADV 320708), the BrainGain Smart Mix Programme (grant SSM06011) and the Dutch Technology Foundation STW (grant UGT7685) (N.R.). We thank Frans Leijten, Cyrille Ferrier, Geert-Jan Huiskamp, and Tineke Gebbink for their help in collecting data, Peter Gosselaar and Peter van Rijen for implanting the electrodes, the technicians, the staff of the clinical neurophysiology department and the subjects for their time and effort, as well as Sybrand Aarnoutse for the help with developing the in-house somatosensory stimulation device.
References
- Bleichner M, Freudenburg Z, Jansma J, Aarnoutse E, Vansteensel M, Ramsey N. Give me a sign: decoding four complex hand gestures based on high-density ECoG. Brain Structure and Function. 2014:1–14. doi: 10.1007/s00429-014-0902-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bouchard KE, Mesgarani N, Johnson K, Chang EF. Functional organization of human sensorimotor cortex for speech articulation. Nature. 2013;495:327. doi: 10.1038/nature11911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Branco MP, Freudenburg ZV, Aarnoutse EJ, Bleichner MG, Vansteensel MJ, Ramsey NF. Decoding hand gestures from primary somatosensory cortex using high-density ECoG. Neuroimage. 2017a;147:130–142. doi: 10.1016/j.neuroimage.2016.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Branco MP, Gaglianese A, Glen D, Hermes D, Saad ZS, Petridou N, Ramsey NF. ALICE: A tool for automatic localization of intra-cranial electrodes for clinical and high-density grids. J Neurosci Methods. 2017b doi: 10.1016/j.jneumeth.2017.10.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bruns A. Fourier-, Hilbert- and wavelet-based signal analysis: are they really different approaches? Journal of Neuroscience Methods. 2004;137 doi: 10.1016/j.jneumeth.2004.03.002. [DOI] [PubMed] [Google Scholar]
- Buzsáki G, Wang X-J. Mechanisms of Gamma Oscillations. Annu Rev Neurosci. 2012;35:203–225. doi: 10.1146/annurev-neuro-062111-150444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Canolty R, Soltani M, Dalal S, Edwards E, Dronkers N, Nagarajan S, Kirsch H, Barbaro N, Knight R. Spatiotemporal dynamics of word processing in the human brain. Frontiers in neuroscience. 2007;1:185–96. doi: 10.3389/neuro.01.1.1.014.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crone NE, Miglioretti DL, Gordon B, Lesser RP. Functional mapping of human sensorimotor cortex with electrocorticographic spectral analysis. II. Event-related synchronization in the gamma band. Brain : a journal of neurology. 1999;121(Pt 12):2301–15. doi: 10.1093/brain/121.12.2301. [DOI] [PubMed] [Google Scholar]
- Hermes D, Miller K, Noordmans H, Vansteensel M, Ramsey N. Automated electrocorticographic electrode localization on individually rendered brain surfaces. J Neurosci Methods. 2009;185:293–8. doi: 10.1016/j.jneumeth.2009.10.005. [DOI] [PubMed] [Google Scholar]
- Hermes D, Miller K, Vansteensel M, Aarnoutse E, Leijten F, Ramsey N. Neurophysiologic correlates of fMRI in human motor cortex. Hum Brain Mapp. 2012;33:1689–1699. doi: 10.1002/hbm.21314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hochberg L, Bacher D, Jarosiewicz B, Masse N, Simeral J, Vogel J, Haddadin S, Liu J, Cash S, Smagt P, Donoghue J. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. Nature. 2012;485:372–375. doi: 10.1038/nature11076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hotson G, McMullen D, Fifer M, Johannes M, Katyal K, Para M, Armiger R, Anderson W, Thakor N, Wester B, Crone N. Individual finger control of a modular prosthetic limb using high-density electrocorticography in a human subject. J Neural Eng. 2016;13 doi: 10.1088/1741-2560/13/2/026017. 026017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kellis S, Sorensen L, Darvas F, Sayres C, O’Neill K, Brown R, House P, Ojemann J, Greger B. Multi-scale analysis of neural activity in humans: Implications for micro-scale electrocorticography. Clin Neurophysiology Official J Int Fed Clin Neurophysiology. 2015;127:591–601. doi: 10.1016/j.clinph.2015.06.002. [DOI] [PubMed] [Google Scholar]
- Leuthardt E, Miller K, Anderson N, Schalk G, Dowling J, Miller J, Moran D, Ojemann J. Electrocorticographic frequency alteration mapping: a clinical technique for mapping the motor cortex. Neurosurgery. 2007;60:260–70. doi: 10.1227/01.NEU.0000255413.70807.6E. discussion 270–1. [DOI] [PubMed] [Google Scholar]
- Leuthardt E, Schalk G, Wolpaw J, Ojemann J, Moran D. A brain-computer interface using electrocorticographic signals in humans. Journal of Neural Engineering. 2004;1:63. doi: 10.1088/1741-2560/1/2/001. [DOI] [PubMed] [Google Scholar]
- Miller K, Leuthardt E, Schalk G, Rao R, Anderson N, Moran D, Miller J, Ojemann J. Spectral Changes in Cortical Surface Potentials during Motor Movement. The Journal of Neuroscience. 2007;27:2424–2432. doi: 10.1523/JNEUROSCI.3886-06.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller K, Sorensen L, Ojemann J, Nijs M. Power-Law Scaling in the Brain Surface Electric Potential. PLoS Computational Biology. 2009b;5 doi: 10.1371/journal.pcbi.1000609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller, Zanos, Fetz, den Nijs, Ojemann Decoupling the Cortical Power Spectrum Reveals Real-Time Representation of Individual Finger Movements in Humans. The Journal of Neuroscience. 2009a;29:3132–3137. doi: 10.1523/JNEUROSCI.5506-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oostenveld R, Fries P, Maris E. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational intelligence and neuroscience. 2010;2011 doi: 10.1155/2011/156869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramsey NF, Salari E, Aarnoutse EJ, Vansteensel MJ, Bleichner MB, Freudenburg ZV. Decoding spoken phonemes from sensorimotor cortex with high-density ECoG grids. Neuroimage. 2017 doi: 10.1016/j.neuroimage.2017.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schalk G, McFarland D, Hinterberger T. BCI2000: a general-purpose brain-computer interface (BCI) system. IEEE Transactions on biomedical engineering. 2004;51:1034–1043. doi: 10.1109/TBME.2004.827072. [DOI] [PubMed] [Google Scholar]
- Vansteensel MJ, Bleichner MG, Freudenburg ZV, Hermes D, Aarnoutse EJ, Leijten FSS, Ferrier CH, Jansma JM, Ramsey NF. Spatiotemporal characteristics of electrocortical brain activity during mental calculation. Hum Brain Mapp. 2014;35:5903–20. doi: 10.1002/hbm.22593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vansteensel MJ, Pels EGMG, Bleichner MG, Branco MP, Denison T, Freudenburg ZV, Gosselaar P, Leinders S, Ottens TH, Van Den Boom MA, Van Rijen PC, et al. Fully Implanted Brain-Computer Interface in a Locked-In Patient with ALS. N Engl J Med. 2016;375:2060–2066. doi: 10.1056/NEJMoa1608085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang P, King C, McCrimmon C, Lin J, Sazgar M, Hsu F, Shaw S, Millet D, Chui L, Liu C, Do A, Nenadic Z. Comparison of decoding resolution of standard and high-density electrocorticogram electrodes. J Neural Eng. 2016;13 doi: 10.1088/1741-2560/13/2/026016. 026016. [DOI] [PMC free article] [PubMed] [Google Scholar]








