Abstract
Electroencephalographic (EEG) signals present a myriad of challenges to analysis, beginning with the detection of artifacts. Prior approaches to noise detection have utilized multiple techniques, including visual methods, independent component analysis and wavelets. However, no single method is broadly accepted, inviting alternative ways to address this problem. Here, we introduce a novel approach based on a statistical physics method, multiscale entropy (MSE) analysis, which quantifies the complexity of a signal. We postulate that noise corrupted EEG signals have lower information content, and, therefore, reduced complexity compared with their noise free counterparts. We test the new method on an open-access database of EEG signals with and without added artifacts due to electrode motion.
I. INTRODUCTION
Electroencephalographic (EEG) signals are essential to monitor brain function. Physiological and clinical analyses typically require huge amounts of data. Therefore, automated or semi-automated approaches that minimize and focus expert intervention are desirable. A major challenge is the detection of artifacts, which may be caused by external (e.g., electrode instability, power line noise) or internal (e.g., muscle or eye movement) factors [1].
Multiple approaches to noise detection have been proposed, including those based on independent component analysis (ICA) [2, 3], moment-based statistical methods [4], wavelet analysis [5], regression [6, 7], blind source separation [8, 9], averaged artifact subtraction [10], Bayesian classification [11], and combinations of methods [12-15]. All these methods have different strengths and limitations. However, currently, no consensus exists on the optimal ways to detect different types of EEG noise.
We approach this problem from the perspective of information theory. Our method, based on multiscale entropy (MSE) analysis [16, 17], is simple to implement and computationally efficient. This approach is motivated by the hypothesis that artifacts degrade signal information content, which can be quantified using the MSE method applied in a moving window.
II. MATERIALS AND METHODS
A. Database
We employed the Motion Artifact Contaminated EEG Database [18, 19], freely available on the PhysioNet website [20] at http://physionet.org/physiobank/database/motion-artifact/. This dataset comprises 23 recordings lasting approximately 8-9 minutes. Each recording includes two EEG signals from the pre-frontal cortex, acquired from transducers in close proximity of each other. In each case, one of the two transducers was undisturbed, while the other was manipulated to produce motion artifacts of variable duration. Simultaneous outputs of 3-axis accelerometers affixed to each transducer were also recorded to document motion-related noise. The EEG signals were sampled at 2048 Hz; the acceleration signals at 200 Hz.
The following procedure, illustrated in Fig. 1, was adopted to recognize movement artifacts inside each epoch:
Fig. 1.

Identification of EEG epochs with movement. (Top) Acceleration time series obtained by computing the amplitude of the acceleration vector from its three components x, y and z given in arbitrary units (a.u.). (Middle) Rectified detrended time series. (Bottom) Time series obtained by low-pass filtering the signal shown in the middle panel. The grey rectangles indicate the epochs with movement artifacts.
(i) Derivation of the acceleration time series (Fig. 1, top panel) by computing the amplitude of the acceleration vector from its three components x, y and z .
(ii) High-pass filtering of the acceleration signal to remove frequencies well below (< 0.2 Hz) those characteristic of movement artifact. We used a parabolic interpolation filter (function available at www.mit.edu/~gari/CODE/ECGtools/, parameter: n=1000 data points).
(iii) Rectification of the detrended acceleration signal by squaring its amplitude (Fig. 1, middle panel).
(iv) Low-pass filtering of the rectified detrended acceleration signal (Fig. 1, bottom panel) using a 200 data point wide moving average window (cut-off frequency of 1 second).
(v) Determination of the temporal location and duration of the movement contaminated segments (Fig. 1, bottom panel) by comparing the amplitude of the signal obtained from step (iv) with an empirically determined threshold (amplitude > 9 × 10−7 (a.u.) implies movement artifact). For the development of the EEG noise detection algorithm, we use the derived time series of 0 and 1 values (1 for movement, 0 for no movement) as the reference signal.
B. Multiscale Entropy Analysis
MSE [16, 17] quantifies the complexity of a signal by assessing the entropy of set a time series, called coarse-grained time series, each representing the systems’ dynamics on a different time scale. The coarse-grained time series for scale s is obtained by averaging the data points inside consecutive non-overlapping windows, each with s data points.
As a measure of entropy, the MSE method uses sample entropy (SampEn), which is the negative logarithm of the conditional probability that m-component patterns that match within a certain tolerance r will also match when their lengths increase by one data point. In this study, we chose m=2 and r=15% of the standard deviation of the signal.
To derive the time series of the complexity indices (CI, unit-less), we applied the MSE algorithm to consecutive non-overlapping windows of 2 seconds (4096 samples). The complexity index was defined as the summation of the entropy values for scales 1 to 5. This range was selected based on the following considerations: i) By construction, the length of the coarse-grained time series for scale s is N/s, where N represents the total number of data points; ii) SampEn is largely independent of time series length for time series longer than 750 samples [21]; iii) Since our 2 second segments comprised 4096 samples, we were able to expand our analysis up to scale 5 (4096/750~5). For scale 5, each data point represents a 2.5 ms observation (5/2048 s). Thus, 2- and 3-component patterns have duration of 5 and 7.5 ms, respectively.
For artifact detection we then analyzed the time series of the sequences of CI. Fig. 2 shows an example of an EEG signal with 4 movement artifact segments and of the derived CI time series. Note that the segments corresponding to artifacts are characterized by lower values of the CI.
Fig. 2.

EEG noise detection. (Top) Original EEG signal. (Bottom) Complexity index (CI) time series. CI values are obtained from MSE analysis of consecutive 2-s EEG epochs. The grey rectangles indicate the epochs contaminated by movement.
C. Evaluation of the method
We started by concatenating the CI time series from all the EEG channels contaminated by movement artifact into a single time series. Next, we classified each 2-s EEG epoch as artifact free or movement artifact depending on whether the CI value for that epoch was above or below a given threshold, respectively. The ROC curve was derived by repeating this process for a range of threshold values, varying from the minimum to the maximum of the concatenated CI time series. The area under the curve (AUC) was used as an index of performance (Table 1, first row). The optimal CI threshold value (1.34, marked in red on the ROC curve in Fig. 3) was defined as the one providing the highest accuracy (number of correctly recognized epochs/total number of epochs).
Table 1.
Detection statistics using the “optimal” threshold.
| Sensitivity (%) | Specificity (%) | Accuracy (%) | |
|---|---|---|---|
|
Movement-
disturbed leads |
79 | 98 | 96 |
| Control leads | Not applicablea | 98 | 98 |
There are no noise-corrupted epochs in the control leads.
Fig. 3.

Receiver Operating Characteristic (ROC) curve derived from the complexity analysis of the database of movement-corrupted EEG leads. The thresholds values range from 0.03 to 9.6, with a step size of 0.01. The red marker represents the threshold that provides maximum accuracy.
We also computed the sensitivity, specificity and accuracy values for the analysis of the control leads using the “optimal” CI threshold defined above (Table 1, second row).
III. RESULTS
The ROC curve derived from analysis of the movement-corrupted EEG signals is shown in Fig. 3. The AUC = 0.95 (95% confidence interval: [0.94, 0.96]).
As in the cases here, movement artifacts may appear on EEG as transient high amplitude spikes, leading to an increase of the SD of time series for the affected epochs. Therefore, one might hypothesize that the performance of a method based on the analysis of the SD time series would be equivalent to that of the CI time series. However, the complexity analysis provides information not contained in the mean or the variance of a signals. Indeed, the CI and the SD are independent of each other. Note that the r parameter (tolerance) of the SampEn algorithm is chosen here as a percentage of the SD in order to eliminate the effect of signal amplitude on the entropy measure.
To illustrate the potential advantages of the CI method over the use of the SD, we next evaluated two examples of signals contaminated by low amplitude artifacts:
1) Artifacts containing periodic oscillations: We selected a noise-free EEG signal from our database and, at random locations, replaced a given amount of data with a periodic wave of similar amplitude (Fig. 4, top panel). By construction, the local SD values computed from noise-free segments were similar to those obtained from the artifact-laden segments (Fig. 4, bottom panel). In contrast, the complexity index was substantially higher for noise-free segments (~5) than for the segments of periodic artifact (~0).
Fig. 4.
(Top) EEG signal corrupted with square-wave artifacts of random duration (solid line). A square wave (dashed line) is used to indicate noise-free (lower values) and noise-corrupted (higher values) periods. (Middle) CI time series. Note that noise-corrupted epochs are characterized by low CI values. (Bottom) Standard deviation (SD) time series. Note that noise-corrupted and noise-free epochs have same similar SD values.
2) Artifacts of low amplitude due to movement: We selected an EEG signal from our database with movement artifact and detrended it – again using the parabolic interpolation filter with parameter n=500 data points – to eliminate slow baseline drifts on time scales much larger than those characteristic of movement artifact. We next rescaled the amplitude of the segments corresponding to movement artifact to match those of the surrounding noise-free segments (Fig. 5, top panel). By construction, the artifact corrupted segments could not be identified from the analysis of time series the of local SD values. In contrast, the time series of the complexity indices showed a marked decrease in MSE for the noise-corrupted periods compared with the noise-free ones (Fig. 5, middle panel).
Fig. 5.
(Top) EEG signal corrupted with low amplitude movement artifacts (solid line). A square wave (dashed line) is used to indicate noise-free (lower values) and noise-corrupted (higher values) periods. (Middle) CI time series. Note that periods with artifact have lower complexity (Bottom) SD time series. Note that noise-corrupted and noise-free periods have same similar SD values.
IV. DISCUSSION
We address the problem of movement artifact detection from EEG signals by introducing a computationally efficient method based on complexity analysis (as measured by the MSE method).
Previous articles have proposed the use of classical (i.e., single scale) entropy measures in artifact detection. For example, Delorme and colleagues [22] employed Shannon’s entropy together with kurtosis for the rejection of independent components of EEG signals. Inuso and colleagues [23] employed Renyi’s entropy in conjunction with wavelet decomposition.
Here, we explore for the first time, to our knowledge, the possibility of using MSE to detect artifact in EEG based on the hypothesis that physiological signals are more complex than their noise-corrupted counterparts.
We employed the MSE method to analyze an open-access database of EEG recordings affected by sensor movements. The preliminary findings are promising with respect to sensitivity and specificity. A major limitation of this study is that it focused only on one class of artifacts. Its utility with respect to a wide range of EEG artifacts, as well as pathophysiological signals related, for example to seizures, needs to be systematically explored. Furthermore, it is likely that semi-automated approaches to EEG artifact detection will require an ensemble of methods, given the broad range of possible contaminative signals. Comparison with other methods also requires future studies.
Finally, we note that parameters of the detection algorithm can be adapted to different needs. Here we employed a window of 2 seconds. Depending on the resolution requirements in detecting noisy segments and on EEG sampling frequency, the time window can be appropriately adjusted.
V. CONCLUSIONS
We introduce a simple method based on multiscale entropy (MSE) analysis to facilitate EEG artifact detection. This method is easy to implement and can be applied in conjunction with other artifact detection methods. Further studies are needed to assess its utility and limitations and to compare it to currently used techniques.
Acknowledgments
Research supported by the Wyss Institute, the James S. McDonnell Foundation, the G. Harold & Leila Y. Mathers Foundation and the National Institutes of Health (NIA, NIGMS, and NHLBI) Grants - R24HL114473; R00AG030677 and R01GM104987.
Contributor Information
Sara Mariani, Wyss Institute for Biologically Inspired Engineering at Harvard University, Boston, MA, USA.
Ana F. T. Borges, Department of Integrative Neurophysiology, Center for Neurogenomics and Cognitive Research, VU University Amsterdam, Amsterdam, Netherlands (aftborges@gmail.com)
Teresa Henriques, Wyss Institute for Biologically Inspired Engineering at Harvard University, Boston, MA, USA.
Ary L. Goldberger, Wyss Institute for Biologically Inspired Engineering at Harvard University and with the Margret and H.A. Rey Institute of Nonlinear Dynamics in Physiology and Medicine, Division of Interdisciplinary Medicine and Biotechnology, Beth Israel Deaconess Medical Center, Boston, MA, USA (agoldber@bidmc.harvard.edu)
Madalena D. Costa, Wyss Institute for Biologically Inspired Engineering at Harvard University and with the Margret and H.A. Rey Institute of Nonlinear Dynamics in Physiology and Medicine, Division of Interdisciplinary Medicine and Biotechnology, Beth Israel Deaconess Medical Center, Boston, MA, USA (mcosta3@bidmc.harvard.edu)
REFERENCES
- [1].Mammone N, Morabito FC. Independent component analysis and high-order statistics for automatic artifact rejection; IJCNN'05. Proceedings.2005. pp. 2447–2452. [Google Scholar]
- [2].Mantini D, Perrucci MG, Cugini S, Ferretti A, Romani GL, Gratta C. Del. Complete artifact removal for EEG recorded during continuous fMRI using independent component analysis. Neuroimage. 2007;34:598–607. doi: 10.1016/j.neuroimage.2006.09.037. [DOI] [PubMed] [Google Scholar]
- [3].James CJ, Gibson OJ. Temporally constrained ICA: an application to artifact rejection in electromagnetic brain signal analysis. IEEE T Biomed-Eng. 2003;50:1108–1116. doi: 10.1109/TBME.2003.816076. [DOI] [PubMed] [Google Scholar]
- [4].Junghöfer M, Elbert T, Tucker DM, Rockstroh B. Statistical control of artifacts in dense array EEG/MEG studies. Psychophysiology. 2000;37:523–532. [PubMed] [Google Scholar]
- [5].Mammone N, Foresta F. La, Morabito FC. Automatic artifact rejection from multichannel scalp EEG by wavelet ICA. IEEE Sens Jour. 2012;12:533–542. [Google Scholar]
- [6].Schlögl A, Keinrath C, Zimmermann D, Scherer R, Leeb R, Pfurtscheller G. A fully automated correction method of EOG artifacts in EEG recordings. Clin Neurophys. 2007;118:98–104. doi: 10.1016/j.clinph.2006.09.003. [DOI] [PubMed] [Google Scholar]
- [7].Gasser T, Möcks J. Correction of EOG artifacts in event- related potentials of the EEG: Aspects of Reliability and Validity. Psychophysiology. 1982;19:472–480. doi: 10.1111/j.1469-8986.1982.tb02509.x. [DOI] [PubMed] [Google Scholar]
- [8].Joyce CA, Gorodnitsky IF, Kutas M. Automatic removal of eye movement and blink artifacts from EEG data using blind component separation. Psychophysiology. 2004;41:313–325. doi: 10.1111/j.1469-8986.2003.00141.x. [DOI] [PubMed] [Google Scholar]
- [9].Jung T, Makeig S, Humphries C, Lee T, Mckeown MJ, Iragui V, Sejnowski TJ. Removing electroencephalographic artifacts by blind source separation. Psychophysiology. 2000;37:163–178. [PubMed] [Google Scholar]
- [10].Brookes MJ, Mullinger KJ, Stevenson CM, Morris PG, Bowtell R. Simultaneous EEG source localisation and artifact rejection during concurrent fMRI by means of spatial filtering. Neuroimage. 2008;40:1090–1104. doi: 10.1016/j.neuroimage.2007.12.030. [DOI] [PubMed] [Google Scholar]
- [11].LeVan P, Urrestarazu E, Gotman J. A system for automatic artifact removal in ictal scalp EEG based on independent component analysis and Bayesian classification. Clin Neurophys. 2006;117:912–927. doi: 10.1016/j.clinph.2005.12.013. 4. [DOI] [PubMed] [Google Scholar]
- [12].Gwin JT, Gramann K, Makeig S, Ferris DP. Removal of movement artifact from high-density EEG recorded during walking and running. J. Neurophysiol. 2010 Jun;103:3526–3534. doi: 10.1152/jn.00105.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Reilly R, Nolan H. FASTER: Fully Automated Statistical Thresholding for EEG artifact Rejection. J Neurosci Meth. 2010 doi: 10.1016/j.jneumeth.2010.07.015. [DOI] [PubMed] [Google Scholar]
- [14].Mognon A, Jovicich J, Bruzzone L, Buiatti M. ADJUST: An automatic EEG artifact detector based on the joint use of spatial and temporal features. Psychophysiology. 2011;48:229–240. doi: 10.1111/j.1469-8986.2010.01061.x. [DOI] [PubMed] [Google Scholar]
- [15].Delorme A, Sejnowski T, Makeig S. Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis. Neuroimage. 2007;34:1443–1449. doi: 10.1016/j.neuroimage.2006.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Costa M, Goldberger AL, Peng CK. Multiscale entropy analysis of complex physiologic time series. Phys. Rev. Lett. 2002;89:068102–068102. doi: 10.1103/PhysRevLett.89.068102. [DOI] [PubMed] [Google Scholar]
- [17].Costa M, Goldberger AL, Peng CK. Multiscale entropy analysis of biological signals. Phys Rev E. 2005;71:021906–021906. doi: 10.1103/PhysRevE.71.021906. [DOI] [PubMed] [Google Scholar]
- [18].Sweeney KT, Ayaz H, Ward TE, Izzetoglu M, McLoone SF, Onaral B. A methodology for validating artifact removal techniques for physiological signals. IEEE T Inf Technol B. 2012;16:918–926. doi: 10.1109/TITB.2012.2207400. [DOI] [PubMed] [Google Scholar]
- [19].Sweeney KT, McLoone SF, Ward TE. The use of ensemble empirical mode decomposition with canonical correlation analysis as a novel artifact removal technique. IEEE T Bio-Med Eng. 2013;60:97–105. doi: 10.1109/TBME.2012.2225427. [DOI] [PubMed] [Google Scholar]
- [20].Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000;101:E215–E220. doi: 10.1161/01.cir.101.23.e215. [DOI] [PubMed] [Google Scholar]
- [21].Richman JS, Moorman JR. Physiological time-series analysis using approximate entropy and sample entropy. Am J Physiol-Heart C. 2000;278:H2039. doi: 10.1152/ajpheart.2000.278.6.H2039. [DOI] [PubMed] [Google Scholar]
- [22].Delorme A, Makeig S, Sejnowski T. Automatic artifact rejection for EEG data using high-order statistics and independent component analysis; International Workshop on ICA; San Diego, CA. 2001. [Google Scholar]
- [23].Inuso G, Foresta F. La, Mammone N, Morabito FC. Brain activity investigation by EEG processing: Wavelet analysis, kurtosis and renyi's entropy for artifact detection; Information Acquisition, 2007. ICIA'07. International Conference On.2007. pp. 195–200. [Google Scholar]


