ABSTRACT
Sleep electroencephalogram (EEG) recordings can be contaminated by artifacts. Visual and automatic methods have been developed to mark such erroneous segments of EEG data. Here, we systematically explored the effect of artifacts on the sleep EEG power spectrum density (PSD), and we compared gold‐standard visual detections to a simple automatic detector using Hjorth parameters to identify artifacts. We found that most distortions in the all‐night average PSD occur because of a small minority of highly anomalous artifacts, which mainly affect the beta and gamma frequency ranges and NREM delta. Visual and automatic detections only showed moderate agreement in which data segments are artifactual. However, the resulting all‐night average PSD is highly similar across all methods, and PSDs calculated with all methods successfully recover the known correlations of PSD with age and sex. No parameter settings of the automatic detector clearly outperformed others. Additionally, we showed that accurate average PSD estimates can be recovered from just a fraction of available data epochs. Our results suggest that artifacts represent a minor and easily solvable problem in sleep EEG recordings. Most visually identified artifacts do not seriously distort estimates of mid‐frequency activity in the sleep EEG spectrum, and distortions to low and high frequencies can be eliminated using a simple automatic detection method nearly as well as with visual detections. These findings show that the visual inspection of EEG data is not necessary to eliminate the effects of artifacts, which is encouraging for the expected performance of automatic preprocessing in large sleep EEG databases.
Keywords: artifacts, automatic data processing, data quality, EEG, sleep
Short abstract
Our study evaluates the effects of artifacts on the power spectrum density (PSD) of the sleep electroencephalogram (EEG) and compares visual and automatic artifact detection methods. It demonstrates that accurate PSD estimates can be achieved without extensive visual inspection, significantly facilitating data processing in large EEG datasets. This work supports the reliability of simple automatic methods in sleep EEG analysis and the feasibility of using automated processes in large‐scale research.
1. Introduction
With the advent of large, freely accessible databases, there is a renewed interest in the sleep electroencephalogram (EEG) as a source of disease biomarkers and features related to normal brain function (Zhang et al. 2018; Redline and Purcell 2021). The sleep EEG has high within‐individual stability (De Gennaro et al. 2005), rendering it especially promising as a marker of modestly time‐invariant characteristics. For example, age can be predicted with high accuracy from the sleep EEG in healthy populations (Sun et al. 2019; Ujma et al. 2019), while sleep age is higher than chronological age in clinical samples (Sun et al. 2023). The sleep EEG has been used to seek biomarkers of, for example, intelligence (Ujma et al. 2017; Ujma et al. 2023), depression (Olbrich and Arns 2013; de Aguiar Neto and Rosa 2019), schizophrenia (Chan et al. 2017), and Alzheimer disease (Zhang et al. 2022). The sleep EEG is altered in stereotypical ways across many health conditions (Ujma and Bódizs 2024a). Within‐individual fluctuations in the sleep EEG can reflect homeostatic pressure as a result of extended wakefulness (Borbély et al. 1981; Tononi and Cirelli 2014; Ujma and Bódizs 2024b) (or, hypothetically, a more engaging wakeful period preceding sleep (Taji, Pierson, and Ujma 2023; Horne and Minard 1985)) and possibly sleep quality (but see Pierson‐Bartel and Ujma 2024). Poor replication records of underpowered neuroscience studies (Button et al. 2013; Szucs and Ioannidis 2017) necessitate the use of very large samples whenever such sleep EEG biomarkers are sought (Marek et al. 2022). In large samples, the data cleaning required to process sleep EEG data increasingly requires automatic methods.
Sleep EEG data are free from blinking artifacts and less contaminated by movement artifacts than wakeful EEG; however, various artifacts related to muscle activity during arousals, cardiac activity, the temporary disconnection of electrodes, or perspiration may contaminate it (Islam, Rastegarnia, and Yang 2016). In small sleep EEG studies, it is possible to visually detect artifacts, that is, have a person screen all EEG data and mark up artifactual epochs by hand. However, this task is increasingly impractical in case of large databases (Mariani et al. 2018). Various automatic detection methods have been developed to detect artifacts in sleep EEG data, using various features such as the slow/fast power ratio (Mariani et al. 2018), abnormal amplitude or power (D'Rozario et al. 2015; Brunner et al. 1996), Riemannian geometry (Saifutdinova et al. 2019), artifact subspace reconstruction (Somervail et al. 2023), or semi‐automatic methods (Leach, Sousouri, and Huber 2023) (see Islam, Rastegarnia, and Yang 2016; Cox and Fell 2020; Anderer et al. 1999; Urigüen and Garcia‐Zapirain 2015; Motamedi‐Fakhr et al. 2014; Malafeev et al. 2019 for reviews).
Many previously published automatic artifact detection algorithms achieved high agreement with visual detections (Mariani et al. 2018; Saifutdinova et al. 2019; Malafeev et al. 2019; 't Wallant et al. 2016) and were successful in developing automatic detection algorithms which resulted in a significant improvement in signal quality. However, to our knowledge, there is no empirical study which systematically investigated how features calculated from the sleep EEG are affected by the erroneous inclusion of artifactual data (but see (Malafeev et al. 2019) for a partial analysis and Brunner et al. 1996 for the systematic description of the effect of muscle artifacts), and there is no consensus about simple and effective automatic methods which can replace the visual scorings of recordings before analysis, even though this would be critical in large datasets where visual scoring is not feasible. In the current study, especially designed to bridge this gap, we used a large sample of healthy volunteers to compare visual detections with a simple automatic method pioneered in recent analyses of large sleep EEG datasets available at sleepdata.org (Purcell et al. 2017; Djonlagic et al. 2021) and implemented in the free Luna tool developed to process these datasets (https://zzz.bwh.harvard.edu/luna/). We showed that a minority of visually scored artifacts are responsible for most of the distortions of EEG spectra, and these are readily captured by a simple method based on finding epochs with outlying Hjorth parameters, and the resulting power spectrum density (PSD) estimates are similar to those obtained with visual scorings and reliably demonstrate similar correlations with age and sex.
2. Methods
2.1. Participants and EEG Acquisition
We used all‐night EEG data from 252 healthy volunteers (mean age = 25.14 [SD = 12.21, range 3.8–69], 130 males, 122 females). The technical details of EEG acquisition in this dataset were previously reported (Bódizs et al. 2017; Bódizs et al. 2022) and reproduced in Table S1. In short, this is a multicenter dataset of the Max Planck Institute of Psychiatry (Munich, Germany) and the Psychophysiology and Chronobiology Research Group of Semmelweis University (Budapest, Hungary). Participants underwent all‐night polysomnography recordings for two consecutive nights. We used data from the second night, from EEG channels common to all recordings: Fp1, Fp2, F3, F4, C3, C4, P3, P4, O1, and O2, all referenced to the mathematically linked mastoids. Impedances were kept at < 10 kΩ. Hypnograms for all recordings were visually scored on a 20 s basis according to standard criteria (Iber et al. 2007). In the current version of the dataset, an additional 28‐year‐old male was included in the “MPIP‐I” subsample (see the previous studies for subsample description).
Study procedures were approved by the ethical boards of Semmelweis University, the Medical Faculty of the Ludwig Maximilian University, or the Budapest University of Technology and Economics. All participants were volunteers who gave informed consent in line with the Declaration of Helsinki. In case of participants who were minors, written informed consent was provided by the parents.
2.2. Artifact Detection and Power Spectral Density (PSD) Analysis
Manual artifact detection was completed in all recordings based on visual inspection of the EEG waveforms by experts in sleep EEG scoring and analysis. This detection was performed on a 4‐s basis. Each 4‐s epoch was either confirmed as good quality or rejected, with this scoring applied to all channels. Artifact scoring was performed to exclude all visual identifiable artifacts, including physiological/biological (myogenic, eye or body movement‐related, teeth grinding, sweating, respiration, or pulse‐related) and technical ones (electrode pop) were excluded from quantitative EEG processing.
Automatic artifact detection was based on Hjorth parameters, (Hjorth 1970) used for artifact detection in several recent studies of large‐scale sleep datasets (Purcell et al. 2017; Djonlagic et al. 2021). Hjorth parameters are computationally simple indicators of the statistical properties of time series signals. The three Hjorth parameters are activity, mobility, and complexity.
Activity is defined as
where y(t) is the signal. In other words, activity is simply the variance of the signal amplitude, which according to Parseval's theorem equals the total area under the power spectral density curve.
Mobility is defined as
where y(t) is the signal and dy(t)/dt is the first derivative of the signal. Mobility reflects the average slope of the EEG relative to overall signal amplitude (preponderance of sudden changes in EEG amplitudes lead to higher mobility).
Complexity is defined as
In other words, complexity is the mobility of the first derivative of the signal divided by the mobility of the signal. The minimal value (comp = 1) of this ratio would emerge in the case of a pure sine wave (no complexity).
We calculated all three Hjorth parameters for all 4‐s epochs in each participant, on each channel, separately in NREM and REM using custom code written in MATLAB R2021a.
Hjorth parameters follow a continuous distribution, but the decision whether to include an epoch in analyses is binary. We explored how individual average spectra calculated using various decision criteria relate to the gold standard visual detection method. First, we calculated within‐participant, within‐channel z‐scores of all Hjorth parameters across all 4‐s epoch within the same vigilance state (NREM or REM). Second, we calculated individual mean spectra only considering epochs in which z‐scores did not reach a z‐score threshold. These thresholds were set at z = 1, 1.5, 2, 2.5, 3, and 4, representing six increasingly lax criteria for which epochs are considered artifactual. In a further step, we explored a dual‐threshold method where z‐scores were re‐calculated after artifactual epochs were rejected by the initial threshold, and some of the remaining epochs were rejected again based on a new threshold applied to the updated z‐scores (Purcell et al. 2017). The rationale behind this approach is that the first threshold removes gross outliers and the second targets more moderate artifacts. For the dual‐threshold approach, we used five thresholds of z = 1/1, 1.5/1.5, 2/2, 2.5/2.5, and 3/3 for the first and second thresholds, respectively. In total, 13 NREM and REM individual average power spectra were calculated for each participant and each channel, one with visual artifact detection, one with no artifact detection at all, and 11 with different Hjorth parameter thresholds applied in the automatic analyses. Power spectra were log10 transformed, but not relativized. Where analyses were performed within frequency bands, we used the following band criteria: delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–10 Hz), low sigma (10–12.5 Hz), high sigma (12.5–16 Hz), beta (16–30 Hz), and gamma (30–49 Hz). Frequency bins at the boundaries were always included in the lower frequency band.
Power spectral density was calculated for each 4‐s epoch in the data (with 2 s overlap) using the periodogram() MATLAB function with a Hamming window. Average power spectra of the night were calculated by calculating the average of all nonartifactual epochs. An epoch was considered as nonartifactual (in both visual and automatic detections) if neither its starting point nor its end point was included in an epoch marked as artifactual.
2.3. Statistical Analysis
In our analyses, we explored (1) the agreement between visual and automatic artifact detection methods on the epoch level, (2) the similarity of the individual average PSD across visual and automatic methods, and (3) the extent to which the correlations of individual average PSD with age and sex can be recovered using various detection methods. Visual and automatic artifact detections were compared using Cohen's kappa. PSD similarity was calculated by calculating the Pearson correlation (between participants) of the log10‐transformed PSD estimates resulting from visual and automatic detections in each frequency bin. Pearson correlations between log10‐transformed PSD estimates and age and sex were also calculated for each frequency bin. Sex was binary‐coded as 1 for males and 0 for females, rendering the calculated correlation a point biserial correlation, with negative coefficients indicating lower PSD estimates in males.
Throughout the manuscript, we refrain from reporting p‐values in detail and focus on effect sizes instead. This is because for the purposes of our analyses, merely rejecting the null hypothesis of no similarity between automatic and visual artifact detections is in our view not sufficient and the degree of similarity (e.g., a correlation coefficient) is more convincing. For example, at our sample size (N = 252), the critical correlation coefficient at alpha = 0.05 is r = 0.124. A correlation of this size, while statistically significant, would in practice mean that the similarity of automatic and visual methods is not sufficient to make them interchangeable. At effect sizes sufficiently high to make this conclusion, p‐values are infinitesimally small (e.g., even at r = 0.28 p < 0.00001).
3. Results
3.1. Distribution of Visually Identified Artifact Effects
To simulate how much it affects average PSD estimates if a visual or automatic artifact detector misses errors in the data, we calculated the average PSD with each visually identified artifact erroneously included in analyses (but leaving other artifacts excluded). Figures 1 and 2 illustrate the result of these simulations in NREM and REM, respectively.
FIGURE 1.

The effect of artifact nonexclusion to NREM PSD estimates. Panel (A) shows the average PSD estimate from a random participant and a random channel (O2) if one of the artifacts (but not the others) are erroneously included. Note that the lines almost perfectly overlap, except for high‐frequency activity. The coefficient of variation (standard deviation of artifact estimates divided by the mean, calculated separately for each frequency bin) gives a quantitative estimate of this effect. Coefficient of variation (shown as a red line on a separate y axis) was calculated from each participant from the same channel and averaged across participants. Panel (B) shows histograms of the biasing effects of artifacts, estimated as the de‐artifacted/artifactual log PSD summed across all bins in the frequency range. Ratios are shown by frequency range and pooled across participants and channels (between‐channel variance was minimal and adding between‐channel SD shading was not visible). Ratios < 0.97 and > 1.03—over 3% change in the final PSD estimate with the inclusion of an artifact—were set to 0.97 and 1.03, respectively. Note the symmetrical distribution of the mid‐frequency artifact effects, but highly asymmetrical delta and beta/gamma effects.
FIGURE 2.

The effect of artifact nonexclusion to REM PSD estimates. Panel (A) shows the average PSD estimate from a random participant and a random channel (O2) if one of the artifacts (but not the others) are erroneously included. Note that the lines almost perfectly overlap, except for high‐frequency activity. The coefficient of variation (standard deviation of artifact estimates divided by the mean, calculated separately for each frequency bin) gives a quantitative estimate of this effect. Coefficient of variation (shown as a green line on a separate y axis) was calculated from each participant from the same channel and averaged across participants. Panel (B) shows histograms of the biasing effects of artifacts, estimated as the de‐artifacted/artifactual log PSD summed across all bins in the frequency range. Ratios are shown by frequency range and pooled across participants and channels (between‐channel variance was minimal and adding between‐channel SD shading was not visible). Ratios < 0.97 and > 1.03—over 3% change in the final PSD estimate with the inclusion of an artifact—were set to 0.97 and 1.03, respectively. Note the symmetrical distribution of the mid‐frequency artifact effects, but highly asymmetrical delta and beta/gamma effects.
In both vigilance states, missing individual artifacts made little difference to the final PSD estimate in the ~2‐ 15‐Hz frequency range. The coefficient of variation—indicating how much the final PSD estimate varies after the erroneous inclusion of individual artifacts—started notably increasing over 30 Hz in REM, but already at over 15 Hz in NREM, with an additional increase in the low‐frequency ranges. The modal artifact made minimal difference to the PSD estimates if erroneously included in analyses. However, a significant minority affected PSD estimates by more than 3%—note that this is the effect of misidentifying a single artifact out of thousands of epochs. In the theta, alpha, and sigma ranges and in REM delta, the effect of artifacts on PSD estimates was symmetrical, with some artifacts causing an under‐ and others and overestimate of PSD. This was, however, not the case for NREM delta and for beta and gamma in both vigilance states. NREM delta PSD estimates were almost always lower and NREM/REM beta and gamma PSD estimates were almost always higher if artifacts were erroneously included. In other words, artifacts in NREM contained less low‐frequency activity than normal data, while in both vigilance states, artifacts contained more high‐frequency activity.
3.2. Visual and Automatic Artifact Detections Show Moderate Agreement
Various metrics of detection agreement are shown in Table 1. Table S2 shows a detailed breakdown of all detections as true positive, true negative, false positive, or false negative.
TABLE 1.
Artifact detection characteristics.
| Visual detection | Hjorth parameter thresholds | No artifact detection | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.5 | 2 | 2.5 | 3 | 4 | 1/1 | 1.5/1.5 | 2/2 | 2.5/2.5 | 3/3 | |||
| NREM | |||||||||||||
| Artifact rate | 0.08 | 0.31 | 0.13 | 0.06 | 0.03 | 0.02 | 0.01 | 0.72 | 0.32 | 0.15 | 0.09 | 0.05 | |
| Detection vs. visual | 0.09 | 0.18 | 0.22 | 0.22 | 0.20 | 0.16 | 0.03 | 0.11 | 0.19 | 0.22 | 0.22 | ||
| PSD vs. visual | 0.97 | 0.96 | 0.95 | 0.94 | 0.93 | 0.90 | 0.96 | 0.97 | 0.97 | 0.97 | 0.96 | 0.78 | |
| REM | |||||||||||||
| Artifact rate | 0.18 | 0.23 | 0.14 | 0.09 | 0.05 | 0.03 | 0.01 | 0.58 | 0.29 | 0.19 | 0.14 | 0.09 | |
| Detection vs. visual | 0.15 | 0.20 | 0.20 | 0.18 | 0.15 | 0.11 | 0.06 | 0.16 | 0.23 | 0.24 | 0.22 | ||
| PSD vs. visual | 0.96 | 0.93 | 0.91 | 0.88 | 0.85 | 0.81 | 0.97 | 0.97 | 0.97 | 0.95 | 0.93 | 0.67 | |
Note: This table shows the proportion of epochs marked as artifacts in each method (“artifact rate”), agreement with visual detections based on Cohen's kappa (“Detection vs. visual”), and the PSD correlations (“PSD vs. visual”) between the visual detection method and each alternative. Cohen's kappa values are averaged across channels. PSD correlations are averaged across channels and frequency bins. For automatic detections, Hjorth parameter thresholds are expressed in standard deviation units. In case of dual thresholds, the two thresholds are separated by /.
Visual artifact detection marked on average (across participants and channels) 8% of epochs in NREM and 18% of epochs in REM as artifactual. This rate was overestimated by the least stringent and generally underestimated by the most stringent automatic criteria. The best‐performing automatic detector was the 2.5 SD dual‐threshold method in both NREM and REM.
We calculated Cohen's kappa between the binary artifact vectors produced by the visual method and those produced by automatic methods. This measure of agreement remained low regardless of the method used. The highest agreement in NREM (0.22) was achieved by several threshold settings, and in REM the z = 2.5/2.5 dual‐threshold method produced the highest agreement (0.24).
In order to estimate what recording characteristics affected the similarity of visual and automatic detection, we fitted a mixed‐effects linear model using the fitlme() function in MATLAB, with Cohen's kappa values between automatic and visual detections as the dependent variable, participant age, vigilance state, recording channel, and artifact detection method as fixed effects, and a random intercept per participant. Results are detailed in Table 2. Briefly, we found that lower participant age, NREM vigilance state, and a more posterior channel location were associated with lower agreement between visual and automatic detections. Less stringent detection criteria, especially with dual thresholds, were also associated with higher agreement. Effect sizes, however, were generally very low.
TABLE 2.
The effect of patient age, vigilance state, recording channel, and the Hjorth parameter threshold used in automatic analyses on the agreement between visual and automatic detectors.
| B | t | p | B | t | p | ||||
|---|---|---|---|---|---|---|---|---|---|
| Intercept | 0.187 | 18.198 | < 0.0001 | Hjorth threshold | 1.5 | 0.070 | 41.950 | < 0.0001 | |
| Age | −0.002 | −5.994 | < 0.0001 | 2 | 0.091 | 54.441 | < 0.0001 | ||
| REM | 0.006 | 8.933 | < 0.0001 | 2.5 | 0.079 | 47.272 | < 0.0001 | ||
| Channel | C4 | −0.001 | −0.906 | 0.365 | 3 | 0.057 | 33.932 | < 0.0001 | |
| Fp1 | −0.002 | −1.506 | 0.132 | 4 | 0.012 | 7.237 | < 0.0001 | ||
| Fp2 | −0.008 | −5.097 | < 0.0001 | 1/1 | −0.077 | −45.864 | < 0.0001 | ||
| F3 | −0.005 | −3.078 | 0.002 | 1.5/1.5 | 0.013 | 7.854 | < 0.0001 | ||
| F4 | −0.005 | −2.937 | 0.003 | 2/2 | 0.085 | 50.775 | < 0.0001 | ||
| P3 | −0.031 | −19.267 | < 0.0001 | 2.5/2.5 | 0.107 | 64.134 | < 0.0001 | ||
| P4 | −0.030 | −19.121 | < 0.0001 | 3/3 | 0.102 | 61.126 | < 0.0001 | ||
| O1 | −0.033 | −20.475 | < 0.0001 | ||||||
| O2 | −0.033 | −21.011 | < 0.0001 |
Note: Categorical variable effects are expressed relative to the reference categories NREM, channel C3, and Hjorth parameter threshold z = 1.
Raincloud plots (Allen et al. 2019) illustrating the similarity of automatic detections with various thresholds on various channels to the visual detection gold standard are available in the Supporting Information for NREM (Figure S1) and REM (Figure S2).
3.3. Different Methods Result in Similar Average Power Spectra
In the next step, we explored the similarity between average PSDs calculated using visual and automatic artifact detection methods. For this, we extracted the PSD estimate at each 0.25 Hz bin between 0 and 49 Hz from each participant produced by the two methods (visual method and the automatic method to be compared with it) and calculated Pearson correlations between the two estimates. Figures 3 and 4 show the resulting estimates in NREM and in REM, respectively (see also Table 1 for more coarse estimates and for information about the proportion of epochs included in both analyses).
FIGURE 3.

The similarity of binwise NREM PSD estimates obtained using visual and automatic detections. Line plots illustrate the Pearson correlation coefficient (between participants) between the PSD estimates obtained with visual detections and each automatic detector. Correlations are shown for each frequency bin and EEG channel. The last panel illustrates the similarity of PSD estimates with visual detections to a case where all epochs are included in analyses without regard for artifacts. All correlations are significant (p max = 10−10), including after Bonferroni correction.
FIGURE 4.

The similarity of binwise REM PSD estimates obtained using visual and automatic detections. Line plots illustrate the Pearson correlation coefficient (between participants) between the PSD estimates obtained with visual detections and each automatic detector. Correlations are shown for each frequency bin and EEG channel. The last panel illustrates the similarity of PSD estimates with visual detections to a case where all epochs are included in analyses without regard for artifacts. All correlations are significant (p max = 10−7), including after Bonferroni correction.
PSD correlations between the visual and automatic methods were very high at least in the high delta‐sigma range, especially in NREM, almost always exceeding 0.9 and often 0.95. Notably, very high correlations were obtained in this range even if no artifact detection method was used at all. Visual‐automatic correlations were lower in the REM and above the sigma range, especially if no artifact detection method was used. While dual‐threshold methods did not show better agreement with visual detections than single‐threshold methods (see the previous section), they did produce noticeably higher mean PSD correlations, indicating that they keep epochs which are more representative of those a visual observer would keep.
In order to quantitatively assess these findings, we ran a linear model with visual‐automatic PSD correlation as the dependent variable and vigilance state, channel, method (dichotomized as single threshold or dual threshold for simplicity), and frequency band as predictors. The results are summarized in Table 3. We found overall very high correlations (intercept = 0.963), with somewhat lower values in REM, in single‐threshold methods and in the delta, beta, and gamma frequency ranges. Topographic differences were significant, but low in magnitude and not systematic.
TABLE 3.
The effect of vigilance state, recording channel, detection method, and frequency range on the correlation between PSD estimates derived from visual and automatic artifact detection.
| B | t | p | B | t | p | ||||
|---|---|---|---|---|---|---|---|---|---|
| Intercept | 0.963 | 161.160 | < 0.001 | Dual threshold | 0.030 | 9.041 | < 0.001 | ||
| REM | −0.034 | −11.737 | < 0.001 | Band | Theta | 0.020 | 3.712 | < 0.001 | |
| Channel | C4 | 0.019 | 2.972 | 0.003 | Alpha | 0.014 | 2.538 | 0.011 | |
| Fp1 | −0.043 | −6.719 | < 0.001 | Low | 0.008 | 1.466 | 0.143 | ||
| Fp2 | −0.007 | −1.083 | 0.279 | High | −0.003 | −0.589 | 0.556 | ||
| F3 | 0.014 | 2.216 | 0.027 | Beta | −0.035 | −6.572 | < 0.001 | ||
| F4 | 0.028 | 4.408 | 0.000 | Gamma | −0.091 | −16.989 | < 0.001 | ||
| P3 | 0.007 | 1.082 | 0.279 | ||||||
| P4 | 0.025 | 3.890 | < 0.001 | ||||||
| O1 | 0.019 | 2.890 | 0.004 | ||||||
| O2 | 0.021 | 3.292 | 0.001 |
Note: Categorical variable effects are expressed relative to the reference categories NREM, channel C3, single‐threshold methods, and delta.
3.4. External Correlations
In subsequent analyses, we explored to what extent known correlates of individual average PSD can be reproduced using various artifact detection methods. We calculated correlations between PSD and age and sex.
Using the gold standard visual detection method, known correlations, (Muehlroth and Werkle‐Bergner 2020) including those reported in this dataset (Ujma et al. 2022; Pótári et al. 2017) emerged. Higher age was associated with less low‐frequency and spindle activity in NREM and a reduction of low and an increase in beta‐frequency activity in REM. Men had lower power across all frequency ranges in both NREM and REM, except the lowest and highest frequencies.
In line with the high correlations seen for PSD estimates, these patterns were generally well‐recovered using various artifact detection methods, including no artifact detection at all. As expected, correlations with high‐frequency components—such as age‐related increases—were less well reproduced without artifact detections, but clearly seen once any automatic method was applied. External correlations are illustrated in Figure 5. More detailed plots, showing results for each channel separately, are available in Figure S3 (NREM, age), Figure S4 (NREM, sex), Figure S5 (REM, age), and Figure S6 (REM, sex).
FIGURE 5.

Correlations of PSD estimates with external indicators. The line plots show Pearson correlation coefficients between age, sex, and PSD estimates obtained using different artifact detection methods (including ignoring artifacts completely). Males are coded as 1 and females as 0, with a negative correlation implying higher PSD in females. Correlations are averaged across channels. Legend entries for automatic methods refer to using one (single number) or two (double number) SD thresholds to identify artifacts using outlying Hjorth parameters. Note the high overlap between lines, especially at the lower frequencies, indicating that the correlation of PSD estimates to external indicators is well‐recovered regardless of the artifact detection method used.
3.5. Simulating Poor‐Quality Recordings
In further analyses, we investigated the effect of excess artifacts on PSD estimates. For these analyses, we compared original PSD estimates to simulations where we randomly excluded good‐quality epochs, emulating poor‐quality recordings or false‐positive artifacts detections by a visual scorer or an automatic detector.
In each participant, we set a random selection of epochs constituting 25%, 50%, or 75% of good‐quality data to artifactual. False‐positive artifacts were either selected from anywhere in the night or the first and second halves, respectively. We calculated PSD estimates using these artificial epoch sets and compared them with actual PSD estimates. Bandwise PSD averages were calculated by averaging frequency bins belonging to the same frequency band.
Results from the representative channel C3 are summarized in Figure 6 for NREM and Figure 7 for REM. Results from additional channels Fp1 and P3 are shown in Figures S7–S10.
FIGURE 6.

The effect of false‐positive artifacts on PSD estimates from the representative channel C3 in NREM. Panel (A): PSD estimates from three randomly selected participants after randomly setting 0% (original PSD estimate), 25%, 50%, or 75% of epochs (separate lines) from either anywhere during the night (left panels), from the first half of the night (middle panels), or from the second half of the night (right panels) to false‐positive artifacts. Panel (B): Grand average (across all participants) bandwise PSD estimates after the exclusion of false positives, expressed relative to the original.
FIGURE 7.

The effect of false‐positive artifacts on PSD estimates from the representative channel C3 in REM. Panel (A): PSD estimates from three randomly selected participants after randomly setting 0% (original PSD estimate), 25%, 50%, or 75% of epochs (separate lines) from either anywhere during the night (left panels), from the first half of the night (middle panels), or from the second half of the night (right panels) to false‐positive artifacts. Panel (B): Grand average (across all participants) bandwise PSD estimates after the exclusion of false positives, expressed relative to the original.
Overall, minimal changes in PSD estimates were observed regardless of the settings used to simulate false‐positive artifacts. The mean change (across all conditions and frequency bands) was 0.5%, the mean absolute change was 1.16%, with range [−8% −4.8%], indicating that small subsets of the actually available data epochs provide reliable approximations of the true PSD average even after the majority of data are excluded.
3.6. Short Data Segments and Specific Sleep Stages
In our original analyses, we estimated the effect of artifacts and automatic artifact removal on all‐night PSD estimates. These analyses were based on a large number of data epochs, which may dilute artifact effects and may not be relevant for analyses of less data.
To address this limitation, we re‐ran analyses only using randomly selected short data segments for each participant. At most 25% of the randomly selected epochs were allowed to be marked up as artifacts based on visual screening, if more artifacts were present, a new random segment was chosen. We used four specifications for data selection:
100 contiguous epochs (400 s per participant).
500 contiguous epochs (2000 s per participants).
100 noncontiguous epochs (400 s in total per participant).
500 noncontiguous epochs (2000 s in total per participant).
The first two specifications simulate cases when only a limited amount of data is available per participant, such as naps or the analysis of single sleep cycles. The second two specifications simulate cases when only short EEG segments separated in time are of interest, such as in case of event‐related EEG.
High correlations were obtained between PSD estimates using visual and automatic scoring even from these short segments (Figures S11 and S12), with somewhat better performance in NREM, and with dual‐threshold methods. In contrast, visual and automatic PSD correlated poorly when no artifact detection was used.
We also calculated separate analyses for N2 and SWS, using all epochs from these sleep stages instead of pooling all NREMs. Results confirmed that PSD estimates based on visual scoring are well‐recovered with simple automatic artifact detection algorithms in specific sleep stages as well (Figure S13).
These results show that PSD estimates are accurately recovered using a simple Hjorth‐parameter‐based method even if only shorter EEG segments are available.
3.7. Artifact Epoch Duration
Both artifact detection and spectral analysis may be affected by the duration of data epochs. In order to investigate the importance of this problem, we re‐ran artifact detection and PSD analysis using 2‐s, 6‐s, and 8‐s epochs, using both a single‐threshold and a dual‐threshold artifact detection approach. In both approaches, epochs were discarded if any Hjorth parameter exceeded the mean by 1.5 SD based on the good performance of this setting in the original 4‐s analyses.
Because spectral resolution is affected by epoch duration, results from various duration settings are not directly compatible at the bin level. Therefore, we summed binwise power estimates to create power estimates for the delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–10 Hz), low sigma (10–13 Hz), high sigma (13–16 Hz), beta (16–25 Hz), and gamma (25–48 Hz) frequency bands. Similar metrics were calculated using the visually identified artifacts and their 4‐s epochs.
Estimates resulting from the three alternative epoch duration settings (2, 6 and 8 s) were extremely similar to the 4‐s window (r ~ 1). Similarly, all settings resulted in bandwise power estimates highly correlated (r > 0.95) with those resulting from visual artifact detections, with little differences except for slightly lower fidelity with 2‐s windows, and a trend for slightly higher estimates using 4‐s windows, as in visual detections. Results are illustrated in Figures S14 and S15. Overall, our findings suggest that automatic detection results are robust to the choice of artifact epoch duration.
4. Discussion
The processing of large‐scale sleep EEG databases has greatly facilitated automatic methods. In our work, we systematically explored (1) the effect of the erroneous inclusion of artifactual data to PSD estimates, (2) the effect of data loss, including time‐biased data loss, on PSD estimates, (3) the similarity of visual detections to the results of a simple automatic algorithm previously used in large datasets (Djonlagic et al. 2021) (4) the similarity of the shape and external correlations of PSD estimates resulting from the use of visual or automatic detectors.
We found that while visual scorers marked a substantial percentage of epochs (8% in NREM and 18% in REM) as artifactual, the vast majority of these artifacts would have had little or no effect on PSD estimates if left undetected. The peak of the histogram of artifact effects was at zero, and the distribution of artifact effects was symmetrical for most frequency ranges, except for beta and gamma in both vigilance states and for NREM delta. In other words, artifacts tend to have, on average, a neutral effect on PSD estimates, with the exception of high frequencies where muscle artifacts tend to result in overestimates of PSD and NREM delta where artifactual epochs contain less of the usual low‐frequency activity that characterizes this vigilance state.
However, a substantial minority of artifacts had significant effects on PSD estimates, resulting in a change of several percent due to the erroneous inclusion of a single artifactual epoch. Such artifacts mostly affected the lowest and highest frequency activities (see Figures 1 and 2 for an illustration of the high variability of PSD estimates at these frequencies as a result of including artifacts). In alternate analyses looking at the between‐participant similarity of PSD estimates with or without artifact detections (Figures 3 and 4), we also found that artifacts mostly affected PSD estimates above the spindle frequency range likely due to them resulting from high‐amplitude muscle artifacts (Brunner et al. 1996; Yuval‐Greenberg and Deouell 2009). This mirrors a previous study (Malafeev et al. 2019), which also reported that high frequencies are the most sensitive to artifacts. Interestingly, PSD estimates at or below the spindle frequency range were very well recovered in NREM (r > 0.9, Figure 3) even if no artifact detection was used at all. External correlations with age and sex (Figure 5) were also reasonably well recovered in this frequency range from PSD estimates calculated without considering artifacts.
Our findings about the effect of visually detected artifacts suggest that manual scorers, at least in our sample, might be too inclusive and mark up many data epochs as artifactual in which no anomalies are present which would significantly affect PSD estimates. This is especially true for the middle‐frequency ranges, including the spindle frequency range, where physiological activity likely overpowers all but the most substantial artifacts.
However, low and especially high frequencies are still affected by artifacts, a significant minority of which causes substantial changes in the PSD estimates by themselves. We employed a very simple artifact detection algorithm, using extreme values of the Hjorth parameters (with a set of thresholds) of epochs to identify anomalous data. We found that using this method with any setting resulted in substantial improvements in the quality of PSD estimates. There was no clear threshold setting which resulted in the best performance. REM sleep contained a higher ratio of visually marked artifacts and, correspondingly, stricter criteria in automatic methods produced higher agreement (Table 1). Generally, the rate of visual artifact detection was best approximated with dual thresholds and intermediate stringency (2.5–2.5 SD). However, this criterion is dubious because, as previously discussed, a large proportion of visually detected artifacts does not make meaningful differences in PSD estimates. Agreement with visual detections was lowest with the most stringent detectors; however, it remained relatively constant with all less stringent methods. All detectors resulted in high (r > 0.9, typically r > 0.95) correlations between PSD estimates using visual and automatic artifact detections (Figures 3 and 4), highlighting their utility especially in accurately recovering high‐frequency PSD estimates. While no clearly superior algorithm emerged, dual‐threshold settings had a tendency to especially well approximate visually scored PSD estimates at high frequencies. External correlations with age and sex (Figure 5) were also well recovered with any automatic detection method, although the strongest (and possibly most accurate) correlations were typically found with visual detections.
We chose full‐night average PSD as the primary outcome of interest because this has been shown to be an individually stable, highly heritable biomarker (De Gennaro et al. 2005; De Gennaro et al. 2008) related to other characteristics such as age, somatic and psychiatric disease, or intelligence (Ujma et al. 2019; Ujma et al. 2017; Ujma and Bódizs 2023; Bódizs, Gombos, and Kovács 2012). Changes in all‐night PSD are also linked to sleep quality (Gabryelska et al. 2019) and homeostatic sleep pressure (Borbély et al. 1981; Tononi and Cirelli 2014). This application entails the simultaneous analysis of a large amount of EEG data in which artifact effects may be diluted. However, our additional analyses show that high visual‐automatic PSD agreement is seen even if only short contiguous or noncontiguous sleep segments are analyzed. Thus, our analyses support the validity of simple, Hjorth‐parameter based artifact detection in the studies in which it was already employed (e.g., Ujma et al. 2023; Djonlagic et al. 2021), as well as for future applications using shorter EEG segments such as sleep cycles, naps, or event‐based EEG. Although a large number of artifact detection algorithms are available (see Introduction) and novel algorithms using artificial intelligence to mimic visual scoring are likely soon to appear, a key advantage of the approach demonstrated here is its simplicity and computational efficiency.
Our work has a number of limitations. Most significantly, we investigated full‐night laboratory PSG recordings of reasonably high quality. The effect of artifacts and the difficulty of identifying them are possibly different in other types of recordings, for example, in routine wake EEG, during naps, or in recordings performed with mobile EEG headbands (Arnal et al. 2020). Another limitation is the use of PSD estimates as the benchmark of artifact detection quality. We chose PSD because it is a frequently used simple EEG marker with well‐known correlates most researchers are familiar with. A large number of other metrics (Stancin, Cifrek, and Jovic 2021), such as time‐frequency coupling, coherence, entropy, or fractal dimensions, can be calculated from EEG, and specific waveforms such as slow waves and spindles can be detected, and other metrics may be differently affected by artifacts. Our dataset consisted of second‐night recordings of healthy volunteers. In specific datasets—such as in recordings of disturbed sleep—participants with sleep disorders or first‐night recordings the performance or optimal settings of automatic detectors may be different. The automatic method investigated here may also be suboptimal if continuous artifacts are present on a channel, making it impossible to detect anomalous data based on the distribution of Hjorth parameters.
A global evaluation of our findings suggests that artifacts in night sleep EEG recordings do not represent a critical problem. Most artifacts minimally affect PSD estimates (and vice versa, PSD estimates are accurately recovered if only less than a fourth of all data is available for calculations, even if artifacts are heavily concentrated in only one half of the night). PSD is estimated with reasonable accuracy up to the spindle frequency range even without any regard for excluding artifacts. The remaining problems resulting from artifacts are mostly eliminated with the use of a simple algorithm, with no substantial difference resulting from algorithm settings. These findings are encouraging for the prospects of large, freely available datasets. In these datasets, the meticulous visual observation of data is not feasible, access to sophisticated detection algorithms may be limited to some researchers, and the proliferation of detection algorithms may cause issues with reproducibility. Our findings—in line with previous findings exploring other algorithms (Mariani et al. 2018; Saifutdinova et al. 2019; Malafeev et al. 2019; 't Wallant et al. 2016)—suggest that accurate PSD estimates strongly resembling the output of visual scoring can be achieved by very simple detection algorithms, with some limitation even without any regard from artifact detection.
Author Contributions
Péter P. Ujma: conceptualization, formal analysis, methodology, writing – original draft, writing – review and editing. Martin Dresler: methodology, writing – original draft, writing – review and editing. Róbert Bódizs: methodology, writing – original draft, writing – review and editing.
Conflicts of Interest
The authors declare no conflicts of interest.
Supporting information
Data S1.
Acknowledgments
This work was supported by the Ministry of Innovation and Technology of Hungary from the National Research, Development and Innovation Fund, financed under the TKP2021‐EGA‐25 funding scheme and by the National Research, Development and Innovation Office—NKFIH (grant number: 138935).
Funding: This work was supported by Nemzeti Kutatási Fejlesztési és Innovációs Hivatal (138935), Nemzeti Kutatási, Fejlesztési és Innovaciós Alap (TKP2021‐EGA‐25) and Magyar Tudományos Akadémia, János Bolyai Research Scholarship.
Data Availability Statement
The code used in analyses and visualization is available at https://zenodo.org/record/7934586. Raw data are also accessible using the resources posted here.
References
- Allen, M. , Poggiali D., Whitaker K., Marshall T. R., and Kievit R. A.. 2019. “Raincloud Plots: A Multi‐Platform Tool for Robust Data Visualization.” Wellcome Open Research 4: 63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderer, P. , Roberts S., Schlögl A., et al. 1999. “Artifact Processing in Computerized Analysis of Sleep EEG ‐ a Review.” Neuropsychobiology 40: 150–157. [DOI] [PubMed] [Google Scholar]
- Arnal, P. J. , Thorey V., Debellemaniere E., et al. 2020. “The Dreem Headband Compared to Polysomnography for Electroencephalographic Signal Acquisition and Sleep Staging.” Sleep 43: 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bódizs, R. , Gombos F., and Kovács I.. 2012. “Sleep EEG Fingerprints Reveal Accelerated Thalamocortical Oscillatory Dynamics in Williams Syndrome.” Research in Developmental Disabilities 33: 153–164. [DOI] [PubMed] [Google Scholar]
- Bódizs, R. , Gombos F., Ujma P. P., et al. 2017. “The Hemispheric Lateralization of Sleep Spindles in Humans.” Sleep Spindles & Cortical up States 1: 42–54. [Google Scholar]
- Bódizs, R. , Horváth C. G., Szalárdy O., et al. 2022. “Sleep‐Spindle Frequency: Overnight Dynamics, Afternoon Nap Effects, and Possible Circadian Modulation.” Journal of Sleep Research 31: e13514. [DOI] [PubMed] [Google Scholar]
- Borbély, A. A. , Baumann F., Brandeis D., Strauch I., and Lehmann D.. 1981. “Sleep Deprivation: Effect on Sleep Stages and EEG Power Density in Man.” Electroencephalography and Clinical Neurophysiology 51: 483–495. [DOI] [PubMed] [Google Scholar]
- Brunner, D. P. , Vasko R., Detka C., Monahan J., Reynolds C. H. III, and Kupfer D.. 1996. “Muscle Artifacts in the Sleep EEG: Automated Detection and Effect on All‐Night EEG Power Spectra.” Journal of Sleep Research 5: 155–164. [DOI] [PubMed] [Google Scholar]
- Button, K. S. , Ioannidis J. P. A., Mokrysz C., et al. 2013. “Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience.” Nature Reviews. Neuroscience 14: 365–376. [DOI] [PubMed] [Google Scholar]
- Chan, M.‐S. , Chung K.‐F., Yung K.‐P., and Yeung W.‐F.. 2017. “Sleep in Schizophrenia: A Systematic Review and Meta‐Analysis of Polysomnographic Findings in Case‐Control Studies.” Sleep Medicine Reviews 32: 69–84. [DOI] [PubMed] [Google Scholar]
- Cox, R. , and Fell J.. 2020. “Analyzing Human Sleep EEG: A Methodological Primer With Code Implementation.” Sleep Medicine Reviews 54: 101353. [DOI] [PubMed] [Google Scholar]
- de Aguiar Neto, F. S. , and Rosa J. L. G.. 2019. “Depression Biomarkers Using Non‐Invasive EEG: A Review.” Neuroscience and Biobehavioral Reviews 105: 83–93. [DOI] [PubMed] [Google Scholar]
- De Gennaro, L. , Ferrara M., Vecchio F., Curcio G., and Bertini M.. 2005. “An Electroencephalographic Fingerprint of Human Sleep.” NeuroImage 26: 114–122. [DOI] [PubMed] [Google Scholar]
- De Gennaro, L. , Marzano C., Fratello F., et al. 2008. “The Electroencephalographic Fingerprint of Sleep Is Genetically Determined: A Twin Study.” Annals of Neurology 64: 455–460. [DOI] [PubMed] [Google Scholar]
- Djonlagic, I. , Mariani S., Fitzpatrick A. L., et al. 2021. “Macro and Micro Sleep Architecture and Cognitive Performance in Older Adults.” Nature Human Behaviour 5: 123–145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- D'Rozario, A. L. , Dungan G. C., Banks S., et al. 2015. “An Automated Algorithm to Identify and Reject Artefacts for Quantitative EEG Analysis During Sleep in Patients With Sleep‐Disordered Breathing.” Sleep & Breathing 19: 607–615. [DOI] [PubMed] [Google Scholar]
- Gabryelska, A. , Feige B., Riemann D., et al. 2019. “Can Spectral Power Predict Subjective Sleep Quality in Healthy Individuals?” Journal of Sleep Research 28: e12848. [DOI] [PubMed] [Google Scholar]
- Hjorth, B. 1970. “EEG Analysis Based on Time Domain Properties.” Electroencephalography and Clinical Neurophysiology 29: 306–310. [DOI] [PubMed] [Google Scholar]
- Horne, J. A. , and Minard A.. 1985. “Sleep and Sleepiness Following a Behaviourally “Active” Day.” Ergonomics 28: 567–575. [DOI] [PubMed] [Google Scholar]
- Iber, C. , Ancoli‐Israel S., Chesson A. L., and Quan S. F.. 2007. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specification. American Academy of Sleep Medicine. [Google Scholar]
- Islam, M. K. , Rastegarnia A., and Yang Z.. 2016. “Methods for Artifact Detection and Removal From Scalp EEG: A Review.” Neurophysiologie Clinique 46: 287–305. [DOI] [PubMed] [Google Scholar]
- Leach, S. , Sousouri G., and Huber R.. 2023. “‘High‐Density‐SleepCleaner’: An Open‐Source, Semi‐Automatic Artifact Removal Routine Tailored to High‐Density Sleep EEG.” Journal of Neuroscience Methods 391: 109849. [DOI] [PubMed] [Google Scholar]
- Malafeev, A. , Omlin X., Wierzbicka A., et al. 2019. “Automatic Artefact Detection in Single‐Channel Sleep EEG Recordings.” Journal of Sleep Research 28: e12679. [DOI] [PubMed] [Google Scholar]
- Marek, S. , Tervo‐Clemmens B., Calabro F. J., et al. 2022. “Reproducible Brain‐Wide Association Studies Require Thousands of Individuals.” Nature 603: 654–660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mariani, S. , Tarokh L., Djonlagic I., et al. 2018. “Evaluation of an Automated Pipeline for Large‐Scale EEG Spectral Analysis: The National Sleep Research Resource.” Sleep Medicine 47: 126–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Motamedi‐Fakhr, S. , Moshrefi‐Torbati M., Hill M., Hill C. M., and White P. R.. 2014. “Signal Processing Techniques Applied to Human Sleep EEG Signals—A Review.” Biomedical Signal Processing and Control 10: 21–33. [Google Scholar]
- Muehlroth, B. E. , and Werkle‐Bergner M.. 2020. “Understanding the Interplay of Sleep and Aging: Methodological Challenges.” Psychophysiology 57: e13523. [DOI] [PubMed] [Google Scholar]
- Olbrich, S. , and Arns M.. 2013. “EEG Biomarkers in Major Depressive Disorder: Discriminative Power and Prediction of Treatment Response.” International Review of Psychiatry 25: 604–618. [DOI] [PubMed] [Google Scholar]
- Pierson‐Bartel, R. , and Ujma P. P.. 2024. “Objective Sleep Quality Predicts Subjective Sleep Ratings.” Scientific Reports 14: 5943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pótári, A. , Ujma P. P., Konrad B. N., et al. 2017. “Age‐Related Changes in Sleep EEG Are Attenuated in Highly Intelligent Individuals.” NeuroImage 146: 554–560. [DOI] [PubMed] [Google Scholar]
- Purcell, S. M. , Manoach D. S., Demanuele C., et al. 2017. “Characterizing Sleep Spindles in 11,630 Individuals From the National Sleep Research Resource.” Nature Communications 8: 15930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Redline, S. , and Purcell S. M.. 2021. “Sleep and Big Data: Harnessing Data, Technology, and Analytics for Monitoring Sleep and Improving Diagnostics, Prediction, and Interventions‐An Era for Sleep‐Omics?” Sleep 44: 1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saifutdinova, E. , Congedo M., Dudysova D., Lhotska L., Koprivova J., and Gerla V.. 2019. “An Unsupervised Multichannel Artifact Detection Method for Sleep EEG Based on Riemannian Geometry.” Sensors 19: 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Somervail, R. , Cataldi J., Stephan A. M., Siclari F., and Iannetti G. D.. 2023. “Dusk2Dawn: An EEGLAB Plugin for Automatic Cleaning of Whole‐Night Sleep Electroencephalogram Using Artifact Subspace Reconstruction.” Sleep 46, no. 12: zsad208. 10.1093/sleep/zsad208. [DOI] [PubMed] [Google Scholar]
- Stancin, I. , Cifrek M., and Jovic A.. 2021. “A Review of EEG Signal Features and Their Application in Driver Drowsiness Detection Systems.” Sensors 21, no. 11: 3786. 10.3390/s21113786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun, H. , Paixao L., Oliva J. T., et al. 2019. “Brain Age From the Electroencephalogram of Sleep.” Neurobiology of Aging 74: 112–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun, H. , Ye E., Paixao L., et al. 2023. “The Sleep and Wake Electroencephalogram Over the Lifespan.” Neurobiology of Aging 124: 60–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szucs, D. , and Ioannidis J. P. A.. 2017. “Empirical Assessment of Published Effect Sizes and Power in the Recent Cognitive Neuroscience and Psychology Literature.” PLoS Biology 15: e2000797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 't Wallant, D. C. , Muto Z., Gaggioni G., et al. 2016. “Automatic Artifacts and Arousals Detection in Whole‐Night Sleep EEG Recordings.” Journal of Neuroscience Methods 258: 124–133. [DOI] [PubMed] [Google Scholar]
- Taji, W. , Pierson R., and Ujma P. P.. 2023. “Protocol of the Budapest Sleep, Experiences, and Traits Study: An Accessible Resource for Understanding Associations Between Daily Experiences, Individual Differences, and Objectively Measured Sleep.” PLoS One 18: e0288909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tononi, G. , and Cirelli C.. 2014. “Sleep and the Price of Plasticity: From Synaptic and Cellular Homeostasis to Memory Consolidation and Integration.” Neuron 81: 12–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ujma, P. P. , and Bódizs R.. 2023. “Sleep Alterations as a Function of 88 Health Indicators.” medRxiv. 10.1101/2023.11.20.23298781. [DOI] [PMC free article] [PubMed]
- Ujma, P. P. , and Bódizs R.. 2024a. “Sleep Alterations as a Function of 88 Health Indicators.” BMC Medicine 22, no. 134: 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ujma, P. P. , and Bódizs R.. 2024b. “Sleep Homeostasis in a Naturalistic Setting.” bioRxiv. 10.1101/2024.07.02.601682. [DOI]
- Ujma, P. P. , Bódizs R., Dresler M., et al. 2023. “Multivariate Prediction of Cognitive Performance From the Sleep Electroencephalogram.” NeuroImage 279: 120319. 10.1016/j.neuroimage.2023.120319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ujma, P. P. , Dresler M., Simor P., et al. 2022. “The Sleep EEG Envelope Is a Novel, Neuronal Firing‐Based Human Biomarker.” Scientific Reports 12: 18836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ujma, P. P. , Konrad B. N., Gombos F., et al. 2017. “The Sleep EEG Spectrum Is a Sexually Dimorphic Marker of General Intelligence.” Scientific Reports 7: 18070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ujma, P. P. , Simor P., Steiger A., Dresler M., and Bódizs R.. 2019. “Individual Slow‐Wave Morphology Is a Marker of Aging.” Neurobiology of Aging 80: 71–82. [DOI] [PubMed] [Google Scholar]
- Urigüen, J. A. , and Garcia‐Zapirain B.. 2015. “EEG Artifact Removal‐State‐Of‐The‐Art and Guidelines.” Journal of Neural Engineering 12: 031001. [DOI] [PubMed] [Google Scholar]
- Yuval‐Greenberg, S. , and Deouell L. Y.. 2009. “The Broadband‐Transient Induced Gamma‐Band Response in Scalp EEG Reflects the Execution of Saccades.” Brain Topography 22: 3–6. [DOI] [PubMed] [Google Scholar]
- Zhang, G.‐Q. , Cui L., Mueller R., et al. 2018. “The National Sleep Research Resource: Towards a Sleep Data Commons.” Journal of the American Medical Informatics Association 25: 1351–1358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang, Y. , Ren R., Yang L., et al. 2022. “Sleep in Alzheimer's Disease: A Systematic Review and Meta‐Analysis of Polysomnographic Findings.” Translational Psychiatry 12: 136. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data S1.
Data Availability Statement
The code used in analyses and visualization is available at https://zenodo.org/record/7934586. Raw data are also accessible using the resources posted here.
