Abstract
Research utilizing ERP methods is generally biased with regard to sample representativeness. Among the myriad of factors that contribute to sample bias are researchers’ assumptions about the extent to which racial differences in hair texture, volume, and style impact electrode placement, and subsequently, study eligibility. The current study examines these impacts using data collected from n = 213 individuals ages 17– 19-years, and offers guidance on collection of ERP data across the full spectrum of hair types. Individual differences were quantified for hair texture using a visual scale, and for hair volume by measuring the amount of gel used in cap preparation. EEG data quality was assessed with multiple metrics at the pre-processing, post-processing, and variable generation stages. Results indicate that hair volume is associated with small, but systematic differences in signal quality and signal amplitude. Such differences are highly problematic as they could be misattributed to cognitive differences among groups. However, inclusion of gel volume as a covariate to account for individual differences in hair volume significantly reduced, and in most cases eliminated, group differences. We discuss strategies for overcoming real and perceived technical barriers for researchers seeking to achieve greater inclusivity and representativeness in ERP research.
Keywords: Race, Ethnicity, Hair Style, Diversity, EEG, Representation
There is a lack of diverse representation in psychophysiological research broadly. Indeed, an examination of publications in Psychophysiology across a 3-year period found that only 14.5% of the empirical studies published provided any racial information about their participants, making the extent of the problem difficult to determine (Gatzke-Kopp, 2016; Kissel & Friedman, 2023). Recently, attention has been paid to the particularly pronounced lack of diversity in electroencephalography (EEG) and event-related potentials (ERP) research specifically (Bradford et al., 2022; Choy, Baker, & Stavropoulos, 2022; Etienne et al., 2020; Parker & Ricard, 2022; Webb, Etter, & Kwasa, 2022). While there are many reasons contributing to this under-representation (see Gatzke-Kopp, 2016), one prominent factor is researcher assumptions that racial differences associated with hair texture, volume, and style, impede effective scalp access needed for quality ERP measurement (Webb et al., 2022). Although this presumption is widely held, it has not, to our knowledge, been examined empirically. Examining whether race has a systematic influence on ERP data quality is of critical importance. As researchers work to expand the inclusiveness of participants in their studies, such knowledge would not only guide specific practices and techniques, but ensure that racially-correlated confounds are not misinterpreted as psychologically-driven effects.
ERP relies on the use of non-invasive electrodes placed upon the scalp to detect and record electrical activity. Because electrical activity at the scalp consists of both cortical and non-cortical electrical signal (e.g., muscle, cardiac, and electrodermal activity), adequate signal to noise ratios are critical to quality ERP recording. A variety of electrode types and application systems have been developed, but they commonly consist of a cap with pre-set electrode holders designed to rest on the scalp surface and contain conductive medium (e.g. gel, sponge wetted with a conductive solution). The quality of data recorded depends on the degree of impedance between the scalp and the electrode. Typically, researchers will seek to minimize impedance values by using a blunt needle to move hair (a non-conductive medium) aside and gently abrade the scalp. The density and volume of hair will naturally impact scalp access, and increase the distance between the scalp surface and the electrode surface. Given this challenge, researchers often employ screening criteria for hair characteristics that are likely to prevent quality data collection (Webb et al., 2022). Such practices disproportionately affect people of color, particularly racial groups characterized by highly textured hair and dense hair follicles, such as individuals of African descent. It is also not uncommon for individuals with this hair type to style their hair in semi-permanent configurations such as braids, that are intended to help preserve the health of their hair (i.e., protective hair styles), but can also make the distribution of hair across the scalp uneven. In other words, methodological limitations inherent in ERP recording contribute directly to under-representation of certain populations (Choy et al., 2022; Parker & Ricard, 2022).
Despite the wide-spread perspective that certain hair types and hair styles are incompatible with this type of research, empirical evidence of the effect of hair type on ERP data quality is lacking. If factors pertaining to hair type are relevant for recording quality, it would be important to establish the specific features pertinent for participant screening. Although features of hair quality are known to correspond with racial heritage, they are far from synonymous, and race is a poor surrogate for determining research eligibility. Identifying such factors could also contribute to the development of equipment better suited for different hair conditions, thus facilitating recording ERP data from a broader and more representative pool of participants (see Etienne et al., 2020).
Traditionally hair has been categorized into three types, i.e., African, Asian, and Caucasian (de la Mettrie et al., 2007). Significant morphological variability exists across these categories with regard to hair follicle diameter, shape (e.g., curl), and scalp density (Bernard, 2003; Franbourg, Hallegot, Baltenneck, Toutaina, & Leroy, 2003; Kreplak et al., 2001; Lindelöf, 1988). On average, Asian ancestry is often associated with straight hair follicles, whereas African ancestry is often associated with tight curl patterns. However, hair morphology derives from multiple genetic influences that do not necessarily align with genetic determinants of skin color, allowing for wide variability in hair characteristics even within a common racial lineage. As such, a race-based classification system may not accurately reflect the breadth of variation in hair, and is particularly limited in categorizing bi- or multi-racial people. More recently, de la Mettrie et al., 2007 and Loussouarn et al., 2007 examined indicators of hair shape (i.e., curve diameter, curl index, number of twists/kinks, and number of waves) and identified 8 categories of hair types that span all race/ethnicity boundaries. This system provides for objective, non-race based, measurement and examination of how hair type impacts ERP recording and analysis. However, such strand-level hair characteristics may be less relevant for electrode placement than the total volume of hair. Hair density will affect the distance between the scalp surface and the recording electrode, with higher densities resulting in a greater distance. Although this distance can be bridged with conductive gel that allows for the electrical signal to be carried from the scalp to the electrode, this increased distance could contribute to a decay of the signal strength. Identifying the specific facets of individual hair characteristics that impact ERP data quality is critical in identifying practical solutions.
The Present Study
This paper used data obtained during collection and analysis of ERP recordings from a diverse sample of 213 participants with a wide variety of hair types to examine (A) if and how differences in hair type impact various features of pre-processed data, post-processed data, and analysis-ready data; and (B) if and how those impacts are alleviated by accounting for individual differences in hair volume. Hair quality was classified according to the eight-point scale of Loussouarn et al., 2007, as rated by the participant and by the research assistant. Because hair volume (i.e., distance between the scalp and electrode) may be a more pertinent factor than hair texture, we additionally quantified the amount of gel used in cap preparation per participant. We examined basic components of signal quality, including DC offset and Signal-to-Noise ratio calculated during a neutral rest condition and a cognitive task. We further examined whether hair type affected data quality for event-related potential analyses, by examining the number of trials rejected for artifact, the standardized measurement error (SME) quantifying reliability of ERP components, and average ERP amplitudes. In addition to these quantitative metrics, we also examined whether hairstyle factors contributed to participant willingness to participate in EEG research, which was presented as an optional component of a larger clinic-based protocol.
Method
Participants
The present data were provided by participants enrolled in the ongoing Family Life Project (FLP). Briefly, the FLP is a longitudinal epidemiological study that has followed 1292 families recruited at the time of the child’s birth in regions of Pennsylvania (n = 519) and North Carolina (n = 773) to investigate the effects of poverty and rurality on early child development; additional details regarding the recruitment and maintenance of the complete FLP sample have been reported elsewhere (Vernon-Feagans, Cox, & The Family Life Project Key Investigators, 2013). When participants were approximately 18 years of age they were invited to participate in a clinic-based assessment, which involved the collection of biological samples, body morphology, and cognitive testing, lasting approximately 2 hours. At the end of the session participants were given the option of completing an ERP protocol (described below) for an additional payment. At the time of our analysis, n = 235 participants had completed the clinic visit, and ERP data were successfully collected from n = 213 participants. Participants were M = 18.34, SD = 0.42 years of age, ranging from 17.52 to 19.22 years. Of these participants, n = 103 (48.36%) were identified as female at birth, n = 52 (24%) as African-American or Black, n = 146 (69%) as White or Caucasian, and n = 15 (7%) as Bi-racial or Multi-racial.
Procedure
All study procedures were approved by the NYU Langone School of Medicine Institutional Review Board, with reliance from The Pennsylvania State University and the University of North Carolina, Chapel Hill. As part of the ongoing study, research staff contacted parents (for children under the age of 18) and/or the target participant by phone to inform them of the opportunity to participate in the current wave of data collection. Interested participants completed an electronic consent and assent (if appropriate) process over the phone and scheduled a clinic visit that consisted of multiple assessments, including the optional ERP component. Compensation was provided for the regular visit, and participants were told they would be able to participate in an ERP assessment that would involve playing a game in which they had the opportunity to win “up to $50” in addition to the compensation received for the full clinic visit. Data were collected at two sites using identical procedures. Although there was not complete separation, Black participants were more likely to be part of the North Carolina assessment site, and research staff at that site are Black. Similarly, White participants were more likely to be part of the Pennsylvania site, where research staff are White.
Measurement of Hair Type
Hair Type.
At the start of the ERP protocol, participants were shown a card with photos of 8 different types of hair (photos from Loussouarn et al., 2007). Images depicted a view of the back of a head such that only the hair was visible, and the hair occupied the 2/3 of the image space with a white background. Hair color was standardized across all 8 images. Participants were asked to indicate which of the 8 photos depicted their hair type (Category 1 to Category 8). In parallel, the research assistant (RA) administering the protocol provided their own rating of the participant’s hair type using the same pictorial scale. In most cases (n = 199) the RA also took a photo of the back of the participant’s head, protecting anonymity while still capturing information about hair that might be referred to later. Participants’ self-rating and RA rating were correlated at r = .823 (p < .001) and for n = 112 (53%) cases, the RA rating and the participant rating were identical. However, in the instance that the participants’ self-rating and the RA rating differed by two or more levels, a third rater used the participant photos to classify the participants’ hair. In this manner, n = 28 participants were evaluated by the third rater.
Hair Group.
The distribution of participants’ hair type along with exemplar photos from study participants are shown in Figure 1. For convenience of analysis and statistical power to discover differences, the 8-category classification of hair types was reduced by placing individuals into two groups. Using a classification cutoff where Group 1 consisted of hair types 1–3 (n = 153), and Group 2 consisted of hair types 4–8 (n = 60) resulted in the greatest agreement among raters.
Figure 1 – Classification of hair type and texture across the 8 possible categories for the low and high texture/curl groups.

Note: The histogram illustrates the distribution of participants across the 8 hair categories. Beneath each category is an example image of a participant who self-identified in that category. For the participants who did not provide a self-classification, the RA rating was substituted. Following guidance in previous research, categories were reduced into two groups, Hair Group 1 consisted of categories 1 through 3 and Hair Group 2 consisted of categories 4 through 8. Images for categories 6 and 7 additionally illustrate protective hairstyles represented in the sample.
Hair Group was not associated with sex (p = .21), but was associated with race (χ2 (2) = 162.44, p < 0.001); with individuals in Hair Group 1 more likely to be White or Caucasian and individuals in Hair Group 2 more likely to be African-American or Black. We additionally examined whether groups differed in socioeconomic status by comparing the income-to-needs ratio (INR) computed for participants at age 6 months. Research indicates that childhood socioeconomic status is a stronger predictor of later brain function than adult (or concurrent) socioeconomic status (Ursache & Noble, 2016). Groups did differ significantly t = 5.23, p < .001 in INR with Group 1 (M = 2.60, SD = 2.12) closer to 2.5 times the federal poverty line and Group 2 (M = 1.36, SD = 1.10) closer to the poverty line on average.
Notably 34 of the 60 participants in Hair Group 2 had protective hair styles (e.g., cornrows, braids, dreadlocks as illustrated in Figure 1 categories 6 and 7) that are often considered incompatible with EEG assessments (see Choy et al., 2022). In these cases, capping proceeded as normal, and no modifications were made to the participants’ hair.
Gel Volume.
Because hair texture/type is unlikely to be problematic in and of itself, we additionally sought to quantify individual differences in hair volume. This was quantified as the volume of electrode gel used when placing the EEG cap and electrodes on each participant’s head (as described below). The further the distance between the scalp and the electrode holder (into which the electrode sits), the more gel used to fill the column. RAs used plastic syringes with volume indicators (mL) to apply the gel into each electrode holder. RAs recorded the difference between the volume of gel drawn into the syringe and the amount remaining when set-up was completed. In instances where syringes were refilled during the capping process, the cumulative total mL of gel used was recorded.
ERP Protocol and Recording
ERP data were recorded during a rest condition and a cognitive task, both of which were presented using Presentation (Version 21.21.0, Build: 06.06.19), and captured using a BioSemi ActiveTwo system (BioSemi, Amsterdam, Netherlands) with DC amplifiers set at a gain of 1 and a 24-bit A-D conversion resolution. Participants were fitted with a 34-lead unipolar montage (which included additional FCz and Iz electrodes) of sintered silver electrodes placed symmetrically into BioSemi electrode caps at standard extended 10–20 locations over the whole head. Additional electrodes were placed on the left and right mastoid processes, the left suborbital ridge below the pupil, and on the left radial styloid process. Data were digitized at 1024 Hz according to the BioSemi zero reference principle (individual electrode voltage is quantified relative to the common mode sense and driven right leg loop). However, due to RA error, n = 33 recordings were digitized at 512 Hz, and so all recordings were down-sampled to 512 Hz in pre-processing for consistency. Throughout the recordings, electrode DC offsets were maintained below 40 μV as a substitute for traditional impedance measures of signal quality, as recommended by the manufacturer.
The resting condition consisted of participants sat quietly for 3 minutes while viewing a fixation cross. Due to a set-up template error, 34 participants were shown a moving star-scape rather than a fixation cross (follow-up analyses, presented in Supplemental Tables 1 – 4, confirm no difference in results as a function of baseline condition and so the present results are based on the full sample). Following this initial recording, the participants completed the cognitive task as described below, after which the clinic visit was concluded.
Electrophysiological Monetary Incentive Delay Task
Participants completed a version of the monetary incentive delay (MID) task (Knutson, Fong, Adams, Varner, & Hommer, 2001; Knutson, Westdorp, Kaiser, & Hommer, 2000) specifically designed for EEG data collection (eMID) (Broyd et al., 2012). The purpose of the present analysis does not include examining hypotheses related to reward responding for which this task was designed, but rather the task is used to extract basic perceptual and cognitive ERP indicators (described below) to examine whether differences in hair type and volume have a systematic association with ERP data quality and peak amplitude. The task structure is illustrated in Figure 2. Each trial consists of a cue stimulus denoting the nature of the trial (gain, loss, neutral) lasting for 250 ms, followed by a 2-second delay (randomly jittered ± 50 ms) before the presentation of the target stimulus, to which participants are required to respond as quickly as possible via button press. The required response time window was dynamically adjusted at each trial, based on the outcome of the previous trial, to force an individual error rate of approximately 50% and balance the number of successful and erroneous trials. After the response participants are subjected to an additional 1500 ms delay before being provided feedback tied to their performance. When the participant is shown the gain cue (+) this indicates that correct performance will be rewarded, resulting in an addition to the monetary winnings to be accumulated across the task. Incorrect performance on such trials has no effect on current winnings. On loss trials, correct performance is needed to avoid losing money. On neutral trials no wins or losses will take place.
Figure 2 – Trial format of the electrophysiology Monetary Incentive Delay (eMID) task.

Note: On each trial, participants were initially presented with a cue stimulus indicating the trial type (i.e., gain, loss, neutral) and possible outcome (i.e., gain points, avoid gaining points, lose points, avoid losing points, no loss or gain). Following the first inter-stimulus interval (ISI) participants were shown a target stimulus to which participants were required to respond (marked by the red line, responset) to as quickly as possible. After a second ISI, participants were shown a feedback stimulus indicating the outcome of the trial.
The task was programmed such that participants completed a total of 120 trials (50 gain, 50 loss, 20 neutral trials) that were split as evenly as possibly into 3 blocks: block 1 consisted of 17 gain trials, 16 loss trials, 7 neutral trials; block 2 consisted of 17 gain trials, 17 loss trials, 6 neutral trials; and block 3 consisted of 16 gain trials, 17 loss trials, 7 neutral trials. No two successive trials were identical. A 1-minute break was programmed between blocks to provide the participants a short rest. During this rest period participants were provided general feedback of their cumulative score at that point. After completing the third block of trials, participants were shown a screen which presented their final score, and the data collection was concluded. All participants were informed they had won the full $50 prize and were paid in cash.
Of the n = 213 who completed the EEG assessment, 2 participants had corrupt rest-condition data files and 2 different participants had corrupted task-condition data files, making the final analysis sample size n = 211 for each component.
ERP Data Processing
Raw EEG data were processed with Matlab R2019a using the EEGLab (v2021.1; Delorme & Makeig, 2004) and ERPLab (v8.30; (Lopez-Calderon & Luck, 2014)) toolboxes. In order to objectively examine how processing and analysis of the EEG data may be impacted by differences in hair type or volume, all the EEG data were processed using a typical ERP pre-processing pipeline in a fully automated manner (visually depicted in Figure 3 and described below) that eliminated possibility for any subjective decisions to cloud assessments of data quality across hair groups.
Figure 3 – Schematic illustration of the automated ERP pre-processing pipeline used and its various output variables.

Note: The present processing pipeline consisted of 9 possible steps and their various sub-steps. Steps 1–5 (marked in green) were applied to both the baseline recording and the task recording, while Steps 6–9 (marked in orange) were only applied to the task recording. Solid lines indicate the the progression of the pipeline through its various steps, and the dashed lines indicate the points at which the output variables were exported from the pipeline.
SEM = Standardized Measurement Error; SNR = Signal-to-Noise Ratio
Raw EEG data.
Raw data recordings, approximately 3-minutes for the resting condition and 15-minutes for the eMID task condition, were imported into EEGLab and combined with channel location information.
Quality of raw EEG Data: DC Offset.
Manufacturer instructions guide researchers to minimize the DC offset, that is, the mean displacement or bias away from zero of the raw signal recorded at a given electrode at the time of setup in an effort to ensure effective scalp contact needed for quality recording. RAs are instructed to monitor the DC offset for electrode values that are unstable or out of range. We examine whether hair group is associated with greater offsets by quantifying average DC offset as the average amplitude value (in μV) over the full length of the recording (approximately 3- and 15-minutes for the resting and task conditions respectively) prior to any pre-processing steps. Manufacturer recommendations indicate that DC Offset values between −20 and 20 μV are ideal, and values between −40 and 40 μV are acceptable.
In order to reduce the amount of data for analysis, electrodes were clustered regionally by lobe. Because offsets could be positive or negative, absolute values were used in the regional averages to ensure that high and low values did not cancel each other out and give a false impression of low offset data. DC offset was computed for the following five clusters: Frontal lobe [Fp1, AF3, F3, FC1, FC2, F4, AF4, Fp2, Fz, and FCz]; Central lobe [C3, Cz, and C4]; Temporal lobe [F7, FC5, T7, CP5, P7, F8, FC6, T8, CP6, and P8]; Parietal lobe [CP1, P3, Pz, PO3, CP2, P4, and PO4]; and Occipital lobe [O1, Oz, O2, and Iz]. This allows for analyses to examine regional specificity of data quality compromises that might occur given that hair density is not always distributed evenly across the scalp (for instance, longer hair could contribute to increased density in the occipital region). Given the relative symmetry of skull topography, laterality is not examined. Because midline electrodes are utilized with great frequency in ERP analyses, we analyze these separately.
Processed EEG data.
Recordings were down-sampled to 512 Hz, temporarily re-referenced to Cz, and parsed through the bad channel detection algorithm embedded in the FASTER plug-in (v1.2.3b; Nolan, Whelan, & Reilly, 2010). The algorithm computes and uses the mean Pearson correlation between the channel of interest and other channels, the variance of the channel of interest relative to other channels, and the Hurst exponent combined with Z-score values greater than 3 to identify and flag bad channels. On average, M = 2.15 (SD = 0.99) bad channels were removed from the rest condition recordings, and M = 2.24 (SD = 0.97) bad channels were removed from task recordings. As a note, due to the size of our electrode montage and relatively large distance between individual electrodes, we chose to not interpolate removed channels.
Remaining data were DC corrected and bandpass filtered using half-amplitude values of 0.1 and 30 Hz and a 2nd-order Butterworth infinite impulse response (IIR) filter with a 12 decibel per octave roll-off (implemented using ERPLab’s zero-phase ‘pop_basicfilter’ function). After filtering, data were re-referenced to the common average and marked as Filtered Signal. The data were then decomposed using Independent Component Analysis (specifically the implementation via the ‘runica’ EEGLab function), and the ICLabel plug-in (v1.3; Pion-Tonachini, Kreutz-Delgado, & Makeig, 2019) was used to identify and remove components classified as 90% or greater likelihood of being eye- and muscle-activity artifacts or channel noise. On average, M = 2.20 (SD = 1.59) components were removed from rest condition recordings (95.69% of those components were classified as eye related, 4.09% as channel noise, and 0.22% as muscle activity), and M = 2.32 (SD = 1.10) components from the eMID task recordings (98.57% were classified as eye activity, 0.41% as channel noise, and 1.02% as muscle activity). The resulting data were labeled as ‘Noise-reduced signal’.
Signal-to-Noise Ratio.
Quality of the processed data from each electrode was quantified with respect to its Signal-to-Noise Ratio (SNR). The SNR of a recording describes the comparative level of a desired signal (i.e., neural activity) to the level of background noise (e.g. non-neural electrical activity, line noise, etc.) within a single recording. Here, SNR was calculated for each electrode recording for each participant following Radüntz (2018) as:
| 1 |
where N is the number of sample points in a recording, xi is the noise reduced or cleaned signal at time i, and si is the band-pass filtered signal at time i. As was done for raw data quality, lobe-specific processed data quality was quantified as the average of SNR for electrodes in Frontal, Central; Temporal; Parietal; and Occipital regions, and across the individual midline electrodes.
ERP analysis data.
The eMID task is designed to examine individual differences in sensitivity to reward and loss by comparing neural activity across trial types. However, our current objective was to identify ERP components that would be minimally related to individual differences in psychological or cognitive processing, such that any group differences detected would likely reflect measurement limitations related to hair volume. As such, we extracted ERP components averaged across all trial types (i.e. gain, loss, neutral). Specifically, two event-related potentials were selected to capture early sensory processing (P1) and later cognitive processing (P3b) of the cue stimuli. By averaging across trial types we postulate that these ERP components reflect basic attentional engagement related to general task performance, and that hair groups will not differ with regard to general task engagement, Because accuracy was artificially constrained, we examined group differences in task engagement by comparing average reaction times. Groups did not differ with regard to reaction time t = 0.37, p = 0.72; Group 1 M = 210.34ms, SD = 32.06, Group 2 M = 212.97ms, SD = 37.52.
The final noise-reduced signal from the eMID task recordings were segmented from −500 to 1000 ms around the cue stimuli and baseline corrected across the −500 to −300 ms pre-stimulus period. After segmentation, segments containing a voltage step greater than 100 μV between successive 200 ms windows (with a 50% overlap), a 30 μV or greater change between sampling points, or a voltage value beyond −100 μV to 100 μV were marked as artifact and removed. On average, these three criteria lead to the removal of M = 26.75, SD = 33.69 trials from the task recordings.
Data from the remaining trials, average of M = 93.26 (SD = 33.70) per participant, were used to derive the ERP components. P1 was extracted from electrodes P7 and P8. The grand average waveform is presented in Figure 4. P1 was defined as the mean amplitude across the 60 – 160 ms post-stimulus window. P3b was extracted from the Pz electrode, and the grand average waveform is presented in Figure 5. P3b was defined as the mean amplitude across the 250 – 500 ms post-stimulus window.
Figure 4 – Grand average P1 waveforms and corresponding scalp topography for all participants.

Note: Plot A presents the grand average P1 waveforms as derived at the P7 and Plot B presents the grand average P1 waveform at the P8 electrode for the full sample. Polarity is such that positive values are plotted up, and the shaded region indicates the window of measurement. On the right-hand side of the image, the scalp topography corresponding to these waveforms is plotted as the average amplitudes across the 60 – 160 ms post-cue stimuli window.
Figure 5 – Grand average P3b waveform and correspond scalp topography for all participants.

Note: The left-hand side of the image plots the grand average P3b waveforms as derived at the Pz electrode for the full sample. Polarity is such that positive values are plotted up, and the shaded region indicates the window of measurement. On the right-hand side of the image, the scalp topography corresponding to the Pz waveform is plotted as the average amplitudes across the 250 – 550 ms post-cue stimuli window.
In addition to amplitude, ERPs were examined in terms of their standardized measurement error (SME), which quantifies the data quality (i.e., consistency) for the specific amplitude or latency value of an ERP component (Luck, Stewart, Simmons, & Rhemtulla, 2021). The SME is derived using the following equation:
| 2 |
where is the standard deviation in the time-window mean amplitude (i.e., the mean amplitude across the chosen time window for an individual trial) for all included trials, and N is the number of included trials. The SME was computed across the measurement windows for both the P1 and P3b ERPs using the implementation of the above equation built directly into ERPLab.
Statistical Analysis
Descriptive analyses were used to examine whether hair type or style were associated with disproportionate refusal of the EEG assessment. We then use t-tests to assess whether gel volume did indeed differ across the two hair groups. Both hair group and gel volume are then tested (via correlation and Chi-square tests) to determine whether they are systematically associated with artifact leading to greater channel/trial loss during the processing stages.
Linear mixed effects models are used to examine group differences in the selected measures of EEG signal including DC Offset, SNR, SME and ERP amplitude. Linear mixed effects models accommodate the nested nature of the data (repeated measures across 5 lobes nested within persons), and are robust to the condition of unequal group sizes. For each measure, models are conducted separately for the rest condition and the task condition data as a function of Hair Group (between-person factor), Lobe (within-person factor) and the Hair Group x Lobe interaction, while accommodating individual differences in overall level via person-level random effects. A comparable model structure is used to examine the midline electrodes. Upon discovery of significant group differences, we check whether those differences are reduced or eliminated through statistical control for differences in gel volume, that is, by adding gel volume and the gel volume x lobe interaction as additional predictors in the model.
All models are estimated in R using the base package for linear regressions the lme4 package (Bates, Mächler, Bolker, & Walker, 2015) for the linear mixed effects models, and the afex (Singmann, Bolker, Westfall, Aust, & Ben-Shachar, 2023) and emmeans (Lenth, 2022) packages for model evaluation and post-hoc comparisons. Figures were created using the ggplot2 package (Wickham, 2016). Prior to analysis, gel volume was centered at the sample mean. Statistical significance was evaluated at p < .05.
Results
Participation
Of the n = 235 participants who completed the clinic visit, n = 18 individuals declined the invitation to participate in the supplemental ERP assessment. Reasons for declining included: no specific reason (n = 4), a migraine headache (n = 1), lack of time (n = 1), and hair-related reasons (n = 12). Individuals declining for hair-related reasons included both males (n = 2) and females (n = 10), and individuals identifying as Black (n = 6), White (n = 4), and Multi-racial (n = 2). Specific reasons for declining included: having recently had their hair styled (4 Black, 2 White); not wanting gel in their hair (1 Multi-racial, 1 Black, 2 White), having a scalp condition (1 Black), and not having time to wash or restyle their hair before their next engagement (1 White).
Differences in Data Recording: Gel Volume
As expected, the volume of gel used in preparation of the cap differed between Hair Groups 1 and 2 (t = −10.23, p < .001). On average, greater volumes of gel were used when placing the EEG cap on participants in Hair Group 2 (M = 24.17, SD =10.52 mL) than on participants in Hair Group 1 (M = 9.72, SD = 4.06 mL).
Differences in Data Removal During Pre-Processing
There were no significant differences between hair groups in the number of channels that were removed during data cleaning from the rest condition (t = 1.43, p = .16; Hair group 1 M = 2.21, SD = 1.00 vs. Hair Group 2 M = 2.00, SD = .95) or from the task condition (t = −.66, p = .51; Hair group 1 M = 2.21, SD = .93 vs. Hair Group 2 M = 2.32, SD = 1.08). Similarly, there were no significant differences between groups in the number of ICA components (i.e., eye-blink, movement, and noise artifacts) removed for the rest condition (t = −.56, p = .58; Hair Group 1 M = 2.15, SD = 1.28 vs. Hair Group 2 M = 2.32, SD = 2.22) or the task recording (t = 0.57, p = .57; Hair Group 1 M = 2.34, SD = 1.11 vs. Hair Group 2 M = 2.25, SD = 1.07) or in the number of trials rejected as artifact (t = −.40, p = .69; Hair Group 1 M = 26.12, SD = 31.74 vs. Hair Group 2 M = 28.33, SD = 38.41) during the preparation of the ERP data. As shown in Table 1, and consistent with the conductive properties of gel, there was some evidence that greater gel volume was associated with removal of fewer bad channels (r = −0.14, p = .05) from the rest condition. However, gel volume was not related to any other aspect of artifact, channel, or trial removal. Comparable results for the task condition are reported in Supplemental Table 5.
Table 1 –
Zero order bivariate correlations among, and means and standard deviations of participant age, data collection variables, and data preprocessing variables
| 1. | 2. | 3. | 4. | 5. | 6. | |
|---|---|---|---|---|---|---|
| 1. Gel Volume | − | |||||
| 2. Number of Channels Removed from baseline | −0.14* | − | ||||
| 3. Number of ICA components removed from baseline | 0.09 | −0.04 | − | |||
| 4. Number of Channels Removed from eMID Task | 0.00 | 0.14* | −0.01 | − | ||
| 5. Number of ICA components removed from eMID Task | 0.03 | 0.09 | 0.26*** | −0.01 | − | |
| 6. Number of eMID Task Trials rejected as Artifact | 0.00 | 0.02 | −0.07 | −0.20** | −0.20** | − |
| M (SD) | 14.05 (9.40) | 2.15 (0.99) | 2.20 (1.59) | 2.24 (0.97) | 2.32 (1.10) | 26.75 (33.69) |
Note: M = mean, SD = standard deviation. ICA = Independent Component Analysis
p < .05,
p < .01,
p < 0.001
Differences in Quality of EEG Data
Group differences in quality of raw and processed EEG data (as indicated by DC Offset and SNR, respectively) were examined across five lobes and across the five midline electrodes for both baseline and task recordings. For parsimony of presentation, when the pattern of results was the same across all the various parsing of the data (condition, electrode parsing), only one model is presented in the main text and the remaining analyses are presented in Supplemental Files.
DC Offset.
Means and standard deviations for the absolute value of DC offsets are reported in Table 2, along with the correlations with gel volume. Significant associations between DC offset and gel volume were observed for all electrode clusters, such that greater gel volume was associated with greater deviations from zero. In Figure 6A, the raw value for DC offsets during the baseline recording are plotted for each lobe and group (comparable results for the midline electrodes are reported in Supplemental Figure 1). As illustrated in the plot, the majority of values were well within the typically targeted −20 to 20 μV range. Analysis of variance in DC offset, however, revealed significant effects for Lobe (F(4, 772) = 17.33, p < .001), Hair Group (F(1, 193) = 39.97, p < .001), and the Hair Group x Lobe interaction (F(4, 772) = 9.16, p < .001). Thus, even though the data quality is in the acceptable range, there are systematic hair-related differences in DC offset present in these data. As indicated in the left panel of Figure 6, post-hoc comparison of the lobe-specific estimated marginal means indicate significant differences between hair groups at all five regions (t’s ≥ 4.10, p’s ≤ .001). On average, individuals in Hair Group 2 had a more negative offset value (estimated differences ranging from −4.75 to −9.44 μV).
Table 2 –
Zero order bivariate correlations between EEG gel volume and data quality metrics calculated for the baseline recording
| DC Offset | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Fz | FCz | Cz | Pz | Oz | Front | Central | Temp | Par | Occ | |
| Gel Vol | 0.25*** | 0.34*** | 0.16* | 0.28*** | 0.38*** | 0.40*** | 0.34*** | 0.36*** | 0.34*** | 0.30*** |
| M (SD) | 6.50 (5.58) | 5.77 (5.48) | 6.06 (5.30) | 6.93 (6.66) | 6.81 (5.81) | 6.41 (5.53) | 5.24 (5.06) | 6.69 (5.87) | 5.42 (5.31) | 6.78 (6.60) |
| Signal-to-Noise Ratio | ||||||||||
| Gel Vol | 0.04 | −0.13† | 0.18* | 0.02 | 0.04 | 0.04 | 0.09 | 0.03 | 0.03 | 0.04 |
| M (SD) | 6.09 (8.88) | 16.82 (9.65) | 10.86 (8.05) | 5.70 (9.17) | 4.66 (8.88) | 6.30 (7.97) | 9.78 (8.14) | 6.64 (8.84) | 5.83 (8.50) | 4.98 (8.82) |
Note: Absolute value of offsets were computed prior conducting the correlation with gel volume to quantify distance from 0 without regard for directionality, and so that magnitude of offset would not be obscured in the averaging. Means and standard deviations reported are of the absolute values.
B = Baseline; C/Cent = Central; F/Front = Frontal; FC = Frontocentral; O/Occ = Occipital; Off = Offset; P/Par = Parietal; Temp = Temporal; Vol = Volume; z = Midline
p < 0.1,
p < 0.05,
p < 0.01,
p ≤ 0.001
Figure 6 – Comparison of baseline DC Offset values between self-categorized hair groups across different brain regions.

Note: Figure 6A presents the observed DC offset values computed for the five investigated brain regions, and the accompanying post-hoc between-group comparisons of the estimated marginal means from the simple model predicting DC offset of the baseline recording. Figure 6B presents the results for the same model with gel volume included as a covariate. Across both figures the solid and dashed lines plot the estimated marginal means (with error bars representing the standard error) for hair groups 1 (green) and 2 (orange) respectively, while the violin plots depict the observed data. Underneath each plot are the tabulated type 3 omnibus test results for the respective model that predicted lobular DC offset of the baseline recording.
Cent = Central Lobe; df = Degrees of freedom; Front = Frontal Lobe; Occ = Occipital Lobe; Par = Parietal Lobe; Temp = Temporal Lobe
** p < 0.01, *** p < 0.001
Figure 6B presents the model incorporating gel volume as a statistical covariate. As illustrated in the figure, the main effect for group is no longer significant once gel volume is accounted for. However, the group x lobe interaction remains significant. Post hoc comparisons indicate that while the group differences in the DC offset were eliminated in most regions, differences in gel volume mitigated, but did not eliminate, the difference in the occipital region (t = 2.80, p = .005). Comparable results for the task condition are reported in Supplemental Figure 2.
Signal-to-Noise Ratio.
Correlations, as reported in Table 2, indicate no significant association between SNR and gel volume. Analysis of variance in SNR of the processed EEG data for the resting condition recording (reported in Supplemental Figure 3) revealed significant effects for lobe (F(4, 772) = 66.11, p < 0.001), but not for Hair Group (F(1, 193) = 1.67, p = .20) or the Hair Group x Lobe interaction (F(4, 772) = 0.90, p = .46). In contrast, analysis of the SNR of the midline electrodes (Figure 7A) revealed a significant main effect for electrode position (F(4, 729.91) = 143.68, p < .001), but not for Hair Group (F(1, 192.94) = 0.55, p = .46). However, the Hair Group x Electrode interaction was significant (F(4, 729.91) = 10.15, p < .001). Post hoc comparisons of the estimated marginal means revealed a group difference for SNR at Cz (t = −3.07, p = .002), with average SNR of Hair Group 2 being 4.33 units higher than the average SNR of Hair Group 1.
Figure 7 – Comparison of baseline Signal-to-Noise ratio values between self-categorized hair groups across midline electrodes.

Note: Figure 7A presents the observed Signal-to-Noise values computed for the five investigated midline electrode locations, and the accompanying post-hoc between-group comparisons of the estimated marginal means from the simple model predicting Signal-to-Noise ratio of the baseline recording. Figure 7B presents the same results but for the model that also included gel-volume as a covariate. Across both figures the solid and dashed lines plot the estimated marginal means (with error bars representing the standard error) for hair groups 1 (green) and 2 (orange) respectively, while the violin plots plot the observed data. Underneath each plot are the tabulated type 3 omnibus test results for the respective model that predicted electrode specific Signal-to-Noise ratio of the baseline recording.
C = Central; df = Degrees of freedom; F = Frontal; O = Occipital; P = Parietal; z = Midline
† p < 0.1, * p < 0.05, ** p < 0.01
As with previous analyses, the model was computed again with gel volume entered as a covariate. Although attenuated when gel volume was included in the model (Figure 7B), the Hair Group x Electrode term remained significant (F(4, 726.45) = 3.25, p = .01). Post hoc analysis at the Cz site revealed a decrease in the magnitude of the effect (t = −2.06, p = .04); all other electrode comparisons remained not significant (p’s > .40). In sum, the analyses reported here and in the supplement suggest that, hair-related differences in quality of EEG both prior to and after processing can be reduced or eliminated via statistical control of gel volume. Comparable results for lobes and midline electrodes during the task condition are reported in Supplemental Figures 4 and 5.
Differences in Quality of ERP
For parsimony, ERP amplitude was modeled for all three electrode sites in a single model, presented in Figure 8. In the simple model (Figure 8A), there is a significant main effect of Hair Group (df = 1, 177.76, F = 24.39, p < .001), reflecting the lower amplitudes on average among participants with more textured hair. Follow-up analyses revealed significant differences in P1 amplitude at P7 (t = 3.79, p < .001) and at P8 (t = 3.94, p < .001) with average amplitudes being lower in Group 2 (M’s = 0.70 and 0.80, SD’s = 1.30 and 2.03) relative to Group 1 (M’s = 1.56 and 2.08, SD’s = 1.53 and 1.98). Group differences in P3b ERP amplitude did not reach statistical significance (t = 1.89, p = .06; Hair Group 1 M = 1.88, SD = 1.62 vs. Hair Group 2 M = 1.40, SD = 1.57). Consistent with the association between hair group and gel volume, correlations between gel volume and ERP amplitude followed the same pattern, with a significant negative correlation between greater gel volume and lower P1 amplitude at both the P7 (r = −0.17, p = .03) and P8 (r = −0.22; p = .005), but not for P3b (r = −0.11, p = .18).
Figure 8 – Comparison of ERP amplitude values between self-categorized hair groups across the P7, P8, and Pz electrodes.

Note: Figure 8A presents the observed ERP amplitude values from the task recording computed for P7, P8, and Pz electrode locations, and the accompanying post-hoc between-group comparisons of the estimated marginal means from the simple model predicting ERP amplitude. Figure 8B presents the same results but for the model that also included gel-volume as a covariate. Across both figures the solid and dashed lines plot the estimated marginal means (with error bars representing the standard error) for hair groups 1 (green) and 2 (orange) respectively, while the violin plots plot the observed data. Underneath each plot are the tabulated type 3 omnibus test results for the respective model that predicted electrode specific ERP amplitude.
df = Degrees of freedom; P = Parietal; z = Midline
† p < 0.1; * p < 0.05, **p < 0.01
In Figure 8B, the same model is presented with gel volume included as a covariate. As with the previous analyses, inclusion of gel volume mitigated the group differences in amplitude (magnitude of F statistic reduced by more than half), however the main effect of hair group remained significant (df = 1, 177.11, F = 10.98, p = .001). Group differences in P1 amplitude were reduced in magnitude, although remained significant at both the P7 (t = 2.39, p = .02) and P8 electrode (t = 2.52, p = .01) sites.
We then examined whether group differences might be accounted for by demographic differences, specifically income-to-needs ratio. Simple correlates between INR and ERP amplitude were not significant for P3b. A significant association did emerge for P1, but only for the P8 electrode (r = 0.19, p = .02). Inclusion of INR as a covariate in the model did not alter the main effect of Hair Group.
Lastly, a similar single-model analysis was run to examine the SME of the ERP components, presented in Figure 9. Analysis of variance of the simple model (Figure 9A), revealed significant effects for electrode (F(2, 340.89) = 5.55, p = .004), but not for Hair Group (F(1, 178.70) = 2.41, p = .12) or the Hair Group x Lobe interaction (F(2, 340.89) = 0.12, p = .89), indicating no significant differences in the internal consistency and reliability of the P1 and P3b ERP components between the two hair groups. These results remained non-significant when including gel-volume as a covariate in the model, presented in Figure 9B.
Figure 9 – Comparison of ERP Standardized Measurement Error (SME) values between self-categorized hair groups across the P7, P8, and Pz electrodes.

Note: Figure 9A presents the observed Standardized Measurement Error (SME) values that correspond to the P1, and P3 ERP components computed at P7, P8, and Pz electrode locations, and the accompanying post-hoc between-group comparisons of the estimated marginal means from the simple model predicting ERP amplitude. Figure 9B presents the same results but for the model that also included gel-volume as a covariate. Across both figures the solid and dashed lines plot the estimated marginal means (with error bars representing the standard error) for hair groups 1 (green) and 2 (orange) respectively, while the violin plots plot the observed data. Underneath each plot are the tabulated type 3 omnibus test results for the respective model that predicted electrode specific SME.
df = Degrees of freedom; P = Parietal; z = Midline
Discussion
The present study aimed to evaluate the effect, if any, that hair type and/or volume has on the quality of ERP data. Results, both qualitative and quantitative, support the feasibility of conducting ERP research with a diverse range of participants. In our sample we experienced a small degree of refusal to complete the ERP assessment, which was slightly higher among Black participants than White participants. However, the majority of Black participants, including those with protective hairstyles, were interested in participation. This desire speaks to the importance of ensuring that ERP assessments are sufficiently inclusive to accommodate all participants. All participants had good signal quality at setup, indicated by stable DC offset values within the target range, and measures of signal quality throughout the processing pipeline did not indicate cause for concern. However, empirical results suggest modest, yet systematic differences in ERP amplitudes. It is critical to consider the implications of this systematic difference, as ERP amplitude differences between racial groups could be misinterpreted as differences in cognitive processes. These differences in signal amplitude are likely a function of the greater distance that electrical signals must travel when the electrode is separated from the scalp due to an excess of hair volume or density. In the current study, we tested this hypothesis by quantifying the amount of gel used during participant set-up, and demonstrate that statistically controlling for gel volume reduces the magnitude of these group differences.
Research assistants were able to place EEG caps successfully on all participants, and DC offsets were within the target range indicative of quality signal recording. This is particularly noteworthy given the prevalence of protective hairstyles among participants, suggesting that as long as participants are fully informed and comfortable with the procedure, special considerations are not needed with regard to cap placement. Although DC offsets were within the target range, analyses did reveal a significant systematic association between hair group and the magnitude of DC offsets observed, with participants in hair group 2 showing greater deviations from zero relative to those in hair group 1. Higher DC offsets could reflect less efficient contact with the skin, potentially as a result of greater hair density. Consistent with this hypothesis, our results show that gel volume correlates with DC offsets across all regions, and inclusion of gel volume as a covariate in the predictive model greatly reduced group differences. Once gel volume was accounted for, group differences were eliminated in the frontal, temporal, and parietal regions, and were attenuated in the occipital region. These results suggest that gel volume may be an effective way to quantify individual differences among participants that could contribute to between-participant (and between-group) variations in data quality.
Additional metrics of data quality were also generally indicative of successful data acquisition across groups. For instance, groups did not differ with regard to the number of bad channels removed, or in the number of artifact components identified in the pre-processing stages. Furthermore, SNR values did not differ between groups, with the exception of the Cz electrode, and this effect indicated greater signal quality in Hair Group 2. Although these metrics present an encouraging story regarding data quality, there was a main effect of average amplitude of the measured ERP components, with higher amplitudes observed among hair group 1 on average. Additional analyses examining the standardized measurement error of the ERP components revealed no significant group differences, indicating that differences in amplitude were not attributable to differences in the consistency and reliability of the ERP components across trials. Given the generally positive indicators of data quality, the question remains whether differences in average amplitude should be interpreted as a differences in cognitive processing or as an artifact of hair characteristics. Notably, the inclusion of gel volume in the model eliminated the group difference in P3b amplitude, and group differences in P1 amplitude at both the P7 and P8 sites were greatly attenuated. This suggests that at least some of the variance between the groups is accounted for by differences in hair volume. It is possible that the residual group difference at the P7/P8 electrode indicate that the effects of hair on data quality are amplified by scalp topography. Indeed the effects that were not eliminated by accounting for gel volume were observed in the occipital and lateral regions. If this is the case, researchers should carefully consider potential group-specific implications of electrode location when interpreting effects.
It is also possible, however, that residual differences in P1 amplitude are attributable to group difference other than hair. Although all participants were originally recruited as a part of a study on rural poverty, participants in group 2 experienced greater poverty on average. A recent review of studies examining associations between poverty and ERP components noted inconsistent associations with P1 (Perera, Salehuddin, Khairudin, & Schaefer, 2021). Our own data indicate a small but significant association between INR and P1 amplitude, but only at the P8 electrode, likely suggesting that the effect is not particularly robust. However, there could be other demographic factors with a stronger association with P1 than INR. More research is needed to understand such effects, especially given that previous research did not control for gel volume, which our data suggest contributes significantly to amplitude levels and differs systematically by racially-correlated hair features.
Conclusions
The findings from this study suggest that individual differences in hair should not be considered exclusionary to ERP research, but that hair volume contributes to individual differences in distance between the electrode and scalp surface that may result in low signal strength, a metric not captured by the other indicators of data quality. While this is likely to be true across all participants, this potential nuisance can only be ignored if it can be assumed that hair volume is distributed across participants at random. Obviously, this assumption is violated, by the potentially systematic differences in hair volume including racially-correlated differences in follicle density or differences in hair styles that could also differ across genders. Thus hair could introduce a systematic bias into ERP research that could be misattributed to psychological differences among groups. Our findings suggest that measuring gel volume provides a more accurate, individualized, and sensitive measure of individual differences that can account for this potential confound, and can be easily incorporated into cap preparation. While the quantification of gel volume may help mitigate this problem, additional research is needed to further examine potential measurement confounds that could impact the validity of psychophysiological techniques for certain populations. In the meantime, researchers are encouraged to carefully consider possible measurement confounds, particularly with regard to regional topography that may moderate the impact of hair volume on ERP components in certain locations.
It is important to acknowledge, however, that this approach is only applicable to gel-based EEG systems, and that a number of recording systems exist that do not use gel as a conductive medium. For instance, some systems utilize sponges that absorb and hold an electrolyte solution. Our findings do not address whether such systems are less capable of accommodating different hair types. Additional research is needed to evaluate whether alternative measures of hair volume, for instance the distance between the scalp and the hair surface when the hair is compressed into the cap, would be equally effective in reducing the confounding effect of hair volume on signal quality. Given the interest in expanding the diversity of participants in EEG research, and the implications of misattributing signal differences, we strongly encourage researchers to take steps to quantify elements of data quality for their recordings. These metrics can then be used to examine whether any identified between-group and/or -person effects are a function of potential artifact or data quality rather than a true psychophysiological difference.
It is likely that gel-based systems do offer an advantage in this regard, and they have been associated with generally better signal quality (Troller‐Renfree et al., 2021). Morphological, physical, and mechanical properties associated with different hair types may also affect how different systems perform. For instance, highly textured hair associated with African heritage is less likely to absorb liquid compared to Asian and Caucasian hair (Franbourg et al., 2003). We postulate that this physical property combined with greater density provides a strong boundary for holding the gel in place to carry the signal to the electrode. Such hair may be less prone to absorbing liquid that could lead to drying and degrading signal over time, as well as less seepage of the solution, which can lead to bridging artifact in high density montages.
There has been a substantial increase in attention to issues of diversifying participation in EEG research in the past few years. A variety of papers have offered guidance to researchers on the importance of increasing research team awareness of the breadth of hair types and styles that they might encounter and ensure that the team is appropriately trained to engage with all participants with respect and sensitivity (see Louis, Webster, Gloe, & Moser, 2022; Parker & Ricard, 2022). Researchers can also access, and contribute to, a dynamic crowd-sourced guidance document for conducting EEG research with a diverse range participants including guidance in preparing participants in advance of the recording session, guidance on maximizing scalp access, and recommendations to individuals regarding post-visit hair care (see (https://hellobrainlab.com/research/eeg-hair-project/). Drawing from our own experiences over the past 15 years of working with Black participants in psychophysiological research, we echo the feasibility of collecting EEG/ERP data across all hair types and styles. We encourage researchers to avoid routine screening and exclusion of participants on the basis of hair. Thorough advanced communication about the process is important, and individuals can decide for themselves whether and when they wish to participate. Videos or brochures can be a helpful resource for potential participants.
Limitations
While we provide some insight into data quality across varying hair types and textures, our study is not without limitations. Firstly, it should be noted that this is only a singular dataset recorded from one specific hardware system. Additionally, the density of the present montage (34 electrodes) is fairly low, and it is possible that increasing montage density to 64 or 128-electrodes may interact with a participant’s hair to present additional challenges not seen in our montage. As such replications should aim to expand the size of the electrode montage as well as the hardware used, so as to further explore how hair texture/type affect signal quality and noise. It is also possible that such efforts could be enacted by analyzing data quality metrics from archival data if already available, although it is unlikely that measures of gel volume or an alternative quantification of hair volume are available for archival datasets.
In a discussion of EEG data quality, we would be amiss to not mention the breadth of ways in which EEG and ERP data are processed prior to analysis (e.g., filter and reference selection, interpolation, ICA decomposition, artifact cleaning methods, etc.) and how indices are measured and quantified (e.g., peak and mean amplitude, difference waves, base-to-peak, peak-to-peak, sliding average, etc.) across the broader literature. It is possible that variations within data processing pipelines and index quantification could generate differences in some data quality metrics and may present a challenge when it comes to comparing data quality across cohorts.
Moving forward, to address the lack of diversity within its research participants psychophysiological and EEG research needs to more actively embrace community centered approaches to the collection of data. More specifically, we should look to engage and work with community members to recruit under-represented participants and be active in seeking to understand the different burdens of participation across participants. Finally, increased efforts to report the demographic details of samples will help establish a greater foundation of literature from which meta-analyses and literature reviews can begin to explore possible sources of systematic bias in research findings. In addition, we would also highlight the need to explicitly report demographics for those participants who were excluded from the study both before and after data processing, as this will better enable us, as a field, to identify sources of bias that may contribute to systematic exclusion of minority identifying individuals from EEG research.
Supplementary Material
Impact Statement.
Differences in hair volume that frequently vary by race may introduce systematic bias in ERP amplitudes. Our results indicate that accounting for these differences by quantifying gel volume used during cap setup provides a means to statistically control this confound, facilitating greater inclusivity of participants in EEG research.
Acknowledgments
We wish to extend our thanks to the study participants and their families for providing a detailed and extended look at their daily lives, and to the many research assistants who helped obtain such rich data. We also wish to recognize and thank our RA teams, in particular the UNC team, for their community and cultural understanding which has been invaluable in this project. Funding for this project was provided by the Environmental Influences on Child Health (ECHO) program (UH3 OD023332). Ty Lees is now at the Laboratory of Affective and Translational Neuroscience, Center for Depression, Anxiety and Stress Research, McLean Hospital and the Department of Psychiatry, Harvard Medical School. We have no known conflicts of interest to disclose. The data that support the findings of this study are available on request from the corresponding author.
Footnotes
CRediT Statement
Conceptualization: LMGK, MMS, TL
Data curation: TL
Formal Analysis: TL, NR
Funding Acquisition: LMGK, MMS
Methodology: LMGK, TL
Project Administration: LMGK, MMS
Resources: LMGK, MMS
Supervision: LMGK
Visualization: TL
Writing – Original Draft: TL, LMGK, NR
Writing – review & editing: TL, LMGK, MMS, NR
References
- Bates D, Mächler M, Bolker B, & Walker S (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1–48. 10.18637/jss.v067.i01 [DOI] [Google Scholar]
- Bernard BA (2003). Hair shape of curly hair. Journal of the American Academy of Dermatology, 48(6), S120–S126. 10.1067/mjd.2003.279 [DOI] [PubMed] [Google Scholar]
- Bradford DE, DeFalco A, Perkins ER, Carbajal I, Kwasa J, Goodman FR, … Joyner KJ (2022). Whose Signals Are Being Amplified? Toward a More Equitable Clinical Psychophysiology. Clinical Psychological Science, 216770262211121. 10.1177/21677026221112117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broyd SJ, Richards HJ, Helps SK, Chronaki G, Bamford S, & Sonuga-Barke EJS (2012). An electrophysiological monetary incentive delay (e-MID) task: A way to decompose the different components of neural response to positive and negative monetary reinforcement. Journal of Neuroscience Methods, 209(1), 40–49. 10.1016/j.jneumeth.2012.05.015 [DOI] [PubMed] [Google Scholar]
- Choy T, Baker E, & Stavropoulos K (2022). Systemic Racism in EEG Research: Considerations and Potential Solutions. Affective Science, 3(1), 14–20. 10.1007/s42761-021-00050-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- de la Mettrie R, Saint-Léger D, Loussouarn G, Garcel A, Porter C, & Langaney A (2007). Shape Variability and Classification of Human Hair: A Worldwide Approach. Human Biology, 79(3), 265–281. 10.1353/hub.2007.0045 [DOI] [PubMed] [Google Scholar]
- Delorme A, & Makeig S (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods, 134(1), 9–21. https://doi.org/ 10.1016/j.jneumeth.2003.10.009 [DOI] [PubMed] [Google Scholar]
- Etienne A, Laroia T, Weigle H, Afelin A, Kelly SK, Krishnan A, & Grover P (2020). Novel Electrodes for Reliable EEG Recordings on Coarse and Curly Hair. 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2020-July, 6151–6154. IEEE. 10.1109/EMBC44109.2020.9176067 [DOI] [PubMed] [Google Scholar]
- Franbourg A, Hallegot P, Baltenneck F, Toutaina C, & Leroy F (2003). Current research on ethnic hair. Journal of the American Academy of Dermatology, 48(6), S115–S119. 10.1067/mjd.2003.277 [DOI] [PubMed] [Google Scholar]
- Gatzke-Kopp LM (2016). Diversity and representation: Key issues for psychophysiological science. Psychophysiology, 53(1), 3–13. 10.1111/psyp.12566 [DOI] [PubMed] [Google Scholar]
- Keil A, Debener S, Gratton G, Junghöfer M, Kappenman ES, Luck SJ, … Yee CM (2014). Committee report: Publication guidelines and recommendations for studies using electroencephalography and magnetoencephalography. Psychophysiology, 51(1), 1–21. 10.1111/psyp.12147 [DOI] [PubMed] [Google Scholar]
- Kissel HA, & Friedman BH (2023). Participant diversity in Psychophysiology, Psychophysiology, 60(11):e14369. 10.1111/psyp.14369 [DOI] [PubMed] [Google Scholar]
- Knutson B, Fong GW, Adams CM, Varner JL, & Hommer D (2001). Dissociation of reward anticipation and outcome with event-related fMRI. NeuroReport, 12(17). Retrieved from https://journals.lww.com/neuroreport/toc/2001/12040 [DOI] [PubMed] [Google Scholar]
- Knutson B, Westdorp A, Kaiser E, & Hommer D (2000). FMRI Visualization of Brain Activity during a Monetary Incentive Delay Task. NeuroImage, 12(1), 20–27. 10.1006/nimg.2000.0593 [DOI] [PubMed] [Google Scholar]
- Kreplak L, Briki F, Duvault Y, Doucet J, Merigoux C, Leroy F, … Dumas P (2001). Profiling lipids across Caucasian and Afro-American hair transverse cuts, using synchrotron infrared microspectrometry. International Journal of Cosmetic Science, 23(6), 369–374. 10.1046/j.0412-5463.2001.00118.x [DOI] [PubMed] [Google Scholar]
- Lenth R. v. (2022). Estimated Marginal Means, aka Least-Squares Means. Retrieved from https://cran.r-project.org/package=emmeans
- Lindelöf B (1988). Human Hair Form. Archives of Dermatology, 124(9), 1359. 10.1001/archderm.1988.01670090015003 [DOI] [PubMed] [Google Scholar]
- Lopez-Calderon J, & Luck SJ (2014). ERPLAB: an open-source toolbox for the analysis of event-related potentials. Frontiers in Human Neuroscience, Vol. 8. 10.3389/fnhum.2014.00213 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Louis CC, Webster CT, Gloe LM, & Moser JS (2022). Hair me out: Highlighting systematic exclusion in psychophysiological methods and recommendations to increase inclusion. Frontiers in Human Neuroscience, 16. 10.3389/fnhum.2022.1058953 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loussouarn G, Garcel A, Lozano I, Collaudin C, Porter C, Panhard S, … de La Mettrie R (2007). Worldwide diversity of hair curliness: a new method of assessment. International Journal of Dermatology, 46(s1), 2–6. 10.1111/j.1365-4632.2007.03453.x [DOI] [PubMed] [Google Scholar]
- Luck SJ (2014). An Introduction to the Event-Related Potential Technique (2nd ed.). MIT Press. [Google Scholar]
- Nolan H, Whelan R, & Reilly RB (2010). FASTER: Fully Automated Statistical Thresholding for EEG artifact Rejection. Journal of Neuroscience Methods, 192(1), 152–162. https://doi.org/ 10.1016/j.jneumeth.2010.07.015 [DOI] [PubMed] [Google Scholar]
- Parker TC, & Ricard JA (2022). Structural racism in neuroimaging: perspectives and solutions. The Lancet Psychiatry, 9(5), e22. 10.1016/S2215-0366(22)00079-7 [DOI] [PubMed] [Google Scholar]
- Perera-W.A. H, Salehuddin K, Khairudin R, & Schaefer A (2021). The relationship between socioeconomic status and scalp event-related potentials: A systematic review. Frontiers in Human Neuroscience, 15:e601489. 10.3389/fnhum.2021.601489 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Picton TW, Bentin S, Berg P, Donchin E, Hillyard SA, Johnson R JR., … Taylor MJ (2000). Guidelines for using human event-related potentials to study cognition: Recording standards and publication criteria. Psychophysiology, 37(2), 127–152. https://doi.org/ 10.1111/1469-8986.3720127 [DOI] [PubMed] [Google Scholar]
- Pion-Tonachini L, Kreutz-Delgado K, & Makeig S (2019). ICLabel: An automated electroencephalographic independent component classifier, dataset, and website. NeuroImage, 198, 181–197. https://doi.org/ 10.1016/j.neuroimage.2019.05.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pivik RT, Broughton RJ, Coppola R, Davidson RJ, Fox N, & Nuwer MR (1993). Guidelines for the recording and quantitative analysis of electroencephalographic activity in research contexts. Psychophysiology, 30(6), 547–558. https://doi.org/ 10.1111/j.1469-8986.1993.tb02081.x [DOI] [PubMed] [Google Scholar]
- Radüntz T (2018). Signal Quality Evaluation of Emerging EEG Devices. Frontiers in Physiology, 9(FEB), 1–12. 10.3389/fphys.2018.00098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singmann H, Bolker B, Westfall J, Aust F, & Ben-Shachar M (2023). afex: Analysis of Factorial Experiments. Retrieved from https://cran.r-project.org/package=afex
- Troller‐Renfree S. v., Morales S, Leach SC, Bowers ME, Debnath R, Fifer WP, … Noble KG (2021). Feasibility of assessing brain activity using mobile, in‐home collection of electroencephalography: methods and analysis. Developmental Psychobiology, 63(6), 1–11. 10.1002/dev.22128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ursache A, & Noble KG (2016). Neurocognitive develoment in socioeconomic context: Multiple mechanisms and implications for measuring socioeconomic status. Psychophysiology, 53, 7–82. 10.1111/psyp.12547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vernon-Feagans L, Cox M, & The Family Life Project Key Investigators. (2013). The Family Life Project: an epidemiological and developmental study of young children living in poor rural communities. Monographs of the Society for Research in Child Development, 78(5), vii–vii. 10.1111/mono.12046 [DOI] [PubMed] [Google Scholar]
- Webb EK, Etter JA, & Kwasa JA (2022). Addressing racial and phenotypic bias in human neuroscience methods. Nature Neuroscience, 25(4), 410–414. 10.1038/s41593-022-01046-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wickham H (2016). ggplot2. New York, NY: Springer New York. 10.1007/978-0-387-98141-3 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
