Abstract
Researchers are increasingly utilizing physiological data like electrodermal activity (EDA) to understand how stress “gets under the skin”. Results of EDA studies in autistic children are mixed, with some suggesting autistic hyperarousal, others finding hypoarousal, and others finding no difference compared to non-autistics. Some of this variability likely stems from the different techniques used to assess EDA. Therefore, the purpose of this study is to investigate and compare commonly used metrics of EDA (frequency of peaks, average amplitude of peaks, and standard deviation of skin conductance level) using two data processing programs (NeuroKit2 and Ledalab) and their link to observed child behavior. EDA data was collected using Empatica E4 Wristbands from 60 autistic children and adolescents (5-18 years old) during a seven-minute play interaction with their primary caregiver. The play interaction was coded for a range of child behaviors including mood, social responsiveness, dysregulation, and cooperation. Results indicate a strong correlation between NeuroKit2 and Ledalab and a weak correlation between metrics within each program. Furthermore, the frequency of peaks was associated with more positive child social behaviors, and the magnitude of peaks was associated with less adaptive child behaviors. Recommendations for replication and the need for generalizability of this research are given.
The search for neurophysiological mechanisms and biomarkers related to outcomes for individuals with intellectual and developmental disabilities (IDD) is a top research priority (NICHD, 2020). One neurophysiological mechanism that has gained traction is electrodermal activity (EDA), a marker of the sympathetic branch of the autonomic nervous system. Skin conductance, the most widely studied property of EDA, is a non-invasive assessment of under-the-skin processes associated with stressful, sensory, and affective stimuli or situations (Braithwaite et al., 2015). The use of EDA with IDD populations, particularly autism, is growing; however, this research has yet to establish a common set of methodological and analytic strategies to use with this data, severely limiting research generalizability. This paper examines the conceptual and methodological challenges associated with EDA research in autism and provides an example of these core issues using EDA data from a sample of 60 autistic children.
The sympathetic branch (SNS) of the autonomic nervous system initiates the fight or flight response. A metric of SNS response is EDA. Skin conductance is a widely studied property of EDA and can be estimated by applying a constant, undetectable low voltage between two points of skin contact and measuring the current flow between them (Posada-Quintero & Chon, 2020). When the SNS is activated, sweat rises to the skin's surface via cholinergic innervation, resulting in increased skin conductance and decreased skin resistance. Unlike other metrics of autonomic functioning that typically capture trait-level arousal, EDA reflects moment-to-moment engagement with the environment (Braithwaite et al., 2015). EDA is also one of the few measures of SNS functioning that is uncontaminated by parasympathetic activity, making it an ideal candidate for a neurophysiological biomarker of under-the-skin processes (Boucsein, 2012; Braithwaite et al., 2015; Dawson et al., 2017; Posada-Quintero & Chon, 2020). Measuring under-the-skin processes can be particularly helpful when working with individuals for whom self-report of arousal states may be difficult such as those on the autism spectrum (Mazefsky, Kao, & Oswald, 2011; White et al., 2009; for a review of interoceptive ability in autism see DuBois et al., 2016).
The polyvagal theory posits that autistic children should have an overly sensitive sympathetic response (Barbier, Chen, & Huizinga, 2022). Thus, hyperarousal of the autonomic nervous system would reflect a continuous state of “fight or flight”, resulting in an exaggerated sympathetic response to stress; however, empirical support for this theory in autism is mixed. Some studies report hyperarousal of the SNS, some evidence suggests hypoarousal, and others report no difference in SNS functioning as measured by EDA compared to non-autistic children (Airij et al., 2020; Ferguson et al., 2019; Kushki et al., 2013; Lydon et al., 2016; Panju et al., 2015; Visnovcova et al., 2022). A recent review of studies examining potential autonomic nervous system dysfunction in autism found little empirical evidence for sympathetic hyperarousal as measured by EDA (Barbier et al., 2022).
Initial research efforts hypothesized that a higher degree of autistic traits would be associated with an increased EDA response (hyperarousal) to various types of stimuli (e.g., social, sensory, etc.; e.g., Fenning et al., 2017; Prince et al., 2017); however, other evidence suggests that EDA may be a better indicator of behavioral functioning in emotionally challenging situations (Vernetti et al., 2020) or co-occurring internalizing conditions like anxiety (Barbier et al., 2022). Alternative evidence suggests that hypoarousal may be more likely to characterize certain autistic children for whom externalizing issues are present (Baker et al., 2018). Others have found evidence of different profiles of SNS functioning among autistic children, with certain profiles of EDA variability associated with distinct behavioral phenotypes (Parma et al., 2021; Schoen et al., 2008). Altogether, the empirical support linking EDA and autism is mixed at best.
There are several potential explanations for such inconclusive evidence. It is often difficult to obtain a good EDA signal when working with children generally but especially in populations such as autism that have sensory sensitivities. Some (autistic) children, for example, cannot tolerate sticky electrodes or the gel needed to place electrodes on the skin. The inconclusive evidence may also be related to how researchers process and analyze EDA data. Researcher degrees of freedom surrounding EDA data is quite large, with choices beginning during data collection, continuing through data processing, and culminating in data analysis with little guidance currently provided in the literature1. Specifically, many choices are made during the multi-step pre-processing of the raw EDA data. First, researchers need to account for significant individual differences in baseline skin conductance and range of skin conductance values. Second, EDA is often heavily skewed and/or kurtotic, and data transformations are usually recommended. Finally, to make the data meaningful, there is a need to distinguish “noise” from the true neurophysiological response. Undertaking data preprocessing requires the use of specific software and researchers must choose between expensive, proprietary software or several open-sourced packages, both of which often require significant expertise and use different criteria for “cleaning” the data, applying filters, and defining variables (e.g., peaks), some of which are modifiable and some of which are not. It is unclear whether the results from different data processing programs are comparable.
Researchers must also choose what metric best characterizes the EDA response. EDA includes hundreds, if not thousands of data points for each individual, and researchers are tasked with distilling large amounts of data into a usable number for analysis. The three most common metrics used in situations that do not apply a specific stimulus are (1) the frequency of peaks, (2) the average amplitude of peaks, and (3) the standard deviation of the skin conductance level (SCL). The frequency of EDA peaks is thought to be a quantitative metric of emotional arousal by capturing quick, sudden changes in phasic responses of the sympathetic nervous system. The average amplitude of peaks is also considered a metric of sympathetic activity that captures the average magnitude of sympathetic responses. Peak amplitude is calculated by taking the difference between the skin conductance values at a peak and the preceding trough (Benedek & Kaernbach, 2010). Finally, the standard deviation of SCL is thought to capture an individual’s level of general arousal. It is unclear which metrics should be used in what circumstance and if they are interchangeable.
The use of these metrics, alongside differences in data processing, may play a role in the mixed findings that characterize EDA research in autism and inhibit the potential use of EDA as a biomarker. Therefore, the purpose of the current study is to describe EDA collected during a parent-child interaction task in a sample of 60 autistic individuals (5-17 years old) using two open-source processing packages. First, we compare the three most common metrics of EDA (frequency of peaks, amplitude of peaks, and standard deviation of SCL) both within and between each program to examine if the programs and the metrics are capturing the same underlying construct. Second, we examine whether any of the six EDA metrics are correlated with observed child behavior during a parent-child play interaction. Because our goals are primarily descriptive in nature, we do not have a priori hypotheses about how the EDA metrics may be associated with each other or with behavioral observation.
Methods
Participants
Seventy-eight autistic children (5-18 years old) participated in a semi-structured parent-child interaction task as part of a larger longitudinal study on families of autistic children [removed for review]. All children had a documented diagnosis, assessed using the Autism Diagnostic Observation Schedule (ADOS-2; Lord et al., 2012), of autism spectrum disorder from a medical or education professional. See Table 1 for demographic information about the sample.
Table 1.
Sample demographic characteristics
| Child Variables | |
|---|---|
| Age (M, SD; range) | 12.323 (3.26); 5-18 |
| Sex [n (% male)] | 51 (82.3) |
| ID Status [N (% yes)] | 19 (20.0) |
| Age (in months) at autism dx (M, SD; range) | 48.14 (22.53); 18-140 |
| Parent & Family Variables | |
| Age (M, SD; range) | 42.69 (5.62); 28-58 |
| Race [n (% male)] | |
| White | 72 (91.7) |
| African-American/Black | 2 (2.1) |
| Hispanic | 2 (2.1) |
| American Indian | 0 |
| Asian or Pacific Islander | 0 |
| Multiple chosen | 2 (2.1) |
| Yearly Household Income [n (%)] | |
| Less than $10k | 12 (19.0) |
| $10-19,999 | 8 (12.7) |
| $20-29,999 | 8 (12.7) |
| $30-39,999 | 5 (7.9) |
| $40-49,999 | 8 (12.7) |
| $50-59,999 | 2 (3.2) |
| $60-69,999 | 8 (12.7) |
| $70-79,999 | 1 (1.6) |
| $80-89,999 | 2 (3.2) |
| $90-99,999 | 1 (1.6) |
| $100,000 or more | 2 (3.2) |
| Parent Employment [n (%)] | |
| Unemployed | 9 (14.3) |
| Homemaker | 10 (15.9) |
| Full-time | 19 (30.2) |
| Missing | 4 (6.3) |
Procedure
All procedures for the larger study were approved by an institutional review board. Families completed a 2-hour research session either at home or in a laboratory setting according to their preference. Each parent-child dyad completed a seven-minute, semi-structured, goal-based play session involving either putting together a puzzle or a Lego set as was developmentally appropriate for the child. The current task was adapted from the structured task paradigm used by Blacher, Baker, & Kaladijan (2012). The dyad was given broad instructions to “complete the puzzle/Lego set” and for the parent to “help their child in whatever way they saw fit”. The child wore the Empatica E4 wristband (McCarthy et al., 2016) during the interaction. The E4 wristband measures a range of neurophysiological indicators, including EDA. Of the total sample, 60 children completed the parent-child interaction while wearing the E4 device.
Measures
Electrodermal Activity
EDA was collected using Empatica E4 Wristbands (McCarthy et al., 2016) sampled at 4 Hz, with valid values ranging from 0.01 to 100 μSiemens. Each participant had seven minutes of data resulting in 1680 EDA samples per participant. Data were processed using two open-source programs: Ledalab and NeuroKit2 (Benedek & Kaernbach, 2010; Makowski et al., 2021). Ledalab is a free and open-source Matlab-based program used for the analysis of skin conductance data (Benedek & Kaernbach, 2010). NeuroKit2 is a Python package designed for processing neurophysiological signals, including EDA (Makowski et al., 2021).
The code used to process the raw EDA data in both programs is described in Appendix A. Peaks were defined as EDA with a minimum amplitude of 0.03 μSiemens. Very low values (< 0.01 μSiemens) were excluded from the data as noise. Peaks that occurred within 0.5 seconds (2 samples) of a “jump” in the amplitude of 0.5 μSiemens or greater were considered artifacts and removed from the data. Following guidelines by Dawson and colleagues (2017), peaks with an amplitude greater than 1.0 μSiemens (or 3 standard deviations) were also excluded. Finally, we excluded the lower-amplitude peak in any pair of peaks within 1 second of each other (Hernandez et al., 2014) after peaks near jumps or peaks with too high an amplitude had been excluded. Given the large variability in baseline EDA levels, the data were standardized prior to analysis to facilitate interpretation.
Child Observed Behavior
The goal-based play session videos were later coded by a research team of post-doctoral and graduate-level researchers. All codes characterize global-level behaviors. The positive and negative mood codes were adapted from the Parent-Child Interaction Rating Scale (Belsky, Crnic, & Woodworth, 1995), which has been used in studies with autistic populations (e.g., Blacher, Baker, & Kaladjian, 2013); all other codes except emotion dysregulation (i.e., responsiveness to caregiver, social initiation, autonomy/independence, noncompliance, dyadic reciprocity, dyadic cooperation, and dyadic conflict) were taken from the Parent-Child Interaction System (Deater-Dechard, Pylas, & Petrill, 1997). Observers rated child and dyadic behaviors on a 7-point Likert scale (1 = ‘no occurrence of behavior’ to 7 = ‘continual occurrence of behavior’). Emotion dysregulation was coded based on methods used in Hoffman, Crnic, & Baker (2006) using a one to five Likert scale in which higher scores indicated high frequency and/or high-intensity displays of dysregulation. All coders reached an 80% reliability cutoff during training and reliability was maintained throughout the process (ICC values range from .625 - .967).
Autism Traits
Parents reported their child’s autism traits using the 65-item Social Responsiveness Scale (SRS-2; Constantino & Gruber, 2012). The SRS-2 queries a range of characteristics and behaviors from the previous six months related to the child’s social communication, restricted interests, and repetitive behaviors, rated on a scale of 1 (‘not true’) to 4 (‘almost always true’). Items were summed to create a total score such that higher scores reflect a greater degree of autism traits.
Demographics
Child demographic characteristics such as sex assigned at birth, age, and intellectual disability status (ID; coded as 1=’yes’, 0=’no’) were provided by parents.
Data Analysis
Data were exported from both Ledalab and NeuroKit2 into a .csv file. Analyses were conducted in SPSS 29.0. Three participants were identified as outliers (+/− 3 standard deviations from the sample mean on at least 3 EDA metrics) and were removed from the dataset. We used bivariate correlations to examine associations between child sociodemographic characteristics (sex assigned at birth, age, ID status) and autistic traits and all six metrics of EDA (i.e., Separate Ledalab and Neurokit2 values for frequency of peaks, the amplitude of peaks, and standard deviation of SCL) to identify additional covariates aside from body temperature and movement (ACC) collected via the E4 wristband. Next, partial correlations and paired-sample t-tests were used to examine associations within and between metrics using Ledalab and NeuroKit2. Finally, partial correlations examined associations between EDA metrics and observed child- and dyadic-level behaviors during the play task.
Results
Bivariate correlations between child age, gender, ID status, and Ledalab EDA metrics were all nonsignificant (ps < .05, r range .016 - .208). Intellectual disability status was significantly associated with NeuroKit frequency of peaks (r = .268, p = .042) such that the presence of a co-occurring intellectual disability was associated with more frequent peaks. Child age, gender, and ID status were not associated with the NeuroKit mean amplitude (r ≤ .221, p ≥ .102) or standard deviation of SCL (r ≤ .189, p ≥ .163). No EDA metric was significantly correlated with parent-reported autistic traits (ps < .05). Subsequent analyses use body temperature, body movement (see Braithwaite et al., 2015), and ID status as covariates.
Partial correlations between and within Ledalab and NeuroKit 2 as well as descriptive statistics for each EDA metric can be found in Table 2. Of note is the pattern of relatively high correlations in corresponding metrics between the two packages. For instance, the frequency of peaks in NeuroKit2 and the same metric in Ledalab are highly correlated (r = .721, p < .001). In contrast, the pattern of correlations amongst metrics within a program was less consistent. The frequency of peaks and the average amplitude of peaks were not associated in NeuroKit2 (r =−.015, p > .05) and weakly, negatively, and not significantly (r = −.23, p > .05) associated in Ledalab.
Table 2.
Intercorrelations amongst the six EDA metrics from NeuroKit2 and LedaLab
| 1 | 2 | 3 | 4 | 5 | 6 | |
|---|---|---|---|---|---|---|
| 1. Frequency NK | -- | |||||
| 2. Mean NK | −.015 | |||||
| 3. SD NK | −.299* | .711*** | ||||
| 4. Frequency LL | .724*** | .086 | −.104 | |||
| 5. Mean LL | −.315** | .576*** | .567*** | −.230 | ||
| 6. SD LL | −.318** | .503*** | .477*** | −.338** | .965*** | -- |
| Mean (SD) | 51.28 (42.862) |
.793 (.464) |
.538 (.236) |
169.03 (122.09) |
1.598 (1.520) |
.978 (.839) |
| Min-Max | 0-184 | .094-2.149 | .076-.989 | 13-545 | .103-9.248 | .065-3.852 |
Note. NK = NeuroKit 2; LL = LedaLab; SD = standard deviation
p < .05
p < .01
p< .001
To further explore intercorrelations among EDA metrics between and within each program, we used a median split to divide the sample into “high” and “low” groups for each metric. Between packages, those who were identified as “high” or “low” in NeuroKit2 were often also similarly identified as such by LedaLab (80% of participants had “matching” group categorization for frequency, 83.3% for mean amplitude, and 71.7% for SD). Within packages, however, the metrics provided less consistent groupings. In NeuroKit2, only 33.3% of the sample was identified as “low” or “high” across all three metrics. In Ledalab, 45% of the sample was identified as “low” or “high” across all three of the metrics. The same patterns of results were found when creating groups using +/− 1 standard deviation instead of the median.
We used paired-sample t-tests to determine if mean differences exist between programs for each metric. Results indicated that NeuroKit2 (M = 51.28, SD = 42.86) identified significantly fewer peaks compared to Ledalab (M = 169.03, SD = 122.09) [t(59) =−8.896, p < .001] and that the average amplitude of those peaks was lower in NeuroKit2 (M = .794, SD = .463) compared to Ledalab (M = 1.467, SD = 1.161) [t(57) = −5.318, p < .001]. The standard deviation of the SCL was also lower according to NeuroKit2 (M = .538, SD = .236) compared to Ledalab (M= .974, SD = .848) [t(57) = −4.368, p < .001].
Our second aim was to explore which, if any, EDA metrics were correlated with the child’s observed behavior during the play interaction. Descriptive statistics for each observational code and partial correlations between EDA and coded child and dyadic level behaviors can be found in Table 3. The general pattern of results indicates that more frequent peaks in EDA are positively associated with child responsiveness to mothers and their level of autonomy during the task and negatively associated with the child’s negative mood and displays of emotion dysregulation. In contrast, the average amplitude of peaks is positively associated with child negative mood and dyadic conflict and negatively associated with child responsiveness to mothers.
Table 3.
Intercorrelations amongst the six EDA metrics from NeuroKit2 and Leda Lab
| Min-Max | Mean (SD) |
NK Frequency |
NK Mean | NK SD | LL Frequency |
LL Mean | LL SD | |
|---|---|---|---|---|---|---|---|---|
| Positive Mood | 1-7 | 2.52 (1.378) |
.140 | −.053 | .086 | .027 | −.121 | −.184 |
| Negative Mood | 1-6 | 1.48 (.850) |
−.249† | .324* | .205 | .016 | .120 | .189 |
| Responsiveness | 2-7 | 5.22 (1.332) |
.272† | −.130 | −.265† | .198 | −.311* | −.197 |
| Initiation | 1-7 | 3.30 (1.39) |
.027 | .107 | .010 | −.094 | −.146 | −.069 |
| Autonomy | 1-7 | 4.10 (2.098) |
.258† | .175 | .034 | .276* | −.190 | −.131 |
| Noncompliance | 1-7 | 1.70 (1.228) |
−.122 | .043 | .088 | −.141 | .050 | .002 |
| Emotion Dysregulation | 1-4 | 1.29 (.701) |
−.279* | .136 | .096 | −.154 | .038 | .005 |
| Dyadic Reciprocity | 1-7 | 2.94 (1.409) |
.203 | .202 | .033 | .041 | −.152 | −.141 |
| Dyadic Cooperation | 1-7 | 5.02 (1.994) |
.216 | .022 | −.056 | −.054 | −.062 | −.092 |
| Dyadic Conflict | 1-5 | 1.65 (.930) |
−.100 | .304* | .140 | −.029 | .229 | .223 |
Note. NK = Neurokit2; LL = LedaLab; SD = standard deviation
p < .05
p < .10
Discussion
Calls for research examining biomarkers of under-the-skin processes, especially in populations for whom self-report or recognition of arousal states may not be easily captured, continue to grow. The purpose of this study was to describe the EDA pre-processing and metric calculation steps in Ledalab and NeuroKit2 and to compare these metrics using a semi-structured goal-based interaction in a sample of 60 autistic children. Our focus was on the three most commonly reported metrics of EDA (frequency of peaks, amplitude of peaks, and standard deviation of SCL).
Results indicated strong correlations between the frequency of peaks and peak amplitude metrics between the two programs (Ledalab and NeuroKit2). These findings suggest that using these different programs to process EDA data should yield similar patterns when analyzing the relation between EDA and outcomes of interest (e.g., child behavior). However, there were notable differences between the programs in actual peak values. NeuroKit2 identified significantly fewer peaks compared to LedaLab and those peaks also had a lower average amplitude. Thus, the comparison of peak frequency and amplitude raw values across programs is not meaningful.
In contrast, the correlations of EDA metrics within each program were less robust, indicating that these metrics may not be interchangeable and may be proxies for different neurophysiological processes. Metrics within programs did a poor job of classifying EDA as “high” or “low” based on sample medians and standard deviations and correlations with observed child behavior differed by metric. This is not necessarily unexpected, nor is it problematic per se. Rather, it suggests that each metric is a different variable measuring a unique piece of sympathetic physiology. The challenge is understanding which metric to use in which circumstance to capture which physiological response and for researchers to agree to such criteria to improve comparability, generalizability, and replicability. The evidence presented here, which requires replication, implies that researchers cannot directly compare results based on one of these indices to results from other studies that use different indices of EDA.
A higher number of peaks was associated with observations of more autonomy during the interaction, fewer instances of negative mood, and less emotion dysregulation by the autistic child. We also found that a higher amplitude of peaks was associated with higher levels of child negative mood and more conflict within the dyad. Thus, it may be beneficial for autistic children to have frequent responses as it shows engagement and sensitivity to the interaction partner but low-level EDA responses (i.e., not overly emotionally aroused) during social interactions. This is in line with general research suggesting that autonomic arousal (i.e., EDA) facilitates behavioral responses to salient environmental information, particularly social information (Critchley, 2002; Sequeira et al., 2009). Others have found autistic individual’s autonomic arousal in response to social stimuli to be similar to neurotypical individuals (e.g., Louwerse et al., 2014), suggesting the “facilitator theory” of autonomic arousal in the context of socially relevant information may also be true in autistic people. It should be noted, however, that the literature on differences in autonomic arousal in autistic and non-autistic individuals varies methodologically and analytically, impeding our ability to draw conclusions (Lydon et al., 2014).
The current results also mirror evidence from previous autism studies. For instance, similar to our findings that fewer peaks (i.e., EDA hypo-arousal) were associated with more emotion dysregulation and more negative mood during the interaction task, Baker and colleagues (2018) found evidence that low EDA arousal during a parent-child compliance task was associated with higher child externalizing symptoms. It could be that children with fewer autism symptoms were coded as more autonomous, displayed less negative mood, and less emotion dysregulation; however, given that no metric of EDA arousal was significantly associated with parent-reported autism symptoms, it is unlikely that autism symptoms alone can explain the pattern of correlations between EDA and observed child behavior. Fenning and colleagues (2017) suggest that accounting for both frequency and intensity of EDA response is important in capturing individual differences in autonomic responses in autism specifically and the results of this study reinforce their findings.
The inconsistencies we observed within EDA preprocessing programs have likely contributed to the mixed results in the literature. It is unclear whether the discrepancy between metrics within programs reflects an issue with data processing in the program itself, suggests possible evidence that each of these commonly used EDA metrics may be tapping into different components of sympathetic reactivity, or simply indicates that these gross metrics are not characterizing the data well. Regardless of the underlying mechanism, this exemplifies a larger issue in the field – interpretation of results and implementation of EDA research findings in autism are hampered by our inability to compare data across studies, which is at least partly due to the way we handle this data. Moving forward, it will be important to focus on more fluid within-person EDA changes in response to interactional partners or environmental changes rather than oversimplified single gross metrics such as frequency and amplitude of peaks. In other words, identifying whether or not autistic children have hypo- or hyperresponsive sympathetic nervous system reactivity via average EDA values may be less useful than identifying profiles of EDA activity and/or how EDA changes in response to parental prompts or changes in their environment (Parma et al., 2021; Schoen et al., 2008). It will also be important to add context to EDA data. Rather than examining the frequency of peaks in isolation, for instance, we may need to align the occurrence of peaks in time with interactional partner responses. Theoretically, if high-frequency/low-intensity EDA responses are optimal for social interaction, as evidenced in the current study, then understanding the types of social behavior that illicit such responses will likely be different for each child and each caregiver-child interaction. This may influence the quality of parenting interventions and has implications for behavioral interventions that target child emotional regulation as a main outcome. The opposite is also useful – understanding the contexts that illicit high-frequency/high-intensity responses can also help parents, clinicians, and educators prevent dysregulated or maladaptive behaviors by creating a more optimal environment in which autistic children can thrive. Larger studies with more diverse samples are needed to test this hypothesis and replicate these findings.
The potential for autonomic arousal via EDA measurement to inform intervention development is complicated by the well-known heterogeneity in autism presentations across the spectrum. It is unlikely, given the heterogenous autistic experience, that there will be a single EDA indicator that can act as a reliable and valid biomarker in autism. It is also unlikely that research will identify a “one-size-fits-all” (or even one-size-fits-most) context or environment that can be targeted in an intervention associated with optimal EDA arousal in all autistic children. Indeed, the ways in which EDA informs intervention development may entirely depend upon this heterogeneity. We would argue, however, that the “one-size-fits-all” approach should not be the goal. Rather, future EDA research should incorporate autistic heterogeneity into the design and analytic approach as a way to increase the external validity of research findings. As others have suggested (e.g., Hobson & Petty, 2021), restricting the diversity of autistic presentation in research samples only perpetuates bias and further limits what we can learn about the autistic experience. We agree that the inherent diversity found in autism represents a challenge for scientists, including and maybe especially for those working with neurophysiological data like EDA; however, by ignoring the inherent heterogeneity in this population we send a message that certain types of autism are more valid than others. Moving forward, EDA research in autism should embrace a “heterogeneity framework” to guide research design, implementation, and interpretation of results (Georgiades, Szatmari, & Boyle, 2013).
The results of the current study should be considered in light of its limitations. First and foremost, this is a cross-sectional study using a small sample of mostly male autistic children from primarily White, middle-income families. Although age was not correlated with EDA, it should also be noted that the wide age range of our sample could be impacting results. Second, the observational task in which the EDA data was collected was designed to be challenging but it was not a “stress task”. Rather than elicit a stress response, we wanted to examine sympathetic nervous activity in a more naturalistic context. It may be that more specific stress induction would produce different results. Observational coding is also not without its challenges and although we used well-validated, global coding systems, these measures were not designed for use in autistic samples. It can also be difficult to obtain a clean EDA signal when working with autistic children and this may impact the type of child that is able to participate in this type of research. This was a driving factor in our decision to use the Empatica watches given that there are no wires, no stickers, and no gel associated with its use; however, there were children who were unwilling to wear the device.
EDA holds promise as a biomarker for understanding “under-the-skin” processes in autism but challenges in processing and interpreting EDA data remain. This study is in line with other research demonstrating that individual differences in autistic children’s sympathetic arousal, as measured by EDA, is the norm (e.g., Baker et al. 2018; Fenning et al., 2017). We also provide evidence that two different data processing programs, NeuroKit2 and Ledalab, are correlated with one another but different EDA metrics within each program are not. An individual differences approach would help make sense of these differences while also pushing the field away from comparative studies and group averages which can lead to overgeneralization and difficulties comparing across studies. Large-scale replication efforts are needed to advance our understanding of how sympathetic arousal can be used to inform intervention.
Acknowledgments
The research reported in this publication was approved by the Institutional Review Board at the University of Wisconsin-Madison and all participants provided informed consent and permission for their child to participate. This research was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health [T32HD007489; U54 HD090256], National Institute of Mental Health [R01MH099190 to S. Hartley], the National Institute on Deafness and Other Communication Disorders [F31 DC0108716], and the University of Wisconsin-Madison. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Appendix A: Code for Processing EDA Data
## Analysis Pipeline 1: Ledalab
Ledalab is a free and open-source Matlab-based program used for analysis of skin conductance data.
The software is documented at and can be obtained from [ledalab.de](http://ledalab.de). Installation consists of downloading the Ledalab files into a directory and adding that directory to the Matlab path.
Ledalab provides two methods to decompose the raw data into tonic and phasic components:
Continuous Decomposition Analysis (CDA) and Discrete Decomposition Analysis (DDA). Comments by the primary software developer in the support forum suggested that DDA would be unlikely to work well with data sampled at a low rate (below 10 Hz), so CDA was used here.
It has both an interactive mode and a batch mode. The interactive mode is useful for
visual inspection of the raw data and provides tools for manually replacing artifacts
with interpolated data points. In the case of this dataset, artifacts
could not be unambiguously identified, so it was decided not to use those tools.
The analysis was conducted in three phases:
1. Create data files in a format that Ledalab could read.
A Python script was used to generate one data file for each line of data in the original data source. Files were created in the "text 1" format described in the Ledalab documentation, where each line includes three values (time, EDA response, and optionally, an event marker) separated by tabs.
2. Run Ledalab in batch mode to generate a set of output files.
A single command at the Matlab command prompt will process files in a
specified directory. The specific command used was:
``` >> Ledalab(data_dir, 'open', 'text', 'analyze', 'CDA', 'export_scrlis', [0.03 2 zscore]) ```
where `data_dir` is the path to the directory where the data files are stored. The 'open', 'text'
parameters specify that the data files are in "text 1" format, described above. The 'analyze', 'CDA'
parameters call for CDA analysis. The 'export_scrlis', [0.03 2 zscore] parameters specify that the
scr peaks identified by the CDA response must have an amplitude of at least 0.03 microSiemens, should
be exported as text files (2), and should write peak amplitudes in units of microSiemens or z-scores,
depending on the value of `zscore`. The z-score parameter is nonstandard, relying on
a modified version of leda_batchanalyis.m, described later in this document.
Output files are written to the current Matlab working directory, so if the Ledalab files are located in *ledalab_dir*, input data files are located in *data_dir*, and output should be written to *output_dir*, then the Matlab commands should be something like:
``` >> addpath ledalab_dir >> cd output_dir >> Ledalab(data_dir, 'open', 'text', –analyze', 'CDA', 'export_scrlis', [0.03 2 1]) ```
### Ledalab modification: Generating output as z-score
By default, Ledalab does not support output formatted as z-scores when running
in batch mode. A minor modification was made to leda_batchanalysis.m which
adds a third parameter to the 'export_scrlist' option.
Four lines, beginning "% Add option for exporting scrlist…", were inserted in leda_batchanalysis.m to enable the new output format.
In context, the modified code is:
```python
%Export Scrlist
if any(args.export_scrlist)
leda2.set.export.SCRmin = args.export_scrlist(1);
if length(args.export_scrlist) > 1
leda2.set.export.savetype = args.export_scrlist(2);
% Add option for exporting scrlist as z-scores
if length(args.export_scrlist) > 2
leda2.set.export.zscale = args.export_scrlist(3);
end
else
leda2.set.export.savetype = 1;
end
export_scrlist;
end
```
3. Process the Ledalab output files to generate the measures of interest.
The output files produced by running Ledalab in batch mode have names based on the source data files, but with an appended "_scrlist" or "_scrlist_z" depending whether z-score output is requested. Each line of the output files includes an onset time and amplitude for each identified peak. These were processed using a small Python script, which, for each Ledalab output file, computes the number of peaks, the mean
peak value, and the standard deviation of the peak values.
## Analysis Pipeline 2: NeuroKit2
NeuroKit2 is a Python package designed for processing neurophysical signals, including EDA data. It can be installed from PyPI and is documented at [neurokit2.readthedocs.io](https://neurokit2.readthedocs.io).
``` pip install neurokit2 ```
The analysis steps are, roughly:
1. Characterize the raw data, identifying any invalid or suspect data points.
2. Decompose the raw signal into tonic and phasic components.
3. Find the peaks in the phasic component.
4. Filter the peaks to exclude artifacts or outliers.
5. Compute measures of interest based on the retained peaks.
Of these, steps 2 and 3 used functions included with NeuroKit2: `eda_phasic` and `eda_peaks`.
### 1. Characterize the raw data
An overall mean and standard deviation are computed of the raw signal. These values are not used in subsequent analysis.
Data values less than **0.01 μS** are considered invalid, and are set to a value of 0. If the data are standardized, these below-threshold values are excluded from the calculations, as discussed below.
The number of data values less than 0.5 μS are reported, as there are suggestions in the literature [Doberenz et al. 2011] that values in that range should be excluded. However, in this data set, that threshold seems inappropriate, as most (in many cases, all) of the data points in most records were below that threshold.
It was speculated that sudden, large changes in the signal might be artifacts, so the number and location of "jumps" -- changes of 0.5 μS between successive data points -- were identified. These jump locations are used to exclude artifacts later in the analysis.
### Optionally, standardize the data
The data were found to be highly variable between subjects, so it was reasoned that standardizing the data would allow more consistent applications of criteria for peak inclusion. Since very low values (< 0.01 μS) are believed to be invalid, they were excluded from the standardization calculation. The signal was converted to z-scores using:
```python
import numpy as np
def standardize_z(signal, exclude_zero=True):
""" Convert a signal to z-scores, optionally excluding zero values from calculations
In this application, the signal is known to be non-negative.
"""
if exclude_zero:
nonzeros = signal > 0
mu = np.mean(signal[nonzeros])
sigma = np.std(signal[nonzeros])
else:
mu = np.mean(signal)
sigma = np.std(signal)
return (signal - mu) / sigma
```
### 2. Decompose the raw signal into tonic and phasic components
The NeuroKit2 function `eda_phasic` was used to decompose the signal into tonic and phasic components. That function uses a Butterworth filter with a cutoff of 0.05 Hz for the decomposition.
```python eda_decomposed = nk.eda_phasic(signal, sampling_rate=sampling_rate) ```
### 3. Find the peaks in the phasic component
The NeuroKit2 function `eda_peaks` was used to find the peaks in the phasic component of the signal. The default 'neurokit' method was used for the peak finding algorithm. Peaks with amplitude less than 10% of the maximum amplitude in the signal were excluded (controlled by the `amplitude_min` parameter).
```python peak_signal, info = nk.eda_peaks( eda_decomposed["EDA_Phasic"].values, sampling_rate=sampling_rate, method=method, amplitude_min=0.1, ) ```
### 4. Filter the peaks to exclude artifacts or outliers
Based on the supposition that "jumps" of at least 0.5 μS are artifacts, peaks that occurred within 0.5 s (2 samples) of a jump were marked for exclusion.
It was suggested in the literature (Dawson et al. [2000, 2007]) that peaks with amplitude greater than **1.0 μS** should be excluded, so when using non-standardized data, that threshold was used to exclude peaks. For standardized data, a threshold of **3 σ** was used.
Finally, [Hernandez et al. 2014] suggest that the minimum distance between NS-SCRs should be 1 second. Accordingly, the lower-amplitude peak in each pair of peaks separated by less than 1 s was excluded. This step occurred only after peaks near jumps or of too-high amplitude had been excluded.
### 5. Compute measures of interest
The frequency, mean, and standard deviation of the remaining peaks were then computed.
A final reported measure, notated as freq03, is the number of peaks with an amplitude of 0.03 (units of μS for non-standardized data, σ for standardized data) with a rise time of no more than 3 seconds. Rise time is defined as the time from the previous trough in the signal to the peak.
Footnotes
We have no conflicts of interest to disclose.
It should be noted that guidelines for processing EDA data do exist (see Braithwaite et al., 2015; Kleckner et al., 2018); however, the published guidelines are not consistent and it is unclear which to follow or in what contexts different rules apply.
References
- Achenbach TM, & Edelbrock C (1991). Child behavior checklist. Burlington (VT), 7, 371–392. [Google Scholar]
- Airij AG, Sudirman R, Sheikh UU, Khuan LY, & Zakaria NA (2020). Significance of electrodermal activity response in children with autism spectrum disorder. Indonesian Journal of Electrical Engineering and Computer Science, 19, 1113–1120. [Google Scholar]
- Baker JK, Fenning RM, Erath SA, Baucom BR, Moffitt J, & Howland MA (2018). Sympathetic under-arousal and externalizing behavior problems in children with autism spectrum disorder. Journal of Abnormal Child Psychology, 46, 895–906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barbier A, Chen JH, & Huizinga JD (2022). Autism spectrum disorder in children is not associated with abnormal autonomic nervous system function: hypothesis and theory. Frontiers in Psychiatry, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Belsky J, Crnic K, & Woodworth S (1995). Personality and parenting: Exploring the mediating role of transient mood and daily hassles. Journal of Personality, 63(4), 905–929. [DOI] [PubMed] [Google Scholar]
- Benedek M, & Kaernbach C (2010). A continuous measure of phasic electrodermal activity. Journal of Neuroscience Methods, 190(1), 80–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blacher J, Baker BL, & Kaladjian A (2013). Syndrome specificity and mother-child interactions: Examining positive and negative parenting across contexts and time. Journal of Autism and Developmental Disorders, 43, 761–774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boucsein W. (2012). Electrodermal activity. Springer Science & Business Media. [Google Scholar]
- Braithwaite JJ, Watson DG, Jones R, & Rowe M (2013). A guide for analysing electrodermal activity (EDA) & skin conductance responses (SCRs) for psychological experiments. Psychophysiology, 49(1), 1017–1034. [Google Scholar]
- Constantino JN, & Gruber CP (2012). Social responsiveness scale: SRS-2 (p. 106). Torrance, CA: Western psychological services. [Google Scholar]
- Critchley HD (2002). Electrodermal responses: what happens in the brain. The Neuroscientist, 8(2), 132–142. [DOI] [PubMed] [Google Scholar]
- Dawson ME, Schell AM, & Filion DL (2017). The electrodermal system. In Cacioppo JT, Tassinary LG, & Berntson GG (Eds.), Handbook of psychophysiology (pp. 217–243). Cambridge University Press. [Google Scholar]
- Deater-Deckard K, Pylas M, & Petrill SA (1997). Parent-child interaction coding system. London, UK: Institute of Psychiatry. [Google Scholar]
- DuBois D, Ameis SH, Lai MC, Casanova MF, & Desarkar P (2016). Interoception in autism spectrum disorder: A review. International Journal of Developmental Neuroscience, 52, 104–111. [DOI] [PubMed] [Google Scholar]
- Fenning RM, Baker JK, Baucom BR, Erath SA, Howland MA, & Moffitt J (2017). Electrodermal variability and symptom severity in children with autism spectrum disorder. Journal of Autism and Developmental Disorders, 47, 1062–1072. [DOI] [PubMed] [Google Scholar]
- Ferguson BJ, Hamlin T, Lantz JF, Villavicencio T, Coles J, & Beversdorf DQ (2019). Examining the association between electrodermal activity and problem behavior in severe autism spectrum disorder: A feasibility study. Frontiers in Psychiatry, 10, 654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Georgiades S, Szatmari P, & Boyle M (2013). Importance of studying heterogeneity in autism. Neuropsychiatry, 3(2), 123. [Google Scholar]
- Hobson H, & Petty S (2021). Moving forwards not backwards: heterogeneity in autism spectrum disorders. Molecular Psychiatry, 26(12), 7100–7101. [DOI] [PubMed] [Google Scholar]
- Hoffman C, Crnic KA, & Baker JK (2006). Maternal depression and parenting: Implications for children’s emergent emotion regulation and behavioral functioning. Parenting: Science and Practice, 6, 271–295. [Google Scholar]
- Kleckner IR, Jones RM, Wilder-Smith O, Wormwood JB, Akcakaya M, … & Goodwin MS (2018). Simple, transparent, and flexible automated quality assessment procedures for ambulatory electrodermal activity data. IEEE Transactions in Biomedical Engineering, 65, 1460–1467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kushki A, Drumm E, Pla Mobarak M, Tanel N, Dupuis A, Chau T, & Anagnostou E (2013). Investigating the autonomic nervous system response to anxiety in children with autism spectrum disorders. PLoS One, 8(4), e59730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lord C, Rutter M, DiLavore P, Risi S, Gotham K, & Bishop S (2012). Autism Diagnostic Observation Schedule (ADOS-2) Modules 1-4. [Google Scholar]
- Louwerse A, Tulen JH, van der Geest JN, van der Ende J, Verhulst FC, & Greaves-Lord K (2014). Autonomic responses to social and nonsocial pictures in adolescents with autism spectrum disorder. Autism Research, 7(1), 17–27. [DOI] [PubMed] [Google Scholar]
- Lydon S, Healy O, Reed P, Mulhern T, Hughes BM, & Goodwin MS (2016). A systematic review of physiological reactivity to stimuli in autism. Developmental Neurorehabilitation, 19(6), 335–355. [DOI] [PubMed] [Google Scholar]
- Makowski D, Pham T, Lau ZJ, Brammer JC, Lespinasse F, Pham H, … & Chen SA (2021). NeuroKit2: A Python toolbox for neurophysiological signal processing. Behavior Research Methods, 1–8. [DOI] [PubMed] [Google Scholar]
- Mazefsky CA, Kao J, & Oswald D (2011). Preliminary evidence suggesting caution in the use of psychiatric self-report measures with adolescents with high-functioning autism spectrum disorders. Research in autism spectrum disorders, 5(1), 164–174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCarthy C, Pradhan N, Redpath C, & Adler A (2016). Validation of the Empatica E4 wristband. IEE EMBS International Student Conference, 1–4. [Google Scholar]
- Panju S, Brian J, Dupuis A, Anagnostou E, & Kushki A (2015). Atypical sympathetic arousal in children with autism spectrum disorder and its association with anxiety symptomatology. Molecular Autism, 6(1), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parma V, Cellini N, Guy L, McVey AJ, Rump K, Worley J, … & Herrington J (2021). Profiles of autonomic activity in autism spectrum disorder with and without anxiety. Journal of Autism and Developmental Disorders, 1–12. [DOI] [PubMed] [Google Scholar]
- Posada-Quintero HF, & Chon KH (2020). Innovations in electrodermal activity data collection and signal processing: A systematic review. Sensors, 20(2), 479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prince EB, Kim ES, Wall CA, Gisin E, Goodwin MS, Simmons ES, … & Shic F (2017). The relationship between autism symptoms and arousal level in toddlers with autism spectrum disorder, as measured by electrodermal activity. Autism, 21(4), 504–508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoen SA, Miller LJ, Brett-Green B, & Hepburn SL (2008). Psychophysiology of children with autism spectrum disorder. Research in Autism Spectrum Disorders, 2(3), 417–429. [Google Scholar]
- Sequeira H, Hot P, Silvert L, & Delplanque S (2009). Electrical autonomic correlates of emotion. International Journal of Psychophysiology, 71(1), 50–56. [DOI] [PubMed] [Google Scholar]
- Vernetti A, Shic F, Boccanfuso L, Macari S, Kane-Grade F, Milgramm A, … & Chawarska K (2020). Atypical emotional electrodermal activity in toddlers with autism spectrum disorder. Autism Research, 13(9), 1476–1488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Visnovcova Z, Ferencova N, Grendar M, Ondrejka I, Olexova LB, Bujnakova I, & Tonhajzerova I (2022). Electrodermal activity spectral and nonlinear analysis-potential biomarkers for sympathetic dysregulation in autism. General Physiology & Biophysics, 41(2). [DOI] [PubMed] [Google Scholar]
- White SW, Oswald D, Ollendick T, & Scahill L (2009). Anxiety in children and adolescents with autism spectrum disorders. Clinical Psychology Review, 29(3), 216–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
