Abstract
The aim of this test–retest reliability study was to evaluate the reliability and reactivity of heart rate variability (HRV) and pupillometry metrics under conditions with controlled cognitive stimulation and paced breathing within a virtual reality protocol. After habituation, 30 English-speaking university students completed a four-phase protocol on two occasions separated by 1 week. HRV and pupillometry were continuously measured during the following phases: baseline, cognitive testing, guided breathing with nature immersion, and spontaneous breathing with nature immersion. Strong day-to-day relative reliability was confirmed for both HRV (pooled ICC: 0.75 to 0.83) and pupillometry (pooled ICC: 0.66 to 0.87). HRV metrics of sympathovagal balance in the time, frequency, and non-linear domains showed reactivity with significant differences between all phases. Pupillometry metrics increased progressively from cognitive testing to guided breathing nature immersion to nature immersion, suggesting psychological rather than respiratory influences. Relatively large minimal detectable change values were determined across HRV (22 to 54% deviation from baseline) and pupillometry (33 to 88% deviation from baseline) metrics. Although the relatively large ratio limits of agreement and minimal detectable change values suggest that detecting systematic changes in these metrics over time might be difficult at the individual level, strong relative reliability supports the use of HRV and pupillometry metrics to detect differences in sympathovagal balance between groups. Additionally, the responsiveness of these metrics demonstrates the efficacy of the proposed virtual reality protocol in inducing detectable physiological reactivity across HRV and pupillometry metrics.
Keywords: CONVIRT, Guided breathing, HRV, Minimal detectable change, Stress, Pupil asymmetry
Introduction
There is well-established evidence that imbalanced autonomic nervous system (ANS) function, through overactivation of the neurophysiological stress response (Jarczok et al., 2020; Wyller et al., 2009), is implicated in many chronic health conditions (Eddy et al., 2016). For example, autonomic dysfunction has been associated with increased risk of developing cardiovascular disease (Lupien et al., 2009; Merz et al., 2002), myocardial infarction (Bigger Jr et al., 1992), and neurological conditions such as depression (Burns et al., 2014; Hemmerle et al., 2012; Novakova et al., 2013). Autonomic arousal has also been associated with acute decrements in functioning, including decreased cognitive performance and increased anxiety (Corrone et al., 2021; Levy, 2013; Miu et al., 2009; Wright et al., 2022). Therefore, chronic autonomic arousal can have deleterious effects on both long-term health and optimal function.
In addition to chronic change in autonomic balance, the impact of changes in physiological state is detectable through measurement of acute reactivity, whereby chronically stressed individuals have been shown to exhibit greater physiological reactivity to acute conditions (Castaldo et al., 2015; Pike et al., 1997), or blunted responses possibly related to burnout (Al Abdi et al., 2018; Brugnera et al., 2019; Schiweck et al., 2019; Wekenborg et al., 2019; Xin et al., 2020). Therefore, changes in acute physiological responsiveness, measured by ANS reactivity, provide an opportunity to identify those who are at increased risk of poor health outcomes associated with chronic autonomic arousal (Manser et al., 2021). The first step, however, is understanding precisely what constitutes a ‘change’ in autonomic arousal. To do this, it is important to determine ‘normal’ variation among indices of autonomic activity.
The sympathetic nervous system (SNS) initiates a physiological state of autonomic arousal (Vingerhoets, 1985), which is typically dominant during periods of perceived stress (Wirtz & von Känel, 2017). Responses include cardiac output becoming more stable (Pereira et al., 2017) and a predominant state of pupil dilation (Turnbull et al., 2017). Conversely, the parasympathetic nervous system (PNS) exerts influence during periods of rest and low perceived stress (Vingerhoets, 1985), where cardiac output becomes more varied, and pupil constriction occurs (Kim et al., 2022). A more flexible, adaptable ANS function suggests a healthy balance in the interchange between the SNS and PNS (Pereira et al., 2017) and a high level of resilience to the physical and psychological impact of stressful states (McCraty & Shaffer, 2015). Therefore, physiological indices estimating sympathetic and parasympathetic influence, can indicate the physical, emotional, or cognitive state of an individual, particularly in relation to stress (Parnandi & Gutierrez-Osuna, 2013).
Heart rate variability (HRV) is a non-invasive measure of autonomic balance (Malik, 1996; Riganello et al., 2012; Shaffer et al., 2014), where HRV reflects the changes in timed intervals between consecutive heartbeats, also known as R-R intervals (Malik, 1996). Low total HRV can be indicative of SNS predominance, suggesting an overactive stress response (Kim et al., 2018). This state is associated with many poor health conditions including chronic stress (Jarczok et al., 2020; Shaffer et al., 2014), cardiovascular disease (Fang et al., 2020), neurological conditions (Thayer et al., 2012) and mortality (La Rovere et al., 2022). Additionally, higher total HRV has been linked to greater physiological resilience, coping capacity, and adaptability (McCraty & Shaffer, 2015; Shaffer et al., 2014; Thayer et al., 2012). Multiple indices of HRV are used as proxy measures of autonomic arousal. For example, low frequency to high frequency ratio (LF/HF ratio) reflects the distribution of power across different frequency bands, which is used as an indirect measure of sympathovagal balance, or the relational interplay between sympathetic and parasympathetic influences (Malik, 1996; Riganello et al., 2012). Time domain analyses of HRV reflect overall variability and short-term changes in cardiac output, reflecting sympathovagal balance (Sollers et al., 2007). Such measures have been investigated over short-term readings in response to situational changes; for example, social stress has been shown to have an acute impact on LF/HF ratio (Castaldo et al., 2015). Therefore, both chronic changes and the acute reactivity in HRV offer opportunities to measure ANS function that reflect the state of autonomic arousal (Hamilton & Alloy, 2016). Although much of the literature supports the use of HRV as a surrogate measure of autonomic activity, less research has assessed if changes in pupil activity reflect changes in autonomic activity.
The assessment of pupil function has been promoted as an alternative measure of ANS functioning (Reith et al., 2016), with commonly employed metrics including pupil diameter (Emelifeonwu et al., 2018). Increases in average pupil diameter throughout cognitive testing sessions serve as a reliable indicator of increased cognitive load (Mandrick et al., 2016; Wahn et al., 2016). Given that some cognitive tasks can influence autonomic arousal akin to a mild stressor (Horan et al., 2020), assessing pupillometry metrics during a task that manipulates states of demand and relaxation can offer valuable insights into ANS function. Correlations between mean pupil diameter and R-R intervals of HRV have been reported (Kaltsatou et al., 2011), prompting suggestions that HRV measures might be reliably estimated using pupil diameter metrics (Parnandi & Gutierrez-Osuna, 2013).
Reliability of HRV has generally been reported to be high within individuals across short time period measurements (<5 minutes) (Chen et al., 2020; Farah et al., 2016; Sookan & McKune, 2012). Although some authors have reported large random variations and moderate reliability in day-to-day readings of HRV (Dupuy et al., 2012; Sookan & McKune, 2012), the reliability of HRV metrics to detect reactivity across time and what constitutes a meaningful level of change has not been systematically investigated. Determining the minimum detectable change (MDC) is important to ensure that measurement error or natural fluctuations do not lead to inappropriate interpretations. Therefore, quantifying the day-to-day reliability of physiological measures reflecting ANS balance and reactivity is important to determine meaningful change from a typical state, which could identify individuals who are most affected by stressors and at greater risk of experiencing poor health outcomes (Brindle et al., 2014).
Combining HRV and pupillometry methods presents a promising approach for ANS evaluation by possibly capturing a wider scope of ANS function. The interaction between these metrics might enhance our understanding of an individual’s physiological state, potentially unveiling nuances that a single measurement approach might not reveal. However, little data exists to compare the reliability and reactivity of HRV and pupillometry metrics in response to changing controlled conditions. Therefore, the aim of this study was to investigate the day-to-day reliability and reactivity of HRV and pupillometry metrics in university students, under controlled conditions designed to perturbate the autonomic nervous system.
Method
Participants
Forty-one participants provided written consent in line with institutional ethics (HEC20077) to participate in this study. Using a recruitment script, Australian university students volunteered to participate in the study. Participants were included if they were current students and able to read English. Individuals with a history of loss of consciousness, psychiatric or neurological illness, dementia, head-injury, presence of current significant health-issues (including emotional disorders) or who used medications that might affect their ability to concentrate, or had health issues such as cardiac, mental, or thyroidal problems, physically fragility, or who were presently ill or unwell (cold, flu) were excluded from the study. Participants were then provided with written informed consent forms to participate in this study.
Using the COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) sample size tool (Mokkink et al., 2019) based on projected good average ICC (r = 0.8) with a 0.3 confidence interval (lower bound r = 0.65), a moderate amount of expected variance in scores and no systematic difference expected between our test–retest measures, a sample size of 30 participants was required to determine the test–retest reliability for this two-way mixed-effects model (absolute definition with average rater type) (Koo & Li, 2016). Additionally, 30 participants were sufficient to determine the reactivity of measures using a within-factor repeated measures ANOVA with a conservative small effect of ρ = 0.17, power set at 0.80, correlation of 0.7, and p = 0.05 (Version 3.1; G*Power, Kiel University, Germany).
Experimental design
All participants repeated the test protocol at three timepoints in the same environmentally controlled (22 ) quiet room without consuming fluids or food throughout testing. The first timepoint (n = 41) served to habituate participants with the protocol, and data from the second (Trial 1; n = 32) and third (Trial 2; n =30) timepoints (~1 week apart), were used to assess test–retest reliability. At each timepoint, a respiration band (Respiratory Belt Transducer; AD Instruments, USA) and a 5-lead Holter monitor (Medilog AR; Schiller, USA) were fitted according to manufacturer guidelines. Academic stress was assessed using the student form of the effort–reward imbalance (ERI) questionnaire, consisting of 14 items, measuring efforts, rewards, and overcommitment. Respondents rated each item on a four-point Likert scale ranging from 1 (“Strongly Disagree”) to 4 (“Strongly Agree”) (Wege et al., 2017). Psychological affect was measured with the depression, anxiety, stress scale- 21 (DASS-21), consisting of 21 items, with seven items each measuring levels of depression, anxiety, and stress. Respondents rated each item on a four-point Likert scale ranging from 0 (“Did not apply to me at all”) to 3 (“Applied to me very much or most of the time”) (Lovibond & Lovibond, 1995). They were then familiarized with the CONVIRT Virtual Reality (VR) environment (Horan et al., 2020) before completing three additional test phases in the VR environment: cognitive test battery and self-selected nature environment while undertaking 5 min of guided breathing at six breaths·min−1, and 5 min of spontaneous breathing. Recordings were stopped and participants removed the electrocardiogram (ECG) and breathing band before booking their next testing time, which was ~10 weeks after habituation. Trial 1 and Trial 2 were repeated at the same time of the day in the same lab room 1 week apart.
Procedures
The ECG monitor (MediLog Holter; MediLog, USA) was applied to the participant’s chest via self-adhesive electrodes, using detailed instructions and a printed visual guide. Subsequently, the respiration band (LabCharts Respiration Monitor; ADInstruments, USA) was attached over clothing across the chest and connected to a data logger (PowerLab 26 Series; ADInstruments, USA) to calibrate the minimum expiration volume (> 0 mV) and maximum inspiration volume (< 100 mV). The HRV measure was then started, and participants completed the questionnaires. Participants completed a questionnaire that captured demographic information and assessed academic stress with the student form ERI (Wege et al., 2017), and psychological affect was measured with the DASS-21 (Lovibond & Lovibond, 1995). Participants were then equipped with the VR headset (HTC VIVE Pro Eye Tracking VR Head Mounted Display) and a handheld thumb-press button. After familiarization of the VR environment and eye-tracking calibration, four cognitive tests assessing attention, decision making and working memory were undertaken using the CONVIRT test battery (Horan et al., 2020). The CONVIRT testing environment includes eye-tracking to capture pupil metrics (HTC VIVE Pro Eye HMD has an eye tracking accuracy of 0–5–1.1 degrees with a refresh rate of 120 Hz) (VIVETM Australia, 2018) and provides a first-person perspective that the participant uses to navigate their way through a hospital setting (lux values from 60.6 to 70.1 cd/m2) where they responded with either a thumb press, visual gaze or both in response to specific commands. Participants were then shown an interface where they selected their preferred calm virtual environment (forest, beach, or lake). Each environment was matched for luminance (lux values: 70 cd/m2). After selection, pre-recorded verbal instructions on how to complete the guided breathing phase were provided within the environment, alongside an example visual. This visual showed a circle that progressively enlarged for 5 s with paired audio instructions to ‘breathe in’ followed by a progressive constriction of the circle paired with audio instructions to ‘breathe out’ for 5 s. Participants performed this behavior for 5 min to elicit a breathing rate of 6 breaths·min−1. The guided stimuli visual was then removed, and participants returned to free breathing in the same VR environment for a further 5 min.
Measurement of HRV
Manufacturer’s software (Darwin V2; Medilog, Schiller AG, Switzerland) was used to process raw ECG data and determine R-to-R intervals. After removing non-sinus R-R intervals, the remaining N-N intervals were imported into a customized program developed in LabView (Version 2019; National Instruments, UK) to determine HRV outcomes in the time, frequency and non-linear domains in 1-min epochs according to standard methods (Malik, 1996). Time domain variables included standard deviation of the N-N intervals (SDNN), mean squared difference of successive N-N intervals (RMMSD), and the SDNN/RMMSD ratio. For frequency domain analyses, data were detrended and windowed using a Hanning window before fast Fourier transformation of 256 samples with 50% overlap to determine the power spectral density in LF bandwidth (0.04–0.15 Hz) and HF bandwidth (0.15–0.40 Hz) and LF/HF ratio as previously described (Kingsley et al., 2005). In addition, Poincaré SD1 (length of the transverse line of the Poincaré plot area), Poincaré SD2 (length of the longitudinal line of the Poincaré plot area), and the SD1/SD2 ratio were calculated because Poincaré analyses can be used to detect irregularities that may otherwise be difficult to determine with conventional time and frequency domain variables (Laitio et al., 2000).
Methods of pupillometry
Pupil diameter, fluctuation, and asymmetry were measured using the eye-tracking feature of the HTC VIVE Pro Eye VR headset, which recorded changes in pupil diameter at a rate of 120 samples per second during the cognitive tasks. These measurements were then calculated across their respective testing phases. Pupil metrics included the mean pupil size, pupil fluctuation by calculating the standard deviation of pupil diameter data collected continuously within each phase, and pupil asymmetry by comparing the mean absolute difference between left and right pupils across each respective phase.
Statistical analysis
All analyses were performed using the IBM SPSS Statistics computer software package (Version 29.0). Statistical significance was set at p < 0.05. Data were screened for normality using the Shapiro–Wilks statistic and breaches were identified across the majority of HRV and pupillometry metrics. Consequently, all HRV and pupillometry data were arcsine transformed prior to two-way repeated measures ANOVAs (within factors: trial and testing phase). Where the trial x phase interaction effects did not reach significance, the pattern of response was deemed to be consistent across trials and the main effects of trial and phase were consulted. Mauchly’s test of sphericity was consulted, and the Greenhouse–Geisser adjusted values were used when the assumption of sphericity was violated. Significant main effects of phase were followed up by pairwise comparisons with Bonferroni correction. Significant pairwise differences between adjacent phases were interpreted to demonstrate reactivity of the metric to changes in testing phases.
Intra-class correlations were used to measure the stability of individual data between trials, providing a measure of relative agreement according to guidelines with 95% confidence intervals (McGraw & Wong, 1996) after pooling with Fisher’s Z transformation (Silver & Dunlap, 1987). Two-way random effects of agreement on average scores were chosen to allow generalization of findings with ICC values interpreted as being “good” (0.75 to 0.90), “moderate” (0.5 to < 0.75) or “poor” (< 0.50) (Koo & Li, 2016). Ratio limits of agreement were used to quantify the absolute agreement between HRV and pupillometry metrics from Trial 1 to Trial 2 (day-to-day variation), where 95% ratio limit of agreement (95% RLOA) using the method for replicated data in pairs (Bland & Altman, 1986; Bland & Altman, 1999). Minimal detectable change (MDC) was calculated to identify the threshold value that can be considered as a change greater than the expected variation in the measurement on a day-to-day basis. The SEM was used to calculate the MDC with the equation MDC = SEM x z-score (95% CI), where the z-score equals 1.96 (McGraw & Wong, 1996).
Results
Thirteen male and 17 female university students aged 18–35 years old (M = 21.8, SD = 4.0) with mean body mass of 68.8 kg (SD = 14.5), mean stature of 1.68 m (SD = 0.12) and mean body mass index 22.3 kg/m2 (SD = 4.2) completed all requirements and were included in analyses.
Psychological affect did not significantly differ from Trial 1 to Trial 2, with no differences in academic stress (t = 1.12, p = 0.266), depression (t = 0.00, p = 0.999) or anxiety (t = 1.10, p = 0.283). Phasic changes in breathing rates are presented by trial in Fig. 1. Breathing rates were not statistically different between trials (trial effect: F(1,29) = 2.38, p = 0.134, η2 = 0.08), but differed by intervention phase (phase effect: F(3,87) = 209.68, p < 0.001, η2 = 0.88). Breathing rate did not significantly differ between baseline (16.3 breaths·min−1, 95% CI: 15.3 to 17.2 breaths·min−1) and cognitive testing (16.8 breaths·min−1, 95% CI: 15.8 to 17.9 breaths·min−1). Guided breathing, where participants were prompted to breathe at a rate of six breaths·min−1 with experiencing nature immersion, resulted in a reduction in breathing rate to 6.3 breaths·min−1 (95% CI: 5.8 to 6.8 breaths·min−1; p < 0.001) and breathing rate returned towards baseline breathing rates when guided breathing was withdrawn during the final phase with nature immersion (10.9 breaths·min−1, 95% CI: 10.0 to 11.9 breaths·min−1). Respiration depth (% of respiration range) did not differ significantly in pattern by trial (trial x phase interaction effect: F(3,87) = 1.32, p = 0.272, η2 = 0.04), but differed by phase (F(3,87) = 75.39, p < 0.001, η2 ≥ 0.72) with pairwise differences between cognitive testing to GBNI (Trail 1: MD: – 25%, 95% CI: – 33 to – 18%, p < 0.001, Trial 2: MD: – 29%, 95% CI: – 40 to – 18 %, p < 0.001) and GBNI to NI (Trial 1: MD: 20%, 95% CI: 13 to 27%, p < 0.001, Trial 2: MD: 25%, 95% CI: 14 to 37%, p < 0.001).
Fig. 1 .

Mean (95% confidence interval) breathing rate between trials and across phase
Heart Rate Variability (HRV)
The pattern of response did not differ significantly across trials for all HRV metrics (trial x phase interaction effect: F(3,87) ≤ 1.17, p ≥ 0.321, η2 ≤ 0.04) and no systematic between trial differences were observed for any HRV metrics (F(1,29) ≤ 0.12, p ≥ 0.526, η2 ≤ 0.01; Fig. 2). Heart rate (HR) was also consistent across trials (trial x phase interaction effect: F(3,87) = 2.92, p = 0.065, η2 = 0.09) with no systematic between trial differences (F(1,29) = 0.88, p = 0.355, η2 = 0.03) and a significant difference between phases (F(3,87) = 5.21, p = 0.005, η2 ≥ 0.15) from baseline to cognitive testing (MD: – 1.8 beats·min−1, 95% CI: – 1.3 to – 2.3 beats·min−1, p < 0.001, d = 0.94).
Fig. 2 .
HRV scores (LF/HF ratio, SDNN/RMSSD ratio, SD1/SD2 ratio, LF power, SDNN, SD2, HF power, RMSSD and SD1) between trials and across phases
Day-to-day reliability and MDC for all HRV metrics are presented in Table 1. Relative reliability was good for all HRV metrics (ICC: 0.75 to 0.83). Absolute reliability had consistently low bias (1 to 6% for 78% of metrics) with variability ranging from SD1/SD2 ratio (RLOA: 0.90 to 1.05) to HF Power (RLOA: 0.67 to 1.20), being tighter for non-linear and time domain metrics in comparison to frequency domain metrics. The minimum detectable change in day-to-day values ranged from 22% (SDNN and SDNN/RMSSD ratio) to 73% of baseline values (HF power).
Table 1 .
Absolute reliability (bias and ratio limits of agreement), relative reliability (intra-class correlations) and minimal detectable change values for heart rate variability metrics
| HRV metric | Mean bias (%) | RLOA (ratio) |
ICC (95% CI) | MDC (metric unit) |
MDC (% Baseline) |
|---|---|---|---|---|---|
| LF/HF ratio | 1 | 0.78 to 1.26 | 0.76 (0.66 to 0.83) | 1.85 | 54 |
| LF Power (ms2) | 15 | 0.69 to 1.11 | 0.83 (0.72 to 0.89) | 183 | 32 |
| HF Power (ms2) | 12 | 0.67 to 1.20 | 0.77 (0.59 to 0.87) | 225 | 73 |
| SDNN/RMSSD ratio | 2 | 0.95 to 1.09 | 0.76 (0.74 to 0.79) | 0.38 | 22 |
| SDNN (ms) | 3 | 0.87 to 1.09 | 0.76 (0.66 to 0.83) | 12 | 22 |
| RMSSD (ms) | 6 | 0.81 to 1.10 | 0.80 (0.69 to 0.87) | 12 | 32 |
| SD1/SD2 (%) | 3 | 0.90 to 1.05 | 0.79 (0.76 to 0.81) | 13 | 38 |
| SD1 (ms) | 6 | 0.81 to 1.10 | 0.80 (0.69 to 0.87) | 8 | 31 |
| SD2 (ms) | 3 | 0.87 to 1.08 | 0.75 (0.65 to 0.82) | 17 | 23 |
RLOA: ratio limits of agreement, ICC: intra-class correlation, CI: confidence interval, MDC: minimal detectable change, LF: low frequency, HF: high frequency, SDNN: standard deviation of normal-to-normal intervals, RMSSD: root mean square of successive differences, SD1: Poincare standard deviation 1, SD2: Poincare standard deviation 2
In the frequency domain, LF power, HF power and LF/HF ratio all differed by phase (F(3,87) ≥ 6.96, p ≤ 0.002, η2 ≥ 0.19). LF/HF ratio was different for all phases of the protocol (Fig. 2A), with reactivity from baseline to cognitive testing (MD: – 1.0, 95% CI: – 0.6 to – 1.4, p < 0.001, d = 0.63), cognitive testing to guided breathing (MD: 11.1, 95% CI: 9.0 to 13.1, p < 0.001, d = 1.39) and guided breathing to nature immersion (MD: – 8.6, 95% CI: – 6.4 to – 10.8, p < 0.001, d = 0.99). LF power (Fig. 2B) increased from cognitive testing to guided breathing (MD: 3179 ms2, 95% CI: 3918 to 2442 ms2, p < 0.001, d = 1.11) and decreased from guided breathing to nature immersion (MD: – 2379 ms2, 95% CI: – 1679 to – 3078 ms2, p < 0.001, d = 0.88). HF power (Fig. 2C) increased from baseline to cognitive testing (MD: 64 ms2, 95% CI: 16 to 114 ms2, p = 0.002, d = 0.34).
In the time domain, SDNN, RMSSD and SDNN/RMSSD ratio were different between all phases (F(3, 87) ≥ 22.09, p < 0.001, η2 ≥ 0.43) except for SDNN/RMSSD ratio from guided breathing to nature immersion (MD: –.06, 95% CI: – 0.02 to 0.14, p = 0.183 d = 0.17). SDNN/RMSSD ratio (Fig. 2D) decreased from baseline to cognitive testing (MD: – 0.21, 95% CI: – 0.15 to – 0.28, p < 0.001, d = 0.82) and increased to guided breathing (MD: 0.42, 95% CI : 0.34 to 0.49, p < 0.001, d = 1.34). SDNN (Fig. 2E) decreased from baseline to cognitive testing (MD: – 3.5 ms, 95% CI: – 1.3 to – 5.8 ms, p = 0.035, d = 0.40), increased to guided breathing (MD: 41.4 ms, 95% CI: −34.9 to – 47.9 ms, p < 0.001, d = 1.65) then decreased to nature immersion (MD: – 26.4 ms, 95% CI: – 19.8 to – 33.1 ms, p < 0.001, d = 1.03). RMSSD (Fig. 2F) increased from baseline to cognitive testing (MD: 2.2 ms, 95% CI: 0.4 to 4.1 ms, p = 0.004, d = 0.32), increased to guided breathing (MD: 14.6 ms, 95% CI: 10.0 to 19.1 ms, p < 0.001, d = 1.11) then reduced to nature immersion (MD: – 14.3 ms, 95% CI: – 9.6 to – 19.0 ms, p < 0.001, d = 0.78).
SD1/SD2 ratio, SD1 and SD2 all differed by phase (F(3, 87) ≥ 22.07, p < 0.001, η2 ≥ 0.43; Fig. 2). SD1/SD2 ratio (Fig. 2G) increased from baseline to cognitive testing (MD: 0.06, 95% CI: 0.4 to 0.8, p < 0.001, d = 0.94) and decreased to guided breathing (MD: – 0.1, 95% CI: – 0.1 to – 0.1, p < 0.001, d = 1.16). SD2 (Fig. 2H) decreased from baseline to cognitive testing (MD: – 6.1 ms, 95% CI: – 2.9 to – 9.3 ms, p = 0.007, d = 0.51), increased to guided breathing (MD: 52.5 ms, 95% CI: 43.7 to 61.3 ms, p < 0.001, d = 1.61) then decreased to nature immersion (MD: – 36.5, 95% CI: – 27.0 to – 46.0 ms, p < 0.001, d = 1.03). SD1 (Fig. 2I) increased from baseline to cognitive testing (MD: 2.8 ms, 95% CI: 1.4 to 4.1 ms, p < 0.001, d = 0.54), increased to guided breathing (MD: 9.3 ms, 95% CI: 6.3 to 12.3 ms, p < 0.001, d = 0.83) and then decreased to nature immersion (MD: – 10.4 ms, 95% CI:– 6.8 to – 14.0 ms, p < 0.001, d = 0.78).
Pupillometry
The pattern of response was not significantly different across trials for all pupillometry metrics (trial x phase interaction effect: F(1,29) ≤ 2.45, p ≥ 0.090, η2 ≤ 0.08). Systematic trial differences were observed for pupil diameter and pupil fluctuation metrics (trial effect: F(1,29) ≥ 4.60, p ≤ 0.041, η2 ≥ 0.14) but not asymmetry (Trial effect: F(1,29) =1.82, p = 0.188, η2 = 0.06). Each pupillometry metric differed between phase (Phase effect: F(1,29) ≥ 12.15, p < 0.001, η2 ≥ 0.29) except for asymmetry from guided breathing to nature immersion (MD: 0.01 mm, 95% CI: – 0.02 to 0.05 mm, p = 1.00, d = 0.12; Fig. 3C).
Fig. 3.
Pupillometry scores (pupil diameter, pupil fluctuation and pupil asymmetry) between trials across phases (mean and 95% confidence intervals)
Day-to-day reliability and MDC for all pupillometry metrics are presented in Table 2. Relative reliability was good for all pupillometry metrics (ICC: 0.66 to 0.87). Absolute reliability of diameter and fluctuation had consistently low bias (1 to 4%) with variability ranging from pupil fluctuation (RLOA: 1.03 to 1.19) to pupil diameter (RLOA: 0.92 to 0.99), being tighter than pupil asymmetry metrics (bias: 18%, RLOA: 0.88 to 2.59). The MDC change in day-to-day values ranged from 33% to 88% of baseline values for pupil diameter and asymmetry, respectively.
Table 2 .
Summary of reliability and minimal detectable change values for pupillometry measures
| Measure | Mean bias (%) | RLOA ratio | ICC (95% CI) | MDC (mm) | Proportion of baseline mean (%) |
|---|---|---|---|---|---|
| Pupil diameter (mm) | 2 | 0.92 to 0.99 | 0.87 (0.85 to 0.89) | 0.22 | 33 |
| Pupil fluctuation (mm) | 4 | 1.03 to 1.19 | 0.67 (0.54 to 0.77) | 0.29 | 64 |
| Pupil asymmetry (mm) | 18 | 0.88 to 2.59 | 0.66 (0.60 to 0.71) | 1.47 | 88 |
RLOA: ratio limits of agreement, ICC: intra-class correlation, CI: confidence interval, MDC: minimal detectable change, SD: standard deviation
Pupil diameter increased from cognitive testing to guided breathing (MD: 0.43 mm, 95% CI: 0.01 to 0.08 mm, p < 0.001, d = 0.28) and then to nature immersion (MD: 0.75 mm, 95% CI: 0.64 to 0.88 mm, p < 0.001, d = 1.61; Fig. 3A). Pupil fluctuation increased from cognitive testing to guided breathing (MD: 0.27 mm, 95% CI: 0.23 to 0.30, p < 0.001, d = 1.91) and then to nature immersion (MD: 0.16 mm, 95% CI: 0.11 to 0.21 mm, p < 0.001, d = 0.81; Fig. 3B). Pupil asymmetry increased from cognitive testing to guided breathing (MD: 0.10 mm, 95% CI: 0.05 to 0.14 mm, p < 0.001, d = 0.57).
Discussion
The results of this study confirm strong day-to-day relative reliability for HRV and pupillometry measurement, taken 1 week apart, with academic stress and affect remaining stable, indicating good stability of measurement over time. HRV metrics of sympathovagal balance and pupillometry displayed discernible reactivity throughout the test phases. These findings demonstrate the effectiveness of both cognitive testing and paced breathing to modulate physiological responsiveness, suggesting that the current protocol manipulates elements of psychological demand and relaxation that is suitable to detect physiological reactivity using HRV and pupillometry metrics.
Day-to-day reliability of heart rate variability and pupillometry
When investigating reliability, the distinction between relative and absolute reliability is important (Maestri et al., 2010; Weir, 2005). Relative reliability considers the extent to which individuals maintain their within-sample rank across repeated measurements (Sole et al., 2007; Watson, 2004; Weir, 2005). In line with the current findings, where ICCs ranged from 0.75 to 0.83 for HRV metrics, HRV measures have previously been reported to demonstrate good relative reliability across short time intervals (Bertsch et al., 2012; Cipryan & Litschmannova, 2013, 2014; Kobayashi, 2009; Maestri et al., 2010; Pinna et al., 2007). The relative reliability of pupillometry measures was moderate for pupil fluctuations and asymmetry, and good for pupil diameter (Couret et al., 2016; Farah et al., 2016).
Absolute reliability considers the variation in repeated measures for an individual, regardless of their relative position (Atkinson & Nevill, 1998). Mean bias was low (≤ 6%) for the majority of HRV metrics, but the range in RLOA was variable with the tightest being SD2 and the broadest being HF power. Although the absolute reliability of HRV reactivity across timepoints has received little attention, these results extend previous finding that individual variations in HRV have been shown to be high (Dupuy et al., 2012; Sookan & McKune, 2012) with large variability being reported for absolute reliability (Cipryan & Litschmannova, 2013, 2014; Maestri et al., 2010; Pinna et al., 2007).
Reactivity of heart rate variability and pupillometry to changes in cognitive stimulation and breathing
The breathing patterns of participants in response to the test protocol were consistent across trials, with similar breathing rate and depth during baseline and cognitive testing (~16 breaths·min−1), decreasing to ~6 breaths·min−1 during paced breathing and then returning towards baseline values when paced breathing was withdrawn. The high reactivity that was observed in the HRV measures between guided breathing with nature immersion to spontaneous breathing with nature immersion can therefore be attributed to respiratory modulation. This response was expected and in accordance with the established association of respiration-driven changes in HRV occurring through vagally mediated respiratory sinus arrythmia (Kromenacker et al., 2018; Miu et al., 2009; Pagani et al., 1984). Stability in the breathing rate and depth across baseline and cognitive testing demonstrates that changes in physiological reactivity from baseline to cognitive testing were not driven by respiratory modulation. Furthermore, heart rate displayed strong reliability between trials and only a small change was identified from baseline to cognitive testing (MD: – 1.8 beats·min−1). The consistency of heart rate across phases demonstrates that it was unlikely to systematically influence the HRV results in this study.
HRV measures of sympathovagal balance (LF/HF ratio, SDNN/RMSSD ratio and SD1/SD2) all showed change from baseline to cognitive testing, with SD1/SD2 displaying the expected reciprocal pattern of response when compared with LF/HF ratio and SDNN/RMSSD. This similarity in reactivity aligns with the known correlation between LF/HF ratio and SD1/SD2 (Guzik et al., 2007; Zerr et al., 2015). The change from baseline to cognitive testing, while the virtual environment and respiration remained consistent, demonstrates sensitivity of these HRV measures to changes in cognitive load as the cognitive distractor influenced parasympathetic balance matching previous findings (Manser et al., 2021). The finding that breathing at 6 breaths·min−1 increased power predominantly in the traditional LF bandwidth is consistent with the known influence of slow-paced breathing on autonomic regulation (Lehrer, 2007), and high reactivity in response to guided breathing aligns with the influence of respiratory modulation of the ANS. Furthermore, the reversal in direction in these measures of sympathovagal balance after baseline and then cognitive testing strengthens these findings by discounting systematic temporal influences, such as habituation.
LF power, SDNN and SD2 are traditionally associated with SNS activity but also include PNS activity (Shaffer et al., 2014). As expected, due to the high correlations between LF power, SDNN (Umetani et al., 1998), and SD2 (Brennan et al., 2002), the results across these metrics were consistent. Similar to the current study, LF power calculated while sitting upright has been shown to reflect a higher contribution from PNS activity (Berntson et al., 1997; Eckberg, 1983). The seated position might explain the similar trend observed to sympathovagally mediated metrics, with both SNS and PNS contributions being detected.
HF power, RMSSD, and SD1 estimate PNS activity and indicate vagally mediated changes in HRV (Kleiger et al., 2005; Shaffer et al., 2014). HF power is highly correlated with RMSSD (Kleiger et al., 1987) while RMSSD and SD1 provide identical measures (Ciccone et al., 2017). RMSSD and SD1 displayed changes across all phases, with the greatest change occurring during guided breathing nature immersion. These metrics all increased from baseline to cognitive testing. Therefore, detection of change due to a mild cognitive stressor might include SNS activation, which is less influential in these metrics. HF power only showed a change from baseline to cognitive testing. HF power displayed weaker reliability and less reactivity than the other similar metrics, suggesting HF power is a less reliable measure of PNS reactivity during this protocol. Conversely, HRV metrics with PNS and SNS components were more sensitive to reactivity throughout the phases.
With the exception of pupil asymmetry, pupillometry metrics all showed increases from cognitive testing to guided breathing nature immersion to nature immersion. No measure of pupillometry was taken at the baseline phase because the luminance in this static environment was not consistent with the luminance of the testing environment. During the assessed phases, luminance was stable (lux values ranged from 70 cd/m2 in guided breathing to 61 cd/m2 in nature immersion). Systemic differences between trials for pupil diameter and fluctuation might be explained by differences in baseline levels as trends remained constant across phases. Increases from cognitive testing to guided breathing nature immersion matched HRV metrics, although the continued increase from guided breathing nature immersion to nature immersion was distinct for pupillometry metrics. It has been established that inhalation and expiration affect pupil diameter (Borgdorff, 1975; Häbler et al., 1994), but breathing rate and deep breathing at six breaths·min−1 do not (Debnath et al., 2021; Schaefer et al., 2023). The divergent mechanisms underlying pupillometry and HRV across cognitive testing and paced breathing extend on other findings indicating an alternative control mechanism in the cognitive state of task engagement, through which respiratory driven autonomic control is superseded (Nakamura et al., 2019). Contrary to expectations, each pupil metric increased across phases, which matched the order of most cognitively demanding (cognitive testing) to least cognitively demanding (nature immersion). As luminance was controlled, and breathing rate is unlikely to have influenced pupillary measures, the increases across phases might not be explained by cognitive load, but by changes in psychological affect, a known factor impacting pupillometry function (Carvalho et al., 2015; Peinkhofer et al., 2019; Thomas et al., 2021). Previous findings demonstrate that both meditation and virtual nature immersion are associated with increases in positive affect (Frost et al., 2022), and such changes in affect are associated with modulation of pupil diameter (Tichon et al., 2014). Although state affect was not measured in the current study, it likely improved in line with immersion in a calming environment of the participant's choice. Therefore, even though pupillometry and HRV are autonomically influenced, these findings indicate that differential influences exist between these measures.
Minimal detectable change
The MDC values, when presented as percentage change from baseline, ranged from 22% for SDNN/RMSSD ratio to 88% for pupil asymmetry. On an individual basis, MDC values have been applied in practical settings as the amount of change in repeated measurements that can be interpreted as exceeding normal day-to-day variability. The relatively large MDC values in this study are expected for physiological measures which are known to fluctuate regularly, and explain the difficulty in using these indices to detect systematic changes from one time point to another at an individual level.
Limitations
Several confounds inherent in HRV and pupillometry measurement need consideration. HRV is known to be sensitive to various conditions, including posture, postural changes, and ambient temperature. Standardized testing conditions were implemented to address these nuisance factors, with participants remaining seated throughout testing in a stable laboratory condition of 22 . Pupillometry measures are influenced by light and the CONVIRT system ensured luminance values were stable during testing. To preserve statistical power, other factors such as fitness and BMI were not included as covariates in analyses but the impact of between-person variability is likely minimal given the within-subjects design employed. Respiration belt accuracy was likely enhanced by a standardized protocol where participants were fitted before HRV recordings to ensure accurate sensitivity and were monitored throughout testing. Finally, although the sample size in the present study provided adequate statistical power for our analyses, the generalizability of our findings would be improved with a larger and more diverse sample.
Conclusion
Across all metrics, relative day-to-day reliability was high with minimal systematic differences between repeated measures, meaning that these measures are likely to be useful to identify differences in means between groups. Absolute reliability was characterized by relatively wide RLOA values and MDC values were high, suggesting that these metrics are less likely to be useful in detecting change at the individual level. The intervention protocol used in this study elicited physiological reactivity in a repeatable fashion over time, and HRV and pupillometry metrics offer promising measures to assess the reactivity of individuals to changes in psychological activities and states. Protocols that provide controlled conditions with varying psychological activities (e.g., demands and relaxation) should be considered when researchers wish to monitor physiological reactivity of the ANS using HRV and pupillometry measurements.
Funding
Open Access funding enabled and organized by CAUL and its Member Institutions
Data availability
The data for the project are available at https://osf.io/e695b/
Code availability
Not applicable
Declarations
Ethics approval
Institutional ethics approval was obtained (HEC20077)
Consent to participate
All participants signed informed consent forms.
Consent for publication
Not applicable
Conflicts of interest/Competing interests
Wright and Horan may look to commercialize CONVIRT software in the future.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- Al Abdi, R. M., Alhitary, A. E., Abdul Hay, E. W., & Al-Bashir, A. K. (2018). Objective detection of chronic stress using physiological parameters. Medical and Biological Engineering and Computing,56, 2273–2286. [DOI] [PubMed] [Google Scholar]
- Atkinson, G., & Nevill, A. M. (1998). Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine. Sports Medicine,26, 217–238. [DOI] [PubMed] [Google Scholar]
- Berntson, G. G., Thomas Bigger Jr, J., Eckberg, D. L., Grossman, P., Kaufmann, P. G., Malik, M., Nagaraja, H. N., Porges, S. W., Saul, J. P., & Stone, P. H. (1997). Heart rate variability: origins, methods, and interpretive caveats. Psychophysiology, 34(6), 623–648. [DOI] [PubMed]
- Bertsch, K., Hagemann, D., Naumann, E., Schaechinger, H., & Schulz, A. (2012). Stability of heart rate variability indices reflecting parasympathetic activity. Psychophysiology,49(5), 672–682. [DOI] [PubMed] [Google Scholar]
- Bigger, J. T., Jr., Fleiss, J. L., Steinman, R. C., Rolnitzky, L. M., Kleiger, R. E., & Rottman, J. N. (1992). Frequency domain measures of heart period variability and mortality after myocardial infarction. Circulation,85(1), 164–171. [DOI] [PubMed] [Google Scholar]
- Bland, J. M., & Altman, D. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet,327(8476), 307–310. [PubMed] [Google Scholar]
- Bland, J. M., & Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research,8(2), 135–160. [DOI] [PubMed] [Google Scholar]
- Borgdorff, P. (1975). Respiratory fluctuations in pupil size. American Journal of Physiology-Legacy Content,228(4), 1094–1102. [DOI] [PubMed] [Google Scholar]
- Brennan, M., Palaniswami, M., & Kamen, P. (2002). Poincaré plot interpretation using a physiological model of HRV based on a network of oscillators. American Journal of Physiology-Heart and Circulatory Physiology,283(5), H1873–H1886. [DOI] [PubMed] [Google Scholar]
- Brindle, R. C., Ginty, A. T., Phillips, A. C., & Carroll, D. (2014). A tale of two mechanisms: A meta-analytic approach toward understanding the autonomic basis of cardiovascular reactivity to acute psychological stress. Psychophysiology,51(10), 964–976. [DOI] [PubMed] [Google Scholar]
- Brugnera, A., Zarbo, C., Tarvainen, M. P., Carlucci, S., Tasca, G. A., Adorni, R., Auteri, A., & Compare, A. (2019). Higher levels of depressive symptoms are associated with increased resting-state heart rate variability and blunted reactivity to a laboratory stress task among healthy adults. Applied Psychophysiology and Biofeedback,44, 221–234. [DOI] [PubMed] [Google Scholar]
- Burns, M. N., Nawacki, E., Kwasny, M. J., Pelletier, D., & Mohr, D. C. (2014). Do positive or negative stressful events predict the development of new brain lesions in people with multiple sclerosis? Psychological Medicine,44(2), 349–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carvalho, N., Laurent, E., Noiret, N., Chopard, G., Haffen, E., Bennabi, D., & Vandel, P. (2015). Eye movement in unipolar and bipolar depression: A systematic review of the literature. Frontiers in Psychology,6, Article 1809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Castaldo, R., Melillo, P., Bracale, U., Caserta, M., Triassi, M., & Pecchia, L. (2015). Acute mental stress assessment via short-term HRV analysis in healthy adults: A systematic review with meta-analysis. Biomedical Signal Processing and Control,18, 370–377. [Google Scholar]
- Chen, Y.-S., Lu, W.-A., Pagaduan, J. C., & Kuo, C.-D. (2020). A novel smartphone app for the measurement of ultra–short-term and short-term heart rate variability: Validity and reliability study. JMIR mHealth and uHealth,8(7), Article e18761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ciccone, A. B., Siedlik, J. A., Wecht, J. M., Deckert, J. A., Nguyen, N. D., & Weir, J. P. (2017). Reminder: RMSSD and SD1 are identical heart rate variability metrics. Muscle & Nerve,56(4), 674–678. [DOI] [PubMed] [Google Scholar]
- Cipryan, L., & Litschmannova, M. (2013). Intra-day and inter-day reliability of heart rate variability measurement. Journal of Sports Sciences,31(2), 150–158. [DOI] [PubMed] [Google Scholar]
- Cipryan, L., & Litschmannova, M. (2014). Intra-session stability of short-term heart rate variability measurement: Gender and total spectral power influence. Journal of Human Sport and Exercise,9(1), 68–80. [Google Scholar]
- Corrone, M., Nanev, A., Amato, I., Bicknell, R., Wundersitz, D. W. T., van den Buuse, M., & Wright, B. J. (2021). Brain-derived neurotropic factor Val66Met is a strong predictor of decision-making and attention performance on the CONVIRT virtual reality cognitive battery. Neuroscience,455, 19–29. [DOI] [PubMed] [Google Scholar]
- Couret, D., Boumaza, D., Grisotto, C., Triglia, T., Pellegrini, L., Ocquidant, P., Bruder, N. J., & Velly, L. J. (2016). Reliability of standard pupillometry practice in neurocritical care: An observational, double-blinded study. Critical Care,20(1), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Debnath, S., Levy, T. J., Bellehsen, M., Schwartz, R. M., Barnaby, D. P., Zanos, S., Volpe, B. T., & Zanos, T. P. (2021). A method to quantify autonomic nervous system function in healthy, able-bodied individuals. Bioelectronic Medicine,7, 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dupuy, O., Mekary, S., Berryman, N., Bherer, L., Audiffren, M., & Bosquet, L. (2012). Reliability of heart rate measures used to assess post-exercise parasympathetic reactivation. Clinical Physiology and Functional Imaging,32(4), 296–304. [DOI] [PubMed] [Google Scholar]
- Eckberg, D. L. (1983). Human sinus arrhythmia as an index of vagal cardiac outflow. Journal of Applied Physiology (1985),54(4), 961–966. [DOI] [PubMed] [Google Scholar]
- Eddy, P., Heckenberg, R., Wertheim, E. H., Kent, S., & Wright, B. J. (2016). A systematic review and meta-analysis of the effort-reward imbalance model of workplace stress with indicators of immune function. Journal of Psychosomatic Research,91, 1–8. [DOI] [PubMed] [Google Scholar]
- Emelifeonwu, J. A., Reid, K., Rhodes, J. K., & Myles, L. (2018). Saved by the pupillometer!–A role for pupillometry in the acute assessment of patients with traumatic brain injuries? Brain Injury,32(5), 675–677. [DOI] [PubMed] [Google Scholar]
- Fang, S.-C., Wu, Y.-L., & Tsai, P.-S. (2020). Heart rate variability and risk of all-cause death and cardiovascular events in patients with cardiovascular disease: A meta-analysis of cohort studies. Biological Research For Nursing,22(1), 45–56. [DOI] [PubMed] [Google Scholar]
- Farah, B. Q., Lima, A. HRd. A., Cavalcante, B. R., de Oliveira, L. M. F. T., Brito, ALd. S., de Barros, M. V. G., & Ritti-Dias, R. M. (2016). Intra-individuals and inter- and intra-observer reliability of short-term heart rate variability in adolescents. Clinical Physiology and Functional Imaging,36(1), 33–39. [DOI] [PubMed] [Google Scholar]
- Frost, S., Kannis-Dymand, L., Schaffer, V., Millear, P., Allen, A., Stallman, H., Mason, J., Wood, A., & Atkinson-Nolte, J. (2022). Virtual immersion in nature and psychological well-being: A systematic literature review. Journal of Environmental Psychology,80, Article 101765. [Google Scholar]
- Guzik, P., Piskorski, J., Krauze, T., Schneider, R., Wesseling, K. H., Wykretowicz, A., & Wysocki, H. (2007). Correlations between the Poincaré plot and conventional heart rate variability parameters assessed during paced breathing. The Journal of Physiological Sciences,57(1), 63–71. [DOI] [PubMed] [Google Scholar]
- Häbler, H.-J., Jänig, W., & Michaelis, M. (1994). Respiratory modulation in the activity of sympathetic neurones. Progress in Neurobiology,43(6), 567–606. [DOI] [PubMed] [Google Scholar]
- Hamilton, J. L., & Alloy, L. B. (2016). Atypical reactivity of heart rate variability to stress and depression across development: Systematic review of the literature and directions for future research. Clinical Psychology Review,50, 67–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hemmerle, A. M., Herman, J. P., & Seroogy, K. B. (2012). Stress, depression and Parkinson’s disease. Experimental Neurology,233(1), 79–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Horan, B., Heckenberg, R., Maruff, P., & Wright, B. (2020). Development of a new virtual reality test of cognition: Assessing the test–retest reliability, convergent and ecological validity of CONVIRT. BMC Psychology,8(1), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jarczok, M. N., Jarczok, M., & Thayer, J. F. (2020). Work stress and autonomic nervous system activity. Handbook of socioeconomic determinants of occupational health: From macro-level to micro-level evidence, 1–33.
- Kaltsatou, A., Kouidi, E., Fotiou, D., & Deligiannis, P. (2011). The use of pupillometry in the assessment of cardiac autonomic function in elite different type trained athletes. European Journal of Applied Physiology,111, 2079–2087. [DOI] [PubMed] [Google Scholar]
- Kim, E., Lim, J. A., Choi, C. H., Lee, S. Y., Kwak, S., & Kim, J. (2022). Assessment of the changes in cardiac sympathetic nervous activity using the pupil size changes measured in seated patients whose stellate ganglion is blocked by interscalene brachial plexus block. Korean Journal of Anesthesiology. [DOI] [PMC free article] [PubMed]
- Kim, H.-G., Cheon, E.-J., Bai, D.-S., Lee, Y. H., & Koo, B.-H. (2018). Stress and heart rate variability: A meta-analysis and review of the literature. Psychiatry Investigation,15(3), 235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kingsley, M., Lewis, M., & Marson, R. (2005). Comparison of polar 810 s and an ambulatory ECG system for RR interval measurement during progressive exercise. International Journal of Sports Medicine,26(01), 39–44. [DOI] [PubMed] [Google Scholar]
- Kleiger, R. E., Miller, J. P., Bigger, J. T., Jr., & Moss, A. J. (1987). Decreased heart rate variability and its association with increased mortality after acute myocardial infarction. The American Journal of Cardiology,59(4), 256–262. [DOI] [PubMed] [Google Scholar]
- Kleiger, R. E., Stein, P. K., & Bigger, J. T., Jr. (2005). Heart rate variability: Measurement and clinical utility. Annals of Noninvasive Electrocardiology,10(1), 88–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kobayashi, H. (2009). Does paced breathing improve the reproducibility of heart rate variability measurements? Journal of Physiological Anthropology,28(5), 225–230. [DOI] [PubMed] [Google Scholar]
- Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine,15(2), 155–163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kromenacker, B. W., Sanova, A. A., Marcus, F. I., Allen, J. J., & Lane, R. D. (2018). Vagal mediation of low-frequency heart rate variability during slow yogic breathing. Psychosomatic Medicine,80(6), 581–587. [DOI] [PubMed] [Google Scholar]
- La Rovere, M. T., Gorini, A., & Schwartz, P. J. (2022). Stress, the autonomic nervous system, and sudden death. Autonomic Neuroscience,237, Article 102921. [DOI] [PubMed] [Google Scholar]
- Laitio, T. T., Huikuri, H. V., Kentala, E., Mäkikallio, T. H., Jalonen, J. R., Helenius, H., Sariola-Heinonen, K., Yli-Mäyry, S., & Scheinin, H. (2000). Correlation properties and complexity of perioperative RR-interval dynamics in coronary artery bypass surgery patients. Anesthesiology,93(1), 69–80. [DOI] [PubMed] [Google Scholar]
- Lehrer, P. M. (2007). Biofeedback training to increase heart rate variability. Principles and Practice of Stress Management,3, 227–248. [Google Scholar]
- Levy, B. (2013). Autonomic nervous system arousal and cognitive functioning in bipolar disorder. Bipolar Disorders,15(1), 70–79. [DOI] [PubMed] [Google Scholar]
- Lovibond, P. F., & Lovibond, S. H. (1995). The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behaviour Research and Therapy,33(3), 335–343. [DOI] [PubMed] [Google Scholar]
- Lupien, S. J., McEwen, B. S., Gunnar, M. R., & Heim, C. (2009). Effects of stress throughout the lifespan on the brain, behaviour and cognition. Nature Reviews Neuroscience,10(6), 434–445. [DOI] [PubMed] [Google Scholar]
- Maestri, R., Raczak, G., Danilowicz-Szymanowicz, L., Torunski, A., Sukiennik, A., Kubica, J., La Rovere, M. T., & Pinna, G. D. (2010). Reliability of heart rate variability measurements in patients with a history of myocardial infarction. Clinical Science,118(3), 195–201. [DOI] [PubMed] [Google Scholar]
- Malik, M. (1996). Heart rate variability: Standards of measurement, physiological interpretation, and clinical use: Task force of the European Society of Cardiology and the North American Society for Pacing and Electrophysiology. Annals of Noninvasive Electrocardiology,1(2), 151–181. [Google Scholar]
- Mandrick, K., Peysakhovich, V., Rémy, F., Lepron, E., & Causse, M. (2016). Neural and psychophysiological correlates of human performance under stress and high mental workload. Biological Psychology,121, 62–73. [DOI] [PubMed] [Google Scholar]
- Manser, P., Thalmann, M., Adcock, M., Knols, R. H., & de Bruin, E. D. (2021). Can reactivity of heart rate variability be a potential biomarker and monitoring tool to promote healthy aging? A systematic review with meta-analyses. Frontiers in Physiology,12, Article 686129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCraty, R., & Shaffer, F. (2015). Heart rate variability: New perspectives on physiological mechanisms, assessment of self-regulatory capacity, and health risk. Global Advances in Health and Medicine,4(1), 46–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods,1(1), 30. [Google Scholar]
- Merz, C. N. B., Dwyer, J., Nordstrom, C. K., Walton, K. G., Salerno, J. W., & Schneider, R. H. (2002). Psychosocial stress and cardiovascular disease: Pathophysiological links. Behavioral Medicine,27(4), 141–147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miu, A. C., Heilman, R. M., & Miclea, M. (2009). Reduced heart rate variability and vagal tone in anxiety: Trait versus state, and the effects of autogenic training. Autonomic Neuroscience,145(1–2), 99–103. [DOI] [PubMed] [Google Scholar]
- Mokkink, L. B., Prinsen, C., Patrick, D. L., Alonso, J., Bouter, L. M., De Vet, H., & Terwee, C. B. (2019). COSMIN Study Design checklist for Patient-reported outcome measurement instruments. Amsterdam, The Netherlands,2019, 1–32. [Google Scholar]
- Nakamura, N. H., Fukunaga, M., & Oku, Y. (2019). Respiratory fluctuations in pupil diameter are not maintained during cognitive tasks. Respiratory Physiology & Neurobiology,265, 68–75. [DOI] [PubMed] [Google Scholar]
- Novakova, B., Harris, P. R., Ponnusamy, A., & Reuber, M. (2013). The role of stress as a trigger for epileptic seizures: A narrative review of evidence from human and animal studies. Epilepsia,54(11), 1866–1876. [DOI] [PubMed] [Google Scholar]
- Pagani, M., Lombardi, F., Guzzetti, S., Sandrone, G., Rimoldi, O., Malfatto, G., Cerutti, S., & Malliani, A. (1984). Power spectral density of heart rate variability as an index of sympatho-vagal interaction in normal and hypertensive subjects. Journal ofHypertension.Supplement: official journal of the International Society of Hypertension, 2(3), S383–385. [PubMed]
- Parnandi, A., & Gutierrez-Osuna, R. (2013). Contactless measurement of heart rate variability from pupillary fluctuations. 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.
- Peinkhofer, C., Knudsen, G. M., Moretti, R., & Kondziella, D. (2019). Cortical modulation of pupillary function: Systematic review. PeerJ,7, Article e6882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pereira, V. H., Campos, I., & Sousa, N. (2017). The role of autonomic nervous system in susceptibility and resilience to stress. Current Opinion in Behavioral Sciences,14, 102–107. [Google Scholar]
- Pike, J. L., Smith, T. L., Hauger, R. L., Nicassio, P. M., Patterson, T. L., McClintick, J., Costlow, C., & Irwin, M. R. (1997). Chronic life stress alters sympathetic, neuroendocrine, and immune responsivity to an acute psychological stressor in humans. Psychosomatic Medicine,59(4), 447–457. [DOI] [PubMed] [Google Scholar]
- Pinna, G. D., Maestri, R., Torunski, A., Danilowicz-Szymanowicz, L., Szwoch, M., La Rovere, M. T., & Raczak, G. (2007). Heart rate variability measures: A fresh look at reliability. Clinical Science,113(3), 131–140. [DOI] [PubMed] [Google Scholar]
- Reith, F. C., Van den Brande, R., Synnot, A., Gruen, R., & Maas, A. I. (2016). The reliability of the Glasgow Coma Scale: A systematic review. Intensive Care Medicine,42, 3–15. [DOI] [PubMed] [Google Scholar]
- Riganello, F., Garbarino, S., & Sannita, W. G. (2012). Heart rate variability, homeostasis, and brain function. Journal of Psychophysiology. 10.1027/0269-8803/a000080 [Google Scholar]
- Schaefer, M., Edwards, S., Nordén, F., Lundström, J. N., & Arshamian, A. (2023). Inconclusive evidence that breathing shapes pupil dynamics in humans: A systematic review. Pflügers Archiv - European Journal of Physiology,475(1), 119–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schiweck, C., Piette, D., Berckmans, D., Claes, S., & Vrieze, E. (2019). Heart rate and high frequency heart rate variability during stress as biomarker for clinical depression. A systematic review. Psychological Medicine,49(2), 200–211. [DOI] [PubMed] [Google Scholar]
- Shaffer, F., McCraty, R., & Zerr, C. L. (2014). A healthy heart is not a metronome: An integrative review of the heart’s anatomy and heart rate variability. Frontiers in Psychology,5, Article Article 1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silver, N. C., & Dunlap, W. P. (1987). Averaging correlation coefficients: Should Fisher’s z transformation be used? Journal of Applied Psychology,72(1), 146. [Google Scholar]
- Sole, G., Hamrén, J., Milosavljevic, S., Nicholson, H., & Sullivan, S. J. (2007). Test–retest reliability of isokinetic knee extension and flexion. Archives of Physical Medicine and Rehabilitation,88(5), 626–631. [DOI] [PubMed] [Google Scholar]
- Sollers, J. J., Buchanan, T. W., Mowrer, S. M., Hill, L. K., & Thayer, J. F. (2007). Comparison of the ratio of the standard deviation of the RR interval and the root mean squared successive differences (SD/rMSSD) to the low frequency-to-high frequency (LF/HF) ratio in a patient population and normal healthy controls. Biomedical Sciences Instrumentation,43, 158–163. [PubMed] [Google Scholar]
- Sookan, T., & McKune, A. J. (2012). Heart rate variability in physically active individuals: Reliability and gender characteristics: Cardiovascular topics. Cardiovascular Journal of Africa,23(2), 67–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thayer, J. F., Åhs, F., Fredrikson, M., Sollers, J. J., III., & Wager, T. D. (2012). A meta-analysis of heart rate variability and neuroimaging studies: Implications for heart rate variability as a marker of stress and health. Neuroscience and Biobehavioral Reviews,36(2), 747–756. [DOI] [PubMed] [Google Scholar]
- Thomas, E. H., Steffens, M., Harms, C., Rossell, S. L., Gurvich, C., & Ettinger, U. (2021). Schizotypy, neuroticism, and saccadic eye movements: New data and meta-analysis. Psychophysiology,58(1), Article e13706. [DOI] [PubMed] [Google Scholar]
- Tichon, J. G., Mavin, T., Wallis, G., Visser, T. A., & Riek, S. (2014). Using pupillometry and electromyography to track positive and negative affect during flight simulation. Aviation Psychology and Applied Human Factors. 10.1027/2192-0923/a000052 [Google Scholar]
- Turnbull, P. R., Irani, N., Lim, N., & Phillips, J. R. (2017). Origins of pupillary hippus in the autonomic nervous system. Investigative Ophthalmology & Visual Science,58(1), 197–203. [DOI] [PubMed] [Google Scholar]
- Umetani, K., Singer, D. H., McCraty, R., & Atkinson, M. (1998). Twenty-four-hour time domain heart rate variability and heart rate: Relations to age and gender over nine decades. Journal of the American College of Cardiology,31(3), 593–601. [DOI] [PubMed] [Google Scholar]
- Vingerhoets, A. J. (1985). The role of the parasympathetic division of the autonomic nervous system in stress and the emotions. International Journal of Psychosomatics. [PubMed]
- Wahn, B., Ferris, D. P., Hairston, W. D., & König, P. (2016). Pupil sizes scale with attentional load and task experience in a multiple-object tracking task. PloS One,11(12), Article e0168087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watson, D. (2004). Stability versus change, dependability versus error: Issues in the assessment of personality over time. Journal of Research in Personality,38(4), 319–350. [Google Scholar]
- Wege, N., Li, J., Muth, T., Angerer, P., & Siegrist, J. (2017). Student ERI: Psychometric properties of a new brief measure of effort-reward imbalance among university students. Journal of Psychosomatic Research,94, 64–67. [DOI] [PubMed] [Google Scholar]
- Weir, J. P. (2005). Quantifying test–retest reliability using the intraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research,19(1), 231–240. [DOI] [PubMed] [Google Scholar]
- Wekenborg, M. K., von Dawans, B., Hill, L. K., Thayer, J. F., Penz, M., & Kirschbaum, C. (2019). Examining reactivity patterns in burnout and other indicators of chronic stress. Psychoneuroendocrinology,106, 195–205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wirtz, P. H., & von Känel, R. (2017). Psychological stress, inflammation, and coronary heart disease. Current Cardiology Reports,19, 1–10. [DOI] [PubMed] [Google Scholar]
- Wright, B. J., Wilson, K.-E., Kingsley, M., Maruff, P., Li, J., Siegrist, J., & Horan, B. (2022). Gender moderates the association between chronic academic stress with top-down and bottom-up attention. Attention, Perception, & Psychophysics,84(2), 383–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wyller, V. B., Eriksen, H. R., & Malterud, K. (2009). Can sustained arousal explain the chronic fatigue syndrome? Behavioral and Brain Functions,5(1), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xin, Y., Yao, Z., Wang, W., Luo, Y., Aleman, A., & Wu, J. (2020). Recent life stress predicts blunted acute stress response and the role of executive control. Stress (Amsterdam, Netherlands),23(3), 359–367. [DOI] [PubMed] [Google Scholar]
- Zerr, C., Kane, A., Vodopest, T., Allen, J., Hannan, J., Cangelosi, A., Owen, D., Fabbri, M., Williams, C., & Cary, B. (2015). The nonlinear index SD1 predicts diastolic blood pressure and HRV time and frequency domain measurements in healthy undergraduates. Applied Psychophysiology and Biofeedback.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data for the project are available at https://osf.io/e695b/
Not applicable


