Abstract
Trace fear conditioning is an important research paradigm to model aversive learning in biological or clinical scenarios, where predictors (conditioned stimuli, CS) and aversive outcomes (unconditioned stimuli, US) are separated in time. The optimal measurement of human trace fear conditioning, and in particular of memory retention after consolidation, is currently unclear. We conducted two identical experiments (N 1 = 28, N 2 = 28) with a 15‐s trace interval and a recall test 1 week after acquisition, while recording several psychophysiological observables. In a calibration approach, we explored which learning and memory measures distinguished CS+ and CS− in the first experiment and confirmed the most sensitive measures in the second experiment. We found that in the recall test without reinforcement, only fear‐potentiated startle but not skin conductance, pupil size, heart period, or respiration amplitude, differentiated CS+ and CS−. During acquisition without startle probes, skin conductance responses and pupil size responses but not heart period or respiration amplitude differentiated CS+ and CS−. As a side finding, there was no evidence for extinction of fear‐potentiated startle over 30 trials without reinforcement. These results may be useful to inform future substantive research using human trace fear conditioning protocols.
Keywords: calibration, fear conditioning, fear memory, fear‐potentiated startle, human neuroscience, psychophysiological modeling, retrodictive validity, trace fear memory
Short abstract
Our report generates new insights into the psychophysiology of trace fear conditioning using long trace intervals (15 s) in humans. By employing a strong explorative/confirmative strategy and comparing multiple psychophysiological measurements we identify skin conductance and pupil size responses to reflect fear acquisition and in agreement with previous research on delay fear conditioning show startle eye‐blink responses to reflect fear retention even a week later.
1. INTRODUCTION
Fear conditioning is a standard laboratory model for investigating the neurobiology of aversive learning (LeDoux, 2014), and for preclinical research into the modification of unwanted aversive memory (Milad & Quirk, 2012). In this paradigm, a conditioned stimulus (CS+) is contingently followed by an aversive unconditioned stimulus (US). Humans and many non‐human species learn to predict the US and express this prediction in preparatory behavior (Davis, 1992; Fanselow, 1994). A common instantiation of this approach in non‐human (Kochli et al., 2015) and human (Lonsdorf et al., 2017) experiments is delay fear conditioning, where the CS and US overlap in time. Such overlap, however, is not the case in many biological environments (e.g., lightning precedes thunder). Similarly, in the phenomenology of post‐traumatic stress disorder, remote cues that preceded the trauma can come to elicit arousal and memory intrusions (Ehlers et al., 2002). This situation is modeled by trace fear conditioning where CS and US are separated in time (Mertens et al., 2020; Rescorla, 1988; Sehlmeyer et al., 2009). While delay and trace fear conditioning are procedurally very similar, there are major differences in the neural circuits that support both types of learning, in particular an essential involvement of the hippocampus in trace fear conditioning as revealed in rodent work (Gilmartin et al., 2012).
Hence, it appears important to test candidate procedures for memory modification in trace fear conditioning protocols. However, the most appropriate quantification of the CS‐US association in human trace fear conditioning is currently unclear, partly because it is unknown which conditioned responses are elicited, and at what point in time they are expressed. While it has been observed that trace conditioning produces weaker responses than delay conditioning (Bouton, 2007), it is debated whether this reflects weaker learning (Raybuck & Lattal, 2014). Indeed, it has been suggested that the expression of trace conditioning in behavior might be different from the expression of delay conditioning, and might appear weaker because quantification strategies were optimized for delay conditioning (Raybuck & Lattal, 2014). For example, in delay conditioning where conditioned responses are well characterized (see Ojala & Bach, 2020 for a comprehensive review), varying the time interval between CS and US onset in delay conditioning can shift the expression of the CS‐US association in time (Castegnetti et al., 2016, 2017). This is likely to be of even more importance for trace fear conditioning.
Typical trace intervals in previous human trace fear conditioning studies were on the order of 1–10 s (Sehlmeyer et al., 2009). Neuroimaging has been used to verify hippocampal involvement in such paradigms. Hippocampal fMRI responses are reported both in delay and trace fear conditioning, with no confirmed dependency on the duration of the trace interval (Sehlmeyer et al., 2009). Also, it is not clear whether this hippocampal activation reflects CS‐US conditioning, or other processes such as concurrent episodic memory formation (Fullana et al., 2016). Using a wider range of trace intervals, non‐human animal studies have reported that hippocampal involvement in trace fear conditioning depends on the trace interval (Chowdhury et al., 2005; Misane et al., 2005). Chowdhury et al. (2005) demonstrated that lesions to the dorsal hippocampus only attenuated freezing after training at a 20 s trace interval, but not a a 1–3 s trace interval. Misane et al. (2005) found that by injecting the NMDA (N‐methyl‐D‐aspartate receptor) receptor antagonist APV ([2R]‐amino‐5‐phosphonovaleric acid) bilaterally into the dorsal hippocampus before fear conditioning led to impaired conditioned freezing in an assessment 24 h after training at trace intervals of 15 and 30 s, but not at shorter trace intervals of 1–10 s. Thus it seems that in rodents, the hippocampus is necessarily required for conditioning only with trace intervals longer than 10 s (Chowdhury et al., 2005; Misane et al., 2005), which have not typically been used in humans. Although neurobiological insights garnered from rodent experiments do not always extrapolate to humans, these results motivated a trace interval of 15 s for our experiments. To reduce potential backward conditioning, we used a longer inter‐trial interval of 30 s, during which participants performed an incidental task.
Our paradigm and the observables we recorded were based on delay fear conditioning research. We used visual CS and electric shock as US. Previous work has identified fear‐potentiated startle (Blumenthal et al., 2005) as a measure with high sensitivity to quantify fear memory retention after consolidation (Bach & Melinscak, 2020; Khemka et al., 2017), but the startles probes required for this measure can interfere with fear acquisition (de Haan et al., 2018; Sjouwerman et al., 2016). Hence, we recorded skin conductance, pupil size, electrocardiogram (ECG) and respiration during acquisition. During the recall session, 1 week after acquisition and without US, we elicited startle eye‐blink responses (SEBR), and recorded orbicularis oculi electromyogram (EMG), together with the same measures as during acquisition. As we aimed to apply the methods developed here for memory modification research, which usually employs a separate memory recall session after an intervention, our focus was the quantification of memory retention after overnight consolidation. With this goal in mind, our focus was to find the optimal measure for each session, rather than identify measures that are comparable across sessions.
We sought to characterize conditioned responses and develop an analysis scheme in a first exploratory experiment, and confirm these results in a second, independent sample. Our validation criterion was retrodictive validity, i.e., the sensitivity to distinguish CS+ and CS−. This metric, in a given data set, is monotonically (inversely) related to measurement error (Bach et al., 2020; Bach & Melinscak, 2020; Bach, Tzovara, & Vunder, 2018).
2. METHOD
2.1. Participants
Two independent samples of healthy individuals were recruited from the general population. All participants confirmed that they had no history of neurological, psychiatric, or systemic medical conditions, and all had normal or corrected‐to‐normal vision. For Experiment 1 (data set code TFC1) we recorded data from 31 participants. One participant withdrew during acquisition, as they could not tolerate the US. Another participant did not return for the recall session, and one participant had incomplete data due to a technical failure. Thus, we included data of 28 participants into the analysis (7 males, 21–40 years, mean age ± SD: 27.25 ± 5.01 years). For Experiment 2 (data set code TFC2) we recorded data from 30 participants. One participant did not return for the recall session, and an additional participant was excluded due to malfunction of the US delivery. Therefore, we included data of 28 participants into the analysis (13 males, 20–39 years, mean age ± SD: 25.53 ± 4.54 years). The samples in Experiments 1 and 2 did not differ in age (t[54] = 1.34, p = .19, d = 0.36) or gender distribution (χ 2 = 1.95, p = .16). All participants gave written informed consent before the experiment. The study was conducted in accordance with the Declaration of Helsinki and approved by the governmental research ethics committee (Kantonale Ethikkomission Zürich KEK‐ZH‐2013‐0118).
2.2. Calibration approach and sample size
As in previous work, we followed a calibration strategy, which assumes that participants acquire and retain a CS‐US association, and aims to find the conditioned response quantification with the highest retrodictive validity, that is, effect size to differentiate CS+ and CS−. It has been shown that in such approach, high retrodictive validity reflects low measurement error (Bach et al., 2020). To determine sample size, our goal was to obtain a robust estimate of population retrodictive validity, rather than achieve high power in a null hypothesis significance test. As no formal framework is available for sample size calculation in this approach, sample size was heuristically based on previous methodological work. Post hoc, the relatively similar effect sizes observed in both experiments suggest a certain chance that robust effect size estimates were obtained.
2.3. Design and stimuli
Both experiments used the same design with an acquisition and a recall session. In the acquisition phase, participants were presented with 40 CS: 20 CS+ with 100% reinforcement, and 20 CS− that predicted the absence of the US. In the recall session 1 week later, US electrodes were attached, and participants were presented with the same CS 30 times: 15 CS+ and 15 CS−. None of the CS were reinforced, and a startle probe was delivered on each trial 13 s after CS offset (i.e., 2 s before the expected US delivery), both in CS+ and CS− trials. For both experimental phases, CS order was randomized for each participant, with the constraint that maximally three CS of the same type were presented consecutively.
CS were differently colored isoluminant triangles (yellow, RGB: 225, 224, 177; purple, RGB: 238, 194, 244) of ~4.1° visual angle, presented for 2 s at the center of an isoluminant gray (RGB: 175, 175, 175) computer screen. CS‐color relation was counterbalanced across participants. During the 15‐s trace interval, a white (RGB: 255, 255, 255) fixation cross of ~0.8° visual angle was presented at the center of the gray background screen. Attentiveness and stimulus recognition might be especially relevant for trace conditioning (Han et al., 2003), therefore participants were asked to indicate the color of the CS by pressing the left/right cursor keys during CS presentation on a standard computer keyboard. If participants gave the wrong response, the words “wrong key”, and in case of no response during CS presentation the words “no response”, were presented immediately after CS offset for 1 second.
During a 30‐s ITI, with a ±2‐s jitter, participants were instructed to perform a simple visual detection task, in order to reduce backward conditioning, as well as drowsiness due to the long ITIs (de Haan et al., 2018). We presented a stream of 13 white (RGB: 255, 255, 255) single digits with a red (RGB: 255, 0, 0) fixation cross embedded in the stream, at a rate of 1 Hz with a presentation time of 0.2 s. Participants were asked to respond with a keypress to the red cross. The onset of this task was randomized between 5–10 s after US offset.
After both experimental sessions, participants were asked to rate CS‐US contingency for both CS from 0 to 100. In Experiment 2 only, participants were additionally asked after the recall session to indicate the shock contingency for each CS in the acquisition session.
2.4. Settings and equipment
US were trains of 83 square electric pulses of 0.2 ms duration with a duty cycle of 1.67%, resulting in total US duration of approximately 1000 ms. Electric pulses were delivered on the participants' dominant forearm with a pin‐cathode/ring anode configuration with a constant current stimulator (Digitimer DS7A, Digitimer, Welwyn Garden City, UK). The stimulus was set to a perceived intensity of approximately 90% of a clearly painful stimulus. This pain threshold was estimated in two phases. In the first phase, the intensity was increased from an imperceptible to a painful level, thereby defining an upper limit for the second phase. In the second phase, participants were asked to rate the subjectively perceived intensity of 14 stimuli with different intensities. These ratings were linearly interpolated to estimate the intensity corresponding to 90% of the clearly painful stimulus. This estimate was then used in the experiment. This resulted in currents of 1.35–48.40 mA (mean ± SD: 4.84 ± 8.67 mA) in Experiment 1 and between 1.30–7.39 mA (mean ± SD = 4.05 ± 1.52 mA) in Experiment 2.
Startle probes were 20‐ms, instantaneous rise time, white noise sounds of 102 dB loudness, delivered binaurally with headphones (HD 202, Sennheiser, Wedemark–Wennebostel, Germany).
The experiment was presented on a Dell P2014h 20″ screen, set to an aspect ratio of 4:3 at 60 Hz, with a resolution of 1152 × 864 pixels. The experiments took place in a dark, soundproof chamber. Participants were positioned on a chin rest at 70 cm distance from the monitor and 47 cm from the eyetracker.
2.5. Psychophysiological recordings
We recorded the EMG from the orbicularis oculi muscle of participants' left eye with two 4 mm Ag/AgCl cup electrodes filled with high‐conductance gel. One was placed below the lower eyelid on the muscle in a line with the pupil in forward gaze, the other below the lateral canthus, at approximately 1–2 cm distance (Blumenthal et al., 2005). The electromyogram was amplified with a gain of 2000 and filtered with a band‐pass filter at 1 and 500 Hz (EMG100C, Biopac Systems). To measure skin conductance responses (SCR), disposable Ag/AgCl snap electrodes (EL507, Biopac Systems), filled with 0.5% NaCl (Hygge & Hugdahl, 1985) electrolyte gel (GEL101, Biopac Systems), were placed on the thenar/hypothenar of the non‐dominant hand. A ground electrode was additionally placed on the left elbow. SCR were measured with a 0.5 V constant voltage coupler/amplifier (EDA100C, Biopac Systems). To measure ECG, pre‐gelled disposable Ag/AgCl snap electrodes (01‐7500, TIGA‐MED) were placed on both wrists and above the right foot ankle. Lead I configuration was generated and amplified (ECG100C, Biopac Systems). Respiration was measured with a single‐belt cushion system (RSP100C, Biopac Systems). All signals were digitized (MP160, Biopac Systems) at 2000 Hz and recorded (Acknowledge, Biopac Systems).
Pupil diameter and gaze direction for both eyes were recorded with an EyeLink 1000 System (SR Research) at a sampling rate of 500 Hz. To calibrate gaze direction, we used the nine‐point protocol implemented in the EyeLink 1000 software.
2.6. Data analysis
Pre‐processing and analysis of psychophysiological data were performed using MATLAB (Version R2018a, Math‐Works) using procedures implemented in PsPM 4.1.1 (Psychophysiological modeling, http://pspm.sourceforge.net), a MATLAB toolbox for model‐based analysis of psychophysiological data (Bach & Friston, 2013; Bach, Tzovara, et al., 2018), and R 4.0.2 (R Core Team, 2020). For pre‐processing of pupil data only, we used PsPM 5.1.0 (bachlab.github.io/pspm).
2.6.1. Data pre‐processing
EMG
EMG pre‐processing for quality control followed the procedure developed by Khemka et al. (2017). Data were filtered with a 4th order Butterworth filter with 50 and 470 Hz cut‐off frequencies. To remove mains noise, we used a 50 Hz notch filter. Data were rectified and then smoothed with 4th order Butterworth low‐pass filter with a time constant of 3 ms. Pre‐processed EMG was visually inspected. Three participants in Experiments 1, and 2 in Experiment 2, had no discernible average SEBR over all trials and were excluded from further EMG analysis.
Skin conductance
Data of some participants showed artifacts from the electric stimulation used as US. To account for this, the period from 0.2 s before US onset to 1.6 s after US onset was treated as missing values for all analyses. Two participants in Experiment 1, and 1 participant in Experiment 2, showed no visually discernible average skin conductance response to all US and were excluded from further SCR analyses.
Pupil size
We first converted from EyeLink 1000 system's arbitrary units to true diameter using the transform derived in Hayes and Petrov (2016). Data were then pre‐processed following the procedure by Kret and Sjak‐Shie (2019) as implemented in PsPM 5.1.0. In brief, this determines valid samples by range, speed, edge, trendline, and isolated sample filtering and then smoothes valid data by filtering, interpolation, and combination of data from both eyes. Finally data points for which gaze direction was outside ±5° visual angle were treated as missing data points. For time‐course analysis and visualization, data were interpolated and smoothed with a moving average of 1 s width. Trials with more than 50% missing data between CS onset to US onset were excluded from the analysis.
ECG
ECG pre‐processing followed Paulus et al. (2016) using the standard procedures implemented in PsPM. This uses a modified offline implementation of the Pan and Tompkins (Pan & Tompkins, 1985) real‐time QRS detection. Inter‐beat intervals (IBIs) were then assigned to the following heartbeat, while rejecting IBI values outside the interval 600–1200 ms (corresponding to 50–100 bpm). Heart period was linearly interpolated with 10 Hz sampling frequency and filtered with a fourth order bidirectional bandpass Butterworth filter (cut‐off frequencies: 0.015–0.5 Hz).
Respiration amplitude
Respiration data pre‐processing followed Bach et al. (2016) and Castegnetti et al. (2017) using the standard procedures implemented in PsPM. This algorithm detects the beginning of inspiration cycles and assigns cycle amplitude to the beginning of the next cycle. These data are then interpolated with 10 Hz sampling frequency. No additional filtering was implemented.
2.6.2. Time course analysis
In a first analysis, we took a model‐free perspective to investigate the time course of conditioned responses in this paradigm. To confirm CS+/CS− differences, we used a cluster‐level random permutation test. We extracted individual trial data (0–17 s after CS onset), resampled at 10 Hz, for skin conductance, pupil size, heart period, and respiration amplitude. Missing data in SCR (due to US artifacts) and pupil size (due to loss of fixation) were linearly interpolated. Data were averaged over trials, separately for each condition per participant. We computed a paired t‐test for CS+ versus CS−, for each datapoint in the trial‐average time course, resulting in 170 t‐tests. We then permuted trial labels 10,000 times within each participant, each time averaging over trials and performing the same 170 t‐tests. We used a cluster‐level correction for multiple comparisons (Maris & Oostenveld, 2007), which controls the false positive rate for the statement that there is a condition difference anywhere within the time course. For illustration, we identified and visualized significant clusters within the trial‐average time course.
2.6.3. Psychophysiological modeling
The second set of analyses built on psychophysiological models, that is, forward models that specify how CS‐US association is expressed in behavior. These models were then inverted, which yields, separately for each trial or condition, an estimate to what extent CS‐US association is expressed in behavior; these were then compared between conditions.
2.6.3.1. Startle eye‐blink response
For SEBR, two different methods were used. First, we used a general linear model, which quantifies for each trial an amplitude of the SEBR by linear regression of the EMG data onto canonical SEBR with variable onset (Khemka et al., 2017). This method used the pre‐processing strategy detailed above. Secondly, we employed a peak scoring method (Balderston et al., 2017) in the version used in (Khemka et al., 2017). To this end, we high‐pass filtered the raw EMG signal with a 4th order Butterworth filter at 30 Hz and applied a notch filter to remove 50 Hz mains noise. Filtered EMG data were rectified and smoothed using a 20‐ms moving average. The peak startle amplitude for each trial was then defined as the maximum EMG amplitude between 20 and 100 ms after startle sound onset as determined from recordings of the audio output.
For both analyses, each participant's amplitude estimate was normalized by dividing through the mean SEBR in this participant's CS− trials (Bach et al., 2019; Bach, Castegnetti, et al., 2018) to correct for differences in electrode impedance and muscle anatomy.
2.6.3.2. Skin conductance responses
SCR pre‐processing and analysis extended the procedure developed by Staib et al. (2015). Data were filtered with a first order uni‐directional band‐pass Butterworth filter (0.0159–5 Hz) and downsampled to 10 Hz. We then employed two analysis methods. Firstly, we used the standard non‐linear model implemented in PsPM. This provides trial‐by‐trial estimates of sudomotor bursts, modeled as a Gaussian bump functions (Bach et al., 2010). We defined three bursts, two with constant latency after CS and US time points, and one with estimated latency (but fixed dispersion) during the trace interval, from 8 s after CS offset to 1 s before US onset. Each participant's amplitude estimate was normalized by dividing through the mean estimate in this participant's CS− trials (Bach et al., 2019; Bach, Castegnetti, et al., 2018). Secondly, we summarized SCR during the trace interval by computing the area under the curve (AUC) from CS onset to US timepoint using the PsPM module for spontaneous fluctuations (SF) (Bach et al., 2010). This analysis did not reveal any additional insights and is not included in the results.
2.6.3.3. Pupil size responses
Previous work on delay fear conditioning with variable CS‐US interval has suggested that the anticipatory pupil size response (PSR) to CS+ is time‐locked to CS onset (Korn et al., 2017), but only delay fear conditioning procedures were tested. Here, we first estimated the anticipatory pupil response using PsPM's standard single‐trial GLM (Korn et al., 2017). Secondly, we built a specific response function on the pupil size time‐course during the acquisition session of Experiment 1. This model was then inverted again using the single‐trial GLM (Korn et al., 2017). For both inversions, we normalized each participant's amplitude estimates by dividing through the mean estimate in this participant's CS− trials.
2.6.3.4. Heart period responses
In delay fear conditioning with variable CS‐US interval, the expression of fear‐conditioned bradycardia appeared to be time‐locked to the US (Castegnetti et al., 2016, 2017). However, US onset and CS offset co‐incided in these experiments, and so it is unclear how the model should be adapted for trace conditioning. Hence, we first inverted the US‐locked model, and secondly built a specific response function on the heart period time course during the recall session of Experiment 1. This model was then inverted using the standard GLM approach as in (Castegnetti et al., 2016).
2.6.3.5. Respiration amplitude responses
For respiration amplitude responses (RAR), previous work provided some weak evidence that the expression of US anticipation may be time‐locked to the US (Castegnetti et al., 2017) although this was less clear than for heart period responses (HPR). Our time‐course analysis identified no appreciable CS+/CS‐ difference such that no new forward model could be identified. In an exploratory analysis, we used PsPM's standard condition‐wise GLM with early and late response functions (Castegnetti et al., 2017).
2.6.3.6. Statistical analysis
Statistical Analysis was performed in R (www.r‐project.org), version 4.0.2. For the acquisition session, we averaged response estimates across all CS+ and CS− trials separately and computed a paired t‐test for the CS+/CS− difference. To account for potential extinction during the recall session, the time course of which is unknown, we averaged the first 1‐n CS+ and CS− trials, with n ranging from 1 to 15, and then performed 15 paired t‐tests for Experiment 1. The average with the highest effect size was then confirmed in Experiment 2. T‐test were performed using function t.test() of the package “stats” version 4.0.2, Cohen's d were estimated using cohensD() of package “lsr” version 0.5.
Because SEBR and SCR response estimates habituate over time, independently of extinction processes, and the sequence of CS+ and CS− is determined randomly, it might be more sensitive to analyze responses in a linear mixed effects model with trial number as predictor across conditions. To this end, we used linear mixed effects modeling as implemented in the R package “lme4” (version 1.1.23) with the function lmer(). For SEBR, we modeled habituation by an exponential decay. The decay parameter was fitted to data from all trials of Experiment 1, based on ordinary least squares, and then used for analysis of Experiments 1 and 2. For SCR, we used a linear decay. We tested for the effects of trial and condition (i.e., CS type), and their interaction. F‐statistics were determined using the anova() function from the package “car” version 3.0.9, p‐values with the pf() function of the package “stats” version 4.0.2. We fitted two models with different random effects structures, random intercept, or random effects for intercept and trial order, on data from Experiment 1, and selected the model with the smallest Akaike Information Criterion. For all LME model analysis, the model accounting for random effects for subject and trial was selected (formula: lmer(data ~ [1 + trial | subject] + trial * condition)). For Experiment 2, only the model selected in Experiment 1 was fitted. Effect sizes for LME were calculated using the R package “effectsize” version 0.5 with function eta_squared(),confidence intervals were computed using the package “MBESS” version 4.8.1, with the function ci.pvaf() both for SEBR and SCR. For robustness analysis of SEBR results, we additionally performed an ANOVA with trial as a predictor within condition, using the R function aov() of the stats package version 4.0.2, where effectsizes were calculated using “effectsize” version 0.5 with function eta_squared(), and cohens_f(), and confidence intervals were computed using the package “MBESS” version 4.8.1, with the function ci.pvaf().
3. RESULTS
3.1. Time‐course analysis
We first analyzed conditioned responses in a model‐free way to ask how and when the anticipation of US might be expressed in psychophysiological signals. Notably, this analysis averaged over all trials such that it may be less suited to find conditioned responses with variable timing over trials or participants. Furthermore, we did not include SEBR in this analysis as they are always locked to the startle probe. Although condition differences were visible in pupil size and heart period (Figure 1), none of these were significant after correction for multiple comparisons across data points. No condition differences were found for skin conductance and respiration amplitude in Experiment 1.
FIGURE 1.

Averaged time course of pupil size, heart period data and skin conductance. Red line and shaded area: CS+ trials ± SEM, blue line and shaded area: CS− trials ± SEM, green section: CS presentation, yellow section: US presentation, black line: startle sound presentation.
3.2. Analysis of response estimates
Next, we built on psychophysiological models that specify how US expectation is expressed in conditioned responses, and derive, for each trial or condition, an estimate of US expectation.
3.2.1. Startle eye‐blink responses
First, we separately averaged the first 1–15 CS+ and CS− trials for each participant and performed 15 paired t‐tests to compare the conditions (see supplementary information). For the GLM estimates, there were no significant condition differences in either of the two experiments. For the peak‐scoring estimates in Experiment 1, the CS+/CS− difference in SEBR was most pronounced when averaging over all 15 trials (t[24] = 2.19, p = .038, Cohen's d = 0.44). This result was confirmed in experiment 2 (t[25] = 2.25, p = .033, d = 0.44). Effects sizes from all comparisons are summarized in Table 1, and results are illustrated in Figure 2 .
TABLE 1.
SEBR results, averaged over all trials of the recall session, for Experiments 1–2 for peak scoring and model inversion analysis
| Method | Cohen's d | Hedge's g | |
|---|---|---|---|
| Experiment 1 | Peak scoring | 0.44 | 0.42 |
| GLM | 0.35 | 0.34 | |
| Experiment 2 | Peak scoring | 0.44 | 0.43 |
| GLM | 0.37 | 0.36 |
FIGURE 2.

Scatterplot of CS+/CS− differences in SEBR and SCR. (a). Normalized SEBR (peak scored) difference between CS+/CS− in the recall session, averaged over all trials. (b). Normalized SCR difference between CS+/CS− at CS presentation in the acquisition session. (c). Normalized SCR difference between CS+/CS− during the trace interval in the acquisition session. Average values are depicted with a horizontal line, SEM with a vertical line, scatterplots show individual values, asterisks denote significant difference (p < .05) from 0.
For GLM estimates, a linear mixed effects model with trial as a factor across conditions and random effects accounting for trial and subject intercept revealed response habituation over trials (main effect trial: F[1, 722] = 47.45, p < .001, η 2 = 0.66) and larger responses in CS+ compared to CS− trials (main effect condition: F[1, 722] = 4.68, p = .031, η 2 = 0.01) with no interaction (F[1, 722] = 1.97, p = .161, η 2 < 0.01) in Experiment 1. Both the effect of trial (F[1, 751] = 28.80, p < .001, η 2 = 0.42) and the effect of condition (F[1, 751] = 8.74, p = .003, η 2 < 0.01) were confirmed in Experiment 2, the interaction (F[1, 751] = 0.08, p = .772, η 2 < 0.01) was again not significant.
For peak‐scoring estimates, the LME model showed response habituation over trials (main effect trial: F[1, 722] = 61.84, p < .001, η 2 = 0.72) and larger responses in CS+ compared to CS− trials (main effect condition: F[1, 722] = 5.48, p = 0.020, η 2 = 0.01) with no interaction (F[1, 722] = 1.92, p = 0.167, η 2 < 0.01) in experiment 1. This was replicated in experiment 2 for trial (F[1, 751] = 80.03, p < 0.001, η 2 = 0.76) and condition (F[1, 751] = 6.04, p = 0.014, η 2 = 0.01), while the interaction was again not significant (F[1, 751] = 1.08, p = 0.30, η 2 < 0.01). To verify robustness of the LME model results, we also conducted an ANOVA on the peak‐scoring estimates, with trial number within conditions as factor. This confirmed significant main effects of CS (F[1690] = 4.77, p = .029, η 2 = 0.01, Cohen's f = 0.08) and trial number (F[29,690] = 7.43, p < .001, η 2 = 0.23, f = 0.56) with no interaction (F[29, 690] = 0.80, p = .760, η 2 = 0.02, f = 0.18). This was further confirmed in experiment 2 (CS: F(1, 720) = 7.20, p = .008, η 2 = 0.01, f = 0.10, trial number: F(29, 720) = 10.05, p < .001, η 2 = 0.28, f = 0.64; interaction: F(29, 720) = 0.95, p = .506, η 2 = 0.03, f = 0.20).
3.2.2. Skin conductance responses
Averaged across the entire acquisition session, estimated SCR amplitudes in Experiment 1 were higher in CS+ compared to CS− trials, both for the timepoint of CS presentation (t[25] = 3.14, p = .004, d = 0.22) and during the trace interval (t[25] = 3.87, p = .001, d = 0.43) respectively. This result was confirmed in Experiment 2 for the timepoint of CS presentation (t[26] = 3.12, p = .004, d = 0.20) and the trace interval (t[26] = 3.62, p = .001, d = 0.32).
For the recall phase in Experiment 1, paired t‐tests revealed no significant (p < .050) difference between CS+/CS− trials during CS presentation or the trace interval, irrespective of the number of trials included into the analysis (see supplementary information). For the time point of US presentation, CS− amplitudes were significantly higher than CS+ amplitudes when averaged over 1–11, 1–12 or 1–13 trials, but not when all trials where included, with the largest effect size for 12 trials. However, this result was not confirmed in Experiment 2, nor did any other significant condition difference emerge in Experiment 2.
Linear mixed‐effects modeling of the normalized data during the recall phase revealed significant habituation (main effect trial: F[1, 809] = 35.28, p < .001, η2 = 0.57), larger CS+ than CS− responses (main effect condition: F[1, 809] = 5.53, p = .019, η 2 = 0.02) and an interaction (F[1, 809] = 10.94, p < .001, η 2 = 0.01) during the trace interval. However, while the effect of trial was replicated (F[1, 780] = 21.01, p < .001, η 2 = 0.45) the effect of condition was not confirmed in experiment 2 (F[1, 780] = 0.190, p = .17, η 2 < 0.01), nor was the interaction (F[1, 780] = 1.55, p = 0.21, η 2 < 0.01). A combined analysis for both datasets again confirmed the effect of trial (F[1, 1563] = 42.84, p < .001, η 2 = 0.45) but not of condition (F[1, 1563] = 3.80, p = 0.051, η 2 < 0.01), or interaction (F[1, 1563] = 3.10, p = .078, η 2 < .01).
3.2.3. Pupil dilation
Estimated CS‐locked anticipatory PSR, averaged across the entire acquisition session, show a significant difference between CS+/CS− trials in Experiment 1 (t[27] = 3.10, p = .004, d = 0.59). This result was not confirmed in Experiment 2 (t[27] = 1.34, p = .191, d = 0.25). None of the paired t‐tests for the recall session between CS+/CS− trials in either experiment reached significance level (p < .050), irrespective of how many trials were included into the analysis (see supplementary information). Similarly, in Experiment 2 no significant difference between CS+/CS− trials was observable for the recall session.
Estimated pupil dilation, fitted to time course of Experiment 1, showed larger response to CS+ than CS− trials in the acquisition session (t[27] = 8.20, p < .001, d = 1.55). This results was replicated (without re‐fitting the pupil response) in Experiment 2 (t[27] = 5.94, p < .001, d = 1.12). Paired sample t‐test for the recall session were not significant, irrespective of how many trials were included, neither for Experiment 1 nor 2.
3.2.4. Heart period
Estimated US‐locked bradycardia amplitudes were not significantly different between CS+ and CS− in either experiment. Based on the observed heart period time courses in the recall session of Experiment 1, we built a response function and estimated responses in Experiment 2. This analysis did not reveal any significant results.
3.2.5. Respiration amplitude
We found no CS+/CS− differences in the acquisition phase in either experiment. Smaller respiration amplitude responses for CS+ versus CS− trials during recall in Experiment 1 (t[27] = 3.55, p = .001, d = 0.67) were not confirmed in Experiment 2 (t[27] = 0.30, p = .768, d = 0.06).
3.3. Declarative memory and task performance
Participants learned the CS+/CS− contingency differences (paired t‐tests, p < .001), remembered them until after the recall session (paired t‐test, p = .010, for Exp. 2), and learned the new contingency for CS+ in the recall session (paired t‐test, p < .001, for acquisition—recall), although their numerical estimates substantially deviated from the correct ones (Table 2, Figure 3). Performance and accuracy in the incidental CS identification task were above 95% in most metrics (Table 3). Hit rate in the ITI task was above 90% and the false alarm rate was below 4% in all metrics (Table 4).
TABLE 2.
Objective shock presentation and subjective contingency ratings for shock and CS‐US association in Experiments 1 and 2. Participants were asked to indicate how likely they thought each CS was followed by a US. In Experiment 2, participants were additionally asked after the recall session to remember how likely CS was followed by US in the acquisition session
| Session | Experiment | US contingency (%) | Mean contingency rating (%) (±SD) | |
|---|---|---|---|---|
| CS+ | Acquisition | 1 | 100 | 78.51 ± 24.11 |
| 2 | 100 | 76.79 ± 26.74 | ||
| Memory | 2 | 100 | 65.07 ± 33.11 | |
| Recall | 1 | 0 | 11.89 ± 22.38 | |
| 2 | 0 | 10.49 ± 21.38 | ||
| CS− | Acquisition | 1 | 0 | 25.34 ± 27.84 |
| 2 | 0 | 26.66 ± 30.95 | ||
| Memory | 2 | 0 | 34.61 ± 32.49 | |
| Recall | 1 | 0 | 13.79 ± 27.22 | |
| 2 | 0 | 10.99 ± 22.46 |
FIGURE 3.

Contingency ratings, displayed as CS+/CS‐ differences. CS+/CS− difference in actual contingency is 100% in acquisition and 0% in recall session. (a) Experiment 1, individual ratings depicted in yellow. (b) Experiment 2. Individual ratings depicted in green. Mean values are marked with a horizontal black line, SEM is depicted with a vertical line. Timepoints: 1—directly after the acquisition phase, 2—acquisition contingency as remembered after the recall session, 3—recall session.
TABLE 3.
Performance (percentage of trials with a response) and accuracy (percentage of correct responses in all trials) for the incidental CS identification task in the two experiments
| Mean (±SD) | ||
|---|---|---|
| Performance (%) | Accuracy (%) | |
| Experiment 1 | ||
| Acquisition | 97.05 ± 5.05 | 94.11 ± 8.37 |
| Recall | 98.81 ± 1.63 | 96.79 ± 4.84 |
| Experiment 2 | ||
| Acquisition | 98.21 ± 2.88 | 96.61 ± 4.04 |
| Recall | 98.10 ± 6.38 | 96.67 ± 7.37 |
TABLE 4.
Hit rate (percentage of trials with a response to the target stimuli) and false alarm rate (percentage of trials with a response to non‐target stimuli) for the ITI task
| Mean (±SD) | ||
|---|---|---|
| Hit rate (%) | False alarms (%) | |
| Experiment 1 | ||
| Acquisition | 91.16 ± 25.28 | 2.14 ± 4.23 |
| Recall | 98.33 ± 3.80 | 1.67 ± 4.11 |
| Experiment 2 | ||
| Acquisition | 96.96 ± 4.05 | 2.77 ± 3.49 |
| Recall | 93.45 ± 17.79 | 3.69 ± 5.62 |
4. DISCUSSION
Trace fear conditioning is an important model of realistic biological and clinical aversive learning scenarios, where predictors and outcomes are separated in time. In rodent studies, trace intervals longer than 10 s clearly involve hippocampal learning. The best strategy for quantifying learning and memory at such long trace intervals in humans is currently unclear. In this study, we conducted two independent trace fear conditioning experiments with a 15‐s trace interval, to establish a method for quantification of learning and of memory retention. We included several data types in which conditioned responses are commonly observed in delay fear conditioning paradigms. Our exploration‐confirmation strategy allowed us to freely investigate many possible analysis algorithms in Experiment 1 and confirm the most sensitive ones in Experiment 2. We used retrodictive validity as a metric to assess measurement error (Bach et al., 2020; Bach & Melinscak, 2020; Bach, Tzovara, & Vunder, 2018).
The first finding is that among psychophysiological measures, only SEBR differentiated CS+ and CS− trials in a recall test 1 week after acquisition. As the most sensitive analysis, a previously proposed peak‐scoring protocol emerged (Balderston et al., 2017), which was superior to GLM‐based amplitude estimates (Khemka et al., 2017). These results replicated across both experiments and were robust in t‐tests, as well as linear mixed effects models and repeated‐measures ANOVA accounting for SEBR habituation. We found no evidence of SEBR extinction over 30 trials in the recall session. Effect size to distinguish CS+/CS− was d = 0.44, which would require 34 participants to achieve 80% power of demonstrating this difference in a one‐tailed t‐test at an alpha level of .05, and 514 participants to demonstrate an at least 50% reduction in fear memory in intervention vs. placebo trial.
During the acquisition session, SCR and PSR were the only psychophysiological measures that differentiated CS+/CS− trials. Our SCR model encompassed a response to the CS and one during the trace interval; both responses differentiated CS+/CS−, with the trace interval response being more sensitive. This result was replicated across both experiments. Additionally, our PSR model fitted on time‐course data of Experiment 1 revealed significant CS+/CS− differentiation for Experiment 2. We were unable to establish fear retention in SCR and PSR. Participants recalled a robust if imprecise declarative memory representation of the CS‐US association. Memory for the initial CS‐US association was remembered until after the recall session when appropriately prompted.
It has already been established that learning CS‐US associations in trace fear conditioning with shorter intervals can be measured with psychophysiological measures (Büchel et al., 1999; Haritha et al., 2013). The challenge in the present study was the long 15‐s trace interval, and the recall test 1 week after acquisition. To our knowledge, aversive memory quantification in such paradigm has not yet been probed.
Regarding our SEBR finding, it is typically assumed that CS‐US association is extinguished during a recall test without reinforcement, and hence memory would be best quantified during the first few trials of recall. This is reflected in our previous results in delay fear conditioning, where SEBR averaged over the first 3–5 CS+ and CS− trials of the recall session provided the best measure of US memory (Khemka et al., 2017), both for PsPM model inversion and peak‐scoring. In the present study, fear retention was only measurable when averaged over all trials of the recall session. While our main results favor a peak‐scoring strategy, the same pattern was found for GLM inversion, where highest retrodictive validity—even if non‐significant—is achieved when including all trials of the recall session. This is in keeping with the lack of evidence for extinction in the current study. This might indicate that trace fear memory is harder to extinguish, as previously suggested (Ewald et al., 2014). This would be of general importance both to understand the basic neurobiology of aversive learning in biological scenarios, and perhaps even more so in clinical intervention research. Future studies might include a larger number of trials in the recall session to investigate the time course of extinction.
A priori, a drawback of the exploration‐confirmation approach with relatively modest sample size poses a certain risk that an analysis overfitted to the exploratory data set does not generalize to the confirmatory data set, while a less specific analysis might be sensitive to both experiments. This is only a concern for those results that did not generalize from Experiment 1 to Experiment 2 and is addressed by repeating the exploration procedure in the confirmation data set, and by analyzing the combined data sets, which did not reveal any additional insights here.
We based the choice of trace interval duration on previous rodent work. A drawback in comparing rodent and human fear conditioning is that rodent work typically uses single‐cue protocols, whereas differential conditioning is common in human work (Haaker et al., 2019; Lonsdorf et al., 2017) and was also used in the present study. Differential conditioning might involve additional learning processes related to the CS‐. In rodents, this additional safety learning is associated with activation in the bed nucleus stria terminal (BNST) (Foilb et al., 2021) and ventrolateral orbitofrontal cortex (vlOFC) (Sarlitto et al., 2018). In humans, the ventromedial prefrontal cortex (vmPFC) has been found to play an important role (Savage et al., 2020). It remains to be shown whether and to what extent human hippocampus is required for delay and trace fear conditioning, as the literature is sparse in this respect (Bechara et al., 1995).
While we investigated multiple measures and analysis methods, the trace conditioning paradigm itself might also be varied to improve the quantification of trace fear memory. For example, one could vary the timepoint of startle sound presentation, assess a potential impact of its perceived aversiveness, remove the task during ITI, or add startle sounds during acquisition to improve comparability between sessions. Furthermore, research into trace fear conditioning memory might help to elucidate these open questions.
To summarize, we identify suitable measures for trace fear conditioning: SEBR during recall, and SCR/PSR during acquisition. We hope that these results will be useful for substantive research in the field of aversive memory modification.
AUTHOR CONTRIBUTIONS
Jelena M. Wehrli: Conceptualization; data curation; formal analysis; investigation; project administration; visualization; writing – original draft. Yanfang Xia: Conceptualization; data curation; investigation; methodology. Samuel Gerster: Resources; software. Dominik R Bach: Conceptualization; formal analysis; funding acquisition; methodology; project administration; resources; software; supervision; writing – original draft.
FUNDING INFORMATION
This work was supported by the Clinical Research Priority Program of the University of Zurich for the CRPP “Synapse & Trauma”. DRB is supported by funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No. ERC‐2018 CoG‐816,564 ActionContraThreat). The Wellcome Centre for Human Neuroimaging is supported by core funding from the Wellcome (203,147/Z/16/Z)
CONFLICT OF INTEREST
The authors have no known conflict of interest to disclose.
Supporting information
TABLE S1 Experiment 1: recall SEBR GLM modelled paired t‐test, not corrected for multiple comparisons
TABLE S2 Experiment 2: recall SEBR GLM modelled paired t‐test, not corrected for multiple comparisons
TABLE S3 Experiment 1: recall SEBR peak scored paired t‐test, not corrected for multiple comparisons
TABLE S4 Experiment 2: recall SEBR peak scored paired t‐test, not corrected for multiple comparisons
TABLE S5 Experiment 1: recall SCR DCM modelled paired t‐test, not corrected for multiple comparisons
TABLE S6 Experiment 2: recall SCR DCM modelled paired t‐test, not corrected for multiple comparisons
TABLE S7 Experiment 1: recall PSR standard RF, paired t‐test, not corrected for multiple comparisons
TABLE S8 Experiment 2: recall PSR standard RF, paired t‐test, not corrected for multiple comparisons
TABLE S9 Experiment 1: recall PSR fitted RF, paired t‐test, not corrected for multiple comparisons
TABLE S10 Experiment 2: recall PSR fitted RF, paired t‐test, not corrected for multiple comparisons
TABLE S11 Experiment 1: recall SF, paired t‐test, not corrected for multiple comparisons
TABLE S12 Experiment 2: recall SF, paired t‐test, not corrected for multiple comparisons
TABLE S13 Acquisition paired t‐test CS+/CS−, not corrected for multiple comparisons
TABLE S14 Paired t‐test CS+/CS−, not corrected for multiple comparisons
TABLE S15 Experiment 1: SCR DCM estimated response peak during trace interval in s (from start of trial)
TABLE S16 Experiment 2: SCR DCM estimated response peak during trace interval in s (from start of trial)
TABLE S17 Experiment 1: SCR LME in recall
TABLE S18 Experiment 2: SCR LME in recall
TABLE S19 Experiment 1 and 2 combined: SCR LME in recall
TABLE S20 Experiment 1: SEBR GLM LME in recall
TABLE S21 Experiment 2: SEBR GLM LME in recall
TABLE S22 Experiment 1: SEBR peak scoring LME in recall
TABLE S23 Experiment 2: SEBR peak scoring LME in recall
TABLE S24 Experiment 1: SEBR peak scoring ANOVA in recall
TABLE S25 Experiment 2: SEBR peak scoring ANOVA in recall
ACKNOWLEDGMENTS
We thank Sarrina Tursunova and Sophia Bodle for help with data acquisition and Ivan Rojkov for help with data analysis. Open access funding provided by Universitat Zurich.
Wehrli, J. M. , Xia, Y. , Gerster, S. , & Bach, D. R. (2022). Measuring human trace fear conditioning. Psychophysiology, 59, e14119. 10.1111/psyp.14119
DATA AVAILABILITY STATEMENT
Analysis code is available from OSF (https://osf.io/wbkfj/), all anonymized data sets are available on www.zenodo.org (Experiment 1: https://doi.org/10.5281/zenodo.6024202, Experiment 2: https://doi.org/10.5281/zenodo.6024245).
REFERENCES
- Bach, D. R. , Castegnetti, G. , Korn, C. W. , Gerster, S. , Melinscak, F. , & Moser, T. (2018). Psychophysiological modeling: Current state and future directions. Psychophysiology, 55(11), e13214. 10.1111/psyp.13209 [DOI] [PubMed] [Google Scholar]
- Bach, D. R. , Daunizeau, J. , Friston, K. J. , & Dolan, R. J. (2010). Dynamic causal modeling of anticipatory skin conductance responses. Biological Psychology, 85(1), 163–170 10.1016/j.biopsycho.2010.06.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bach, D. R. , & Friston, K. J. (2013). Model‐based analysis of skin conductance responses: Towards causal models in psychophysiology: Causal models in psychophysiology. Psychophysiology, 50(1), 15–22 10.1111/j.1469-8986.2012.01483.x [DOI] [PubMed] [Google Scholar]
- Bach, D. R. , Gerster, S. , Tzovara, A. , & Castegnetti, G. (2016). A linear model for event‐related respiration responses. Journal of Neuroscience Methods, 270, 147–155 10.1016/j.jneumeth.2016.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bach, D. R. , & Melinscak, F. (2020). Psychophysiological modeling and the measurement of fear conditioning. Behavior Research and Therapy, 127, 103576 10.1016/j.brat.2020.103576 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bach, D. R. , Melinščak, F. , Fleming, S. M. , & Voelkle, M. C. (2020). Calibrating the experimental measurement of psychological attributes. Nature Human Behavior, 4(12), 1229–1235 10.1038/s41562-020-00976-8 [DOI] [PubMed] [Google Scholar]
- Bach, D. R. , Näf, M. , Deutschmann, M. , Tyagarajan, S. K. , & Quednow, B. B. (2019). Threat Memory Reminder Under Matrix Metalloproteinase 9 Inhibitor Doxycycline Globally Reduces Subsequent Memory Plasticity. The Journal of Neuroscience, 39(47), 9424–9434 10.1523/JNEUROSCI.1285-19.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bach, D. R. , Tzovara, A. , & Vunder, J. (2018). Blocking human fear memory with the matrix metalloproteinase inhibitor doxycycline. Molecular Psychiatry, 23(7), 1584–1589 10.1038/mp.2017.65 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balderston, N. L. , Mathur, A. , Adu‐Brimpong, J. , Hale, E. A. , Ernst, M. , & Grillon, C. (2017). Effect of anxiety on behavioral pattern separation in humans. Cognition and Emotion, 31(2), 238–248 10.1080/02699931.2015.1096235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bechara, A. , Tranel, D. , Damasio, H. , Adolphs, R. , Rockland, C. , & Damasio, A. R. (1995). Double dissociation of conditioning and declarative knowledge relative to the amygdala and hippocampus in humans. Science, 269(5227), 1115–1118 10.1126/science.7652558 [DOI] [PubMed] [Google Scholar]
- Blumenthal, T. D. , Cuthbert, B. N. , Filion, D. L. , Hackley, S. , Lipp, O. V. , & Van Boxtel, A. (2005). Committee report: Guidelines for human startle eyeblink electromyographic studies. Psychophysiology, 42(1), 1–15 10.1111/j.1469-8986.2005.00271.x [DOI] [PubMed] [Google Scholar]
- Bouton, M. E. (2007). Learning and behavior: A contemporary synthesis. Sinauer Associates Publishers. [Google Scholar]
- Büchel, C. , Dolan, R. J. , Armony, J. L. , & Friston, K. J. (1999). Amygdala–Hippocampal Involvement in Human Aversive Trace Conditioning Revealed through Event‐Related Functional Magnetic Resonance Imaging. The Journal of Neuroscience, 19(24), 10869–10876 10.1523/JNEUROSCI.19-24-10869.1999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Castegnetti, G. , Tzovara, A. , Staib, M. , Gerster, S. , & Bach, D. R. (2017). Assessing fear learning via conditioned respiratory amplitude responses: Fear‐conditioned respiratory amplitude in humans. Psychophysiology, 54(2), 215–223 10.1111/psyp.12778 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Castegnetti, G. , Tzovara, A. , Staib, M. , Paulus, P. C. , Hofer, N. , & Bach, D. R. (2016). Modeling fear‐conditioned bradycardia in humans: Modeling fear‐conditioned bradycardia in humans. Psychophysiology, 53(6), 930–939 10.1111/psyp.12637 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chowdhury, N. , Quinn, J. J. , & Fanselow, M. S. (2005). Dorsal hippocampus involvement in trace fear conditioning with long, but not short, trace intervals in mice. Behavioral Neuroscience, 119(5), 1396–1402 10.1037/0735-7044.119.5.1396 [DOI] [PubMed] [Google Scholar]
- Davis, M. (1992). The Role of the Amygdala in Fear and Anxiety. Annual Review of Neuroscience, 15(1), 353–375 10.1146/annurev.ne.15.030192.002033 [DOI] [PubMed] [Google Scholar]
- de Haan, M. I. C. , van Well, S. , Visser, R. M. , Scholte, H. S. , van Wingen, G. A. , & Kindt, M. (2018). The influence of acoustic startle probes on fear learning in humans. Scientific Reports, 8(1), 14552 10.1038/s41598-018-32646-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ehlers, A. , Hackmann, A. , Steil, R. , Clohessy, S. , Wenninger, K. , & Winter, H. (2002). The nature of intrusive memories after trauma: The warning signal hypothesis. Behavior Research and Therapy, 40(9), 995–1002 10.1016/S0005-7967(01)00077-8(01)00077‐8 [DOI] [PubMed] [Google Scholar]
- Ewald, H. , Glotzbach‐Schoon, E. , Gerdes, A. B. M. , Andreatta, M. , Müller, M. , Mühlberger, A. , & Pauli, P. (2014). Delay and trace fear conditioning in a complex virtual learning environment—Neural substrates of extinction. Frontiers in Human Neuroscience, 8, 323. 10.3389/fnhum.2014.00323 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fanselow, M. S. (1994). Neural organization of the defensive behavior system responsible for fear. Psychonomic Bulletin & Review, 1(4), 429–438 10.3758/BF03210947 [DOI] [PubMed] [Google Scholar]
- Foilb, A. R. , Sansaricq, G. N. , Zona, E. E. , Fernando, K. , & Christianson, J. P. (2021). Neural correlates of safety learning. Behavioral Brain Research, 396, 112884 10.1016/j.bbr.2020.112884 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fullana, M. A. , Harrison, B. J. , Soriano‐Mas, C. , Vervliet, B. , Cardoner, N. , Àvila‐Parcet, A. , & Radua, J. (2016). Neural signatures of human fear conditioning: An updated and extended meta‐analysis of fMRI studies. Molecular Psychiatry, 21(4), 500–508 10.1038/mp.2015.88 [DOI] [PubMed] [Google Scholar]
- Gilmartin, M. R. , Kwapis, J. L. , & Helmstetter, F. J. (2012). Trace and contextual fear conditioning are impaired following unilateral microinjection of muscimol in the ventral hippocampus or amygdala, but not the medial prefrontal cortex. Neurobiology of Learning and Memory, 97(4), 452–464 10.1016/j.nlm.2012.03.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haaker, J. , Maren, S. , Andreatta, M. , Merz, C. J. , Richter, J. , Richter, S. H. , Meir Drexler, S. , Lange, M. D. , Jüngling, K. , Nees, F. , Seidenbecher, T. , Fullana, M. A. , Wotjak, C. T. , & Lonsdorf, T. B. (2019). Making translation work: Harmonizing cross‐species methodology in the behavioral neuroscience of Pavlovian fear conditioning. Neuroscience & Biobehavioral Reviews, 107, 329–345 10.1016/j.neubiorev.2019.09.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Han, C. J. , O'Tuathaigh, C. M. , van Trigt, L. , Quinn, J. J. , Fanselow, M. S. , Mongeau, R. , Koch, C. , & Anderson, D. J. (2003). Trace but not delay fear conditioning requires attention and the anterior cingulate cortex. Proceedings of the National Academy of Sciences, 100(22), 13087–13092 10.1073/pnas.2132313100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haritha, A. T. , Wood, K. H. , Ver Hoef, L. W. , & Knight, D. C. (2013). Human trace fear conditioning: Right‐lateralized cortical activity supports trace‐interval processes. Cognitive, Affective, & Behavioral Neuroscience, 13(2), 225–237 10.3758/s13415-012-0142-6 [DOI] [PubMed] [Google Scholar]
- Hayes, T. R. , & Petrov, A. A. (2016). Mapping and correcting the influence of gaze position on pupil size measurements. Behavior Research Methods, 48(2), 510–527 10.3758/s13428-015-0588-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hygge, S. , & Hugdahl, K. (1985). Skin conductance recordings and the NaCl concentration of the electrolyte. Psychophysiology, 22(3), 365–367 10.1111/j.1469-8986.1985.tb01616.x [DOI] [PubMed] [Google Scholar]
- Khemka, S. , Tzovara, A. , Gerster, S. , Quednow, B. B. , & Bach, D. R. (2017). Modeling startle eyeblink electromyogram to assess fear learning: Modeling startle‐blink EMG to assess fear learning. Psychophysiology, 54(2), 204–214 10.1111/psyp.12775 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kochli, D. E. , Thompson, E. C. , Fricke, E. A. , Postle, A. F. , & Quinn, J. J. (2015). The amygdala is critical for trace, delay, and contextual fear conditioning. Learning & Memory, 22(2), 92–100 10.1101/lm.034918.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korn, C. W. , Staib, M. , Tzovara, A. , Castegnetti, G. , & Bach, D. R. (2017). A pupil size response model to assess fear learning: Pupil responses and fear conditioning. Psychophysiology, 54(3), 330–343 10.1111/psyp.12801 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kret, M. E. , & Sjak‐Shie, E. E. (2019). Preprocessing pupil size data: Guidelines and code. Behavior Research Methods, 51(3), 1336–1342 10.3758/s13428-018-1075-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- LeDoux, J. E. (2014). Coming to terms with fear. Proceedings of the National Academy of Sciences, 111(8), 2871–2878 10.1073/pnas.1400335111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lonsdorf, T. B. , Menz, M. M. , Andreatta, M. , Fullana, M. A. , Golkar, A. , Haaker, J. , Heitland, I. , Hermann, A. , Kuhn, M. , Kruse, O. , Drexler, S. M. , Meulders, A. , Nees, F. , Pittig, A. , Richter, J. , Römer, S. , Shiban, Y. , Schmitz, A. , & Merz, C. J. (2017). Do not fear “fear conditioning”: Methodological considerations for the design and analysis of studies on human fear acquisition, extinction, and return of fear. Neuroscience & Biobehavioral Reviews, 77, 247–285 10.1016/j.neubiorev.2017.02.026 [DOI] [PubMed] [Google Scholar]
- Maris, E. , & Oostenveld, R. (2007). Nonparametric statistical testing of EEG‐ and MEG‐data. Journal of Neuroscience Methods, 164(1), 177–190 10.1016/j.jneumeth.2007.03.024 [DOI] [PubMed] [Google Scholar]
- Mertens, G. , Krypotos, A.‐M. , & Engelhard, I. M. (2020). A review on mental imagery in fear conditioning research 100 years since the “Little Albert” study. Behavior Research and Therapy, 126, 103556 10.1016/j.brat.2020.103556 [DOI] [PubMed] [Google Scholar]
- Milad, M. R. , & Quirk, G. J. (2012). Fear extinction as a model for translational neuroscience: ten years of progress. Annual Review of Psychology, 63(1), 129–151 10.1146/annurev.psych.121208.131631 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Misane, I. , Tovote, P. , Meyer, M. , Spiess, J. , Ögren, S. O. , & Stiedl, O. (2005). Time‐dependent involvement of the dorsal hippocampus in trace fear conditioning in mice. Hippocampus, 15(4), 418–426 10.1002/hipo.20067 [DOI] [PubMed] [Google Scholar]
- Ojala, K. E. , & Bach, D. R. (2020). Measuring learning in human classical threat conditioning: Translational, cognitive and methodological considerations. Neuroscience & Biobehavioral Reviews, 114, 96–112 10.1016/j.neubiorev.2020.04.019 [DOI] [PubMed] [Google Scholar]
- Pan, J. , & Tompkins, W. J. (1985). A real‐time QRS detection algorithm. IEEE Transactions on Biomedical Engineering, BME‐32(3), 230–236 10.1109/TBME.1985.325532 [DOI] [PubMed] [Google Scholar]
- Paulus, P. C. , Castegnetti, G. , & Bach, D. R. (2016). Modeling event‐related heart period responses. Psychophysiology, 53(6), 837–846 10.1111/psyp.12622 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raybuck, J. D. , & Lattal, K. M. (2014). Bridging the interval: Theory and neurobiology of trace conditioning. Behavioral Processes, 101, 103–111 10.1016/j.beproc.2013.08.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rescorla, R. A. (1988). Behavioral studies of pavlovian conditioning. Annual Review of Neuroscience, 11(1), 329–352 10.1146/annurev.ne.11.030188.001553 [DOI] [PubMed] [Google Scholar]
- Sarlitto, M. C. , Foilb, A. R. , & Christianson, J. P. (2018). Inactivation of the ventrolateral orbitofrontal cortex impairs flexible use of safety signals. Neuroscience, 379, 350–358 10.1016/j.neuroscience.2018.03.037 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Savage, H. S. , Davey, C. G. , Fullana, M. A. , & Harrison, B. J. (2020). Clarifying the neural substrates of threat and safety reversal learning in humans. NeuroImage, 207, 116427 10.1016/j.neuroimage.2019.116427 [DOI] [PubMed] [Google Scholar]
- Sehlmeyer, C. , Schöning, S. , Zwitserlood, P. , Pfleiderer, B. , Kircher, T. , Arolt, V. , & Konrad, C. (2009). Human fear conditioning and extinction in neuroimaging: A systematic review. PLoS ONE, 4(6), e5865 10.1371/journal.pone.0005865 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sjouwerman, R. , Niehaus, J. , Kuhn, M. , & Lonsdorf, T. B. (2016). Do not startle me‐Interference of startle probe presentations and intermittent ratings with fear acquisition: Startle probe and rating: Impact on fear learning. Psychophysiology, 53(12), 1889–1899 10.1111/psyp.12761 [DOI] [PubMed] [Google Scholar]
- Staib, M. , Castegnetti, G. , & Bach, D. R. (2015). Optimizing a model‐based approach to inferring fear learning from skin conductance responses. Journal of Neuroscience Methods, 255, 131–138 10.1016/j.jneumeth.2015.08.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
TABLE S1 Experiment 1: recall SEBR GLM modelled paired t‐test, not corrected for multiple comparisons
TABLE S2 Experiment 2: recall SEBR GLM modelled paired t‐test, not corrected for multiple comparisons
TABLE S3 Experiment 1: recall SEBR peak scored paired t‐test, not corrected for multiple comparisons
TABLE S4 Experiment 2: recall SEBR peak scored paired t‐test, not corrected for multiple comparisons
TABLE S5 Experiment 1: recall SCR DCM modelled paired t‐test, not corrected for multiple comparisons
TABLE S6 Experiment 2: recall SCR DCM modelled paired t‐test, not corrected for multiple comparisons
TABLE S7 Experiment 1: recall PSR standard RF, paired t‐test, not corrected for multiple comparisons
TABLE S8 Experiment 2: recall PSR standard RF, paired t‐test, not corrected for multiple comparisons
TABLE S9 Experiment 1: recall PSR fitted RF, paired t‐test, not corrected for multiple comparisons
TABLE S10 Experiment 2: recall PSR fitted RF, paired t‐test, not corrected for multiple comparisons
TABLE S11 Experiment 1: recall SF, paired t‐test, not corrected for multiple comparisons
TABLE S12 Experiment 2: recall SF, paired t‐test, not corrected for multiple comparisons
TABLE S13 Acquisition paired t‐test CS+/CS−, not corrected for multiple comparisons
TABLE S14 Paired t‐test CS+/CS−, not corrected for multiple comparisons
TABLE S15 Experiment 1: SCR DCM estimated response peak during trace interval in s (from start of trial)
TABLE S16 Experiment 2: SCR DCM estimated response peak during trace interval in s (from start of trial)
TABLE S17 Experiment 1: SCR LME in recall
TABLE S18 Experiment 2: SCR LME in recall
TABLE S19 Experiment 1 and 2 combined: SCR LME in recall
TABLE S20 Experiment 1: SEBR GLM LME in recall
TABLE S21 Experiment 2: SEBR GLM LME in recall
TABLE S22 Experiment 1: SEBR peak scoring LME in recall
TABLE S23 Experiment 2: SEBR peak scoring LME in recall
TABLE S24 Experiment 1: SEBR peak scoring ANOVA in recall
TABLE S25 Experiment 2: SEBR peak scoring ANOVA in recall
Data Availability Statement
Analysis code is available from OSF (https://osf.io/wbkfj/), all anonymized data sets are available on www.zenodo.org (Experiment 1: https://doi.org/10.5281/zenodo.6024202, Experiment 2: https://doi.org/10.5281/zenodo.6024245).
