Abstract
Memories of experiences are often tied together, where remembering one experience can spark memories of others. One may link temporally distant events based on their meaning or recall closely occurring events together. Linking based on temporal organization has been studied in simple list-learning paradigms, but less is known about how these effects manifest in real-life experiences. Here, we investigate how meaningful connections in a narrative interact with temporal context to influence recall. Participants encoded picture stories featuring multiple subplots, some of which connected across time (Coherent Narratives), while others did not (Unrelated Narratives). Replicating prior findings, Coherent Narratives were better recalled than Unrelated Narratives, regardless of temporal distance between events using character-cued (N=39 and N=36) and free recall (N=39) tasks. Extending this work, free recall analyses revealed that temporally separated Coherent Narrative event pairs were recalled contiguously, suggesting that memories are bridged through meaningful connections, not just temporal structure.
Keywords: episodic memory, narrative coherence, temporal context, naturalistic
Introduction
Memories of experiences often weave a complex tapestry of interconnected moments, where remembering one experience can trigger memories of others. Consider the example of remembering the ingredients bought for a cake: It might evoke a temporally proximal memory like driving home afterward to walk your dog, or it might bring to mind a temporally distant memory linked by meaningful connections, such as your friend’s birthday later that week. A wealth of research has assessed memory for sequences of items, such as lists of words, with some studies focusing on what is remembered (i.e., recall performance) and others on how the items are recalled (i.e., recall organization).
Studies of recall organization have emphasized the role of time in the organization of episodic memories (Crowder, 2014), with slowly drifting temporal contexts acting as a bridge between items (Howard & Kahana, 2002). Items encountered close together in time—that is, within a similar temporal context—tend to be recalled contiguously (Estes, 1955; Kahana et al., 1996; Howard & Kahana, 2002). Moreover, in the absence of explicit task instructions, people often organize information based on time (Murdock, 1974).
Although the temporal contiguity effect is well established, most of these studies rely on list-learning paradigms (Hintzman, 2016). Unlike lists of words or items, information encountered in everyday life tends to be neither isolated nor arbitrary. Simply listing the people or things encountered at places and times would neither effectively nor efficiently capture the rich, dynamic connections between meaningful moments. For instance, one could recall items like milk, sugar, car, dog, pan, oven, and friend based solely on the fact that these things became relevant at similar points in time. However, this strategy would not capture that these items were encountered in the context of buying ingredients for a cake, driving home, walking your dog, and then baking a cake in preparation for your friend’s birthday.
Instead, in real life, people tend to tell stories to help make sense of the past (Bietti et al., 2019), enabling them to capture meaningful relationships between moments and events (Lee et al., 2020). The causal and meaningful associations emphasized by narrative connections facilitate comprehension and memory beyond temporal adjacency (Trabasso & Speery, 1985; Mandler & Johnson, 1977). Thus, important aspects of real-world events that influence the structure of memories may not be captured well by a list-learning task.
Recent studies by Cohn-Sheehy and colleagues (Cohn-Sheehy, Delarazan et al., 2021; Cohn-Sheehy et al. 2022) established an important role for narrative coherence—the degree to which individual units of information can be interrelated into a larger overarching story (Bartlett, 1932; Graesser et al., 1994)—in memory for complex events. Pairs of events that were narratively coherent were recalled in greater detail than event pairs that merely shared a recurring character. Critically, these narratively coherent events did not occur together in time, so mere temporal proximity could not have played a role in binding them in memory.
However, these studies assessed the amount of information recalled, whereas classic list-learning studies that demonstrate temporal contiguity effects primarily investigate the organization of recall. Although there is a strong indication that recall organization and recall performance may be related with effective recall organization typically leading to recalling a greater amount of information (Sederberg et al. 2010), one does not necessitate the other. Thus, an important question remains unanswered: Do meaningful connections in a lifelike narrative interact with temporal context to influence the success and organization of recall? Answering this question can shed light on the extent to which well-established phenomena from list-learning tasks translate to memory for complex experiences.
We conducted a series of experiments to directly assess whether and how temporal context and narrative coherence interact during recall of lifelike events. Similar to prior work (Cohn-Sheehy, Delarazan et al., 2022; Cohn-Sheehy et al., 2021), participants were presented with an extended fictional story comprised of events centered on a main character (mainplot). The main character interacted with four side-characters twice, and each of these subplot interactions was tangential to the main narrative (i.e., a sideplot event; Fig. 1A and 1B). Critically, for each side-character, the respective pair of sideplot events could either be meaningfully integrated into a single, coherent narrative, or instead constituted two separate unrelated narratives (Fig. 1C).
Fig. 1.

Participants were presented a fictional story in the form of audio picture-book. (a) The 18 story events were divided into 10 main character events and 8 side-character events. Events involving side-characters are referred to as sideplots, which do not relate to the mainplot events (green). Each side-character appeared twice, and each sideplot event pairs could either form one Coherent Narratives (blue) or separate Unrelated Narratives (red). All temporally-distant sideplot event pairs were categorized as Short (separated by 4 intervening events) or Long Lag (separated by 12 intervening events) pairs. This resulted in 4 sideplot conditions: Coherent Narrative Short Lag, Coherent Narrative Long Lag, Unrelated Narrative Short Lag, or Unrelated Narrative Long Lag. Story version was randomized across participants. (b) The story consisted of 18 events/clips containing 8 sentences and 8 associating images (total of 144 sentences/images). Images appeared throughout the narration of the story. (c) Coherent Narratives (blue) consisted of same event versions (e.g., Sandra Act 1 and 2, Version A) and Unrelated Narratives (red) events were drawn from different possible Coherent Narrative Event pairs (e.g., Sandra Act 1, Version A, and Sandra Act 2, Version B). Synopses and image examples are provided for two possible pairs of events for Sandra. (d) Character-Cued Recall (Experiment 1): Participants were cued by all four side-characters and the main character one by one and asked to recall everything they can remember about the cued character in as much detail as possible. (e) Free recall (Experiment 2): Participants were asked to recall everything they can remember about the entire story in as much detail as possible in any order they preferred.
We presented pairs of sideplot events that were separated in time by shorter or longer distances between them to test whether effects of narrative coherence interacted with the temporal context between events in a pair. Experiment 1 (Experiment 1A and 1B) tested whether coherence effects on recall performance are modulated by temporal distances using recall cued by character (Fig. 1D). Experiment 2 assessed free recall of story events (Fig. 1E), allowing participants to structure their recall with no specific task constraints. Experiment 2 also aimed to test whether narrative coherence interacts with temporal contiguity in predicting the organization of free recall.
Across the experiments, the number of details recalled was higher for Coherent Narrative events than for Unrelated Narrative events, regardless of the temporal lag between events or whether the task was character-cued or free recall. In Experiment 2, free recall was generally temporally organized, but pairs of Coherent Narrative events were recalled closer together in sequence compared with their original positions in the presented story. Together, these findings suggest that narrative structure provides a scaffold to benefit memory for lifelike events, but this can also reorganize the event timeline as people structure their recall.
Experiment 1
In all experiments, participants were presented with paired sideplot events that either formed one Coherent Narrative or two separate, Unrelated Narratives embedded in an overarching, causally related mainplot story (Fig. 1A and 1B). Building on prior work (Cohn-Sheehy, Delarazan, et al. 2022; Cohn-Sheehy et al., 2021), Experiment 1 (Experiment 1A and 1B) was designed to determine the influence of minimizing or maximizing contextual temporal drift on narrative coherence by manipulating the distances between paired events. All paired sideplot events were either separated by four (Short Lag) or twelve (Long Lag) intervening events. This resulted in four sideplot conditions: Coherent Narrative Short Lag, Coherent Narrative Long Lag, Unrelated Narrative Short Lag, and Unrelated Narrative Long Lag. Participants then completed character-cued recall, in which they were cued with each of the five characters and instructed to recall the events involving that character. Experiment 1A and 1B stimuli and recall procedures were identical but also included distinct tests of explicit temporal memory to conduct exploratory analyses over whether narrative coherence affected explicit memory for time. Briefly, in Experiment 1A, we tested the perceived distance between event pairs. In Experiment 1B, we tested estimated absolute time of individually cued events. These data are not central to the questions addressed by the present manuscript, and we will not detail the procedures or results here. Briefly, explicit temporal memory was not influenced by narrative coherence (see Supplementary Materials).
In line with prior work, we hypothesized that Coherent Narratives would be recalled more successfully than Unrelated Narratives. We additionally predicted that, if slowly drifting temporal context plays a role in bridging events, then we would observe an interaction between narrative coherence and temporal distance. Specifically, Coherent Narratives may be remembered better than Unrelated Narratives with Long Lags, but not Short Lags. That is, the benefit of coherence may be diminished in Short Lags as temporal proximity alone (or shared temporal context) would facilitate recall, even for Unrelated Narratives.
Alternatively, other accounts argue that temporal context may shift abruptly based on the structure of experience, rather than drift gradually over time (DuBrow et al., 2017). In this view, drastic changes in events may trigger contextual shifts, disrupting any benefit of temporal-based continuity. Evidence in favor of this view would result in Coherent Narratives being recalled better than Unrelated Narratives across both Short and Long Lags, as Short Lags would also cause a shift in temporal context. In other words, from this perspective, the key factor is not absolute temporal distance per se, but merely the fact that events have been segmented from one another.
Method
Participants
Across all experiments, participants were recruited from a pool of undergraduate students enrolled in psychology courses at Washington University in St. Louis and received course credit for their participation. Inclusion criteria included: normal hearing, normal or correct-to-normal vision, no history of major neurological or psychiatric illness, and English as a native language. Participants were excluded for either failure to complete a part of the study or technical difficulties encountered during the study.
A power analysis was conducted with a targeted effect size of Cohen’s d = 0.66, derived from a prior study using similar materials and compared recall performance for Coherent and Unrelated events in a delayed recall group (Cohn-Sheehy, Delarazan, et al., 2022). With an alpha (α) level of 0.05 and a desired power (1−𝛽) of 0.80, the calculation indicated that a minimum sample size of 20 participants would be required to detect the specified effect. Given that we altered the design by adding a variable of temporal lag, we collected more than the minimum required sample size.
In Experiment 1A, forty-one participants (M = 19.10, SD = 1.28; range = 18 – 24; 21 female) were recruited and two were excluded (failure to complete a part of the study: N = 1; technical difficulties: N = 1), resulting in thirty-nine usable participants. In Experiment 1B, forty participants (M = 19.00, SD = 1.01; range = 18 – 21; 30 female) were recruited and four were excluded (failure to complete a part of the study: N = 3; technical difficulties: N = 1), resulting in thirty-six usable participants. This research received approval from Washington University in St. Louis Institutional Review Board (ID: 202010173).
Procedure
Participants completed two sessions that were 24 hours apart: Session 1 (Encoding) and Session 2 (Retrieval). Prior research have found that the narrative coherence effect is more robust after a delay (Cohn-Sheehy, Delarazan, et al. 2022); therefore, incorporating a delay in the current study allowed for a more precise examination of the temporal distance between events. In Session 1, participants first completed basic demographic questionnaires and the Survey of Autobiographical Memory (SAM) which are routinely collected in our lab. Next, participants completed a familiarization task where they were first introduced to the names and faces of story characters, and then tested on them (Supplementary Materials, Fig. S6). Finally, participants were presented with the story, and instructed that they would later encounter a series of tasks that would test their memory for the story in detail. In Session 2, participants first completed the character-cued recall task and then temporal judgment tasks (reported in Supplementary Materials; Fig. S8). We tested the perceived distance between event pairs in Experiment 1A and estimated absolute time of individually cued events in Experiment 1B. Across both tasks, we did not observe evidence that narrative coherence influences explicit memory for time (Supplementary Materials, Fig. S9 and S10). Participants were then debriefed and awarded course credit for their participation.
Methods and Materials
Story Presentation.
Participants were presented with a fictional story in the form of an audio narration accompanied by static illustrations (Fig. 1B). The story is centered on a main character who is determined to get a promotion at a newspaper company (see examples in Supplementary Materials, Fig. S1–S5). The main character interacts with four side-characters whose appearances (sideplot events) are unrelated to events that are central to the main character’s story (mainplot events). Each side-character appears in two temporally distant, distinct sideplot events that are separated by four or twelve intervening events (i.e., Short Lag or Long Lag, respectively; Fig. 1A and 1B). Lag between sideplot events was directly manipulated in order to investigate the influence of temporal proximity on memory and were chosen to ensure a balanced experimental design by providing an equal number of observations across conditions of narrative coherence and temporal lag. These distances helped preserve the overall comprehension of the narrative, as extremely short or long temporal gaps could hinder understanding. Furthermore, the two sideplot events involving each recurring side-character could either form one larger Coherent Narrative, in which meaningful links could be drawn between events, or separate Unrelated Narratives, in which no meaningful links could be drawn between events other than sharing a recurring side-character (Fig. 1C).
The story consisted of ten mainplot events and eight sideplot events. Each event contained eight sentences with eight corresponding images (see examples in Supplementary Materials, Fig. S1–S5). In order to minimize potential confounds, thirty-two alternate versions of the fictional story were created and pseudo-randomly assigned across participants (Supplementary Materials, Table S1). Across these alternate story versions, for each pair of sideplot events, we systematically varied the specific sideplot event content and whether or not these events could form a coherent narrative. Additionally, we varied the particular order in which each side character appeared in the context of the main story. As such, across subjects, this design should minimize any confounding effects of specific event content and/or character identity within reported outcomes.
Character-Cued Recall Task.
For each character in the story, participants were instructed to recall everything they could remember involving the character in as much detail as possible. Participants were presented with the character’s name and face onscreen, with a textbox for typing recall. Participants were required to spend a minimum of three minutes for each side-character and six minutes on the main character (Fig. 1D). Character cues were randomized for side-characters across participants, but the main character was always cued last.
Analyses
Data and Code Availability.
Study materials are publicly available (https://osf.io/2ercv/). Story materials are available on request to the corresponding author. Data: All primary data are publicly available (https://osf.io/2ercv/). Analysis scripts: All analysis scripts are publicly available (https://osf.io/2ercv/; https://github.com/aidelarazan/kramer2.0). No aspects of the experiments were preregistered.
Recall Performance.
Similar to prior studies, recall performance was operationalized as the number of words recalled for each condition (Flores et al. 2017). False recall and confabulations were minimal in our data. Errors often involved misnaming a character rather than describing false content; in such cases, responses were scored based on the described character rather than the incorrect name. Moreover, a separate analysis of a previous study that used the same stories (Cohn-Sheehy, Delarazan et al. 2022; Cohn-Sheehy et al. 2021) was conducted, and we found that word count correlated highly with manual scoring of recall (Pearson r(88) =.91, CI = [0.86, 0.94], p<.001; Supplementary Materials, Fig. S7). The manual scoring method in the previous study was adapted from the Autobiographical Memory Interview (Levine et al., 2002): Each recall transcript was segmented into meaningful detail units and then assigned to labels that describe their content. Raters scored the number of verifiable details that each participant recalled for each event (mean interrater reliability: Pearson r = 0.83; Cohn-Sheehy, Delarazan, et al., 2022).
Recall performance was analyzed using repeated-measures of variance (ANOVAs) contrasting narrative coherence (Coherent Narratives vs. Unrelated Narratives) and temporal distance (Short Lag vs. Long Lag) on total word count across conditions. Post-hoc contrasts were corrected for multiple comparisons using the Bonferroni method. Bayes factors (BF) are included for all statistical tests. We use standard ranges for interpretation: BFs > 3 are interpreted as substantial evidence in favor of the experimental hypothesis, BFs less than 1/3 are interpreted as substantial evidence in favor of the null hypothesis, 3 > BF > 1 are interpreted as modest evidence in favor of the experimental hypothesis, and 1 > BF > 1/3 are interpreted as modest evidence in favor of the null hypothesis. Statistical analyses performed in Python (Version 3.8.3), using the Pingouin (https://pingouin-stats.org/build/html/index.html) package.
Results
Character-cued Recall Differences are Driven by Narrative Coherence, but Not Temporal Distances Between Events
Performance on character-cued recall showed that events that could be formed into one Coherent Narrative were better remembered than Unrelated Narratives, regardless of the temporal distances between paired events across Experiment 1A and 1B (Fig. 2A and 2B). In Experiment 1A, a two-way repeated measures ANOVA on recall performance that incorporated a within-subjects factor of narrative coherence (Coherent Narratives vs. Unrelated Narratives) and temporal distance (Short Lag vs. Long Lag) revealed a significant main effect of narrative coherence [F(1, 38) = 37.06, ηG2 = 0.17, p < .001; Fig. 2A]. Neither a significant main effect of temporal lag [F(1, 74) = 0.57, ηG2 = 0.001, p = .455] nor an interaction [F(1, 38) = n.s., ηG2 = 0.00, p = .971] were observed. Pairwise comparisons showed that recall performance was better for Coherent Narratives than for Unrelated Narratives [Coherent Narrative Short Lag (M = 93.333, SD = 44.203) v. Unrelated Narrative Short Lag (M = 64.128, SD = 36.080): t(38) = 4.88, Hedges’ g = 0.72, p < .001 corrected, BF10 = 1086.82, 95% CI = [17.08, 41.33]; Coherent Narrative Long Lag (M = 93.410, SD = 44.425) v. Unrelated Narrative Long Lag (M = 63.923, SD = 39.066): t(38) = 4.67, Hedges’ g = 0.70, p < .001 corrected, BF10 = 607.88, 95% CI = 95% CI = [16.71, 42.27]].
Fig. 2. Replication of Prior Work with Recall Performance.

Replication of overall recall performance driven by narrative coherence in character-cued recall in a. Experiment 1A and b. Experiment 1B Key: Points represent individual participants’ average performance. Boxplot and violin plot display distribution of data, including the range, IQR and median. Significant tests + p < .1, * p < .05, ** p < .01, *** p < .001
Experiment 1B recall replicated results of Experiment 1A, demonstrating that memory for Coherent Narratives was better than that of Unrelated Narratives [F(1, 35) = 16.90, ηG2 = 0.05, p < .001; Fig. 2B]. Neither a significant main effect of temporal distances [F(1, 35) = 0.99, ηG2 = 0.00, p = .326] nor an interaction [F(1, 35) = 0.92, ηG2 = 0.00, p = .344] were observed. Pairwise comparisons indicated that Coherent Narratives were better recalled compared to Unrelated Narratives, particularly for longer lags in this dataset [Coherent Narrative Long Lag (M = 98.500, SD = 47.573) v. Unrelated Narrative Long Lag (M = 73.917, SD = 41.244): t(35) = 4.69, Hedges’ g = 0.55, p < .001 corrected; BF10 = 563.90, 95% CI = [13.94, 35.23]]. No other pairwise comparison was significant with correction for multiple comparisons. Lastly, given that recall procedures for Experiment 1A and 1B were identical, we also analyzed the experiments combined. These analyses resulted in similar results as above with Coherent Narratives recalled more than Unrelated Narratives [F(1, 74) = 52.07, ηG2 = 0.08, p < .001], and no main effect of temporal lag [F(1, 74) = 0.57, ηG2 = 0.001, p = .455], nor an interaction [F(1, 74) = 0.58, ηG2 = 0.001, p = .450].
Discussion
Experiment 1 replicated the finding that narrative coherence increases the likelihood that events are remembered (Cohn-Sheehy, Delarazan, et al. 2022; Cohn-Sheehy et al., 2021). The present experiment extended this finding, showing that narrative coherence provided recall benefits regardless of the temporal distance between paired events across two experiments. These findings suggest that lifelike events, such as the contents of the story, reflect abrupt shifts in context rather than only smoothly drift over time.
However, it remains unclear whether effects of how memories are organized influence overall recall performance. The fact that we cued recall with each character may have pushed people to remember the story based on character connections, thus minimizing the opportunity for temporal lag effects to emerge and influence overall recall performance. To address this possibility, Experiment 2 used the same stimulus as Experiment 1 but employed an unconstrained free recall task. The free recall approach also allowed us to apply classic list-learning analyses to narratives.
Experiment 2
Experiment 1 used character-cued recall, which encouraged participants to bridge temporally distant events based on characters. This revealed robust effects of narrative coherence in overall recall success, but likely masked our ability to directly test the temporal organization of event recall, and how narrative coherence might influence that organization. Experiment 2 incorporated free recall of the story to investigate the impact of narrative coherence on the temporal structure of recall. Instead of cueing by character, participants were instructed to freely recall the entire story in any order (Fig. 1E).
In classic list-learning experiments, participants are presented with a list of items (e.g., words or images) and later asked to recall them. Similar list-learning recall analyses were conducted to investigate how participants organize their memory for naturalistic events in Experiment 2. Each story event was treated as a functional unit, comparable to a single word or image in a typical list-learning task. This approach enabled us to measure recall performance for each event and apply well-established techniques for analyzing temporal memory patterns (e.g., serial position effects, lag-conditional response probabilities, and temporal clustering scores). By doing so, we assessed how time and narrative coherence shape the organization of memories.
We hypothesized that coherence benefits to recall would also be observed in the free recall task, supporting the unique contributions of time and narrative coherence on the organization of memories. We further hypothesized that distant Coherent Narrative events will be recalled closer together in time than their original positions in the story. Lastly, Unrelated Narrative events, which lack meaningful links, would be more likely to be recalled near their original positions in the story, as predicted by temporal contiguity mechanisms.
Method
Participants
Fifty-two participants (M = 19.64, SD = 1.15; 24 female) were recruited in Experiment 2. Thirteen participants were excluded (failure to complete a part of the study: N=8; technical difficulties: N = 5), resulting in thirty-nine usable participants.
Procedure
Participants completed two sessions that were 24-hours apart. Retrieval consisted of the free recall task.
Free Recall.
Free recall was aimed at assessing how participants organize their recall in the absence of cues by asking participants to freely recall the entire story in any order. Participants were first shown a single screen containing the images of all story characters (side-characters, main character, and minor characters within mainplot events). Participants were then presented with a blank screen with a textbox, and were prompted to recall as much of the story as possible, even minor details. Participants were required to spend a minimum of fifteen minutes on this task and were encouraged to continue typing as long as they remember additional details.
Analyses
Recall Scoring Approach.
Two raters (E.B. and V.F-L.) independently segmented participants’ recall and labeled each segment according to the event it pertained to. Following approaches used to apply list-learning analyses to naturalistic stimuli (Diamond & Levine, 2020; Heusser et al., 2021), each segment was assigned its ordinal position from the original story presentation. From this, we derived a vector of order tags, representing the events recalled by the participant and the sequence in which they were recalled. Analyses were conducted using the psifr package in Python (Morton 2020).
Intraclass correlation coefficient (ICC) was calculated on a subset of participants (N=5) using the pingouin package in Python. Raters assigned event labels to participants’ recall and these were compared to assess inter-rater reliability. Analyses revealed high inter-rater reliability [Single raters absolute: ICC = 0.95, F(4, 5) = 42.36, p < .001, 95% CI = [0.70, 0.99]]. Raters cross-checked each other’s work on the remaining participants, and any disagreements were resolved through consultation with a third rater (A.I.D).
Recall Performance.
Once raters identified the event to which each segment of recall pertained, recall performance was assessed based on the total number of words for each condition, following a procedure similar to Experiment 1.
Serial Position Curve.
The serial position curve represents the proportion of participants who recalled each story events as a function of the event’s position during story encoding. This approach was primarily used to investigate whether previous properties of serial position in list learning apply to narrative event recall in the context of narrative coherence (Diamond & Levine, 2020; Heusser et al. 2021). Exploratory analyses were conducted to evaluate differences in recall performance between Act 1 and Act 2 events across narrative conditions (i.e., Coherent, Unrelated, and Main Narratives). Chi-squared tests of observed recall frequencies were compared against the null hypothesis of equal recall across conditions (Coherent Narratives: Two possible events in Act 1 and two possible events in Act 2; Unrelated Narratives: Two possible events in Act 1 and two possible events in Act 2; Main Narratives: Five possible events in Act 1 and five possible events in Act 2).
Lag-CRP.
The lag-conditional response probability (lag-CRP) curve represents the probability of recalling a given event after the just-recalled event, as a function of their relative encoding positions (lag; Kahana 1996). A lag of 1 indicates that a recalled event was presented immediately after the previously recalled event, and a lag of −3 indicates that a recalled event was presented three events before the previously recalled event. For each event transition, we computed the lag between the current recall event and the next recall event, normalized by the total number of possible transitions. This resulted in a matrix with dimensions of participants by the number of lags (−17 to +17; 34 lags in total excluding lags of 0). A group-averaged lag-CRP curve was calculated by averaging across the rows of the matrix.
Temporal Clustering Scores.
Building upon lag-CRP curves, we calculated temporal clustering scores to quantify the extent to which participants recalled events according to the temporal order. For each transition between recalled events, considering each participant separately, we sorted all not-yet-recalled event transition according to the absolute lag (that is, the distance away in the story). We then computed the percentile rank of the next event the participant recalled and then averaged the percentile ranks across all participants for each condition. Recalling the story events in the exact order would yield a score of 1 and recalling the events in random order would yield a score of 0.5, corresponding to chance clustering. Clustering above chance is evidence of a temporal contiguity effect.
Temporal clustering scores were initially calculated based on the overall story, treating all narrative events as part of a single story without considering distinctions between narrative categories. This initial analysis provided a baseline measure of temporal organization in recall, independent of narrative category. We then conducted further analyses to compare temporal clustering scores across different narrative categories (Coherent, Unrelated, and Main Narratives) to investigate how narrative coherence influences the temporal organization of recall. For these analyses, transitions were based on the narrative category of the originating event (e.g., transitions originating from Coherent, Unrelated, or Main Narratives). Importantly, the narrative category of the subsequent recalled event was not restricted, allowing us to capture the influence of the originating event’s narrative category on recall dynamics, irrespective of the subsequent item’s narrative category. We conducted a one-way ANOVA to evaluate whether certain narrative categories promote stronger temporal clustering during recall. To assess whether temporal clustering for each narrative category exceeded chance levels (a score of 0.5), we conducted paired t-tests comparing mean temporal clustering score within each narrative condition against the chance threshold.
Lastly, we analyzed temporal clustering scores for individual conditions (Coherent Narrative Short Lag, Coherent Narrative Long Lag, Unrelated Narrative Short Lag, and Unrelated Narrative Long Lag) to investigate interactions between narrative coherence and temporal lag. Due to unequal numbers of observations across conditions, we used a mixed-effects model: Temporal clustering scores ~ Coherence x Lag + (1|Participant). In this model, narrative coherence and temporal lag were treatment coded, with Coherent and Long Lag as the reference levels. The interaction term (Coherence x Lag) captures the combined effect of narrative coherence and lag, while the random intercept for participants (1|Participant) accounts for interindividual variability. We also conducted paired t-tests comparing mean temporal clustering score within each condition against chance.
Distance Between Sideplot Event Pairs.
While the temporal clustering analyses can reveal how narrative categories influence recall organization, we can also assess effects of narrative coherence and temporal lag on the temporal organization of recall, operationalized by how far apart paired sideplot events were recalled from each other. Recall distances between paired sideplot events were obtained by counting the number of intervening events between them. Lower numbers indicate recalling events as closer together and larger numbers indicate recalling events as farther apart than in the story. Recall distances were averaged across participants for each condition. Additionally, the mean distance for each condition was compared to the actual distance between paired events.
Successful recall of paired events varied, so we analyzed the influence of narrative coherence and temporal lag on the distance between recalled events using the following mixed effects model: Recall Distances between Paired Events ~ Coherence x Lag + (1|Participant) + (1|Version). The dependent variable (recall distances between paired events) is the absolute distance between paired events. Treatment coding for narrative coherence and temporal lag was used with “Coherent” and “Long Lag” conditions as the reference level. Interaction terms (Coherence x Lag) reflect the combined effect of narrative coherence and lag. The model also includes a random intercept for participants (1|Participant) to account for interindividual differences in baseline recall distances, and story version (1|Version) to account for potential differences across story event combinations.
Our mixed effects models rely on the following assumptions: linearity between predictors and recall distance, normality of residuals, and consistent residual variance across predictor levels. While the model accounts for within-participant recall variability, trials were assumed to be independent across events. Our assessment of the model assumptions were adequately met, supporting the validity of our statistical inferences. The models were conducted using statsmodel package in Python. Lastly, we conducted separate t-tests comparing the mean paired event distances for each narrative coherence condition to its actual paired event distances due to the unequal number of observations in each condition (i.e., 4 for Short Lag and 12 for Long Lag).
Results
Free Recall Performance is Driven by Narrative Coherence
Participants’ memory for Coherent Narratives was better than that for Unrelated Narratives, leading to a significant main effect of narrative coherence [F(1, 38) = 33.53, ηG2 = 0.10, p < .001; Fig. 3A]. Neither a significant main effect of temporal distance [F(1, 38) = 0.47, ηG2 = 0.00, p = .499] nor an interaction was observed [F(1, 38) = 0.04, ηG2 = 0.00, p = .852]. Pairwise comparisons revealed that Coherent Narratives were better recalled compared to Unrelated Narratives across all lag conditions [Coherent Narrative Short Lag (M = 71.000, SD = 50.819) v. Unrelated Narrative Short Lag (M = 44.051, SD = 32.074): t(38) = 3.26, Hedges’ g = 0.63, p = .014 corrected, BF10 = 14.51, 95% CI = [10.23, 43.67]; Coherent Narrative Short Lag v. Unrelated Narrative Long Lag (M = 47.564, SD = 35.283): t(38) = 3.63, Hedges’ g = 0.53, p = .005 corrected, BF10 = 36.59, 95% CI = [10.38, 36.49]; Coherent Narrative Long Lag (M = 72.487, SD = 36.392) v. Unrelated Narrative Short Lag: t(38) = 5.65, Hedges’ g = 0.82, p < .001 corrected, BF10 = 1.01e4, 95% CI = [18.24, 38.63]; Coherent Narrative Long Lag v. Unrelated Narrative Long Lag: t(38) = 4.58, Hedges’ g = 0.69, p < .001 corrected, BF10 = 463.86, 95% CI = [13.89, 35.95]]. Experiment 2 significantly expands on prior findings, revealing that narrative coherence benefits recall not only when specific organizing cues are provided, but also when naturalistic events are freely recalled.
Fig. 3. Extension of Prior Work with Recall Performance.

(a) Extension of overall recall performance driven by narrative coherence in free recall (Experiment 2). (b) Serial position curves for each narrative category (i.e., Main, Coherent, and Unrelated Narratives): Reflects the proportion of participants who remembered each event as a function of the event’s position during story encoding. Key: Points represent individual participants’ average performance. Boxplot and violin plot display distribution of data, including the range, IQR and median. Significant tests + p < .1, * p < .05, ** p < .01, *** p < .001
Serial Position Curve.
Differences in overall recall performance may be driven by the relative advantage of bridging events together in Coherent compared to Unrelated Narratives. Exploratory analyses of serial position curves showed that Unrelated Narrative events were recalled less in Act 2 compared to Coherent and Main Narrative events (Fig. 3B). Moreover, observed frequencies significantly differed from a null expectation of equal recall of Act 1 and Act 2 events for Unrelated Narratives [X2 (1, 38) = 30.16, p < .001] but not Coherent Narratives [X2 (1, 38) = 5.38, p = .613] or Main Narrative events [X2 (1, 38) = 6.824, p = .655]. For Unrelated Narratives, lower recall in Act 2 than Act 1 is comparable to well-established primacy effects found in non-narrative, list-learning memory tasks. In contrast, Coherent and Main Narratives exhibit successful binding of Act 1 and Act 2, overcoming the decline seen in Act 2 for Unrelated Narratives. These findings suggest that the boost in overall recall performance for Coherent Narratives may be driven by leveraging meaningful narrative connections evident in persistent recall performance carrying into Act 2, guarding against the drop-off seen in Unrelated Narratives.
Evaluating Temporal Clusters in Free Recall
We next analyzed the temporal order in which the story events and recall events occurred. The sequence of story events presented, and the sequence recalled were each treated as a list of items as in a list-learning paradigm, allowing us to examine the lag-conditional response probability (lag-CRP) and calculate temporal clustering scores.
Lag-Conditional Response Probability Curve.
Recall transitions between events revealed patterns reflecting temporal structure (Fig. 4A). Recall also reflected additional influences not limited to transitions based on neighboring events in time. For instance, the highest lag-CRP at +2 reflects the interleaving of mainplot and sideplot narratives, such that participants tended to recall all mainplot events in chronological order while skipping over sideplot events. This is supported by exploratory post-hoc analyses revealing that lag-CRP of +2 (M = 0.416, SD = 0.203) recall transitions is more likely than +1 (M = 0.165, SD = 0.165), [pairwise comparisons of lag-CRP +1 v. lag-CRP +2: t(38) = −5.588, Hedges’ g = −1.433, p < .001, BF10 = 8587.52, 95% CI = [−0.36, −0.17]]. This runs counter to standard contiguity effects often found in list-learning experiments (Kahana et al., 1996), but is simply a consequence of our stimulus design. Relatedly, we observed an increase of lag-CRP at −16 (recalling the last mainplot event at Event 18 to the first sideplot event at Event 2). This suggests that after mainplot events are recalled, participants tended to recall sideplot events. Together, these findings align with prior work demonstrating a strong within-category temporal contiguity effect (Polyn et al., 2011).
Fig. 4. Evaluating Temporal Clustering Scores in Free Recall.

(a) Lag-CRP: Reflects the probability of recalling a given item after the just-recalled item, as a function of their relative encoding positions (lag), combined across narrative categories. (b) Temporal Clustering Scores: Participant’s tendency to organize their recall in relation to the original story’s sequence. Higher values indicate recalling events in the order that they were encoded. (c) Correlations between recall performance and temporal clustering scores. Key: Points represent individual participants’ average performance and shaded areas represent bootstrap-derived standard errors (1,000 samples). Boxplot display distribution of data, including the range, IQR and median. Dotted vertical grey lines represent chance. Solid lines represent line of best fit and confidence interval.
Overall Temporal Clustering Score.
Building on these patterns, we calculate temporal clustering score which quantifies the extent to which participants recalled events according to the temporal proximity. Recall reflected organization based on temporal structure (M = 0.658, SD = 0.197) that was significantly different from randomly-ordered recall [t(38) = 5.798, Hedges’ g = 1.30, p < .001, BF10 = 1.59e4, 95% CI = [0.61, 0.72]; Fig. 4B]. In fact, only 6 out of 39 participants (or 15%) scored below chance and did not exhibit temporal contiguity effects. We also found modest evidence that participants with higher temporal clustering scores had better overall memory for the story [Pearson r(38) = 0.35, p = .029, BF10 = 1.99, 95% CI = [0.078, 0.620]], suggesting that recalling events in chronological order aids in overall memory performance (Fig. 4C). Our findings align with prior list-learning studies demonstrating that recall tends to follow temporal structure as well as more recent studies involving naturalistic elements (Diamond & Levine, 2020). We were specifically interested in potential recall organization differences across narrative categories (Coherent, Unrelated, and Main Narratives). We therefore conducted additional analyses to further investigate this.
Narrative Coherence Influences the Temporal Organization of Free Recall
Temporal Clustering Based on Narrative Conditions.
We conducted further analyses comparing temporal clustering scores across different narrative categories (Coherent, Unrelated and Main Narratives) to further examine influences of narrative coherence on the temporal organization of recall (Fig. 5A and 5B). We observed a significant effect of narrative category [F(2, 108) = 8.19, ηG2 = 0.13, p <.001; Fig 5C]. Pairwise comparisons revealed that Coherent Narratives (M = 0.601, SD = 0.291) showed a significantly lower temporal clustering score than Unrelated Narratives (M = 0.758, SD = 0.220), [t(32) = −2.934, Hedges’ g = −0.64, p = .018 corrected, BF10 = 6.62, 95% CI = [−0.29, −0.05]] and Main Narratives (M =0.745 , SD = 0.104), [t(32) = −3.308, Hedges’ g = −0.72, p = 007 corrected, BF10 = 15.33, 95% CI = [−0.26, −0.06]]. No significant difference was observed between Unrelated Narratives and Main Narratives [t(32) = 0.237, Hedges’ g = 0.05, p = .861 corrected, BF10 = 0.197, 95% CI = [−0.07, 0.08]].
Fig. 5. Narrative Coherence Affects the Temporal Organization of Free Recall.

(a) Example of perfect recall order based on the encoded story. Individual shapes represent a condition: Mainplot events (green circle), Coherent Narrative Short Lag (light blue square), Coherent Narrative Long Lag (dark blue triangle), Unrelated Narrative Short Lag (light red square), Unrelated Narrative Long Lag (dark red triangle). (b) Example of participant recall based on encoded story. The distance between paired events in each condition was calculated based on the number of intervening events between the recalling pairs. For example, in the Coherent Narrative Long Lag condition, the events are recalled consecutively, resulting in a distance of one. Conversely, in the Unrelated Narrative Short Lag condition, there are three intervening events between the paired events, resulting in a distance of four. (c) Temporal clustering scores for each narrative category. (d) Average distance between paired events for each condition. Key: Points represent individual participants’ average performance and shaded areas represent bootstrap-derived standard errors (1,000 samples). Boxplot display distribution of data, including the range, IQR and median. Red X mark represents the actual distance between the pairs (Short Lag: 4 and Long Lag: 12).
These findings reveal that participants were inclined to diverge from the originally encoded story order when recalling Coherent Narratives but tended to adhere to the temporal structure in which the story was encoded for Unrelated and Main Narratives. We conducted separate t-tests to determine whether temporal clustering scores across narrative categories were significantly different from randomly-ordered recall (score of 0.5). We found above-chance temporal clustering scores for Unrelated Narratives [t(33) = 6.96, Hedges’ g = 1.581, p < .001, BF10 = 9.11e4, 95% CI = [0.67, 0.83]] and Main [t(38) = 11.41, Hedges’ g = 2.559, p <. 001, BF10 = 9.72e10, 95% CI = [0.68, 0.76]], indicating the presence of temporal organization for these aspects of the story. However, Coherent Narratives did not feature temporal organization [t(37) = 1.06, Hedges’ g = 0.240, p = .298; BF10 = 0.29, 95% CI = [0.45, 0.65]]. That is, in addition to pairwise differences across conditions reported above, we observe no evidence for temporal clustering of Coherent Narrative events.
We applied similar analyses as above for the individual conditions (Coherent Narrative Short Lag, Coherent Narrative Long Lag, Unrelated Narrative Short Lag, and Unrelated Narrative Long Lag) to see whether there are temporal lag influences narrative category clustering. Using a mixed effects model of: Temporal clustering score ~ Coherence x Lag + (1|Particpant), we found a main effect of narrative coherence [β = 0.196, SE = 0.076, z = 2.59, p = .01, 95% CI = [0.048, 0.344]]. There was no significant no main effect of temporal lag [β = 0.029, SE = 0.071, z = 0.406, p = .685, 95% CI = [−0.110, 0.168]] nor an interaction [β = −0.003, SE=0.106, z = −0.025, p = .980, 95% CI = [−0.210, 0.205]]. Lastly, we conducted separate t-tests to determine whether temporal clustering scores across conditions were significantly different from randomly-ordered recall (score of 0.5). Similar to the analysis above, we found above-chance temporal clustering scores for Unrelated Narratives for both Short (M = 0.770, SD = 0.249), [t(27) = 5.744, Hedges’ g = 1.514, p < .001, BF10 = 4758, 95% CI = [0.17, 0.37]] and Long Lag (M = 0.752, SD = 0.293), [t(26) = 4.461, Hedges’ g = 1.197, p < .001, BF10 = 196, 95% CI = [0.14, 0.37]]. We found that Coherent Narratives did not exhibit temporal clustering above chance of both Short (M = 0.562, SD = 0.320), [t(34) = 1.14, Hedges’ g = 0.269, p = .262, BF10 = 0.329, 95% CI = [−0.05, 0.17]] and Long Lag (M = 0.551, SD = 0.390), [t(32) = 0.75, Hedges’ g = 0.182, p = .460, BF10 = 0.241, 95% CI = [−0.09, 0.19]]. Together, these findings suggest that when presented with a more informative cue, such as meaningful connections within a story, people structured recall based on those connections rather than strictly following temporal structure.
Coherent, but not Unrelated Narrative Event Pairs are Recalled Contiguously.
We assessed effects of narrative coherence and temporal lag on the temporal organization of recall, operationalized by how far apart paired sideplot events were recalled from each other (Fig. 5A and 5B). To be included in these analyses, participants had to have recalled both events (e.g., Coherent Narrative Short Lag in Act 1 and Act 2) for a condition (i.e., Coherent Narrative Short Lag, Coherent Narrative Long Lag, Unrelated Narrative Short Lag, and Unrelated Narrative Long Lag). This resulted in an unequal number of observations in each condition (Coherent Narratives Short Lags: 30 participants, Coherent Narrative Long Lag: 28 participants, Unrelated Narrative Short Lag: 12 participants, and Unrelated Narrative Long Lag: 11 participants).
We used the following mixed effects model to analyze the influence of narrative coherence and temporal lag on the distance between recalled events: Recall Distances between Paired Events ~ Coherence x Lag + (1|Subject) + (1|Story Version). Our assessment of the model assumptions were adequately met, supporting the validity of our statistical inferences. Residual normality was assessed using histograms and Q-Q plots, which indicated that residuals were approximately normally distributed, though with minor deviations in the tails. Linearity was evaluated via residual plots, confirming an approximately linear relationship between predictors and recall distance. Homogeneity of variance was tested using Levene’s test, which indicated no significant difference in variance between coherence conditions (W = 0.504, p = 0.480), confirming that residual variance was approximately equal across groups.
The model showed that narrative coherence significantly affected the temporal organization of recall [β = 2.55, SE = 1.219, z = 2.10, p = .036, 95% CI = [0.166, 4.946]] such that paired Coherent Narrative events were recalled closer together (M = 5.258, SD = 4.767) than paired Unrelated Narrative events (M = 7.478, SD = 4.907; Fig. 5D). No statistically significant fixed effect of temporal lag (Short Lag: M = 5.333, SD = 4.200; Long Lag: M = 6.487, SD = 5.515), [β = −0.36, SE = 0.843, z = −0.427, p = .669, 95% CI = [−2.013, 1.292]] or an interaction [β = −1.775, SE = 1.645, z = −1.079, p = .281, 95% CI = [−5.000, 1.450]] was observed.
Given that there were fewer Unrelated Narrative event pairs recalled than there were Coherent Narrative event pairs, we conducted separate t-tests comparing the mean paired event distances for each narrative coherence condition to its actual paired event distances (i.e., 4 for Short Lag and 12 for Long Lag). A significant difference emerged for Coherent Narratives, showing that Coherent Narratives were recalled closer together than their presented temporal distance in the story [t(33) = −2.901, Hedges’ g = −0.693, p = .007, BF10 = 6.174, 95% CI = [−4.18, −0.73]]. Unrelated Narratives, on the other hand, were recalled similarly to their original position in the story, showing evidence for less temporal reorganization [t(17) = −0.27, Hedges’ g = −0.066, p = .788, BF10 = 0.251, 95% CI = [−2.43, 1.87]]. These findings together indicate a deviation from the originally encoded story for Coherent but not Unrelated Narrative event pairs. Interestingly, we also found a significant difference when comparing conditions of temporal lag against their original positions for Long [t(30) = −6.120, Hedges’ g = −1.535, p < .001, BF10 = 1.76e4, 95% CI = [−7.59, −3.79]] and but not Short [t(31) = 1.749, Hedges’ g = 0.432, p = .090, BF10 = 0.736, 95% CI = [−0.22, 2.88]] Lag event pairs.
Discussion
Experiment 2 aimed to evaluate how narrative coherence influences the temporal structure of recall. Our findings indicate that narrative coherence enhanced recall not only with specific cues but also, during free recall of naturalistic events. When provided with a more informative cue emphasizing meaningful connections in a story, people structured recall around those connections rather than strictly adhering to temporal order. This was supported by the finding that Coherent Narrative sideplot event pairs were recalled closer in time than Unrelated Narratives. Additionally, temporal clustering analyses showed less temporal clustering for Coherent Narratives compared to Unrelated Narratives, indicating a greater deviation from the presented order. While our analyses are primarily concerned with narrative coherence and temporal contiguity effects on the organization of recall, exploratory analyses calculating semantic similarity between events (see Supplementary Materials, Fig. S11) further suggest that deviations from the timeline during recall were based on higher organizing principles such as semantic relatedness and narrative coherence. Interestingly, exploratory analyses demonstrate that temporal clustering scores predicted recall performance while semantic clustering scores did not (Supplementary Materials).
General Discussion
In the present study, we used narrative recall tasks to ask how narrative coherence and temporal context affect which events are remembered and in what order. We presented narratives that included sideplots separated by intervening events (Short and Long Lags). These sideplots formed either Coherent Narratives or Unrelated Narratives. We replicated and extended prior work (Cohn-Sheehy, Delarazan, et al. 2022) demonstrating that narrative coherence enhanced recall during character-cued recall (Experiment 1) and unconstrained, free recall (Experiment 2). Critically, narratively coherent events that were separated in time were reorganized into contiguous clusters—in contrast to temporally separated events that shared a character but not a common narrative arc. Alongside the general recall enhancement associated with narrative coherence, this suggests a possible adaptive function of anchoring memories to meaningful links, even if this deviates from the ground-truth temporal context.
Considerable work has emphasized the importance of temporal organization as people recall episodic memories. For instance, well-established temporal contiguity effects in list-learning highlight the tendency for nearby items to be recalled together and influence recall performance (Kahana et al., 1996). Subsequent models have incorporated recall clustering based on semantic associations to better characterize recall (Polyn et al., 2009). These models can account for the fact that people tend to recall items of the same taxonomic category (e.g., animals, occupation, vegetables) together, even when they are presented randomly (Bousfield 1953; Romney et al. 1993). However, these results are largely based on word-level associations in unstructured lists, which lack higher-order meaning and associations.
In contrast, real-life experiences consist of dynamic events with complex, meaningful connections. The experimental design used here incorporated these elements and directly tested how meaningful connections can modulate recall organization. In these meaningfully structured materials, recall preserved elements of the given temporal structure—but also showed strong clustering based on coherent subcomponents in the narrative. As a result, narratively coherent event pairs tended to be recalled closer together than they were presented. In other words, recall of lifelike events clusters was based on meaningful connections, potentially rewriting the narrative timeline to span across temporal gaps.
Meaningful structures such as narratives can significantly bolster memory for specific information. For instance, encoding random words in the context of a story elicits deeper processing (Delaney & Knowles, 2005) and enhanced recall of those words (Laming, 2006; Bower & Clark, 1969). This is further supported in the current study, reflecting enhanced recall of Coherent Narratives, which tap into an overarching story structure. Unrelated Narratives do not tap into this structure, which according to the discourse processing literature may lead to an adaptive pruning process (Kintsch & Welsch, 2013). That is, the second Unrelated Narrative event may be actively pruned during encoding due to its lack of meaningful connection to the first event, which is reflected in relatively poorer recall of those events. Additionally, Coherent Narratives could reflect integration of incoming information with existing mental models (Radvansky & Zacks, 2017). An important avenue for future research will be to continue to clarify the nature of narrative coherence, including when and how it exerts its influences (e.g., through an ongoing process or through a replay mechanism after a connecting event is encountered).
Other studies have directly examined the role of time in narratives (Mandler & Johnson 1977; Keven, 2016). Linking events into a temporal sequence has been identified as means to organize information. However, temporal organization may not be effective in the reconstruction of complex, lifelike memories (Kintsch 1968; Mandler 1986). It has been argued that linking events via causal connections is a stronger strategy than relying on sequential order (Trabasso & Van Den Broek, 1985; Trabasso & Sperry, 1985). This is especially crucial when, much like real-life, connected events can span large temporal gaps (Keven, 2016). Recent work further highlights the importance of causal structure in narrative recall—when participants recalled scenes from a nonlinear film, causal and chronological strategies were stronger than temporal or semantic strategies (Antony et al., 2024). The current findings align with these ideas. Specifically, Unrelated Narratives showed a heightened reliance on temporal structure compared to those that formed a Coherent Narrative.
An intriguing possibility, which can be fleshed out by future studies, is that effects of narrative coherence are at least in part related to causality. Recent work has shown that events with higher causal and semantic centrality events are remembered better than low centrality events (Lee & Chen, 2022). Our findings align with this pattern, as Main and Coherent Narratives—both of which exhibit stronger meaningful connections—were better recalled than Unrelated Narratives. However, it is important to note that while coherence and causality are often intertwined, they are not always synonymous. Our stimuli include some Coherent Narrative event sequences that imply causal order, whereas others remain flexible in their ordering without disrupting comprehension. Additionally, prior work using the same narrative materials (Cohn-Sheehy, Delarazan, et al. 2022) tested whether recall performance could be explained by semantic similarity (as measured using Universal Sentence Encoder embeddings; Cer et al., 2018). Importantly, narrative coherence remained a significant predictor of recall performance even after accounting for sentence-level similarity, reinforcing that coherence effects extend beyond semantic associations alone.
Consistent with this, exploratory analyses revealed that semantic similarity (via correlating language model embeddings between events) exerted an influence on recall transitions. However, we observed below rather than above-chance semantic clustering, suggesting that other constructs such as coherence and temporal organization may have overridden the influence of simple semantic associations. Importantly, we did not significantly predict overall recall performance, suggesting that coherence reflects broader narrative-level organization rather than local semantic overlap (Supplementary Materials). Future work can directly compare and disentangle the contributions of narrative coherence, causality, and semantic relatedness in memory organization. Together, these findings suggest that, although temporal context does influence recall of complex events, people may gravitate less toward temporally structuring recall when more meaningful connections—such as coherence and causality—are available (Mandler & Johnson, 1977).
The current results also align with the proposal that narratives themselves tend to be organized to follow an idealized internal structure–or story schema–that aids in event comprehension (Mandler & Johnson, 1977). The degree to which narratives maintain schema-relevant structure is known to influence memory (Mandler & Johnson, 1977). The divergence from temporal structure observed within Coherent Narratives is consistent with a tendency to adhere to an ideal narrative structure, despite the fact that it deviates from the encoded timeline. The arrangement of the narratives in our study was designed such that mainplot and sideplot events intentionally interleaved with each other. Thus, following the exact chronological order of events would disrupt the ideal narrative structure of separate but co-existing narratives (i.e., Coherent and Main Narratives) and reflect prioritization of temporal associations.
However, our results revealed a different tendency: Recall transitions were influenced by higher-order narrative structure. Specifically, the uptick in +2 lag transitions indicates a preference for within-narrative transitions, driven by interleaved Main Narrative events. This pattern aligns with prior research showing similar within-category transitions in free-recall studies with intermixed items from different semantic categories. While overall temporal contiguity appeared diminished, separate analyses revealed strong temporal contiguity within categories, with participants transitioning to the nearest same-category item. Similarly, our findings suggest that participants prioritized within-narrative transitions, favoring higher-order structure over temporal contiguity when possible. Moreover, Coherent Narrative event pairs were recalled closer together, reflecting a bias to align with the overarching narrative structure rather than temporal context. Our findings align with a more event-based organization, with clear delineations and connections between events based on meaning and significance, rather than organization based on a drifting temporal context (DuBrow et al., 2017).
It is important to note that, although Coherent Narratives led to temporal reorganization during recall, memory for the story overall was associated with strong temporal clustering. These findings align with prior research that examined recall organization in naturalistic contexts (e.g., navigating a museum tour) and demonstrated a positive association between temporal clustering and the number of internal details recalled (Diamond & Levine, 2020). Thus, we do not take our findings to suggest that it would be adaptive to abandon temporal context altogether as a means of organizing event recall. In fact, other studies have shown that while categories could disrupt fine-grained temporal organization, coarse temporal organization were still preserved (Hong et al., 2024). Time is a critical source of information that is widespread throughout episodic memory systems in the brain (Howard, 2017), and is itself strongly related to the causal flow of narratives (Briner et al., 2012). However, the way in which memories are remembered need not follow the straight arrow of time. That is, events may be remembered in a way that disrupts their original chronical or causal sequence. We reveal that narratively meaningful connections are particularly strong influences on memory retrieval, such that it is actually advantageous to bridge even large temporal gaps to reconstruct a coherent narrative.
The present work is, of course, is not without limitations. Although these results establish that narrative coherence affects the organization of recall, the current method cannot assess whether narrative coherence distorts an internal timeline or affects how information is read off during recall. Relatedly, it remains unclear whether these temporal distortions are due to automatic processes or strategic control of output order during recall. Future research can manipulate explicit recall strategies to explore whether these findings remain under various conditions.
Additionally, we intentionally placed events in their current positions (e.g., Short Lags in the middle and Long Lags towards the end) to maintain balance between conditions while telling a compressive overarching story. This design choice, however, leaves open the question of whether the effects of temporal lag and narrative coherence would differ if certain events occurred at the beginning or end of the sequence. Future studies should explore how varying the placement of events might influence the role of time and coherence in recall.
While the present study aims to take a step forward in the direction of reflecting naturalistic experiences, our findings may not directly translate to the real-world. In particular, time is referenced in the context of our stimuli. While these stimuli are built to emulate real-world events, they clearly do not occur at the timescale of real life. Thus, one potential factor underlying the lack of observed temporal lag effects is that the temporal distances between events in this study (on the order of minutes) are negligible in the scope of distances between real-world events (on the order of hours, days, or weeks). Future research should aim to further disentangle this.
In sum, the present results strongly suggest that when people recall the complex events of their lives, they leverage links of meaning that bind disjointed components across time. Representations of time and of narrative structure interact to facilitate remembering the past, but we tend to use higher-order organizing principles such as narrative coherence, even if this rearranges the original timeline of events.
Supplementary Material
Acknowledgments
We thank Sarah Morse for helping with data collection as well as Ata B. Karagoz, Erwin M. Macalalad, and members of the Complex Memory Lab and Dynamic Cognition Lab for helpful discussions and support.
Funding:
This material is based upon work supported by the National Institutes of Health under Training Grant GR0028512 (A.I.D.) and Grant N00014-17-1-2961 (J.M.Z.), National Science Foundation under Grant DGE-2139839 (A.I.D.) and DGE-1745038 (A.I.D.), and Office of Naval Research under Grant R01AG062438 (J.M.Z.).
Footnotes
Conflict of interest: All authors declare no conflicts of interest.
Artificial Intelligence: No artificial intelligence assisted technologies were used in this research or the creation of this article.
Ethics: This research received approval from a Washington University in St. Louis Institutional Review Board (ID: 202010173).
Preregistration: No aspects of the experiments were preregistered. Materials: Some study materials are publicly available (https://osf.io/2ercv/). Story materials are available on request to the corresponding author. Data: All primary data are publicly available (https://osf.io/2ercv/). Analysis scripts: All analysis scripts are publicly available (https://osf.io/2ercv/; https://github.com/aidelarazan/kramer2.0).
References
- Antony J, Lozano A, Dhoat P, Chen J, & Bennion K (2024). Causal and chronological relationships predict memory organization for nonlinear narratives. Journal of Cognitive Neuroscience, 36(11), 2368–2385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- BARTLETT FC (1932). Remembering: A study in experimental and social psychology. Cambridge: Cambridge University Press. [Google Scholar]
- Bietti LM, Tilston O, & Bangerter A (2019). Storytelling as adaptive collective sensemaking. Topics in cognitive science, 11(4), 710–732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bousfield WA (1953). The occurrence of clustering in the recall of randomly arranged associates. The Journal of General Psychology, 49(2), 229–240. [Google Scholar]
- Bower GH, & Clark MC (1969). Narrative stories as mediators for serial learning. Psychonomic science, 14(4), 181–182. [Google Scholar]
- Briner SW, Virtue S, & Kurby CA (2012). Processing causality in narrative events: Temporal order matters. Discourse Processes, 49(1), 61–77. [Google Scholar]
- Cer D, Yang Y, Kong SY, Hua N, Limtiaco N, John RS, … & Kurzweil R (2018, November). Universal sentence encoder for English. In Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations (pp. 169–174). [Google Scholar]
- Lee H, & Chen J (2022). Predicting memory from the network structure of naturalistic events. Nature Communications, 13(1), 4235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohn-Sheehy BI, Delarazan AI, Crivelli-Decker JE, Reagh ZM, Mundada NS, Yonelinas AP, … & Ranganath C (2022). Narratives bridge the divide between distant events in episodic memory. Memory & cognition, 50(3), 478–494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohn-Sheehy BI, Delarazan AI, Reagh ZM, Crivelli-Decker JE, Kim K, Barnett AJ, … & Ranganath C (2021). The hippocampus constructs narrative memories across distant events. Current Biology, 31(22), 4935–4945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Conway MA (2009). Episodic memories. Neuropsychologia, 47(11), 2305–2313. [DOI] [PubMed] [Google Scholar]
- Crowder RG (2014). Principles of learning and memory: Classic edition. Psychology Press. [Google Scholar]
- Delarazan AI, Bosak E, Cohn-Sheehy BI, Foureaux-Lee V, Zacks JM, & Reagh Z (2025, March 17). Narrative coherence warps the timeline of recalled naturalistic events. Retrieved from osf.io/2ercv [DOI] [PMC free article] [PubMed]
- Delaney PF, & Knowles ME (2005). Encoding strategy changes and spacing effects in the free recall of unmixed lists. Journal of Memory and Language, 52(1), 120–130. [Google Scholar]
- Diamond NB, & Levine B (2020). Linking detail to temporal structure in naturalistic-event recall. Psychological Science, 31(12), 1557–1572. [DOI] [PubMed] [Google Scholar]
- DuBrow S, & Davachi L (2016). Temporal binding within and across events. Neurobiology of learning and memory, 134, 107–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DuBrow S, Rouhani N, Niv Y, & Norman KA (2017). Does mental context drift or shift? Current Opinion in Behavioral Sciences, 17, 141–146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ezzyat Y, & Davachi L (2014). Similarity breeds proximity: pattern similarity within and across contexts is related to later mnemonic judgments of temporal proximity. Neuron, 81(5), 1179–1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Faber M, & Gennari SP (2015). In search of lost time: Reconstructing the unfolding of events from memory. Cognition, 143, 193–202. [DOI] [PubMed] [Google Scholar]
- Faber M, & Gennari SP (2017). Effects of learned episodic event structure on prospective duration judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(8), 1203. [DOI] [PubMed] [Google Scholar]
- Flores S, Bailey HR, Eisenberg ML, & Zacks JM (2017). Event segmentation improves event memory up to one month later. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(8), 1183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graesser AC, Singer M, & Trabasso T (1994). Constructing inferences during narrative text comprehension. Psychological review, 101(3), 371. [DOI] [PubMed] [Google Scholar]
- Heusser AC, Fitzpatrick PC, & Manning JR (2021). Geometric models reveal behavioural and neural signatures of transforming experiences into memories. Nature Human Behaviour, 5(7), 905–919. [DOI] [PubMed] [Google Scholar]
- Hintzman DL (2016). Is memory organized by temporal contiguity?. Memory & cognition, 44(3), 365–375. [DOI] [PubMed] [Google Scholar]
- Hong MK, Gunn JB, Fazio LK, & Polyn SM (2024). The modulation and elimination of temporal organization in free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 50(7), 1035. [DOI] [PubMed] [Google Scholar]
- Horner AJ, Bisby JA, Wang A, Bogus K, & Burgess N (2016). The role of spatial boundaries in shaping long-term event representations. Cognition, 154, 151–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howard MW (2017). Temporal and spatial context in the mind and brain. Current opinion in behavioral sciences, 17, 14–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howard MW, & Kahana MJ (2002). A distributed representation of temporal context. Journal of mathematical psychology, 46(3), 269–299. [Google Scholar]
- Jeunehomme O, & D’Argembeau A (2019). The time to remember: Temporal compression and duration judgements in memory for real-life events. Quarterly Journal of Experimental Psychology, 72(4), 930–942. [DOI] [PubMed] [Google Scholar]
- Kahana MJ (1996). Associative retrieval processes in free recall. Memory & cognition, 24(1), 103–109. [DOI] [PubMed] [Google Scholar]
- Keven N. (2016). Events, narratives and memory. Synthese, 193(8), 2497–2517. [Google Scholar]
- Kintsch W. (1968). Recognition and free recall of organized lists. Journal of Experimental Psychology, 78(3p1), 481. [Google Scholar]
- Kintsch W, & Welsch DM (2013). The construction-integration model: A framework for studying memory for text. In Relating theory and data (pp. 381–400). Psychology Press. [Google Scholar]
- Laming D. (2006). Predicting free recalls. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32(5), 1146. [DOI] [PubMed] [Google Scholar]
- Levine B, Svoboda E, Hay JF, Winocur G, & Moscovitch M (2002). Aging and autobiographical memory: dissociating episodic from semantic retrieval. Psychology and aging, 17(4), 677. [PubMed] [Google Scholar]
- Lee H, Bellana B, & Chen J (2020). What can narratives tell us about the neural bases of human memory?. Current Opinion in Behavioral Sciences, 32, 111–119. [Google Scholar]
- Mandler JM (1986). On the comprehension of temporal order. Language and Cognitive Processes, 1(4), 309–320. [Google Scholar]
- Mandler JM, & Johnson NS (1977). Remembrance of things parsed: Story structure and recall. Cognitive psychology, 9(1), 111–151. [Google Scholar]
- Morton NW (2020). Psifr: Analysis and visualization of free recall data. Journal of Open Source Software, 5(54), 2669. [Google Scholar]
- Murdock BB (1974). Human memory. New York. [Google Scholar]
- Murdock BB (1982). A theory for the storage and retrieval of item and associative information. Psychological Review, 89(6), 609. [DOI] [PubMed] [Google Scholar]
- Polyn SM, Erlikhman G, & Kahana MJ (2011). Semantic cuing and the scale insensitivity of recency and contiguity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 37(3), 766. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Polyn SM, Norman KA, & Kahana MJ (2009). A context maintenance and retrieval model of organizational processes in free recall. Psychological review, 116(1), 129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Radvansky GA, & Zacks JM (2017). Event boundaries in memory and cognition. Current opinion in behavioral sciences, 17, 133–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Romney AK, Brewer DD, & Batchelder WH (1993). Predicting clustering from semantic structure. Psychological Science, 4(1), 28–34. [Google Scholar]
- SederBerg PB, Miller JF, Howard MW, & Kahana MJ (2010). The temporal contiguity effect predicts episodic memory performance. Memory & Cognition, 38(6), 689–699 [DOI] [PubMed] [Google Scholar]
- Speer NK, & Zacks JM (2005). Temporal changes as event boundaries: Processing and memory consequences of narrative time shifts. Journal of memory and language, 53(1), 125–140. [Google Scholar]
- Sridharan D, Levitin DJ, Chafe CH, Berger J, & Menon V (2007). Neural dynamics of event segmentation in music: converging evidence for dissociable ventral and dorsal networks. Neuron, 55(3), 521–532. [DOI] [PubMed] [Google Scholar]
- Swallow KM, Barch DM, Head D, Maley CJ, Holder D, & Zacks JM (2011). Changes in events alter how people remember recent information. Journal of cognitive neuroscience, 23(5), 1052–1064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trabasso T, & Sperry LL (1985). Causal relatedness and importance of story events. Journal of Memory and language, 24(5), 595–611. [Google Scholar]
- Trabasso T, & Van Den Broek P (1985). Causal thinking and the representation of narrative events. Journal of memory and language, 24(5), 612–630. [Google Scholar]
- Zwaan RA, Langston MC, & Graesser AC (1995). The construction of situation models in narrative comprehension: An event-indexing model. Psychological science, 6(5), 292–297. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Study materials are publicly available (https://osf.io/2ercv/). Story materials are available on request to the corresponding author. Data: All primary data are publicly available (https://osf.io/2ercv/). Analysis scripts: All analysis scripts are publicly available (https://osf.io/2ercv/; https://github.com/aidelarazan/kramer2.0). No aspects of the experiments were preregistered.
