Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2022 Oct 31;377(1866):20210339. doi: 10.1098/rstb.2021.0339

What would have happened? Counterfactuals, hypotheticals and causal judgements

Tobias Gerstenberg 1,
PMCID: PMC9629435  PMID: 36314143

Abstract

How do people make causal judgements? In this paper, I show that counterfactual simulations are necessary for explaining causal judgements about events, and that hypotheticals do not suffice. In two experiments, participants viewed video clips of dynamic interactions between billiard balls. In Experiment 1, participants either made hypothetical judgements about whether ball B would go through the gate if ball A were not present in the scene, or counterfactual judgements about whether ball B would have gone through the gate if ball A had not been present. Because the clips featured a block in front of the gate that sometimes moved and sometimes stayed put, hypothetical and counterfactual judgements came apart. A computational model that evaluates hypotheticals and counterfactuals by running noisy physical simulations accurately captured participants’ judgements. In Experiment 2, participants judged whether ball A caused ball B to go through the gate. The results showed a tight fit between counterfactual and causal judgements, whereas hypotheticals did not predict causal judgements. I discuss the implications of this work for theories of causality, and for studying the development of counterfactual thinking in children.

This article is part of the theme issue ‘Thinking about possibilities: mechanisms, ontogeny, functions and phylogeny’.

Keywords: causality, counterfactual, hypothetical, conditional, mental simulation, intuitive physics

1. Introduction

How do people make causal judgements? Consider the diagram shown in figure 1a. Ball A and ball B enter the scene from the right, collide with one another, and ball B goes through the gate. Did ball A cause ball B to go through the gate? Intuitively, the answer is ‘yes’. But why?

Figure 1.

Figure 1.

Two diagrams illustrating the difference between a situation in which (a) ball A caused ball B to go through the gate, and (b) one in which ball A did not cause ball B to go through the gate. In (a), ball B would have missed the gate if ball A had not been present in the scene. In (b), ball B would have gone through the gate even if ball A had not been present.

Reaching causal verdicts about scenes like this one requires going beyond what actually happened, and considering what would have happened in a relevant counterfactual situation [13]. If ball A had not been present in the scene then ball B would not have gone through the gate. The fact that this counterfactual is true suggests that ball A caused ball B to go through the gate. By the same logic, in figure 1b, ball A did not cause ball B to go through the gate. Here, ball B would have gone through the gate even if ball A had not been present in the scene. Gerstenberg et al. [4] show that a model based on counterfactual simulation accurately captures people’s causal judgements about physical events like this one (see also [3,5]). But are counterfactuals really necessary, or might it be possible to explain causal judgements differently? One such possibility is that another kind of cognitive operation may suffice: hypothetical simulation. But what is the difference between counterfactual and hypothetical simulation?

A counterfactual simulation involves observing what actually happened, mentally travelling back in time to imagine a change to what actually happened, and then simulating how this alternative possibility would have played out. If the outcome in the counterfactual situation would have been different from what actually happened then the event of interest caused the outcome. By contrast, a hypothetical simulation involves imagining a possible future. This does not require going back in time and mentally changing something that already happened. Instead, one considers a possible change in the future. While the counterfactual asks whether ball B would have gone through the gate if ball A had not been there, the hypothetical asks whether ball B would go through the gate if ball A were not there. So, counterfactuals and hypotheticals differ in whether the mind travels to the past or to the future.

For example, when judging causation in figure 1a, as the balls enter the scene an observer may consider a hypothetical simulation of where ball B would go if ball A were not present in the scene, and then compare what actually happened to the outcome of this future hypothetical. In fact, for the clips shown in figure 1, the hypothetical probability (would B go through the gate if ball A were not there) and the counterfactual probability (would B have gone through the gate if ball A had not been there) are the same. These cases cannot distinguish between what kind of mental time travel is involved in making causal judgements. We need new evidence to determine whether counterfactuals are necessary for explaining causal judgements, or whether hypotheticals suffice.1

In this paper, I present new evidence that bears on the question of what kind of mental simulation is involved when people make causal judgements. First, I will clarify the conceptual distinction between conditionals, hypotheticals and counterfactuals using the formal framework developed by Pearl [6]. I will then discuss prior research focusing on the role that counterfactuals play in theories of causal judgement. The empirical evidence so far does not conclusively show that people engage in counterfactual simulation when making causal judgements. I will present such evidence. I develop a physical simulation model that generates both hypothetical and counterfactual simulations, and test the model in two experiments. The experiments feature a set of physical scenarios in which the outcomes of hypothetical and counterfactual simulations differ. Experiment 1 shows that the model accurately captures participants’ hypothetical and counterfactual judgements. Experiment 2 tests whether causal judgements are better explained by hypotheticals or counterfactuals. The results clearly support the counterfactual account. I discuss the implications of these findings for theories of causality, and for psychological research into the development of counterfactual reasoning.

(a) . Counterfactuals versus hypotheticals: same, same but different

Counterfactuals and hypotheticals are both thoughts about possibilities. The key difference is when a change to actuality is imagined to take place. Hypotheticals are thoughts about changes that lie in the future. For example, as ball A and ball B enter the scene in figure 2a and before they are about to collide with one another, one might wonder whether ball B would go through the gate if ball A was removed from the scene. Imagining such a future hypothetical, an observer mentally removes ball A from the scene and simulates what path ball B would take. Hypotheticals are essential for decision-making and planning [7]. Making good decisions requires evaluating the likely consequences of different hypothetical actions and then choosing the action with the highest expected utility.

Figure 2.

Figure 2.

Illustration of the (a) one-cause scenario and (b) two-cause scenario. The images show what happened in each scenario. The diagrams and structural equations below capture the causal dependence between the variables that represent the relevant events (X = whether ball A is present or absent, Y = whether ball B goes through the gate or nor, Z = whether the block ends up in front of the gate or not). In the one-cause scenario, the hypothetical probability (Would ball B go through the gate if ball A were not there?) and the counterfactual probability (Would ball B have gone through the gate if ball A had not been there?) are the same. In the two-cause scenario, the hypothetical and counterfactual probability come apart because it is uncertain whether or not the block will move. The hypothetical probability is determined at the beginning of the clip (at which point it is unclear whether or not the block will move), whereas the counterfactual probability is determined after the clip has played (at which point it is now clear whether or not the block moved).

Counterfactuals are thoughts about changes that lie in the past. For example, after ball B ended up going through the gate in figure 2a, one might wonder whether ball B would still have gone through the gate even if ball A had not been present. Here, the observer first takes in everything that happened, then goes back in time, and mentally simulates how things would have played out if the change of interest had taken place.

Note that for the scene shown in figure 2a, the hypothetical and counterfactual outcome are the same. Where ball B would go if ball A was not present (the hypothetical) is the same as where ball B would have gone if ball A had not been present (the counterfactual). This is the case because there are no other factors that influence the outcome (beyond the presence or absence of ball A) about which an observer may have some degree of uncertainty (e.g. other balls that could enter at a later point, the possibility that the gate may close, …). However, as we will see soon, hypotheticals and counterfactuals can come apart.

(i) . A hierarchy of causal concepts

Formal theories of causality make a conceptual distinction between hypotheticals and counterfactuals. Pearl & Mackenzie [8] propose a metaphorical ladder of causation where each rung represents what kinds of causal questions can be answered [6,9]. Let us consider a simple setting with two binary variables X ∈ {x, x′} (the candidate cause), and Y ∈ {y, y′} (the candidate effect), where x (or y) indicates that the event of interest happened, and x′ (or y′) indicates that it did not happen. Table 1 summarizes the three levels of the causal hierarchy (see also [10]).

Table 1.

The causal hierarchy adapted from Pearl [9]. On level I, one can only answer questions about probabilistic dependence. On level II, one can distinguish genuine causation from mere correlation. On level III, one can answer questions about why a particular event of interest happened.

level concept expression activity question example
I observation/prediction p(y|x) seeing How does x change my belief in y? Would the grass be dry if we found the sprinkler off?
II intervention/ hypothetical p(y|do(x)) doing Would y happen if I did x? Would the grass be dry if we made sure that the sprinkler was off?
III counterfactual p(yx|x′, y′) explaining Would y have happened instead of y′, if I had done x instead of x′? Would the grass have been dry if the sprinkler had been off, given that the grass is wet and the sprinkler on?

On level I of the ladder, one can answer conditional questions such as how likely y will happen if x happens, p(y|x). On level II, one can answer hypothetical questions such as how likely y would happen if x were made true, p(y|do(x)). The formally defined concept of an intervention, do(), distinguishes causal from merely correlational relationships [6]. Intuitively, when X causes Y, intervening on X increases the probability that Y will happen (i.e. p(y|do(x)) > p(y)). Whereas when X and Y are merely correlated, it is possible that this correlation is due to another factor, such as a common cause C that brings about both X and Y (in this case, p(y|do(x)) = p(y)). Finally, on level III, one can answer counterfactual questions such as whether y would have happened if x had been made true, given that in fact neither x nor y happened, p(yx|x′, y′). Answering counterfactual questions requires a combination of conditioning on what actually happened, and then (mentally) changing an event that already happened, to see whether things would have played out differently.

As we have seen above, hypothetical and counterfactual probabilities do not always come apart. Let us say that in figure 2a, X denotes whether or not ball A was present in the scene with X ∈ {x = ball A is present, x′ = ball A is absent}, and Y denotes whether or not ball B goes through the gate with Y ∈ {y = ball B goes through the gate, y′ = ball B does not go through the gate}. In this simple setting, the hypothetical probability p(y|do(x′)) is the same as the counterfactual probability p(yx|x, y). The probability that ball B would go through the gate if ball A were not there is the same as the probability that ball B would have gone through if ball A had not been there. This is the case because there are no other factors that influence ball B’s going through the gate except for ball A. In order for hypothetical and counterfactual probabilities to come apart, we need to go beyond such simple one-cause scenarios.

Figure 2b shows a situation in which ball A and ball B interact with one another in the same way as they did in figure 2a. However, this time there is another object in the scene that affects the outcome: a block in front of the gate that sometimes moves and sometimes stays put. I will use the variable Z for the block with Z ∈ {z = the block is in front of the gate, z′ = the block is not in front of the gate}. Let us assume that the block has a 50% chance of moving p(Z = z) = 0.5. In this scenario, the hypothetical probability and the counterfactual probability come apart. The hypothetical probability that ball B would go through the gate if ball A were not present is p(y|do(x′)) ≈ 0.5. Ball B would only go through the gate (in ball A’s absence) if the block moved, but whether or not this will happen is unclear at the beginning of the clip. The counterfactual probability that ball B would have gone through the gate if ball A had not been present is p(yx|x, z′, y) ≈ 1. This is because when considering the counterfactual, we condition on what actually happened: ball A was present (x), the block was not in front of the gate (z′), and ball B went through the gate (y). Because the counterfactual intervention on A’s presence does not affect whether or not the block moves (there is no causal link from X to Z), it is clear that the block would still have been out of the way in the counterfactual situation in which ball A had not been present, and that ball B would still have gone through the gate in that case.

(ii) . The sprinkler example

As another illustration of how conditioning on observations, interventions, or counterfactuals comes apart, consider the example shown in figure 3 (adapted from Pearl [6]). Picture yourself in sunny California wondering whether it was the sprinkler (S ∈ {s = sprinkler on, s′ = sprinkler off}) or the rain (R ∈ {r = rain present, r′ = rain absent}) that caused the grass on your lawn to be wet (W ∈ {w = grass is wet, w′ = grass is dry}). The clouds (C ∈ {c = clouds present, c′ = clouds absent}) cause the rain, and they prevent the sprinkler from running (it is one of these Silicon valley smart sprinklers that only runs if there are no clouds). The grass is wet if the sprinkler is on, if it rains, or if both are the case (figure 3a). Somewhat unrealistically, there is a 50% chance on any given day that there are clouds, p(C = c) = 0.5.

Figure 3.

Figure 3.

Diagrammatic illustration of the difference between making inferences based on observations, interventions, and counterfactuals. (a) The causal structure of the setting. (b) What actually happened. (c) Observation: What can be inferred from observing that the sprinkler is off? (d) Intervention: What can be inferred from intervening to turn the sprinkler off? (d) Counterfactual: What can be inferred from first observing what actually happened, and then intervening to turn the sprinkler off? Here, only the counterfactual level yields the intuitively correct response that the sprinkler caused the grass to be wet in the actual situation.

On Monday morning, you go outside and you see that there are no clouds, that the sprinkler is on, that there is no rain, and that the grass is wet (figure 3b). You wonder, did the sprinkler cause the grass to be wet, p(sw)? Intuitively, the answer is ‘yes’, of course. After all, the sprinkler was on, and there was no rain. But what verdict would we reach based on the three different levels in the causal hierarchy?

To answer the question of whether the sprinkler caused the grass to be wet, we want to test whether the grass would have been dry if the sprinkler had been off. On level I, we can only condition on observations (figure 3c). Observing that the sprinkler is off licenses the diagnostic inference that there must be clouds, which in turn means that it rained, which in turn means that the grass is wet. So, on this level, the grass would not be dry if one observed the sprinkler to be off, p(w′|s′) = 0. On level II, we condition on hypothetically intervening on the scene (figure 3d). Intervening to turn the sprinkler off breaks the causal link between clouds (C) and sprinkler (S). Intervening on a variable removes all the incoming links into that variable (and thereby breaks any diagnostic inferences from the intervened-on variable to its parents). There is a 50% chance that the grass would be dry if we intervened to turn the sprinkler off because it now depends on whether or not there would be clouds, p(w′|do(s′)) = 0.5. On level III, we first condition on what actually happened, and then consider a counterfactual intervention that would have turned the sprinkler off (figure 3e). This yields the inference that the grass would have been dry had the sprinkler been turned off, p(ws|c′, s, r′, w) = 1. So, only on the counterfactual level do we get the intuitive verdict that it was sprinkler that caused the grass to be wet.

(b) . Prior work

There is a vast literature on how conditional and counterfactual reasoning relates to causality in philosophy (e.g. [1114]), linguistics [1520] and psychology (e.g. [2128]). Here, I will focus on work in psychology that has directly been inspired by Pearl’s [6] formal modelling framework (for an overview, see [2]).

A lot of work has shown that people differentiate between levels I and II of the causal hierarchy (table 1). People are sensitive to the different inferences that are licensed based on ‘seeing’ versus ‘doing’ [7,2935]. For example, whereas observing an effect makes it more likely that a cause was present, intervening on an effect blocks the diagnostic inference about the likelihood of its cause.

Psychologists have also studied whether the way in which people reason about counterfactuals accords with Pearl’s [6] framework. Much of this work has focused on the question of whether or not people ‘backtrack’. For example, consider a causal chain structure ABC in which none of the events happened. A backtracking counterfactual question asks whether A would have happened if one had intervened to make B happen. According to Pearl’s framework, the answer is ‘no’ (i.e. p(ab|a′, b′, c′) = 0). Because counterfactuals are construed as interventions that break any incoming links into the intervened-on variable, only the values of variables that are downstream from the intervention may change. There is no backtracking in Pearl’s framework (i.e. changing the values of variables upstream from the intervention). However, several studies have shown that people sometimes do backtrack [3641]. There is also a rich literature on the development of counterfactual reasoning, which I will say more about in the General Discussion.

Only a few studies have looked directly into whether people distinguish between the second and third level of the causal hierarchy. Meder et al. [42] studied what causal inferences participants draw based on evidence from observations (level I), evidence from interventions (level II) or evidence from counterfactuals (level III). The experiment was designed such that, normatively, different inferences were licensed for each kind of evidence. While the results showed that participants differentiated between observational and interventional evidence, they did not distinguish counterfactual from interventional evidence.

Most relevant to the question of how hypothetical and counterfactual judgements relate to causal judgements is a recent paper by Skovgaard-Olsen et al. [43]. Across a series of six experiments, the authors show that people differentiate between indicative conditionals (if x happens then y happens) and counterfactual conditionals (if x had happened then y would have happened). For example, consider a situation in which A is a common cause of both B and C, BAC. In this case, the indicative conditional ‘if b happens, then c happens’ is true (i.e. p(b|c) is high). But the counterfactual conditional ‘if b had happened, then c would have happened’ is false (i.e. p(cb|b′, c′) is low). Considering a counterfactual intervention that had changed B would not have affected C.

In their Experiment 5, Skovgaard-Olsen et al. [43] used such a common-cause structure to test whether participants’ judgements about indicative conditionals and counterfactual conditionals come apart and, if so, which type of conditional better aligns with causal judgements. Participants were either asked about the relationship between A and B (predictive), B and A (diagnostic), or B and C (spurious). Each participant judged the probability of an indicative conditional being true (e.g. ‘if a then b’ in the predictive condition), a counterfactual being true (e.g. if a′ then b′, phrased as ‘if a had not happened, then b would not have happened’), or a causal statement being true (e.g. ‘a caused b’). The results showed that participants responded differently to the different question types. Whereas their responses for indicative conditionals were essentially the same in each of the three conditions (predictive, diagnostic, spurious), for counterfactual conditionals and causal statements their answers differed between the conditions. For example, they said that A caused B in the predictive condition, but that A did not cause B in the diagnostic condition. Importantly, participants’ counterfactual and causal judgements were closely aligned with one another, whereas participants’ judgements about the indicative conditionals did not match their causal judgements.

(i) . The role of counterfactuals in theories of causal judgement

Much prior research has argued that counterfactuals and causal judgements are intimately linked [1,44]. Here, by causal judgements, I mean judgements about what caused what to happen in a particular situation, such as whether the sprinkler caused the grass to be wet. As we have seen, counterfactuals also form the basis for recent approaches in computer science that formally model causal judgements [6,45]. In these approaches, causal knowledge is expressed in the form of causal Bayes nets or structural equations that capture the causal dependence between the variables in the model. Counterfactuals are construed as interventions that set a variable to a desired value (see also [4649]). By considering such counterfactual interventions, the formalism yields verdicts about which variables actually caused some outcome of interest [50]. Intuitively, it’s those variables that were pivotal for the outcome which caused it to come about (see [5154], for work showing how this idea of being pivotal is important for judgements of responsibility as well).

While a simple counterfactual test fails in situations of causal overdetermination (where two or more individually sufficient causes brought about an outcome), more sophisticated tests have been developed to deal with such situations (see [45,50], for details). These tests consider not only whether a variable was pivotal in the actual situation, but also whether it would have been pivotal in other possible situations that could have arisen.2

An alternative class of approaches—process theories of causation—explains causal judgements merely in terms of what actually happened and without relying on counterfactuals [55,56]. Empirical work has shown that people’s causal judgements are indeed sensitive to the way in which the outcome came about, and not just to mere counterfactual dependence [55,5761].

Inspired by both counterfactual theories and process theories of causation, Gerstenberg et al. [3] developed the counterfactual simulation model (CSM) of causal judgement for physical events. The CSM predicts that people’s causal judgements are a function of their subjective degree of belief that the candidate cause made a difference to whether or not the outcome of interest happened. The dashed paths in figure 1 show how ball B would have moved if ball A had not been present in the scene. An observer does not have direct access to what would have happened. Instead, they need to use their intuitive understanding of the domain to simulate the counterfactual. Gerstenberg et al. [3] showed that the CSM accurately captured people’s quantitative causal judgements. The more certain participants were that ball A’s presence made a difference to whether or not the outcome happened, the more they agreed that ball A caused ball B to go through the gate. So, for example, participants gave high causal ratings for clips like the one in figure 1a, and low causal ratings for clips like the one in figure 1b. Participants gave intermediate judgements whenever it was unclear whether ball B would have gone through the gate if ball A had not been present in the scene (i.e. when ball B was initially headed to one of the edges of the gate).

(ii) . Direct evidence for spontaneous counterfactual simulation?

Gerstenberg et al. [3] show that the CSM captures people’s causal judgements to a high degree of quantitative accuracy. However, they do not show directly that people engage in counterfactual simulation when making causal judgements. By tracking participants’ eye-movements, Gerstenberg et al. [4] demonstrated that participants spontaneously tried to assess where ball B would go if ball A was not present in the scene. When asked to make causal judgements, participants did not just focus their attention on what actually happened. Instead, their eyes saccaded to where ball B would go if ball A was not present. Saccades are fast eye movements from one place to another that exceed a certain velocity threshold. These eye movements were more frequent in situations in which the counterfactual outcome was less clear (i.e. situations in which ball B was headed toward one of the edges of the gate), suggesting they may serve the purpose of reducing uncertainty about the counterfactual outcome. By contrast, when participants were asked to make a judgement about the actual outcome (i.e. how closely ball B went through the gate, or missed the gate), they tended to focus on what actually happened and only rarely saccaded to where ball B would have gone. Looks to where ball B would have gone were recruited specifically in service of making causal judgements.

Figure 4 shows the endpoints of participants’ saccades for one of the clips from the experiment, separated by the experimental condition. The conditions only differed in terms of what questions participants were asked to answer about the clip. Gerstenberg et al. [4] termed those looks ‘counterfactual saccades’ for which the endpoint of the saccade was close to the path that ball B would have taken if ball A had been absent. The results showed that participants produced more counterfactual saccades in the counterfactual condition (where they were asked to say whether ball B would have gone into the gate if ball A had not been there), and in the causal condition, compared with the outcome condition.

Figure 4.

Figure 4.

Saccade plots for one of the trials from Gerstenberg et al. [4] in the counterfactual, causal and outcome condition. Each point in the plot shows an endpoint of a saccade (a fast eye movement from one position to another). Saccade endpoints that were both close to the counterfactual path that ball B would have taken if ball A had not been present in the scene, and far enough to the left of where the collision happened, were classified as ‘counterfactual saccade’ (white points). The rest were classified as ‘other saccade’ (black points). The time window for this analysis was constrained to range from after the two balls entered the scene to before they collided with one another. This was done because after the two balls collide, ball A travels on a similar path to the one that ball B would have taken if ball A had not been present in the scene. By restricting the time window to before the collision, one can be sure that the ‘Counterfactual saccades’ are anticipatory saccades to where ball B would go, rather than saccades to where ball A currently is.

While Gerstenberg et al. [4] termed these looks ‘counterfactual saccades’ it is important to note that these looks actually happened before the two balls collided. So, in some sense, these looks were ‘hypothetical saccades’ to where ball B would go if ball A were removed from the scene. The CSM postulates that people make causal judgements by comparing what actually happened with what they believe would have happened in the relevant counterfactual situation, such as in the situation in which ball A had been removed from the scene. Another possibility, however, is that people compute the probability of a future hypothetical outcome instead, and then compare what actually happened to that hypothetical outcome.

As discussed earlier, Skovgaard-Olsen et al. [43] show that people are sensitive to the difference between inferences on level I and level III on Pearl’s [6] causal hierarchy, and that level III inferences are more closely aligned with causal judgements than level I inferences. The work presented here is a natural follow up. I will show that level III inferences are critical for capturing causal judgements about dynamic physical interactions, and that level II inferences do not suffice. I will present a computational model that implements hypothetical and counterfactual inference as mental simulations in a physical setting, and then test in two experiments which kind of mental simulation better explains people’s causal judgements.

2. Simulation model

In the experiments below, I ask different groups of participants to make three different kinds of judgements: hypothetical judgements about what would happen in the future, counterfactual judgements about what would have happened if things had been different, or causal judgements. Figure 5a shows an example of the kinds of video clips that participants saw in the experiments. I will represent this clip with three variables. X denotes whether ball A was present in the scene (x) or not (x′). Y denotes whether ball B went through the gate (y) or did not go through (y′), and Z denotes whether the final position of the block was in front of the gate (z) or out of the way (z′).

Figure 5.

Figure 5.

Diagrams illustrating what actually happened as well as how hypothetical simulations and counterfactual simulations are generated. (a) In the actual situation, ball A was present in the scene (x), the block did not end up in front of the gate (z′), and ball B went through the gate (y). (b) To compute the hypothetical probability of whether ball B would miss the gate if ball A were not present in the scene p(y′|do(x′)), the model removes ball A from the scene, and then simulates what would happen. The dashed path illustrates ball B’s movement in the simulation. At each moment in the simulation, a small degree of noise is added to ball B’s trajectory to capture the fact that participants have some degree of uncertainty about exactly how ball B would move if ball A were not there. In some of the simulations the block stays put (see top example) while in others the block moves (bottom example). (c) To compute the counterfactual probability of whether ball B would have missed the gate if ball A had not been present in the scene p(yx|x, y), the model first takes into account everything that actually happened which includes whether or not the block moved. It then goes back in time (indicated by the dotted arrow in the diagram) to replay the clip with ball A removed. Based on the outcome of many such simulations, the hypothetical probability with which ball B would go through the gate if ball A were not present is around 50% because the block moves with 50% probability. The counterfactual probability with which ball B would have gone through the gate if ball A had not been present in the scene is close to 100% because the block always moves in each counterfactual simulation just like it did in the actual situation. (Online version in colour.)

In the hypothetical condition, the model computes the probability of whether ball B would go through the gate if ball A were not present in the scene, p(y|do(x′)). The model computes this probability by running a number of hypothetical simulations. The model first conditions on what it actually observed. That is, it considers the balls’ initial trajectories and the initial state of the block. The model then removes ball A from the scene and simulates what would happen in its absence. Figure 5b shows two runs of the simulation model in the hypothetical condition. In the top one, ball B did not go into gate because the block stayed put. In the bottom one, ball B ended up going into gate because the block moved out of the way.

In the counterfactual condition, the model computes the probability of whether ball B would have gone through the gate if ball A had not been present in the scene, p(yx|x, y). Here, instead of only conditioning up to the point at which the two balls collided in the actual situation, the model takes into account the full clip until the end. So it observes whether or not ball B went into the gate, and also whether or not (and when) the block moved. When the model now simulates what would have happened if ball A had not been present in the scene, it still has uncertainty about ball B’s movement, as well as about the exact time at which the block would have moved. However, importantly, it does not have any uncertainty about whether or not the block moved. Figure 5c shows two runs of the simulation model in the counterfactual condition. In both situations, ball B ends up going through the gate, as the block always moves out of the way (just like it did in the actual situation).

The simulation model has two sources of uncertainty. First, the model has uncertainty about ball B’s movement. To capture this uncertainty, the model introduces noise to ball B’s movement at each time step in the physics simulation by applying a small perturbation to the direction of the ball’s velocity vector. At each time step in the simulation, the direction of ball B’s velocity is randomly perturbed. The perturbation is drawn from a Gaussian distribution with N(0,σball), where σball affects how strong the random perturbations are.

Second, the model has uncertainty about whether, and if so, when the block moves. In the experiment, I tell participants that the block sometimes moves and sometimes stays put. Across the clips that participants see in the experiment, the block moves half of the time. So, in the model, I set the probability that the block will move to p = 0.5. If the block moves, then the model is uncertain about exactly when the block will start moving. I model this uncertainty by adding noise to the actual time point in the physical simulation at which the block moves. This noise is drawn from a Gaussian distribution with N(0,σblock), where σblock determines the degree of uncertainty in when the block would move. The time point at which the block starts moving is constrained to lie between the time at which the two balls collide, and the time point at which the clip ends. This constraint makes it so that it is not possible in a hypothetical simulation for the block to start moving before the time point at which the balls collided in the actual situation. Remember that the model conditions on what happened up to this point at which the block is still in its initial position.

To compute the hypothetical probability p(y|do(x′)) and the counterfactual probability p(yx|x, y), the model generates a large number of simulations for each clip and then simply computes the proportion of simulations in which ball B ended up going through the gate. The key difference between hypothetical and counterfactual simulations is that for counterfactual simulations, the model conditions on what actually happened all the way until the end of the clip. In contrast, for hypothetical simulations, the model only conditions on what happened until before the two balls collided. This means that the model has more uncertainty about the hypothetical outcome (because the block may or may not move) compared to the counterfactual outcome. For example, for the clip shown in figure 5a, the hypothetical probability is close to 50% (ball B only goes through if the block goes out of the way), whereas the counterfactual probability is close to 100% (the block moves in each of the simulations but it might sometimes not move out of the way in time).

The simulation models then uses the hypothetical or counterfactual probabilities to compute the probability that x caused y. According to the hypothetical simulation model, the probability that x caused y is given by

p(xy)=p(y|do(x)). 2.1

The more likely ball B would miss the gate (y′) if ball A were not there (do(x′)), the more likely ball A caused ball B to go into the gate.

According to the counterfactual simulation model, the probability that x caused y is given by

p(xy)=p(yx|x,y). 2.2

The more likely ball B would have missed the gate if ball A had not been there (yx), when it fact ball A was present (x) and ball B went into the gate (y), the more likely ball A caused ball B to go into the gate.3

Experiment 1 tests whether the simulation model accurately captures participants’ hypothetical and counterfactual judgements. Experiment 2 then tests whether participants’ causal judgements are better explained by hypothetical simulations (equation (2.1)), or by counterfactual simulations (equation (2.2)).

3. Experiment 1: Hypothetical versus counterfactual simulations

In this experiment, I wanted to see how people make hypothetical and counterfactual judgements, and whether their judgements would be accurately captured by the simulation model.

(a) . Methods

All of the materials including the data, experiment code and analysis scripts are available here: https://github.com/cicl-stanford/counterfactual_hypothetical/.

(i) . Participants

One hundred and ten participants (age: M = 35, s.d. = 10; gender: 35 female, 72 male, 1 non-binary, 2 preferred not to say; race: 87 White, 11 Black, 10 Asian, 1 Native American, 1 preferred not to say; ethnicity: 13 Hispanic, 96 not Hispanic, 1 preferred not to say) were recruited via Amazon Mechanical Turk using psiTurk [62]. Only participants based in the USA with an approval rating of 95% or higher were able to participate [63].

(ii) . Design and procedure

The experiment had two conditions that differed in whether participants were asked to answer a hypothetical question about what would happen if ball A was removed, or a counterfactual question about what would have happened if ball A had been removed.

The instructions in both conditions were largely identical. Participants were told that their task would be to make judgements about video clips, and they viewed two diagrams similar to those in figure 6 illustrating what the clips will look like. Participants learned that in each clip, two balls, ball A and ball B, enter the scene from the right and collide with one another. In some of the clips, ball B ends up going through the red gate on the left, and in some of the clips ball B misses the gate. They were also told that there is a brown block on a track that may or may not move. They were not provided with any specific information about how likely it was that the block would move and, if so, at what time. In fact, the block moved in half of the clips and stayed put in the other half. For the clips in which it moved, it moved at the same point in time.

Figure 6.

Figure 6.

Diagrams of the eight test clips that participants saw. In each clip, both ball A and ball B are initially out of sight, enter the scene from the right, and collide with one another. The diagrams show the full trajectory of ball B, and the trajectory of ball A up until the collision (what trajectory ball A took after the collision is not shown). (a) In clips 1–4, ball B goes through the gate on the left. (b) In clips 5–8, ball B misses the gate. In half of the clips, the brown block in front of the gate moves its position, whereas in the other half it stays put. For example, in clip 1, the block is initially in front of the gate, and then moves up. In this clip, ball B would have gone through the gate even if ball A had not been present in the scene (because the block moved out of the way before ball B would have gotten there). In the hypothetical condition, the clip paused shortly before the two balls collided. The block was still at its initial position at this point in time. In the counterfactual and causal condition, the clip played until the end. (Online version in colour.)

Participants were asked a set of multiple-choice comprehension check questions. One of the questions made sure that participants had paid attention to the fact that the block sometimes slides along the track, and that it sometimes stays put. If any of the comprehension check questions were answered incorrectly, participants were redirected to read the instructions again. Only participants who answered all of the comprehension check questions correctly were able to proceed to the test phase.

Participants first watched two practice clips to familiarize themselves with the setting and the task. In both of the clips, the two balls collided with one another. In one of the clips, ball B did not go through the gate and the block moved. In the other clip, ball B went through the gate and the block did not move. After the practice clips, the eight test clips shown in figure 6 were presented in randomized order. In all of the clips, balls A and B enter the scene from the right and collide with one another. In clips 1–4, ball B goes through the gate (figure 6a). In clips, 5–8 ball B does not go through the gate (figure 6b). As the figure shows, ball B’s full trajectory and ball A’s trajectory up until the two balls collided were identical within each of two sets of clips. What differed between the clips was the initial position of the block, and whether or not it moved. For example, in clip 1, the block was initially in front of the gate but then moved out of the way. In clip 2, the block did not move and stayed in front of the gate the whole time.

In the hypothetical condition (N = 50), each clip paused shortly before the two balls collided with one another. Participants were allowed to replay the clip as many times as they liked. They were then asked to what extent they agreed with the statement ‘Ball B would go through the gate if ball A wasn’t there’ and indicated their answer on a sliding scale with the endpoints being labelled ‘not at all’ (0) and ‘very much’ (100). After having provided their judgement on the slider, participants viewed the full clip of what actually happened until the end. The reason I wanted participants to see the whole clip until the end was so that they could see that the block sometimes moved, and sometimes did not move. Note that they did not get to see a clip of what would have happened if ball A had not been there, so they did not get direct feedback about whether or not their judgement was correct.

In the counterfactual condition (N = 60), each clip played until the end. Here, too, participants were able to replay the clips as many times as they liked. They were then asked to what extent they agreed with the statement ‘Ball B would have gone through the gate if ball A hadn’t been there’ and indicated their answer on a sliding scale with the endpoints being labelled ‘not at all’ (0) and ‘very much’ (100). So the key differences between the hypothetical and the counterfactual condition were how the question was phrased, and at what time point in the clip the question was asked—shortly before the collision in the hypothetical condition, or at the end of the clip in the counterfactual condition.

After the test phase participants provided demographic information. They were also asked what factors influenced how they made their judgement and responded using a free text form. On average, it took participants 7.9 min (s.d. = 3.6) to complete the experiment.

(b) . Results

Figure 7 shows participants’ ratings in the hypothetical condition and in the counterfactual condition for the eight test clips, together with the model predictions. I will discuss the results from the hypothetical condition and the counterfactual condition in turn before comparing participants’ ratings with the model predictions.

Figure 7.

Figure 7.

Mean ratings (bars) in the hypothetical (blue) and counterfactual (red) condition for situations in which ball B went in (left) or ball B missed (right), together with the model predictions (circles). The images on the x-axis illustrate the initial and final position of the block. The block is in front of the gate if it is at the bottom. For example, in clip 1, the block was initially in front of the gate but then moved out of the way. Note: Error bars are 95% bootstrapped confidence intervals. (Online version in colour.)

(c) . Hypothetical condition

Figure 7 shows that participants’ mean judgements in the hypothetical condition tended to be close to the midpoint of the scale, and that they were affected only by the initial position of the block (see table 2, block initial).4 For example, judgements of whether ball B would go through the gate if ball A were not there were lower in clips 1 and 2 where the block was initially in the way than they were in clips 3 and 4 for which the block was initially out of the way. Recall that in the hypothetical condition, participants only viewed the clip up until shortly before the collision at which point the clip paused. Thereby, it is no surprise that the final position of the block did not affect their judgements (as they could not have known at the time of making the judgement what the final position of the block would be). Whether ball B ended up going into the gate, or whether it missed the gate also did not affect participants’ hypothetical judgements. Again, at the time of judgement participants did not know what the outcome would be.

Table 2.

Posterior means and 95% highest density intervals for each fixed effect in the Bayesian mixed-effects regression model. I fitted the model separately for participants’ hypothetical and counterfactual judgements. The results shows that participants’ hypothetical judgements are most strongly influenced by the initial position of the block, and that counterfactual judgements are mostly strongly influenced by the final position of the block. Note: I used sum contrasts for the predictor variables with no/yes for the block variables, and miss/hit for the outcome variable. model specification:judgment1+block_initial+block_final+outcome+(1|participant)

name intercept block initial block final outcome
hypothetical 49.26 [44.95, 53.61] 9.67 [6.64, 12.66] 1.84 [−1.08, 4.94] −2.08 [−5.3, 1.03]
counterfactual 51.2 [48.02, 54.46] 3.05 [0.48, 5.75] 28.51 [25.91, 31.11] 0.59 [−2.03, 3.29]

(d) . Counterfactual condition

Participants’ counterfactual judgements were most strongly affected by the final position of the block (see table 2, block final). For example, they agreed that ball B would have gone into the gate if ball A had not been present when the block moved out of the way (clip 1) or had stayed out of the way (clip 4), but disagreed when the block did not move out of the way (clip 2) or moved into the way (clip 3). The same pattern of judgements holds for those clips in which ball B missed the gate.

While the final position of the block was most important for participants’ counterfactual judgements, there was also a small effect of the initial position of the block (see table 2, block initial). Participants’ counterfactual judgements tended to be higher when the block had stayed out of the way the whole time (clips 4 and 8) compared to when the block moved out of the way (clips 1 and 5). Similarly, their judgements tended to be lower when the block was in front of the gate the whole time (clips 2 and 6) compared to when it moved into the way (clips 3 and 7).

(e) . Model comparison

As figure 7 shows, the simulation model closely tracks participants’ hypothetical and counterfactual judgements (Pearson correlation: r = 0.97, Spearman correlation: rs = 0.79, Rootmeansquarederror(RMSE)=12.50). The simulation model has two free parameters: one that captures participants’ uncertainty in how exactly ball B would have moved if ball A had not been present in the scene σball, and one that captures participants’ uncertainty about the moment in time in which the block would have started to move σblock. I fitted these two parameters in the model to participants’ judgements by minimizing the sum of squared errors between model predictions and participants’ mean hypothetical and counterfactual judgements (see appendix for more details on how the parameters were fitted). This analysis reveals that uncertainty about when the block will move is more important for capturing participants’ judgements than the uncertainty that is associated with the ball’s movement trajectory. As ball B travels on a straight horizontal path towards the middle of the gate, extrapolating how it would have moved if ball A had not been present is fairly straightforward. By contrast, remembering at what moment in time the block moved and whether it would have moved early enough so as to get out of the way (or into the way) is more difficult to assess.

It is worth noting that the simulation model captures the effect that the initial position of the block has on participants’ counterfactual judgements. For example, participants’ counterfactual judgements are slightly higher in clips 4 and 8 than in clips 1 and 5. When the block is initially in the way, there is some chance that it is not going to move out of the way in time. So the chances of ball B being blocked is a little greater in clips 4 and 8 than in clips 1 and 5. On the other hand, the model predicts a larger difference between counterfactual judgements for clips 2 and 6 versus clips 3 and 7 than what was observed. When the block stays put in front of the gate, there is only a very small chance that ball B would have gone through the gate if ball A had not been present in the scene. While the model predicts a very low rating in this case, participants’ judgements were a little higher.

(f) . Discussion

The results of Experiment 1 show that hypothetical and counterfactual judgements come apart in this paradigm. In the hypothetical condition, the clips paused shortly before the collision between the balls and participants judged whether ball B would go through the gate if ball A was not present in the scene. Here, participants’ judgements were only affected by where the block was positioned at the time when video clip paused. Participants’ hypothetical judgements that ball B would go through were a little higher when the block was initially out of the way than when it was in the way. I had told participants in the instructions that the block sometimes moves. While in fact the probability that the block moved was 50% across the ten clips that participants saw, they did not know this. So it makes sense that they would assume that ball B would be less likely to go into the gate when the block was initially in the way than when it was out of the way.5

In the counterfactual condition, the clips played until the end so participants were able to see whether or not the block moved in the actual situation. Here, participants’ judgements were most strongly affected by the final position of the block. They agreed that ball B would have gone through the gate if ball A had not been present when the final position of the block was out of the way, and disagreed when the block ended up in front of the gate. Their judgements were also somewhat affected by the initial position of the block. Counterfactual judgements were higher when the block was out of the way the whole time, and lower when the block was in front of the gate the whole time (compared to when it moved).

The reason that hypothetical and counterfactual judgements come apart in this paradigm is that the truth of the hypothetical (or counterfactual) depends on a factor that is independent of the causal event of interest (ball A’s presence). Whereas in the counterfactual condition, participants get to see whether or not the block moved, participants in the hypothetical condition do not know. This makes it such that the hypothetical of what would happen if ball A was not present is different from the counterfactual of what would have happened if ball A had not been present. Participants in the hypothetical condition were more uncertain about what would happen than participants in the counterfactual condition were about what would have happened.

The simulation model accurately captured participants’ hypothetical and counterfactual judgements. The model assumes that observers may be uncertain about how exactly ball B would have moved in the absence of ball A, and at what point in time the block would have started to move. These two sources of uncertainty are sufficient to explain the overall pattern of responses.

4. Experiment 2: causal judgements

Experiment 1 established that hypothetical and counterfactual judgements come apart in this paradigm, and that the simulation model captures both kinds of judgements. In Experiment 2, I asked participants to make a causal judgement about what happened in each clip. Specifically, whether ball A caused ball B to go through the gate (when it went through), and whether ball A prevented ball B from going through the gate (when it missed). Will participants’ causal judgements be better explained assuming that they compare what actually happened to the simulation of a future hypothetical, or to the simulation of a counterfactual?

(a) . Methods

(i) . Participants

Sixty seven participants (age: M = 34, s.d. = 10; gender: 20 female, 46 male, 1 non-binary; ethnicity: 9 Hispanic, 57 not Hispanic, 1 preferred not to say) were recruited via Amazon Mechanical Turk using psiTurk [62]. Only participants based in the USA with an approval rating of 95% or higher were able to participate.

(ii) . Design and procedure

The procedure was largely identical to that of the counterfactual condition in Experiment 1. The only thing that differed was what questions participants were asked. Participants were asked to what extent they agree with the statement ‘Ball A has caused ball B to go through the gate’ if ball B went through the gate, or ‘Ball A has prevented ball B from going through the gate’ if ball B did not go through the gate. Participants indicated their answer on a sliding scale with the endpoints labelled ‘not at all’ (0) and ‘very much’ (100). Just like in the counterfactual condition of Experiment 1, participants watched the clip until the end before providing their causal judgement.

(b) . Results

Figure 8 shows participants’ mean causal judgements together with the model predictions based on participants’ hypothetical and counterfactual judgements from Experiment 1. Participants’ causal judgements are explained by an interaction between the final position of the block and the outcome (see table 3). For situations in which ball B went in the gate, participants’ causal judgements were high when the final position of the block was in front of the gate (clips 2 and 3), and low when the block was not in front of the gate (clips 1 and 4). Conversely, when ball B missed the gate, participants’ judgements were high when the final position of the block was not in front of the gate (clips 5 and 8), and low when the block was in front of the gate (clips 6 and 7). The initial position of the block had little to no effect on participants’ causal judgements. What mattered was whether ball B ended up in the gate, and whether the block would have blocked ball B if ball A had not been present in the scene.

Figure 8.

Figure 8.

Mean causal ratings (grey bars) as a function of whether ball B went in (left) or ball B missed the gate (right). Model predictions based on participants’ judgements in the hypothetical condition (blue) and the counterfactual condition (red) are shown as circles. The images on the x-axis illustrate the initial and final position of the block. The block is in front of the gate if it is at the bottom. For example, in clip 1, the block was initially in front of the gate but then moved out of the way. Note that the model predictions for when ball B went in are flipped compared to those shown in figure 7. This is because in Experiment 1, we asked participants whether ball B would go in (hypothetical), or would have gone in (counterfactual) without ball A. But here, the model uses the probability that ball B would not go in (hypothetical), or would not have gone in (counterfactual) without ball A (see equations (2.1) and (2.2) as well as endnote 6). Note: Error bars are 95% bootstrapped confidence intervals. (Online version in colour.)

Table 3.

Posterior means and 95% highest density intervals for each fixed effect in the Bayesian mixed-effects regression model. The results show that the interaction between the final position of the block and the outcome (block final: outcome) predicts most of the variance in participants’ causal judgements. Note: I used sum contrasts for the predictor variables with no/yes for the block variables, and miss/hit for the outcome variable. modelspecification:judgment1+(block_initial+block_final)outcome+(1|participant).

intercept block initial block final outcome block initial: outcome block final: outcome
61.46 [58.1, 64.42] 0.11 [−2.31, 2.22] 0.71 [−1.58, 2.78] −1.02 [−3.22, 1.07] 1.91 [−0.3, 4.25] 23.97 [21.72, 26.13]

As figure 8 also shows, participants’ causal judgements in Experiment 2 lined up nicely with participants’ counterfactual judgements in Experiment 1. Participants’ mean counterfactual and causal judgements were highly correlated with one another (r = 0.99, rs = 0.79, RMSE = 12.50). By contrast, the correlation between mean hypothetical and causal judgements was much lower (r = 0.24, rs = 0.40, RMSE = 27.31).6

(c) . Discussion

The results of Experiment 2 demonstrate that causal judgements are best explained by counterfactuals and not by hypotheticals. Participants’ causal judgements in Experiment 2 closely aligned with participants’ counterfactual judgements in Experiment 1, whereas the correlation between causal judgements and hypothetical judgements was much lower. For example, when judging whether ball A caused ball B to go through the gate, it matters what would have happened if ball A had not been present in the scene. If ball B would have gone through the gate even if ball A had not been present in the scene (because there was no block in the way), then people tended to disagree with the statement that ball A caused ball B to go through the gate. However, when ball B would have been blocked had ball A not been presented in the scene, then participants’ tended to agree that ball A caused ball B to go through the gate. The same holds for participants’ judgements about whether ball A prevented ball B from going through the gate. They agreed when ball B would have gone in in the absence of ball A (i.e. when it would not have been blocked), but disagreed when ball B would have missed even if ball A had been removed (because the gate was blocked).

5. General discussion

This paper asked the question of whether counterfactuals are necessary for explaining causal judgements, or whether comparing what actually happened with a future hypothetical simulation suffices. The answer is clear: counterfactuals are necessary. People make causal judgements about particular events by comparing what actually happened with what would have happened in a counterfactual situation [3].

Prior work had tested the idea that causal judgements are intimately linked to counterfactual simulations and found a close fit between causal and counterfactual judgements [4]. Participants’ judgements that ball A caused ball B to go through the gate were higher the more certain they were that ball B would not have gone through if ball A had not been there. However, the results of these studies are consistent with the idea that people compare what actually happened with the outcome of a hypothetical future situation. In order to tease hypothetical and counterfactual probabilities apart, situations are required in which multiple factors influence the outcome, and where the observer has some uncertainty about at least one of these factors. I generated video clips in which hypotheticals and counterfactuals come apart by introducing a block in front of the gate that sometimes moves and sometimes stays put.

Experiment 1 asked one group of participants to make hypothetical judgements about whether ball B would go through the gate if ball A were not there, and another group of participants to make counterfactual judgements about whether ball B would have gone through the gate if ball A had not been there. As predicted, participants’ hypothetical and counterfactual judgements came apart in these clips and they were well captured by a computational model that computes hypothetical and counterfactual probabilities by running noisy physical simulations that incorporate people’s uncertainty about what would happen or would have happened. Experiment 2 then asked participants to make causal judgements. The results showed that participants’ causal judgements in Experiment 2 were closely in line with participants’ counterfactual judgements from Experiment 1. By contrast, the hypothetical judgements from Experiment 1 did not capture participants’ causal judgements as well.

In the remainder, I will discuss what implications these results have for theories of causality, and for research into the development of counterfactual reasoning. I conclude by pointing out some limitations of the work presented here that motivate possible (hypothetical) future directions.

(a) . Implications for theories of causality

The term ‘counterfactual’ is often used quite broadly to refer to any alternative possibility (which could lie in the future, or in the past). However, as the work presented here shows, it matters whether the imagined changes that lead to these alternative possibilities lie in the future or in the past. Pearl [6] proposes a causal hierarchy in which knowledge on level I supports prediction, knowledge on level II supports hypothetical reasoning about the possible consequences of future interventions, and knowledge on level III supports counterfactual reasoning about how things could have turned out differently from how they actually did. Much work has demonstrated that the difference between levels I and II matters: correlation is different from causation [2932]. This study demonstrates that the difference between level II and III matters, too. Counterfactual simulation explains causal judgements, whereas hypothetical simulation does not. This result extends recent work showing that people differentiate between indicative conditionals (level I) and counterfactual conditionals (level III), and that causal judgements align more closely with counterfactual judgements [43].

The tight link between counterfactuals and causal judgements puts pressure on theories of causal judgement that do not distinguish between counterfactuals and other types of conditionals (e.g. [64]). For example, the mental model theory analyses causation in terms of temporally ordered sets of possibilities [65,66]. It defines ‘A causes B to occur’ as ‘given A, B occurs’, whereas ‘A enables B to occur’ is defined as ‘given A, it is possible for B to occur’. However, as we have seen above, for capturing causal judgements about particular events, indicative conditionals will not suffice. For instance, ‘Given A, B occurs’ would be true if both A and B were the effects of some common cause C, ACB. But in this case, A does not cause B (even if A were to regularly precede B in time).

The results also put pressure on process theories of causation that aim to explain causal judgements without the use of counterfactuals [67,68]. For example, Wolff’s [55] force dynamics theory of causation analyses different causal expressions such as caused, enabled or prevented in terms of configurations of force vectors that represent the forces that are at play at the time of interaction between cause and effect (see [69], for a counterfactual account that captures people’s use of different causal expressions). For A to have caused B, A’s and B’s force vectors need to have pointed in different directions, whereas for A to have enabled B, the force vectors need to have been aligned. However, in the video clips that I used, the way in which balls A and B interact with one another is identical in clips 1 through 4, and in clips 5 through 8. The actual process by which A brings about the outcome does not change. What does change is the initial and final position of the block which influences what would have happened in the relevant counterfactual situation. A process model of causation that does not incorporate counterfactuals has no way of producing a different verdict across these sets of clips, while people clearly do (see [61], for a process model that does incorporate some counterfactual machinery).

The results reported here also have implications for work on causal selection [70]. In most situations, outcomes are the result of a multitude of contributing factors. However, people systematically choose to select some factors and not others as ‘the’ cause of the outcome [7174]. What explains people’s causal selections? Researchers have identified a number of factors that influence people’s causal selections that include event normality, and the causal structure of the situation. For instance, when two events bring about an outcome conjunctively such that each event was necessary for the outcome to come about, people have a tendency to cite the abnormal event as the cause of the outcome rather than the normal event. By contrast, when two events combine disjunctively such that each individual event would have been sufficient for the outcome to come about, people tend to cite the normal event as the cause of the outcome rather than the abnormal event [7577].

A number of different accounts have been proposed that strive to explain people’s causal selections [74,78,79]. One prominent view is that causal selections are influenced by counterfactuals [5,80]. Accordingly, certain counterfactuals come to mind more easily than others, and this affects what events are selected as causes [81]. Another related view suggests that people select those events as causes that would make for good points of intervention [58,8284]. While these theoretical accounts lead to similar predictions in many instances, they come apart in scenarios like the one presented here. This is because what would have been good to do this time (counterfactual), need not necessarily be the thing one should do the next time (hypothetical). Future work needs to look at situations in which optimal interventions and counterfactuals come apart to better understand what drives causal selections.

(b) . The development of counterfactual reasoning

Counterfactual reasoning is an impressive cognitive feat. One has to take into account what actually happened, mentally travel back in time, consider an intervention on the course of events, imagine how things would have played out, and compare that to what actually happened. Maybe unsurprisingly, children appear to master this task relatively late by around 5 years of age [8593]. While earlier work claimed that 3-year-old children can already reason correctly about counterfactuals [94], later work argued that these early successes may have been false alarms. In one of the scenarios in Harris et al.’s [94] study, Carol walks across a floor with dirty shoes. When asked ‘What if Carol had taken her shoes off—would the floor be dirty?’ even 3-year-old children answered correctly with ‘no’. However, it is possible that children answered this question without running through a counterfactual simulation of what would have happened, and relying on basic conditional reasoning instead [95]. In general, if shoes are dirty the floor gets dirty, and if shoes are clean the floor stays clean. To tease apart counterfactual reasoning from basic conditional reasoning, Rafetseder et al. [96] added a second character who also walked across the floor with dirty shoes. Now the correct answer to the question of whether the floor would have been dirty if Carol had taken her shoes off is ‘yes’, because of the other child. Rafetseder et al. [96] found that in this setting in which the outcome was causally overdetermined, even 6-year-old children tended to get it wrong.

Recently, Nyhout & Ganea [97] reported mature counterfactual reasoning in 4- and 5-year-olds. In their experiments, children saw blocks being put on a box that had the potential to make the box light up. In one of the trials, a blue block and a green block are put on the box, and the box lights up. Children were asked ‘If she had not put the green one on the box, would the light still have been on?’. While the question is clearly a counterfactual question, it seems possible for children to answer the question without considering a counterfactual. Instead, they might merely consider the hypothetical situation of just putting the blue block on the box and then try and simulate what would happen.

While the literature on the development of counterfactual reasoning has witnessed some false alarms, it is very likely that there have been some misses too. Demonstrating counterfactual reasoning in young children is challenging because of the verbal processing demands. The question ‘Would ball B have gone into the gate if ball A had not been there?’ is a mouthful. Other cognitive capacities, such as theory of mind (the ability to reason about other people’s mental states), have been demonstrated in young children by replacing explicit verbal measures with implicit measures, such as where children are looking on a screen ([98], but see also [99]). It is possible that children would be able to display the capacity for counterfactual reasoning much earlier than what has been shown so far, if verbal task demands could be circumvented (cf. [86]).

That said, the results reported here raise the bar for what is required to demonstrate successful counterfactual reasoning. To show that children are really simulating counterfactual rather than hypothetical situations, we need a setting like the one here in which counterfactuals and hypotheticals come apart. Using the metaphor of Pearl’s [6] ladder of causation: while early work may have misinterpreted level I reasoning as level III counterfactual reasoning, more work is required to make sure that we are not misinterpreting level II reasoning as counterfactual reasoning either [100].

If, as I argue, counterfactual reasoning is indeed required to accurately answer causal questions in this setting, then experiments like the ones presented here would provide a new approach for demonstrating counterfactual reasoning in children. This approach does not rely on asking children any explicit counterfactual questions. Instead, to demonstrate counterfactual reasoning, it would appear to be sufficient to show that children’s causal judgements aligned with those of adults.

(c) . What could have been better, and what would be good

The work presented here suggests that the process of counterfactual simulation is critical for understanding causal judgements. However, there are some theoretical and empirical limitations. On the theoretical side, there is a question about how exactly the distinction between hypotheticals and counterfactuals in Pearl's [6] framework maps onto the distinctions drawn here in this paper. In Pearl's [6] framework, the difference between hypotheticals and counterfactuals arises from the computational tasks that the model can solve at different levels of the hierarchy. Only on level III is the model able to first condition on what actually happened, and then consider an intervention that is counter to what actually happened. As illustrated in the sprinkler example (figure 3), one way to implement this computationally is via using a twin network whereby the conditioning step and the intervention step are carried out one after the other in separate networks (see also [10]). More generally, in order to be able to answer counterfactual questions, a model needs to know how the different variables functionally relate to one another (see the structural equations in figure 3). By contrast, to answer hypothetical questions, it is sufficient to know the probabilistic contingencies between variables as well as their causal connections (knowing the structural equations is not necessary).

In the experiments presented here, the difference between hypotheticals and counterfactuals is not due to the model (or the participant) having causal knowledge at different levels of the hierarchy. Our participants know how physics works and they can simulate different possible scenarios.7 Instead, the difference between hypotheticals and counterfactuals arises from how much information participants have about what happened. While the clip was paused shortly before the causal event of interest in the hypothetical condition of the experiment, in the other conditions, participants viewed the clip until the end. I highlighted that a key difference between hypotheticals and counterfactuals is whether the causal intervention takes place in the future or in the past. This difference in the time point of the intervention does not perfectly map onto the difference between level II and level III in Pearl’s hierarchy. Nonetheless, the results reported here show that counterfactuals are critical for causal judgements. To explain the pattern of causal judgements across the different video clips, a model requires the causal knowledge and computational capacities on level III of Pearl’s hierarchy.

On the empirical side, it will be important to document more broadly how counterfactual simulation and causal judgements relate. For example, I focused on a single setting here in which participants were asked to make causal judgements about physical events. Future work should expand this by assessing how counterfactuals and causal judgements are linked in a broader range of physical settings, as well as in settings that go beyond the physical domain (e.g. [104]). The experiments featured a relatively small set of clips. For a better quantitative assessment of how well the simulation model captures people’s judgements, future work should include a larger set of test clips. In the setting here, there was a 50% chance that the block would move. Future work could manipulate this probability to see how it affects participants’ judgements. The simulation model predicts that uncertainty about the block’s movement should only affect hypothetical judgements but not the counterfactual, or causal judgements. That said, much work has shown that the (ab)normality of events influences causal judgements [7276], and it is possible that such effects would be observed in this setting too (see [77,83]). For example, consider a situation in which the block is initially out of the way but then moves in front of the gate. When ball A knocked ball B into the gate (via the wall and around the block), would the causal judgements be greater when there was a low chance that the block would move compared to when there was a high chance? Finally, it would also be interesting to use process tracing techniques, such as eye-tracking [4], to gain more direct evidence for where the mind travels when it makes causal judgements.

Acknowledgements

Thanks to Jingren Wang for help in the early stages of the project, to Ari Beller for help with implementing the simulation model, to David Rose for thoughtful comments on the manuscript, and to Thomas Icard for many fruitful discussions.

Appendix A. Parameter search for the simulation model

The simulation model has two free parameters. One parameter determines how much the ball’s velocity vector is rotated at each time step in the physical simulation. The degree of rotation is drawn from Gaussian distribution N(0,σball). The other parameter determines at what moment in time the block starts moving out of the way. The time point is determined by adding Gaussian noise with N(0,σblock) to the true moment in time at which the block moved. In clips 1–4, the block starts to move at time step 280 in the video clip, and in clips 5–8 it moves at time step 290. When determining at what time step the block moves in the simulation, the model made sure that the block would not begin moving before the two balls collided, and not after the end of the clip (the clip timed out at time step 700). When simulating what would happen in the hypothetical condition, the model first determines whether or not the block moves (it moves with 50% probability), and then determines at which point in time it starts moving.

Figure 9 shows the results of a grid search over the parameter space. For each parameter setting, I ran 1000 simulations for each of the eight clips in the hypothetical and the counterfactual condition. The best-fitting set of parameters is σball = 0.6 and σblock = 175. These parameters minimize the squared error between model predictions and participants’ mean judgements for the different clips across both the hypothetical and counterfactual condition in Experiment 1. The model predictions shown in figure 7 use these best-fitting parameters.

Figure 9.

Figure 9.

Results of a grid search over the ball motion noise parameter (σball) and the block movement noise parameter (σblock). The loss displayed here is the sum of squared errors between model predictions and participants’ mean judgements (both on a scale from 0 to 1) for the eight clips in the hypothetical and counterfactual condition (as shown in figure 7).

Endnotes

1

Hypotheticals are also sometimes expressed using indicative language (Conditional I: ‘If this thing happens, that thing will happen.’) rather than subjunctive language (Conditional II: ‘If this thing happened, that thing would happen.’). I prefer to use the subjunctive form because I think it helps to differentiate between conditionals that are based on observations, those that are based on (hypothetical) interventions, and those that are based on counterfactuals (Conditional III: ‘If this thing had happened, that thing would have happened.’).

2

The challenge is then to impose restrictions onto what possible situations may be considered in such a way that the verdicts of this formalism agree with people’s intuitions about which events caused the outcome (see [45], for details).

3

To answer the question of whether ball A prevented ball B from going through the gate, the model computes the probability that ball A caused ball B to miss. For the hypothetical simulation model, this probability is given by p(xy′) = p(y|do(x′)), and for the counterfactual simulation model, it is given by p(xy′) = p(yx|x, y').

4

I will refer to a factor as having influenced participants’ judgements when the 95% credible interval of the posterior distribution for that factor excludes 0.

5

Participants were able to learn how likely the block is to move over the course of the experiment as they got to see the full video clip after having made their hypothetical prediction.

6

Before correlating the judgements with one another, I subtracted participants’ hypothetical and counterfactual judgements from 100 for the situations in which ball B went through the gate. Participants in Experiment 1 were asked to judge whether ball B would (or would have gone) through the gate if ball A were not (or had not been) there. To map these onto participants’ causal judgements in Experiment 2 when ball B went through the gate, we need participants’ hypothetical and counterfactual judgements that ball B would not go (or would not have gone) through the gate (see equations (2.1) and (2.2)).

7

In fact, the causal knowledge that people bring to bear on this task arguably goes beyond what is captured in Pearl’s hierarchy. While structural equations are a very useful formal tool for representing causal knowledge, they do not naturally capture the kinds of spatio-temporal dynamics that are at play in these physical interactions (see [101103]).

Ethics

The research was approved via Stanford’s IRB.

Data accessibility

All the data are available here: https://github.com/cicl-stanford/counterfactual̇hypothetical/.

Conflict of interest declaration

I declare I have no competing interests.

Funding

No funding has been received for this article.

References

  • 1.Kahneman D, Tversky A. 1982. The simulation heuristic. In Judgment under uncertainty: heuristics and biases (eds D Kahneman, A Tversky), pp. 201–208. Cambridge, UK: Cambridge University Press.
  • 2.Sloman SA, Lagnado D. 2015. Causality in thought. Annu. Rev. Psychol. 66, 223-247. ( 10.1146/annurev-psych-010814-015135) [DOI] [PubMed] [Google Scholar]
  • 3.Gerstenberg T, Goodman ND, Lagnado DA, Tenenbaum JB. 2021. A counterfactual simulation model of causal judgments for physical events. Psychol. Rev. 128, 936-975. ( 10.1037/rev0000281) [DOI] [PubMed] [Google Scholar]
  • 4.Gerstenberg T, Peterson MF, Goodman ND, Lagnado DA, Tenenbaum JB. 2017. Eye-tracking causality. Psychol. Sci. 28, 1731-1744. (doi:10.1177%2F0956797617713053) [DOI] [PubMed] [Google Scholar]
  • 5.Gerstenberg T, Stephan S. 2021. A counterfactual simulation model of causation by omission. Cognition 26, 104842. ( 10.1016/j.cognition.2021.104842) [DOI] [PubMed] [Google Scholar]
  • 6.Pearl J. 2000. Causality: models, reasoning and inference. Cambridge, UK: Cambridge University Press. [Google Scholar]
  • 7.Sloman SA, Hagmayer Y. 2006. The causal psycho-logic of choice. Trends Cogn. Sci. 10, 407-412. ( 10.1016/j.tics.2006.07.001) [DOI] [PubMed] [Google Scholar]
  • 8.Pearl J, Mackenzie D. 2018. The book of why: the new science of cause and effect. New York, NY: Basic Books. [Google Scholar]
  • 9.Pearl J. 2019. The seven tools of causal inference, with reflections on machine learning. Commun. ACM 62, 54-60. ( 10.1145/3241036) [DOI] [Google Scholar]
  • 10.Bareinboim E, Correa J, Ibeling D, Icard T. 2020. On Pearl’s hierarchy and the foundations of causal inference. Sociol. Methodol. 40, 75-149. ( 10.1145/3501714.3501743) [DOI] [Google Scholar]
  • 11.Adams E. 1965. The logic of conditionals. Inquiry 8, 166-197. ( 10.1080/00201746508601430) [DOI] [Google Scholar]
  • 12.Lewis D. 1976. Probabilities of conditionals and conditional probabilities. Phil. Rev. 85, 297-315. ( 10.2307/2184045) [DOI] [Google Scholar]
  • 13.Stalnaker RC. 1970. Probability and conditionals. Phil. Sci. 37, 64-80. ( 10.1086/288280) [DOI] [Google Scholar]
  • 14.Edgington D. 1995. On conditionals. Mind 104, 235-329. ( 10.1093/mind/104.414.235) [DOI] [Google Scholar]
  • 15.Lassiter D. 2017. Probabilistic language in indicative and counterfactual conditionals. Semantics Linguistic Theory, 27, 525–546. ( 10.3765/salt.v27i0.4188) [DOI]
  • 16.Lassiter D. 2017. Complex antecedents and probabilities in causal counterfactuals. In Proc. 21st Amsterdam Colloquium, Amsterdam, 2017, pp. 45-54. [Google Scholar]
  • 17.Ciardelli I, Zhang L, Champollion L. 2018. Two switches in the theory of counterfactuals. Linguist. Phil. 41, 577-621. ( 10.1007/s10988-018-9232-4) [DOI] [Google Scholar]
  • 18.Kratzer A. 1981. Partition and revision: the semantics of counterfactuals. J. Phil. Logic 10, 201-216. ( 10.1007/BF00248849) [DOI] [Google Scholar]
  • 19.Kaufmann S. 2013. Causal premise semantics. Cogn. Sci. 37, 1136-1170. ( 10.1111/cogs.12063) [DOI] [PubMed] [Google Scholar]
  • 20.Schulz K. 2011. ‘If you’d wiggled A, then B would’ve changed’: causality and counterfactual conditionals. Synthese 179, 239-251. ( 10.1007/s11229-010-9780-9) [DOI] [Google Scholar]
  • 21.van Rooij R, Schulz K. 2019. Conditionals, causality and conditional probability. J. Logic Lang. Inf. 28, 55-71. ( 10.1007/s10849-018-9275-5) [DOI] [Google Scholar]
  • 22.Cheng PW. 1997. From covariation to causation: a causal power theory. Psychol. Rev. 104, 367-405. ( 10.1037/0033-295X.104.2.367) [DOI] [Google Scholar]
  • 23.Douven I, Verbrugge S. 2010. The Adams family. Cognition 117, 302-318. ( 10.1016/j.cognition.2010.08.015) [DOI] [PubMed] [Google Scholar]
  • 24.Oaksford M, Chater N. 2007. Bayesian rationality: the probabilistic approach to human reasoning. Oxford, UK: Oxford University Press. [DOI] [PubMed] [Google Scholar]
  • 25.Over DE, Hadjichristidis C, Evans JSB, Handley SJ, Sloman SA. 2007. The probability of causal conditionals. Cognit. Psychol. 54, 62-97. ( 10.1016/j.cogpsych.2006.05.002) [DOI] [PubMed] [Google Scholar]
  • 26.Over DE, Evans J. 2003. The probability of conditionals: the psychological evidence. Mind Lang. 18, 340-358. ( 10.1111/1468-0017.00231) [DOI] [Google Scholar]
  • 27.Byrne RM. 2016. Counterfactual thought. Annu. Rev. Psychol. 67, 135-157. ( 10.1146/annurev-psych-122414-033249) [DOI] [PubMed] [Google Scholar]
  • 28.Byrne RMJ. 2005. The rational imagination: how people create alternatives to reality. Cambridge, MA: MIT Press. [DOI] [PubMed] [Google Scholar]
  • 29.Sloman SA, Lagnado DA. 2005. Do we ‘do’? Cogn. Sci. 29, 5-39. ( 10.1207/s15516709cog2901_2) [DOI] [PubMed] [Google Scholar]
  • 30.Steyvers M, Tenenbaum JB, Wagenmakers EJ, Blum B. 2003. Inferring causal networks from observations and interventions. Cogn. Sci. 27, 453-489. ( 10.1207/s15516709cog2703_6) [DOI] [Google Scholar]
  • 31.Bramley NR, Dayan P, Griffiths TL, Lagnado DA. 2017. Formalizing Neurath’s ship: approximate algorithms for online causal learning. Psychol. Rev. 124, 301. ( 10.1037/rev0000061) [DOI] [PubMed] [Google Scholar]
  • 32.Bramley NR, Gerstenberg T, Tenenbaum JB, Gureckis TM. 2018. Intuitive experimentation in the physical world. Cognit. Psychol. 105, 9-38. ( 10.1016/j.cogpsych.2018.05.001) [DOI] [PubMed] [Google Scholar]
  • 33.Meder B, Gerstenberg T, Hagmayer Y, Waldmann MR. 2010. Observing and intervening: rational and heuristic models of causal decision making. Open Psychol. J. 3, 119-135. ( 10.2174/1874350101003010119) [DOI] [Google Scholar]
  • 34.McCormack T, Bramley N, Frosch C, Patrick F, Lagnado D. 2016. Children’s use of interventions to learn causal structure. J. Exp. Child Psychol. 141, 1-22. ( 10.1016/j.jecp.2015.06.017) [DOI] [PubMed] [Google Scholar]
  • 35.Gopnik A, Glymour C, Sobel D, Schulz L, Kushnir T, Danks D. 2004. A theory of causal learning in children: causal maps and Bayes nets. Psychol. Rev. 111, 1-31. ( 10.1037/0033-295X.111.1.3) [DOI] [PubMed] [Google Scholar]
  • 36.Dehghani M, Iliev R, Kaufmann S. 2012. Causal explanation and fact mutability in counterfactual reasoning. Mind Lang. 27, 55-85. ( 10.1111/j.1468-0017.2011.01435.x) [DOI] [Google Scholar]
  • 37.Hiddleston E. 2005. A causal theory of counterfactuals. Noûs 39, 632-657. ( 10.1111/j.0029-4624.2005.00542.x) [DOI] [Google Scholar]
  • 38.Rips LJ, Edwards BJ. 2013. Inference and explanation in counterfactual reasoning. Cogn. Sci. 37, 1107-1135. ( 10.1111/cogs.12024) [DOI] [PubMed] [Google Scholar]
  • 39.Rips LJ. 2010. Two causal theories of counterfactual conditionals. Cogn. Sci. 34, 175-221. ( 10.1111/j.1551-6709.2009.01080.x) [DOI] [PubMed] [Google Scholar]
  • 40.Gerstenberg T, Bechlivanidis C, Lagnado DA. 2013. Back on track: backtracking in counterfactual reasoning. In Proceedings of the 35th Annual Conference of the Cognitive Science Society (eds M Knauff, M Pauen, N Sebanz, I Wachsmuth), pp. 2386–2391. Austin, TX: Cognitive Science Society.
  • 41.Lucas CG, Kemp C. 2015. An improved probabilistic account of counterfactual reasoning. Psychol. Rev. 122, 700-734. ( 10.1037/a0039655) [DOI] [PubMed] [Google Scholar]
  • 42.Meder B, Hagmayer Y, Waldmann MR. 2009. The role of learning data in causal reasoning about observations and interventions. Memory Cogn. 37, 249-264. ( 10.3758/MC.37.3.249) [DOI] [PubMed] [Google Scholar]
  • 43.Skovgaard-Olsen N, Stephan S, Waldmann M. 2021. Conditionals and the hierarchy of causal queries. J. Exp. Psychol. Gen. 150, 2472-2505. ( 10.1037/xge0001062) [DOI] [PubMed] [Google Scholar]
  • 44.Alicke MD, Mandel DR, Hilton D, Gerstenberg T, Lagnado DA. 2015. Causal conceptions in social explanation and moral evaluation: a historical tour. Perspect. Psychol. Sci. 10, 790-812. ( 10.1177/1745691615601888) [DOI] [PubMed] [Google Scholar]
  • 45.Halpern JY. 2016. Actual causality. Cambridge, MA: MIT Press. [Google Scholar]
  • 46.Woodward J. 2003. Making things happen: a theory of causal explanation. Oxford, UK: Oxford University Press. [Google Scholar]
  • 47.Yablo S. 2002. De facto dependence. J. Phil. 99, 130-148. ( 10.2307/3655640) [DOI] [Google Scholar]
  • 48.Hitchcock C. 2001. A tale of two effects. Phil. Rev. 110, 361-396. ( 10.1215/00318108-110-3-361) [DOI] [Google Scholar]
  • 49.Woodward J. 2021. Causation with a human face: normative theory and descriptive psychology. Oxford, UK: Oxford University Press. [Google Scholar]
  • 50.Halpern JY, Pearl J. 2005. Causes and explanations: a structural-model approach. Part I: Causes. Br. J. Phil. Sci. 56, 843-887. ( 10.1093/bjps/axi147) [DOI] [Google Scholar]
  • 51.Lagnado DA, Gerstenberg T, Zultan R. 2013. Causal responsibility and counterfactuals. Cogn. Sci. 47, 1036-1073. ( 10.1111/cogs.12054) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Chockler H, Halpern JY. 2004. Responsibility and blame: a structural-model approach. J. Artif. Intell. Res. 22, 93-115. ( 10.1613/jair.1391) [DOI] [Google Scholar]
  • 53.Langenhoff AF, Wiegmann A, Halpern JY, Tenenbaum JB, Gerstenberg T. 2021. Predicting responsibility judgments from dispositional inferences and causal attributions. Cognit. Psychol. 129, 101412. ( 10.1016/j.cogpsych.2021.101412) [DOI] [PubMed] [Google Scholar]
  • 54.Gerstenberg T, Ullman TD, Nagel J, Kleiman-Weiner M, Lagnado DA, Tenenbaum JB. 2018. Lucky or clever? From expectations to responsibility judgments. Cognition 177, 122-141. ( 10.1016/j.cognition.2018.03.019) [DOI] [PubMed] [Google Scholar]
  • 55.Wolff P. 2007. Representing causation. J. Exp. Psychol. Gen. 136, 82-111. ( 10.1037/0096-3445.136.1.82) [DOI] [PubMed] [Google Scholar]
  • 56.Talmy L. 1988. Force dynamics in language and cognition. Cogn. Sci. 12, 49-100. ( 10.1207/s15516709cog1201_2) [DOI] [Google Scholar]
  • 57.Mandel DR. 2003. Judgment dissociation theory: an analysis of differences in causal, counterfactual and covariational reasoning. J. Exp. Psychol. Gen. 132, 419-434. ( 10.1037/0096-3445.132.3.419) [DOI] [PubMed] [Google Scholar]
  • 58.Lombrozo T. 2010. Causal-explanatory pluralism: how intentions, functions, and mechanisms influence causal ascriptions. Cognit. Psychol. 61, 303-332. ( 10.1016/j.cogpsych.2010.05.002) [DOI] [PubMed] [Google Scholar]
  • 59.Walsh CR, Sloman SA. 2011. The meaning of cause and prevent: the role of causal mechanism. Mind Lang. 26, 21-52. ( 10.1111/j.1468-0017.2010.01409.x) [DOI] [Google Scholar]
  • 60.Shultz TR. 1982. Rules of causal attribution. Monogr. Soc. Res. Child Dev. 47, 1-51. ( 10.2307/1165893) [DOI] [Google Scholar]
  • 61.Wolff P, Barbey AK, Hausknecht M. 2010. For want of a nail: how absences cause events. J. Exp. Psychol. Gen. 139, 191-221. ( 10.1037/a0018129) [DOI] [PubMed] [Google Scholar]
  • 62.Gureckis TM, Martin J, McDonnell J, Rich AS, Markant D, Coenen A, Halpern D, Hamrick JB, Chan P. 2016. psiTurk: an open-source framework for conducting replicable behavioral experiments online. Behav. Res. Methods 48, 829-842. ( 10.3758/s13428-015-0642-8) [DOI] [PubMed] [Google Scholar]
  • 63.Mason W, Suri S. 2012. Conducting behavioral research on Amazon’s Mechanical Turk. Behav. Res. Methods 44, 1-23. ( 10.3758/s13428-011-0124-6) [DOI] [PubMed] [Google Scholar]
  • 64.Sebben S, Ullrich J. 2021. Can conditionals explain explanations? A modus ponens model of B because A. Cognition 215, 104812. ( 10.1016/j.cognition.2021.104812) [DOI] [PubMed] [Google Scholar]
  • 65.Khemlani SS, Barbey AK, Johnson-Laird PN. 2014. Causal reasoning with mental models. Front. Human Neurosci. 8, 849. ( 10.3389/fnhum.2014.00849) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Goldvarg E, Johnson-Laird PN. 2001. Naive causality: a mental model theory of causal meaning and reasoning. Cogn. Sci. 25, 565-610. ( 10.1207/s15516709cog2504_3) [DOI] [Google Scholar]
  • 67.Salmon WC. 1994. Causality without counterfactuals. Phil. Sci. 61, 297-312. ( 10.1086/289801) [DOI] [Google Scholar]
  • 68.Dowe P. 2000. Physical causation. Cambridge, UK: Cambridge University Press. [Google Scholar]
  • 69.Beller A, Bennett E, Gerstenberg T. 2020. The language of causation. In Proc. 42nd Annual Conf. Cognitive Science Society (eds S Denison, M Mack, Y Xu, BC Armstrong), pp. 3133–3139. Cognitive Science Society.
  • 70.Hesslow G. 1988. The problem of causal selection. In Contemporary science and natural explanation: commonsense conceptions of causality (ed. DJ Hilton), pp. 11–32. Brighton, UK: Harvester Press.
  • 71.Henne P, Niemi L, Knobe J. 2019. A counterfactual explanation for the action effect in causal judgment. Cognition 190, 157-164. ( 10.1016/j.cognition.2019.05.006) [DOI] [PubMed] [Google Scholar]
  • 72.Kahneman D, Miller DT. 1986. Norm theory: comparing reality to its alternatives. Psychol. Rev. 93, 136-153. ( 10.1037/0033-295X.93.2.136) [DOI] [Google Scholar]
  • 73.Hilton DJ, Slugoski BR. 1986. Knowledge-based causal attribution: the abnormal conditions focus model. Psychol. Rev. 93, 75-88. ( 10.1037/0033-295X.93.1.75) [DOI] [Google Scholar]
  • 74.Hitchcock C, Knobe J. 2009. Cause and norm. J. Phil. 11, 587-612. ( 10.5840/jphil20091061128) [DOI] [Google Scholar]
  • 75.Kominsky JF, Phillips J, Gerstenberg T, Lagnado DA, Knobe J. 2015. Causal superseding. Cognition 137, 196-209. ( 10.1016/j.cognition.2015.01.013) [DOI] [PubMed] [Google Scholar]
  • 76.Icard TF, Kominsky JF, Knobe J. 2017. Normality and actual causal strength. Cognition 161, 80-93. (doi:10.1016%2Fj.cognition.2017.01.010) [DOI] [PubMed] [Google Scholar]
  • 77.Gerstenberg T, Icard TF. 2020. Expectations affect physical causation judgments. J. Exp. Psychol. Gen. 149, 599-607. ( 10.1037/xge0000670) [DOI] [PubMed] [Google Scholar]
  • 78.Livengood J, Sytsma J, Rose D. 2017. Following the FAD: folk attributions and theories of actual causation. Rev. Phil. Psychol. 8, 273-294. ( 10.1007/s13164-016-0316-1) [DOI] [Google Scholar]
  • 79.Quillien T. 2020. When do we think that X caused Y? Cognition 205, 104410. ( 10.1016/j.cognition.2020.104410) [DOI] [PubMed] [Google Scholar]
  • 80.Kominsky JF, Phillips J. 2019. Immoral professors and malfunctioning tools: counterfactual relevance accounts explain the effect of norm violations on causal selection. Cogn. Sci. 43, e12792. ( 10.1111/cogs.12792) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Phillips J, Luguri J, Knobe J. 2015. Unifying morality’s influence on non-moral judgments: the relevance of alternative possibilities. Cognition 145, 30-42. ( 10.1016/j.cognition.2015.08.001) [DOI] [PubMed] [Google Scholar]
  • 82.Girotto V, Legrenzi P, Rizzo A. 1991. Event controllability in counterfactual thinking. Acta Psychol. 78, 111-133. ( 10.1016/0001-6918(91)90007-M) [DOI] [Google Scholar]
  • 83.Kirfel L, Icard TF, Gerstenberg T. 2022. Inference from explanation. J. Exp. Psychol. Gen. 151, 1481. ( 10.1037/xge0001151) [DOI] [PubMed] [Google Scholar]
  • 84.Hitchcock C. 2012. Portable causal dependence: a tale of consilience. Phil. Sci. 79, 942-951. ( 10.1086/667899) [DOI] [Google Scholar]
  • 85.McCormack T, Ho M, Gribben C, O’Connor E, Hoerl C. 2018. The development of counterfactual reasoning about doubly-determined events. Cogn. Dev. 45, 1-9. (doi:10.1016%2Fj.cogdev.2017.10.001) [Google Scholar]
  • 86.Kominsky JF, Gerstenberg T, Pelz M, Sheskin M, Singmann H, Schulz L, Keil FC. 2021. The trajectory of counterfactual simulation in development. Dev. Psychol. 57, 253-268. ( 10.1037/dev0001140) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Beck SR, Riggs KJ. 2014. Developing thoughts about what might have been. Child Dev. Perspect. 8, 175-179. ( 10.1111/cdep.12082) [DOI] [Google Scholar]
  • 88.Carey S, Leahy B, Redshaw J, Suddendorf T. 2020. Could it be so? The cognitive science of possibility. Trends Cogn. Sci. 24, 3-4. ( 10.1016/j.tics.2019.11.007) [DOI] [PubMed] [Google Scholar]
  • 89.Rafetseder E, O’Brien C, Leahy B, Perner J. 2021. Extended difficulties with counterfactuals persist in reasoning with false beliefs: evidence for teleology-in-perspective. J. Exp. Child Psychol. 204, 105058. ( 10.1016/j.jecp.2020.105058) [DOI] [PubMed] [Google Scholar]
  • 90.Beck SR, Guthrie C. 2011. Almost thinking counterfactually: children’s understanding of close counterfactuals. Child Dev. 82, 1189-1198. ( 10.1111/j.1467-8624.2011.01590.x) [DOI] [PubMed] [Google Scholar]
  • 91.Nyhout A, Henke L, Ganea PA. 2019. Children’s counterfactual reasoning about causally overdetermined events. Child Dev. 90, 610-622. ( 10.1111/cdev.12913) [DOI] [PubMed] [Google Scholar]
  • 92.McCormack T, O’Connor E, Beck S, Feeney A. 2016. The development of regret and relief about the outcomes of risky decisions. J. Exp. Child Psychol. 148, 1-19. ( 10.1016/j.jecp.2016.02.008) [DOI] [PubMed] [Google Scholar]
  • 93.Koskuba K, Gerstenberg T, Gordon H, Lagnado DA, Schlottmann A. 2018. What’s fair? How children assign reward to members of teams with differing causal structures. Cognition 177, 234-248. ( 10.1016/j.cognition.2018.03.016) [DOI] [PubMed] [Google Scholar]
  • 94.Harris PL, German T, Mills P. 1996. Children’s use of counterfactual thinking in causal reasoning. Cognition 61, 233-259. ( 10.1016/S0010-0277(96)00715-9) [DOI] [PubMed] [Google Scholar]
  • 95.Leahy B, Rafetseder E, Perner J. 2014. Basic conditional reasoning: how children mimic counterfactual reasoning. Studia Logica 102, 793-810. ( 10.1007/s11225-013-9510-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Rafetseder E, Schwitalla M, Perner J. 2013. Counterfactual reasoning: from childhood to adulthood. J. Exp. Child Psychol. 114, 389-404. ( 10.1016/j.jecp.2012.10.010) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Nyhout A, Ganea PA. 2019. Mature counterfactual reasoning in 4- and 5-year-olds. Cognition 183, 57-66. ( 10.1016/j.cognition.2018.10.027) [DOI] [PubMed] [Google Scholar]
  • 98.Low J, Perner J. 2012. Implicit and explicit theory of mind: state of the art. Brit. J. Dev. Psychol. 30, 1-13. ( 10.1111/j.2044-835X.2011.02074.x) [DOI] [PubMed] [Google Scholar]
  • 99.Kulke L, von Duhn B, Schneider D, Rakoczy H. 2018. Is implicit theory of mind a real and robust phenomenon? Results from a systematic replication study. Psychol. Sci. 29, 888-900. ( 10.1177/0956797617747090) [DOI] [PubMed] [Google Scholar]
  • 100.Beck SR, Robinson EJ, Carroll DJ, Apperly IA. 2006. Children’s thinking about counterfactuals and future hypotheticals as possibilities. Child Dev. 77, 413-426. ( 10.1111/j.1467-8624.2006.00879.x) [DOI] [PubMed] [Google Scholar]
  • 101.Gerstenberg T, Tenenbaum JB. 2017. Intuitive theories. In Oxford handbook of causal reasoning (ed. Waldmann M), pp. 515-548. Oxford, UK: Oxford University Press. [Google Scholar]
  • 102.Goodman ND, Tenenbaum JB, Gerstenberg T. 2015. Concepts in a probabilistic language of thought. In The conceptual mind: new directions in the study of concepts (eds E Margolis, S Lawrence), pp. 623–653. New York, NY: MIT Press.
  • 103.Ullman TD, Spelke E, Battaglia P, Tenenbaum JB. 2017. Mind games: game engines as an architecture for intuitive physics. Trends Cogn. Sci. 21, 649-665. ( 10.1016/j.tics.2017.05.012) [DOI] [PubMed] [Google Scholar]
  • 104.Sosa FA, Ullman TD, Tenenbaum JB, Gershman SJ, Gerstenberg T. 2021. Moral dynamics: grounding moral judgment in intuitive physics and intuitive psychology. Cognition 217, 104890. ( 10.1016/j.cognition.2021.104890) [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All the data are available here: https://github.com/cicl-stanford/counterfactual̇hypothetical/.


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES