Abstract
Humans discount delayed relative to more immediate reward. A plausible explanation is that impatience arises partly from uncertainty, or risk, implicit in delayed reward. Existing theories of discounting-as-risk focus on a probability that delayed reward will not materialize. By contrast, we examine how uncertainty in the magnitude of delayed reward contributes to delay discounting. We propose a model wherein reward is discounted proportional to the rate of random change in its magnitude across time, termed volatility. We find evidence to support this model across three experiments (total N = 158). First, using a task where participants chose when to sell products, whose price dynamics they previously learned, we show discounting increases in line with price volatility. Second, we show that this effect pertains over naturalistic delays of up to 4 months. Using functional magnetic resonance imaging, we observe a volatility-dependent decrease in functional hippocampal–prefrontal coupling during intertemporal choice. Third, we replicate these effects in a larger online sample, finding that volatility discounting within each task correlates with baseline discounting outside of the task. We conclude that delay discounting partly reflects time-dependent uncertainty about reward magnitude, that is volatility. Our model captures how discounting adapts to volatility, thereby partly accounting for individual differences in impatience. Our imaging findings suggest a putative mechanism whereby uncertainty reduces prospective simulation of future outcomes.
Keywords: discounting, impulsivity, uncertainty, risk, volatility
Humans and other animals exhibit impatience when offered reward (Berns et al., 2007; Frederick et al., 2002; Kalenscher & Pennartz, 2008; Loewenstein et al., 2003). Experiments show that the subjective value of reward decreases approximately in inverse proportion to its delay (Bickel et al., 2012; Green et al., 1994, 1999; Kirby et al., 1999; Madden et al., 2003). Specifically, people appear to discount the value of delayed reward following a hyperbolic function, such that the value of a reward of magnitude X, available after a delay t is given by the following equation:
1 |
Here K referred to as a discount rate, determines how steeply reward value decreases as it is delayed, with higher values of K generating greater impatience. s is an additional parameter corresponding to power law time perception (Rachlin, 2006). The wider relevance of delay discounting is supported by a correlation with impulsive or shortsighted real-world behavior, from substance misuse to undersaving for retirement (Bickel & Marsch, 2001; Bickel et al., 2014; Critchfield & Kollins, 2001; Daugherty & Brase, 2010; Epstein et al., 2014; Koffarnus et al., 2013; MacKillop et al., 2011; Meier & Sprenger, 2012; Snider et al., 2019).
An influential theory posits that delay discounting arises partly from uncertainty implicit in future reward (e.g., Andersen et al., 2008; Andreoni & Sprenger, 2012a; Harrison et al., 2005; Jones & Rachlin, 2009; Keren & Roelofsma, 1995; Kurth-Nelson et al., 2012; Luhmann et al., 2008; Noussair & Wu, 2006; Rachlin et al., 1991; Sozou, 1998; B. J. Weber & Chapman, 2005). Where the range of possible outcomes and their probabilities are precisely known, uncertainty over outcomes is termed “first-order uncertainty” (Bach et al., 2011), commonly conceptualized as “risk” (Von Neumann & Morgenstern, 1947). By contrast, uncertainty about the probability distribution of outcomes is referred to as “second-order uncertainty” (Bach et al., 2011; Knight, 1921), often conceptualized as “ambiguity” (Ellsberg, 1961). Here, for simplicity, we limit our discussion to first-order uncertainty, or risk.
Models of risk-sensitivity commonly assume that a preference for reward is affected by its variability (e.g., Bell, 1995; D’Acremont & Bossaerts, 2008; Kahneman & Tversky, 1979; Kroll et al., 1984; Sharpe, 1964; Symmonds et al., 2011; Weber et al., 2004). Furthermore, it has been proposed that delay discounting occurs because delayed reward entails greater risk than immediate reward (e.g., Jones & Rachlin, 2009; Rachlin et al., 1991; Sozou, 1998). However, rather than focusing on a variability associated with delayed reward, extant theories of discounting-as-risk have instead tended to emphasize a possibility that delayed reward will not be received. Such theories ascribe a probability per unit time that delayed reward will fail to materialize, termed a hazard rate (Friston et al., 2013; Jones & Rachlin, 2009; Kacelnik, 1997; Luhmann et al., 2008; Rachlin et al., 1991; Sozou, 1998; Yi & Landes, 2012). In support, delay discounting increases when people are faced with more hazardous life circumstances (Lahav et al., 2011; Sheffer et al., 2018) and is found to be higher both among people who are hopeless about the future (Pulcu et al., 2014), and among those who report reduced subjective survival probability (Chao et al., 2009). Furthermore, delay discounting correlates with a subjective probability that reward will not be paid out as promised (Patak & Reynolds, 2007; Reynolds et al., 2007; Takahashi et al., 2007).
Hazard rate models emphasize that delayed reward has a lower probability of being realized than immediate reward and hence has a lower expected (mean) value. By contrast, first-order uncertainty is greatest when all reward magnitudes are equally likely, which, in the case of reward versus no-reward, occurs when p(no-reward) = 0.5; see Rushworth & Behrens, 2008. Thus, if delay discounting is indeed sensitive to uncertainty about future reward then, even for instances where the expected (mean) value of future reward remains constant, a growing variance with time should lead to greater discounting. In other words, even where receipt of future reward is guaranteed, uncertainty about its magnitude may grow with delay. For example, owing to random changes in currency markets, the purchasing power of £100 1 year ahead is less predictable than it is today.
In support of this idea, delay discounting is higher in risk averse individuals (Eckel et al., 2005; Epper et al., 2011; Hayden & Platt, 2007; Leigh, 1986) and in those expressing a greater subjective intolerance of uncertainty (Luhmann et al., 2011). Finally, an observation that adding delay to choices between gambles reduces a bias toward riskless options (Noussair & Wu, 2006; B. J. Weber & Chapman, 2005) suggests that delay implicitly entails variability in reward. However, existing models of discounting make no claims about whether discounting is sensitive to future reward variance nor have previous approaches directly tested this experimentally. Previous work has examined uncertainty in the timing of future reward (Chesson & Viscusi, 2003; Dasgupta & Maskin, 2005); however, here we are concerned with magnitude uncertainty.
We address these issues here by formalizing first-order future uncertainty as volatility, the perceived rate of continuous random change in the reward environment, governing an increase in variance with delay (Figure 1a and b). We consider a simple model in which decision-makers discount future reward hyperbolically, with rate proportional to volatility. This model has a Bayesian interpretation as an optimal integration of future reward according to its magnitude uncertainty (see supplemental material; MacKay, 2003).
Figure 1. A Volatility Discounting Model.
Note. (a) Samples of reward magnitude drawn from a Gaussian random walk. Solid blue bars represent 1 SD above and below the mean at each timepoint. Green lines schematically represent Gaussian probability densities at the same timepoints. (b) Variance in expected reward magnitude grows linearly with delay with a slope given by volatility. (c) Immediate reward magnitude, x, at indifference, is plotted as a function of time delay for different levels of risk aversion (left panel) and volatility (right panel). Rewards are hyperbolically discounted, with discount rate given by an interaction between risk aversion m and volatility, σ2. Here, for simplicity, we assume η = 0, s = 1. See the online article for the color version of this figure.
We adopt a modeling framework wherein agents have an internal, or generative model, of reward dynamics (cf. Behrens et al., 2007; Mathys et al., 2011). Within this framework, reward magnitude and its volatility are (subjective) parameters of the internal model. We assume that these parameters are learned and therefore come to reflect both an agent’s preferences and the objective dynamics of the environment. Thus, magnitude uncertainty might entail either variance in the objective size of rewards (e.g., the value of a stock, or the weight of corn produced per hectare of land), or in the subjective value of reward (e.g., appetite for food, fashion tastes). Here, we manipulate volatility in objective reward magnitude, though we consider the very same model could be applied to volatility in subjective value.
A Volatility Discounting Model
We consider a situation wherein a decision-maker is promised a reward, with an expected magnitude, if redeemed immediately, of R0 (e.g., a £10 voucher for a local deli, which is expected to buy a kilogram of food). However, the reward is only available after a delay, t (e.g., the voucher is redeemable after a month). The decision-maker believes the magnitude of the reward might change during the delay (e.g., the price of food at the deli may fall or rise, meaning that the voucher will buy either more or less food).
Specifically, we assume that the decision-maker is equipped with a generative model wherein expected future reward magnitude, Rt > 0, evolves based on a random walk, with volatility, σ2, such that:
2 |
We also consider an additional “emission noise” in the observed rewards, rt, which does not vary with time (e.g., portion sizes vary slightly from day-to-day), and has standard deviation, ϑ:
3 |
A key intuition here is that a latent quantity, Rt, governs reward magnitude and varies over time. However, the reward realized at a given time step, rt, is not precisely equal to Rt, but instead it is drawn from a Gaussian distribution with a mean of Rt.
Since there is no drift in the random walk and no bias in the emission noise, future reward has the same expected (mean) magnitude as if it were consumed immediately (e.g., one can expect, on average, a £10 voucher will buy the same amount of food next month as it would buy today):
4 |
However, by the additive property of Gaussian random variables, the variance of future reward grows linearly with delay, reflecting an accumulation of random changes over time:
5 |
Thus volatility, σ2, determines how quickly uncertainty in reward magnitude grows with delay (e.g., changeability in food prices). The above definition of volatility follows previous learning models (Behrens et al., 2007; Mathys et al., 2011); formally similar definitions occur in financial modeling (e.g., Black & Scholes, 1973; Sharpe, 1964).
We posit that, given estimates of σ2 and ϑ2, the decision-maker values reward in inverse proportion to its variance:
6 |
Here, rewards whose value is more precisely known contribute more to a long-run estimate of future value; distant rewards are more uncertain, and therefore receive less weight (MacKay, 2003).
We envisage σ2 and ϑ2 as parameters within the decision-maker’s internal model of reward dynamics. Here we assume that the volatility of the reward environment is static, and known to the decision-maker; we therefore fix σ2 to the objective, external volatility. However, in Equation 6, the value of reward tends to infinity as its variance tends to zero. To address this, we define the emission noise parameter, ϑ2, as a sum of internal and external sources of noise:
7 |
where η2 denotes the objective, external noise, and θ2 denotes a degree of irreducible internal uncertainty associated with all reward, even a nominally certain reward. As described below, this formalism allows θ2 to capture individual differences in sensitivity to variance in objective reward magnitude.
To derive a discount function, we consider a choice between an immediate reward, E[rt] = x, and a larger delayed reward, E[rt] = X. At indifference, by Equation 6:
8 |
Thus, delayed reward is discounted according to its uncertainty, relative to that of immediate reward. For a case, where immediate reward is nominally certain:
9 |
Dividing by θ2 and substituting m = 1/θ2 gives the following equation:
10 |
This arrangement thus yields hyperbolic discounting of risky reward according to its objective variance, with rate m. Where t = 0, m captures individual differences in risk aversion, with the variance of a risky prospect given by η2.
In fitting the model, we also allow for a possibility of nonlinear time perception, whereby uncertainty increases as a function of subjective time:
11 |
This arrangement is commensurate with existing models of hyperbolic discounting (Rachlin, 2006). Where objective time-independent risk is negligible (η2 = 0), Equation 11 obtains hyperbolic delay discounting (as shown in Equation 1), with rate proportional to volatility, K = mσ2, and where individual differences in “volatility discounting” are captured by m (shown in Figure 1c).
Gabaix and Laibson (2017) derive a closely related model by considering a decision-maker’s internal uncertainty associated with future reward. In the resulting model, agents attempt to mentally simulate the future and combine these noisy signals with their prior expectations to generate posterior beliefs. By analogy to volatility, noise in the sampling process accrues linearly with delay. The authors show that this arrangement results in hyperbolic discounting, with a rate given by the precision of future simulations relative to prior expectations. A corollary of this arrangement is that at long delays, where the value of future reward is imprecisely known, average value estimates return to prior expectations.
Gershman and Bhui (2020) extend the above framework by postulating that more precise future simulation demands mental effort. The authors show how this model can account for a well-known finding that people are more patient for larger rewards, by proposing that increased simulation effort is deemed more worthwhile for larger reward. Their model also accounts for a finding that choices are more variable for smaller rewards, an effect that is found to diminish for more uncertain rewards. Here by contrast we experimentally vary volatility in external reward. The present study therefore links Bayesian approaches to discounting with extant notions of discounting-as-risk.
Neural Correlates of Volatility Discounting
A further outstanding question is how uncertainty-dependent valuation of future reward is implemented in the brain. A process model, we consider, is that uncertainty reduces the extent to which future outcomes are incorporated into a person’s model of their future situation (Gershman & Daw, 2017; Kurth-Nelson et al., 2012; Schacter et al., 2008). Such high-level models are supported by medial temporal lobe (MTL) and related structures, in particular, the hippocampus (HC; Addis et al., 2007; Hassabis et al., 2007; Johnson & Redish, 2007; Peters & Büchel, 2010; Schacter et al., 2008; Tsao et al., 2018). Important here are observations that encouraging imagination of delayed reward, by embedding rewards within future episodes of a person’s life (e.g., an upcoming birthday), decreases delay discounting (Daniel et al., 2015; Peters & Büchel, 2010) and increases connectivity between MTL and prefrontal regions encoding discounting value (Peters & Büchel, 2010). However, if future rewards are more variable then representing the ensuing range of future scenarios is more cognitively demanding (Gershman & Bhui, 2020). We suggest therefore that, when faced with uncertain future reward, rather than expend cognitive resource, people engage simpler value representations that are less enriched by episodic forecasts. This would predict that MTL regions should then participate less in the evaluation of more volatile delayed rewards, and in functional magnetic resonance imaging (fMRI), this would predict a decreased correlation between MTL activity and discounted value, and/or a weakening of MTL connectivity with regions representing discounted value (Kable & Glimcher, 2007; McClure et al., 2004, 2007).
Summary of Experiments
We test the above predictions across three experiments (total N = 158); wherein, we manipulate reward volatility in a combined learning and intertemporal choice (ITC) task. In Experiment 2, a subset of participants performed the task whilst undergoing fMRI, to probe a relationship between uncertainty-dependent valuation of future reward and MTL activity. Notably, in reinforcement-learning models, discounting of past reward is determined by a learning rate, where a high learning rate entails steeper discounting of past rewards and faster value updates (Daw et al., 2005, 2006; Dolan & Dayan, 2013; Mathys et al., 2011; Rescorla & Wagner, 1972; Sutton & Barto, 1998; Wilson et al., 2010). As predicted by optimality, learning rates in humans increase when reward contingencies are more changeable, or volatile (Behrens et al., 2007; Iigaya, 2016; McGuire et al., 2014; Nassar et al., 2010, 2012). Both learning rate and discount rate ought therefore to increase when reward dynamics become more changeable. Furthermore, if learning rate and discount rate both depend on perceived volatility, then the two parameters might be correlated across participants. We also examine these hitherto untested predictions.
Experiment 1
In Experiment 1, participants were briefed to imagine that they owned a farming business, selling produce to the highest bidder in a marketplace. Participants learned how the prices of three different products (wheat, chicken, and beans) evolved week-by-week, where a week corresponded to a trial of the experiment (Figure 2). The three products had different levels of volatility in price evolution. Participants subsequently made ITCs about when to sell each product, either immediately for a guaranteed price or in the marketplace following a delay.
Figure 2. Design of Experiment 1: Farming Futures Task.
Note. Participants tracked prices of three agricultural products, where a single trial corresponded to a “week.” For one product (a), no volatility, the underlying market price was constant, with added Gaussian emission noise. For another two products, the market price underwent shifts across time, with the same emission noise. For a low volatility product (b) shifts in the market price were small, while for a high volatility product (c), shifts were more extreme. (d) After observing prices over several trials, participants were asked to predict upcoming prices 1 week ahead. Within each block, participants performed three phases of observation and prediction: The first consisted of 70 observation trials followed by 70 prediction trials, while the subsequent two phases each consisted of 45 observation trials and 5 prediction trials. (Example predictions from one participant shown in color in Panels a–c). The time series was paused at three points (vertical blue lines), where participants predicted market prices further into the future. They were subsequently asked to indicate the lowest price they would accept to sell the product on the market after a delay (not shown), providing an estimate of the discount factor at each delay. See the online article for the color version of this figure.
Method
Ethics Statement
All participants gave full informed consent before taking part in the study. The study procedures received approval from the UCL Research Ethics Committee (3450/002) and were carried out in accordance with these guidelines.
Data and Code Availability
Behavioral data supporting the findings of this study are publicly available online in a third-party repository: https://doi.org/10.5061/dryad.47d7wm3k2 (G. Story, 2023). Computer codes that support the findings of this study are available from the corresponding author upon reasonable request.
Participant Recruitment and Sample Size
This experiment was designed as a pilot, and thereby focused on testing for larger, within participant, effects. Participants were recruited from the UCL Institute of Cognitive Neuroscience subject database. Twenty participants (mean age 27.4 years, SD 6.9 years; nine female) completed the experiment.
Baseline Discounting
Prior to the main task, we elicited discount functions for riskless quantities of money. Participants were required to indicate the smallest immediate monetary reward, termed their indifference amount, that they would be willing to accept instead of a larger stated quantity of money (£8, £9, £11, or £12) to be received at a specified delay (1, 2, 4, 26, or 52 weeks). Each delay was presented twice for each larger reward amount, creating 40 choices in total. One choice was selected to be paid for real, at the stated delay, in postdated Amazon vouchers. To achieve this in an incentive-compatible manner, for the selected choice, we randomly selected an immediate reward from a uniform distribution between £0 and the magnitude of the larger reward (e.g., £12); if this amount was below or equal to the participant’s stated indifference point, they received the delayed reward, if above the indifference point they received the randomly drawn immediate reward. Participants were fully briefed on this procedure. We fitted a two-parameter hyperbolic model of the form shown in Equation 1 to participants’ indifference points (Rachlin, 2006).
Learning Price Dynamics
During the task, participants observed and predicted the price of each product, displayed on a linear scale ranging from £0 to £25, as it evolved over the course of 240 trials. Each trial of the experiment was described as a “week.” After passively observing prices over several “weeks” (trials), participants were asked to predict upcoming prices 1 week ahead; the task therefore involved both observational and instrumental learning (Figure 2a–c). Participants were instructed about two sources of variability in prices: Gaussian emission noise, applying equally to all products, which we described as “variability in bidding,” and changes in the underlying “market price.” For one of the three products (“no volatility”), the market price was held constant; the market price of the other two products (“low volatility” and “high volatility”) underwent random changes across time, with the same Gaussian emission noise. We used two predefined sequences of outcomes for each product; participants were then allocated at random to one of the two sequences. We estimated learning rates for the three products separately by fitting a Rescorla–Wagner learning model (see supplemental material; Rescorla & Wagner, 1972) to participants’ price predictions from the first block of 70 prediction trials (shown in Figure 2a–c).
ITC Procedure
At three points during each block, participants were asked to predict the market price further into the future, at delays of 1, 4, 7, 12, or 18 weeks. Participants subsequently chose when to sell the product, either immediately for a fixed price (x), or on the market after a stated delay (1, 4, 7, 12, or 18 weeks). Specifically, they were asked to indicate the smallest fixed price that would just tempt them away from selling on the market. Participants were informed that the future price would evolve according to the same process they had previously observed and was also subject to the same Gaussian emission noise. By contrast, the immediate price was fixed, with no objective risk.
Participants were informed that, after the experiment, we would select one of their choices to be paid out for real. To realize this in an incentive-compatible manner, for the selected choice, we randomly selected an immediate fixed price from a uniform distribution between £0 and £25; if this amount was below the participant’s stated indifference point, they received the simulated future market price for the product as a bonus payment. If the selected price was above the participant’s indifference point, they received the randomly drawn fixed price. All bonus payments were made on the same day, at the end of the experiment.
Statistical Analyses
We used a Bayesian hierarchical mixed-effects model fitting routine, the details of which have been described previously (Huys et al., 2011, 2012). Models were compared using the integrated Bayesian information criterion (BICi), which approximates the model evidence (Huys et al., 2011, 2012). We also computed exceedance probabilities (φ), which estimate the likelihood of a given model outperforming the alternatives, given the model evidence for each across participants (Stephan et al., 2009).
We fitted participants’ reported indifference points with a volatility discounting model. The model assumed two independent contributions to discounting: a baseline component due to effects other than volatility, with rate K, (see Equation 1) and an additional component due to volatility (see Equation 11):
12 |
Here x represents the participant’s immediate price at indifference, while X represents the expected future market price. We tested alternative models in which X was given by either (a) each participant’s individual estimate of the future price at the relevant delay, or (b) the mean of this estimate across participants. σ2 represents objective volatility, and η2 objective emission noise. m is a participant-specific risk aversion parameter. c is a bias term that does not vary with condition and is set to zero for immediate options. s is an additional parameter corresponding to power law time perception (Rachlin, 2006). Using nonlinear optimization in MATLAB (Mathworks, Provo), with a Gaussian likelihood function, parameters were sought that minimized differences between reported indifference amounts and those predicted by the model.
We first fitted hyperbolic curves to each product separately, omitting the volatility discounting term (setting m = 0), and testing for an effect of volatility on log K using linear mixed-effects regression. The contribution of each participants’ data to this analysis was weighted by the reliability of their log K estimates (see Huys et al., 2012). We went on to fit the full model to the three products jointly, with differences between products parameterized by m. Finally, we tested a different class of model wherein risk preference is accounted for by concave utility over reward magnitude (see supplemental material).
Results
In-Task Discount Rate Increased With Volatility
As shown in Figure 3a, a volatility discounting model in which future reward magnitude was estimated from participants’ individual price predictions outperformed a version based on mean price predictions across participants (ΔBICi = 1,104; higher model evidence in 16/20 participants; φ = 0.998). This model also outperformed a null model in which m = 0 (ΔBICi = 1,252; higher model evidence in 16/20 participants; φ = 0.999), and an alternative model based on concave utility (ΔBICi = 400; higher model evidence in 16/20 participants; φ = 0.999), supporting an effect of volatility discounting.
Figure 3. Model Fits, Price Forecasts, and Discount Factors in Experiment 1.
Note. (a) Model comparison; blue asterisk indicates the best fitting volatility discounting model. (b) Observed indifference amounts are plotted against indifference amounts predicted by the best fitting model. (c) Mean indifference amounts (immediate selling prices) across all choices are plotted as a function of delay, separated by product (N = 20 participants). Solid lines show the fits of a volatility discounting model, where subjective forecasted future prices are hyperbolically discounted according to their variance. (d) Mean price forecasts for the three products separated by initial market price for the volatile products. (e) To illustrate effects of initial price on delay discounting, indifference amounts are transformed to discount factors (immediate price/future price), using the initial market price as an estimate of the future price. Indifference amounts predicted by the model are also divided by the initial market price, and overlaid with the observed data. When the price of the high volatility product was expected to rise, participants chose to defer selling the product (left panel). By contrast, participants were more likely to sell the high volatility product immediately when its price was expected to fall (right panel). When the expected future price was constant (central panel), discounting increased with increasing volatility. Throughout, error bars indicate one standard error. BICi = integrated Bayesian information criterion. See the online article for the color version of this figure.
As shown in Figure 3b and c, a volatility discounting model (Equation 12) provided a good fit to participants’ choices. Figure 3c shows observed indifference amounts for the three products, averaged across choices and participants, together with those predicted by the model. Discounting increased in proportion to volatility, linear mixed effects on log K fitted to each product separately: βcondition = 0.19, t(58) = 2.38, p = .020.
Participants’ future price predictions for the three products are shown in Figure 3d. For the purposes of illustrating effects of initial price on delay discounting, we transformed indifference amounts to discount factors (immediate price/future price), using the initial market price as an estimate of the future price. Indifference amounts predicted by the model were transformed in the same way, overlaid with the observed discount factors in Figure 3e.
When the initial price was close to the long-run mean, participants correctly predicted zero net growth in price, and discounting increased with increasing volatility. For the high volatility product, when the current price was below average, participants expected the future price to increase across time, and accordingly showed a preference to defer reward. For the low volatility product, when the current price was below average, expected price increases were more subtle, and discount factors predominantly reflected a small effect of volatility. By contrast, when the initial price was above average, participants expected the price of the high and low volatility items to fall and accordingly were more likely to prefer the immediate reward. These effects reflect the experimental design, wherein price evolution was bounded between £0 and £25, leading participants to expect that eccentric prices would tend to return toward the long-run average. The effects are captured by the model, which is furnished with participants’ subjective estimates of future market prices.
Baseline Discount Rate Correlated With In-Task Volatility Discounting
Importantly, volatility discounting within the task, governed by parameter m, showed a significant positive correlation with baseline discounting (Spearman ρ = 0.49, p = .049; N = 17; three participants who answered £0 in response to all baseline questions were excluded from this analysis). That is, people who showed greater discounting of uncertainty within the task tended to show steeper discounting of money outside of the task.
Learning Rate Increased With Volatility
As previously reported (Behrens et al., 2007; Lee et al., 2020), Rescorla–Wagner learning rates increased with increasing volatility, linear mixed effects on α: βcondition = 0.88, t(58) = 6.61, p < .001; no volatility: mean α = 0.39, SD = 0.22, low volatility, mean α = 0.55, SD = 0.17, high volatility mean α = 0.75, SD = 0.11. However, in-task discounting showed no significant correlation with learning rate across participants in any of three conditions (log K vs. α; no volatility: ρ = 0.41, p = .076; low volatility: ρ = −0.12, p = .612; high volatility: ρ = −0.28, p = .232; supplemental Figure S1).
Summary
In summary, in this pilot experiment, we found that within-task delay discounting increased in line with reward volatility. Furthermore, people who showed greater volatility-dependent increases in discounting within the task tended to show steeper discounting of money across real delays at baseline. This finding supports an hypothesis that discounting of time-dependent uncertainty contributes to individual differences in delay discounting.
Experiment 2
Experiment 2 tested whether the effects observed in Experiment 1 replicated in a larger sample and also probed neural correlates of volatility discounting. Here, to test whether effects of volatility extend to timescales used in conventional discounting tasks, we superimposed the timescale of the task onto longer delays. Specifically, one actual ITC was selected to be paid out at the stated delay, in the order of weeks. To further test the veridicality of the model, we measured risk aversion outside the main task, and elicited participants’ subjective estimates of future uncertainty within-task.
Method
Ethics Statement
All participants gave full informed consent before taking part in the study. The study procedures received approval from the UCL Research Ethics Committee (3450/002) and were carried out in accordance with these guidelines.
Data and Code Availability
Behavioral data supporting the findings of this study are publicly available online in a third-party repository: https://doi.org/10.5061/dryad.47d7wm3k2 (G. Story, 2023). Computer codes and imaging data that support the findings of this study are available from the corresponding author upon reasonable request.
Learning Phase
Participants learned price dynamics according to a similar procedure as described for Experiment 1 (see supplemental material and Figure 4a and b). Here only two products were used, to simplify the neuroimaging analysis. For one of the two products (“stable”), the market price was held constant at £25, and participants were explicitly informed about this; the market price of the other product (“volatile”) evolved according to a Gaussian random walk, with zero mean drift and volatility σ = 3.5, upper bounded at £50 and lower bounded at £0 (Figure 4a). We used two predefined sequences of outcomes sampled from a random walk with these properties; participants were then allocated at random to one of the two sequences. To estimate the optimal learning rate for the volatile product, we fitted a Rescorla–Wagner model so as to minimize differences between observed price inputs and those predicted by the model. We did this for the specific stimulus sequence observed by each participant, and averaged the estimates.
Figure 4. Design of Experiment 2: Online Marketplace Task.
Note. (a) Participants first observed prices of two products to be sold in an imaginary online marketplace, where one trial corresponded to 1 week. For one of the two products, the market price (solid gray line) was constant at £25, with added Gaussian emission noise. For the other product, the market price changed as a Gaussian random walk (volatility σ = 3.5), with the same emission noise (standard deviationη = 2). Sample inputs from one participant are shown. (b) Participants predicted prices 1 week ahead. (c) Subsequently, inside the fMRI scanner, participants chose between selling each product today for a guaranteed price, or after a delay in the simulated online marketplace. This differs from a standard delay discounting task because the uncertainty of the future payoff in the volatile condition grows with delay. See the online article for the color version of this figure.
ITC Phase
After observing price evolution for both products, participants entered a separate ITC phase of the experiment. Here, participants made a series of binary choices about when to sell each product, either immediately for a guaranteed (i.e., riskless) price (less than £25), or in the marketplace after a stated delay (0, 1, 4, 17 weeks) from a starting price of £25 (Figure 4c). We included a delay of 0 week to allow the intercept of the discounting curve to be reliably estimated. We selected a set of guaranteed immediate prices that allowed a plausible range of discount factors to be estimated (the full choice set, with equivalent K values at indifference, is shown in supplemental Table 4).
Participants were informed that one of their choices would be selected and realized after the experiment and, depending on their actual choice, participants would be paid either the guaranteed amount on the day of the experiment (if they opted for the immediate choice), or the simulated future price of the product at the stated number of weeks in the future (if they opted for the delayed choice). Here, since delays were real, rather than embedded within the timescale of the task, we expected a significant degree of discounting even in the stable condition, due to the influence of effects outside of the task. Modeling analyses followed equivalent procedures as described for Experiment 1: we fitted log K separately for the two products and also fitted volatility discounting models. We also tested a model wherein risk preference is accounted for by concave utility over reward magnitude.
Participant Recruitment, Sample Size, and Power Calculation
We conducted a behavioral pilot experiment with 11 participants, using the above design. In this pilot experiment, we observed an effect size of d = 0.76, based on the mean difference in log K between stable and volatile products, suggesting that a medium effect size was a plausible assumption. Sample size for the imaging experiment was therefore determined so as to achieve at least 80% power to detect a medium effect size (Cohen’s d = 0.5), based on a paired t-test, indicating a required sample size of 34 participants. We aimed to recruit at least 34 participants for the main experiment, in addition to the pilot sample. Thirty-six participants were recruited from the UCL Institute of Cognitive Neuroscience subject database, and underwent MRI scanning. Including the behavioral pilot, a total of 47 participants (mean age 28.0 years, SD 8.4 years, 32 female) completed the experiment.
fMRI Methods
Imaging methods are described in online supplemental material.
Baseline Risk Aversion
Before participants were introduced to the market behavior, we measured their risk preferences for lotteries to be paid out on the same day. Each lottery had prices drawn from a Gaussian distribution with mean £25, and one of four standard deviations (ranging from £5 to £13). For each lottery participants observed 36 outcomes drawn from the relevant distribution, before making a series of choices between receiving a guaranteed amount (between £11 and £25) or accepting a one-off play of the lottery. We fitted participants’ risk choices with a hyperbolic risk discounting model of the form shown in Equation 12, with t = 0, setting η2 to the objective variance of the lottery. We also tested a standard additive mean–variance model (Bell, 1995; D’Acremont & Bossaerts, 2008; Kroll et al., 1984; Symmonds et al., 2011; Weber et al., 2004; see supplemental material).
Subjective Future Uncertainty
Following the learning phase, we elicited participants’ predictions about each product’s future price, for a scenario in which the current market price was stated to be £25. Using a graphical interface, participants were also asked to indicate the lower and upper bounds of an interval in which they were 90% certain the future price would lie, described as “highest and lowest reasonable estimates” (for similar estimation procedures see Cesarini et al., 2006; Delavande & Rohwedder, 2008; O’Connor, 1989). We fitted a model to participants’ confidence intervals based on the true generative process, that is a Gaussian random walk (see supplemental material), to derive a subjective estimate of σ2 for each participant in each condition, which we termed “subjective future uncertainty” ().
Statistical Analyses
As shown in Figure 1, the volatility discounting model predicts that a component of discounting is proportional to an interaction between future uncertainty (σ2) and risk aversion (m). To test this, we examined a relationship between log K and an (SFU × m) interaction, where m is measured from risk choices outside of the discounting task. The interaction term denotes the “subjective cost of future uncertainty.” We tested for an effect on log K of a subjective cost of future uncertainty in stable and volatile conditions separately, using simple linear regression. Here, the contribution of each participants’ data was weighted by the reliability of their log K estimates (see Huys et al., 2012), where log K was estimated separately for the two conditions (setting m = 0 when fitting discounting choices).
We also tested whether an effect of condition on discounting was greater amongst participants who showed a greater increase in the subjective cost of uncertainty between conditions. To do so, we calculated, for each participant, the change in SFU between conditions: dSFU = SFUvolatile−SFUstable. We then implemented a mixed-effects linear regression on log K, with m, dSFU, condition, (Condition × dSFU), (Condition × m), and (Condition × dSFU × m) as predictor variables, where condition is coded as a within-subject dummy variable. We hypothesized that participants who were (a) risk averse and (b) showed a greater subjective increase in uncertainty between conditions would be more sensitive to an effect of condition, manifest in a significant three-way (Condition × dSFU × m) interaction.
Results
Baseline Risk Aversion
A hyperbolic risk discounting model substantially outperformed a standard mean–variance model in accounting for baseline risk preferences (higher model evidence in 39/45 participants; φ > 0.999). Participants were on average risk averse (mean log m = −7.44, 95% CI [−7.82, −7.06]) and also showed a bias away from choice of the risky option (mean c = 3.43, 95% CI [4.00, 2.86]).
Subjective Future Uncertainty Increased With Volatility
As shown in Figure 5a, subjective confidence intervals increased with delay. On average, in the volatile condition participants accurately predicted that the expected future price would remain £25 (Figure 5b; mean slope of future predictions, β = 0.03, 95% CI of β [−0.38, 0.44]). Consistent with an effect of volatility, SFU was significantly greater in the volatile than in the stable condition, paired t(46) = 8.7, two-tailed p < .0001, d = 1.27; Figure 5a and c. In the volatile condition, participants tended to slightly overestimate the true volatility (group mean , true σ = 3.50, 95% CI of [3.44, 4.90]). Notably there was a shallow positive slope even in the stable condition, in keeping with a prior belief that uncertainty grows with delay (group mean , true σ = 0, 95% CI of [2.11, 2.80]).
Figure 5. Subjective Future Uncertainty and Price Forecasts in Experiment 2.
Note. (a) Mean width of subjective confidence intervals on future prices. Participants accurately predicted that uncertainty grows as a function of delay in the volatile condition (red). Solid lines show the fit of a random walk model. (b) Mean “best guesses” about future prices. Participants accurately predicted that expected prices are constant across delay. Solid lines show the fit of log growth curves. N = 47 for all analyses shown. Error bars indicate one standard error. (c) Subjective future uncertainty (SFU), that is subjective volatility derived from growth in confidence intervals across delay. Filled gray bars show group means, solid green and red bars 95% CIs; black dots show individual parameter estimates. See the online article for the color version of this figure.
*** p < .001.
Delay Discounting Increased With Volatility
Two of 47 participants always selected the immediate option, rendering their discounting preferences inestimable, and consequently they were excluded from analysis (N = 45). The proportion of choices on which participants chose the delayed option decreased with delay, indicating participants discounted delayed payoffs (Figure 6a). Fitting the baseline discount rate, K, separately for the two conditions (setting m = 0 in Equation 12, leaving K, c, and s as free parameters), we found that log K was significantly greater in the volatile condition, mean difference in log K = 0.37, 95% CI [0.21, 0.53], t(44) = 4.62, p < .001, d = 0.69, in keeping with steeper discounting.
Figure 6. Delay Discounting as a Function of Risk Aversion and Future Uncertainty in Experiment 2.
Note. (a) Mean discount functions for stable (green) and volatile conditions (red) for participants separated by risk aversion, N = 15 per tertile. Low risk aversion: log (m) ≤ −8.06; medium: −8.06 < log(m) ≤ −4.75; high: log (m) > −4.75. Error bars represent 1 SE. Solid lines represent fits of a volatility discounting Model to both conditions together, with risk aversion, m, estimated from risk choices outside of the ITC task. (b) Discounting depends on the subjective cost of future uncertainty. Log K fitted directly to volatile (red) and stable (green) conditions is plotted as a function of risk aversion (m, Z-scored), SFU (shown here for high risk averse participants, N = 15) and an (SFU × Risk Aversion) interaction. Color intensity reflects the contribution of each data point to the weighted least squares regression. P-values are for each condition and regressor separately. (c) Model comparison; blue asterisk indicates the best fitting volatility discounting model, with risk aversion, m, estimated from risk choices outside of the ITC task. ITC = intertemporal choice. (d) Observed choice proportions are plotted against proportions predicted by the best fitting model. SFU = subjective future uncertainty; BICi = integrated Bayesian information criterion. See the online article for the color version of this figure.
Discounting Correlated With the Subjective Cost of Uncertainty
We first tested for an effect on log K of a subjective cost of future uncertainty in stable and volatile conditions separately. We found that log K in both conditions showed the expected positive relationship with the cost of future uncertainty, (SFU × m), Figure 6b, panel iii; stable condition: β = 0.16, t(43) = 2.97, p = .005; volatile condition: β = 0.12, t(43) = 4.68, p < .001. In a subsequent analysis, we tested whether an effect of condition (volatile vs. stable) on discounting was greater amongst participants who: (a) showed a greater subjective increase in uncertainty between conditions and (b) were more risk averse. Here, we found evidence for an hypothesized three-way interaction, (m × dSFU × Condition): β = 0.10, t(83) = 3.04, p = .003, indicating that a change in discounting between conditions was sensitive to an added cost of future uncertainty. Main effects of condition (β = 0.41, t(83) = 4.19, p < .001) and risk aversion (m), β = 0.39, t(83) = 2.27, p = .026, were also significant. Additional terms included as covariates were not significant, [dSFU]: β = 0.03, t(83) = 0.32, p = .752; [dSFU × Condition]: β = −0.02, t(83) = −0.62, p = .539; [m × Condition]: β = −0.10, t(83) = −1.10, p = .275. In summary, as predicted by a volatility discounting model, both baseline discounting and a volatility-dependent shift in discounting were commensurate with participants’ subjective sensitivity to future uncertainty.
Volatility Discounting Correlated With Baseline Discounting
We went on to fit a volatility discounting model (Equation 12) directly to the delay discounting data, fitting both conditions with the same set of parameters. As shown in Figure 6c, this model outperformed an alternative based on concave utility. Consistent with Experiment 1, within this model baseline discounting (log K) was significantly correlated with volatility discounting (log m; Spearman ρ = 0.31, p = .037). That is, participants who showed greater discounting in the absence of extraneous volatility also increased their discounting more in response to volatility.
Finally, we fitted a volatility discounting model by carrying participants’ risk aversion parameters forward to fit their ITCs. Here, each participant’s estimate of m was carried over from their risk choices, which were fitted separately. This model, which leverages information about participant-specific risk aversion, provided a better fit to the data than a null model with m = 0 (higher model evidence in 41/45 participants; φ > 0.999). The fit of this model to participants’ ITCs is shown in Figure 6a. Here, an effect of condition predicted by the model derives from participants’ idiosyncratic degrees of risk aversion, measured separately.
Learning Rate Increased With Volatility
Rescorla–Wagner learning rates were substantially higher in the volatile condition, paired t(46) = 17.5, two-tailed p < .0001, mean α-volatile = 0.68; SD 0.11, and close to zero in the stable condition (mean α-stable = 0.03; SD 0.06). The mean learning rate in the volatile condition was close to the optimal learning rate of 0.75. We found neither a significant correlation between learning rate and log K (stable condition: r = 0.14, p = 0.346; volatile condition: r = 0.006, p = .966; supplemental Figure S2) nor between change in learning rate and change in log K (r = −0.14, p = .376).
Reduced Hippocampal–Prefrontal Coupling Under Volatile Reward Dynamics
We first sought to replicate previously published observations by investigating neural representation of subjective value for a delayed choice option, corresponding to the time of presentation of this delayed option. Participants who did not discount delayed reward (N = 8) were excluded from this analysis. Consistent with prior results (Owens et al., 2017; Wesley & Bickel, 2014), we found clusters in middle temporal gyrus (left −48 −46 7, t = 6.58) and dorsolateral prefrontal cortex (right 39 8 40 and 54 26 25, t = 6.57, extending into right anterior insula) that survived whole brain correction at family-wise error (FWE) p < .05.
We hypothesized that higher volatility should lead to a decrease in the reliability of simulated future scenarios. Accordingly, we predicted that medial-temporal lobe (MTL) and related regions (which support imagining future scenarios), would participate less in the evaluation of more volatile delayed rewards. In fMRI data, this predicts: (a) a decreased correlation between MTL activity and discounted value in the volatile condition compared to the stable condition and (b) an associated modulation in functional coupling between MTL and prefrontal regions tracking discounted value.
To evaluate the first of these two hypotheses, we tested for regions whose activity correlated more strongly with discounted value in the stable compared to the volatile condition. This contrast revealed activation in left amygdala (FWE corrected within a bilateral amygdala mask, −30 2 − 23, t = 3.65, p = .029; supplemental Figure S3b); however, this did not survive small volume correction for a larger a priori region of interest encompassing bilateral HC and amygdala (p = .11 corrected). The reverse contrast yielded no significant suprathreshold clusters at p < .001.
To test the second hypothesis, we performed a psychophysiological interaction (PPI) analysis with a seed in the HC, with Condition ([stable–volatile]) as a modulating variable. Although we did not observe hippocampal activation in our primary contrast of interest, we selected a hippocampal seed region on an a priori basis, since HC is known to provide for a cognitive map (Addis et al., 2007; Hassabis et al., 2007; Peters & Büchel, 2010; see supplemental material). In order to test which regions showing increased HC-coupling in the stable condition also correlated with discounted value, we used the discounted value contrast (without an effect of condition) as a mask for this PPI. For left HC-coupling, we found a significant peak in left dorsolateral prefrontal cortex (left dlPFC −39 26 25, t = 5.22, p = .025, FWE corrected for the volume of the discounted value mask), which both tracked discounted value and showed decreased coupling with HC in the volatile condition (supplemental Figure S8). The equivalent analysis for right HC-coupling yielded no significant activations. There were no significant clusters at p < .001 uncorrected whose connectivity with HC was greater in the volatile condition.
Summary
Here, we illustrate effects of volatility on discounting at a time-scale commensurate with that used in conventional discounting questionnaires, across delays of up to 4 months. Our findings support a conclusion that delay discounting incorporates uncertainty discounting. We find a decrease in functional coupling under volatile conditions between MTL (HC) and a region of left dorsolateral prefrontal cortex (dlPFC), which tracked discounted value. This finding suggests that volatility-dependent increases in discounting are associated with reduced engagement of MTL structures known to participate in prospective forecasts.
Experiment 3
Experiment 3 tested for replicability of effects in Experiments 1 and 2 in a larger, online sample. A replication test was motivated by findings that estimated correlation coefficients are unstable in smaller sample sizes (Schönbrodt & Perugini, 2013). The design followed that of Experiment 1.
Method
Ethics Statement
All participants gave full informed consent before taking part in the study. The study procedures received approval from the UCL Research Ethics Committee (20399/001) and were carried out in accordance with these guidelines.
Data and Code Availability
Behavioral data supporting the findings of this study are publicly available online in a third-party repository: https://doi.org/10.5061/dryad.47d7wm3k2 (G. Story, 2023). Computer codes that support the findings of this study are available from the corresponding author upon reasonable request.
Sample Size and Power Calculation
Budget constraints limited our sample size to approximately 100 participants. Statistical simulations indicate that this sample size allows the Pearson correlation coefficient r to be estimated within an interval of ±0.15 with 80% confidence, for a true correlation of r = 0.3 (approximately the size of correlation observed between log K and risk aversion in Experiment 2; Schönbrodt & Perugini, 2013). Participants were recruited from Prolific.co, an online subject database.
Baseline Delay Discounting
Prior to starting the task, participants (N = 101, mean age 28.9 years, SD 9.7 years; 52 female) made a series of binary choices between a monetary reward of magnitude £23, £23.50, £25, or £26.50, delayed by 3, 7, 12, or 18 weeks, respectively, and a smaller quantity of money available immediately. Choices were selected according to an adaptive procedure (see supplemental material). Participants were informed that we would select one choice from every twenty participants to be paid for real as a bonus in Prolific.
Learning Price Dynamics
Market prices evolved according to a Gaussian random walk, upper bounded at £40 and lower bounded at £0. For one of the three products (“no volatility”), the market price was held constant at £20.00; the market price of the other two products (“low volatility” and “high volatility”) evolved with volatility σ = 1.5 and σ = 3.5, respectively. All prices were subject to Gaussian emission noise, with standard deviation η = 3. Price profiles were selected so that the market price on the final trial was equal to the long-run mean of £20.00.
ITC Procedure
After observing the price dynamics of each product, participants were asked to predict each product’s future price and report a subjective confidence interval, as described for Experiment 2. Eight out of 101 participants did not adjust the subjective confidence interval from its starting value in at least one of the three conditions, suggesting inattention to the task; these participants were therefore excluded. Participants subsequently chose when to sell the product, either immediately for a guaranteed price, or for the market price after a stated delay (1, 4, 9, or 13 weeks). As for Experiment 1, one choice was selected to be realized and was paid out as a bonus at the end of the experiment. An adaptive procedure was used to estimate indifference points at each delay by adjusting the immediate price (see online supplemental material). Statistical analyses followed those previously described for Experiments 1 and 2. By distinction from Experiments 1 and 2, here future sales were for the market price, without emission noise; we therefore set η2 to zero when fitting models.
Results
Nonveridical Estimates of Subjective Future Uncertainty
As shown in Figure 7a, future price predictions largely reflected a veridical pattern of no net growth or decay. Subjective confidence intervals about future prices increased as a function of delay (Figure 7b). However, unlike in Experiment 2, group average confidence intervals did not recapitulate the statistics of the true generative process. Instead, participants overestimated volatility in the no volatility condition (σ2 = 0, mean ) and underestimated volatility in the low volatility (σ2 = 1.5, mean and high volatility (σ2 = 3.5, mean conditions. There was no significant difference between subjective future uncertainty () in high volatility and no volatility conditions, paired t(92) = 1.16, two-tailed p = .24 (Figure 7c). However, the intercept term, corresponding to an estimate of time-independent uncertainty, was significantly higher in the high volatility condition, paired t(92) = 4.73, two-tailed p < .0001. This pattern suggests that, in this online experiment, participants either did not fully differentiate price dynamics in the three conditions, or did not fully attend to delay when providing subjective confidence intervals.
Figure 7. Subjective Future Uncertainty and Delay Discounting in Experiment 3.
Note. (a) Mean “best guesses” about future prices. Participants accurately predicted that expected prices are constant across delay. (b) Mean width of subjective confidence intervals on future prices. Solid lines show the fit of a random walk model. (c) Subjective future uncertainty (SFU), that is, subjective volatility derived from growth in confidence intervals across delay. Filled gray bars show group means, solid green, yellow and red bars 95% CIs; black dots show individual parameter estimates. Participants overestimated future uncertainty in the no volatility and low volatility conditions and underestimated future uncertainty in the high volatility condition. (d) Model comparison; blue asterisk indicates the best fitting volatility discounting model. (e) Discount curves in each condition, fitted with a volatility discounting model. The model overestimated discounting in the high volatility condition, commensurate with participants’ underestimating future uncertainty in this condition. BICi = integrated Bayesian information criterion. See the online article for the color version of this figure.
Delay Discounting Increased With Volatility
We fitted a volatility discounting model to participants’ choices. A model in which reward magnitude was given by mean future price predictions across participants outperformed a version in which magnitude was estimated from participants’ individual future price predictions (Figure 7d; ΔBICi = 1,544, higher model evidence in 81/93 participants; φ > 0.999). This model also outperformed a null model in which m = 0 (ΔBICi = 317; higher model evidence in 48/93 participants; φ = 0.999), supporting an effect of volatility to increase delay discounting. Finally the volatility discounting model outperformed an alternative based on concave utility (ΔBICi = 2000; higher model evidence in 93/93 participants; φ > 0.999). Discount curves for the three products are shown in Figure 7e, plotted using mean future price predictions as estimates of future reward magnitude. As predicted, discounting increased in proportion to volatility, linear mixed effects on log K fitted to each condition separately: βcondition = 0.04, t(277) = 4.13, p < .001.
Baseline Discount Rate Correlated With In-Task Volatility Discounting
Volatility discounting within the task, governed by parameter m, showed a significant positive correlation with baseline discounting, measured outside of the task (log m vs. log K: Pearson r = 0.22, p = .035). As shown in Figure 8, this finding was consistent across all three experimental tasks. Pooling data across the three experiments revealed a highly significant positive relationship between log m and log K, linear mixed-effects regression with random slope and intercept parameters, grouped by experiment: fixed slope βm = 0.24, t(153) = 2.84, p = .005.
Figure 8. Baseline Discounting Versus In-Task Volatility Discounting.
Note. (a) For Experiment 1, volatility discounting (log m), that is, sensitivity of discounting to volatility within-task, correlates with baseline discounting for nonrisky rewards measured outside of the task (N = 17; three participants excluded due to inestimable baseline discounting data). (b) In Experiment 2 (N = 45), where in-task discounting was measured over naturalistic timescales, volatility discounting (log m, estimated from discounting choices) correlates with baseline discounting (log K0, discounting in stable condition). (c) In Experiment 3 (N = 93), in-task volatility discounting (log m) correlates with discounting for nonrisky rewards measured outside of the task. Solid lines show fits of a linear mixed-effects regression with experiment as a random grouping variable, fixed slope βm = 0.24, t(153) = 2.84, p = .005. See the online article for the color version of this figure.
Learning Rate Increased With Volatility
Learning rates increased overall with increasing volatility, linear mixed effects on α: βcondition = 0.32, t(277) = 6.16, p < .001. However, the pattern was nonmonotonic, suggesting that participants did not fully distinguish between no volatility and low volatility conditions (no volatility: mean α = 0.16, SD = 0.43, low volatility, mean α = 0.05, SD = 0.60, high volatility mean α = 0.77, SD = 0.88). We speculate that this was due to inattention to the task. It is also possible that increased emission noise in this experiment (relative to Experiment 2) obscured participants’ perceptions of volatility. As previously, in-task discounting showed no significant correlation with learning rate across participants in any of three conditions (no volatility: r = −0.02, p = .815; low volatility: r = −0.08, p = .422; high volatility: r = −0.19, p = .051; supplemental Figure S5).
Summary
In summary, Experiment 3 reproduced findings that in-task delay discounting increased with increasing volatility, to an extent that correlated with baseline discounting. Notably, here participants’ subjective estimates of future uncertainty and learning rate during the task did not reflect the true reward statistics, perhaps suggesting lower attention to the task during this online experiment than was achieved by participants in the previous laboratory experiments.
General Discussion
A central idea in behavioral economics considers impatience as arising, at least partly, from risk implicit in a delay. Existing approaches have emphasized that delayed rewards are less probable than immediate ones, rather than that the value of delayed reward is less precisely known. By contrast, here, we advance a model in which random changes in reward value accumulate over time. If risky rewards are discounted according to their variance, this model yields impatience, to an extent that is proportional to volatility (changeability) of reward value. This model has affinities with concepts found in finance, such as the price of risk, the excess rate of return on an investment demanded to compensate for an increase in volatility, expressed per unit of volatility (Sharpe, 1964). Consistent with volatility discounting, we find that people discount delayed reward more steeply in volatile environments, an effect that also pertains over naturalistic delays of up to 4 months.
In the experiments presented here, objective outcomes (prices) are uncertain, whereas in conventional discounting tasks the objective outcomes are certain (e.g., $10 or $20 tomorrow), but subjective values are uncertain. It may be argued that manipulating volatility experimentally in this manner simply adds extraneous risk to a baseline delay discounting process, rather than revealing that subjective values incorporate estimates of volatility. However, three findings mitigate this concern. First, in the stable condition of Experiment 2, where there is no extraneous time-dependent risk, delay discounting nevertheless correlated significantly with a subjective cost of future uncertainty. Second, participants who showed steeper discounting in the stable condition also increased discounting more in response to volatility. Third, within-task volatility discounting in Experiments 1 and 3 correlated with delay discounting of nominally riskless rewards outside of the task. Taken together, these findings support a conclusion that impatience for reward is partly determined by an aversion to future uncertainty. Nevertheless, the experiments presented here manipulate volatility explicitly, so participants may have been primed to attend to this. Future work might examine effects on discounting of implicit manipulations of volatility.
We do not suggest that volatility is the sole factor underlying delay discounting. Rather, normative factors independent of risk, such as an opportunity cost associated with delay, ought to affect discounting (Niv et al., 2007; Rachlin, 2006), perhaps accounting for observations that probability and delay discounting are subject to distinct influences (Cox et al., 2020; Estle et al., 2006; Lee et al., 2020; Luhmann et al., 2008; Peters & Büchel, 2009; Weber & Huettel, 2008). Furthermore, while volatility pertains to uncertainty over the magnitude of future reward, in conventional discounting tasks both the probability of receipt (e.g., Sozou, 1998) and the timing of the receipt may also be uncertain (Dasgupta & Maskin, 2005).
Sampling Future Reward in Prospective Memory
Gabaix and Laibson (2017) propose that, owing to time-dependent uncertainty, mental simulations of future events are less precise than more immediate ones. The formal implications for discounting are similar to those considered here. A key hypothesis is that cognitive resources ought to be devoted to simulating the future only where doing so is worth the reward gained and is effective in reducing future uncertainty (Gershman & Bhui, 2020). Our imaging findings are consistent with this idea. In Experiment 2, we investigated volatility-dependent increases in discounting in terms of relative engagement of MTL structures known to participate in prospective forecasts, in particular, the HC (Addis et al., 2007; Hassabis et al., 2007; Johnson & Redish, 2007; Peters & Büchel, 2010; Schacter et al., 2008; Tsao et al., 2018). We found a decrease in functional coupling under volatile conditions between HC and a region of left dorsolateral prefrontal cortex (dlPFC), which tracked discounted value.
Anatomically, dlPFC is reciprocally connected to HC via the parahippocampal gyrus, subiculum, and presubiculum (Goldman-Rakic et al., 1984) and is a region implicated in the exercise of cognitive control (Hare et al., 2009; Hare et al., 2009; Hecht et al., 2013; MacDonald et al., 2000; Rudorf & Hare, 2014). An interpretation is therefore that dlPFC maintains an online representation of delayed reward, by retrieving contextual information from MTL, and this coupling is diminished under unpredictable reward dynamics. We tentatively suggest that this effect reflects a cost-benefit trade-off whereby imagining the future is more effortful and is therefore downweighted. This finding is in keeping with a previous report that in a reinforcement-learning task, dlPFC was more strongly activated under predictable than under unpredictable state transition rules (Tanaka et al., 2006). Notably however our primary contrast of interest did not reveal hippocampal activation, placing our connectivity analysis on a relatively weak evidential footing.
Effects of Nonlinear Utility and Internal State
In economic models, risk aversion is often accounted for by postulating a concave utility function over reward magnitude, such that rewards of increasing magnitude are associated with decreasing marginal benefits (Bell, 1995; Kahneman & Tversky, 1979; Pine et al., 2009). We found that a volatility discounting model outperformed a model based on a concave power law utility function. However, effects of nonlinear utility cannot be fully delineated from the model presented here, since the two effects both converge on a prediction that discounting depends on the variance of future reward. Indeed, it is likely that there exists some alternative form of utility function that would approximate hyperbolic discounting of variance.
In the experiments presented here, we manipulate volatility in external reward magnitude. However, the same model might also be applied to changes in the subjective utility of reward deriving from changes in internal state. For example, a person asked to specify what they would like to eat for dinner a week in advance might be unsure of whether their preferences will be the same in a week’s time. Effects of internal state changes on both risk attitude (Kacelnik, 1997; Smallwood, 1996) and delay discounting (Giordano et al., 2002; Kirk & Logue, 1997; Mitchell, 2004) are well documented and examining the interactions of such effects with those of future uncertainty suggests an important direction for future research.
Learning Rate and Discount Rate as Separable Markers of Impulsivity
As predicted by optimality, volatility-dependent increases in temporal discounting seen here were also associated with an increase in learning rate (Behrens et al., 2007; Iigaya, 2016; Nassar et al., 2010, 2012). In reinforcement-learning models, the value of an action is estimated by updating a recency-weighted average of rewards which followed that action in the past (Mathys et al., 2011; Wilson et al., 2010), where a high learning rate entails steeper discounting of past rewards and faster value updates (Rescorla & Wagner, 1972; Sutton & Barto, 1998). A high setting of either rate can generate impulsive behavior: A high learning rate leads to behavior driven by recent successes or failures, rather than the long-run context, while a high discount rate produces behavior driven by its immediate consequences, thereby neglecting delayed benefits. We show that volatility engenders an increase in both parameters, reflecting the fact that where reward contingencies change frequently, neither past rewards nor promised future rewards are fully informative of current action value (Daw et al., 2005, 2006; Dolan & Dayan, 2013).
That we did not find a correlation between learning rate and discount rate across participants suggests that future discounting engages processes different to those involved in learning. In particular, we find that while learning rate is sensitive to volatility, discounting is sensitive to an interaction between volatility and risk aversion. Within this interaction, risk aversion appears to contribute the greater source of between-participant variability. An additional consideration is that the form of weighted averaging over past rewards predicted by Rescorla–Wagner learning does not correspond directly to the model used here for future discounting. Further work might examine whether integration of past and future rewards might be fitted with the same model.
Finally, we note that increasing time-independent uncertainty ought to decrease both learning rate (Mathys et al., 2011; Wilson et al., 2010) and delay discounting (by Equation 8 and supplemental Equation S12). This accords with findings that adding risk to both immediate and delayed rewards reduces a bias toward immediate reward (Anderhub et al., 2001; Andreoni & Sprenger, 2012b; Keren & Roelofsma, 1995; Stevenson, 1992; though see Ahlbrecht & Weber, 1997) and that increasing emission noise reduces learning rate (Diederen & Schultz, 2015). In this regard, an interesting direction for future research would be to examine a tendency to misperceive emission noise as volatility, which may underpin maladaptive impulsivity. For example, in learning tasks, people with high trait anxiety show smaller adjustments in learning rate in response to changing volatility (Browning et al., 2015), and increased reliance on “lose-shift” strategies (Huang et al., 2017). These findings suggest that anxious individuals may misperceive chance fluctuations as underlying environmental changes, particularly following losses. Further research might explore whether this effect correlates with the increased delay discounting reported amongst anxious individuals (Xia et al., 2017).
Conclusions
We conclude that individual differences in delay discounting in part reflect differential discounting of uncertainty, whereby more impatient people are more sensitive to future risk. The present study contributes to a growing body of work illustrating how discounting can be derived from beliefs about the statistics of future reward across time (e.g., Patak & Reynolds, 2007; Reynolds et al., 2007; Stevens et al., 2005; Takahashi et al., 2007). We propose that measuring such beliefs directly offers to enrich our understanding and prediction of impulsive behavior. This idea can also reconcile state- and trait-level influences on discounting in that, over long timescales, prevailing environmental conditions ought to be reflected in trait-based differences in discounting, which are subject to state-based effects when conditions change (Griskevicius et al., 2011; Haynes et al., 2021; Koffarnus et al., 2013; Odum, 2011; Peviani et al., 2019). However, more integrative studies are required to examine how the various documented influences on discounting interact to shape impulsive behavior “in the wild.”
Supplementary Material
References
- Addis D. R., Wong A. T., & Schacter D. L. (2007). Remembering the past and imagining the future: Common and distinct neural substrates during event construction and elaboration. Neuropsychologia, 45(7), 1363–1377. 10.1016/j.neuropsychologia.2006.10.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ahlbrecht M., & Weber M. (1997). An empirical study on intertemporal decision making under risk. Management Science, 43(6), 813–826. 10.1287/mnsc.43.6.813 [DOI] [Google Scholar]
- Anderhub V., Güth W., Gneezy U., & Sonsino D. (2001). On the interaction of risk and time preferences: An experimental study. German Economic Review, 2(3), 239–253. 10.1111/1468-0475.00036 [DOI] [Google Scholar]
- Andersen S., Harrison G. W., Lau M. I., & Rutström E. E. (2008). Eliciting risk and time preferences. Econometrica, 76(3), 583–618. 10.1111/j.1468-0262.2008.00848.x [DOI] [Google Scholar]
- Andreoni J., & Sprenger C. (2012a). Estimating time preferences from convex budgets. The American Economic Review, 102(7), 3333–3356. 10.1257/aer.102.7.3333 [DOI] [Google Scholar]
- Andreoni J., & Sprenger C. (2012b). Risk preferences are not time preferences. The American Economic Review, 102(7), 3357–3376. 10.1257/aer.102.7.3357 [DOI] [Google Scholar]
- Bach D. R., Hulme O., Penny W. D., & Dolan R. J. (2011). The known unknowns: Neural representation of second-order uncertainty, and ambiguity. The Journal of Neuroscience, 31(13), 4811–4820. 10.1523/JNEUROSCI.1452-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Behrens T. E., Woolrich M. W., Walton M. E., & Rushworth M. F. (2007). Learning the value of information in an uncertain world. Nature Neuroscience, 10(9), 1214–1221. 10.1038/nn1954 [DOI] [PubMed] [Google Scholar]
- Bell D. E. (1995). Risk, return, and utility. Management Science, 41(1), 23–30. 10.1287/mnsc.41.1.23 [DOI] [Google Scholar]
- Berns G. S., Laibson D., & Loewenstein G. (2007). Intertemporal choice—Toward an integrative framework. Trends in Cognitive Sciences, 11(11), 482–488. 10.1016/j.tics.2007.08.011 [DOI] [PubMed] [Google Scholar]
- Bickel W. K., Jarmolowicz D. P., Mueller E. T., Koffarnus M. N., & Gatchalian K. M. (2012). Excessive discounting of delayed reinforcers as a trans-disease process contributing to addiction and other disease-related vulnerabilities: Emerging evidence. Pharmacology & Therapeutics, 134(3), 287–297. 10.1016/j.pharmthera.2012.02.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bickel W. K., Koffarnus M. N., Moody L., & Wilson A. G. (2014). The behavioral- and neuro-economic process of temporal discounting: A candidate behavioral marker of addiction. Neuropharmacology, 76, 518–527. 10.1016/j.neuropharm.2013.06.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bickel W. K., & Marsch L. A. (2001). Toward a behavioral economic understanding of drug dependence: Delay discounting processes. Addiction, 96(1), 73–86. 10.1046/j.1360-0443.2001.961736.x [DOI] [PubMed] [Google Scholar]
- Black F., & Scholes M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637–654. 10.1086/260062 [DOI] [Google Scholar]
- Browning M., Behrens T. E., Jocham G., O’Reilly J. X., & Bishop S. J. (2015). Anxious individuals have difficulty learning the causal statistics of aversive environments. Nature Neuroscience, 18(4), 590–596. 10.1038/nn.3961 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cesarini D., Sandewall Ö., & Johannesson M. (2006). Confidence interval estimation tasks and the economics of overconfidence. Journal of Economic Behavior & Organization, 61(3), 453–470. 10.1016/j.jebo.2004.10.010 [DOI] [Google Scholar]
- Chao L. W., Szrek H., Pereira N. S., & Pauly M. V. (2009). Time preference and its relationship with age, health, and survival probability. Judgment and Decision Making, 4(1), 1–19. 10.1017/S1930297500000668 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chesson H., & Viscusi W. (2003). Commonalities in time and ambiguity aversion for long-term risks. Theory and Decision, 54(1), 57–71. 10.1023/A:1025095318208 [DOI] [Google Scholar]
- Cox D. J., Dolan S. B., Johnson P., & Johnson M. W. (2020). Delay and probability discounting in cocaine use disorder: Comprehensive examination of money, cocaine, and health outcomes using gains and losses at multiple magnitudes. Experimental and Clinical Psychopharmacology, 28(6), 724–738. 10.1037/pha0000341 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Critchfield T. S., & Kollins S. H. (2001). Temporal discounting: Basic research and the analysis of socially important behavior. Journal of Applied Behavior Analysis, 34(1), 101–122. 10.1901/jaba.2001.34-101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- D’Acremont M., & Bossaerts P. (2008). Neurobiological studies of risk assessment: A comparison of expected utility and mean-variance approaches. Cognitive, Affective & Behavioral Neuroscience, 8(4), 363–374. 10.3758/CABN.8.4.363 [DOI] [PubMed] [Google Scholar]
- Daniel T. O., Said M., Stanton C. M., & Epstein L. H. (2015). Episodic future thinking reduces delay discounting and energy intake in children. Eating Behaviors, 18, 20–24. 10.1016/j.eatbeh.2015.03.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dasgupta P., & Maskin E. (2005). Uncertainty and hyperbolic discounting. The American Economic Review, 95(4), 1290–1299. 10.1257/0002828054825637 [DOI] [Google Scholar]
- Daugherty J. R., & Brase G. L. (2010). Taking time to be healthy: Predicting health behaviors with delay discounting and time perspective. Personality and Individual Differences, 48(2), 202–207. 10.1016/j.paid.2009.10.007 [DOI] [Google Scholar]
- Daw N. D., Niv Y., & Dayan P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8(12), 1704–1711. 10.1038/nn1560 [DOI] [PubMed] [Google Scholar]
- Daw N. D., O’Doherty J. P., Dayan P., Seymour B., & Dolan R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876–879. 10.1038/nature04766 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Delavande A., & Rohwedder S. (2008). Eliciting subjective probabilities in Internet surveys. Public Opinion Quarterly, 72(5), 866–891. 10.1093/poq/nfn062 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diederen K. M., & Schultz W. (2015). Scaling prediction errors to reward variability benefits error-driven learning in humans. Journal of Neurophysiology, 114(3), 1628–1640. 10.1152/jn.00483.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dolan R. J., & Dayan P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325. 10.1016/j.neuron.2013.09.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eckel C., Johnson C., & Montmarquette C. (2005). Savings decisions of the working poor: Short- and long-term horizons. In Carpenter J., Harrison G. W., & List J. A. (Eds.), Field experiments in economics, research in experimental economics (Vol. 10, pp. 219–260). JAI Press. 10.1016/S0193-2306(04)10006-9 [DOI] [Google Scholar]
- Ellsberg D. (1961). Risk, ambiguity, and the savage axioms. The Quarterly Journal of Economics, 75(4), 643–669. 10.2307/1884324 [DOI] [Google Scholar]
- Epper T., Fehr-Duda H., & Bruhin A. (2011). Viewing the future through a warped lens: Why uncertainty generates hyperbolic discounting. Journal of Risk and Uncertainty, 43(3), 169–203. 10.1007/s11166-011-9129-x [DOI] [Google Scholar]
- Epstein L. H., Jankowiak N., Fletcher K. D., Carr K. A., Nederkoorn C., Raynor H. A., & Finkelstein E. (2014). Women who are motivated to eat and discount the future are more obese. Obesity, 22(6), 1394–1399. 10.1002/oby.20661 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Estle S. J., Green L., Myerson J., & Holt D. D. (2006). Differential effects of amount on temporal and probability discounting of gains and losses. Memory & Cognition, 34(4), 914–928. 10.3758/BF03193437 [DOI] [PubMed] [Google Scholar]
- Frederick S., Loewenstein G., & O’donoghue T. (2002). Time discounting and time preference: A critical review. Journal of Economic Literature, 40(2), 351–401. 10.1257/jel.40.2.351 [DOI] [Google Scholar]
- Friston K., Schwartenbeck P., Fitzgerald T., Moutoussis M., Behrens T., & Dolan R. J. (2013). The anatomy of choice: Active inference and agency. Frontiers in Human Neuroscience, 7, Article 598. 10.3389/fnhum.2013.00598 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gabaix X., & Laibson D. (2017). Myopia and discounting (No. w23254). National Bureau of Economic Research. http://www.nber.org/papers/w23254.pdf [Google Scholar]
- Gershman S. J., & Bhui R. (2020). Rationally inattentive intertemporal choice. Nature Communications, 11(1), Article 3365. 10.1038/s41467-020-16852-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gershman S. J., & Daw N. D. (2017). Reinforcement learning and episodic memory in humans and animals: An integrative framework. Annual Review of Psychology, 68(1), 101–128. 10.1146/annurev-psych-122414-033625 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Giordano L. A., Bickel W. K., Loewenstein G., Jacobs E. A., Marsch L., & Badger G. J. (2002). Mild opioid deprivation increases the degree that opioid-dependent outpatients discount delayed heroin and money. Psychopharmacology, 163(2), 174–182. 10.1007/s00213-002-1159-2 [DOI] [PubMed] [Google Scholar]
- Goldman-Rakic P. S., Selemon L. D., & Schwartz M. L. (1984). Dual pathways connecting the dorsolateral prefrontal cortex with the hippocampal formation and parahippocampal cortex in the rhesus monkey. Neuroscience, 12(3), 719–743. 10.1016/0306-4522(84)90166-0 [DOI] [PubMed] [Google Scholar]
- Green L., Fristoe N., & Myerson J. (1994). Temporal discounting and preference reversals in choice between delayed outcomes. Psychonomic Bulletin & Review, 1(3), 383–389. 10.3758/BF03213979 [DOI] [PubMed] [Google Scholar]
- Green L., Myerson J., & Ostaszewski P. (1999). Discounting of delayed rewards across the life span: Age differences in individual discounting functions. Behavioural Processes, 46(1), 89–96. 10.1016/S0376-6357(99)00021-2 [DOI] [PubMed] [Google Scholar]
- Griskevicius V., Tybur J. M., Delton A. W., & Robertson T. E. (2011). The influence of mortality and socioeconomic status on risk and delayed rewards: A life history theory approach. Journal of Personality and Social Psychology, 100(6), 1015–1026. 10.1037/a0022403 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hare T. A., Camerer C. F., & Rangel A. (2009). Self-control in decision-making involves modulation of the vmPFC valuation system. Science, 324(5927), 646–648. 10.1126/science.1168450 [DOI] [PubMed] [Google Scholar]
- Harrison G. W., Lau M. I., Rutström E. E., & Sullivan M. B. (2005). Eliciting risk and time preferences using field experiments: Some methodological issues. Research in Experimental Economics, 10, 125–218. 10.1016/S0193-2306(04)10005-7 [DOI] [Google Scholar]
- Hassabis D., Kumaran D., Vann S. D., & Maguire E. A. (2007). Patients with hippocampal amnesia cannot imagine new experiences. Proceedings of the National Academy of Sciences of the United States of America, 104(5), 1726–1731. 10.1073/pnas.0610561104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayden B. Y., & Platt M. L. (2007). Temporal discounting predicts risk sensitivity in rhesus macaques. Current Biology, 17(1), 49–53. 10.1016/j.cub.2006.10.055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haynes J. M., Galizio A., Frye C. C. J., Towse C. C., Morrissey K. N., Serang S., & Odum A. L. (2021). Discounting of food and water in rats shows trait- and state-like characteristics. Journal of the Experimental Analysis of Behavior, 115(2), 495–509. 10.1002/jeab.677 [DOI] [PubMed] [Google Scholar]
- Hecht D., Walsh V., & Lavidor M. (2013). Bi-frontal direct current stimulation affects delay discounting choices. Cognitive Neuroscience, 4(1), 7–11. 10.1080/17588928.2011.638139 [DOI] [PubMed] [Google Scholar]
- Huang H., Thompson W., & Paulus M. P. (2017). Computational dysfunctions in anxiety: Failure to differentiate signal from noise. Biological Psychiatry, 82(6), 440–446. 10.1016/j.biopsych.2017.07.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huys Q. J., Cools R., Gölzer M., Friedel E., Heinz A., Dolan R. J., & Dayan P. (2011). Disentangling the roles of approach, activation and valence in instrumental and pavlovian responding. PLoS Computational Biology, 7(4), Article e1002028. 10.1371/journal.pcbi.1002028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huys Q. J., Eshel N., O’Nions E., Sheridan L., Dayan P., & Roiser J. P. (2012). Bonsai trees in your head: How the pavlovian system sculpts goal-directed choices by pruning decision trees. PLoS Computational Biology, 8(3), Article e1002410. 10.1371/journal.pcbi.1002410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iigaya K. (2016). Adaptive learning and decision-making under uncertainty by metaplastic synapses guided by a surprise detection system. eLife, 5, Article e18073. 10.7554/eLife.18073 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson A., & Redish A. D. (2007). Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. The Journal of Neuroscience, 27(45), 12176–12189. 10.1523/JNEUROSCI.3761-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones B. A., & Rachlin H. (2009). Delay, probability, and social discounting in a public goods game. Journal of the Experimental Analysis of Behavior, 91(1), 61–73. 10.1901/jeab.2009.91-61 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kable J. W., & Glimcher P. W. (2007). The neural correlates of subjective value during intertemporal choice. Nature Neuroscience, 10(12), 1625–1633. 10.1038/nn2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kacelnik A. (1997). Normative and descriptive models of decision making: Time discounting and risk sensitivity. In Bock G. R. & Cardew G. (Eds.), Ciba foundation symposium, (Vol. 208, pp. 51–67). Wiley. [DOI] [PubMed] [Google Scholar]
- Kahneman D., & Tversky A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–292. 10.2307/1914185 [DOI] [Google Scholar]
- Kalenscher T., & Pennartz C. M. (2008). Is a bird in the hand worth two in the future? The neuroeconomics of intertemporal decision-making. Progress in Neurobiology, 84(3), 284–315. 10.1016/j.pneurobio.2007.11.004 [DOI] [PubMed] [Google Scholar]
- Keren G., & Roelofsma P. (1995). Immediacy and certainty in intertemporal choice. Organizational Behavior and Human Decision Processes, 63(3), 287–297. 10.1006/obhd.1995.1080 [DOI] [Google Scholar]
- Kirby K. N., Petry N. M., & Bickel W. K. (1999). Heroin addicts have higher discount rates for delayed rewards than non-drug-using controls. Journal of Experimental Psychology: General, 128(1), 78–87. 10.1037/0096-3445.128.1.78 [DOI] [PubMed] [Google Scholar]
- Kirk J. M., & Logue A. W. (1997). Effects of deprivation level on humans’ self-control for food reinforcers. Appetite, 28(3), 215–226. 10.1006/appe.1996.0071 [DOI] [PubMed] [Google Scholar]
- Knight F. H. (1921). Risk, uncertainty and profit. Houghton Mifflin. [Google Scholar]
- Koffarnus M. N., Jarmolowicz D. P., Mueller E. T., & Bickel W. K. (2013). Changing delay discounting in the light of the competing neurobehavioral decision systems theory: A review. Journal of the Experimental Analysis of Behavior, 99(1), 32–57. 10.1002/jeab.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kroll Y., Levy H., & Markowitz H. M. (1984). Mean-variance versus direct utility maximization. The Journal of Finance, 39(1), 47–61. 10.1111/j.1540-6261.1984.tb03859.x [DOI] [Google Scholar]
- Kurth-Nelson Z., Bickel W., & Redish A. D. (2012). A theoretical account of cognitive effects in delay discounting. The European Journal of Neuroscience, 35(7), 1052–1064. 10.1111/j.1460-9568.2012.08058.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lahav E., Benzion U., & Shavit T. (2011). The effect of military service on soldiers’ time preference-Evidence from Israel. Judgment and Decision Making, 6(2), 130–138. 10.1017/S1930297500004071 [DOI] [Google Scholar]
- Lee S., Gold J. I., & Kable J. W. (2020). The human as delta-rule learner. Decision, 7(1), 55–66. 10.1037/dec0000112 [DOI] [Google Scholar]
- Leigh J. P. (1986). Accounting for tastes: Correlates of risk and time preferences. Journal of Post Keynesian Economics, 9(1), 17–31. 10.1080/01603477.1986.11489597 [DOI] [Google Scholar]
- Loewenstein G., Read D., & Baumeister R. (Eds.). (2003). Time and decision economic and psychological perspectives of intertemporal choice. SAGE Publications. [Google Scholar]
- Luhmann C. C., Chun M. M., Yi D. J., Lee D., & Wang X. J. (2008). Neural dissociation of delay and uncertainty in intertemporal choice. The Journal of Neuroscience, 28(53), 14459–14466. 10.1523/JNEUROSCI.5058-08.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luhmann C. C., Ishida K., & Hajcak G. (2011). Intolerance of uncertainty and decisions about delayed, probabilistic rewards. Behavior Therapy, 42(3), 378–386. 10.1016/j.beth.2010.09.002 [DOI] [PubMed] [Google Scholar]
- MacDonald A. W. III, Cohen J. D., Stenger V. A., & Carter C. S. (2000). Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive control. Science, 288(5472), 1835–1838. 10.1126/science.288.5472.1835 [DOI] [PubMed] [Google Scholar]
- MacKay D. J. (2003). Information theory, inference and learning algorithms. Cambridge University Press. [Google Scholar]
- MacKillop J., Amlung M. T., Few L. R., Ray L. A., Sweet L. H., & Munafò M. R. (2011). Delayed reward discounting and addictive behavior: A meta-analysis. Psychopharmacology, 216(3), 305–321. 10.1007/s00213-011-2229-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Madden G. J., Begotka A. M., Raiff B. R., & Kastern L. L. (2003). Delay discounting of real and hypothetical rewards. Experimental and Clinical Psychopharmacology, 11(2), 139–145. 10.1037/1064-1297.11.2.139 [DOI] [PubMed] [Google Scholar]
- Mathys C., Daunizeau J., Friston K. J., & Stephan K. E. (2011). A bayesian foundation for individual learning under uncertainty. Frontiers in Human Neuroscience, 5, Article 39. 10.3389/fnhum.2011.00039 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClure S. M., Ericson K. M., Laibson D. I., Loewenstein G., & Cohen J. D. (2007). Time discounting for primary rewards. The Journal of Neuroscience, 27(21), 5796–5804. 10.1523/JNEUROSCI.4246-06.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClure S. M., Laibson D. I., Loewenstein G., & Cohen J. D. (2004). Separate neural systems value immediate and delayed monetary rewards. Science, 306(5695), 503–507. 10.1126/science.1100907 [DOI] [PubMed] [Google Scholar]
- McGuire J. T., Nassar M. R., Gold J. I., & Kable J. W. (2014). Functionally dissociable influences on learning rate in a dynamic environment. Neuron, 84(4), 870–881. 10.1016/j.neuron.2014.10.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meier S., & Sprenger C. D. (2012). Time discounting predicts creditworthiness. Psychological Science, 23(1), 56–58. 10.1177/0956797611425931 [DOI] [PubMed] [Google Scholar]
- Mitchell S. H. (2004). Effects of short-term nicotine deprivation on decision-making: Delay, uncertainty and effort discounting. Nicotine & Tobacco Research, 6(5), 819–828. 10.1080/14622200412331296002 [DOI] [PubMed] [Google Scholar]
- Nassar M. R., Rumsey K. M., Wilson R. C., Parikh K., Heasly B., & Gold J. I. (2012). Rational regulation of learning dynamics by pupil-linked arousal systems. Nature Neuroscience, 15(7), 1040–1046. 10.1038/nn.3130 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nassar M. R., Wilson R. C., Heasly B., & Gold J. I. (2010). An approximately Bayesian delta-rule model explains the dynamics of belief updating in a changing environment. The Journal of Neuroscience, 30(37), 12366–12378. 10.1523/JNEUROSCI.0822-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Niv Y., Daw N. D., Joel D., & Dayan P. (2007). Tonic dopamine: Opportunity costs and the control of response vigor. Psychopharmacology, 191(3), 507–520. 10.1007/s00213-006-0502-4 [DOI] [PubMed] [Google Scholar]
- Noussair C., & Wu P. (2006). Risk tolerance in the present and the future: An experimental study. MDE. Managerial and Decision Economics, 27(6), 401–412. 10.1002/mde.1278 [DOI] [Google Scholar]
- O’Connor M. (1989). Models of human behaviour and confidence in judgement: A review. International Journal of Forecasting, 5(2), 159–169. 10.1016/0169-2070(89)90083-6 [DOI] [Google Scholar]
- Odum A. L. (2011). Delay discounting: Trait variable? Behavioural Processes, 87(1), 1–9. 10.1016/j.beproc.2011.02.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Owens M. M., Gray J. C., Amlung M. T., Oshri A., Sweet L. H., & MacKillop J. (2017). Neuroanatomical foundations of delayed reward discounting decision making. NeuroImage, 161, 261–270. 10.1016/j.neuroimage.2017.08.045 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Patak M., & Reynolds B. (2007). Question-based assessments of delay discounting: Do respondents spontaneously incorporate uncertainty into their valuations for delayed rewards? Addictive Behaviors, 32(2), 351–357. 10.1016/j.addbeh.2006.03.034 [DOI] [PubMed] [Google Scholar]
- Peters J., & Büchel C. (2009). Overlapping and distinct neural systems code for subjective value during intertemporal and risky decision making. The Journal of Neuroscience, 29(50), 15727–15734. 10.1523/JNEUROSCI.3489-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peters J., & Büchel C. (2010). Episodic future thinking reduces reward delay discounting through an enhancement of prefrontal-mediotemporal interactions. Neuron, 66(1), 138–148. 10.1016/j.neuron.2010.03.026 [DOI] [PubMed] [Google Scholar]
- Peviani K. M., Kahn R. E., Maciejewski D., Bickel W. K., Deater-Deckard K., King-Casas B., & Kim-Spoon J. (2019). Intergenerational transmission of delay discounting: The mediating role of household chaos. Journal of Adolescence, 72(1), 83–90. 10.1016/j.adolescence.2019.03.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pine A., Seymour B., Roiser J. P., Bossaerts P., Friston K. J., Curran H. V., & Dolan R. J. (2009). Encoding of marginal utility across time in the human brain. The Journal of Neuroscience, 29(30), 9575–9581. 10.1523/JNEUROSCI.1126-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pulcu E., Trotter P. D., Thomas E. J., McFarquhar M., Juhász G., Sahakian B. J., Deakin J. F., Zahn R., Anderson I. M., & Elliott R. (2014). Temporal discounting in major depressive disorder. Psychological Medicine, 44(9), 1825–1834. 10.1017/S0033291713002584 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rachlin H. (2006). Notes on discounting. Journal of the Experimental Analysis of Behavior, 85(3), 425–435. 10.1901/jeab.2006.85-05 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rachlin H., Raineri A., & Cross D. (1991). Subjective probability and delay. Journal of the Experimental Analysis of Behavior, 55(2), 233–244. 10.1901/jeab.1991.55-233 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rescorla R. A., & Wagner A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In Prokasy A. H. (Ed.), Classical conditioning II: Current research and theory (pp. 64–99). Appleton-Century-Croft. [Google Scholar]
- Reynolds B., Patak M., & Shroff P. (2007). Adolescent smokers rate delayed rewards as less certain than adolescent nonsmokers. Drug and Alcohol Dependence, 90(2–3), 301–303. 10.1016/j.drugalcdep.2007.04.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rudorf S., & Hare T. A. (2014). Interactions between dorsolateral and ventromedial prefrontal cortex underlie context-dependent stimulus valuation in goal-directed choice. The Journal of Neuroscience, 34(48), 15988–15996. 10.1523/JNEUROSCI.3192-14.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rushworth M. F., & Behrens T. E. (2008). Choice, uncertainty and value in prefrontal and cingulate cortex. Nature Neuroscience, 11(4), 389–397. 10.1038/nn2066 [DOI] [PubMed] [Google Scholar]
- Schacter D. L., Addis D. R., & Buckner R. L. (2008). Episodic simulation of future events: Concepts, data, and applications. Annals of the New York Academy of Sciences, 1124(1), 39–60. 10.1196/annals.1440.001 [DOI] [PubMed] [Google Scholar]
- Schönbrodt F. D., & Perugini M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47(5), 609–612. 10.1016/j.jrp.2013.05.009 [DOI] [Google Scholar]
- Sharpe W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. The Journal of Finance, 19(3), 425–442. 10.1111/j.1540-6261.1964.tb02865.x [DOI] [Google Scholar]
- Sheffer C. E., Miller A., Bickel W. K., Devonish J. A., O’Connor R. J., Wang C., Rivard C., & Gage-Bouchard E. A. (2018). The treasure of now and an uncertain future: Delay discounting and health behaviors among cancer survivors. Cancer, 124(24), 4711–4719. 10.1002/cncr.31759 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smallwood P. D. (1996). An introduction to risk sensitivity: The use of Jensen’s inequality to clarify evolutionary arguments of adaptation and constraint. American Zoologist, 36(4), 392–401. 10.1093/icb/36.4.392 [DOI] [Google Scholar]
- Snider S. E., DeHart W. B., Epstein L. H., & Bickel W. K. (2019). Does delay discounting predict maladaptive health and financial behaviors in smokers? Health Psychology, 38(1), 21–28. 10.1037/hea0000695 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sozou P. D. (1998). On hyperbolic discounting and uncertain hazard rates. Proceedings: Biological Sciences, 265(1409), 2015–2020. 10.1098/rspb.1998.0534 [DOI] [Google Scholar]
- Stephan K. E., Penny W. D., Daunizeau J., Moran R. J., & Friston K. J. (2009). Bayesian model selection for group studies. NeuroImage, 46(4), 1004–1017. 10.1016/j.neuroimage.2009.03.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stevens J. R., Hallinan E. V., & Hauser M. D. (2005). The ecology and evolution of patience in two New World monkeys. Biology Letters, 1(2), 223–226. 10.1098/rsbl.2004.0285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stevenson M. K. (1992). The impact of temporal context and risk on the judged value of future outcomes. Organizational Behavior and Human Decision Processes, 52(3), 455–491. 10.1016/0749-5978(92)90029-7 [DOI] [Google Scholar]
- Story G. (2023). Discounting future reward in an uncertain world: Behavioural data [Dataset]. Dryad. 10.5061/dryad.47d7wm3k2 [DOI]
- Sutton R. S., & Barto A. G. (1998). Introduction to reinforcement learning. MIT Press. 10.1109/TNN.1998.712192 [DOI] [Google Scholar]
- Symmonds M., Wright N. D., Bach D. R., & Dolan R. J. (2011). Deconstructing risk: Separable encoding of variance and skewness in the brain. NeuroImage, 58(4), 1139–1149. 10.1016/j.neuroimage.2011.06.087 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi T., Ikeda K., & Hasegawa T. (2007). A hyperbolic decay of subjective probability of obtaining delayed rewards. Behavioral and Brain Functions, 3(1), Article 52. 10.1186/1744-9081-3-52 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tanaka S. C., Samejima K., Okada G., Ueda K., Okamoto Y., Yamawaki S., & Doya K. (2006). Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics. Neural Networks, 19(8), 1233–1241. 10.1016/j.neunet.2006.05.039 [DOI] [PubMed] [Google Scholar]
- Tsao A., Sugar J., Lu L., Wang C., Knierim J. J., Moser M.-B., & Moser E. I. (2018). Integrating time from experience in the lateral entorhinal cortex. Nature, 561(7721), 57–62. 10.1038/s41586-018-0459-6 [DOI] [PubMed] [Google Scholar]
- Von Neumann J., & Morgenstern O. (1947). Theory of games and economic behavior. Princeton University Press. [Google Scholar]
- Weber B. J., & Chapman G. B. (2005). The combined effects of risk and time on choice: Does uncertainty eliminate the immediacy effect? Does delay eliminate the certainty effect? Organizational Behavior and Human Decision Processes, 96(2), 104–118. 10.1016/j.obhdp.2005.01.001 [DOI] [Google Scholar]
- Weber B. J., & Huettel S. A. (2008). The neural substrates of probabilistic and intertemporal decision making. Brain Research, 1234, 104–115. 10.1016/j.brainres.2008.07.105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weber E. U., Shafir S., & Blais A. R. (2004). Predicting risk sensitivity in humans and lower animals: Risk as variance or coefficient of variation. Psychological Review, 111(2), 430–445. 10.1037/0033-295X.111.2.430 [DOI] [PubMed] [Google Scholar]
- Wesley M. J., & Bickel W. K. (2014). Remember the future II: Meta-analyses and functional overlap of working memory and delay discounting. Biological Psychiatry, 75(6), 435–448. 10.1016/j.biopsych.2013.08.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson R. C., Nassar M. R., & Gold J. I. (2010). Bayesian online learning of the hazard rate in change-point problems. Neural Computation, 22(9), 2452–2476. 10.1162/NECO_a_00007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia L., Gu R., Zhang D., & Luo Y. (2017). Anxious individuals are impulsive decision-makers in the delay discounting task: An ERP study. Frontiers in Behavioral Neuroscience, 11, Article 5. 10.3389/fnbeh.2017.00005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yi R., & Landes R. D. (2012). Temporal and probability discounting by cigarette smokers following acute smoking abstinence. Nicotine & Tobacco Research, 14(5), 547–558. 10.1093/ntr/ntr252 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Behavioral data supporting the findings of this study are publicly available online in a third-party repository: https://doi.org/10.5061/dryad.47d7wm3k2 (G. Story, 2023). Computer codes that support the findings of this study are available from the corresponding author upon reasonable request.
Behavioral data supporting the findings of this study are publicly available online in a third-party repository: https://doi.org/10.5061/dryad.47d7wm3k2 (G. Story, 2023). Computer codes and imaging data that support the findings of this study are available from the corresponding author upon reasonable request.
Behavioral data supporting the findings of this study are publicly available online in a third-party repository: https://doi.org/10.5061/dryad.47d7wm3k2 (G. Story, 2023). Computer codes that support the findings of this study are available from the corresponding author upon reasonable request.