Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2014 Feb 28;9(2):e90138. doi: 10.1371/journal.pone.0090138

Intertemporal Choice as Discounted Value Accumulation

Christian A Rodriguez 1,*, Brandon M Turner 1, Samuel M McClure 1
Editor: Jean Daunizeau2
PMCID: PMC3938649  PMID: 24587243

Abstract

Two separate cognitive processes are involved in choosing between rewards available at different points in time. The first is temporal discounting, which consists of combining information about the size and delay of prospective rewards to represent subjective values. The second involves a comparison of available rewards to enable an eventual choice on the basis of these subjective values. While several mathematical models of temporal discounting have been developed, the reward selection process has been largely unexplored. To address this limitation, we evaluated the applicability of the Linear Ballistic Accumulator (LBA) model as a theory of the selection process in intertemporal choice. The LBA model formalizes the selection process as a sequential sampling algorithm in which information about different choice options is integrated until a decision criterion is reached. We compared several versions of the LBA model to demonstrate that choice outcomes and response times in intertemporal choice are well captured by the LBA process. The relationship between choice outcomes and response times that derives from the LBA model cannot be explained by temporal discounting alone. Moreover, the drift rates that drive evidence accumulation in the best-fitting LBA model are related to independently estimated subjective values derived from various temporal discounting models. These findings provide a quantitative framework for predicting dynamics of choice-related activity during the reward selection process in intertemporal choice and link intertemporal choice to other classes of decisions in which the LBA model has been applied.

Introduction

In order to choose between rewards available at different points in time it is often necessary to evaluate the tradeoff between the size of potential rewards and the corresponding delays until their receipt. For example, deciding whether to save or spend a certain amount of money requires determining whether ensuring greater future wealth is worth delaying the pleasure of spending and consuming now. When engaged in this form of decision making, a class of decisions known as intertemporal choice, humans and other species discount the value of rewards in proportion to the delay at which they are available. Moreover, the behavior observed in intertemporal choice experiments reveals preferences consistent with a steep reduction in the value of rewards delayed from the present moment but more modest discounting of rewards delayed from future time points [1]. This property is particularly evident as a greater reluctance to forego immediate for delayed rewards compared with when both outcomes are delayed, a tendency that manifests itself in impulsivity and a predilection for procrastination. Several mathematical models have been shown to account for this pattern of delay discounting [2]. However, subjective valuation is only one of the cognitive processes involved in intertemporal choice behavior [3], [4].

In addition to representing the value of delayed rewards, intertemporal choices require comparing alternatives and selecting among them. One proposal for how delayed rewards might be compared and selected is through a process of sequential sampling of discounted values [3]. Similar processes are commonly assumed to underlie perceptual judgments based on sensory evidence [5]. This hypothesis suggests that there exists a direct connection between choices made on the basis of discounted values and other choices which have been argued to derive from sequential sampling processes. However, the hypothesis that a sequential sampling process underlies intertemporal decision-making has not been empirically tested. Therefore, our primary goal is to determine whether intertemporal choice behavior can be explained by a sequential sampling process based on discounted value.

There are several computational models that employ sequential sampling mechanisms to explain choice behavior (cf. [6][8]). A major accomplishment of all of these models is their ability to provide a process-level account of how experimental manipulations such as time pressure and stimulus ambiguity simultaneously affect response times (RT) and error rates. While many of these models might be able to explain intertemporal choice behavior, we used the Linear Ballistic Accumulator (LBA) model [8] in our analyses. The LBA model incorporates the fundamental features of all sequential sampling models, including trial-to-trial variability in the rate of evidence accumulation, a decision criterion, and constants to account for perception and motor execution times. The major advantage of the LBA model is its analytical tractability, which facilitates testing several versions of the model to determine which combination of parameters best accounts for intertemporal choice behavior. We show that the LBA model provides an excellent description of the relationship between choice outcomes and RT and that best-fitting model parameters can be directly related to subjective values.

Materials and Methods

Subjects

Fifty healthy adults participated in this study (28 females, ages 19–46 years, mean 24.36 years). All subjects gave written informed consent. Stanford University's Institutional Review Board approved the study. One subject was excluded because the behavior did not allow us to estimate reliable temporal discounting parameters. Another three subjects were excluded because of data collection problems. Data from a total of forty-six subjects were analyzed (28 females, ages 19–46 years, mean 24.26 years).

Temporal discounting model and task design

The experiments were conducted over two sessions. The purpose of the first session was to estimate each individual's discount rate using a hyperbolic discounting model. For half of our subjects (Inline graphic) the second session consisted of an electroencephalography (EEG) experiment. For the other half the second session consisted of a functional magnetic resonance imaging (fMRI) experiment. The analyses reported below were obtained from the behavior observed during these EEG and fMRI sessions.

We assumed that the subjective value of delayed rewards was discounted according to

graphic file with name pone.0090138.e002.jpg (1)

where Inline graphic is the magnitude of a reward offered at delay Inline graphic. The individually-determined parameter Inline graphic is the discount factor [9]. While subjects completed the first session, we used a stair-stepping procedure to approximate Inline graphic. All choices required participants to select between a delayed reward (of amount Inline graphic available at delay Inline graphic) and a fixed immediate reward of $10. For any choice, indifference between the immediate and delayed options implies a discount rate of Inline graphic. We refer to this implied equivalence point as Inline graphic; our procedure amounted to varying Inline graphic systematically until indifference was reached. Specifically, we began with Inline graphic. If the delayed offer was chosen, Inline graphic was decreased by a step size of Inline graphic for the next trial. Otherwise, Inline graphic increased by the same amount. At every second choice reversal, occurring within five consecutive trials, the step size was reduced by 5%. A total of 60 trials were completed. We placed no limits on the time subjects could take to respond, and presented both offers on the screen, as “$10 now” on the left side, and “$Inline graphic in Inline graphic days” on the right.

Critically, our use of the hyperbolic discounting model to summarize behavior in this first experimental session had no bearing on the modeling results that follow. We used the hyperbolic model because it provided a good fit to behavior with a single parameter (Inline graphic) summarizing preferences. Fits of this model were used solely to generate choices for the second experimental session. Alternative delay discounting functions that may or may not provide better fits to behavior would have a subtle impact on the choice set (dollar amounts of choice options) for the second study, but no impact on the model fitting that is the primary aim of the current study.

After completing the first session, we fit a softmax decision function to participants' choices. Intuitively, this procedure allowed us to determine how consistently participants selected the option with greater subjective value. Practically, we fit the softmax to better equate choices during the second session, across participants. In particular, our aim was to equate the relative impact of delayed rewards, across subjects, with respect to actual choice outcomes (i.e. the likelihood of selecting the delayed option). Best fitting softmax functions were estimated by maximizing the likelihood of observed choices. We assumed that the likelihood of selecting a delayed reward (Inline graphic) was given by

graphic file with name pone.0090138.e020.jpg (2)

where Inline graphic is given by Equation 1, Inline graphic (i.e., the fixed-value of the immediate reward also given by the right side of Equation 1) and Inline graphic describes a subject's sensitivity to changes in Inline graphic.

We used individually determined values of Inline graphic and Inline graphic to generate choices for the second session. At every trial, Inline graphic was randomly selected from a range of 30–45 days. We then calculated and offered an amount Inline graphic that would give Inline graphic of 0.1, 0.3, 0.5, 0.7, or 0.9 (Figure 1a-b). The EEG group completed 30 trials at every Inline graphic level, except at Inline graphic = 0.5, for which they completed 60 trials. The fMRI group completed 40 trials at every Inline graphic level, except at Inline graphic = 0.5, for which they completed 80 trials. Non-uniform trial distributions as a function of Inline graphic were introduced to allow us to study the effects of choice difficulty on EEG and fMRI measures, with equal numbers of trials at each difficulty level. We report the results of these analyses elsewhere. Trial types were randomized and counterbalanced over two blocks for the EEG group and over four blocks for the fMRI group. We also counterbalanced the mapping between choices and button presses for every subject. During the first half of the second session, approximately half of subjects (13 in EEG, 11 in fMRI) indicated choices of the delayed reward by pressing a button with their left index finger and immediate choices by pressing a different button with their right index finger. The other subjects indicated their choices by the inverse left-right mapping. All subjects switched the initial response mapping during the second half of the session.

Figure 1. Experimental design.

Figure 1

(a) Delayed reward offers corresponded with one of five different levels of discounted value. Each level of discounted value corresponded to one of five probabilities of choosing the delayed reward: 0.1, 0.3, 0.5, 0.7 or 0.9. (b) Every delay could be combined with any of five different amounts to yield a different discounted value and probability of choosing the delayed reward. (c) Delay and amount information was presented sequentially. Delays were presented first for 1000 ms. Amounts were presented second, replacing the presentation of the delay and remaining on the screen for a maximum of 4000 ms. After every trial, a fixation cross was presented on the center of the screen for a randomly chosen inter-trial-interval in the order of hundreds of milliseconds during the EEG experiment and several seconds during the fMRI experiment.

To ensure reliable neural measures, we used a sequential presentation of delay and amount during the second session (Figure 1c). During pilot studies we found that a simultaneous presentation of delay and amount caused participants to sequentially fixate the information, producing excessive EEG artifacts. Having the information presented sequentially allowed subjects to maintain central fixation during the task, avoiding these artifacts. As we show below, this sequential presentation of delayed reward information had no adverse effects on behavior. We maintained the same sequential presentation during the fMRI study for the purpose of facilitating direct comparisons and pooling of behavioral data. We report RT as measured from the onset of the decision period, 1000 ms into the trial. The duration of the decision period was fixed at 4000 ms. When subjects made choices in less than 4000 ms the amount information disappeared and the screen remained blank until 4000 ms elapsed. Trial length was thus fixed at 5000 ms. We discarded any trial in which a response was made in less than 200 ms or fell outside of the decision period. To optimize experimental time and separability of neural signals across trials for both groups, we introduced a long inter-trial-interval for the fMRI group (between 4–10 s), whereas the inter-trial-interval was shorter for the EEG group (100–350 ms). In exchange for participation subjects received $10 cash and an additional amount, determined by their choice in a randomly selected trial, taken from either the first or second sessions.

Model specification and fitting

Figure 2 provides an illustrative diagram of our LBA model of intertemporal choice. To provide a formal description of the model, we denote the RT on the Inline graphicth trial for the Inline graphicth subject in the Inline graphicth value condition as Inline graphic, and the corresponding choice as Inline graphic where Inline graphic. Inline graphic and Inline graphic are the immediate and delayed rewards respectively. The model assumes that evidence for Inline graphic and Inline graphic is accumulated independently in separate accumulators. Both accumulators begin with some choice bias, which is provided as independent amounts of starting point evidence Inline graphic, sampled from a common uniform distribution Inline graphic. Evidence then increases through time at rates Inline graphic, which are sampled from independent normal distributions with means Inline graphic. Mean accumulation rates vary across value conditions, but the standard deviation Inline graphic is the same for Inline graphic and Inline graphic. Therefore, Inline graphic and Inline graphic. Each accumulator gathers evidence until either one reaches a response threshold Inline graphic. The observed RT is the sum of the decision time, plus some extra time Inline graphic, which accounts for non-comparison and selection processes, such as temporal discounting and motor execution. Letting Inline graphic  =  a and Inline graphic  =  d, the RT in any given trial is given by

Figure 2. Illustrative diagram of the Linear Ballistic Accumulator model for intertemporal choice, where each response option is represented as a separate accumulator.

Figure 2

Following the presentation of a stimulus and some non-decision time Inline graphic, information accumulates ballistically for each alternative. A decision is made that coincides with the accumulator that reaches the threshold Inline graphic first. The model assumes trial-to-trial variation in both starting point and drift rate.

graphic file with name pone.0090138.e060.jpg (3)

The model provides a closed-form and joint account of RT and choice probability across value conditions by specifying “defective” probability density functions (PDF) for Inline graphic and Inline graphic in terms of the parameters just described. These defective PDFs give the probabilities of each accumulator reaching the bound at time Inline graphic. For our best fitting model, the full PDFs are given by

graphic file with name pone.0090138.e064.jpg (4)

where Inline graphic and Inline graphic are the PDF and cumulative density functions of each accumulator (see [8] for details).

We estimated LBA model parameters using a hierarchical Bayesian procedure. This procedure offers two advantages over conventional maximum likelihood methods, providing measures of uncertainty for every parameter estimate and allowing the sharing of information across subjects (e.g., [10], [11]), which improves fitting accuracy[12][14]. We assume that the data for each subject is characterized by an individual set of LBA model parameters Inline graphic, and that these subject-specific parameters are constrained by a set of group-level parameters ϕ, which characterize the central tendency and dispersion of Inline graphic across subjects. The procedure first samples the posterior distributions for every subjects' Inline graphic and uses these estimates to derive the posterior distribution of ϕ. On every subsequent iteration, the posterior estimates of ϕ are used to constrain the sampling of possible values of Inline graphic for every subject. We specified mildly informative priors for Inline graphic, based on empirical evidence from previous fits of the LBA model using the hierarchical Bayesian procedure [15]. For ϕ, we specified a conjugate relationship between prior and posterior (see, e.g., [16]). Assuming a conjugate relationship at the group-level allowed us to derive exact conditional posterior distributions, so that we could perform the estimation of all of the parameters simultaneously, based on a single sample of subject-level parameters. The joint posterior distribution estimated by this procedure is given by:

graphic file with name pone.0090138.e072.jpg (5)

where Inline graphic is the prior distribution for Inline graphic, Inline graphic is the prior distribution for Inline graphic given ϕ, and

graphic file with name pone.0090138.e077.jpg

is the likelihood function of the data under the LBA model (given by Equation 4).

To satisfy scaling conditions, we imposed a constraint such that the drift rates sum to one (i.e., Inline graphic). Consequently, it is sufficient to only estimate the drift rate for the delayed reward. For the subject-specific parameters, we first transformed the parameters so that they had continuous, infinite support (i.e., can take on any real value). Thus, for parameters bounded by zero, we applied a log transformation, whereas for the drift rates – which were bounded by zero and one – we used a logit transformation. Following these transformations, we specified the following priors for Inline graphic:

graphic file with name pone.0090138.e080.jpg
graphic file with name pone.0090138.e081.jpg
graphic file with name pone.0090138.e082.jpg
graphic file with name pone.0090138.e083.jpg
graphic file with name pone.0090138.e084.jpg

To obtain the desired conjugate relationship between prior and posterior at the level of ϕ, we specified the following priors for the group-level means:

graphic file with name pone.0090138.e085.jpg
graphic file with name pone.0090138.e086.jpg
graphic file with name pone.0090138.e087.jpg
graphic file with name pone.0090138.e088.jpg
graphic file with name pone.0090138.e089.jpg

and the following priors for the group-level standard deviations,

graphic file with name pone.0090138.e090.jpg
graphic file with name pone.0090138.e091.jpg
graphic file with name pone.0090138.e092.jpg
graphic file with name pone.0090138.e093.jpg
graphic file with name pone.0090138.e094.jpg

where Inline graphic denotes the inverse gamma distribution with shape parameter Inline graphic, and scale parameter Inline graphic. This particular choice of Inline graphic and Inline graphic for the priors produces a skewed distribution with an approximate 95% credible set of (1.14, 9.05), and an expected value of 3.32. These choices reflect our a priori beliefs: we did not expect the between-subject variability to be less that 1, and felt that larger values would become increasingly less likely to account for these data.

While our prior selections were informed by other similar modeling applications (see, e.g., [15]), we remained conservative in our choices to avoid undue parameter constraint, because our experimental task was considerably different from prior research using the hierarchical version of the LBA model.

We used Gibbs sampling to estimate parameters at the group-level[16], and differential evolution with Markov chain Monte Carlo to estimate parameters at the subject-level (DE-MCMC;[15], [17]). For the subject level estimates, we used 24 chains and obtained 5,000 samples after a burn-in period of 5,000 samples. We then thinned the chains to reduce autocorrelation by retaining every fourth sample. Thus, our estimates of the joint posterior distribution of LBA model parameters are based on 30,000 samples. The burn-in period allowed us to converge quickly to the high-density regions of the posterior distribution, while the rest of the samples allowed us to improve the reliability of the estimates.

To find the optimal number of parameters needed to account for intertemporal choice behavior, we tested a variety of model variants where different sets of parameters were assumed to vary across value conditions. We fit a total of eight variants, following a model building approach based on the Bayesian predictive information criterion (BPIC; [18]). Table 1 shows the model variants we fit (left column) with the particular constraints that were imposed (right column) along with the resulting BPIC values obtained (middle column). We started with the simplest possible model and added parameters only if they improved model fits on the basis of BPIC. The most basic model (M1) only allowed the mean drift rates Inline graphic to vary across value conditions. Another four models freed each of the remaining parameters (Inline graphic and Inline graphic), independently, across value conditions. Because the model that freed Inline graphic (M2) was superior to M1, we considered three additional models that freed Inline graphic and Inline graphic, together with each of the remaining parameters independently. None of these three models improved fits, indicating that no additional parameter combinations needed to be tested. We did not consider any models that freed parameters other than Inline graphic between Inline graphic and Inline graphic because we found no a priori justification for them.

Table 1. Mean Bayesian predictive information criterion fit statistics for each model variant we tested (standard deviations of the BPIC values computed across chains appear in parentheses).

Model BPIC (std. dev.) Constraint
M1 20101.37 (19.02) Inline graphic
M2 20090.73 (26.91) Inline graphic, Inline graphic
M3 20168.13 (71.25) Inline graphic, Inline graphic
M4 20197.20 (58.93) Inline graphic, Inline graphic
M5 20135.33 (46.86) Inline graphic, Inline graphic
M6 20138.64 (98.01) Inline graphic, Inline graphic, Inline graphic
M7 20111.70 (44.94) Inline graphic, Inline graphic, Inline graphic
M8 20153.45 (28.34) Inline graphic, Inline graphic, Inline graphic

For each model, the third column indicates the set of parameters assumed to vary across value conditions.

Results

Model fits

Table 1 shows BPIC results for all the models tested. The best overall model, albeit by a small margin, was M2, which allowed mean drift rates (Inline graphic) and non-decision times (Inline graphic) to vary across experimental conditions. Figure 3 shows the quality of the fits obtained with this model. The match between the data and the model predictions is clear in each of the defective PDFs and histograms shown on the top row. These fits speak to the LBA model's ability to simultaneously account for observed RT distributions and choice probabilities during intertemporal choice.

Figure 3. A comparison of model fits to empirical data.

Figure 3

The top row shows the aggregated posterior predictive distribution (densities) overlaid on the aggregated empirical data (histograms). The response time distribution for the immediate reward is plotted on the left (i.e., with a negative axis; red), whereas the delayed reward is plotted on the right (green). The choice probability can be inferred by comparing the relative heights of the two distributions. The bottom row shows the same distributions as overlapping density functions with corresponding colors. The model fits are shown as black densities. The median response times for the empirical data are shown as the dashed vertical lines with corresponding colors.

The bottom row of Figure 3 shows the model fits with the RT distributions for both accumulators on the same axis to better illustrate the relationship between choice probability and RT. As net value (i.e. Inline graphic) increases, choices for the reward of less subjective value are slower relative to choices for the reward of greater value. This finding is illustrated by the increased separation of RT medians as the probability of choosing the delayed reward deviates further from Inline graphic (Figure 3). We confirmed the reliability of this pattern in the data by analyzing RT medians for choices that were consistent versus inconsistent with estimated subjective values. Specifically, we performed a rank-test on RT medians for consistent and inconsistent choices for all value conditions for which Inline graphic and confirmed that inconsistent responses were slower relative than consistent responses in all conditions where Inline graphic (Inline graphic). A similar relationship between RT and choice probability is commonly observed during perceptual decision making under stressed accuracy conditions. As choice probabilities deviate from Inline graphic, the means of the drift rate distributions (Inline graphic) grow further apart (cf. [19], [20]). Recall that Inline graphic. However, subjects maintain an elevated accumulation bound (Inline graphic) relative to the starting points (Inline graphic). As a result, choices for the reward of less subjective value only occur in the improbable trials where the drift rate for the highest valued reward is unusually low, the drift rate for the lowest valued reward is unusually high, and subjects require more accumulated information before a decision can be made. If the starting points were large relative to the decision bound we would observe the opposite interdependence of RT and choice probabilities. Inconsistent choices would be faster than consistent choices, because fast errors occur when the initial choice bias drives the accumulation close to the decision bound before much evidence influences the decision. This value accumulation mechanism can explain why our model fitting results indicated that variability in Inline graphic or Inline graphic was not required to provide a good fit for these data (i.e., M1 and M2 performed better than M3, M4, M6, and M7).

Non-decision time

The best fitting model, M2, specifies a total of 13 subject-specific parameters, four more than the next best, and simplest model, M1. The four additional parameters modeled differences in non-decision time (Inline graphic) by value condition (Inline graphic). To evaluate whether there was indeed systematic variance in non-decision time, we first inspected group-level estimates of Inline graphic, shown in the left panel of Figure 4. These parameter estimates showed a positive quadratic pattern centered at Inline graphic. To test the quadratic relationship between Inline graphic and value, we performed a mixed-effects regression analysis with the nlme package in R (Jose Pinheiro et al., 2013), specifying subjects as random effects, and the regressor Inline graphic as a predictor of subject-specific maximum a posteriori (MAP) estimates of Inline graphic. The results corroborated a positive quadratic relationship between Inline graphic estimates and value (Inline graphic, Inline graphic), suggesting that there is an increase in valuation and/or motor-execution times as net value increases.

Figure 4. Relationships between model parameters, choice probability, and RT statistics.

Figure 4

The left panel shows the estimated group level non-decision time parameter for each value condition. The middle and right panels show the maximum a posteriori (MAP) estimate for each subject's non-decision time parameter against their minimum and median response time, respectively.

In the LBA model, Inline graphic functions as an offset term that captures differences in condition-wise RT that are not captured by the other parameters. The obvious empirical statistics related to average RT differences are condition-wise median and minimum RT. We therefore next tested whether (1) Inline graphic estimates were related to either median or minimum RT, and (2) whether minimum and/or median RT differed by value condition as suggested by the positive quadratic relationship between Inline graphic estimates and value.

The middle and right panels of Figure 4 plot subject-specific MAP estimates of Inline graphic against minimum and median RT, respectively. We conducted two mixed-effects regressions (using subjects as random effects) to determine whether Inline graphic estimates were related to minimum or median RT at each value condition. As hypothesized, Inline graphic estimates showed a significant linear relationship with minimum RT (Inline graphic, Inline graphic, Inline graphic) and also a significant linear relationship with median RT (Inline graphic, Inline graphic, Inline graphic).

Given these results, we next sought to determine whether RT differed across value conditions in the same manner as did estimates of Inline graphic. To test this hypothesis, we ran two additional mixed-effects regressions using the quadratic regressor Inline graphic as a predictor of minimum and median RT (with subjects again as random effects). Recall that Inline graphic estimates showed a positive quadratic relationship with value. This relationship with value was not evident in analyses of minimum or median RT. Specifically, minimum RT did not show a significant quadratic relationship (Inline graphic, Inline graphic), and median RT showed a significant negative relationship with value (Inline graphic, Inline graphic). We conclude from these results that neither minimum nor median RT alone can explain the positive quadratic relationship between Inline graphic and value. Taken together, our results suggest that the additional degrees of freedom in M2 allowed the model to capture within-subject changes in minimum RT and residual variance of median RT across value conditions.

Drift rates and value

To obtain a more precise characterization of M2 as a mechanistic theory of discounted value accumulation, we examined the relationship between independently estimated accumulation rates and discounted values. We first tested whether there were systematic differences in group-level estimates of Inline graphic as a function of Inline graphic. Group-level means of Inline graphic increased as a function of Inline graphic. Specifically, we ran a mixed-effects regression of subject-specific MAP estimates of Inline graphic on Inline graphic (using subjects as random-effects). This test revealed a significant positive linear relationship (Inline graphic, Inline graphic, Inline graphic; Figure 5, left plot).

Figure 5. Relationships between model parameters, choice probability, and discounted value.

Figure 5

The left panel shows the estimated group level drift rate for each value condition. The middle panel shows the maximum a posteriori (MAP) estimate for each subject's drift rate against observed choice probabilities for the delayed reward (Inline graphic). The right panel shows the MAP estimate as a function of subject-specific discounted values for the delayed reward (Inline graphic).

Next, we tested for a relationship between observed choice probabilities for the delayed reward and MAP estimates of Inline graphic and Inline graphic at the level of individual subjects. Specifically, we hypothesized that drift rates (Inline graphic) should be related to subjective value through a linear transform, with a slope parameter to account for differences in scale (i.e. Inline graphic and Inline graphic are restricted to be between 0 and 1 but Inline graphic and Inline graphic are in dollars with a mean of $10) and an offset parameter to account for differences in drift rate and value means. We further reasoned that if drift rates were directly related to discounted subjective value then drift rates ought to be related to choice probability in the same way that differences in value are related to choice probability. Based on fits of the hyperbolic temporal discounting model (Equation 1) to choice outcomes, we already knew that a sigmoidal relationship (Equation 2) existed between subjective value (i.e. Inline graphic) and choice probabilities (i.e. Inline graphic). If modeled drift rates had the same relationship then we would expect a similar relationship between Inline graphic, Inline graphic, and Inline graphic. However, Inline graphic and Inline graphic were not independent in our model specification. They were restricted such that Inline graphic. Thus, the difference in drift rates, Inline graphic reduces to a linear transformation of Inline graphic. We therefore tested whether a sigmoidal relationship exists between subject- and condition-specific Inline graphic and a linear transform of Inline graphic:

graphic file with name pone.0090138.e201.jpg (6)

where Inline graphic and Inline graphic are subject specific parameters.

We tested for evidence to support Equation 6 in two ways. First, we performed a mixed-effects logistic regression using Inline graphic to predict Inline graphic, with subjects as random effects. This analysis revealed a significant fit (Inline graphic, Inline graphic, Inline graphic, Inline graphic). The sigmoidal relationship is also clearly evident in the center plot of Figure 5 which plots Inline graphic against Inline graphic. Next, we tested whether the relationship between Inline graphic and Inline graphic (i.e., Equation 2) was directly related to the relationship between Inline graphic and Inline graphic (i.e., Equation 6). If so, then the logistic function in both analyses should be equivalent and the following relationship should hold:

graphic file with name pone.0090138.e216.jpg (7)

We estimated all of the parameters in Equation 7 from separate logistic regression analyses. Namely, Inline graphic and Inline graphic were obtained from fitting Equation 6, Inline graphic derived from fitting Equation 2, and Inline graphic was obtained from best fits of Equation 1, all independently for every subject. In a group-level analysis, we used a mixed-effects regression with subjects as random effect and the right side of Equation 7 as the predictor. This analysis revealed a highly significant slope near unity (Inline graphic, Inline graphic, Inline graphic). Together, these analyses indicated that there was a strong and direct relationship between drift rates and discounted value. Parameter estimates derived from fitting the LBA model to behavior therefore provided an independent means of estimating subjective values. Moreover, subjective values estimated from the LBA model corresponded closely with values estimated using a hyperbolic discounting model.

Generalizability of the relationship between drift rates and value

The previous analysis showed that a relationship existed between drift rates derived from LBA model fits and subjective value calculated based on a hyperbolic discount function. Of course, subjective value may actually be determined in a manner that differs in functional form from the hyperbolic equation (cf. [2]). Indeed, numerous functions have been proposed to account for delay discounting. In this final section, we aimed to show that drift rates derived from the LBA model are related to subjective value more generally; that is, that the relationship between drift rates and subjective value does not strictly depend on capturing subjective value using the hyperbolic discount function. To do so, we first fitted two additional discounting models to individual subjects' choices, substituting the right side of Equation 1 with exponential and “quasi-hyperbolic” value functions. For the exponential discounting function, we assumed Inline graphic to be given by:

graphic file with name pone.0090138.e225.jpg (8)

where Inline graphic is the delayed reward amount, Inline graphic is the discount rate, and Inline graphic is the delay. Similarly, for the quasi-hyperbolic discounting function, we assumed Inline graphic to be given by:

graphic file with name pone.0090138.e230.jpg (9)

where Inline graphic is again the delayed reward amount, Inline graphic is Inline graphic when there is no delay or some fixed value between Inline graphic and Inline graphic when there is a delay, Inline graphic is between Inline graphic and Inline graphic, and Inline graphic is the delay (always greater than zero).

We then obtained estimates of Inline graphic using Equation 8 and Equation 9, as well as two independent estimates of Inline graphic, one for each discounting function, from Equation2, for every subject. Next, we ran mixed-effects regression analyses with subjects as random effect and the right side of Equation 7 as predictors of subject-specific drift rate estimates. The analysis using Inline graphic revealed a significant slope near unity (Inline graphic, Inline graphic, Inline graphic) and the analysis using Inline graphic also revealed a significant positive slope (Inline graphic, Inline graphic, Inline graphic). We therefore conclude that drift rates are related to subjective value independent of the specific functional form assumed for delay discounting.

Discussion

We have shown that intertemporal choice behavior is consistent with a process of discounted value accumulation instantiated by the LBA model. Our findings support the broader hypothesis that selecting among delayed rewards can be explained by a sequential sampling process that corresponds closely with mechanisms known to predict other types of choices (cf. [3]). Thus, perceptual and value-based decision making may depend on similar comparison and selection processes. It is interesting to speculate on whether this similarity reflects a direct correspondence between the cognitive and neural processes that support selection across diverse domains or whether there is simply a common motif for action selection used in separate choice domains.

The LBA model we employed here has been used to explain neural activity during perceptual decision making (cf. [20][22]). Furthermore, sequential sampling processes such as that implemented by the LBA model provide a direct link between neural dynamics and decision making behavior. For example, evidence about visual motion is believed to be integrated in the lateral intraparietal (LIP) area, resulting in a progressive increase in LIP neuron firing rates that reflect the accumulation of sensory evidence and predict choice outcomes and response times [23], [24]. Our results represent a first step in extending such findings from perceptual decision making tasks to generate quantitative predictions about discounted value accumulation in intertemporal choice. Moreover, our hierarchical LBA model fitting method might be particularly advantageous for studying the neural mechanisms of value accumulation when used in combination with the “joint modeling framework”, which was designed to simultaneously explain neuroimaging and choice data [25], [26]. Using this framework, [25] have shown that it is possible to link neural and behavioral measures in a way that maps the mechanisms assumed by cognitive models directly to neural function. This approach allows for the specification of a priori predictions for how neural mechanisms should influence the modeled cognitive processes that presumably best explain behavior, providing a basis for hypothesis tests that are simultaneously informed by neural data, model parameters, and behavior.

Our results revealed a relationship between response time and choice probability, such that low probability choices are associated with increased response time. Similar results have been observed in previous studies using accumulation models to account for behavior in risk preference[27][29] and simple choice tasks [30][34]. Our observation that the LBA model can accommodate the relationship between response times and choice probability during intertemporal choice is thus consistent with previous findings and suggests that the LBA model might also be useful in accounting for behavior in other value-based decision domains.

Our best-fitting model included variability in drift rates and non-decision times across value conditions. This result violated our a priori expectation that drift rate variability across value conditions would be sufficient to account for our behavioral manipulation. Moreover, our results indicate that the model containing non-decision time variability performed only slightly better than the simplest model which was consistent with our theoretical expectation. Thus, from a purely theoretical standpoint, we favor the simplest model. However, for methodological consistency and empirical validity, we supported and analyzed the fits obtained from the best-fitting model. The BPIC statistic provides a measure of model quality that penalizes for the total number of parameters in the model [18]. Relying on the BPIC statistic we corroborated our prediction that very few parameters needed to vary across conditions, but also found that the best model was not the simplest one. Future studies using the LBA could corroborate if in fact the simplest model generalizes better than the model with variability in non-decision time.

We showed that drift rates estimated with the model are directly related to discounted subjective values independently derived from behavioral models of intertemporal choice. The drift rate parameters of the LBA model therefore have a direct psychological interpretation and suggest a powerful means to estimate subjective values independent of assuming and fitting a specific form for temporal discounting (e.g. the hyperbolic model in Equation 1). In contrast, we are uncertain about how to interpret the variability in non-decision times across value conditions. On average, non-decision times decreased with increased difficulty. Moreover, although median RT showed a modest relationship with non-decision times, median RT increased with choice difficulty, reflecting a dependence on accumulation rates. Non-decision times also correlated strongly with minimum RT, which did not vary systematically across value conditions, but was highly variable across subjects. This suggests that our best-fitting model is reflecting the fact that minimum RT varies considerably across value conditions. It is unclear what to conclude from these findings. Our belief is that non-decision times capture idiosyncratic differences in choice strategies and valuation processes across subjects and that incorporating a parameter to absorb these trends improves model fits overall and the interpretability of drift rates more specifically.

In summary, we have demonstrated that an LBA model provides an excellent description of the choice process in intertemporal decision making. The model fits RT distributions, provides an explanation for interdependence between RT and choice probability, and can be interpreted in terms of value accumulation. These results validate the LBA model as a complementary tool to temporal discounting models for studying the cognitive and neural mechanisms of intertemporal choice. Because the LBA has been applied to a wide range of perceptual decision making tasks, our findings not only demonstrate that a general mechanism of evidence accumulation drives decision making but also support a common and analytically tractable framework for explaining it.

Funding Statement

This work was supported by National Science Foundation's (NSF) Graduate Research Fellowship Program (GRFP), the NSF's Integrative Graduate Education and Research Traineeship (IGERT) program, and National Institutes of Health (F32GM103288). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Ainslie G (1991) Derivation of “rational” economic behavior from hyperbolic discount curves. The American Economic Review 81: 334–340. [Google Scholar]
  • 2. van den Bos W, McClure SM (2013) Towards a general model of temporal discounting. Journal of the Experimental Analysis of Behavior 99: 58–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Kable JW, Glimcher PW (2009) The neurobiology of decision: Consensus and controversy. Neuron 63: 733–745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Rangel A, Camerer C, Montague PR (2008) A framework for studying the neurobiology of value-based decision making. Nature Reviews Neuroscience 9: 545–556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Gold JI, Shadlen MN (2007) The neural basis of decision making. Annual Review of Neuroscience 30: 535–574. [DOI] [PubMed] [Google Scholar]
  • 6. Ratcliff R (1978) A theory of memory retrieval. Psychological Review 85: 59–108. [Google Scholar]
  • 7. Usher M, McClelland JL (2001) On the time course of perceptual choice: The leaky competing accumulator model. Psychological Review 108: 550–592. [DOI] [PubMed] [Google Scholar]
  • 8. Brown S, Heathcote A (2008) The simplest complete model of choice reaction time: Linear ballistic accumulation. Cognitive Psychology 57: 153–178. [DOI] [PubMed] [Google Scholar]
  • 9.Mazur JE (1987) An adjusting procedure for studying delayed reinforcement. In: Quantitative Analysis of Behavior: The Effects of Delay and Intervening Events on Reinforcement Value, Hillsdale, New Jersey: Lawrence Erlbaum Associates. pp. 55–73.
  • 10. Shiffrin RM, Lee MD, Kim W, Wagenmakers EJ (2008) A survey of model evaluation approaches with a tutorial on hierarchical Bayesian methods. Cognitive Science 32: 1248–1284. [DOI] [PubMed] [Google Scholar]
  • 11.Lee MD, Wagenmakers EJ (2012) A course in Bayesian graphical modeling for cognitive science. Available: http://www.ejwagenmakers.com/BayesCourse/BayesBookWeb.pdf. Accessed 2012 Jan 1.
  • 12. Lee MD (2008) Three case studies in the Bayesian analysis of cognitive models. Psychonomic Bulletin and Review 15: 1–15. [DOI] [PubMed] [Google Scholar]
  • 13. Rouder JN, Lu J (2005) An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychonomic Bulletin and Review 12: 573–604. [DOI] [PubMed] [Google Scholar]
  • 14. Rouder JN, Lu J, Speckman P, Sun D, Jiang Y (2005) A hierarchical model for estimating response time distributions. Psychonomic Bulletin and Review 12: 195–223. [DOI] [PubMed] [Google Scholar]
  • 15. Turner BM, Sederberg PB, Brown S, Steyvers M (2013) A method for efficiently sampling from distributions with correlated dimensions. Psychological Methods 18: 368–384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian Data Analysis. New York, NY: Chapman and Hall.
  • 17. ter Braak CJF (2006) A Markov chain Monte Carlo version of the genetic algorithm Differential Evolution: easy Bayesian computing for real parameter spaces. Statistics and Computing 16: 239–249. [Google Scholar]
  • 18. Ando T (2007) Bayesian predictive information criterion for the evaluation of hierarchical bayesian and empirical bayes models. Biometrika 94: 443–458. [Google Scholar]
  • 19. Ratcliff R, Rouder JN (1998) Modeling response times for two-choice decisions. Psychological Science 9: 347–356. [Google Scholar]
  • 20. Forstmann BU, Tittgemeyer M, Wagenmakers EJ, Derrfuss J, Imperati D, et al. (2011) The speed-accuracy tradeoff in the elderly brain: A structural model-based approach. Journal of Neuroscience 31: 17242–17249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Forstmann BU, Dutilh G, Brown S, Neumann J, von Cramon DY, et al. (2008) Striatum and pre-SMA facilitate decision-making under time pressure. Proceedings of the National Academy of Sciences 105: 17538–17542. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Forstmann BU, Anwander A, Schäfer A, Neumann J, Brown S, et al. (2010) Cortico-striatal connections predict control over speed and accuracy in perceptual decision making. Proceedings of the National Academy of Sciences 107: 15916–15920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Shadlen MN, Newsome WT (2001) Neural basis of a perceptual decision in the parietal cortex (area lip) of the rhesus monkey. Journal of Neurophysiology 86: 1916–1936. [DOI] [PubMed] [Google Scholar]
  • 24. Roitman JD, Shadlen MN (2002) Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. The Journal of neuroscience 22: 9475–9489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Turner BM, Forstmann BU, Wagenmakers EJ, Brown SD, Sederberg PB, et al.. (2013) A bayesian framework for simultaneously modeling neural and behavioral data. NeuroImage. [DOI] [PMC free article] [PubMed]
  • 26.Turner BM (2013) Constraining cognitive abstractions through Bayesian modeling. In: Forstmann BU, Wagenmakers EJ, editors, An introduction to model-based cognitive neuroscience, New York, NY: Springer. p. 22.
  • 27. Busemeyer JR (1985) Decision making under uncertainty: a comparison of simple scalability, fixed-sample, and sequential-sampling models. Journal of Experimental Psychology: Learning, Memory, and Cognition 11: 538. [DOI] [PubMed] [Google Scholar]
  • 28. Busemeyer JR, Townsend JT (1993) Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. Psychological review 100: 432. [DOI] [PubMed] [Google Scholar]
  • 29. Hunt LT, Kolling N, Soltani A, Woolrich MW, Rushworth MF, et al. (2012) Mechanisms underlying cortical activity during value-guided choice. Nature neuroscience 15: 470–476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Krajbich I, Armel C, Rangel A (2010) Visual fixations and the computation and comparison of value in simple choice. Nature neuroscience 13: 1292–1298. [DOI] [PubMed] [Google Scholar]
  • 31. Krajbich I, Rangel A (2011) Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions. Proceedings of the National Academy of Sciences 108: 13852–13857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Milosavljevic M, Malmaud J, Huth A, Koch C, Rangel A (2010) The drift diffusion model can account for value-based choice response times under high and low time pressure. Judgement & Decision Making 5: 437–449. [Google Scholar]
  • 33. De Martino B, Fleming SM, Garrett N, Dolan RJ (2012) Confidence in value-based choice. Nature neuroscience 16: 105–110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Towal RB, Mormann M, Koch C (2013) Simultaneous modeling of visual saliency and value computation improves predictions of economic choice. Proceedings of the National Academy of Sciences 110: E3858–E3867. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES