Dopaminergic Modulation of Human Intertemporal Choice: A Diffusion Model Analysis Using the D2-Receptor Antagonist Haloperidol

Ben Wagner; Mareike Clos; Tobias Sommer; Jan Peters

doi:10.1523/JNEUROSCI.0592-20.2020

. 2020 Oct 7;40(41):7936–7948. doi: 10.1523/JNEUROSCI.0592-20.2020

Dopaminergic Modulation of Human Intertemporal Choice: A Diffusion Model Analysis Using the D2-Receptor Antagonist Haloperidol

Ben Wagner ^1,^✉, Mareike Clos ², Tobias Sommer ², Jan Peters ^1,^2,^✉

PMCID: PMC7548690 PMID: 32948675

Abstract

The neurotransmitter dopamine is implicated in diverse functions, including reward processing, reinforcement learning, and cognitive control. The tendency to discount future rewards over time has long been discussed in the context of potential dopaminergic modulation. Here we examined the effect of a single dose of the D2 receptor antagonist haloperidol (2 mg) on temporal discounting in healthy female and male human participants. Our approach extends previous pharmacological studies in two ways. First, we applied combined temporal discounting drift diffusion models to examine choice dynamics. Second, we examined dopaminergic modulation of reward magnitude effects on temporal discounting. Hierarchical Bayesian parameter estimation revealed that the data were best accounted for by a temporal discounting drift diffusion model with nonlinear trialwise drift rate scaling. This model showed good parameter recovery, and posterior predictive checks revealed that it accurately reproduced the relationship between decision conflict and response times in individual participants. We observed reduced temporal discounting and substantially faster nondecision times under haloperidol compared with placebo. Discounting was steeper for low versus high reward magnitudes, but this effect was largely unaffected by haloperidol. Results were corroborated by model-free analyses and modeling via more standard approaches. We previously reported elevated caudate activation under haloperidol in this sample of participants, supporting the idea that haloperidol elevated dopamine neurotransmission (e.g., by blocking inhibitory feedback via presynaptic D2 auto-receptors). The present results reveal that this is associated with an augmentation of both lower-level (nondecision time) and higher-level (temporal discounting) components of the decision process.

SIGNIFICANCE STATEMENT Dopamine is implicated in reward processing, reinforcement learning, and cognitive control. Here we examined the effects of a single dose of the D2 receptor antagonist haloperidol on temporal discounting and choice dynamics during the decision process. We extend previous studies by applying computational modeling using the drift diffusion model, which revealed that haloperidol reduced the nondecision time and reduced impulsive choice compared with placebo. These findings are compatible with a haloperidol-induced increase in striatal dopamine (e.g., because of a presynaptic mechanism). Our data provide novel insights into the contributions of dopamine to value-based decision-making and highlight how comprehensive model-based analyses using sequential sampling models can inform the effects of pharmacological modulation on choice processes.

Keywords: computational modeling, decision making, dopamine, haloperidol, intertemporal choice, pharmacology

Introduction

Future rewards are discounted in value (Peters and Büchel, 2011) such that humans and many animals prefer smaller-sooner (SS) rewards over larger-but-later (LL) rewards (temporal discounting). Steep discounting of reward value is associated with a range of maladaptive behaviors ranging from substance use disorders (Bickel et al., 2014), attention-deficit hyperactivity disorder (Jackson and MacKillop, 2016), and obesity (Amlung et al., 2016) to behavioral addictions, such as gambling disorder (Wiehler and Peters, 2015). Temporal discounting has thus been suggested to constitute a transdiagnostic process (Amlung et al., 2019; Lempert et al., 2019) with relevance for many psychiatric conditions.

Dopamine (DA) plays a central role in addiction (Robinson and Berridge, 1993). In rodents, reductions versus moderate increases in DA transmission led to increases and decreases in discounting, whereas the corresponding human literature is small and more heterogeneous (D'Amour-Horvat and Leyton, 2014). For example, de Wit et al. (2002) found that acute administration of d-amphetamine decreased impulsivity, such that temporal discounting was reduced under d-amphetamine. However, a later study did not replicate this effect (Acheson and de Wit, 2008). Administration of the D2/D3 receptor agonist pramipexole did not affect measures of impulsivity in another study (n = 10) from the same group (Hamidovic et al., 2008). In contrast, Pine et al. (2010) observed increased temporal discounting following administration of the catecholamine precursor L-DOPA compared with placebo in healthy control participants (n = 13), while the D2-receptor antagonist haloperidol did not modulate discounting. In a recent within-subjects study using L-DOPA in a substantially larger sample (n = 87), there was no overall effect on temporal discounting (Petzold et al., 2019). Rather, effects depended on baseline impulsivity, which the authors interpreted in the context of the inverted-U-model of DA effects on cognitive control functions (Cools and D'Esposito, 2011). Two recent studies have reported a reduction in discounting following administration of the selective D2/D3-receptor antagonist amisulpride (Weber et al., 2016) as well as the D2 receptor antagonist metoclopramide (Arrondo et al., 2015). Although the latter is primarily used clinically for its peripheral effects, it can pass the blood-brain barrier and act centrally (Shakhatreh et al., 2019).

A similar heterogeneity is evident when considering model-based reinforcement learning (RL) (Doll et al., 2012), which in some studies (Shenhav et al., 2017), but not others (Solway et al., 2017), was associated with reduced temporal discounting. However, in contrast to temporal discounting (see above), L-DOPA instead increased reliance on model-based RL in healthy controls (Wunderlich et al., 2012) and Parkinson's disease patients (Sharp et al., 2016). Notably, this overall effect was not observed in a recent study in a substantially larger sample (n = 65) (Kroemer et al., 2019). Here, increased model-based RL under L-DOPA was restricted to participants with high working memory capacity.

One well-replicated behavioral effect in temporal discounting (magnitude effect) refers to the observation that the rate of temporal discounting decreases with increasing reward magnitude (Green et al., 1997). In humans, this effect depends on lateral PFC processing (Ballard et al., 2017); and in rodents, D-amphetamine effects on temporal discounting are more pronounced for large-magnitude conditions (Krebs et al., 2016). However, it is unclear whether DA impacts the magnitude effect in humans.

In the present study, we examined these processes using a between-subjects double-blind placebo-controlled pharmacological study with the D2-receptor antagonist haloperidol (2 mg). We previously reported increased dorsal striatal activation under haloperidol versus placebo in these participants (Clos et al., 2019a,b), compatible with a predominantly presynaptic effect of haloperidol that increases striatal dopaminergic signaling. Importantly, we extended previous pharmacological studies by applying a temporal discounting modeling framework based on a combination of discounting models with the drift diffusion model (DDM) (Pedersen et al., 2017; Fontanesi et al., 2019; Shahar et al., 2019; Peters and D'Esposito, 2020), allowing us to comprehensively examine drug effects on response time (RT) components related to both valuation and non–valuation-related processes.

Materials and Methods

Participants

Fifty-four healthy participants were initially enrolled in the study. Participants were screened by a physician for current diseases and current intake of prescription drugs or drugs of abuse. All participants were presently in good health and had no history of neurologic or psychiatric disorder with no current intake of prescription medication. Only healthy subjects were allowed to participate. Twenty-seven participants were randomly assigned to each group (placebo/haloperidol). Two participants from the haloperidol group did not complete the temporal discounting task. Technical problems led to working memory data loss from 4 participants (3 from the haloperidol, 1 from the placebo group), but these participants were still included in the temporal discounting data analysis.

Following filtering of RTs (see below; the fastest and slowest 2.5% of trials were excluded per participant), we examined the individual RT histograms for each subject (see Extended Data Fig. 1-1). This revealed that, even after filtering, the 3 participants with the fastest minimum RTs (2 from the haloperidol group and 1 from the placebo group) still showed implausibly fast responses on a number of trials (minimum RTs of 2, 2, and 234 ms, in Subjects 24, 25, and 41, respectively) such that the minimum RTs were substantially faster than those in the remaining participants (all min(RT) z scores of −2.04, −2.04, and −1.7; see Extended Data Fig. 1-2). These subjects were therefore excluded from further modeling.

We verified that there were no significant differences in demographic background in terms of age or baseline working memory capacity (Table 1). Potential side effects of the medication were monitored via multiple blood pressure and pulse measurements and evaluated via mood questionnaires. These analyses did not reveal significant group differences in terms of reported mood, side effects, or physiological parameters, as reported in our previous study (Clos et al., 2019b). Before enrollment, participants provided informed written consent, and all study procedures were approved by the local institutional review board (Hamburg Board of Physicians).

Table 1.

Demographic and working memory data^a

	Placebo	Haloperidol	Group comparison
Age (yr)	24.4 ± 3.4	23.3 ± 2.5	t_(45,614) = 1.40, p = 0.17
Sex (M/F)	7/19	6/17	χ²₍₁₎ = 0.001, p = 1
WM baseline (z score)	−0.0453 ± 0.665	0.0943 ± 0.556	t_(46,826) = −0.80, p = 0.43
Weight (kg) (M/F)	70.7 ± 3.39/63.5 ± 3.19	80.5 ± 2.80/62.5 ± 2.32	t_(36,702) = −0.68, p = 0.50

Open in a new tab

^aData are mean ± SD.

Experimental design

General procedure

The study consisted of two testing sessions performed on separate days. On the first day (T0), participants completed a background screening and a set of working memory tasks (see below). On the second day (T1), participants received either placebo or haloperidol (2 mg). In line with the pharmacokinetics of haloperidol (Franken et al., 2017), testing on T1 was performed 5 h after drug administration to ensure appropriate plasma levels of haloperidol. During the first 2.5 h, participants were under constant observation, and pulse as well as blood pressure levels were checked 30 min and 2 h after drug administration. During the waiting period, participants filled out questionnaires on current mood and medication effects. Participants then completed a number of unrelated tasks during an fMRI scanning session (total scan time 2.5 h.). Following scanning, they first completed the temporal discounting task outlined below, followed by a set of working memory tasks (digit span forward and backward, block span forward and backward, complex working memory span) (for detailed results, see Clos et al., 2019b).

Temporal discounting task

Participants performed 210 trials of a temporal discounting task where on each trial they made a choice between an SS reward available immediately and an LL reward. SS and LL rewards were randomly displayed on the left and right sides of the screen, and participants were free to make their choice at any time. For half the trials, the SS reward consisted of 20€; and for the remaining trials, the SS reward was fixed at 100€. These trials were presented randomly intermixed. LL options were computed via all combinations of a set of LL reward amounts (constructed by multiplying the SS reward with [1.01, 1.02, 1.05, 1.10, 1.20, 1.50, 1.80, 2.50, 2, 3, 4, 5, 7, 10, 13]) and LL delays (1, 2, 3, 5, 8, 30, 60 d), yielding 105 trials in total per magnitude condition. As is typically the case for temporal discounting tasks investigating magnitude effects (Green et al., 1997), all choices were hypothetical.

Computational modeling

Temporal discounting model

We applied a simple single-parameter hyperbolic discounting model to describe how value changes as a function of delay (Mazur, 1987; Green and Myerson, 2004) as follows:

S V (L L_{t}) = \frac{A_{t}}{1 + \exp (k + s_{k} * I_{t}) * D_{t}}

(1)

Here, A_t is the numerical reward amount of the LL option on trial t, D_t is the LL delay in days on trial t, and I_t is an indicator variable that takes on a value of 0 for trials from the large-magnitude condition (SS amount = 100€) data and 1 for trials from the small-magnitude condition (SS amount = 20€). The model has two free parameters: k is the hyperbolic discounting rate from the large-magnitude condition (modeled in log-space) and s_k is a weighting parameter that models the degree of change in discounting for small versus large SS rewards (i.e., higher values in s_k reflect a greater magnitude effect) (Green et al., 1997).

Softmax action selection

Softmax action selection models choice probabilities as a sigmoid function of value differences (Sutton and Barto, 1998) as follows:

P {(L L)}_{t} = \frac{exp {((β + s_{β} * I_{t}) * S V (L L_{t}))}^{}}{exp {((β + s_{β} * I_{t}) * S V (S S_{t}))}^{} + exp {((β + s_{β} * I_{t}) * S V (L L_{t}))}^{}}

(2)

Here, SV is the subjective value of the risky reward according to Equation 1 and β is an inverse temperature parameter, modeling choice stochasticity (for β = 0, choices are random and as β increases, choices become more dependent on the option values). SV(SS_t) was fixed at 100 for the large-magnitude condition and fixed at 20 for the small-magnitude condition. I_t is again the dummy-coded condition regressor, and s_β models the magnitude effect on $β$ .

Temporal discounting DDMs

To more comprehensively examine dopaminergic effects on choice dynamics, we additionally replaced Softmax action selection with a series of DDM-based choice rules. In the DDM, choices arise from a noisy evidence accumulation process that terminates as soon as the accumulated evidence exceeds one of two response boundaries. In the present setting, the upper boundary was defined as selection of the LL option, whereas the lower boundary was defined as selection of the SS option.

RTs for choices of the SS option were multiplied by −1 before model fitting. We furthermore used a percentile-based cutoff, such that, for each participant, the fastest and slowest 2.5% of trials were excluded from the analysis. We then first examined a null model (DDM₀) without any value modulation. Here, the RT on each trial t is distributed according to the Wiener First Passage Time (wfpt) as follows:

R T_{t} \sim w f p t (α + s_{α} * I_{t}, τ + s_{τ} * I_{t}, z + s_{z} * I_{t}, v + s_{υ} * I_{t})

(3)

The parameter α models the boundary separation (i.e., the amount of evidence required before committing to a decision), τ models the nondecision time (i.e., components of the RT related to motor preparation and stimulus processing), z models the starting point of the evidence accumulation process (i.e., a bias toward one of the response boundaries, with z > 0.5 reflecting a bias toward the LL boundary, and z < 0.5 reflecting a bias toward the SS boundary), and ν models the rate of evidence accumulation. For each parameter x, we also include a parameter s_x that models the change in that parameter from the high-magnitude (SS = 100€) to the low-magnitude (SS = 20€) condition (coded via the dummy-coded condition regressor I_t).

As in previous work (Pedersen et al., 2017; Fontanesi et al., 2019; Peters and D'Esposito, 2020), we then set up temporal discounting diffusion models by making trialwise drift rates proportional to the difference in subjective values between options. First, we set up a linear modeling scheme (DDM_lin) (Pedersen et al., 2017) as follows:

v_{t} = (v_{c o e f f} + s_{v_{c o e f f}} * I_{t}) * (S V (L L_{t}) - S V (S S_{t}))

(4)

Here, the drift rate on trial t is calculated as the scaled value difference between the LL and SS rewards. As noted above, RTs for SS options were multiplied by −1 before model estimation, such that this formulation predicts SS choices whenever SV(SS) > SV(LL) (the trialwise drift rate is negative) and predicts longest RTs for trials with the highest decision conflict (i.e., in the case of SV(SS) = SV(LL) the trialwise drift rate is zero). We next examined a DDM with nonlinear trialwise drift rate scaling (DDM_S) that has recently been reported to account for the value dependency of RTs better than the DDM_lin (Fontanesi et al., 2019; Peters and D'Esposito, 2020). In this model, the scaled value difference from Equation 4 is additionally passed through a sigmoid function with asymptote v_max as follows:

v_{t} = S [(v_{c o e f f} + s_{v_{c o e f f}} * I_{t}) * (S V (L L_{t}) - S V (S S_{t}))]

(5)

S (m) = \frac{2 * (v_{m a x} + s_{v_{m a x}} * I_{t})}{1 + exp {(- m)}^{}} - (v_{m a x} + s_{v_{m a x}} * I_{t})

(6)

All parameters, including v_coeff and v_max, were again allowed to vary according to the reward magnitude condition, such that we included s_x parameters for each parameter x that were multiplied with the dummy-coded condition predictor I_t (see above).

Hierarchical linear regression

Here we used the median posterior log(k) parameter of each participant from the DDM_S model (see above) to compute the discounted values for all LL options. We then computed the trialwise decision conflict as the absolute difference between the subjective value of the LL reward and the corresponding smaller sooner reward. To ensure that the intercept in the regression model corresponds to the RT for the lowest decision conflict and to account for the strongly skewed distribution of value differences, we took the inverse of the absolute difference in SS and discounted LL values in each trial. To further avoid numerical instabilities when taking the inverse of absolute differences < 1 (high conflict, e.g., SV(LL) = 20.10€, SS = 20€), these value differences were capped at 1 before computing the inverse. We then ran a hierarchical linear regression model in JAGS with 1/RT (to account for the skewed RT distribution) as dependent variable and decision conflict (inverse of the absolute value difference) as a predictor.

Statistical analyses

Hierarchical Bayesian models

Models were fit to all trials from all participants using a hierarchical Bayesian modeling approach with separate group-level distributions for all parameters for the placebo and haloperidol groups. Model fitting was performed using Markov Chain Monte Carlo as implemented in the JAGS software package (Plummer, 2003) (version 4.3) using the Wiener module for JAGS that implements the Wiener First Passage Time (Wabersich and Vandekerckhove, 2014) (see Eq. 3) in combination with R (version 3.4) and the R2Jags package. For group-level means, we used uniform priors defined over numerically plausible parameter ranges (see Code and data availability). For all s_x parameters modeling condition effects on model parameters, we used Gaussian priors with means of 0 and SDs of 2. For group-level precisions, we used γ distributed priors (0.001, 0.001). We initially ran 2 chains with a burn-in period of 900,000 samples and thinning of two. Chain convergence was then assessed via the Gelman-Rubinstein convergence diagnostic $\hat{R}$ and sampling was continued until $1 \leq \hat{R} \leq 1.1$ for all group-level and individual-subject parameters. This occurred after a maximum of 1.3 million samples. For most parameters, $1 \leq \hat{R} \leq 1.01$ (Softmax: all parameters, DDM₀: all parameters, DDM_lin: 5 parameters $1.01 \leq \hat{R} \leq 1.1$ , DDM_S: 9 parameters $1.01 \leq \hat{R} \leq 1.1$ ). Relative model comparison was performed via the deviance information criterion (DIC), where lower values reflect a superior fit of the model (Spiegelhalter et al., 2002). A total of 10,000 additional samples were then retained for further analysis. We then show posterior group distributions for all parameters of interest as well as their 85% and 95% highest density intervals (HDIs). For group comparisons, we report Bayes factors (BFs) for directional effects Kass and Raftery, 1995 for the hyperparameter difference distributions of placebo-haloperidol, estimated via kernel density estimation using R (version 4.01) via RStudio (version 1.3) interface. These are computed as the ratio of the integral of the posterior difference distribution from 0 to ∞ versus the integral from 0 to –∞. Using common criteria (Beard et al., 2016), we considered BFs between 1 and 3 as anecdotal evidence, BFs >3 as moderate evidence, and BFs >10 as strong evidence. BFs >30 and >100 were considered as very strong and extreme evidence, respectively, whereas the inverse of these reflect evidence in favor of the opposite hypothesis.

Parameter recovery analyses

To ensure that the parameters underlying the data-generating process could be recovered using our modeling procedures, we performed posterior predictive checks for the best-fitting model (DDM_S). During model estimation, we generated 10,000 datasets simulated from the posterior distribution of the DDM_S. Ten of these simulated datasets were randomly selected and refit with the DDM_S (see previous section) (Fontanesi et al., 2019; Peters and D'Esposito, 2020). Parameter recovery was then assessed in two ways. For group-level parameters, we examined whether the estimated 95% highest posterior density intervals contained the true generating parameters. For subject-level parameters, we examined scatter plots of generating versus estimated single-subject parameters, pooled across all 10 simulations.

Posterior predictive checks

To check whether the best-fitting model indeed captured key aspects of the data, in particular the value dependency for RTs, we performed posterior predictive checks (Peters and D'Esposito, 2020) as follows. For each individual participant, we binned trials into five bins, according to the absolute difference in LL versus SS value (“decision conflict,” computed according to each participant's median posterior log(k) parameter from the DDM_S, and separately for the high- and low-magnitude conditions). For each participant and condition, we then plotted the mean observed RTs as a function of decision conflict, as well as the mean RTs across 10,000 datasets simulated from the posterior distributions of the DDM₀, DDM_lin and DDM_S.

Code and data availability

Model code is available on the Open Science Framework (https://osf.io/wm7ud/). Raw choice data are available from Zenodo.org (https://doi.org/10.5281/zenodo.4006531) for researchers meeting the criteria for access to confidential data.

Results

Subjective and physiological drug effects

As reported in detail in our previous papers (Clos et al., 2019a,b), there were no significant group differences with respect to reported side effects, subjective mood, heart rate, or blood pressure relative to baseline. Likewise, groups did not differ with respect to the actual and guessed drug condition (haloperidol vs placebo) (Clos et al., 2019b).

Model free analysis of temporal discounting

Figure 1a shows the overall RT distributions per group with choices of the LL option coded as positive RTs and choices of the SS option coded as negative RTs. As a model-free measure of temporal discounting, we examined proportions of LL choices as a function of group (placebo vs haloperidol) and condition (100€ vs 20€ reference reward). Raw proportions of LL choices are plotted in Figure 1b. ANOVA on arcsine-square-root transformed proportion values with the within-subject factor magnitude (high [100€] vs low [20€] SS reward) and the between-subject factor drug (placebo vs haloperidol) confirmed a significant magnitude effect (F_(1,47) = 96.86, p < 0.001) such that participants overall made more LL selections in the high-magnitude condition. Furthermore, effects of drug (F_(1,47) = 3.47, p = 0.068) and drug × magnitude (F_(1,47) = 3.31, p = 0.075) showed trend-level significance.

Extended Data Figure 1-1

RT histograms for each individual participant (placebo group: blue, haloperidol group: orange) following exclusion of each participants' 2.5% fastest and 2.5% slowest trials. Subjects 24, 25 and 41 still had a number of implausibly fast trials even after filtering (see Extended Data Figure 1-2 and methods section) and were therefore excluded from modeling. Download Figure 1-1, EPS file^{(260.1KB, eps)}

Extended Data Figure 1-2

Histogram of minimum RTs across subjects following percentile-based trial filtering (i.e., following exclusion of each participants 2.5% fastest and 2.5% slowest trials). Still three participants (leftmost bars in the plot) had implausibly fast minimum RTs (z-scores < -1.65), corresponding to Subjects 24, 25, and 41 from Extended Data Figure 1-1. These participants were excluded from modeling. Download Figure 1-2, EPS file^{(18.4KB, eps)}.

Softmax choice rule

First, we analyzed our data using a standard Softmax choice rule (Fig. 2). This analysis revealed an overall drug effect on log(k), such that discounting was substantially lower in the haloperidol group compared with the placebo group (Fig. 1a). Examination of BFs indicated that a decrease in log(k) in haloperidol versus placebo was ∼116 times more likely than an increase (Table 2).

Figure 2. — Modeling results (blue: placebo, orange: haloperidol) from a hierarchical Bayesian Model with softmax choice rule. a, Log(k) is the log(discount rate) from the high magnitude condition (smaller-sooner reward = 100€). b, Log(k)_shift is the change in log(k) from the high magnitude condition to the low magnitude condition (smaller-sooner reward = 20€). c, is the inverse temperature parameter. d,_shift the corresponding shift in inverse temperature from the high to low magnitude condition. The thin (thick) horizontal lines denote 95% (85%) highest posterior density intervals.

Table 2.

Summary of group differences in model parameters for the temporal discounting Softmax model^a

Parameter	Baseline		Magnitude effect
Parameter	M_diff	dBF	M_diff	dBF
Log(k)	2.66	116.34	−0.10	0.42
Temp	0.03	1.36	−0.01	0.89

Open in a new tab

^aFor each parameter, we report mean posterior group differences (M_diff) and BFs (dBF) testing for directional effects on both the baseline parameter in the 100€ condition (left columns) and on the magnitude effect on each parameter (right columns). BFs < 0.33 reflect evidence for placebo < haloperidol, whereas BFs >3 reflect evidence for placebo > haloperidol. For details, see Statistical analyses.

Model comparison

We next compared three versions of the DDM that varied in the way that they accounted for the influence of value differences on trialwise drift rates, based on the DIC (Spiegelhalter et al., 2002). In each model, we included separate group-level distributions for the two drug conditions (haloperidol vs placebo). Furthermore, for each parameter x, we included a shift parameter s_x modeling the change in parameter x from the high-magnitude condition (SS reward =100€) to the low-magnitude condition (SS reward = 20€) (see Materials and Methods). These s_x parameters were modeled with Gaussian priors with means of zero (see Materials and Methods). DDM₀ assuming constant drift rates independent of value was also included and compared with two variants of the DDM using either linear (DDM_lin) (Pedersen et al., 2017) or in a nonlinear (sigmoid) drift rate scaling (Fontanesi et al., 2019; Peters and D'Esposito, 2020). In both drug conditions as well as overall (Table 3), the data were best accounted for by a DDM with nonlinear drift rate scaling (DDM_S).

Table 3.

Model comparison of three variants of the DDM based on the DIC (Spiegelhalter et al., 2002) where lower values indicate a better model fit^a

Model	DIC
Model	Placebo	Haloperidol	Full model
DDM₀	11792.1	10034.5	21833.8
DDM_lin	10835.0	10092.1	20923.9
DDM_s	8586.5	8161.7	16771.8

Open in a new tab

^aThe data were generally better accounted for by a temporal discounting DDM with DDM_S compared with DDM_lin and DDM₀.

We also compared the three diffusion models and the Softmax model with respect to the proportion of binary choices (LL vs SS selections) that they correctly accounted for. As can be seen from Table 4, the DDM_s performed numerically on par with the Softmax model, whereas the DDM_lin performed slightly worse.

Table 4.

Proportion of correctly predicted binary choices for each group and model^a

	Placebo	Haloperidol
Softmax	0.89 (0.77–1.00)	0.90 (0.78–0.98)
DDM₀	0.73 (0.57–1.00)	0.80 (0.60–0.98)
DDM_lin	0.88 (0.71–0.97)	0.85 (0.62–0.98)
DDM_s	0.89 (0.81–1.00)	0.90 (0.82–0.98)

Open in a new tab

^aData are mean (range).

Overall group differences

We next examined overall group differences in model parameters for the baseline (SS reward =100€) condition. Results are plotted in Figure 3, and BFs for all group comparisons are listed in Table 5. In both groups, there was a positive association between trialwise drift rates and value differences, as the 95% HDI for the drift rate coefficient parameter did not include 0 in either group (Fig. 3b). Likewise, there was a slight bias toward the SS option in both groups, as the 95% HDI for bias was <0.5 in both cases (Fig. 3e).

Table 5.

Summary of group differences in model parameters for the temporal discounting DDM^a

Model parameter	Baseline		Magnitude effect
Model parameter	M_diff	dBF	M_diff	dBF
Log(k)	2.26	77.9	−0.093	0.47
Drift rate coefficient	−0.365	0.061	0.020	2.73
Nondecision time	0.180	98.4	−0.0001	0.95
Boundary separation	−0.047	0.60	0.017	1.47
Starting point bias	−0.004	0.74	−0.017	0.26
Drift rate maximum	0.18	8.27	0.16	16.88

Open in a new tab

We furthermore observed substantially lower group-level discount rates log(k) in the haloperidol group compared with placebo, such that the 95% HDI of the posterior group difference in log(k) was >0 (Fig. 3a; Table 5). Interestingly, the nondecision time was likewise substantially lower in the haloperidol group (Fig. 3c; Table 5), amounting, on average, to 180 ms faster nondecision times.

Magnitude effects on model parameters

We next turned to the effects of the magnitude manipulation on diffusion model parameters, that is, the change in each parameter in the low-magnitude condition compared with the high-magnitude baseline condition. Results are plotted in Figure 4, and BFs for all directional group comparisons are listed in Table 5. There was a substantial magnitude effect on log(k), such that discounting was steeper in the low-magnitude condition (Fig. 4a). Interestingly, this pattern of results was not mirrored by in the magnitude effect on the starting point/bias parameter. Instead, the bias was shifted in the direction of a neutral bias (0.5) in the low-magnitude condition (Fig. 4e) in both groups. An additional interesting observation is that the nondecision time was increased in the low-magnitude condition by on average ∼30 ms (Fig. 4c).

Figure 4. — Posterior distributions (blue: placebo, orange: haloperidol) of the change in each parameter from the high magnitude (baseline) to the low magnitude condition (top row: a, Log(k)_shift; b, Drift rate coefficient_shift; c, Nondecision time_shift; d, Boundary separation_shift; e, Starting point bias_shift; f, Drift rate maximum_shift) and corresponding group differences (bottom row, placebo–haloperidol). Thin (thick) horizontal line denote 95% (85%) highest posterior density intervals.

Both drift rate components (v_coeff and v_max) were increased in the 20€ condition (Fig. 4b,f). This overall effect might in part be attributable to the fact that, in the model, these two parameters effectively scale the trialwise value differences to the appropriate scale of the DDM (Pedersen et al., 2017). Because average value differences spanned a smaller absolute range in the 20€ condition, this is compensated in the model by increasing both v_coeff (Fig. 4b) and v_max (Fig. 4f). Notably, under haloperidol, the drift rate coefficient was somewhat increased, whereas the maximum drift rate was attenuated. There might be some trade-off between the drift rate components, which could contribute to such contrasting effects, such that increases in one component can be compensated by decreases in the other. There was also some evidence for a reduced magnitude effect on the maximum drift rate (Fig. 4f) in the haloperidol group. This could be a reflection of the fact that the magnitude effect on LL choice proportions was numerically attenuated under haloperidol (Fig. 1a), leading to overall more homogeneous values in the two conditions. Difference distributions in the remaining model parameters were centered at zero, indicating no systematic group differences.

Correlation of model parameters

For descriptive purposes, we show the full correlation matrices for all single-subject median posterior parameters in Figure 5a for haloperidol and Figure 5b for placebo.

Hierarchical linear regression

We also explored whether the qualitative pattern of results could be reproduced using a hierarchical linear regression, modeling trialwise inverse RTs as a function of value differences (see Materials and Methods). Full posterior distributions of all parameters are shown in Figure 6. This analysis reproduced effects observed for the full DDM. For example, the slope was overall negative, reflecting the decrease in 1/RT for increasing conflict (Fig. 6a). The intercept was numerically smaller under haloperidol (dBF = 0.11; see Table 6), mirroring the drug effect on the nondecision time in the DDM_S. However, a direct comparison with DDM parameters is complicated by the fact the intercept in the regression model also captures RT components that in the DDM are reflected in the boundary separation, as well as potentially additional nonlinear aspects of the evidence accumulation process that cannot be accounted for by the slope. These effects are visualized in Figure 6e where we plot the 1/RT predicted by this regression model as a function of group, condition, and decision conflict. This illustrates again the slope effect in the baseline condition and the attenuated intercept under haloperidol.

Figure 6. — Modeling results (blue: placebo, orange: haloperidol) from a hierarchical linear regression with decision conflict as a predictor and 1/RT as dependent variable. Top row: The slope in a, represents the influence of increasing decision conflict (decreasing value differences) on 1/RT. The intercept in c, here corresponds to 1/RT for the lowest decision conflict (highest subjective value difference) from the high magnitude condition (smaller-sooner reward = 100€). Shift-parameters again reflect the change in slope and intercept (b, d) from the high to the low magnitude condition. e, Illustrates 1/RT predicted by this regression model as a function of group, condition and decision conflict. Bottom row: Corresponding group differences (placebo–haloperidol). The thin (thick) horizontal lines denote 95% (85%) highest posterior density intervals.

Table 6.

Summary of group differences in model parameters for the hierarchical linear regression model^a

Model parameter	Baseline		Magnitude effect
Model parameter	M_diff	dBF	M_diff	dBF
Slope	0.02	2.09	−0.07	0.09
Intercept	−0.10	0.11	0.01	2.59

Open in a new tab

Associations with working memory span

Exploratory analyses did not reveal associations between model parameters of interest (log(k), nondecision time, drift rate scaling) and working memory score (all |r| < 0.38).

Posterior predictive checks

We next performed extensive posterior predictive checks to ensure that the best-fitting model (DDM_S) could account for RTs of individual participants in both groups. To this end, we binned the trials of each individual participant into five bins, according to the absolute difference in LL versus SS value (computed according to each participant's median posterior log(k) parameter from the DDM_S). For each bin, participant, and condition, we then plot the mean observed RT, as well as the mean simulated RT across 10,000 datasets simulated from the posterior distributions of the DDM₀, DDM_lin, and DDM_S. These results are shown in Figure 7 for the placebo group and Figure 8 for the haloperidol group. As can be seen, the DDM_S provided a much better account of how RTs vary as a function of decision conflict than the DDM_lin in the vast majority of participants in both groups. This was mainly because the DDM_lin overestimated RTs with medium decision conflict and underestimated RTs in cases of very low decision conflict (Peters and D'Esposito, 2020).

Figure 7. — Placebo condition posterior predictive checks. For each participant and condition (high (left facet) represents the high magnitude condition; low (right facet) represents the low magnitude condition), trials were binned into five equal sized bins according to the absolute difference in between subjective LL and SS options (decision conflict bin). Plotted are mean observed RTs per bin (data) as well model-generated RTs (blue represents DDM₀; red represents DDM_lin; orange represents DDM_S) averaged >10,000 datasets simulated from the posterior distribution of each hierarchical model (blue represents DDM₀; red represents DDM_lin; orange represents DDM_s).

Figure 8. — Haloperidol condition posterior predictive checks. For each participant and condition (high (left facet) represents the high magnitude condition; low (right facet) represents the low magnitude condition), trials were binned into five equal sized bins according to the absolute difference in between subjective LL and SS options (decision conflict bin). Plotted are mean observed RTs per bin (data) as well model-generated RTs (blue represents DDM₀; red represents DDM_lin; orange represents DDM_S) averaged >10,000 datasets simulated from the posterior distribution of each hierarchical model (blue represents DDM₀; red represents DDM_lin; orange represents DDM_s).

Some additional nontrivial patterns in the data deserve mention. For example, while the DDM_S in most cases predicted longest RTs for choices with the highest decision conflict, this was not always the case (see, e.g., the low-magnitude condition of Participant 34 from the placebo group in Fig. 7). In this case, in the low-magnitude condition, the participant exhibited a relatively small boundary separation (1.84) and drift rate coefficient (0.24), in combination with a bias toward the SS boundary (0.43) and a high discount rate log(k) (−0.7). In such a constellation, the bias toward the SS boundary can only be overcome when value evidence is accumulated for a relatively long time (because v_coeff is relatively small), giving rise to long RTs for LL choices (which in this case only occurred in the case of low decision conflict).

Parameter recovery

As a final model check, we ran a series of parameter recovery simulations. Here, we randomly selected 10 datasets simulated from the posterior distribution of the DDM_S (see Materials and Methods), and refit these synthetic data with the DDM_S. Results are shown in Figure 9 for the baseline (high magnitude 100€) parameters, and Figure 10 for the parameters modeling condition effects. As can be seen from these plots, for both baseline and condition effects, this revealed that group-level parameters (Figs. 9, 10, bottom rows) recovered well, such that the true generating parameters were generally contained in the estimated 95% HDIs.

Figure 10. — Parameter recovery analysis for all shift parameters using the DDM_s (a, Log(k)_shift; b, Drift rate coefficient_shift; c, Nondecision time_shift; d, Boundary separation_shift; e, Starting point bias_shift; f, Drift rate maximum_shift). Top row: Generating parameters vs. fitted parameters for each subject across ten simulations for haloperidol group (yellow) and placebo group (blue). Second row: True generating group level hyperparameter means (points) and Bottom row: standard deviations (points) and estimated 95% highest density intervals (lines) per fitted simulation.

Extended Data Figure 9-1

Nonparametric Spearman correlation coefficients (generating versus fitted) for all subjects. Download Figure 9-1, DOCX file^{(20.8KB, docx)}.

Parameter recovery for individual-subject parameters was excellent for all baseline (100€ magnitude) parameters (Fig. 9, top row) such that the correlation between generating and estimated individual-subject parameters was >0.9 for all parameters. For the parameters modeling condition effects (magnitude effects, Fig. 10, top row), these correlations were lower for some parameters, in particular for condition effects on boundary separation and log(k). The likely reason is that the synthetic data were simulated from the actual posterior distribution, and there was overall little between-subject variance in some of these parameters in our data (see, e.g., Fig. 10a,f).

Discussion

We investigated the effects of a single dose of the D2-receptor antagonist haloperidol (2 mg) on temporal discounting in a between-subjects study in a double-blind placebo-controlled setting. A diffusion model-based analysis revealed substantially smaller log(k) parameters and a substantial reduction in nondecision times under haloperidol versus placebo.

We applied a recent class of value-based decision models based on the DDM (Pedersen et al., 2017; Fontanesi et al., 2019; Shahar et al., 2019; Peters and D'Esposito, 2020). Comprehensive RT-based analysis was not possible in previous studies because of the specifics of task timing (Pine et al., 2010) or low trial numbers (Weber et al., 2016; Petzold et al., 2019). Model comparison confirmed previous results (Fontanesi et al., 2019; Peters and D'Esposito, 2020), such that the data were better accounted for by a model assuming a nonlinear trialwise scaling of the drift rate, and this was confirmed via posterior predictive checks of single-subject data. Extensive parameter recovery analyses confirmed that group-level parameters recovered well (Fontanesi et al., 2019; Peters and D'Esposito, 2020). Recovery of individual-subject baseline parameters (100€ magnitude condition) was excellent, whereas recovery of parameters modeling condition effects was somewhat lower. This is likely because of some parameters (e.g., boundary separation shift) showing low between-subject variance. Modeling was further validated by the observation that drug effects were fully reproduced using a Softmax choice rule (Sutton and Barto, 1998) and by the finding that the magnitude effect (Green et al., 1997; Ballard et al., 2017; Mellis et al., 2017) was likewise replicated using the DDM-based approach. The qualitative pattern of RT effects was reproduced using a hierarchical linear regression model of trialwise inverse RTs as a function of decision conflict.

The human literature on DA and impulsivity is heterogeneous (D'Amour-Horvat and Leyton, 2014), and interpretation of these findings is complicated by several factors. First, effects of dopaminergic drugs might depend on baseline DA availability (Cools and D'Esposito, 2011), such that the same drug might impair or enhance performance in different participants, according to an inverted U-shaped function (or a different process-dependent function) (Floresco, 2013). Second, the action of D2-receptor antagonists is often interpreted in terms of a reduction in DA neurotransmission (Pessiglione et al., 2006; Pine et al., 2010). But such drugs might indeed enhance DA release by predominantly binding at presynaptic DA auto-receptors, at least at lower dosages (Frank and O'Reilly, 2006) as shown in animal (Pehek, 1999; Schwarz et al., 2004) and human studies (Chen et al., 2005).

Interpretation of D2-receptor antagonist effects as a presynaptically mediated elevation of DA release might reconcile a number of conflicting results. First, our finding of reduced temporal discounting under haloperidol is in line with two recent studies that reported reduced temporal discounting following administration of D2/D3-receptor antagonists (Arrondo et al., 2015; Weber et al., 2016). On the other hand, a reduction of temporal discounting following administration of haloperidol was not observed in an earlier within-subjects study in n = 13 participants (Pine et al., 2010) that used a slightly lower dosage of 1.5 mg (we used 2 mg). Lower dosages of D2/D3-receptor antagonists might increase (rather than decrease) DA signaling (Frank and O'Reilly, 2006), an effect mediated by inhibitory feedback through presynaptic D2 auto-receptors (Grace, 1991), which may lead to an enhancement of phasic (vs. tonic) DA signaling (Frank and O'Reilly, 2006), a point that we return to below. However, we do acknowledge that such an interpretation is not general consensus in the cognitive literature on DA drug effects (Pessiglione et al., 2006; Pine et al., 2010).

Our results advance previous findings regarding the role of D2/D3-receptor antagonists in temporal discounting in several ways. First, participants performed an unrelated memory task during fMRI directly before completing the temporal discounting task. Those data revealed an overall main effect of drug condition on trial onset-related activity in caudate nucleus (Clos et al., 2019a,b) (i.e., caudate activity was increased under haloperidol). Although this neural read-out was obtained before the discounting task, both the fMRI and temporal discounting time points were well within the time of maximum haloperidol plasma levels (Franken et al., 2017). This observation is arguably more compatible with the idea that the dosage of haloperidol applied here increased (rather than decreased) striatal DA signaling. Similar neural evidence was lacking in most previous human pharmacological studies on DA effects on discounting (de Wit et al., 2002; Hamidovic et al., 2008; Arrondo et al., 2015; Weber et al., 2016). Second, the DDM-based modeling approach adopted in the present study allowed us examine the dynamics underlying decision-making much more comprehensively than previous human pharmacological studies (de Wit et al., 2002; Hamidovic et al., 2008; Pine et al., 2010; Arrondo et al., 2015; Weber et al., 2016; Petzold et al., 2019). In addition to the drug effect on the discount rate log(k), diffusion modeling revealed substantially shorter nondecision times in the haloperidol group that amounted to ≈180 ms on average. Such a robust enhancement of lower-level motor and/or perceptual RT components is also more compatible with an increase, rather than a decrease, in DA transmission (Weed and Gold, 1998) and resonates with previous findings regarding a dopaminergic enhancement of RT-based response vigor (Guitart-Masip et al., 2011; Beierholm et al., 2013). An exploratory inspection of parameter correlations revealed that log(k) and nondecision time were positively correlated in both groups, suggesting that they might capture similar aspects of the data and/or might both be modulated by changes in phasic dopaminergic responses. In support of this interpretation, augmentation of DA levels in Parkinson's disease patients reduces temporal discounting (Foerde et al., 2016) and improves model-based RL (Sharp et al., 2016). Finally, this interpretation of available human D2-receptor antagonist effects would also reconcile the human and animal literature on acute dopaminergic effects on impulsivity (D'Amour-Horvat and Leyton, 2014). Together, these considerations lead us to suggest that haloperidol increased (rather than decreased) striatal DA neurotransmission, resulting in enhanced cognitive control (reduced discounting) and a substantial facilitation of motor responding (shorter nondecision times).

By what mechanism might haloperidol attenuate the impact of delay on reward valuation? According to models of basal ganglia contributions to action selection (Maia and Frank, 2011), the probability for selecting a given candidate action depends on the relative difference in activation between the direct (go) and the indirect (nogo) pathways. A similar striatal gating mechanism might underlie working memory and/or prefrontal control functions (Cools, 2011). By increasing phasic DA responses, haloperidol might increase the signal-to-noise ratio in striatal value representations, thereby increasing the likelihood that objectively smaller and/or more delayed LL rewards gain access to processing in the PFC. Naturally, other modes of action are likewise conceivable. Frontal and striatal regions are interconnected via a series of loops that follow a dorsal-to-ventral organization (Haber and Knutson, 2010), and haloperidol might impact functional interactions within these circuits (Cools, 2011), for example, related to top-down control of value representations (Hare et al., 2009, 2014; Figner et al., 2010; Peters and D'Esposito, 2016). Finally, haloperidol might have directly augmented control processes in specific PFC regions (Figner et al., 2010). However, because of the much greater expression of D2 receptors in striatum compared with PFC (Seamans and Yang, 2004), it is generally assumed that prefrontal action of D2 antagonists requires substantially higher dosages than those applied in the studies examined here (Seamans and Yang, 2004; Frank and O'Reilly, 2006).

The present study has a number of limitations that need to be acknowledged. First, we did not run a within-subjects design, which would have allowed us to account for individual-participant baseline parameters in the analysis of the drug effects. Second, this also precluded us from comprehensively analyzing potential modulatory influences of, for example, individual differences in working memory on the drug effects, which might modulate DA effects on discounting (Petzold et al., 2019) and cognitive control more generally (Cools and D'Esposito, 2011). Third, the proportion of female participants was relatively large. Given the known association of ovarian hormones with the DA system (Yoest et al., 2018), future studies would benefit from testing larger sample sizes that allow for the examination of gender effects and/or from directly controlling menstrual cycle phase. Fourth, rewards were hypothetical because of the inclusion of the high-magnitude condition. However, preferences for real and hypothetical outcomes in temporal discounting tasks show a very good correspondence (Johnson and Bickel, 2002) and rely on similar neural circuits (Bickel et al., 2009). Also, neural haloperidol effects vary across brain regions and functions (Wächtler et al., 2020), complicating interpretation as no task-related imaging data were obtained here.

In conclusion, our data show that the D2-receptor antagonist haloperidol attenuated temporal discounting and substantially shortened nondecision times, as revealed by comprehensive computational modeling of choices and RTs using hierarchical Bayesian parameter estimation. These data are best accounted for by a model in which low dosages of haloperidol lead to an enhancement of phasic DA responses because of reduced feedback inhibition from D2 auto-receptors, leading to an augmentation of both lower-level (nondecision time) and higher-level (temporal discounting) decision components.

Footnotes

This work was supported by Deutsche Forschungsgemeinschaft PE1627/5-1 to J.P. and SO952/3-1 to T.S.

The authors declare no competing financial interests.

References

Acheson A, de Wit H (2008) Bupropion improves attention but does not affect impulsive behavior in healthy young adults. Exp Clin Psychopharmacol 16:113–123. 10.1037/1064-1297.16.2.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
Amlung M, Petker T, Jackson J, Balodis I, MacKillop J (2016) Steep discounting of delayed monetary and food rewards in obesity: a meta-analysis. Psychol Med 46:2423–2434. 10.1017/S0033291716000866 [DOI] [PubMed] [Google Scholar]
Amlung M, Marsden E, Holshausen K, Morris V, Patel H, Vedelago L, Naish KR, Reed DD, McCabe RE (2019) Delay discounting as a transdiagnostic process in psychiatric disorders: a meta-analysis. JAMA Psychiatry 76:1176 10.1001/jamapsychiatry.2019.2102 [DOI] [PMC free article] [PubMed] [Google Scholar]
Arrondo G, Aznárez-Sanado M, Fernández-Seara MA, Goñi J, Loayza FR, Salamon-Klobut E, Heukamp FH, Pastor MA (2015) Dopaminergic modulation of the trade-off between probability and time in economic decision-making. Eur Neuropsychopharmacol 25:817–827. 10.1016/j.euroneuro.2015.02.011 [DOI] [PubMed] [Google Scholar]
Ballard IC, Kim B, Liatsis A, Aydogan G, Cohen JD, McClure SM (2017) More is meaningful: the magnitude effect in intertemporal choice depends on self-control. Psychol Sci 28:1443–1454. 10.1177/0956797617711455 [DOI] [PMC free article] [PubMed] [Google Scholar]
Beard E, Dienes Z, Muirhead C, West R (2016) Using Bayes factors for testing hypotheses about intervention effectiveness in addiction in addiction research. Addiction 111:2230–2247. 10.1111/add.13501 [DOI] [PMC free article] [PubMed] [Google Scholar]
Beierholm U, Guitart-Masip M, Economides M, Chowdhury R, Duzel E, Dolan R, Dayan P (2013) Dopamine modulates reward-related vigor. Neuropsychopharmacology 38:1495–1503. 10.1038/npp.2013.48 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bickel WK, Pitcock JA, Yi R, Angtuaco EJ (2009) Congruence of BOLD response across intertemporal choice conditions: fictive and real money gains and losses. J Neurosci 29:8839–8846. 10.1523/JNEUROSCI.5319-08.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bickel WK, Koffarnus MN, Moody L, Wilson AG (2014) The behavioral- and neuro-economic process of temporal discounting: a candidate behavioral marker of addiction. Neuropharmacology 76:518–527. 10.1016/j.neuropharm.2013.06.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen YC, Choi JK, Andersen SL, Rosen BR, Jenkins BG (2005) Mapping dopamine D2/D3 receptor function using pharmacological magnetic resonance imaging. Psychopharmacology (Berl) 180:705–715. 10.1007/s00213-004-2034-0 [DOI] [PubMed] [Google Scholar]
Clos M, Bunzeck N, Sommer T (2019a) Dopamine is a double-edged sword: dopaminergic modulation enhances memory retrieval performance but impairs metacognition. Neuropsychopharmacology 44:555–563. 10.1038/s41386-018-0246-y [DOI] [PMC free article] [PubMed] [Google Scholar]
Clos M, Bunzeck N, Sommer T (2019b) Dopamine enhances item novelty detection via hippocampal and associative recall via left lateral prefrontal cortex mechanisms. J Neurosci 39:7920–7933. 10.1523/JNEUROSCI.0495-19.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
Cools R. (2011) Dopaminergic control of the striatum for high-level cognition. Curr Opin Neurobiol 21:402–407. 10.1016/j.conb.2011.04.002 [DOI] [PubMed] [Google Scholar]
Cools R, D'Esposito M (2011) Inverted-U-shaped dopamine actions on human working memory and cognitive control. Biol Psychiatry 69:e113–e125. 10.1016/j.biopsych.2011.03.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
D'Amour-Horvat V, Leyton M (2014) Impulsive actions and choices in laboratory animals and humans: effects of high vs. low dopamine states produced by systemic treatments given to neurologically intact subjects. Front Behav Neurosci 8:432. [DOI] [PMC free article] [PubMed] [Google Scholar]
de Wit H, Enggasser JL, Richards JB (2002) Acute administration of D-amphetamine decreases impulsivity in healthy volunteers. Neuropsychopharmacology 27:813–825. 10.1016/S0893-133X(02)00343-3 [DOI] [PubMed] [Google Scholar]
Doll BB, Simon DA, Daw ND (2012) The ubiquity of model-based reinforcement learning. Curr Opin Neurobiol 22:1075–1081. 10.1016/j.conb.2012.08.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
Figner B, Knoch D, Johnson EJ, Krosch AR, Lisanby SH, Fehr E, Weber EU (2010) Lateral prefrontal cortex and self-control in intertemporal choice. Nat Neurosci 13:538–539. 10.1038/nn.2516 [DOI] [PubMed] [Google Scholar]
Floresco SB. (2013) Prefrontal dopamine and behavioral flexibility: shifting from an “inverted-U” toward a family of functions. Front Neurosci 7:62. [DOI] [PMC free article] [PubMed] [Google Scholar]
Foerde K, Figner B, Doll BB, Woyke IC, Braun EK, Weber EU, Shohamy D (2016) Dopamine modulation of intertemporal decision-making: evidence from Parkinson disease. J Cogn Neurosci 28:657–667. 10.1162/jocn_a_00929 [DOI] [PubMed] [Google Scholar]
Fontanesi L, Gluth S, Spektor MS, Rieskamp J (2019) A reinforcement learning diffusion decision model for value-based decisions. Psychon Bull Rev 26:1099–1121. 10.3758/s13423-018-1554-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
Frank MJ, O'Reilly RC (2006) A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. Behav Neurosci 120:497–517. 10.1037/0735-7044.120.3.497 [DOI] [PubMed] [Google Scholar]
Franken LG, Mathot RA, Masman AD, Baar FP, Tibboel D, van Gelder T, Koch BC, de Winter BC (2017) Population pharmacokinetics of haloperidol in terminally ill adult patients. Eur J Clin Pharmacol 73:1271–1277. 10.1007/s00228-017-2283-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
Grace AA. (1991) Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: a hypothesis for the etiology of schizophrenia. Neuroscience 41:1–24. 10.1016/0306-4522(91)90196-u [DOI] [PubMed] [Google Scholar]
Green L, Myerson J (2004) A discounting framework for choice with delayed and probabilistic rewards. Psychol Bull 130:769–792. 10.1037/0033-2909.130.5.769 [DOI] [PMC free article] [PubMed] [Google Scholar]
Green L, Myerson J, McFadden E (1997) Rate of temporal discounting decreases with amount of reward. Mem Cognit 25:715–723. 10.3758/bf03211314 [DOI] [PubMed] [Google Scholar]
Guitart-Masip M, Beierholm UR, Dolan R, Duzel E, Dayan P (2011) Vigor in the face of fluctuating rates of reward: an experimental examination. J Cogn Neurosci 23:3933–3938. 10.1162/jocn_a_00090 [DOI] [PubMed] [Google Scholar]
Haber SN, Knutson B (2010) The reward circuit: linking primate anatomy and human imaging. Neuropsychopharmacology 35:4–26. 10.1038/npp.2009.129 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hamidovic A, Kang UJ, de Wit H (2008) Effects of low to moderate acute doses of pramipexole on impulsivity and cognition in healthy volunteers. J Clin Psychopharmacol 28:45–51. 10.1097/jcp.0b013e3181602fab [DOI] [PubMed] [Google Scholar]
Hare TA, Camerer CF, Rangel A (2009) Self-control in decision-making involves modulation of the vmPFC valuation system. Science 324:646–648. 10.1126/science.1168450 [DOI] [PubMed] [Google Scholar]
Hare TA, Hakimi S, Rangel A (2014) Activity in dlPFC and its effective connectivity to vmPFC are associated with temporal discounting. Front Neurosci 8:50. 10.3389/fnins.2014.00050 [DOI] [PMC free article] [PubMed] [Google Scholar]
Jackson JN, MacKillop J (2016) Attention-deficit/hyperactivity disorder and monetary delay discounting: a meta-analysis of case-control studies. Biol Psychiatry Cogn Neurosci Neuroimaging 1:316–325. 10.1016/j.bpsc.2016.01.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
Johnson MW, Bickel WK (2002) Within-subject comparison of real and hypothetical money rewards in delay discounting. J Exp Anal Behav 77:129–146. 10.1901/jeab.2002.77-129 [DOI] [PMC free article] [PubMed] [Google Scholar]
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795. 10.1080/01621459.1995.10476572 [DOI] [Google Scholar]
Krebs CA, Reilly WJ, Anderson KG (2016) Reinforcer magnitude affects delay discounting and influences effects of D-amphetamine in rats. Behav Processes 130:39–45. 10.1016/j.beproc.2016.07.004 [DOI] [PubMed] [Google Scholar]
Kroemer NB, Lee Y, Pooseh S, Eppinger B, Goschke T, Smolka MN (2019) L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action. Neuroimage 186:113–125. 10.1016/j.neuroimage.2018.10.075 [DOI] [PubMed] [Google Scholar]
Lempert KM, Steinglass JE, Pinto A, Kable JW, Simpson HB (2019) Can delay discounting deliver on the promise of RDoC? Psychol Med 49:190–199. 10.1017/S0033291718001770 [DOI] [PubMed] [Google Scholar]
Maia TV, Frank MJ (2011) From reinforcement learning models to psychiatric and neurological disorders. Nat Neurosci 14:154–162. 10.1038/nn.2723 [DOI] [PMC free article] [PubMed] [Google Scholar]
Mazur JE. (1987) An adjusting procedure for studying delayed reinforcement. In: Quantitative analyses of behavior (Commons ML, Mazur JE, Nevin JA, Rachlin H, eds), pp 555–573. Hillsdale, NJ: Erlbaum. [Google Scholar]
Mellis AM, Woodford AE, Stein JS, Bickel WK (2017) A second type of magnitude effect: reinforcer magnitude differentiates delay discounting between substance users and controls. J Exp Anal Behav 107:151–160. 10.1002/jeab.235 [DOI] [PMC free article] [PubMed] [Google Scholar]
Pedersen ML, Frank MJ, Biele G (2017) The drift diffusion model as the choice rule in reinforcement learning. Psychon Bull Rev 24:1234–1251. 10.3758/s13423-016-1199-y [DOI] [PMC free article] [PubMed] [Google Scholar]
Pehek EA. (1999) Comparison of effects of haloperidol administration on amphetamine-stimulated dopamine release in the rat medial prefrontal cortex and dorsal striatum. J Pharmacol Exp Ther 289:14–23. [PubMed] [Google Scholar]
Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD (2006) Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature 442:1042–1045. 10.1038/nature05051 [DOI] [PMC free article] [PubMed] [Google Scholar]
Peters J, Büchel C (2011) The neural mechanisms of inter-temporal decision-making: understanding variability. Trends Cogn Sci 15:227–239. 10.1016/j.tics.2011.03.002 [DOI] [PubMed] [Google Scholar]
Peters J, D'Esposito M (2016) Effects of medial orbitofrontal cortex lesions on self-control in intertemporal choice. Curr Biol 26:2625–2628. 10.1016/j.cub.2016.07.035 [DOI] [PubMed] [Google Scholar]
Peters J, D'Esposito M (2020) The drift diffusion model as the choice rule in inter-temporal and risky choice: a case study in medial orbitofrontal cortex lesion patients and controls. PLoS Comput Biol 16:e1007615. 10.1371/journal.pcbi.1007615 [DOI] [PMC free article] [PubMed] [Google Scholar]
Petzold J, Kienast A, Lee Y, Pooseh S, London ED, Goschke T, Smolka MN (2019) Baseline impulsivity may moderate L-DOPA effects on value-based decision-making. Sci Rep 9:5652. 10.1038/s41598-019-42124-x [DOI] [PMC free article] [PubMed] [Google Scholar]
Pine A, Shiner T, Seymour B, Dolan RJ (2010) Dopamine, time, and impulsivity in humans. J Neurosci 30:8888–8896. 10.1523/JNEUROSCI.6028-09.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
Plummer M. (2003) JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd International Workshop on Distributed Statistical Computing, p 125 Technische Universit at Wien. [Google Scholar]
Robinson TE, Berridge KC (1993) The neural basis of drug craving: an incentive-sensitization theory of addiction. Brain Res Brain Res Rev 18:247–291. 10.1016/0165-0173(93)90013-p [DOI] [PubMed] [Google Scholar]
Schwarz A, Gozzi A, Reese T, Bertani S, Crestan V, Hagan J, Heidbreder C, Bifone A (2004) Selective dopamine D(3) receptor antagonist SB-277011-A potentiates phMRI response to acute amphetamine challenge in the rat brain. Synapse 54:1–10. 10.1002/syn.20055 [DOI] [PubMed] [Google Scholar]
Seamans JK, Yang CR (2004) The principal features and mechanisms of dopamine modulation in the prefrontal cortex. Prog Neurobiol 74:1–58. 10.1016/j.pneurobio.2004.05.006 [DOI] [PubMed] [Google Scholar]
Shahar N, Hauser TU, Moutoussis M, Moran R, Keramati M, Dolan RJ, NSPN Consortium (2019) Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling. PLoS Comput Biol 15:e1006803. 10.1371/journal.pcbi.1006803 [DOI] [PMC free article] [PubMed] [Google Scholar]
Shakhatreh M, Jehangir A, Malik Z, Parkman HP (2019) Metoclopramide for the treatment of diabetic gastroparesis. Expert Rev Gastroenterol Hepatol 13:711–721. 10.1080/17474124.2019.1645594 [DOI] [PubMed] [Google Scholar]
Sharp ME, Foerde K, Daw ND, Shohamy D (2016) Dopamine selectively remediates “model-based” reward learning: a computational approach. Brain 139:355–364. 10.1093/brain/awv347 [DOI] [PMC free article] [PubMed] [Google Scholar]
Shenhav A, Rand DG, Greene JD (2017) The relationship between intertemporal choice and following the path of least resistance across choices, preferences, and beliefs. Judge Decis Making 12:1–18. [Google Scholar]
Solway A, Lohrenz T, Montague PR (2017) Simulating future value in intertemporal choice. Sci Rep 7:43119. 10.1038/srep43119 [DOI] [PMC free article] [PubMed] [Google Scholar]
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64:583–639. 10.1111/1467-9868.00353 [DOI] [Google Scholar]
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. Cambridge, MA: Massachusetts Institute of Technology. [Google Scholar]
Wabersich D, Vandekerckhove J (2014) Extending JAGS: a tutorial on adding custom distributions to JAGS (with a diffusion model example). Behav Res Methods 46:15–28. 10.3758/s13428-013-0369-3 [DOI] [PubMed] [Google Scholar]
Wächtler CO, Chakroun K, Clos M, Bayer J, Hennies N, Beaulieu JM, Sommer T (2020) Region-specific effects of acute haloperidol in the human midbrain, striatum and cortex. Eur Neuropsychopharmacol 35:126–135. [DOI] [PubMed] [Google Scholar]
Weber SC, Beck-Schimmer B, Kajdi ME, Müller D, Tobler PN, Quednow BB (2016) Dopamine D2/3- and μ-opioid receptor antagonists reduce cue-induced responding and reward impulsivity in humans. Transl Psychiatry 6:e850. 10.1038/tp.2016.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
Weed MR, Gold LH (1998) The effects of dopaminergic agents on reaction time in rhesus monkeys. Psychopharmacology (Berl) 137:33–42. 10.1007/s002130050590 [DOI] [PubMed] [Google Scholar]
Wiehler A, Peters J (2015) Reward-based decision making in pathological gambling: the roles of risk and delay. Neurosci Res 90:3–14. 10.1016/j.neures.2014.09.008 [DOI] [PubMed] [Google Scholar]
Wunderlich K, Smittenaar P, Dolan RJ (2012) Dopamine enhances model-based over model-free choice behavior. Neuron 75:418–424. 10.1016/j.neuron.2012.03.042 [DOI] [PMC free article] [PubMed] [Google Scholar]
Yoest KE, Quigley JA, Becker JB (2018) Rapid effects of ovarian hormones in dorsal striatum and nucleus accumbens. Horm Behav 104:119–129. 10.1016/j.yhbeh.2018.04.002 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Extended Data Figure 1-1

Extended Data Figure 1-2

Extended Data Figure 9-1

Nonparametric Spearman correlation coefficients (generating versus fitted) for all subjects. Download Figure 9-1, DOCX file^{(20.8KB, docx)}.

Data Availability Statement

[B1] Acheson A, de Wit H (2008) Bupropion improves attention but does not affect impulsive behavior in healthy young adults. Exp Clin Psychopharmacol 16:113–123. 10.1037/1064-1297.16.2.113 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] Amlung M, Petker T, Jackson J, Balodis I, MacKillop J (2016) Steep discounting of delayed monetary and food rewards in obesity: a meta-analysis. Psychol Med 46:2423–2434. 10.1017/S0033291716000866 [DOI] [PubMed] [Google Scholar]

[B3] Amlung M, Marsden E, Holshausen K, Morris V, Patel H, Vedelago L, Naish KR, Reed DD, McCabe RE (2019) Delay discounting as a transdiagnostic process in psychiatric disorders: a meta-analysis. JAMA Psychiatry 76:1176 10.1001/jamapsychiatry.2019.2102 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] Arrondo G, Aznárez-Sanado M, Fernández-Seara MA, Goñi J, Loayza FR, Salamon-Klobut E, Heukamp FH, Pastor MA (2015) Dopaminergic modulation of the trade-off between probability and time in economic decision-making. Eur Neuropsychopharmacol 25:817–827. 10.1016/j.euroneuro.2015.02.011 [DOI] [PubMed] [Google Scholar]

[B5] Ballard IC, Kim B, Liatsis A, Aydogan G, Cohen JD, McClure SM (2017) More is meaningful: the magnitude effect in intertemporal choice depends on self-control. Psychol Sci 28:1443–1454. 10.1177/0956797617711455 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] Beard E, Dienes Z, Muirhead C, West R (2016) Using Bayes factors for testing hypotheses about intervention effectiveness in addiction in addiction research. Addiction 111:2230–2247. 10.1111/add.13501 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] Beierholm U, Guitart-Masip M, Economides M, Chowdhury R, Duzel E, Dolan R, Dayan P (2013) Dopamine modulates reward-related vigor. Neuropsychopharmacology 38:1495–1503. 10.1038/npp.2013.48 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] Bickel WK, Pitcock JA, Yi R, Angtuaco EJ (2009) Congruence of BOLD response across intertemporal choice conditions: fictive and real money gains and losses. J Neurosci 29:8839–8846. 10.1523/JNEUROSCI.5319-08.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] Bickel WK, Koffarnus MN, Moody L, Wilson AG (2014) The behavioral- and neuro-economic process of temporal discounting: a candidate behavioral marker of addiction. Neuropharmacology 76:518–527. 10.1016/j.neuropharm.2013.06.013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] Chen YC, Choi JK, Andersen SL, Rosen BR, Jenkins BG (2005) Mapping dopamine D2/D3 receptor function using pharmacological magnetic resonance imaging. Psychopharmacology (Berl) 180:705–715. 10.1007/s00213-004-2034-0 [DOI] [PubMed] [Google Scholar]

[B11] Clos M, Bunzeck N, Sommer T (2019a) Dopamine is a double-edged sword: dopaminergic modulation enhances memory retrieval performance but impairs metacognition. Neuropsychopharmacology 44:555–563. 10.1038/s41386-018-0246-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] Clos M, Bunzeck N, Sommer T (2019b) Dopamine enhances item novelty detection via hippocampal and associative recall via left lateral prefrontal cortex mechanisms. J Neurosci 39:7920–7933. 10.1523/JNEUROSCI.0495-19.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B13] Cools R. (2011) Dopaminergic control of the striatum for high-level cognition. Curr Opin Neurobiol 21:402–407. 10.1016/j.conb.2011.04.002 [DOI] [PubMed] [Google Scholar]

[B14] Cools R, D'Esposito M (2011) Inverted-U-shaped dopamine actions on human working memory and cognitive control. Biol Psychiatry 69:e113–e125. 10.1016/j.biopsych.2011.03.028 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15] D'Amour-Horvat V, Leyton M (2014) Impulsive actions and choices in laboratory animals and humans: effects of high vs. low dopamine states produced by systemic treatments given to neurologically intact subjects. Front Behav Neurosci 8:432. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B16] de Wit H, Enggasser JL, Richards JB (2002) Acute administration of D-amphetamine decreases impulsivity in healthy volunteers. Neuropsychopharmacology 27:813–825. 10.1016/S0893-133X(02)00343-3 [DOI] [PubMed] [Google Scholar]

[B17] Doll BB, Simon DA, Daw ND (2012) The ubiquity of model-based reinforcement learning. Curr Opin Neurobiol 22:1075–1081. 10.1016/j.conb.2012.08.003 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] Figner B, Knoch D, Johnson EJ, Krosch AR, Lisanby SH, Fehr E, Weber EU (2010) Lateral prefrontal cortex and self-control in intertemporal choice. Nat Neurosci 13:538–539. 10.1038/nn.2516 [DOI] [PubMed] [Google Scholar]

[B19] Floresco SB. (2013) Prefrontal dopamine and behavioral flexibility: shifting from an “inverted-U” toward a family of functions. Front Neurosci 7:62. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B20] Foerde K, Figner B, Doll BB, Woyke IC, Braun EK, Weber EU, Shohamy D (2016) Dopamine modulation of intertemporal decision-making: evidence from Parkinson disease. J Cogn Neurosci 28:657–667. 10.1162/jocn_a_00929 [DOI] [PubMed] [Google Scholar]

[B21] Fontanesi L, Gluth S, Spektor MS, Rieskamp J (2019) A reinforcement learning diffusion decision model for value-based decisions. Psychon Bull Rev 26:1099–1121. 10.3758/s13423-018-1554-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B22] Frank MJ, O'Reilly RC (2006) A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. Behav Neurosci 120:497–517. 10.1037/0735-7044.120.3.497 [DOI] [PubMed] [Google Scholar]

[B23] Franken LG, Mathot RA, Masman AD, Baar FP, Tibboel D, van Gelder T, Koch BC, de Winter BC (2017) Population pharmacokinetics of haloperidol in terminally ill adult patients. Eur J Clin Pharmacol 73:1271–1277. 10.1007/s00228-017-2283-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B24] Grace AA. (1991) Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: a hypothesis for the etiology of schizophrenia. Neuroscience 41:1–24. 10.1016/0306-4522(91)90196-u [DOI] [PubMed] [Google Scholar]

[B25] Green L, Myerson J (2004) A discounting framework for choice with delayed and probabilistic rewards. Psychol Bull 130:769–792. 10.1037/0033-2909.130.5.769 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B26] Green L, Myerson J, McFadden E (1997) Rate of temporal discounting decreases with amount of reward. Mem Cognit 25:715–723. 10.3758/bf03211314 [DOI] [PubMed] [Google Scholar]

[B27] Guitart-Masip M, Beierholm UR, Dolan R, Duzel E, Dayan P (2011) Vigor in the face of fluctuating rates of reward: an experimental examination. J Cogn Neurosci 23:3933–3938. 10.1162/jocn_a_00090 [DOI] [PubMed] [Google Scholar]

[B28] Haber SN, Knutson B (2010) The reward circuit: linking primate anatomy and human imaging. Neuropsychopharmacology 35:4–26. 10.1038/npp.2009.129 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B29] Hamidovic A, Kang UJ, de Wit H (2008) Effects of low to moderate acute doses of pramipexole on impulsivity and cognition in healthy volunteers. J Clin Psychopharmacol 28:45–51. 10.1097/jcp.0b013e3181602fab [DOI] [PubMed] [Google Scholar]

[B30] Hare TA, Camerer CF, Rangel A (2009) Self-control in decision-making involves modulation of the vmPFC valuation system. Science 324:646–648. 10.1126/science.1168450 [DOI] [PubMed] [Google Scholar]

[B31] Hare TA, Hakimi S, Rangel A (2014) Activity in dlPFC and its effective connectivity to vmPFC are associated with temporal discounting. Front Neurosci 8:50. 10.3389/fnins.2014.00050 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B32] Jackson JN, MacKillop J (2016) Attention-deficit/hyperactivity disorder and monetary delay discounting: a meta-analysis of case-control studies. Biol Psychiatry Cogn Neurosci Neuroimaging 1:316–325. 10.1016/j.bpsc.2016.01.007 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B33] Johnson MW, Bickel WK (2002) Within-subject comparison of real and hypothetical money rewards in delay discounting. J Exp Anal Behav 77:129–146. 10.1901/jeab.2002.77-129 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B34] Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795. 10.1080/01621459.1995.10476572 [DOI] [Google Scholar]

[B35] Krebs CA, Reilly WJ, Anderson KG (2016) Reinforcer magnitude affects delay discounting and influences effects of D-amphetamine in rats. Behav Processes 130:39–45. 10.1016/j.beproc.2016.07.004 [DOI] [PubMed] [Google Scholar]

[B36] Kroemer NB, Lee Y, Pooseh S, Eppinger B, Goschke T, Smolka MN (2019) L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action. Neuroimage 186:113–125. 10.1016/j.neuroimage.2018.10.075 [DOI] [PubMed] [Google Scholar]

[B37] Lempert KM, Steinglass JE, Pinto A, Kable JW, Simpson HB (2019) Can delay discounting deliver on the promise of RDoC? Psychol Med 49:190–199. 10.1017/S0033291718001770 [DOI] [PubMed] [Google Scholar]

[B38] Maia TV, Frank MJ (2011) From reinforcement learning models to psychiatric and neurological disorders. Nat Neurosci 14:154–162. 10.1038/nn.2723 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B39] Mazur JE. (1987) An adjusting procedure for studying delayed reinforcement. In: Quantitative analyses of behavior (Commons ML, Mazur JE, Nevin JA, Rachlin H, eds), pp 555–573. Hillsdale, NJ: Erlbaum. [Google Scholar]

[B40] Mellis AM, Woodford AE, Stein JS, Bickel WK (2017) A second type of magnitude effect: reinforcer magnitude differentiates delay discounting between substance users and controls. J Exp Anal Behav 107:151–160. 10.1002/jeab.235 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B41] Pedersen ML, Frank MJ, Biele G (2017) The drift diffusion model as the choice rule in reinforcement learning. Psychon Bull Rev 24:1234–1251. 10.3758/s13423-016-1199-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[B42] Pehek EA. (1999) Comparison of effects of haloperidol administration on amphetamine-stimulated dopamine release in the rat medial prefrontal cortex and dorsal striatum. J Pharmacol Exp Ther 289:14–23. [PubMed] [Google Scholar]

[B43] Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD (2006) Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature 442:1042–1045. 10.1038/nature05051 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B44] Peters J, Büchel C (2011) The neural mechanisms of inter-temporal decision-making: understanding variability. Trends Cogn Sci 15:227–239. 10.1016/j.tics.2011.03.002 [DOI] [PubMed] [Google Scholar]

[B45] Peters J, D'Esposito M (2016) Effects of medial orbitofrontal cortex lesions on self-control in intertemporal choice. Curr Biol 26:2625–2628. 10.1016/j.cub.2016.07.035 [DOI] [PubMed] [Google Scholar]

[B46] Peters J, D'Esposito M (2020) The drift diffusion model as the choice rule in inter-temporal and risky choice: a case study in medial orbitofrontal cortex lesion patients and controls. PLoS Comput Biol 16:e1007615. 10.1371/journal.pcbi.1007615 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B47] Petzold J, Kienast A, Lee Y, Pooseh S, London ED, Goschke T, Smolka MN (2019) Baseline impulsivity may moderate L-DOPA effects on value-based decision-making. Sci Rep 9:5652. 10.1038/s41598-019-42124-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[B48] Pine A, Shiner T, Seymour B, Dolan RJ (2010) Dopamine, time, and impulsivity in humans. J Neurosci 30:8888–8896. 10.1523/JNEUROSCI.6028-09.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B49] Plummer M. (2003) JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd International Workshop on Distributed Statistical Computing, p 125 Technische Universit at Wien. [Google Scholar]

[B50] Robinson TE, Berridge KC (1993) The neural basis of drug craving: an incentive-sensitization theory of addiction. Brain Res Brain Res Rev 18:247–291. 10.1016/0165-0173(93)90013-p [DOI] [PubMed] [Google Scholar]

[B51] Schwarz A, Gozzi A, Reese T, Bertani S, Crestan V, Hagan J, Heidbreder C, Bifone A (2004) Selective dopamine D(3) receptor antagonist SB-277011-A potentiates phMRI response to acute amphetamine challenge in the rat brain. Synapse 54:1–10. 10.1002/syn.20055 [DOI] [PubMed] [Google Scholar]

[B52] Seamans JK, Yang CR (2004) The principal features and mechanisms of dopamine modulation in the prefrontal cortex. Prog Neurobiol 74:1–58. 10.1016/j.pneurobio.2004.05.006 [DOI] [PubMed] [Google Scholar]

[B53] Shahar N, Hauser TU, Moutoussis M, Moran R, Keramati M, Dolan RJ, NSPN Consortium (2019) Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling. PLoS Comput Biol 15:e1006803. 10.1371/journal.pcbi.1006803 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B54] Shakhatreh M, Jehangir A, Malik Z, Parkman HP (2019) Metoclopramide for the treatment of diabetic gastroparesis. Expert Rev Gastroenterol Hepatol 13:711–721. 10.1080/17474124.2019.1645594 [DOI] [PubMed] [Google Scholar]

[B55] Sharp ME, Foerde K, Daw ND, Shohamy D (2016) Dopamine selectively remediates “model-based” reward learning: a computational approach. Brain 139:355–364. 10.1093/brain/awv347 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B56] Shenhav A, Rand DG, Greene JD (2017) The relationship between intertemporal choice and following the path of least resistance across choices, preferences, and beliefs. Judge Decis Making 12:1–18. [Google Scholar]

[B57] Solway A, Lohrenz T, Montague PR (2017) Simulating future value in intertemporal choice. Sci Rep 7:43119. 10.1038/srep43119 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B58] Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64:583–639. 10.1111/1467-9868.00353 [DOI] [Google Scholar]

[B59] Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. Cambridge, MA: Massachusetts Institute of Technology. [Google Scholar]

[B60] Wabersich D, Vandekerckhove J (2014) Extending JAGS: a tutorial on adding custom distributions to JAGS (with a diffusion model example). Behav Res Methods 46:15–28. 10.3758/s13428-013-0369-3 [DOI] [PubMed] [Google Scholar]

[B61] Wächtler CO, Chakroun K, Clos M, Bayer J, Hennies N, Beaulieu JM, Sommer T (2020) Region-specific effects of acute haloperidol in the human midbrain, striatum and cortex. Eur Neuropsychopharmacol 35:126–135. [DOI] [PubMed] [Google Scholar]

[B62] Weber SC, Beck-Schimmer B, Kajdi ME, Müller D, Tobler PN, Quednow BB (2016) Dopamine D2/3- and μ-opioid receptor antagonists reduce cue-induced responding and reward impulsivity in humans. Transl Psychiatry 6:e850. 10.1038/tp.2016.113 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B63] Weed MR, Gold LH (1998) The effects of dopaminergic agents on reaction time in rhesus monkeys. Psychopharmacology (Berl) 137:33–42. 10.1007/s002130050590 [DOI] [PubMed] [Google Scholar]

[B64] Wiehler A, Peters J (2015) Reward-based decision making in pathological gambling: the roles of risk and delay. Neurosci Res 90:3–14. 10.1016/j.neures.2014.09.008 [DOI] [PubMed] [Google Scholar]

[B65] Wunderlich K, Smittenaar P, Dolan RJ (2012) Dopamine enhances model-based over model-free choice behavior. Neuron 75:418–424. 10.1016/j.neuron.2012.03.042 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B66] Yoest KE, Quigley JA, Becker JB (2018) Rapid effects of ovarian hormones in dorsal striatum and nucleus accumbens. Horm Behav 104:119–129. 10.1016/j.yhbeh.2018.04.002 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Dopaminergic Modulation of Human Intertemporal Choice: A Diffusion Model Analysis Using the D2-Receptor Antagonist Haloperidol

Ben Wagner

Mareike Clos

Tobias Sommer

Jan Peters

Abstract

Introduction

Materials and Methods

Participants

Table 1.

Experimental design

General procedure

Temporal discounting task

Computational modeling

Temporal discounting model

Softmax action selection

Temporal discounting DDMs

Hierarchical linear regression

Statistical analyses

Hierarchical Bayesian models

Parameter recovery analyses

Posterior predictive checks

Code and data availability

Results

Subjective and physiological drug effects

Model free analysis of temporal discounting

Figure 1.

Softmax choice rule

Figure 2.

Table 2.

Model comparison

Table 3.

Table 4.

Overall group differences

Figure 3.

Table 5.

Magnitude effects on model parameters

Figure 4.

Correlation of model parameters

Figure 5.

Hierarchical linear regression

Figure 6.

Table 6.

Associations with working memory span

Posterior predictive checks

Figure 7.

Figure 8.

Parameter recovery

Figure 9.

Figure 10.

Discussion

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases