Abstract
Rationale and Objectives
Norepinephrine mediates the adjustment of error-driven learning to match the rate of change of the environment, while phasic dopamine signals prediction errors. We tested the hypothesis that pharmacologic manipulation may modulate this process.
Methods
We administered a single dose of methylphenidate, a norepinephrine/dopamine reuptake inhibitor, or placebo in double-blind randomized fashion to 20 healthy human males, who then performed a probabilistic learning task. Each subject was tested in two sessions, receiving methylphenidate in one session and placebo in the other, in randomized order. Task performance was quantified by the percentage of trials on which subjects chose the most likely option, while learning rate was measured using a computational model-based parameter as well as with a behavioral analogue of this parameter.
Results
There was a substance-by-session interaction effect on behavioral learning rate and model-based learning rate, such that subjects receiving methylphenidate exhibited higher learning rates than those receiving placebo in session 1, with no difference observed in session 2, suggesting that subjects retained the increased learning rate across sessions. Higher behavioral learning rate was associated with both higher task performance and with the model-based learning rate. Higher learning rates were advantageous given the high rate of change on the task. Subjects receiving methylphenidate and placebo began the task in session 1 with a similar behavioral learning rate, but those receiving methylphenidate rapidly increased learning rate toward the optimal value, suggesting thatmethylphenidate accelerated the adaptation of learning rate based on the environment.
Conclusions
The results suggest that methylphenidate may improve disrupted probabilistic learning in disorders involving noradrenergic or dopaminergic dysfunction.
Keywords: Learning, Decision Making, Methylphenidate, Norepinephrine
1. Introduction
Adaptive behavior relies on the ability to make adjustments in the face of motivationally relevant, unexpected outcomes. This process, known as error-driven learning, is described by the classic Rescorla-Wagner (RW) model (Rescorla and Wagner 1972; Sutton and Barto 1998). A key parameter for this type of learning, known as the learning rate, is a scaling factor that determines the degree of modification in beliefs (and consequently behavior) after a given prediction error. When the learning rate is high, new data is heavily weighted in forming beliefs about the environment and driving behavior; when it is low, old data is given relatively greater weight. Learning rate helps determine the reactivity or stability of behavior and may have important implications for individual differences and psychopathology.
The optimal learning rate in a given environment specifies the most favorable response after a prediction error. Determining when to alter behavior after an unexpected event and when to stay the course is a ubiquitous dilemma faced by organisms in uncertain environments. Normative models provide a framework for understanding these decisions by decomposing the uncertainty underlying prediction errors into different sources. Expected uncertainty results from known unreliability of predictive signals in the environment, while unexpected uncertainty reflects changes in the environment itself (Yu and Dayan 2005). After an environmental change, recent data will be more accurate than old data in predicting outcomes, so learning rate should be shifted upward in the presence of higher unexpected uncertainty (i.e. higher probability of change). In this way, an organism should modulate learning rate according to estimates of environmental volatility. Based on a synthesis of physiological, pharmacological, and behavioral data, Yu and Dayan proposed that unexpected uncertainty is signaled by brain norepinephrine (NE) (2005).
Evidence has accumulated that humans adjust learning rates according to environmental volatility (Behrens et al. 2007; Nassar et al. 2012; Payzan-LeNestour et al. 2013). Furthermore, learning rate or unexpected uncertainty is associated with NE signaling measured by pupil diameter (Browning et al. 2015; Nassar et al. 2012; Nassar et al. 2010) and by fMRI imaging of the locus coeruleus (LC), the principal location of NE neurons in the brain (Payzan-LeNestour et al. 2013). Environmental volatility is also associated with activity in the dorsal anterior cingulate cortex (dACC) (Behrens et al. 2007). dACC is one of two major cortical inputs to LC (Aston-Jones and Cohen 2005) and is closely associated with peripheral markers of NE activity (Critchley et al. 2003; Ebitz and Platt 2015).
Learning rates display individual differences associated with NE activity (Nassar et al. 2012) and dACC activity (Behrens et al. 2007). Such differences may have important implications for psychopathology, as individuals with high trait anxiety display a blunted ability to adjust learning rate based on volatility, a deficit linked to altered NE responses (Browning et al. 2015).
Learning rates can be influenced by manipulating NE activity using a task-irrelevant stimulus (Nassar et al. 2012). This raises the possibility that learning rates could be manipulated by medications targeting NE. To our knowledge, such an effect has not previously been demonstrated. Here, we report the influence on learning rate of methylphenidate (MPH), which inhibits reuptake of dopamine (DA) and NE. MPH increases synaptic NE, which inhibits spontaneous neural activity more than evoked activity and thereby increases the signal-to-noise ratio of neurons (Foote et al. 1983). MPH and other NE reuptake inhibitors reduce spontaneous activity in LC neurons via NE action at α-adrenergic receptors on LC cells (Bari and Aston-Jones 2013; Devilbiss and Berridge 2006; Foote et al. 1983). MPH is thereby thought to shift LC activity toward a phasic mode, in which LC cells respond more strongly (and specifically) to salient events (Aston-Jones et al. 2007). In addition to its effects on NE, acute MPH increases striatal DA (Volkow et al. 2015). This could also affect error-driven learning by modulating phasic DA signaling of prediction error. Evidence indicates that the DA precursor L-DOPA increases learning rate in older adults (Chowdhury et al. 2013), while the DA D2 agonist pramipexole reduces learning rate (which is attributed to a reduction in phasic DA release) (Huys et al. 2013). We hypothesized that MPH would influence learning rates so that true environmental changes were signaled more strongly, i.e. that the learning rate would shift in a more optimal direction.
2. Materials and Methods
2.1 Subjects
20 healthy adult male volunteers 18 through 40 years of age (mean age 25.9 +/− 6.6 years) were recruited from the community. Inclusion criteria included weight of ≥55 kg (120 lbs) and BMI between 18 and 30 kg/m2, inclusive. Exclusion criteria included any major medical illness, any history of psychiatric disorder, use of any medications within 14 days with exceptions including acetaminophen and occasional use of sleeping medication (not taken on the evening prior to a visit), and being a current smoker. All participants provided informed written consent. The study was approved by the University of California, San Diego Institutional Review Board.
2.2 Experimental Design
Subjects performed a probabilistic learning task during two experimental sessions (separated by at least one month). At each session, subjects received a single dose of MPH 40 mg or placebo 1 hour prior to the task, with peak blood concentration and peak cognitive effects expected to occur 1–2 hours after administration (Kimko et al. 1999). Each subject received MPH in one session and placebo in the other, with the order of sessions randomized in double-blind fashion. The dose of 40 mg was chosen because this dose has previously been shown to be safe and well tolerated and to exert significant subjective and cognitive effects in acute-dosing studies (Korostenskaja et al. 2008; Pauls et al. 2012; Schmid et al. 2014). In order to measure the subjective effects of MPH, subjects completed visual analog scales (VAS) indicating their level of alertness and attentiveness at baseline (2 hours prior to dosing) as well as 1.5 hours after dosing.
2.3 Experimental Task
The probabilistic learning task was a modification of the task used by Yu and Huang (Yu and Huang 2014). It consisted of a practice session with 10 trials followed by 3 blocks with 60 trials each. On each trial, subjects attempted to locate a target stimulus in one of 3 alternative locations. The target stimulus consisted of a random-dot stimulus (Gold and Shadlen 2000) with a percentage of dots moving in a pre-specified direction; the other two locations contained distractor stimuli with dots moving in the opposite direction. At the beginning of the trial, both targets and distractors were hidden from view, with three black circles indicating the three locations. Within each trial, subjects sequentially chose locations to view (indicated by key presses ‘R’, ‘B’, and ‘J’) until they located the target stimulus (indicated by pressing the space bar). At the end of each trial, correct target location and points were shown on the screen. Points were calculated based on search time (st), number of switches (sw, 0 if only one patch opened), and final response (c, 1 for correct, −1 for incorrect):
The practice session consisted of 10 trials with random-dot motion coherence of 50%. Subjects had to reach 75% accuracy to proceed to the main experiment, or the same session had to be repeated. Each experimental session consisted of 3 blocks of 60 trials per block, with random-dot motion coherence of 30%. In the experimental sessions, the target appeared in the three locations (ABC) with relative frequency 1:3:9. Location configuration (e.g., ABC vs. CBA for 1:3:9) of the target probability changed without being signaled based on N(10,1). That is, every 10 trials on average, there was a change in the target probability in the three locations.
2.4 Behavioral Analysis
As a measure of task performance, we calculated the percentage of trials on which the subject chose to view the most likely location first for each subject and session. For each subject and session, we then calculated a behavioral analogue of the RW learning rate parameter, measuring change in behavior after an error rather than change in (inferred) expectations. This behavioral learning rate is the percent of trials after an error trial on which subjects switch their first location choice to the location of the previous target.
2.5 Model
Two free parameters (η,β) were estimated for each subject using standard Rescorla-Wagner learning (Rescorla and Wagner 1972; Sutton and Barto 1998), in which η is the learning rate parameter in the value update function, and β is the inverse temperature parameter in the softmax decision function, with large β corresponding to a more deterministic choice favoring the option with the highest value, and lower β corresponding to more random decisions. Values of the stimuli were initialized at 0 at the beginning of each block. Rn is 1 for a rewarded choice and 0 for an unrewarded choice.
2.6 Statistical Analysis
In order to assess the subjective effects of MPH, we performed 2x2 analysis of variance (ANOVA) tests with change from baseline in alertness ratings and change from baseline in attentiveness ratings as dependent variables and substance and session as factors.
Our primary variables of interest were percent of trials choosing most likely location (the performance measure), behavioral learning rate, and RW learning rate η. For each of these, we performed a 2x2 ANOVA with substance and session as factors. To assess the between subject effect of methylphenidate within session 1, we performed t-tests for our primary variables of interest comparing subjects receiving methylphenidate and those receiving placebo within session 1. In order to distinguish whether group differences in behavioral learning rate in session 1 were present from the beginning of the task or whether groups may have differentially adjusted behavioral learning rates over time, we calculated average behavioral learning rate across session 1, block 1 using a moving window of 10 trials. A substance-by-time interaction for the behavioral learning rate was tested using a mixed effects model in R with moving window average of learning rate as dependent variable, substance and trial as fixed effects, with random effects including a random intercept term and a random slope term. Including the random slope term improved model fit compared to a model with random intercept only based on Akaike Infomration Criteria (AIC). To test the significance of the main effect of behavioral learning rate, a null model was constructed without behavioral learning rate as a fixed effect and was compared to the full model with behavioral learning rate as a fixed effect using the likelihood ratio test (modeled by a chi-square distribution). In order to test the association between behavioral learning rate and performance, a mixed effects model was created in R with performance as the dependent variable, session and behavioral learning rate as fixed effects, and subject as a random effect. To test the significance of the main effect of behavioral learning rate, a null model was constructed without behavioral learning rate as a main effect and was compared to the full model with behavioral learning rate as a fixed effect using the likelihood ratio test (modeled by a chi-square distribution). To test the association between learning rate η and behavioral learning rate and performance across sessions, two mixed effects models were created in R using performance or behavioral learning rate as dependent variables, session and learning rate η as fixed effects, and subject as a random effect. For significance tests of the main effect of learning rate η, null models were created without learning rate η as a fixed effect and were compared to the full models with learning rate η as a fixed effect using the likelihood ratio test (modeled by a chi-square distribution). Significance threshold for all statistical tests was p=0.05.
3. Results
One subject who had unusually high spatial bias (i.e. chose the same location on 98% of all trials) was excluded from the data analysis.
To assess the subjective effects of MPH, we performed 2x2 ANOVA tests with change from baseline in alertness ratings and change from baseline in attentiveness ratings as dependent variables and substance and session as factors. For both ratings, there was a significant main effect of substance such that increase from baseline was higher with MPH than placebo (alertness: main effect of substance, F1=16.59, p<0.01; main effect of session, F1=3.29, p=0.08; interaction between substance and session, F1=0.99, p=0.33; attentiveness: main effect of substance, F1=4.77, p=0.04; main effect of session, F1=2.18, p=0.15; interaction between substance and session, F1=0.05, p=0.83).
A 2x2 ANOVA with performance as the dependent variable and substance and session as factors found that main effect of substance (F1=0.55, p=0.46), main effect of session (F1=3.48, p=0.07), and interaction between substance and session (F1=2.94, p=0.096) did not meet the threshold for significance (see Fig. 1a). A t-test found that, within session 1, subjects receiving methylphenidate exhibited significantly higher performance than those receiving placebo (t17=2.12, p<0.05).
A 2x2 ANOVA with behavioral learning rate as the dependent variable and substance and session as factors found no significant main effect of substance (F1=0.81, p=0.38), but did find a significant main effect of session (F1=6.92, p=0.01) (with overall behavioral learning rates being higher in session 2) and interaction between substance and session (F1=4.74, p=0.04) (see Fig. 1b). Based on t-tests, subjects receiving MPH had a significantly higher behavioral learning rate than those receiving placebo in session 1 (t17=2.48, p=0.02), but subjects receiving MPH and placebo were not significantly different in session 2 (t17=0.75, p=0.46).
Across both sessions, behavioral learning rate was positively associated with performance (X21=30.88, p<0.01) based on a linear mixed effects model (LME). See Fig. 1c for a plot of behavioral learning rate against performance in session 1. It is important to note that a higher learning rate was advantageous for this task because the environmental volatility was high (i.e. a change point occurred every 10 trials). A high learning rate would not be advantageous in a more stable environment. The high learning rate among subjects receiving MPH therefore could be explained by a general bias toward higher learning rate or a rapid adaptation of the learning rate toward the optimal rate for a given environment (which is high in this experiment). In order to distinguish these possibilities, we plotted behavioral learning rate across session 1, block 1 using a moving window of 10 trials (Fig. 1d). Subjects receiving MPH and placebo began block 1 with similar behavioral learning rates, but subjects receiving MPH rapidly increased behavioral learning rates from trials 1–10 to trials 11–20 (group-by-time interaction, X21=3.95, p<0.05) based on a linear mixed effects model (LME).
We estimated a standard RW learning model with two free parameters per subject, per block (learning rate η and inverse temperature parameter β). Model accuracy in predicting individual decisions was high (0.76). A 2x2 ANOVA with learning rate η as dependent variable and substance and session as factors found that main effect of substance (F1=0.84, p=0.37) and main effect of session (F1=3.28, p=0.08) did not meet threshold for significance, but that there was a significant interaction between substance and session (F1=5.27, p=0.03) (see Fig. 2a). Based on t-tests, subjects receiving MPH had a significantly higher learning rate η than those receiving placebo in session 1 (t17=2.13, p<0.05), but subjects receiving MPH and placebo were not significantly different in session 2 (t17=1.02, p=0.32). Across both sessions, learning rate η was positively associated with both behavioral learning rate (X21=17.33, p<0.01) and performance (X21=14.19, p<0.01) based on linear mixed effects models (LME) (Fig. 2b).
4. Discussion
In this study, we showed that a single dose of MPH increased a behavioral measure of error driven learning, although this effect disappeared when subjects were retested in a later session. Higher behavioral learning rate was strongly associated with higher performance on the task. It is important to note that a higher learning rate was adaptive on this task because of the high environmental volatility (with a change point every 10 trials on average). In a more stable environment, a lower learning rate would be more advantageous. This raises the question of whether MPH merely biases the learning rate upward, or whether it accelerates adaptation of learning rate toward a more optimal level for a given environment. Our data show that subjects receiving MPH began the task in session 1 with a learning rate similar to that of subjects receiving placebo. However, the subjects receiving MPH then rapidly increased learning rate toward a higher and more optimal level. This suggests that MPH does not simply bias learning rate upward, but actually accelerates the adaptation of learning rate toward a more optimal level. Future work can test this more directly by including blocks with less frequent change points, on which lower learning rates would be optimal. Our prediction would be that subjects receiving MPH would exhibit lower learning rates on these blocks.
While behavioral learning rate was strongly associated with performance, the performance measure failed to reach the threshold of significance in 2x2 ANOVA with substance and session as factors. Behavioral learning rate is likely a less noisy measure than performance, because behavioral learning rate depends only on subject choices, while performance additionally depends on the randomly determined stimulus locations. Higher learning rates were advantageous on this task, implying that methylphenidate resulted in more optimal choices in session 1. Additionally, a t-test performed to assess the between subject effect of methylphenidate within session 1 did find that methylphenidate was associated with a significant improvement in performance in session 1.
We calculated two different learning rates for each subject. One learning rate was calculated by fitting the RW model, while the other was a behavioral operationalization of the RW learning rate. Our behavioral learning rate is highly similar conceptually to the behavioral learning rate calculated by Nassar et al. (2010), except that our decision task is categorical while the previous task was continuous. In our study, the RW learning rate η and the behavioral learning rate were positively associated, as expected. Like behavioral learning rate, the RW learning rate exhibited a significant interaction between substance and session in 2x2 ANOVA, such that subjects receiving methylphenidate exhibited a higher learning rate than those receiving placebo in session 1, but subjects receiving methylphenidate and placebo were not significantly different in session 2.
It is possible that a more complex model could have shown a greater difference between MPH and placebo. Other applications of models similar to RW have added additional parameters. For example, RW has been modified to include a “memory” parameter that scales the influence of the previous value estimate (just as the learning rate scales the influence of the prediction error) (Dombrovski et al. 2010), to include separate learning rates for rewards and positive and negative prediction errors (Cazé and van der Meer 2013), and to include a reward sensitivity parameter (Huys et al. 2013). While not all of these approaches are compatible with our task, it is possible that a similar extension of RW could increase the utility of the model in clarifying the influence of MPH on error-driven learning. Another extension of the RW model is the model-based reinforcement learning (RL) approach. According to this approach, individuals construct an internal, explicit model of the environment to guide choices rather than merely tracking the expected reward associated with each action (Dayan and Niv 2008). A different methodology that also posits that an individual explicitly models the environment is Bayesian learning modeling, in which individuals are assumed to incorporate new evidence with prior beliefs according to fundamental rules of probability. Examples of this approach include the Hierarchical Gaussian Filter (Iglesias et al. 2013; Mathys et al. 2014) and the Dynamic Belief Model (Yu and Huang 2014). Future research can incorporate more complex modeling approaches to further examine the effect of methylphenidate on probabilistic learning.
The difference in behavioral learning rate and RW learning rate between MPH and placebo disappeared in the second session of our crossover design. There was a significant substance by session interaction, suggesting that subjects retained the increased learning rate between sessions. This effect was unexpected, and our study was designed and powered based on the expectation that there would be no main effects of session or interactions between substance and session. While our performance measure did not reach significance in a 2x2 ANOVA test, we cannot rule out an effect of methylphenidate on this measure, especially considering that a t-test revealed a between subject effect of methylphenidate on performance within session 1. The possibility that subjects may learn and retain high-level aspects of experimental tasks is an important consideration when using within-subject designs in other computational studies, as well as for possible clinical applications in which repeat measures within subjects may be crucial.
As noted above, the influence of methylphenidate on both NE and DA could account for effects on learning rates (NE through tracking of environmental volatility and DA through altered signaling of prediction errors). Importantly, subjects receiving MPH in session 1 began the task with similar behavioral learning rates as those receiving placebo, but then increased learning rates toward the optimal level given the high volatility. This argues for an influence of MPH on volatility tracking via NE, as exaggerated prediction errors mediated by DA would be expected to result in a higher learning rate from the beginning of the task. However, further research is needed to more clearly resolve this issue, using a task with both stable and volatile conditions or using agents such as atomoxetine, a selective NE reuptake inhibitor (Michelson et al. 2003). Future research would also be necessary to better disentangle the effects of NE and DA reuptake inhibition in cortical vs. subcortical areas.
This study highlights the potential utility of computational approaches in elucidating the cognitive effects of medications such as MPH. A meta-analysis of single-dose studies of MPH concluded that MPH had significant positive effects on working memory and speed of processing, while effects on verbal learning and memory, attention and vigilance, and reasoning and problem solving were less consistent (Linssen et al. 2014). Understanding of the discrepant effects of MPH on different cognitive domains can be expanded and refined using computational approaches in which the role of neurotransmitter systems in task performance is modeled explicitly. In this study, we examined the possible influence of manipulation of NE and DA on error-driven, probabilistic learning. Our approach yields insights into aspects of learning which differ from verbal learning and memory (e.g. word list learning or story recall) or visual memory, yet which may have substantial effects on decision making in uncertain, real-world environments.
The pharmacologic manipulation of error-driven learning may have important potential clinical applications. Recent evidence indicates that individuals with high trait anxiety have difficulty adjusting learning rate depending on environmental volatility (Browning et al. 2015). This difficulty was associated with a lack of modulation of pupil diameter based on environmental volatility, suggesting that it results from underlying LC dysfunction. This finding is consistent with previous evidence that LC, as a central component of the “stress circuitry”, plays an important role in the pathophysiology of anxiety (Itoi and Sugimoto 2010; Sullivan et al. 1999; Tanaka et al. 2000). The cortical region dACC, which is one of two major cortical inputs to LC (Aston-Jones and Cohen 2005), is also associated with the rate of error-driven learning (Behrens et al. 2007). This region is implicated in fear expression (Milad et al. 2007), autonomic arousal (Critchley et al. 2003), and anxiety (Etkin et al. 2011). This evidence suggests that the rate of error-driven learning is regulated by a reciprocally connected dACC-LC circuit, which itself may be dysfunctional in anxiety disorders and lead to an impaired ability to adjust error-driven learning to the environment. MPH may have the potential to correct this deficit by accelerating the adaptation of error-driven learning to environmental volatility. Supporting this, a recent multicenter randomized controlled trial found that treatment with MPH was associated with a remarkably robust reduction in symptoms of posttraumatic stress disorder (PTSD) (McAllister et al. 2015). Future research can incorporate functional neuroimaging along with the computational approach described here in order to assess the influence of MPH on the dACC-LC circuit both in normal individuals and individuals with anxiety disorders. The combination of neuroimaging and computational modeling of error-driven learning may also be applicable to other disorders involving NE or DA dysfunction.
Our findings demonstrate the utility of a computational approach to probe the behavioral manifestations of a pharmacologic perturbation of a neural circuit. The ability to measure pharmacologic influences on error-driven learning could have important clinical implications for anxiety disorders and other disorders involving noradrenergic or dopaminergic dysfunction. More broadly, computational models of neuromodulator function could be used to develop noninvasive, specific behavioral probes of drug actions in other disorders. Such a computational psychopharmacology approach holds promise as a key tool in future rational drug development for psychiatric conditions and an important application of the emerging field of computational psychiatry.
Acknowledgments
This work was supported by National Institute of Mental Health grant T32 MH018399 to Dr. Howlett and Swiss Foundation for Medical-Biological Grants P3SMP3_155341/1 to Dr. Hysek.
Footnotes
The authors declare no conflict of interest.
References
- Aston-Jones G, Cohen JD. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annu Rev Neurosci. 2005;28:403–450. doi: 10.1146/annurev.neuro.28.061604.135709. [DOI] [PubMed] [Google Scholar]
- Aston-Jones G, Iba M, Clayton E, Rajkowski J, Cohen J. The locus coeruleus and regulation of behavioral flexibility and attention: clinical implications. In: Ordway GA, Schwartz MA, Frazer A, editors. Brain Norepinephrine: Neurobiology and Therapeutics. Cambridge University Press; New York: 2007. pp. 196–235. [Google Scholar]
- Bari A, Aston-Jones G. Atomoxetine modulates spontaneous and sensory-evoked discharge of locus coeruleus noradrenergic neurons. Neuropharmacology. 2013;64:53–64. doi: 10.1016/j.neuropharm.2012.07.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10:1214–1221. doi: 10.1038/nn1954. [DOI] [PubMed] [Google Scholar]
- Browning M, Behrens TE, Jocham G, O’Reilly JX, Bishop SJ. Anxious individuals have difficulty learning the causal statistics of aversive environments. Nat Neurosci. 2015;18:590–596. doi: 10.1038/nn.3961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cazé RD, van der Meer MA. Adaptive properties of differential learning rates for positive and negative outcomes. Biol Cybern. 2013;107:711–719. doi: 10.1007/s00422-013-0571-5. [DOI] [PubMed] [Google Scholar]
- Chowdhury R, Guitart-Masip M, Lambert C, Dayan P, Huys Q, Düzel E, Dolan RJ. Dopamine restores reward prediction errors in old age. Nature neuroscience. 2013;16:648–653. doi: 10.1038/nn.3364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Critchley HD, Mathias CJ, Josephs O, O’Doherty J, Zanini S, Dewar BK, Cipolotti L, Shallice T, Dolan RJ. Human cingulate cortex and autonomic control: converging neuroimaging and clinical evidence. Brain. 2003 doi: 10.1093/brain/awg216. [DOI] [PubMed] [Google Scholar]
- Dayan P, Niv Y. Reinforcement learning: the good, the bad and the ugly. Curr Opin Neurobiol. 2008;18:185–196. doi: 10.1016/j.conb.2008.08.003. [DOI] [PubMed] [Google Scholar]
- Devilbiss DM, Berridge CW. Low-dose methylphenidate actions on tonic and phasic locus coeruleus discharge. J Pharmacol Exp Ther. 2006;319:1327–1335. doi: 10.1124/jpet.106.110015. [DOI] [PubMed] [Google Scholar]
- Dombrovski AY, Clark L, Siegle GJ, Butters MA, Ichikawa N, Sahakian BJ, Szanto K. Reward/punishment reversal learning in older suicide attempters. Am J Psychiatry. 2010 doi: 10.1176/appi.ajp.2009.09030407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ebitz RB, Platt ML. Neuronal activity in primate dorsal anterior cingulate cortex signals task conflict and predicts adjustments in pupil-linked arousal. Neuron. 2015;85:628–640. doi: 10.1016/j.neuron.2014.12.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etkin A, Egner T, Kalisch R. Emotional processing in anterior cingulate and medial prefrontal cortex. Trends Cogn Sci. 2011;15:85–93. doi: 10.1016/j.tics.2010.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Foote SL, Bloom FE, Aston-Jones G. Nucleus locus ceruleus: new evidence of anatomical and physiological specificity. Physiol Rev. 1983;63:844–914. doi: 10.1152/physrev.1983.63.3.844. [DOI] [PubMed] [Google Scholar]
- Gold JI, Shadlen MN. Representation of a perceptual decision in developing oculomotor commands. Nature. 2000;404:390–394. doi: 10.1038/35006062. [DOI] [PubMed] [Google Scholar]
- Huys QJ, Pizzagalli DA, Bogdan R, Dayan P. Mapping anhedonia onto reinforcement learning: a behavioural meta-analysis. Biol Mood Anxiety Disord. 2013;3:12. doi: 10.1186/2045-5380-3-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iglesias S, Mathys C, Brodersen KH, Kasper L, Piccirelli M, den Ouden HE, Stephan KE. Hierarchical prediction errors in midbrain and basal forebrain during sensory learning. Neuron. 2013;80:519–530. doi: 10.1016/j.neuron.2013.09.009. [DOI] [PubMed] [Google Scholar]
- Itoi K, Sugimoto N. The brainstem noradrenergic systems in stress, anxiety and depression. J Neuroendocrinol. 2010;22:355–361. doi: 10.1111/j.1365-2826.2010.01988.x. [DOI] [PubMed] [Google Scholar]
- Kimko HC, Cross JT, Abernethy DR. Pharmacokinetics and clinical effectiveness of methylphenidate. Clinical pharmacokinetics. 1999;37:457–470. doi: 10.2165/00003088-199937060-00002. [DOI] [PubMed] [Google Scholar]
- Korostenskaja M, Kičić D, Kähkönen S. The effect of methylphenidate on auditory information processing in healthy volunteers: a combined EEG/MEG study. Psychopharmacology. 2008;197:475–486. doi: 10.1007/s00213-007-1065-8. [DOI] [PubMed] [Google Scholar]
- Linssen A, Sambeth A, Vuurman E, Riedel W. Cognitive effects of methylphenidate in healthy volunteers: a review of single dose studies. International journal of neuropsychopharmacology. 2014;17:961–977. doi: 10.1017/S1461145713001594. [DOI] [PubMed] [Google Scholar]
- Mathys CD, Lomakina EI, Daunizeau J, Iglesias S, Brodersen KH, Friston KJ, Stephan KE. Uncertainty in perception and the Hierarchical Gaussian Filter. Front Hum Neurosci. 2014:8. doi: 10.3389/fnhum.2014.00825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McAllister TW, Zafonte R, Jain S, Flashman LA, George MS, Grant GA, He F, Lohr JB, Andaluz N, Summerall L. Randomized placebo-controlled trial of methylphenidate or galantamine for persistent emotional and cognitive symptoms associated with PTSD and/or traumatic brain injury. Neuropsychopharmacology. 2015 doi: 10.1038/npp.2015.282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michelson D, Adler L, Spencer T, Reimherr FW, West SA, Allen AJ, Kelsey D, Wernicke J, Dietrich A, Milton D. Atomoxetine in adults with ADHD: two randomized, placebo-controlled studies. Biol Psychiatry. 2003;53:112–120. doi: 10.1016/s0006-3223(02)01671-2. [DOI] [PubMed] [Google Scholar]
- Milad MR, Quirk GJ, Pitman RK, Orr SP, Fischl B, Rauch SL. A role for the human dorsal anterior cingulate cortex in fear expression. Biol Psychiatry. 2007;62:1191–1194. doi: 10.1016/j.biopsych.2007.04.032. [DOI] [PubMed] [Google Scholar]
- Nassar MR, Rumsey KM, Wilson RC, Parikh K, Heasly B, Gold JI. Rational regulation of learning dynamics by pupil-linked arousal systems. Nat Neurosci. 2012;15:1040–6. doi: 10.1038/nn.3130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nassar MR, Wilson RC, Heasly B, Gold JI. An approximately Bayesian delta-rule model explains the dynamics of belief updating in a changing environment. J Neurosci. 2010;30:12366–78. doi: 10.1523/JNEUROSCI.0822-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pauls AM, O’Daly OG, Rubia K, Riedel WJ, Williams SC, Mehta MA. Methylphenidate effects on prefrontal functioning during attentional-capture and response inhibition. Biological psychiatry. 2012;72:142–149. doi: 10.1016/j.biopsych.2012.03.028. [DOI] [PubMed] [Google Scholar]
- Payzan-LeNestour E, Dunne S, Bossaerts P, O’Doherty JP. The neural representation of unexpected uncertainty during value-based decision making. Neuron. 2013;79:191–201. doi: 10.1016/j.neuron.2013.04.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Proksasy WF, editors. Classical Conditioning II: Current Research and Theory. Appleton-Century Crofts; New York: 1972. pp. 64–99. [Google Scholar]
- Schmid Y, Hysek CM, Simmler LD, Crockett MJ, Quednow BB, Liechti ME. Differential effects of MDMA and methylphenidate on social cognition. Journal of psychopharmacology. 2014;28:847–856. doi: 10.1177/0269881114542454. [DOI] [PubMed] [Google Scholar]
- Sullivan GM, Coplan JD, Kent JM, Gorman JM. The noradrenergic system in pathological anxiety: a focus on panic with relevance to generalized anxiety and phobias. Biol Psychiatry. 1999;46:1205–1218. doi: 10.1016/s0006-3223(99)00246-2. [DOI] [PubMed] [Google Scholar]
- Sutton RS, Barto AG. Reinforcement Learning: An Introduction. MIT Press; Cambridge: 1998. [Google Scholar]
- Tanaka M, Yoshida M, Emoto H, Ishii H. Noradrenaline systems in the hypothalamus, amygdala and locus coeruleus are involved in the provocation of anxiety: basic studies. Eur J Pharmacol. 2000;405:397–406. doi: 10.1016/s0014-2999(00)00569-0. [DOI] [PubMed] [Google Scholar]
- Volkow ND, Wang G-J, Smith L, Fowler JS, Telang F, Logan J, Tomasi D. Recovery of dopamine transporters with methamphetamine detoxification is not linked to changes in dopamine release. NeuroImage. 2015;121:20–28. doi: 10.1016/j.neuroimage.2015.07.035. [DOI] [PubMed] [Google Scholar]
- Yu AJ, Dayan P. Uncertainty, neuromodulation, and attention. Neuron. 2005;46:681–692. doi: 10.1016/j.neuron.2005.04.026. [DOI] [PubMed] [Google Scholar]
- Yu AJ, Huang H. Maximizing masquerading as matching in human visual search choice behavior. Decision. 2014;1:275. [Google Scholar]