Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2015 Jul 6;112(29):E3930–E3939. doi: 10.1073/pnas.1418014112

Insular neural system controls decision-making in healthy and methamphetamine-treated rats

Hiroyuki Mizoguchi a,1, Kentaro Katahira b,1, Ayumu Inutsuka c,1, Kazuya Fukumoto a,b, Akihiro Nakamura a, Tian Wang d, Taku Nagai d, Jun Sato a, Makoto Sawada e, Hideki Ohira b, Akihiro Yamanaka c, Kiyofumi Yamada d,2
PMCID: PMC4517258  PMID: 26150496

Significance

Patients with addiction have a greater tendency to engage in risk-taking behavior. However, the neural substrates responsible for these deficits remain unknown. Here we demonstrated that chronic methamphetamine-treated rats preferred high-risk/high-reward actions and assigned higher value to high returns, indicative of altered decision-making. Pharmacological studies revealed that the insular neural system controls decision-making in both healthy and methamphetamine-treated rats. We further confirmed the role of the insular cortex in decision-making using designer receptor exclusively activated by designer drug technology. Because decision-making is a cognitive process that influences many aspects of daily living and both mental and physical health, the findings of this study have broader implications.

Keywords: decision-making, methamphetamine, insular cortex, DREADD, motivational value

Abstract

Patients suffering from neuropsychiatric disorders such as substance-related and addictive disorders exhibit altered decision-making patterns, which may be associated with their behavioral abnormalities. However, the neuronal mechanisms underlying such impairments are largely unknown. Using a gambling test, we demonstrated that methamphetamine (METH)-treated rats chose a high-risk/high-reward option more frequently and assigned higher value to high returns than control rats, suggestive of changes in decision-making choice strategy. Immunohistochemical analysis following the gambling test revealed aberrant activation of the insular cortex (INS) and nucleus accumbens in METH-treated animals. Pharmacological studies, together with in vivo microdialysis, showed that the insular neural system played a crucial role in decision-making. Moreover, manipulation of INS activation using designer receptor exclusively activated by designer drug technology resulted in alterations to decision-making. Our findings suggest that the INS is a critical region involved in decision-making and that insular neural dysfunction results in risk-taking behaviors associated with altered decision-making.


Decision-making is a key activity of everyday life. Consequently, disturbances in the ability to make appropriate decisions or anticipate their possible consequences can result in massive social, medical, and financial problems. Decision-making depends on three temporally and partially functionally distinct sets of processes: (i) assessment and formation of preferences among possible options, (ii) selection and execution of an action, and (iii) experience or evaluation of the outcome (1). These processes are thought to require an extended neural network, mainly comprising glutamatergic, serotonergic, noradrenergic, and dopaminergic pathways in the frontostriatal and limbic loops, including the cingulate cortex, orbitofrontal cortex (OFC), insular cortex (INS), striatum (St), basal ganglia, and amygdala (2). Analysis of these processes helps to distinguish which aspects of decision-making are differentially affected in various neuropsychiatric disorders (1).

Poor decision-making is a symptom of several neuropsychiatric diseases, such as depression, schizophrenia, Parkinson’s disease, and drug dependence (1, 35). Pathological decision-making in neuropsychiatric disorders is associated with an inability to make profitable long-term decisions that incorporate expectations of future outcomes (6). Thus, pathological decision-making is recognized as a core problem in neuropsychiatric disorders, and a better understanding of the mechanisms underlying altered decision-making should provide insights that could lead to successful treatments for these diseases.

A hallmark of addiction is continuous use of substances despite negative consequences or the absence of positive consequences (7). Addicts are less able to flexibly adapt their behavior to changes in reward contingencies, and they have difficulties in integrating reinforcers to guide future behavior (8). Several altered decision-making patterns have been observed in patients with drug dependence: in particular, they tend to preferentially select actions associated with larger short-term gains but long-term losses over actions associated with smaller short-term gains and overall long-term gains (1, 4). Moreover, brain imaging studies have demonstrated that cocaine-dependent patients exhibit reduced cortical thickness in the INS and dorsolateral prefrontal cortex in association with abnormal judgment and decision-making (9), and patients with methamphetamine (METH) dependence exhibit dysfunction in the OFC (1, 4) and activation in the INS (1, 10, 11). In animal studies, dopamine signaling in prefrontal cortex, including the INS and OFC, has been implicated in impulsive choice (12) and risky decision-making (13). However, the specific contributions of these regions to impaired decision-making remain unclear, and it is not known whether the dysfunctions in decision-making and the underlying neural substrates are preexisting conditions that contribute to the initiation of drug use or are instead a consequence of repeated use of drugs.

To address this issue, we tested the effect of chronic METH treatment on decision-making in rats, using a developed gambling test for rodents. Furthermore, we examined the effects of pharmacogenetic manipulations of neuronal activity in the INS on decision-making by using the designer receptor exclusively activated by designer drug (DREADD) technology.

Results

Rats Show Preference for Low-Risk/Low-Reward over High-Risk/High-Reward in a Gambling Test.

Rats were subjected to a gambling test developed for rodents, in which they selected one of four arms (one low-risk/low-reward [L-L] arm, one high-risk/high-reward [H-H] arm, and two empty arms) in 16 trials per day for 14 d. Choice of the L-L arm resulted in a small reward (one food pellet) with high probability (14/16 trials, 87.5%) but sometimes resulted in a quinine-coated food pellet (2/16 trials, 12.5%). On the other hand, choice of the H-H arm resulted in a large reward (seven food pellets) with low probability (2/16 trials, 12.5%) but usually (14/16 trials, 87.5%) resulted in a quinine pellet (Fig. S1B).

Fig. S1.

Fig. S1.

Gambling test for rodents using an eight-arm radial arm maze. (A) The apparatus consisted of two start arms and four choice arms [one low-risk/low-return (L-L) arm, one high-risk/high-return (H-H) arm, and two empty arms] on an elevated six-arm Plexiglas radial maze. The other two arms of the original eight-arm maze were not used (removed). The food cups were glued onto the surfaces of the arms. The entrances from the start arms to the central platform could be blocked by black guillotine doors. In the decision-making task, rewards were 45-mg banana-flavored food pellets, and negative outcomes (food that was disappointing in comparison with the reward) were 45-mg quinine-coated food pellets that were unpalatable but not inedible. Rats could obtain either rewards or quinine pellets at the ends of the baited arms, but no food pellets were placed in empty arms. (B) Reward probabilities and reward values of the H-H and L-L arms. Under the standard condition, choice of the L-L arm resulted in frequent (14/16 trials, reward probability: 87.5%) small rewards (one 45-mg food pellet; open circle) with infrequent (2/16 trials) negative outcomes (one quinine-coated pellet; closed circle). Choice of the H-H arm resulted in infrequent (2/16 trials, reward probability: 12.5%) large rewards (seven 45-mg food pellets; double open circle) with frequent (14/16 trials) negative outcomes (one quinine-coated pellet; closed circle). Quinine pellets in the L-L arm and large rewards in the H-H arm were provided in a random manner at a rate of 2 out of 16 trials per day. (C) Comparison of the standard condition and the double-reward condition.

When the reward probability of the H-H arm was set at 12.5%, rats also showed preference for the L-L arm over the H-H arm, as do healthy human subjects in the Iowa gambling task (6): H-H arm choice was 9.6 ± 1.3% on day 14 (Fig. S2A). When the reward probability of the H-H arm was increased from 12.5% to 25.0% or 50% and the reward probability of the L-L arm was decreased from 87.5% to 75% or 50%, H-H arm preference increased as a function of reward probability (Fig. S2A). The number of entries into empty arms, considered to represent an index of learning and reference memory of the task rules, decreased steeply during the first 5 d of testing and then stably maintained (Fig. S2B).

Fig. S2.

Fig. S2.

(A) Effect of changing the reward probability on H-H arm choice ratio in the gambling test. Reward probability of H-H arm was changed from 12.5% to 50%, and reward probability of L-L arm was changed from 87.5% to 50%. Values are means ± SE (n = 4–6). Repeated ANOVA revealed significant effects on reward probability [F(2,13) = 5.04, P < 0.05], time [F(13,169) = 1.81, P < 0.05], and their interaction [F(26,169) = 1.84, P < 0.05]. P < 0.05. *P < 0.05 vs. the 12.5% reward probability group on each day. #P < 0.05 vs. the 25% reward probability group on each day. (B) Effect of changing the reward probability on the number of entries into empty arms. Values are means ± SE (n = 4–6). There was no effect on reward probability [F(2,13) = 2.69, P > 0.05], time [F(13,169) = 59.10, P < 0.05], or their interaction [F(26,169) = 3.46, P < 0.05]. *P < 0.05 vs. the 12.5% reward probability group on each day. #P < 0.05 vs. the 25% reward probability group on each day. (C) Effect of changing reward values of the H-H and L-L arms on H-H arm choice ratio. Total reward values of the H-H and L-L arms were doubled from 630 mg to 1260 mg, but the reward probabilities remained the same with the standard condition (12.5% and 87.5%, respectively). Values are means ± SE (n = 10). Repeated ANOVA revealed no significant main effect on reward value [F(1,18) = 2.15, P > 0.05], time [F(13,234) = 18.18, P < 0.05], or interaction [F(13,234) = 3.86, P < 0.05]. On days 2, 3, 6, 8, and 10, simple main effects were significant (P < 0.05). (D) Effect of changing reward values of the H-H and L-L arms on the number of entries in empty arms. Values are means ± SE (n = 10). There were no significant main effects [reward value: F(1,18) = 3.70, P = 0.07; time: F(13,234) = 49.56, P < 0.05; interaction: F(13,234) = 1.66, P = 0.07].

Next, to examine the effect of reward values of the H-H and L-L arms on decision-making behavior, the total reward values of the H-H and L-L arms were doubled (from 630 mg to 1,260 mg), but the reward probabilities remained the same as in the standard condition (12.5% and 87.5%, respectively). Under this double-reward condition, we observed apparent facilitation of development of risk avoidance behavior (i.e., reduction of H-H arm choice) relative to that in the standard condition (Fig. S2C). On days 6, 8, and 10, the rate of H-H arm choice was significantly lower under the double-reward condition than under the standard condition. The number of entries into empty arms tended to be lower in the double-reward condition than under standard conditions (Fig. S2D). These results suggest that risk avoidance behaviors in this gambling test are controlled by the probability of reward as well as the minimum reward values of the options.

Chronic METH-Treated Rats Chose the H-H Arm More Frequently Than Control Animals.

To determine the effect of chronic drug administration on decision-making, rats that had previously been treated with METH (4 mg/kg, once a day for 30 d) were subjected to the gambling test under the standard condition (Fig. 1A). As shown in Fig. 1B, vehicle-treated control rats gradually developed preference for the L-L arm (blue squares) over the H-H arm (blue circles), as did nontreated naïve rats; however, chronic METH-treated rats chose the H-H arm (red circles) more frequently and the L-L arm (red squares) less frequently than saline-treated control rats. The number of entries into empty arms did not differ between saline- and METH-treated animals (Fig. 1C). There was no difference in preference between two empty arms in both saline- and METH-treated rats (saline group: 50.3 ± 3.4% for one and 49.8 ± 3.4% for the other; METH group: 54.4 ± 5.8% for one and 45.6 ± 5.8% for the other).

Fig. 1.

Fig. 1.

(A) Experimental scheme. Rats were given METH at a dose of 4 mg/kg once a day for 30 d. The gambling test was initiated after 2 wk of withdrawal from METH. (B) Performance of METH-treated rats in the gambling test. A dotted line indicates the level of chance for arm choice. Values are means ± SE (n = 5–6). P < 0.05. Repeated ANOVA for L-L arm revealed effects on treatment [F(1,9) = 12.61, P < 0.05], time [F(13,117) = 33.57, P < 0.05], and their interaction [F(13,117) = 0.90, P > 0.05]. Repeated ANOVA for H-H arm revealed effects on treatment [F(1,9) = 11.67, P < 0.05], time [F(13,117) = 8.74, P < 0.05], and their interaction [F(13,117) = 1.98, P < 0.05]. *P < 0.05 vs. the H-H arm choice in saline-treated group on each day. (C) Number of entries into empty arms. Values are means ± SE (n = 5–6). n.s., not significant. Repeated ANOVA revealed effects on treatment [F(1,9) = 1.12, P > 0.05], time [F(13,117) = 34.00, P < 0.05], and their interaction [F(13,117) = 2.50, P < 0.05]. (DG) Win–stay/lose–shift behavior of METH-treated rats in the gambling test. Values are means ± SE (n = 5–6). *P < 0.05 vs. saline-treated group. Significant effects were observed in D (P < 0.05 by u test) and G (P < 0.05 by u test) but not E (P > 0.05 by u test) or F (P > 0.05 by u test).

Next, to evaluate the effect of reward prediction error (RPE) on subsequent arm choice behaviors, we analyzed the choice data as a three-choice task including the empty arm. When animals received the large reward in the H-H arm (large positive RPE), the METH-treated rats chose the H-H arm (win–stay behavior) more frequently in the next trial than control rats did (Fig. 1D). There was no difference in H-H arm choice ratio between control and METH-treated animals after a quinine pellet was encountered in the H-H arm (negative RPE) (Fig. 1E). Alternatively, when METH-treated rats encountered a quinine pellet following a choice of the L-L arm (negative RPE), they chose the H-H arm in the next trial (lose–shift behavior) more frequently than control animals did (Fig. 1G). Again, there was no difference in H-H arm choice ratio between the two groups of rats after a normal food pellet was encountered in the L-L arm (small positive RPE) (Fig. 1F). Furthermore, there were no differences in empty arm choice ratio between saline- and METH-treated rats after receiving large reward (Fig. S3A), small reward (Fig. S3C), or quinine pellet (Fig. S3 B and D). These results suggest that METH-treated animals may suffer from impairments in the neural systems that control responses evoked by both large positive and negative RPEs, leading to impulsive choice behaviors.

Fig. S3.

Fig. S3.

(A and C) Win–empty/(B and D) lose–empty behavior of METH-treated rats in the gambling test. Values are means ± SE (n = 5–6). There were no significant effects (P > 0.05 by u test).

Next, we analyzed the numbers of large rewards and quinine pellets acquired in the first (days 1–7) and second (days 8–14) halves of the gambling test. Our analysis revealed that the elevated H-H arm choice in METH-treated animals was not due to chance encounters of either large rewards following choice of the H-H arm during the initial phase of the gambling test (Fig. S4 AC) or quinine pellets following choice of the L-L arm (Fig. S4D). Rather, over all trials of the gambling test, METH-treated rats encountered quinine pellets more frequently than saline-treated rats did (Fig. S4 E and F). There was no difference in the total number of reward pellets between the two groups (Fig. S4 G and H). In the food consumption test, there was no difference between control and METH-treated groups with respect to approach time (82.7 ± 42.1 s for control vs. 58.5 ± 41.5 s for METH), consumption time (151.6 ± 39.7 s for control vs. 185.2 ± 6.5 s for METH), number of rewards eaten (26.4 ± 4.9 pellets for control vs. 35.0 ± 2.4 pellets for METH), or the ratio of the number of rewards eaten to consumption time (0.18 ± 0.01 pellets per s for control vs. 0.19 ± 0.01 pellets per s for METH). These results suggest that the altered arm choice behavior in METH-treated rats was not due to a change in their motivation to obtain food. Moreover, the altered choice behavior in METH-treated rats was a consequence not only of repeated treatment with METH but also of subsequent experience manifested under uncertain conditions in which RPEs were inserted into the test (Figs. S5 and S6).

Fig. S4.

Fig. S4.

Difference between saline- and METH-treated rats in the number of reward pellets acquired. (A–C) Number of large rewards acquired in the H-H arm. Values are means ± SE (n = 5–6). P < 0.05. *P < 0.05 vs. saline-treated group. (A) Number of large rewards acquired per day. (B) Total number of large rewards acquired over 14 d. (C) Number of large rewards acquired in the first (days 1–7) and second (days 8–14) halves of the gambling test. There were significant effects in the number of large rewards acquired per day [A; treatment: F(1,9) = 21.76, P < 0.05; time: F(13,117) = 2.59, P < 0.05; interaction: F(13,117) = 1.04, P > 0.05] and in the total number of large rewards acquired over 14 d (B; P < 0.05). There was no difference between the two groups in the first block (C; days 1–7, P > 0.05; days 8–14, P < 0.05). (DF) Number of quinine pellets acquired. Values are means ± SE (n = 5–6). (D) Number of quinine pellets acquired per day in the L-L arm. There was no significant main effect on treatment [F(1,9) = 1.99, P > 0.05], time [F(13,117) = 1.72, P > 0.05], or their interaction [F(13,117) = 1.69, P > 0.05]. (E) Total number of quinine pellets acquired in both arms over 14 d. *P < 0.05 vs. saline-treated group. (F) Total number of quinine pellets acquired in both arms in the first (days 1–7) and second (days 8–14) halves of the gambling test. *P < 0.05 vs. the saline-treated group. (GI) Number of reward pellets acquired. Values are means ± SE (n = 5–6). (G) Number of reward pellets acquired per day. There was no significant effect on treatment [F(1,9) = 0.003, P > 0.05], time [F(13,117) = 0.83, P > 0.05], or their interaction [F(13,117) = 1.19, P > 0.05]. (H) Total number of reward pellets acquired over 14 d (P > 0.05). (I) Number of reward pellets acquired in the first half (days 1–7) and second half (days 8–14) of the gambling test (days 1–7, P > 0.05; days 8–14, P > 0.05).

Fig. S5.

Fig. S5.

Posttreatment with METH altered the established preference for the L-L arm in the gambling test. (A) Experimental scheme of the behavioral assessment. Animals that had been previously subjected to the gambling test were given METH (4 mg/kg, s.c.) once a day for 14 consecutive days. The effect of posttreatment with METH on choice behavior in the gambling test was investigated three times (the first gambling test on days 7 and 8, the second on days 14 and 15, and the third on days 21 and 22) after initiation of METH administration. Data from the first, second, and third gambling tests were expressed as mean performance over 2 d. (B) Effect on H-H arm choice ratio in the gambling test. Values are means ± SE (n = 10). P < 0.05. Repeated ANOVA revealed significant effects [treatment: F(1,18) = 8.90, P < 0.05; time: F(2,36) = 1.89, P > 0.05; interaction: F(2,36) = 1.23, P > 0.05]. (C) Effect on the number of entries into empty arms in the gambling test. Values are means ± SE (n = 10). There were no significant effects [treatment: F(1,18) = 2.35, P > 0.05; time: F(2,36) = 0.24, P > 0.05; interaction: F(2,36) = 0.08, P > 0.05]. (DG) Win–stay/lose–shift behavior in the three postgambling tests (first through third tests). Values are means ± SE [(D) n = 7–9, (E) n = 6–10, (F) n = 10, and (G) n = 10]. *P < 0.05 vs. saline-treated group. There were significant effects in D (P < 0.05 by u test) and G (P < 0.05 by u test) but not in E (P > 0.05 by u test) or F (P > 0.05 by u test).

Fig. S6.

Fig. S6.

Preference for the H-H arm in METH-treated rats is dependent on subsequent experience. (A) Experimental scheme of behavioral assessment. Animals that had been previously subjected to the gambling test were given METH (4 mg/kg, s.c.) once a day for 14 consecutive days. The effect of posttreatment with METH on choice behavior in the gambling test was investigated three times (first gambling test on days 7 and 8, second on days 14 and 15, and third on days 21 and 22) after initiation of METH administration. Data from the first, second, and third gambling tests are expressed as the mean performance over 2 d. METH or saline-treated rats were trained under a 0% ratio of quinine-coated pellets in the L-L arm and of large reward in the H-H arm. The effect on decision-making of the unexpected experience of a large reward or quinine pellet after posttreatment with METH was investigated 7, 8, 14, and 15 d after the first administration, as well as 21 and 22 d afterward (during withdrawal). (B) Effect on H-H arm choice. Values are means ± SE (n = 4–5). Repeated ANOVA revealed no significant effects [treatment: F(1,7) = 0.12, P > 0.05; time: F(2,14) = 0.33, P > 0.05; interaction: F(2,14) = 0.07, P > 0.05]. (C) Effect on the number of entry into empty arms in the gambling test. Values are means ± SE (n = 4–5). Repeated ANOVA revealed no significant effects [treatment: F(1,7) = 0.36, P > 0.05; time: F(2,14) = 0.78, P > 0.05; interaction: F(2,14) = 0.45, P > 0.05].

Finally, to examine the effect of the reward value of the H-H and L-L arms on the change in choice strategy in METH-treated rats, we subjected these animals to the gambling test under the double-reward value condition. Surprisingly, under these conditions, METH-treated animals did not exhibit a difference in arm choice behavior relative to saline-treated control animals, although they exhibited a significant difference in the rate of entries into empty arms. Furthermore, there were no differences in win–stay or lose–shift behavior between control and METH-treated animals (Fig. S7).

Fig. S7.

Fig. S7.

Performance of chronic METH-treated rats in the gambling test under the double-reward condition. Rats were given METH at a dose of 4 mg/kg once a day for 30 d. The gambling test was initiated after 2 wk of METH withdrawal, as shown in Fig. 1. (A) Effect on H-H arm choice ratio in the gambling test. Values are means ± SE (n = 7–9). Repeated ANOVA revealed no significant effect on H-H arm choice ratio [treatment: F(1,14) = 0.01, P > 0.05; time: F(13,182) = 12.76, P < 0.05; interaction: F(13,182) = 1.10, P > 0.05]. (B) Number of entries into empty arms. Values are means ± SE (n = 7–9). Repeated ANOVA revealed significant effect on the number of entries into empty arms [treatment: F(1,14) = 4.95, P < 0.05; time: F(13,182) = 32.70, P < 0.05; interaction: F(13,182) = 2.25, P < 0.05]. P < 0.05. *P < 0.05 vs. the saline-treated group on each day. (CF) Win–stay/lose–shift behavior. Values are means ± SE (n = 7–9). There were no significant effects in any of these figures (P > 0.05 by u test). (GJ) Win–empty/lose–empty behavior. Values are means ± SE (n = 7–9). A significant effect was observed in I (P = 0.0495 by u test) but not G, H, or J (P > 0.05 by u test).

Chronic METH-Treated Rats Assign Higher Motivational Values to Large Rewards.

There are several possible reasons why the METH-treated rats preferred the H-H arm to the L-L arm. Because the rate of H-H arm choice in the METH-treated group did not exceed 33% (the level of chance), two trivial explanations might apply: first, the METH treatment may have merely impaired or retarded the learning process, and second, METH treatment may simply have made the choice more random. A third, nontrivial explanation is that the METH treatment altered the relative motivational values associated with small reward, large reward, and quinine. To determine which mechanism best explains the data, we applied reinforcement learning models to the rats’ behavioral data and estimated the process underlying the trial-by-trial learning dynamics (14). If the first possibility is correct, the learning rate α, which determines the magnitude of the value update resulting from a single trial, would differ between METH-treated and control rats. If the second possibility is correct, the inverse temperature parameter β, which determines the randomness of choice, would differ between the two groups. On the other hand, if the third possibility is correct, the motivational value parameters for the large reward κBR (relative to quinine and small reward) should differ between groups.

Fig. 2 and Fig. S8 show the estimated parameters for each group. Regarding the standard condition, we observed significant differences between groups in regard to the motivational value parameters for large reward and empty outcome, as well as the inverse temperature parameters. Among them, the differences in the motivational value for large rewards best explain the difference in the behavior of saline- and METH-treated rats. In addition, the inverse temperature was larger for METH-treated rats. Because larger inverse temperature value induces less random choice, the difference in the inverse temperature itself cannot account for the observation that the METH-treated rats chose the H-H arm more often than the control rats. For the double-reward condition, only the inverse temperature parameter yielded a significant difference between the two groups. These observations indicated that the METH-treated rats assigned higher motivational values to large rewards than control rats under the standard condition but not under the double-reward condition. This result supports the third explanation offered above, i.e., that METH treatment alters the balance of motivational values, rather than merely impairing learning or choice behaviors.

Fig. 2.

Fig. 2.

Reinforcement learning model-based analysis of rats’ performance. (AD) Estimated model parameters of the best model, selected based on the AIC. The selected model includes the learning rate that is shared with two groups (METH-treated rats and saline-treated control rats) (A) and the motivational value parameter for the large reward κLR, which has different values between groups [χ2(1) = 12.54, P < 0.001, likelihood ratio test]; the motivational value for empty outcome κE, which has different values between groups [χ2(1) = 7.96, P = 0.0048]; the inverse temperature, which also has different values between groups [χ2(1) = 6.97, P = 0.0083]; and zero motivational value for the quinine pellet, κQ. A dotted line indicates one motivational value for small reward in each group. (E and F) Model simulation results using the parameters of the best model. Fractions of arm choices on each day were averaged over 10,000 simulation runs. A dotted line indicates the level of chance for arm choice. This figure corresponds to Fig. 1B.

Fig. S8.

Fig. S8.

Reinforcement learning model-based analysis of rats’ performance under the double-reward condition. The conventions are the same as in Fig. 2. (AD) The model in which the learning rate differs between METH-treated rats and control rats and the motivational value for empty arm has nonzero value whereas the value for a quinine pellet is set to zero were selected based on the AIC (AIC = 6,595.24). A dotted line indicates one motivational value for small reward in each group. The difference in learning rate did not reach the level of significance [χ2(1) = 2.83, P = 0.092, likelihood ratio test], whereas the difference in the motivational value of the empty arm yielded a significant effect [χ2(1) = 11.20, P < 0.001]. Thus, we used the model in which all of the parameters, except for the motivational value of the empty arm, were shared by the two groups (AIC = 6,596.07). (E and F) Accordingly, the simulated learning curves (the development of H-H arm choice ratio) and the number of entry in empty arms were almost identical.

Changes in c-Fos Expression Induced by the Gambling Test.

Previous studies have suggested the involvement of the brain reward system, as well as the frontostriatal neuronal system, in decision-making (2, 13, 15). To identify the possible brain areas that could be involved in the altered decision-making of METH-treated rats, saline-treated control and METH-treated animals were subjected to the gambling test under either the standard or double-reward condition for 14 d and then subjected to c-Fos immunohistochemistry. METH-treated animals that were subjected to the gambling test under the double-reward condition were regarded as an alternative control group because they showed normal arm choice behavior under this condition (Figs. S7 and S8).

Quantitative analysis of c-Fos staining revealed significant differences between saline- and METH-treated animals in the numbers of c-Fos–positive cells in the INS, OFC, core of nucleus accumbens (NAc), and St but no differences in other brain areas such as anterior cingulated cortex (ACC), prelimbic cortex (PrL), and shell of nucleus accumbens (NAs) (Fig. 3 and Fig. S9). In particular, the numbers of c-Fos–positive cells in the INS, St, and NAc of METH-treated rats subjected to the gambling test under the standard condition were significantly higher than those in saline-treated control rats (Fig. 3; P < 0.05). This increase in c-Fos expression was specific to the standard condition of the gambling test and was not observed under the double-reward condition. Accordingly, neural activity in the INS, St, and NAc of METH-treated rats was associated with arm choice behavior in the gambling test under both the standard (Fig. 1) and double-reward conditions (Fig. S7). By contrast, although c-Fos expression was elevated in the OFC of METH-treated rats, this change was not associated with these animals’ behavior in the gambling test. These results suggests that neural activation in the INS, NAc, and St is associated with altered decision-making in METH-treated rats regarding arm choice behavior in the gambling test but is not associated with previous drug treatment history. In subsequent experiments, we focused on the role of the INS in arm choice decision-making in the gambling test because the INS is involved in computing the level of risk associated with specific actions and plays a role in making decisions that require weighting of uncertain positive and negative consequences (1619).

Fig. 3.

Fig. 3.

Changes in c-Fos expression evoked by the gambling test under the standard or double-reward condition. c-Fos expression was analyzed immunohistochemically in saline- and METH-treated rats 2 h after the gambling test. Photographs show typical examples of c-Fos expression in various brain areas of saline- or METH-treated rats, following the gambling test under the standard condition. Values are means ± SE (n = 3–4). *P < 0.05 vs. saline-treated group (standard condition). P < 0.05 vs. saline-treated group (double-reward condition). #P < 0.05 vs. METH-treated group (double-reward condition). n.s., not significant. OFC, orbitofrontal cortex [F(3,9) = 5.65, P < 0.05]; INS, insular cortex [F(3,9) = 4.93, P < 0.05]; NAc, core of the nucleus accumbens [F(3,10) = 11.3, P < 0.05]; NAs, shell of the nucleus accumbens [F(3,10) = 1.74, P > 0.05]; St, striatum [F(3,10) = 8.25, P < 0.05]. (Scale bar: 100 µm.)

Fig. S9.

Fig. S9.

No changes in c-Fos expression in the ACC and PrL of METH-treated rats were evoked by the gambling test under the standard condition. c-Fos expression was analyzed immunohistochemically in saline- and METH-treated rats 2 h after the gambling test as in Fig. 3. Photographs show typical examples of c-Fos expression in various brain areas of saline- or METH-treated rats after the gambling test under the standard condition. Values are means ± SE (n = 3–4). ACC, anterior cingulate cortex; PrL, prelimbic cortex. (Scale bar: 100 µm.)

Manipulation of Insular Neural Activity Modifies Arm Choice Behavior in the Gambling Test.

After being subjected to the gambling test under the standard condition for 14 d, METH-treated rats with higher H-H arm preference than saline-treated rats were divided into two groups: one group received bilateral microinjections of GABA receptor agonists to suppress INS neural activity, and the other received injections of vehicle. Microinjections of vehicle into the INS of METH-treated (METH/vehicle) rats had little effect on their arm choice behavior in the gambling test; thus, the METH/vehicle group exhibited a stable high level of H-H arm preference over 5 d (Fig. 4A). However, when INS neural activity was suppressed by GABA receptor agonists in METH-treated (METH/GABA agonist) rats, their elevated H-H arm choice ratio gradually decreased (Fig. 4A). In METH-treated rats, microinjections of vehicle or GABA receptor agonist into the INS had little effect on the number of entries into empty arms (Fig. 4B). Following treatment with GABA receptor agonist, win–stay behavior in METH-treated rats did not significantly decrease (Fig. 4C), whereas lose–shift behavior was significantly reduced (Fig. 4D). When the METH/GABA agonist rats were subjected to the gambling test for an additional 3 d without pretreatment with GABA receptor agonist, their reduced H-H arm choice ratio returned to the previous level (control group: 41.8 ± 5.2%; GABA receptor agonist-treated group: 28.6 ± 1.9%).

Fig. 4.

Fig. 4.

Dysfunction of the GABA system in the INS of METH-treated rats is involved in altered decision-making. (A) Effect of GABA receptor agonist (mixture of GABAA and GABAB receptor agonists) on the increase of H-H arm choice ratio in the gambling test in METH-treated rats. One group of animals received bilateral microinjections of GABA receptor agonist into the INS, whereas the other group received vehicle as a control. They were subjected to the gambling test 15 min after the injection, and the experiment was repeated for consecutive 5 d. Values are means ± SE (n = 5). P < 0.05. Repeated ANOVA revealed significant effects on treatment [F(1,8) = 9.74, P < 0.05], time [F(4,32) = 2.46, P > 0.05], and their interaction [F(4,32) = 2.38, P > 0.05]. (B) Number of entries into empty arms. Values are means ± SE (n = 5). There were no significant effects [treatment: F(1,8) = 0.18, P > 0.05; time: F(4,32) = 1.84, P > 0.05; interaction: F(4,32) = 1.95, P > 0.05]. (C) Win–stay behavior. Values are means ± SE (n = 4–5). There were no significant effects (P > 0.05 by u test). (D) Lose–shift behavior. Values are means ± SE (n = 5). *P < 0.05 vs. METH/vehicle group. Significant effects were observed (P < 0.05 by u test). (E) Effect of GABA receptor antagonist (mixture of GABAA and GABAB receptor antagonists) on H-H arm choice ratio in the gambling test in normal rats. GABA receptor antagonist was injected into the INS of normal rats that had been previously subjected to the gambling test. These animals were again subjected to the gambling test 15 min after the injection; the experiment was repeated for 5 consecutive days. Values are means ± SE (n = 7). P < 0.05. Repeated ANOVA revealed significant effect on treatment [F(1,12) = 6.36, P < 0.05] and not significant effect on time [F(4,48) = 0.68, P > 0.05] and their interaction [F(4, 48) = 2.31, P > 0.05]. (F) Number of entries into empty arms [treatment: F(1,12) = 0.29, P > 0.05; time: F(4,48) = 0.56, P > 0.05; interaction: F(4, 48) = 0.65, P > 0.05]. Values are means ± SE (n = 7). (G) Depolarization-evoked increase in the extracellular GABA levels in INS of METH-treated and saline-treated rats. Values are means ± SE (n = 9–11). Significant main effects were not observed [group: F(1,18) = 2.38, P > 0.05; time: F(16,288) = 12.6, P < 0.05; interaction: F(16, 288) = 0.81, P > 0.05]. The basal levels of GABA were not significantly different between the two groups (P > 0.05). (H) AUC of high (60 mM) K+-evoked GABA release for 60 min. Values are means ± SE (n = 9–11). *P < 0.05 vs. saline-treated group.

We also examined the effect of a mixture of GABAA and GABAB receptor antagonists on arm choice behavior in the gambling test. When INS neural activity was disinhibited by GABA receptor antagonist in naïve rats that preferred the L-L arm to the H-H arm, these rats began to choose the H-H arm more frequently than vehicle-treated control rats did (Fig. 4E), although there was no change in the number of entries into empty arms (Fig. 4F). These findings suggest that the INS neural system plays a crucial role in arm choice behavior in the gambling test.

GABA Transmission Is Disrupted in the INS of METH-Treated Rats.

To analyze activity-dependent GABA release in the INS of METH-treated rats, we measured depolarization-evoked GABA release in the INS by the in vivo microdialysis. In saline-treated control rats, high (60 mM) K+ stimulation increased extracellular GABA levels of INS to 346.6 ± 55.3% of basal levels at the peak, and higher (100 mM) K+ stimulation increased the level to 517.4 ± 105.8% (Fig. 4G). In METH-treated rats, depolarization-evoked GABA release in the INS was reduced compared with that in control rats (Fig. 4G). In particular, the AUC of high (60 mM) K+-evoked GABA release for 60 min was significantly reduced in METH-treated rats compared with control rats (Fig. 4H). There was no significant difference in the basal level of extracellular GABA levels between control and METH-treated rats (2.37 ± 0.57 nM in saline group and 2.44 ± 0.45 nM in METH group). These results suggest that METH-treated rats have impairments in activity-dependent GABA release in the INS and that dysfunction in GABA neurotransmission in the INS may be associated with the altered decision-making of METH-treated rats.

Manipulation of Neural Activity in the INS Using the DREADD Technology Alters Arm Choice Behavior in Normal and METH-Treated Rats.

Finally, we investigated whether manipulating the activity of the INS using the DREADD technology would affect arm choice behavior in the gambling test. Specifically, to determine whether aberrant activation of the INS was necessary and sufficient to alter the choice strategy in METH-treated rats, we used a mutant human M4 muscarinic receptor (hM4Di) transgene to suppress the aberrant neural activity in the INS of METH-treated rats, whereas we injected mutant human M3 muscarinic receptor (hM3Dq) to activate INS neurons in normal rats.

Normal rats overexpressing hM3Dq and METH-treated rats overexpressing hM4Di in the INS were subjected to the daily gambling test for 14 d; each test occurred 30 min after i.p. injection of either clozapine-N-oxide (CNO) or vehicle (control group). First, we confirmed that CNO treatment induced c-Fos expression in the INS of hM3Dq-overexpressing rats, indicating that CNO treatment activated the INS (Fig. 5 AC). In normal rats overexpressing hM3Dq in the INS, CNO injection resulted in an increase in H-H arm choice relative to vehicle-treated rats (Fig. 5D) but no difference in the number of entries into empty arms (Fig. 5E). There was no difference in win–stay behavior in CNO-treated rats relative to vehicle-treated rats (Fig. 5F), whereas lose–shift behavior was significantly elevated in response to CNO treatment (Fig. 5G).

Fig. 5.

Fig. 5.

Manipulation of neural activity in the INS using the DREADD technology resulted in alterations of arm choice behavior in the gambling test in normal and METH-treated rats. (A) Representative photograph of the expression of hM3Dq-mCherry in the INS. (B) Representative photograph of c-Fos expression induced by CNO treatment in the INS of hM3Dq-overexpressed rats. (Scale bar: 50 µm.) (C) Quantitation of c-Fos expression induced by CNO treatment in the INS of hM3Dq-overexpressing rats. Rats were killed 1.5 h after CNO treatment. Values are means ± SE (n = 4–5). *P < 0.05 vs. vehicle-treated group. (D) hM3Dq was overexpressed in the INS of naïve rats. The rats were given i.p. injections of CNO (0.5 mg/kg) or vehicle 30 min before each gambling test. Values are means ± SE (n = 4). P < 0.05. Repeated ANOVA revealed significant effects on treatment [F(1,6) = 27.96, P < 0.05], time [F(13,78) = 9.76, P < 0.05], and their interaction [F(13,78) = 2.09, P < 0.05]. *P < 0.05 vs. the vehicle-treated group on each day. (E) Effect on number of entries into empty arms. Values are means ± SE (n = 4). There were no significant effects [treatment: F(1,6) = 0.64, P > 0.05; time: F(13,78) = 17.35, P < 0.05; interaction: F(13, 78) = 0.28, P > 0.05]. (F) Win–stay behavior in hM3Dq-overexprssing rats. Values are means ± SE (n = 4). No significant effect was observed (P > 0.05). (G) Lose–shift behavior in hM3Dq-overexpressing rats. Values are means ± SE (n = 4). *P < 0.05 vs. vehicle-treated group. (H) hM4Di was overexpressed in the INS of METH-treated rats. The rats were given i.p. injections of CNO (0.5 mg/kg) or vehicle 30 min before each gambling test. Values are means ± SE (n = 7). P < 0.05. Repeated ANOVA revealed significant effects on treatment [F(1,12) = 20.27, P < 0.05], time [F(13,156) = 5.38, P < 0.05], and their interaction [F(13, 156) = 5.41, P < 0.05]. *P < 0.05 vs. METH/vehicle group on each day. (I) Effect on the number of entries into empty arms. Values are means ± SE (n = 7). Repeated ANOVA revealed significant effects [treatment: F(1,12) = 6.09, P < 0.05; time: F(13,156) = 23.87, P < 0.05; interaction: F(13, 156) = 1.11, P > 0.05]. P < 0.05. (J) Win–stay behavior of hM4Di-overexpressing rats. Values are means ± SE (n = 7). *P < 0.05 vs. METH/vehicle-treated group. (K) Lose–shift behavior of hM4Di-overexpressing rats. Values are means ± SE (n = 7). *P < 0.05 vs. METH/vehicle-treated group.

By contrast, when hM4Di was overexpressed in the INS of rats that had been previously treated with METH for 30 d, CNO treatment resulted in a significant decrease in H-H arm choice ratio relative to vehicle treatment (Fig. 5H). As with the normal rats described above, the treatment had little effect on the number of entries into empty arms (Fig. 5I). The elevations in win–stay (Fig. 5J) and lose–shift behavior (Fig. 5K) in METH-treated rats were significantly decreased in response to CNO treatment. These behaviors suggest that the INS plays a critical role in arm choice in the gambling test used in this study.

Discussion

Computational and neurobiological studies of decision-making have provided much insight into the neural mechanisms that underlie suboptimal decision-making behaviors observed in patients with various psychiatric and neurological disorders (15). However, because optimal decision-making according to the demands of specific tasks is likely to involve multiple algorithms and brain systems in various combinations, it is challenging to make generalizations about the nature of decision-making deficits in different disorders (15).

In this study, we developed a gambling test for rodents that follows rules similar to those of the human Iowa gambling test. Using this test, we demonstrated that METH-treated rats exhibited sustained/enduring alteration of decision-making or changes in choice strategy (increase in risky choice) even 1 mo after withdrawal.

Our behavioral results definitively showed that arm choice behavior was changed in METH-treated rats under the standard condition but not under the double-reward condition. In prospect theory, the shape of the value function is concave for gains and convex for losses, accounting for the empirical findings that humans are risk-averse for gains but risk-seeking for losses (3). METH-treated rats exhibited the same risk-averse behavior as control animals (Fig. 1B), but the value function of the small reward (one food pellet) in METH-treated rats under the standard condition may have been lower than that in control rats. Under the double-reward condition, such a difference in value function of small reward (two food pellets) between the two groups of rats may have been concealed by differences in the slopes of concave reward value function curves, leading to the disappearance of the change in arm choice strategy in METH-treated rats under the double-reward condition. From the viewpoint of clinical treatment, our findings suggest that an enriched environment may provide a means to ameliorate the altered decision-making in METH-dependent subjects.

It remains unclear whether this behavior in METH-treated rats reflects an impulse-like disorder rather than slower learning. To provide a hypothetical explanation for the arm choice behavior in METH-treated rats, we applied a reinforcement-learning model and estimated the learning rate, randomness of arm choice, and motivational value of the large reward. This analysis indicated that the alteration of arm choice behavior in METH-treated rats is not merely due to impairment of the learning process or randomness of arm choice. Instead, the results showed that METH-treated rats assigned a higher motivational value to the large reward than control animals did. Yechiam et al. (20) applied a reinforcement learning model that distills performance in the Iowa gambling task of 10 populations, including chronic cocaine abusers, chronic cannabis abusers, young polydrug abusers, and young alcohol abusers, into three different underlying psychological components: the relative effect of rewards and punishments on evaluations of options, the rate at which contingent payoffs are learned, and the consistency between learning and responding. Their analysis indicated that chronic cocaine abusers pay significantly higher attention to gains (21), and cannabis abusers show both significantly higher attention to gains and greater recency effects compared with control subjects. By contrast, young polydrug or alcohol abusers exhibited parameter values very similar to those of their respective control groups. Thus, although all drug-abusing subjects share a common decision-making deficit, the psychological processes underlying this impairment may vary according to the type of drug abused, the duration of abuse, and the age of the subject.

Together, the behavioral data obtained using the rodent gambling test, theoretical analysis of the results using a reinforcement learning model, and the c-Fos expression studies suggest that dysfunction in the frontostriatal network may contribute to the observed change in arm choice strategy in METH-treated rats and that the outcome of processing and modification of option values in the decision-making process may be altered by chronic METH treatment. Both win–stay and lose–shift behaviors were increased in the METH-treated group, suggesting that both positive and negative RPEs may contribute to an increased tendency toward risky choice in METH-treated rats. Accumulating evidence suggests that RPEs are represented by the activity of dopaminergic neurons in the ventral tegmental area and used in a reinforcement-learning process (22, 23). The dorsal striatum contributes directly to aspects of decision-making, especially action selection and initiation, through the integration of sensorimotor, cognitive, and motivational/emotional information within specific corticostriatal circuits involving discrete regions of striatum (24). By contrast, the ventral striatal signal changes dynamically over time, dependent on the phase of the reward process and on learning status, and thereby acts as a motivational engine for the continuation of behavior (25, 26). Recent studies have suggested that ventral striatal dopamine D2 receptor expression is associated with risky decision-making (13, 27). Accordingly, further studies are warranted to clarify the contribution of dopaminergic neurons, as well as the striatal direct and indirect efferent pathways (28), to the altered arm choice strategy in METH-treated rats.

The INS is also critically involved in decision-making (1619). INS activation is related to reward expectations in decision-making, as are the amygdala, basal ganglia, and OFC (2). The INS contributes to conscious drug urges and decision-making processes that precipitate relapse (29). Because the INS is connected to the amygdala (1) and striatum (30), the INS constructs the frontostriatal and limbic loops related to decision-making. Several neuroimaging studies have revealed dysfunctions of the prefrontal cortex in stimulant-dependent subjects (5, 29), and the close correlation between risky responses, harm avoidance, and activation of the INS suggests that this area plays a role in punishment (1, 31, 32). INS activation during decision-making tasks may alert the individual to expected aversive outcomes, and reduced activation in this area would be consistent with a diminished ability to differentiate between choices that lead to good vs. poor outcomes, potentially a key factor in METH relapse (33). We detected elevated c-Fos expression in the INS, striatum, and NAc of METH-treated rats that exhibited alteration of decision-making in the gambling test, suggesting that aberrant activation in the frontostriatal loop may contribute to the change in choice strategy. Previous studies have demonstrated that chronic treatment of rats with psychostimulants such as amphetamine and cocaine increase the density of dendritic spines on medium spiny neurons in the frontostriatal loop including the nucleus accumbens, caudate-putamen, and prefrontal cortex (34, 35), which may underlie the aberrant activation of these brain areas by the gambling test.

Behavioral errors are associated with activation of the salience network, an intrinsic connectivity network characterized by spatially consistent functional connectivity of intrinsic brain activity (36), and the anterior right INS plays a central role in the salience network response to errors (37). Based on our observation that the INS was aberrantly activated in METH-treated rats following the gambling test, we speculate that the salience network was responding to the positive and negative RPEs in the test and that this response is what caused the aberrant INS activity. Thus, aberrant activation of the INS following the gambling test in METH-treated rats may reflect the response of the salience network to positive and negative RPEs in the test. Because signals related to value functions and reward prediction errors are observed in many different areas (38), many other brain areas are likely to be involved in the process of updating action value functions; consequently, impaired functions of various brain areas might contribute to more impulsive choices in METH-dependent individuals (39).

Previously, we demonstrated that baclofen acutely ameliorates the cognitive deficits in repeatedly METH-treated mice, an animal model for cognitive deficits in METH abuse and schizophrenia (40). In this study, we also demonstrated that GABA transmission in the INS plays roles in decision-making in control and METH-treated rats and that reduced activity-dependent GABA release in the INS is related to alteration of decision-making in METH-treated animals. Consistent with the findings reported here, a previous report demonstrated that GABA release in the frontal cortex is reduced following METH treatment (41). Based on the analysis of win–stay and lose–shift behaviors in the gambling test, it is likely that METH-treated rats have a dysfunction in the neural system that controls response evoked by both positive and negative RPEs, leading to their impulsive choice behaviors. Conversely, GABAergic neurons in the INS, as part of the salience network system, may provide an inhibitory control of the emotional/impulsive choice behaviors evoked by RPE in normal animals. Consistent with this hypothesis, suppression of INS activity by local injection of GABA receptors agonists normalized the increased H-H arm choice ratio in METH-treated rats, whereas activation of INS by GABA receptors antagonists increased H-H arm choice ratio in normal rats. Thus, GABA in the INS may be implicated in the motivational values of rewards, as is dopamine in the reward system, serotonin in impulsivity and patience (42), and noradrenaline in attention and arousal (43).

Disinhibition of glutamatergic neurons, due to dysfunction of GABAergic interneurons in the INS, may play a role in altered decision-making in METH-treated rats. In other words, the excitation/inhibition balance in the INS may be important for appropriate decision-making. In fact, when hM3Dq was overexpressed in the INS of naïve rats, the rats exhibited increased H-H arm choice following i.p. injection of CNO. In particular, lose–shift behavior was significantly increased after treatment with CNO, indicating that aberrant activation in the INS leads to altered decision-making and suggesting that the INS plays a critical role in the response to the negative outcome. On the other hand, when hM4Di was overexpressed in the INS of METH-treated rats, the increased tendency to choose the H-H arm was significantly reduced by CNO treatment, indicating that inactivation of the INS can ameliorate the perturbation of choice strategy in METH-treated rats. Our findings obtained by this pharmacogenetic approach are consistent with results obtained by the pharmacologic and neurochemical experiments, and support the hypothesis described above. This is the first report to our knowledge to demonstrate a causal role for the INS in decision-making, strongly supporting many fMRI studies that showed a correlation between INS activity and conscious urges in human studies of addiction (29). It remains to be determined whether glutamatergic afferent neurons in INS in which hM3Dq or hM4Di were overexpressed are activated or inhibited by CNO treatment, respectively.

One of the limitations of our study is that we used a passive drug exposure procedure at a fixed dose; however, the current standard for modeling human drug abuse uses a paradigm of self-administration, which has greater face validity (44). For example, Mitchell et al. (27) used a rat model of a risky decision-making task and cocaine self-administration to demonstrate that the relationship between elevated risk-taking and cocaine self-administration is bidirectional and that low striatal dopamine D2 receptor expression may predispose subjects to both maladaptive decision-making and cocaine use. We believe that our findings provide some insights into alterations of decision-making following overexposure to METH that have some relevance to the related condition in humans who misuse METH. Nevertheless, to justify experimenter-administered dosing, as opposed to letting the animals have extended access to the drug and escalating their intake on their own (45), we would need to perform additional experiments.

Methods

Animals.

Male Wistar rats (7–8 wk old; weighing 220 ± 5 g at the beginning of the experiments) were obtained from Japan SLC, Inc. The animals were housed in plastic cages and kept in a regulated environment (23 ± 1 °C; 50 ± 5% humidity) with a 12-h light−dark cycle (lights on at 9:00 AM). Food (CE-2; CLEA Japan, Inc.) and tap water were available ad libitum. All experiments were performed in accordance with the Guidelines for Animal Experiments of Nagoya University, the Guiding Principles for the Care and Use of Laboratory Animals approved by the Japanese Pharmacological Society, and the United States National Institutes of Health Guide for the Care and Use of Laboratory Animals. All experimental procedures were approved by the Institutional Animal Care and Use Committee of the Research Institute of Environmental Medicine, Nagoya University.

Methamphetamine Treatment.

METH (Dainippon Sumitomo Pharma Co.) was dissolved in saline. Rats were given METH s.c. at a dose of 4 mg/kg once a day for 30 d (Fig. 1); the dose course was designed according the previous studies, in which rats given amphetamine at a dose of 3–4 mg/kg at least 20 times exhibited persistent changes in the density of dendritic spines on neurons in the nucleus accumbens and prefrontal cortex (34, 35). In subsequent experiments, animals that exhibited stable high-risk/high-reward (H-H) arm choice behavior (12.9 ± 2.5%) following repeated gambling tests (usually >14 times) were given METH (4 mg/kg, s.c.) once a day for 14 consecutive days (Fig. S5). The animals were then subjected to the gambling test during repeated METH treatment, as well as after the cessation of treatment.

Gambling Test for Rodents Using a Radial Arm Maze.

The apparatus consisted of two start arms and four choice arms (two baited arms and two empty arms) on an elevated six-arm Plexiglas radial maze (BrainScience Idea Co., Ltd., Japan) located in a sound-attenuated testing room illuminated at 40 lx. The other two arms of the original eight-arm maze were removed from the apparatus. The central platform and arms were black in color. Each arm contained a plastic food cup (3 cm in diameter and 1.5 cm deep), glued onto the surfaces of the arms ∼1.5 cm from the end of each arm. The entrances from the start arms to the central platform could be blocked with black guillotine doors (30 cm in height and 11 cm in width). The central platform was 32 cm in diameter, and the arms were 48 cm long and 12 cm wide, with a 5-cm edge around the apparatus. The maze was elevated 45 cm from floor level (Fig. S1A).

Before the start of the decision-making task for rodents, rats were habituated to the apparatus in a 10-min free exploration trial, conducted once per day for 10 d, during which 45-mg food pellets with banana flavor (Bioserve Inc.) were placed in the food cups of all four choice arms. The rats used in these trials were mildly food-restricted (about 1,300 mg solid feed per day). In the decision-making task, rewards were 45-mg banana-flavored food pellets, and negative outcomes (foods that were disappointing in comparison with the rewards) were quinine-coated 45-mg food pellets (Bioserve Inc.) that were unpalatable but not inedible. Rats could obtain either rewards or quinine pellets in the baited arms (see below), but no food pellets were placed in the empty arms. Rats were randomly placed in one of two start arms, and they learned to enter the central platform and then choose one of the four choice arms when the guillotine door was opened. When a rat obtained a food pellet during a trial, it was returned to either start arm by the experimenter for the next trial. When a rat entered an empty arm, it was returned to the start arm, and the trial was continued until the animal entered a baited arm and received a reward or quinine pellet. Each session, held once per day, consisted of 16 trials with intertrial intervals of ∼5 s, and the sessions were repeated for 14 consecutive days.

In the decision-making task, rats were trained to choose one of four choice arms [one low-risk/low-return (L-L) arm, one high-risk/high-return (H-H) arm, and two empty arms]. Under the standard condition (Fig. S1 B and C), choice of the L-L arm resulted in frequent (14/16 trials; reward probability: 87.5%) small rewards (one banana-flavored food pellet; reward value: 45 mg) with infrequent (2/16 trials) negative outcomes (one quinine pellet). Choice of the H-H arm resulted in infrequent (2/16 trials, reward probability: 12.5%) large rewards (seven banana-flavored food pellets; reward value: 315 mg) with frequent (14/16 trials) negative outcomes (one quinine pellet). Spatial locations of the L-L arm, the H-H arm, the two empty arms, and the two start arms were fixed among subjects throughout the experiments. A quinine pellet in the L-L arm and large rewards in the H-H arm were provided randomly at a rate of 2 out of 16 trials per day. Accordingly, there was no difference in the total reward value (630 mg) between the L-L and H-H arms. For three-choice task analysis, H-H arm or L-L arm choice (%) was expressed as follows: arm choice (%) = [((total number of either arm choice)/(16 trials + total number of entries into empty arms)) × 100].

Win–Stay/Lose–Shift Behavior.

We analyzed the choice data as a three-choice task, including empty arms (H-H arm, L-L arm, and empty arms). Win–stay behavior was measured in trials immediately after the animal encountered the large reward in the H-H arm, whereas lose–shift behavior was analyzed in trials immediately after the animal encountered the quinine pellet in the L-L arm (46). When a rat encountered the large reward in the H-H arm [large positive reward prediction error (RPE)], its subsequent choice was counted as a win–stay if it revisited the H-H arm in the next trial and otherwise as a win–shift. Alternatively, when a rat encountered the quinine pellet in the L-L arm, its subsequent choice was scored as lose–shift (negative RPE) if it switched to the H-H arm and otherwise as lose–stay. Win–stay or lose–shift behavior was expressed as follows: H-H arm choice (%) for win–stay behavior = [((total number of win–stay behaviors)/(total number of large rewards acquired)) × 100], considering all win–stay behaviors during the 5- or 14-d training session, and H-H arm choice (%) for lose–shift behavior = [((total number of lose–shift behavior)/(total number of quinine pellets acquired in the L-L arm)) × 100], considering all lose–shift behaviors during the 5- or 14-d training session.

If a rat never encountered the large reward in the H-H arm or the quinine pellet in L-L arm, its behavior was omitted from the data. If a rat chose the empty arm between a current H-H arm choice and a previous baited arm choice, its behavior was omitted from the win–stay and lose–shift behaviors. As a control for the win–stay and lose–shift behaviors immediately after the positive and negative RPEs, respectively, we analyzed the behavioral pattern of arm choice in trials immediately after the animal encountered the quinine pellet in the H-H arm or the small reward in the L-L arm (negative RPE or small positive RPE, respectively).

Reinforcement Learning Model-Based Analysis in the Gambling Test.

To estimate the underlying process governing the rats’ behavior, we used reinforcement learning model-based analysis (14). The standard reinforcement learning model represents the value of each action (e.g., selection of one arm); this value is called the “action value.” Let Qi(t) denote the action value for choosing arm i (i = H-H, L-L, or empty) in trial t, where the entry of empty arm was counted as a single trial. For the simplicity, we treated the two empty arms as a single option. The action values are updated according to the choice and the resulting outcome (small reward, large reward, or quinine pellet). Let a(t) denote the option the rat chooses in trial t. If a(t) = i, then the action value corresponding to the selected option is updated as follows:

Qi(t+1)=Qi(t)+αδ(t),
δ(t)=R(t)Qi(t),

whereas the action value corresponding to the unselected option does not change. Here α(0α1) is the learning rate that determines the magnitude of the update, and R(t) is the motivational value for the outcome presented in trial t, which is specified below. δ(t) is called the “reward prediction error.” Given a set of action values, a choice is assumed to be made according to the probability of choosing option 1 [P(a(t)=HH)] given by the soft-max function:

P[a(t)=i]=exp[βθ(m)Qi(t)]kexp[βθ(m)Qk(t)].

At the beginning of each day, rats tended to behave randomly, and they then gradually recovered their preferences based on previous experience. To model this random choice tendency at the beginning of each day and the gradual reinstatement of preferences, our model included θ(m), with specific form θ(m)=1γm, where m is the index of the trial in each day and γ (0γ1) is a free parameter that determines the persistence of the random-choice tendency. The model set the motivational value R(t) as follows:

R(t)={κEκQ1κLRiftheoutcomewasemptyiftheoutcomewasaquininepelletiftheoutcomewassmallrewardiftheoutcomewaslargereward.

Because we were interested only in the relative motivational values of the four types of outcomes, rather than their absolute values [because the absolute values are inseparable with the magnitude of the inverse temperature (47)], we set either κE or κQ to 0 and the value of the small reward to 1. We treated the models with κE=0 and the models with κQ= 0 as different models and selected the best model using a statistical model selection method described in Supporting Information. By estimating the motivational value parameters from rats’ choice data, we can estimate the balance of motivational values in the subjects (47, 48).

Microinjections of GABA Receptor Agonist and Antagonist in the INS.

Rats were anesthetized with sodium pentobarbital and placed in a stereotaxic apparatus (49). According to the rat brain atlas (50), metal guide cannulae were inserted stereotaxically into the INS (AP +3.0, ML +4.0 from the bregma, DV −5.5 from the skull). A mixture of the GABAA receptor agonist muscimol and the GABAB receptor agonist baclofen (final concentration of each compound: 500 pmol/μL) or vehicle (PBS) was bilaterally microinjected at a rate of 1 μL/min (total injection volume, 1 μL). Alternatively, a mixture of the GABAA receptor antagonist picrotoxin and the GABAB receptor antagonist CGS54626 (final concentrations: 25 ng and 1 ng/μL, respectively) or vehicle (PBS) was bilaterally microinjected at a rate of 1 μL/min (total injection volume, 1 μL). Fifteen minutes after injection, rats were subjected to the gambling test.

In Vivo Microdialysis of GABA in the INS.

Rats were anesthetized with sodium pentobarbital for stereotaxic implantation of a guide cannula into the INS, as described above. A dialysis probe (AI-6–1; 1 mm of membrane length; Eicom) was inserted through the guide cannula and perfused with brain dialysis medium (147 mM NaCl, 4 mM KCl, 2.3 mM CaCl2) at a flow rate of 1.0 µL/min, as described (51). Outflow fractions were collected every 20 min. After the collection of two baseline fractions, brain dialysis medium containing 60 or 100 mM KCl (for depolarization stimulation) was delivered through the dialysis probe for 60 min. GABA levels in the dialysate samples were analyzed by HPLCy (HPLC) with o-phthaldialdehyde derivatization.

Surgery for AAV Injection and Viral Gene Transfer.

The mCherry-tagged Dq-DREADD receptor [human M3 muscarinic receptor (hM3Dq)] and Di-DREADD receptor [human M4 muscarinic receptor (hM4Di)] transgenes were expressed under the control of the CMV promoter using adeno-associated virus (AAV) vectors. All AAV vectors were produced using the AAV Helper-Free System (Agilent Technologies) and purified according to a published method (52).

Rats were anesthetized with sodium pentobarbital for stereotaxic implantation of a capillary tube into the INS, as described above. Then, 1 µL of hM3Dq-mCherry (3 × 1012 particles/mL) or 3 µL of hM4Di-mCherry (3 × 1012 particles/mL) was infused bilaterally with a glass micropipette at a flow rate of 0.5 µL/min. The injector was left in place for an additional 15 min to minimize diffusion up the injector tract. For all experiments, accuracy of injection coordinates was confirmed by visualization of mCherry in the injection needle tracks in 40-µm tissue sections.

Both hM3Dq and hM4Di can be activated only by clozapine-N-oxide (CNO), which is otherwise pharmacologically inert. CNO is able to silence or reduce the activity of neurons that express hM4Di, whereas the compound activates neurons that express hM3Dq (53). Two weeks after AAV injection, the gambling test was initiated. Rats were given CNO at a dose of 0.5 mg/kg (i.p.) 30 min before the gambling test (54). Vehicle (10% DMSO in saline)-treated rats that expressed hM3Dq or hM4Di in the INS were used as controls in these experiments.

c-Fos Immunohistochemistry and Quantitative Analysis.

c-Fos immunostaining was performed as described previously (40). Selected areas were as follows: OFC, INS, anterior cingulated cortex (ACC), prelimbic cortex (PrL), core (NAc) and shell (NAs) of nucleus accumbens, and striatum. The methods can be found in Supporting Information.

Statistical Analyses.

All data are expressed as means ± SE. Statistical significance was determined using Student’s t test or the Mann–Whitney u test for comparisons of two groups, one-way or two-way analysis of variance (ANOVA) for multigroup comparisons, or repeated-measures ANOVA. Fisher’s LSD test was used for post hoc comparison when the F value was significant (P < 0.05).

We used the maximum likelihood approach to estimate the reinforcement learning model parameter from rats’ choice data, and the statistical analyses can be fully found in Supporting Information.

SI Methods

Gambling Test for Rodents Using a Radial Arm Maze.

In other experiments, reward probabilities and reward values of the H-H and L-L arms were varied to test the influence on performance in the gambling test for rodents. Accordingly, reward probability and reward value of the H-H arm were varied as follows: 12.5% probability (2/16 trials) of a large reward of 7 food pellets (315 mg; standard condition), 25% probability (4/16 trials) of a large reward of 3.5 food pellets (158 mg), and 50% probability (8/16 trials) of a large reward of 1.75 food pellets (79 mg). Reward probability and reward value of the L-L arm were varied as follows: 87.5% probability (14/16 trials) of a small reward (1 food pellet, 45 mg), 75% probability (12/16 trials) of a small reward of 1.17 food pellets (52.5 mg), and 50% probability (8/16 trials) of a small reward of 1.75 food pellets (79 mg). Consequently, under these conditions the total reward values of the H-H and L-L arms were always equal (14 food pellets, 630 mg) (Fig. S1 B and C).

In the double-reward condition, the total reward values of the H-H and L-L arms were both doubled (1,260 mg), but the reward probabilities were the same as those in the standard condition, with a large-reward probability of 12.5% (2/16 trials) in the H-H arm and a small-reward probability of 87.5% (14/16 trials) in the L-L arm. Thus, under the double-reward condition, the large reward in the H-H arm was 14 food pellets (630 mg), and the small reward in the L-L arm was 2 food pellets (90 mg) (Fig. S1 B and C).

Empty Arm Choice.

Ratio of empty arm choice immediately after receiving large reward, small reward, or quinine pellet was expressed as follows: empty arm choice (%) after large reward = [((total number of empty arm choice behaviors immediately after receiving large rewards)/(total number of large rewards acquired)) × 100] (16 trials per day × 14 d = 224 trials), empty arm choice (%) after quinine pellet in the H-H arm = [((total number of empty arm choice behaviors immediately after receiving quinine pellets)/(total number of quinine pellets acquired in the H-H arm at any two trials)) × 100] (16 trials per day × 14 d = 224 trials), empty arm choice (%) after small reward = [((total number of empty arm choice behaviors immediately after receiving small rewards)/(total number of small rewards acquired in the L-L arm at any two trials)) × 100] (16 trials per day × 14 d = 224 trials), and empty arm choice (%) after quinine pellet in the L-L arm = [((total number of empty arm choice behaviors immediately after receiving quinine pellets)/(total number of quinine pellets acquired in the L-L arm)) × 100] (16 trials per day × 14 d = 224 trials).

Gambling Test for Examining the Effect of Posttreatment with METH on Choice Behavior.

Rats were subjected to the gambling test for 14 d and then received repeated METH or saline treatment for 14 d. The animals were subjected to the gambling test twice during the METH treatment, on days 7 and 8 and days 14 and 15. A third postgambling test was conducted 1 wk after withdrawal.

Moreover, to determine whether preference for the H-H arm in METH-treated rats is dependent on subsequent experience, the gambling test was performed under a condition of 100% small reward (always one food pellet) in the L-L arm and 0% large reward (always one quinine pellet) in the H-H arm. The animals were first subjected to the gambling test for 14 d under the standard condition and then received repeated METH or saline treatment for 14 d. The animals were subjected to the gambling test twice during saline or METH treatment (i.e., on days 7 and 8 and days 14 and 15) and 1 wk after the withdrawal (on days 21 and 22).

Win–stay or lose–shift behavior was expressed during the postgambling test as well as in Fig. 1.

Food Consumption Test in the Radial Maze Task.

Rats were mildly food-restricted. The apparatus contained one start arm and one food arm, which contained a food reward (100 pellets) in a food cup. When the guillotine door was opened in the start arm, rats had no choice but to enter the food arm. Under this condition, rats ate food pellets in the food arm, and once they were satisfied, they came back to the central platform. The following indices of motivation to food were analyzed: approach time (i.e., the time taken by rats to approach the food cup in the food arm after the guillotine door opens), consumption time (i.e., the period during which the rat was eating the food), number of rewards eaten, and number of rewards eaten divided by the consumption time.

Production and Purification of Viral Vectors Expressing Designer Receptors Exclusively Activated by Designer Drug (DREADDs).

HEK293 cells were transfected with a pAAV vector plasmid containing the gene of interest, pHelper, and pAAV-RC (serotype 10) (provided by Penn Vector Core) using a standard calcium-phosphate method. Three days later, the transfected cells were collected and suspended in artificial CSF (124 mM NaCl, 3mM KCl, 26 mM NaHCO3, 2 mM CaCl2 1 mM MgSO4, 1.25 mM KH2PO4, 10 mM d-Glucose). After four freeze–thaw cycles and subsequent centrifugation, the supernatant was treated with Benzonase nuclease at 45 °C for 15 min. Purified viruses suspended in artificial CSF were titered by quantitative PCR and stored at −80 °C before use. Plasmids pAAV-CMV-hM3Dq-mCherry and pAAV-CMV-hM4Di-mCherry were constructed from plasmids pAAV-hSyn-FLEX-hM3Dq and pAAV-hSyn-FLEX-hM4Di, respectively, purchased from Addgene (IDs: 44361 and 44362).

c-Fos Immunohistochemistry and Quantitative Analysis.

c-Fos immunostaining was performed as described previously (40). Animals were deeply anesthetized with sodium pentobarbital (50 mg/kg) 2 h after the gambling test and then transcardially perfused with ice-cold PBS, followed by 4% paraformaldehyde in PBS. The brains were removed, postfixed in the same fixative, and then cryoprotected in 10–30% sucrose in PBS. Frozen serial coronal slices (40 μm) of the entire brain were made and then incubated with rabbit anti–c-Fos antibody (1:3,000; sc-253; Santa Cruz Biotechnology) for 24 h at 4 °C. We used the EnVision system-HRP (Dako), based on an HRP-labeled polymer conjugated to secondary antibodies. To quantify the number of c-Fos–positive cells in the brain, we used a microscope with a cooled CCD digital camera system (NanoZoomer 2.0; Hamamatsu) to scan the slices and calculated the cell numbers from the digitized images using the Win ROOF image analysis software (ver. 5.6; Mitani Co.). From both the saline- and METH-treated groups, we selected animals that exhibited typical/average responses in the gambling test and then counted c-Fos–positive cells in six to eight different sections from each animal (40). Selected areas were as follows: OFC, INS, anterior cingulated cortex (ACC), prelimbic cortex (PrL), core (NAc) and shell (NAs) of nucleus accumbens, and striatum (St).

To examine the effect of CNO treatment on neural activity in the INS, rats were killed 1.5 h after CNO treatment, and then c-Fos immunohistochemistry (1:1,000; sc-253) was performed as described above. Affinity-purified FITC-conjugated goat anti-rabbit IgG was used as the secondary antibody. We selected more than ten different sections from each animal, and the images were analyzed with a deconvolution fluorescence microscope system (BZ-9000; Keyence). The average numbers of c-Fos–positive cells were used for statistical analysis.

Statistical Analyses for Reinforcement Learning Model-Based Analysis.

We used the maximum likelihood approach to estimate the reinforcement learning model parameter from rats’ choice data. To determine which parameter differs between subject groups (saline-treated and METH), we adopted the following procedure. First, we constructed a model set of all combinations for each parameter. The parameters were either allowed to have different values between two subject groups (saline and METH) or shared the same value between the groups. We also examined combinations in which we set either κE or κQ to 0, both of them to a nonzero value, or both of them to 0. We then used fixed-effect analysis; to obtain a stable estimator, for each model a single parameter set was estimated for all subjects considered as a whole (14). The model parameters were optimized by minimizing the negative log-likelihood using the Matlab function “fmincon.” We compared the models based on the Akaike information criterion (AIC). The model that yielded the smallest AIC was deemed the best model. Based on the best model selected by the AIC, we tested whether the differences in parameters between groups were significant by conducting the likelihood ratio test, with the null hypothesis that the improvement in the likelihood of differentiation in the model parameters between groups occurred by chance alone.

SI Results

To determine whether posttreatment with METH altered the established preference for L-L arm in the gambling test, rats were subjected to the gambling test for 14 d and then received repeated METH or saline treatment for 14 d. The animals were subjected to the gambling test twice during the METH treatment, on days 7 and 8 and days 14 and 15. A third postgambling test was conducted 1 wk after withdrawal (on days 21 and 22; Fig. S5A).

In the saline-treated group, the H-H arm choice ratios in the first, second, and third postgambling tests remained stable at ∼10% (Fig. S5B). Following METH treatment, the animals chose the H-H arm more than saline-treated control rats did, and this increase was maintained at least 7 d after the withdrawal (Fig. S5B). Accordingly, in the second and third postgambling test, H-H arm choice ratios in the METH-treated group were significantly elevated relative to the saline-treated group. There was no difference between saline- and METH-treated rats in the number of entries into empty arms (Fig. S5C).

In addition, we analyzed the effect of posttreatment with METH on win–stay/lose–shift behavior in the gambling test (Fig. S5 DG). The effects were quite similar to those observed in the pretreatment experiment (Fig. 1 DG). When METH-treated rats received a large reward following a choice of the H-H arm, they chose the H-H arm in the next trial more frequently than control rats did (Fig. S5D). Alternatively, when METH-treated rats obtained a quinine pellet following a choice of the L-L arm, they chose the H-H arm in the next trial more frequently than did control animals (Fig. S5G). There were no differences in H-H arm choice ratios after animals received a quinine pellet in the H-H arm (Fig. S5E) or the small reward in the L-L arm (Fig. S5F).

Finally, we investigated the effect of repeated METH treatment on arm choice behavior in the gambling test under a condition of 100% small reward (always one food pellet) in the L-L arm and 0% large reward (always one quinine-coated pellet) in the H-H arm. Accordingly, the animals were first subjected to the gambling test for 14 d under the standard condition and then received repeated METH or saline treatment for 14 d. The animals were subjected to the gambling test twice during the METH treatment (i.e., on days 7 and 8 and days 14 and 15) and 1 wk after the withdrawal (on days 21 and 22) (Fig. S6). Under these conditions, both groups of rats preferentially chose the L-L arm, and there were no differences between the saline- and METH-treated groups in H-H arm choice ratio (Fig. S6B) or number of entries into empty arms (Fig. S6C). Thus, alteration of decision-making induced by METH treatment was manifested under conditions of uncertainty, i.e., when RPEs were inserted during the test.

Acknowledgments

This work was supported by the following funding sources: Grants-in-Aid for Scientific Research (24111518, 25116515, 25460094, 26120713, 26118506, and 26118507) from Ministry of Education, Culture, Sports, Science and Technology (MEXT); “Integrated Research on Neuropsychiatric Disorders” and “Bioinformatics for Brain Sciences,” carried out under the Strategic Research Program for Brain Sciences from MEXT; a Grant-in-Aid for Health Science Research from the Ministry of Health, Labour and Welfare of Japan; a grant from the Smoking Research Foundation, Japan; a grant from the Uehara Memorial Foundation; a grant from the Takeda Science Foundation; and the Program for Promotion of Fundamental Studies in Health Sciences of the National Institute of Biomedical Innovation.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1418014112/-/DCSupplemental.

References

  • 1.Ernst M, Paulus MP. Neurobiology of decision making: A selective review from a neurocognitive and clinical perspective. Biol Psychiatry. 2005;58(8):597–604. doi: 10.1016/j.biopsych.2005.06.004. [DOI] [PubMed] [Google Scholar]
  • 2.Gleichgerrcht E, Ibáñez A, Roca M, Torralva T, Manes F. Decision-making cognition in neurodegenerative diseases. Nat Rev Neurol. 2010;6(11):611–623. doi: 10.1038/nrneurol.2010.148. [DOI] [PubMed] [Google Scholar]
  • 3.Lee D. Decision making: from neuroscience to psychiatry. Neuron. 2013;78(2):233–248. doi: 10.1016/j.neuron.2013.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Dom G, Sabbe B, Hulstijn W, van den Brink W. Substance use disorders and the orbitofrontal cortex: Systematic review of behavioural decision-making and neuroimaging studies. Br J Psychiatry. 2005;187:209–220. doi: 10.1192/bjp.187.3.209. [DOI] [PubMed] [Google Scholar]
  • 5.Tanabe J, et al. Reduced neural tracking of prediction error in substance-dependent individuals. Am J Psychiatry. 2013;170(11):1356–1363. doi: 10.1176/appi.ajp.2013.12091257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bechara A, Damasio AR, Damasio H, Anderson SW. Insensitivity to future consequences following damage to human prefrontal cortex. Cognition. 1994;50(1-3):7–15. doi: 10.1016/0010-0277(94)90018-3. [DOI] [PubMed] [Google Scholar]
  • 7.Kalivas PW, Volkow ND. The neural basis of addiction: A pathology of motivation and choice. Am J Psychiatry. 2005;162(8):1403–1413. doi: 10.1176/appi.ajp.162.8.1403. [DOI] [PubMed] [Google Scholar]
  • 8.Park SQ, et al. Prefrontal cortex fails to learn from reward prediction errors in alcohol dependence. J Neurosci. 2010;30(22):7749–7753. doi: 10.1523/JNEUROSCI.5587-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Makris N, et al. Cortical thickness abnormalities in cocaine addiction—A reflection of both drug use and a pre-existing disposition to drug abuse? Neuron. 2008;60(1):174–188. doi: 10.1016/j.neuron.2008.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Paulus MP, Lovero KL, Wittmann M, Leland DS. Reduced behavioral and neural activation in stimulant users to different error rates during decision making. Biol Psychiatry. 2008;63(11):1054–1060. doi: 10.1016/j.biopsych.2007.09.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Paulus MP. Decision-making dysfunctions in psychiatry—Altered homeostatic processing? Science. 2007;318(5850):602–606. doi: 10.1126/science.1142997. [DOI] [PubMed] [Google Scholar]
  • 12.Zeeb FD, Floresco SB, Winstanley CA. Contributions of the orbitofrontal cortex to impulsive choice: Interactions with basal levels of impulsivity, dopamine signalling, and reward-related cues. Psychopharmacology (Berl) 2010;211(1):87–98. doi: 10.1007/s00213-010-1871-2. [DOI] [PubMed] [Google Scholar]
  • 13.Simon NW, et al. Dopaminergic modulation of risky decision-making. J Neurosci. 2011;31(48):17460–17470. doi: 10.1523/JNEUROSCI.3772-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Daw ND. 2009. Trial-by-trial data analysis using computational models. Decision Making, Affect, and Learning: Attention and Performance XXIII (Oxford University Press, Oxford), pp 1–26.
  • 15.Sharp C, Monterosso J, Montague PR. Neuroeconomics: A bridge for translational research. Biol Psychiatry. 2012;72(2):87–92. doi: 10.1016/j.biopsych.2012.02.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Naqvi NH, Bechara A. The insula and drug addiction: An interoceptive view of pleasure, urges, and decision-making. Brain Struct Funct. 2010;214(5-6):435–450. doi: 10.1007/s00429-010-0268-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Ishii H, Ohara S, Tobler PN, Tsutsui K, Iijima T. Inactivating anterior insular cortex reduces risk taking. J Neurosci. 2012;32(45):16031–16039. doi: 10.1523/JNEUROSCI.2278-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sanfey AG, Hastie R, Colvin MK, Grafman J. Phineas gauged: Decision-making and the human prefrontal cortex. Neuropsychologia. 2003;41(9):1218–1229. doi: 10.1016/s0028-3932(03)00039-3. [DOI] [PubMed] [Google Scholar]
  • 19.Clark L, et al. Differential effects of insular and ventromedial prefrontal cortex lesions on risky decision-making. Brain. 2008;131(Pt 5):1311–1322. doi: 10.1093/brain/awn066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Yechiam E, Busemeyer JR, Stout JC, Bechara A. Using cognitive models to map relations between neuropsychological disorders and human decision-making deficits. Psychol Sci. 2005;16(12):973–978. doi: 10.1111/j.1467-9280.2005.01646.x. [DOI] [PubMed] [Google Scholar]
  • 21.Stout JC, Busemeyer JR, Lin A, Grant SJ, Bonson KR. Cognitive modeling analysis of decision-making processes in cocaine abusers. Psychon Bull Rev. 2004;11(4):742–747. doi: 10.3758/bf03196629. [DOI] [PubMed] [Google Scholar]
  • 22.Hart AS, Rutledge RB, Glimcher PW, Phillips PE. Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J Neurosci. 2014;34(3):698–704. doi: 10.1523/JNEUROSCI.2489-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Lak A, Stauffer WR, Schultz W. Dopamine prediction error responses integrate subjective value from different reward dimensions. Proc Natl Acad Sci USA. 2014;111(6):2343–2348. doi: 10.1073/pnas.1321596111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Balleine BW, Delgado MR, Hikosaka O. The role of the dorsal striatum in reward and decision-making. J Neurosci. 2007;27(31):8161–8165. doi: 10.1523/JNEUROSCI.1554-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Howe MW, Tierney PL, Sandberg SG, Phillips PE, Graybiel AM. Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature. 2013;500(7464):575–579. doi: 10.1038/nature12475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Heekeren HR, et al. Role of ventral striatum in reward-based decision making. Neuroreport. 2007;18(10):951–955. doi: 10.1097/WNR.0b013e3281532bd7. [DOI] [PubMed] [Google Scholar]
  • 27.Mitchell MR, et al. 2014. Adolescent risk taking, cocaine self-administration, and striatal dopamine signaling. Neuropsychopharmacology 39(4):955–962.
  • 28.Freeze BS, Kravitz AV, Hammack N, Berke JD, Kreitzer AC. Control of basal ganglia output by direct and indirect pathway projection neurons. J Neurosci. 2013;33(47):18531–18539. doi: 10.1523/JNEUROSCI.1278-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Naqvi NH, Bechara A. The hidden island of addiction: The insula. Trends Neurosci. 2009;32(1):56–67. doi: 10.1016/j.tins.2008.09.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Schilman EA, Uylings HB, Galis-de Graaf Y, Joel D, Groenewegen HJ. The orbital cortex in rats topographically projects to central parts of the caudate-putamen complex. Neurosci Lett. 2008;432(1):40–45. doi: 10.1016/j.neulet.2007.12.024. [DOI] [PubMed] [Google Scholar]
  • 31.Critchley HD, Mathias CJ, Dolan RJ. Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron. 2001;29(2):537–545. doi: 10.1016/s0896-6273(01)00225-2. [DOI] [PubMed] [Google Scholar]
  • 32.O’Doherty J, Critchley H, Deichmann R, Dolan RJ. Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices. J Neurosci. 2003;23(21):7931–7939. doi: 10.1523/JNEUROSCI.23-21-07931.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Paulus MP, Tapert SF, Schuckit MA. Neural activation patterns of methamphetamine-dependent subjects during decision making predict relapse. Arch Gen Psychiatry. 2005;62(7):761–768. doi: 10.1001/archpsyc.62.7.761. [DOI] [PubMed] [Google Scholar]
  • 34.Li Y, Kolb B, Robinson TE. The location of persistent amphetamine-induced changes in the density of dendritic spines on medium spiny neurons in the nucleus accumbens and caudate-putamen. Neuropsychopharmacology. 2003;28(6):1082–1085. doi: 10.1038/sj.npp.1300115. [DOI] [PubMed] [Google Scholar]
  • 35.Robinson TE, Kolb B. Alterations in the morphology of dendrites and dendritic spines in the nucleus accumbens and prefrontal cortex following repeated treatment with amphetamine or cocaine. Eur J Neurosci. 1999;11(5):1598–1604. doi: 10.1046/j.1460-9568.1999.00576.x. [DOI] [PubMed] [Google Scholar]
  • 36.Holroyd CB, et al. Dorsal anterior cingulate cortex shows fMRI response to internal and external error signals. Nat Neurosci. 2004;7(5):497–498. doi: 10.1038/nn1238. [DOI] [PubMed] [Google Scholar]
  • 37.Ham T, Leff A, de Boissezon X, Joffe A, Sharp DJ. Cognitive control and the salience network: An investigation of error processing and effective connectivity. J Neurosci. 2013;33(16):7091–7098. doi: 10.1523/JNEUROSCI.4692-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lee D, Seo H, Jung MW. Neural basis of reinforcement learning and decision making. Annu Rev Neurosci. 2012;35:287–308. doi: 10.1146/annurev-neuro-062111-150512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Hoffman WF, et al. Cortical activation during delay discounting in abstinent methamphetamine dependent individuals. Psychopharmacology (Berl) 2008;201(2):183–193. doi: 10.1007/s00213-008-1261-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Arai S, et al. Involvement of pallidotegmental neurons in methamphetamine- and MK-801-induced impairment of prepulse inhibition of the acoustic startle reflex in mice: Reversal by GABAB receptor agonist baclofen. Neuropsychopharmacology. 2008;33(13):3164–3175. doi: 10.1038/npp.2008.41. [DOI] [PubMed] [Google Scholar]
  • 41.Han W, et al. NMDA receptors in the medial prefrontal cortex and the dorsal hippocampus regulate methamphetamine-induced hyperactivity and extracellular amino acid release in mice. Behav Brain Res. 2012;232(1):44–52. doi: 10.1016/j.bbr.2012.03.038. [DOI] [PubMed] [Google Scholar]
  • 42.Miyazaki K, Miyazaki KW, Doya K. The role of serotonin in the regulation of patience and impulsivity. Mol Neurobiol. 2012;45(2):213–224. doi: 10.1007/s12035-012-8232-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Takahashi H, et al. Norepinephrine in the brain is associated with aversion to financial loss. Mol Psychiatry. 2013;18(1):3–4. doi: 10.1038/mp.2012.7. [DOI] [PubMed] [Google Scholar]
  • 44.Kitamura O, Wee S, Specio SE, Koob GF, Pulvirenti L. Escalation of methamphetamine self-administration in rats: A dose-effect function. Psychopharmacology (Berl) 2006;186(1):48–53. doi: 10.1007/s00213-006-0353-z. [DOI] [PubMed] [Google Scholar]
  • 45.Yan Y, Nitta A, Mizoguchi H, Yamada K, Nabeshima T. Relapse of methamphetamine-seeking behavior in C57BL/6J mice demonstrated by a reinstatement procedure involving intravenous self-administration. Behav Brain Res. 2006;168(1):137–143. doi: 10.1016/j.bbr.2005.11.030. [DOI] [PubMed] [Google Scholar]
  • 46.van den Bos R, Jolles J, van der Knaap L, Baars A, de Visser L. Male and female Wistar rats differ in decision-making performance in a rodent version of the Iowa Gambling Task. Behav Brain Res. 2012;234(2):375–379. doi: 10.1016/j.bbr.2012.07.015. [DOI] [PubMed] [Google Scholar]
  • 47.Katahira K. The relation between reinforcement learning parameters and the influence of reinforcement history on choice behavior. J Math Psychol. 2015;66:59–69. [Google Scholar]
  • 48.Katahira K, Fujimura T, Okanoya K, Okada M. Decision-making based on emotional images. Front Psychol. 2011;2:311. doi: 10.3389/fpsyg.2011.00311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Mizoguchi H, et al. Matrix metalloproteinase-9 contributes to kindled seizure development in pentylenetetrazole-treated mice by converting pro-BDNF to mature BDNF in the hippocampus. J Neurosci. 2011;31(36):12963–12971. doi: 10.1523/JNEUROSCI.3118-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Paxinos G, Watson C. The Rat Brain in Stereotaxic Coordinates. 6th Ed Academic; Amsterdam: 2007. [Google Scholar]
  • 51.Mizoguchi H, et al. Reduction of methamphetamine-induced sensitization and reward in matrix metalloproteinase-2 and -9-deficient mice. J Neurochem. 2007;100(6):1579–1588. doi: 10.1111/j.1471-4159.2006.04288.x. [DOI] [PubMed] [Google Scholar]
  • 52.Lazarus M, et al. Arousal effect of caffeine depends on adenosine A2A receptors in the shell of the nucleus accumbens. J Neurosci. 2011;31(27):10067–10075. doi: 10.1523/JNEUROSCI.6730-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Wess J, Nakajima K, Jain S. Novel designer receptors to probe GPCR signaling and physiology. Trends Pharmacol Sci. 2013;34(7):385–392. doi: 10.1016/j.tips.2013.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Garner AR, et al. Generation of a synthetic memory trace. Science. 2012;335(6075):1513–1516. doi: 10.1126/science.1214985. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES