Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Aug 1.
Published in final edited form as: J Abnorm Psychol. 2020 Aug;129(6):544–555. doi: 10.1037/abn0000503

Computational Models of Drug Use and Addiction: A Review

Jessica A Mollick 1, Hedy Kober 2
PMCID: PMC7416739  NIHMSID: NIHMS1068341  PMID: 32757599

Abstract

In this brief review, we describe current computational models of drug-use and addiction that fall into two broad categories: mathematically-based models that rely on computational theories, and brain-based models that link computations to brain areas or circuits. Across categories, many are models of learning and decision-making, which may be compromised in addiction. Several mathematical models take predictive coding approaches, focusing on Bayesian prediction error. Other models focus on learning processes and (traditional) prediction error. Brain-based models have incorporated prefrontal cortex, basal ganglia, and the dopamine system, based on the effects of drugs on dopamine, motivation, and executive control circuits. Several models specifically describe how behavioral control may transition from habitual to goal-directed systems, consistent with computational accounts of compromised “model-based” control. Some brain-based models have linked this to the transition of behavioral control from ventral to dorsal striatum. Overall, we propose that while computational models capture some aspects of addiction and have advanced our thinking, most have focused on the effects of drug use rather than addiction per-se, most have not been tested on and/or supported by human data, and few capture multiple stages and symptoms of addiction. We conclude by suggesting a path forward for computational models of addiction.

Keywords: Substance use disorders, computational psychiatry, learning and decision-making, Bayesian, predictive coding

Scientific Summary

One of the approaches that has been used to understand addiction is computational modeling; this involves simulating key variables and, in some cases, neural circuits, that contribute to the disorder. We propose that while published models have captured some aspects of addiction and have advanced our thinking, most have not captured multiple stages and symptoms of addiction and have focused on simple forms of drug use behavior instead. We briefly review those models and suggest a path forward for computational models of addiction.

Introduction

Substance use disorders (SUDs; addictions) are the most prevalent and costly psychiatric conditions, associated with lifetime prevalence of 35.3% (NIMH, 2007), and costs exceeding $700 billion in the US alone (Suzuki & Kober, 2018). Clinically, SUDs are chronic, relapsing conditions, characterized by problematic drug use leading to clinically-significant impairment or distress (APA, 2013). For diagnosis, patients must also report at least two of eleven symptoms, which describe risky/compulsive use, impaired control, physiological alterations, and craving. The complexity of this clinical disorder highlights the need to understand the different components that contribute to the development and maintenance of SUDs. Further, understanding the underlying neural circuitry will help us develop more effective treatments for SUDs as well as prevention strategies. Thus, it is unsurprising that many models have been proposed to explain these disorders.

As Box (1979) noted: “all models are wrong but some are useful”. This is particularly true for addiction, because it is a complex phenomenon that develops in stages, involves many symptoms, and has many underlying environmental, biological, and neuropsychological causes. Thus, models of addiction are necessarily a simplification, and yet can still describe important aspects of the disorder, and drive thinking forward. Models have ranged from purely-psychological (e.g., Kavanagh, Andrade, & May, 2005; Prochaska, DiClemente, & Norcross, 1992; Ryan, 2002; Tiffany, 1990) to primarily circuit-based (e.g., Lüscher, 2016; Lüscher & Malenka, 2011; Nestler, 2005). Within this range, many prominent theoretical models of addiction are either neurobiologically-inspired or posit neurobiological components (e.g., Goldstein & Volkow, 2002; Koob & Volkow, 2010). Such models have been useful for linking psychology/psychopathology to addiction, providing a framework for understanding addiction, formalizing components and pathways, synthesizing data, providing graphical representations, and inspiring considerable bodies of work (Everitt & Robbins, 2005; Koob & Volkow, 2010).

Furthermore, several of these theoretical brain-based models of addiction have been particularly useful in understanding its complexity. They have done so by describing the different stages of addiction, their underlying neurobiological adaptations, differentiating addiction from casual drug use, honoring psychological states of craving (and cue-reactivity), and incorporating impaired cognitive control, binge patterns, and the consequences that follow, including negative affect and withdrawal (e.g., Goldstein & Volkow, 2002; Koob & Volkow, 2010; Lüscher, 2016; Lüscher & Malenka, 2011). However, such theoretical models have not formalized the specific functions or processes involved in the development and maintenance of SUDs, and have not always led to testable hypotheses. In contrast, computational models can both formalize the relevant processes and allow descriptions of specific computations and predictions.

Computational approaches are particularly useful for clinical psychology because they force us to make explicit predictions about the representational properties of model components and their interactions. Further, the level of detail in such models allows them to make specific predictions that can be tested experimentally (Wang & Krystal, 2014). In the context of SUDs, computational models might formalize how basic components relate to the symptoms of this multi-faceted disorder. Modeling might also help us understand the considerable heterogeneity of substance users and predict effective treatments for different subcategories of substance users. Indeed, both heterogeneity and comorbidity of clinical disorders were flagged as some of the major problems in psychiatry that computational modeling approaches can address (Wiecki, Poland, & Frank, 2015). Further advantages include the ability to formally compare the evidence for different theories (Adams, Huys, & Roiser, 2016), and the ability to integrate and move between models at different levels of analysis, abstraction, and biological plausibility (Wang & Krystal, 2014). However, computational models in themselves have not been a panacea: they occasionally lack theory, and their components do not necessarily link well with underlying psychological processes, or to human data more broadly. In our view, an effective model of SUDs should describe a range of addiction symptoms, as well as the progression and stages of addiction beyond repetitive drug use, to be of practical clinical use. Below, we briefly review the existing literature with this view in mind.

Computational Models of Addiction and/or Drug Use

As SUDs are characterized by many brain changes in reward and motivation circuitry (Belin, Belin-Rauscent, Murray, & Everitt, 2013; Goldstein & Volkow, 2002; Koob & Volkow, 2010), most computational models for drug use and addiction focused specifically on these systems. Broadly, such computational models fall into two categories: (1) mathematical models that do not make precise mappings of model components to neural circuits, and (2) biologically-based models that map components to specific brain regions or systems. These categories are related to Marr’s levels of analysis (Marr, 1982), such that mathematical models are at the algorithmic level, as they describe the particular algorithm implementing the computations described. In contrast, biologically-based models are at the implementational level, as they describe the specific brain hardware that makes those computations. We will explain key computational mechanisms of several models in each category and describe the evidence for each model, when available (see Table 1, for a summary).

Table 1:

Summary of models.

Authors What Does it Model? Midtiple Symptoms? Stages of Addiction Symptoms Modeled Existing Data Simulated (if any) Evidence [indirect] Primary Category Secondary Category Marr’s level Main points
Redish and Johnson (2007) Craving Yes Yes Craving None Animal data Mathematical Decision-making Algorithmic A model of the planning system in decision-making can account for craving, based on recognition of a high-value option
Redish, Jensen, and Johnson (2008) Addiction Yes Yes Withdrawal, Relapse, Incentive salience/ Over-valuation of drugs, Impulsivity, Decreased executive function None Animal data Mathematical Decision-making Algorithmic Decision-making consists of a goal-based planing system and a habit learning process. Addiction can be explained by vulnerabilities in this decision-making system.
Dezfouli et al., (2009) Drug use Yes Yes: Early to late, no withdrawal, craving Compulsive behavior: taking drugs despite punishment, Over-valuation of drugs Yes: Use despite punishment (Vanderschuren & Everitt, 2004), Blocking (Panlilo et al. 2007), Impulsive choice Animal data Mathematical Decision-making Algorithmic A TD-based average reinforcement learning model, proposes addiction raises the basal reward threshold, and prediction errors are calculated relative to a capped reward level.
Simon & Daw (2012) Drug use No No Craving/Goal directed behaviors None Animal data Mathematical Decision-making Algorithmic Drug-seeking effects may be explained by both model-based and model-free systems. Outlines architectures for potential interactions between the two systems.
Friston (2012) Drug use No No Impaired reversal learning None Animal & human data Mathematical Predictive coding Algorithmic Addictive behavior is a consequence of impaired peceptual learning, particularly increases in precision which impairs updating of priors with precision-based prediction errors.
Schwartenbeck et al. (2015) Drug use No No Impulsivity None None Mathematical Predictive coding Algorithmic Addiction leads to decreased precision in beliefs about policies, which increases impulsivity.
Gu & Filbey (2017), Gu (2018) Craving No No. Later work extends model to abstinence Craving, Effects of withdrawal duration on craving Yes: Effect of beliefs on craving (Gu et al. 2016), Effects of withdrawal duration on craving (Lu et al., 2004) Animal & human data Mathematical Predictive coding Algorithmic Drug use leads to more precise beliefs about physiological states. Models data showing that beliefs about drug effects influence craving
Becker & Murphy (1988) Drug use Yes Yes: Early to late, Recovery Initial use, Recovery None Hunan data Mathematical Decision-making Algorithmic Economic model proposes that choices to use drugs are rational and explained by expected utility, when benefits outweigh costs
Bernheim & Rangel (2004) Drug use Yes Yes Cue-triggered craving, Relapse, Recovery None Animal data Mathematical Decision-making Algorithmic Economic model, proposes that drug use decisions occur from interactions of a ‘hot’ and ‘cold’, rational system. Cues may activate the ‘hot’ decision-making system and lead to irrational drug choice.
Redish (2004) Drug use No Yes: Early to late, no withdrawal, craving Continued learning about drugs, Over-valuation of drugs Yes: Dopamine bursts continue to occur at drug receipt (Phillips et al., 2003), Greater elasticity (cost-sensitivity) of drug rewards (Bickel & Marsh, 2001) Some human data, not animal data Mathematical Learning Algorithmic Learning model, predicts that prediction errors continue to occur for drug rewards due to effects on the dopamine system
Zhang et al., (2009) Drug use No No. Craving Incentive salience/Over-valuation of drugs Yes: Multiplicative sensitization by amphetamine (Wyvell & Berridge, 2001), Enhanced value signals for salt after salt deprivation (Tindell et al., 2009) Animal & human data Mathematical Learning Algorithmic Learning model, predicts that a physiological drive state (like craving) multiplies the value of a conditioned cue without needing new learning
Takahashi, Schoenbaum, and Niv (2008) Drug use No Yes: Early to late, no withdrawal, craving Dysfunctional learning None Animal data Mathematical and Brain-based Learning:Actor/Critic Implementational Both actor (action selection) and critic (learning/prediction error) systems contribute to decision-making. In addiction, impaired encoding of state values in the critic may capture animal data on the effects of drugs on these systems.
Piray et al. (2010) Drug use No Yes: Early to late; no withdrawal, craving Compulsive behavior: taking drugs despite punishment Yes: Use despite punishment (Deroche-Gamonet, Belin,and Piazza, 2004), Increased involvement of dorsal striatum (“actor”) over learning Animal data Mathematical and Brain-based Learning: Actor/Critic Implementational An increased learning rate for appetitive compared to aversive stimuli in the critic (learning/prediction error system) can account for addiction.
Keramati & Gutkin (2013) Drug use Yes Early to late; no withdrawal, craving Compulsive behavior: taking drugs despite punishment. Over-valuation of drugs; Impaired executive control. Yes: Dopamine in VS/DS (Willuhn et. al 2012), Use despite punishment (Deroche-Garmonet, Belin, and Piazza, 2004), Blocking (Panlilio et al. 2007) Animal data Mathematical and Brain-based Decision-making Implementational Addiction leads to increased control of behavior by habitual systems. The model captures data about the effects of drugs on habitual and goal-diected systems in the corticostriatal hiearchy.

This table summarizes the published models on drug use and addiction, as reviewed in this paper. Note that not all models are supported by data, and very few are supported by human data. Additionally, most of them do not model multiple stages or symptoms of addiction, but rather describe circumscribed changes associated with instances of drug use, such as the effects of drug use on learning and decision-making.

Mathematical models

Perhaps the earliest model is Becker’s “rational theory of addiction” (e.g., Becker & Murphy, 1988). This model proposes that individuals plan to maximize their utility, by making choices to take drugs when the benefits outweigh costs, which further increase the utility of using drugs. The model also considered factors that decrease utility, including price, effort, and penalties. Bernheim and Rangel (2004) incorporated some of these elements to characterize drug use as an irrational mistake, or a divergence between choices and preferences. They further specify that drugs sensitize an individual to environmental cues that promote use, by activating a “hot” decision-making system that exaggerates positive consequences of drug use. This increases subsequent choices to use drugs, even if it is not the most rational option. This model accounts for precommitment (limiting drug exposure) and for regret over drug-taking, as decisions made by the rational or “cold” system. Neither model has been applied to human data or differentiates between drug use and addiction per-se.

Reinforcement learning and dopamine-based models.

One category of mathematical models focuses on reinforcement learning mechanisms, and how they are changed by drug use. These models focus on the mesolimbic dopamine system, based on data suggesting that dopamine neurons encode prediction errors (PE) – the difference between expectations and outcomes (Schultz, 1997). Such models typically do not describe how drug-induced learning affects other brain regions, but instead focus on simulating behavioral effects of drug use. One such account suggests that drug use sensitizes the dopamine system, enhancing the attribution of incentive salience to drug cues, leading to enhanced craving and motivation to use (Berridge, 2012). Specifically, Zhang, Berridge, Tindell, Smith, and Aldridge (2009) suggested that a physiological drive state multiplies the value of a conditioned cue, without needing new learning to enhance the cue’s value. The model is consistent with findings that salt deprivation increases neural signaling in the ventral pallidum (a region associated with value encoding) for a salt-predictive cue before the animal had experienced the new value of salt (Tindell, Berridge, Zhang, Pecina, & Aldridge, 2005). Recent work showing that craving has a multiplicative effect on value for food provides support for this model (Konova, Louie, & Glimcher, 2018). However, this model does not describe addiction per-se, but rather drug use. Further, it is inconsistent with prediction-error (PE) based accounts (described below) that assume that new learning must occur prior to any changes in value.

Indeed, other models have focused on the effect of drug use on dopamine, capitalizing on cocaine’s effect on the dopamine transporter (Redish, 2004). These models predict that PEs would consistently occur for drug rewards, such that cues that lead to drug rewards would continue to grow in value. In turn, the model predicts that blocking – the ability of previous learning about a cue to prevent new learning about another cue – would not occur. This model is consistent with reduced blocking in human methamphetamine users (Freeman et al., 2013), and yet inconsistent with the occurrence of blocking in drug-taking animals (Panlilio, Thorndike, & Schindler, 2007) and human cigarette smokers (Freeman, Morgan, Beesley, & Curran, 2012). Additional evidence suggests that negative PEs can occur for drug rewards, as rats reduce lever pressing for a smaller-than-expected drug reward (Marks, Kearns, Christensen, Silberberg, & Weiss, 2010), in contrast to the model’s prediction that drug rewards always cause positive PEs.

Drug taking as instrumental behavior.

Several models extend this focus on learning and PEs to understand drug-taking as instrumental behavior. One of these models uses the state-action value learning framework to model the effects of drug use on learning values for states and actions (Dezfouli et al., 2009). This model proposes that drug use raises the basal reward level, and that PEs are calculated relative to a capped reward level. This enhances PEs and thus the value of drug-taking actions, especially early in learning. This model simulates data whereby rats continue lever pressing for a drug even when the lever is presented with a punishment cue (Vanderschuren & Everitt, 2004). This is because drug value grows over time, allowing it to compete with the lower-valued punishment. Importantly, this model makes additional predictions about extended drug use, whereby drug value plateaus, because the basal reward level slowly rises. Thus, PEs are minimized over time, reducing the value of drug-related actions (e.g., lever pressing). This reduction in drug value allows the model to simulate blocking after extended drug use, consistent with prior data (Freeman et al., 2012; Panlilio et al., 2007), and in contrast with the model proposed by Redish (2004). This model also simulates impulsivity. However, it also focuses on drug-taking, and does not incorporate many aspects of addiction. Further, the evidence regarding reward PE in addiction is mixed, such that some studies have shown differences between participants with and without SUDs (Parvaz et al., 2015; Rose et al., 2014; Tanabe et al., 2013), while others have not (Park et al., 2010; Reiter et al., 2016).

“Model-based,” “model-free,” goals, and habits.

Another class of mathematical models differentiates between “model-free” and “model-based” decision-making. While neurobiological substrates have been proposed for these systems, the primary mechanisms are computational (Doll, Simon, & Daw, 2012). In “model-free” control, PEs update action values based on past experience, and these action values are saved and used to guide future action-selection. However, in “model-based” control, an internal model of the world evaluates actions based on their prospective consequences, allowing updating of action values without directly experiencing all potential outcomes (Daw, Niv, & Dayan, 2005; Doll et al., 2012). “Model-based” control has been linked to goal-directed instrumental control that is sensitive to the current outcome values, in contrast to “model-free” habitual instrumental control, whereby responding persists despite changes in outcome values (Voon, Reiter, Sebold, & Groman, 2017). It has been proposed that both systems interact to control ongoing behavior (Daw et al., 2005) and rely on value computation in ventral striatum (VS; Daw, Gershman, Seymour, Dayan, & Dolan, 2011; Simon & Daw, 2011). Simon and Daw (2012) applied the Dyna computational framework to the interactions of “model-based” and “model-free” systems in drug use. This model suggests that the high value of actions associated with drug-reward leads to enhanced “model-based” updating of states (e.g., cue-induced reactivity) and actions (e.g., conditioned drug-seeking) that precede drug-taking, via mental simulation, which is a feature of the Dyna framework.

Further, another advantage of this model is that it provides a framework for both “model-based” and “model-free” control to contribute to learning. Another advantage of this model is that it incorporates goal-directed states and craving; however, it does not discuss the different stages of addiction, or model human data.

Paralleling “model-based” and “model-free” systems, Redish, Jensen, and Johnson (2008) proposed a decision-making model of addiction focusing on the interaction of a goal-based planning system, and a habit-learning process. The planning system consists of interacting components: a recognition component that identifies the current situation, a prediction component that calculates potential action consequences, and an evaluative component that calculates the value of consequences. However, this model does not discuss the different factors leading to the deployment of habit-based or goal-directed control. In an earlier version of this model, Redish and Johnson (2007) suggested that cravings may occur when the planning system recognizes a high-value drug option. This causes a bias towards retrieving associated actions in memory, which leads to recurrent drug-seeking. Redish and Johnson (2007) further propose that the planning system component is consistent with findings that rats pause before making decisions, and that hippocampal firing during pauses reflects evaluation of different maze trajectories (Johnson & Redish, 2007). In a later iteration, Redish et al. (2008) proposed that behavioral control shifts from the planning system to the habitual system, consistent with some animal studies (e. g., Hikosaka et al., 1999; Packard & McGaugh, 1996). However, this model conflates Pavlovian and instrumental learning processes within a single planning system because it uses situational cues to calculate the consequences of future actions, and the value of those consequences (Ostlund & Balleine, 2008). Indeed, Pavlovian-instrumental transfer effects (when a conditioned stimulus invigorates an instrumental response) are not sensitive to outcome devaluation (Holland, 2004; Rescorla, 1994). Further, valuations from Pavlovian and instrumental systems are neurally dissociable (Ostlund & Balleine, 2008). In response, Redish et al. (2008) suggested adding another system to encode Pavlovian stimulus-outcome associations, which could drive actions through Pavlovian conditioning (separately from the planning system, which would then only represent instrumental action). However, it is not clear how this planning system might interact with the new Pavlovian system, which is an important aspect of addictive behaviors.

Broadly, computational models of drug use that propose “model-based” vs. “model-free” components are consistent with animal models that describe addiction as a transition from goal-directed to habitual behavioral control (Everitt & Robbins, 2005; Lucantonio, Caprioli, & Schoenbaum, 2014). They are further supported by data from a study with abstinent methamphetamine users who exhibited relatively more “model-free” behavior in a two-step choice task (Voon et al., 2015), and another study suggesting impaired “model-based” control in alcohol-dependent subjects (Sebold et al., 2014). However, such models are inconsistent with evidence suggesting that drug using animals can and do make goal-directed or “model-based” decisions (Halbout, Liu, & Ostlund, 2016; Root et al., 2009), as can human drug users (Hogarth, 2012), and humans with SUDs (Hogarth & Chase, 2011). These models are also inconsistent with evidence suggesting that both systems can operate in parallel (Balleine & O’Doherty, 2010). Further, evidence is lacking that the complex pattern of behaviors exhibited in human addiction can become fully habitual under any circumstances (e.g., getting up, obtaining money, going to purchase drugs, etc). These models also do not account for craving states creating the explicit goal to get drugs, which renders future instrumental drug-seeking actions goal-directed (as they are directed towards reducing craving). Nevertheless, these approaches are useful, as they provide a way for thinking about the process that determines value estimates, and have been applied to many clinical phenomena (Voon et al., 2017).

Predictive coding models.

Other models are based on Bayesian probability theory, which provides a framework for performing inference, and has also been applied to drug use and addiction. Specifically, it formalizes how prior beliefs are integrated with observed data to calculate the posterior probability of a particular hypothesis. In Bayesian inference, the prior reflects prior knowledge (before considering data), while the likelihood reflects the probability of the observed data, given the hypothesis. These two terms are multiplied to calculate the posterior probability – the probability of the hypothesis given the data and the prior, normalized by the probability of the data given all possible hypotheses. The next time data are encountered, this updated posterior probability serves as the prior (Griffiths, Kemp, & Tenenbaum, 2008; Olshausen, 2004).

Friston (2005) applied the Bayesian framework to perception and cognition and their underlying neural mechanisms in a predictive coding framework. He suggested that the brain uses an internal model of the world to generate predictions about causes of sensations; sensory samples are then tested against these predictions to update beliefs about their causes (Friston, 2005, 2009, 2010). Generally, he proposed that internal brain states and actions are selected to minimize free-energy, or the difference between prior and posterior beliefs. In turn, posterior beliefs are updated after sampling the data using Bayesian updating. In Friston’s view, the difference between the prior and posterior probabilities corresponds to Bayesian surprise, or Bayesian prediction error (BPE), which is used to update beliefs and drive learning (Friston, 2005, 2009, 2010). Friston extended this account to neural hierarchies by suggesting that higher-level brain regions predict the inputs expected in the lower level, and minimize BPEs (Friston, 2005, 2009). Thus, BPEs are an algorithmic element of this predictive coding model inasmuch as they update other levels in a hierarchy. Importantly, in this framework, BPEs resulting from sensory states are weighted by their confidence, or precision (inverse variance), which is captured by the dopamine system in this formulation (Friston et al., 2012). Interestingly, other Bayesian models have used signed prediction errors to update beliefs (Mathys, Daunizeau, Friston, & Stephan, 2011; see Supplemenatry materials for additional discussion).

Based on his own approach, Friston (2012) proposed the first Bayesian account of addiction as a natural consequence of impaired perceptual learning. Incorporating physiological states, physical states, and hidden states governing causal dynamics, Friston (2012) suggests that drug-taking – which increases dopamine (Di Chiara & Imperato, 1988) – leads to high precision. This subsequently impairs learning, precision-weighed BPEs, and subsequent updating of priors. Further, this leads to a strong expectation of reward in a previously-rewarded state while ignoring sensory evidence to the contrary (e.g., inaccurate priors). Given these parameters, the model further predicts that reversal learning (updating values after contingencies change) would be greatly impaired. Consistently, this model is supported by several animal studies showing that perseverative responses are enhanced after cocaine use (Calu et al., 2007; Schoenbaum, Saddoris, Ramus, Shaham, & Setlow, 2004), and two human studies demonstrating perseverative responses in drug users (Ersche et al., 2011; Ersche, Roiser, Robbins, & Sahakian, 2008). Further, the dopamine system is known to be involved in reversal learning, as administering a D2 dopamine receptor agonist reversed performance deficits in stimulant users (Ersche et al., 2011). However, impaired reversal learning could also be due to impaired encoding of (non-Bayesian) PEs or value updating, rather than enhanced precision. Further, multiple aspects of addiction are not modeled.

Gu and Filbey (2017) recently proposed a Bayesian account focused on the effect of prior beliefs on drug craving. This model draws on the effects of drugs on the dopamine system, increasing precision (Friston et al., 2012), leading specifically to more precise beliefs about physiological states in addicted individuals. The model proposes that drug-addicted individuals form priors about drug effects, including reduction in craving. Thus, with no expectation to use, the prior would shift towards greater discomfort, leading the posterior probability to shift toward greater discomfort, or more craving, upon unexpected drug receipt. In comparison, the model predicts that when drug-addicted individuals expect and receive drugs, the posterior belief would shift towards reduction of craving and lower relative discomfort. This model simulated data showing that craving was only significantly reduced when participants both expected and received drugs (Gu et al., 2016); however, it is unclear which real-life situations are modeled in this study.

Recently, Gu (2018) extended this framework to reflect changes in craving over the abstinence period, moving towards stages of addiction and recovery. However, this model – and the notion of enhanced precision of bodily state estimates – is inconsistent with findings of impaired interoceptive insight in addiction (Bechara & Damasio, 2002; Çöl, Sönmez, & Vardar, 2016; Goldstein et al., 2009; Sönmez, Kahyacı Kılıç, Ateş Çöl, Görgülü, & Köse Çınar, 2017; Verdejo-Garcia, Clark, & Dunn, 2012), unless such estimates are both more precise and inaccurate. Further, the idea that drug use increases precision contradicts another Bayesian predictive coding account of addiction, which proposes that individuals with addiction have decreased precision in beliefs about policies (mappings from states of the world to actions), which reflects lower confidence in reaching a goal (Schwartenbeck et al., 2015). This decreased precision makes individuals more impulsive, which would increase habitual behavior, consistent with accounts of “model-based” control (Schwartenbeck et al., 2015).

Overall, predictive coding models posit several useful concepts, like the idea of uncertainty arbitrating behavioral control, and the importance of priors and beliefs. However, given these conflicting accounts, we suggest that such models should clarify the role of precision in learning and decision-making behavior, or integrate these accounts to discuss how precision may operate differently for bodily state estimates and policy selection.

Brain-based models

Several brain-based models extend the focus on learning, PEs, and action selection to the interactions between the dopamine system, prefrontal cortex (PFC), and basal ganglia – crucial circuits involved in addiction (Everitt & Robbins, 2005; Goldstein & Volkow, 2011; Koob & Volkow, 2010). These include actor-critic models (Barto, 1995; Barto, Sutton, & Anderson, 1983), whereby the actor system selects different actions with an action-selection policy, while the critic system evaluates selected actions as good or bad. Importantly, the critic system does this using PEs, which in these models indicates whether outcomes were better or worse than expected (unlike BPEs).

Actor-critic models were applied to interactions between the VS and dorsal striatum (DS) in addiction, as studies suggest that DS encodes action values (the actor; Burton, Nakamura, & Roesch, 2015; O’Doherty et al., 2004), while the VS encodes values of different states that contribute to PEs (the critic). An early model by Takahashi, Schoenbaum, and Niv (2008) suggests that impaired encoding of state values (cue or contexts) by the VS critic could lead to inappropriate action selection. The model is consistent with animal findings that cocaine use disrupts encoding in rat VS, while DS encoding remains (mostly) intact (Takahashi, Roesch, Stalnaker, & Schoenbaum, 2007). Impaired VS encoding could then contribute to inappropriate PEs and thus impaired updating of action values in the actor system. This could allow the model to capture later stages of addiction; however, this was not explicitly simulated.

Another actor-critic model suggests that a higher learning rate for appetitive learning compared to aversive learning (which increases the value of drug cues) in the VS critic could explain compulsive addictive behavior (Piray, Keramati, Dezfouli, Lucas, & Mokri, 2010). They model data observed by Deroche-Gamonet, Belin, and Piazza (2004), where a subset of vulnerable rats continued to nose-poke for a drug reward cue even when the drug cue was paired with a shock-predictive cue. Specifically, higher appetitive learning rates caused the positive value of nose-poking to outweigh the punishment. Further, the model captures enhanced control of behavior by the actor in later stages of addiction. However, neither actor-critic model has been tested in human addiction.

A related action-selection model extended the idea that control might pass from VS to DS to consider interactions between basal ganglia circuits and the PFC (Keramati & Gutkin, 2013). This model also computes action values based on PEs. Further, the model incorporates hierarchical PFC and basal ganglia loops organized by level of abstraction, such that higher levels of the model that encode abstract knowledge about the value of different options are used for updating action values at more detailed levels. This is based on anatomical and functional evidence suggesting that representations in rostral regions of the PFC and basal ganglia are more abstract, while those in dorsal regions are more specific (Badre, 2008; Badre & Frank, 2011; Nee et al., 2012; Verstynen, Badre, Jarbo, & Schneider, 2012). Importantly, the model proposes that abstract processes control behavior in the early stages of learning, due to higher flexibility. However, abstract regions (VS) are also less precise, leading to greater uncertainty as learning progresses, allowing control to pass to the more specific/less abstract regions (DS). In support of this model, some data suggest that dopamine responses to reward cues are initially higher in VS, but shift to DS over the course of learning (Willuhn, Burgeno, Everitt, & Phillips, 2012); control of drug-seeking responses may also gradually shift from the VS to DS over the course of drug use (Corbit, Nie, & Janak, 2012; Everitt & Robbins, 2016; Vanderschuren, Di Ciano, & Everitt, 2005). The model thus captures early and late addiction, but does not model many addiction symptoms like withdrawal or craving.

Summary and Path Forward

Theoretical and neurobiological models describe addictions as complex, multi-stage diseases, characterized by disturbances in subcortical (midbrain, basal ganglia, amygdala) and cortical (prefrontal) systems, and are often based on findings from human addiction. Conversely, computational models have often focused on limited aspects of drug-taking decisions, and have relied on animal findings, which has limited their ecological validity and utility in understanding human addiction. While each modeled aspect of drug-taking may be important, few computational models captured the complexity of SUDs, as they have not integrated both the multiple stages of addiction and the multi-faceted set of symptoms occurring in SUDs. Furthermore, none of the models addressed processes associated with treatment for or recovery from addiction.

Indeed, although computational models have formalized specific functions or processes involved in drug use, they have often conceptualized addiction as continued drug-taking actions that occur when drug rewards affect learning and action selection. The latter occurs either by enhancing the value of drug cues, enhancing the value of drug-taking actions, (Berridge, 2012; Dezfouli et al., 2009; Gutkin, Dehaene, & Changeux, 2006; Keramati & Gutkin, 2013; Piray et al., 2010; Redish, 2004), or by affecting precision of estimates as in predictive coding accounts (Friston, 2012; Gu & Filbey, 2017; Schwartenbeck et al., 2015). Some models address stages of addiction by proposing that control gradually shifts towards more habitual processes (Keramati & Gutkin, 2013; Redish et al., 2008), while others allow for goal-directed processes (Redish & Johnson, 2007; Simon & Daw, 2011). Commendably, a few of the models were fit to human data, focusing on craving (Gu, 2018; Gu & Filbey, 2017) and drug valuation (Redish, 2004). Others simulated animal data with addiction-relevant features such as continued drug-taking despite punishment (Dezfouli et al., 2009; Keramati & Gutkin, 2013; Piray et al., 2010). However, humans reliably develop multiple addiction symptoms concurrently (APA, 2013), and recent reverse-translational work successfully demonstrated that some drug-taking animals also develop several addiction-like symptoms, including difficulty stopping drug intake, increased effort to take drugs, and continued use despite punishment (Belin-Rauscent, Fouyssac, Bonci, & Belin, 2016; Deroche-Gamonet et al., 2004). Thus, across species, more of these behaviors could be incorporated into – and simulated by – computational models to further link them to human addiction. Finally, while each of the models made important theoretical contributions by pointing out relevant components, the lack of integration between components makes it difficult to make clinically-relevant predictions or draw clinically-relevant conclusions (i.e., an intervention improving one component could potentially worsen another).

In our view, the most clinically-useful models are those that integrate the best of all worlds: those that are grounded in theory, describe psychological phenomena associated with addiction (not just drug use), rely on both psychological and neurobiological data (including circuits, whenever possible), formalize the components into modules, and specify computations to allow clinically-relevant predictions, preferably fit to human data. An effective model of SUDs should describe the full range of addiction symptoms, along with the progression and different stages of addiction. To date, few if any such models have been put forth for addiction. Comprehensive models of this type are promising for understanding clinical disorders because they allow us to understand how addiction arises from an interaction of underlying components, and how these contribute to symptoms, which may help tailor treatments to heterogeneous patient groups or sub-groups.

One promising approach that can integrate the best features of existing models is a value-based decision-making (VBDM) framework, grounded in neurobiological circuits (Mollick & Kober, In prep). A VBDM framework in the context of addiction is similar to the “Decision Theoretic Psychiatry” framework proposed by Huys, Guitart-Masip, Dolan, and Dayan (2015). This account broadly proposes that normative, instrumental choice behavior can break down – across psychiatric disorders – along three different major fault lines: determining what problem to solve, how to solve the problem, and the effects of experience. Comparatively, a VBDM framework is more specific, in proposing component computational processes that contribute to the decision-making processes described by Huys and colleagues (2015). Furthermore, we propose a specific application of the VBDM framework to SUDs, and outline several components that are unique to these disorders. As such, this VBDM framework can be applied to form a specific theory of substance use development and maintenance (see Wellman and Gelman (1992) for a discussion of specific theories).

Indeed, VBDM seems particularly fitting to SUDs because their clinical definition explicitly describes decisions to continue using drugs despite negative consequences (Bernheim & Rangel, 2004; Redish et al., 2008). Ideally, this framework can model stages of SUD development, whereby the initial decision to take drugs can be understood in the context of interacting brain systems (e.g., craving, control), and the computation of priors. SUDs might then develop from a series of subsequent decisions, changing the weights between different nodes in the computational model, leading to future decisions to take drugs. This idea is consistent with models that showed how drug use leads to increased valuation of drug cues and actions and decisions to use despite negative consequences (Dezfouli et al., 2009; Piray et al., 2010). Building on Redish and Johnson (2007) and Simon and Daw (2012), this VBDM framework can incorporate the maintenance of goal-directed cognition in the PFC, and enhanced incentive salience to drug cues (Berridge, 2012), which characterizes craving states, and contributes to impaired executive control in addiction (Goldstein & Volkow, 2011).

Further, a VBDM framework can also incorporate learning processes, using Pavlovian and instrumental conditioning circuits in basal ganglia and amygdala. Such a framework can also incorporate the effects of beliefs on goals to take drugs, building on Bayesian models that capture the effect of beliefs on behavior, as well as the effect of bodily states on goals and motivational states (Gu & Filbey, 2017). Building on goal-directed and habitual processes (Redish et al., 2008), as well as “model-based” and “model-free” decision-making (Daw et al., 2005; Lucantonio et al., 2014; Voon et al., 2017), we suggest that engaging executive control during decision-making depends on the incentives provided, and becomes more difficult in the face of strong cue-induced craving.

Finally, unlike existing models, a VBDM framework is well-positioned to provide some insight into the ways that people recover from addiction. For example, motivational interviewing increases treatment initiation and reduces drug use by helping patients consider their long-term goals, and the relationship between their substance use and these goals. Within the VBDM framework, it can be described as enhancing the acquisition of alternative goals to drug use, which decreases the impact of drug-seeking motivations on valuation of drug cues. Contingency management (CM) is an effective treatment (Higgins, Heil, & Lussier, 2004; Regier & Redish, 2015) that provides tangible rewards for abstinence (e.g., a voucher or money for each drug-negative urine sample). In the VBDM framework, CM can be understood as changing the incentive structure for abstinence, leading to enhanced executive control in moments of craving (Kober, Kross, Mischel, Hart, & Ochsner, 2010). In turn, this makes making the decision to abstain more likely. Further, cognitive behavioral therapies (CBT) emphasize understanding of drug use within the context of antecedents and consequences, and provide skills training to recognize situations and states where one would be vulnerable to use, including strategies for regulating craving. Within a VBDM framework, CBT can be understood as enhancing executive control systems, allowing for more deliberative, less-impulsive decision-making. This is consistent with data showing that CBT-based strategies engage executive control systems (Kober, Mende-Siedlecki, et al., 2010) and that CBT leads to lasting changes in these systems (DeVito et al., 2017; DeVito et al., 2012). Additionally, recent evidence suggests that mindfulness-based treatments reduce drug-use (Bowen et al., 2009; Bowen et al., 2014); they may do so by reducing the likelihood of maintaining a craving state after seeing drug cues, which reduces substance use behavior (e.g., Elwafi, Witkiewitz, Mallik, & Brewer, 2013; Kober, Brewer, Tuit, & Sinha, 2017).

Importantly, the complexity of the described VBDM framework may allow it to model human behavior and its many underlying causes and consequences. As such, this framework integrates recent neurobiological research on decision-making and addiction with features from existing computational models. In doing so, it expands their scope by considering the different stages involved in drug-taking, as well as multiple addiction symptoms, and making clinically-relevant predictions. Ultimately, to be of practical use, any model of addiction would need to allow for the complexity as well as the heterogeneity of clinical presentation in humans suffering from SUDs.

Supplementary Material

Supplemental Material

Acknowledgments

The writing of this manuscript was supported by R01 DA043690 to Hedy Kober. In addition, Dr. Kober received consulting fees from Indivior, Inc. These were not related to Dr. Kober’s research, and not related to this paper.

The authors wish to thank Uri Berger, Kai Krueger, Wendy Sun, Yihan (Sophy) Xiong, and Nilo Vafay for helpful comments. Some of the ideas from this paper were described as part of poster presentations at the annual meetings of the Society for Affective Science (2019) and Reinforcement Learning and Decision Making meetings (2019).

Contributor Information

Jessica A. Mollick, Yale University, Department of Psychiatry

Hedy Kober, Yale University, Department of Psychiatry.

References

  1. Adams RA, Huys QJ, & Roiser JP (2016). Computational psychiatry: towards a mathematically informed understanding of mental illness. J Neurol Neurosurg Psychiatry, 87(1), 53–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. APA. (2013). Diagnostic and Statistical Manual of Mental Disorders - 5th Edition Washington, DC: American Psychiatric Association. [Google Scholar]
  3. Badre D (2008). Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes. Trends in cognitive sciences, 12(5), 193–200. [DOI] [PubMed] [Google Scholar]
  4. Badre D, & Frank MJ (2011). Mechanisms of hierarchical reinforcement learning in cortico–striatal circuits 2: Evidence from fMRI. Cerebral cortex, 22(3), 527–536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Balleine BW, & O’Doherty JP (2010). Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology, 35(1), 48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barto AG (1995). Adaptive critics and the basal ganglia In Davis JL & Beiser DG (Eds.), Models of information processing in the basal ganglia. Cambridge, MA: MIT Press. [Google Scholar]
  7. Barto AG, Sutton RS, & Anderson CW (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE transactions on systems, man, and cybernetics(5), 834–846. [Google Scholar]
  8. Bechara A, & Damasio H (2002). Decision-making and addiction (part I): impaired activation of somatic states in substance dependent individuals when pondering decisions with negative future consequences. Neuropsychologia, 40(10), 1675–1689. [DOI] [PubMed] [Google Scholar]
  9. Becker GS, & Murphy KM (1988). A theory of rational addiction. Journal of political Economy, 96(4), 675–700. [Google Scholar]
  10. Belin D, Belin-Rauscent A, Murray JE, & Everitt BJ (2013). Addiction: Failure of control over maladaptive incentive habits. Current opinion in neurobiology, 23(4), 564–572. [DOI] [PubMed] [Google Scholar]
  11. Belin-Rauscent A, Fouyssac M, Bonci A, & Belin D (2016). How preclinical models evolved to resemble the diagnostic criteria of drug addiction. Biological psychiatry, 79(1), 39–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bernheim BD, & Rangel A (2004). Addiction and cue-triggered decision processes. American economic review, 94(5), 1558–1590. [DOI] [PubMed] [Google Scholar]
  13. Bickel WK, & Marsch LA (2001). Toward a behavioral economic understanding of drug dependence: delay discounting processes. Addiction, 96(1), 73–86. [DOI] [PubMed] [Google Scholar]
  14. Berridge KC (2012). From prediction error to incentive salience: mesolimbic computation of reward motivation. European Journal of Neuroscience, 35(7), 1124–1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Bowen S, Chawla N, Collins SE, Witkiewitz K, Hsu S, Grow J, . . . Marlatt A (2009). Mindfulness-based relapse prevention for substance use disorders: A pilot efficacy trial. Substance Abuse, 30(4), 295–305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Bowen S, Witkiewitz K, Clifasefi SL, Grow J, Chawla N, Hsu SH, . . . Lustyk MK (2014). Relative efficacy of mindfulness-based relapse prevention, standard relapse prevention, and treatment as usual for substance use disorders: a randomized clinical trial. JAMA Psychiatry, 71(5), 547–556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Box GE (1979). Robustness in the strategy of scientific model building Robustness in statistics (pp. 201–236): Elsevier. [Google Scholar]
  18. Burton AC, Nakamura K, & Roesch MR (2015). From ventral-medial to dorsal-lateral striatum: neural correlates of reward-guided decision-making. Neurobiology of learning and memory, 117, 51–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Calu DJ, Stalnaker TA, Franz TM, Singh T, Shaham Y, & Schoenbaum G (2007). Withdrawal from cocaine self-administration produces long-lasting deficits in orbitofrontal-dependent reversal learning in rats. Learning & memory, 14(5), 325–328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Çöl IA, Sönmez MB, & Vardar ME (2016). Evaluation of interoceptive awareness in alcohol-addicted patients. Nöro Psikiyatri Arşivi, 53(1), 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Corbit LH, Nie H, & Janak PH (2012). Habitual alcohol seeking: time course and the contribution of subregions of the dorsal striatum. Biological psychiatry, 72(5), 389–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Daw ND, Gershman SJ, Seymour B, Dayan P, & Dolan RJ (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69(6), 1204–1215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Daw ND, Niv Y, & Dayan P (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8(12), 1704. [DOI] [PubMed] [Google Scholar]
  24. Deroche-Gamonet V, Belin D, & Piazza PV (2004). Evidence for addiction-like behavior in the rat. Science, 305(5686), 1014–1017. [DOI] [PubMed] [Google Scholar]
  25. DeVito EE, Dong G, Kober H, Xu J, Carroll KM, & Potenza MN (2017). Functional neural changes following behavioral therapies and disulfiram for cocaine dependence. Psychology of Addictive Behaviors, 31(5), 534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. DeVito EE, Worhunsky PD, Carroll KM, Rounsaville BJ, Kober H, & Potenza MN (2012). A preliminary study of the neural effects of behavioral therapy for substance use disorders. Drug and Alcohol Dependence, 122(3), 228–235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Dezfouli A, Piray P, Keramati MM, Ekhtiari H, Lucas C, & Mokri A (2009). A neurocomputational model for cocaine addiction. Neural computation, 21(10), 28692893. [DOI] [PubMed] [Google Scholar]
  28. Di Chiara GD, & Imperato A (1988). Drugs abused by humans preferentially increase synaptic dopamine concentrations in the mesolimbic system of freely moving rats. Proceedings of the National Academy of Sciences, 85(14), 5274–5278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Doll BB, Simon DA, & Daw ND (2012). The ubiquity of model-based reinforcement learning. Current opinion in neurobiology, 22(6), 1075–1081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Elwafi HM, Witkiewitz K, Mallik S, & Brewer JA (2013). Mindfulness training for smoking cessation: Moderation of the relationship between craving and cigarette use. Drug and Alcohol Dependence, 130(1–3), 222–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ersche KD, Barnes A, Jones PS, Morein-Zamir S, Robbins TW, & Bullmore ET (2011). Abnormal structure of frontostriatal brain systems is associated with aspects of impulsivity and compulsivity in cocaine dependence. Brain, 134(7), 2013–2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Ersche KD, Roiser JP, Robbins TW, & Sahakian BJ (2008). Chronic cocaine but not chronic amphetamine use is associated with perseverative responding in humans. Psychopharmacology, 197(3), 421–431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Everitt BJ, & Robbins TW (2005). Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nature Neuroscience, 8(11), 1481–1489. [DOI] [PubMed] [Google Scholar]
  34. Everitt BJ, & Robbins TW (2016). Drug addiction: updating actions to habits to compulsions ten years on. Annu Rev Psychol, 67(1), 23–50. [DOI] [PubMed] [Google Scholar]
  35. Freeman T, Morgan C, Beesley T, & Curran H (2012). Drug cue induced overshadowing: selective disruption of natural reward processing by cigarette cues amongst abstinent but not satiated smokers. Psychological medicine, 42(1), 161–171. [DOI] [PubMed] [Google Scholar]
  36. Freeman T, Morgan CJ, Pepper F, Howes OD, Stone JM, & Curran HV (2013). Associative blocking to reward-predicting cues is attenuated in ketamine users but can be modulated by images associated with drug use. Psychopharmacology, 225(1), 41–50. [DOI] [PubMed] [Google Scholar]
  37. Friston KJ (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 360(1456), 815–836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Friston KJ (2009). The free-energy principle: a rough guide to the brain? Trends in cognitive sciences, 13(7), 293–301. [DOI] [PubMed] [Google Scholar]
  39. Friston KJ (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2), 127. [DOI] [PubMed] [Google Scholar]
  40. Friston KJ (2012). Policies and priors In Gutkin B & Ahmed SH (Eds.), Computational neuroscience of drug addiction (Vol. 10, pp. 237–283). New York, NY: Springer. [Google Scholar]
  41. Friston KJ, Shiner T, FitzGerald T, Galea JM, Adams R, Brown H, . . . Bestmann S (2012). Dopamine, affordance and active inference. PLoS computational biology, 8(1), e1002327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Goldstein RZ, Bechara A, Garavan H, Childress AR, Paulus MP, & Volkow ND (2009). The neurocircuitry of impaired insight in drug addiction. Trends in cognitive sciences, 13(9), 372–380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Goldstein RZ, & Volkow ND (2002). Drug addiction and its underlying neurobiological basis: Neuroimaging evidence for the involvement of the frontal cortex. American journal of Psychiatry, 159(10), 1642–1652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Goldstein RZ, & Volkow ND (2011). Dysfunction of the prefrontal cortex in addiction: neuroimaging findings and clinical implications. Nature Reviews Neuroscience, 12(11), 652–669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Griffiths TL, Kemp C, & Tenenbaum JB (2008). Bayesian models of cognition In Sun R (Ed.), Cambridge Handbook of Computational Cognitive Modeling (pp. 59–100). Cambridge, MA: Cambridge University Press. [Google Scholar]
  46. Gu X (2018). Incubation of craving: a Bayesian account. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Gu X, & Filbey F (2017). A bayesian observer model of drug craving. JAMA Psychiatry, 74(4), 419–420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Gu X, Lohrenz T, Salas R, Baldwin PR, Soltani A, Kirk U, . . . Montague PR (2016). Belief about Nicotine Modulates Subjective Craving and Insula Activity in Deprived Smokers. Frontiers in Psychiatry, 7, 126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Gutkin BS, Dehaene S, & Changeux J-P (2006). A neurocomputational hypothesis for nicotine addiction. Proceedings of the National Academy of Sciences, 103(4), 1106–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Halbout B, Liu AT, & Ostlund SB (2016). A closer look at the effects of repeated cocaine exposure on adaptive decision-making under conditions that promote goal-directed control. Frontiers in Psychiatry, 7, 44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Higgins ST, Heil SH, & Lussier JP (2004). Clinical implications of reinforcement as a determinant of substance use disorders. Annu Rev Psychol, 55, 431–461. doi: 10.1146/annurev.psych.55.090902.142033 [DOI] [PubMed] [Google Scholar]
  52. Hikosaka O, Nakahara H, Rand MK, Sakai K, Lu X, Nakamura K, . . . Doya K (1999). Parallel neural networks for learning sequential procedures. Trends in neurosciences, 22(10), 464–471. [DOI] [PubMed] [Google Scholar]
  53. Hogarth L (2012). Goal-directed and transfer-cue-elicited drug-seeking are dissociated by pharmacotherapy: Evidence for independent additive controllers. Journal of Experimental Psychology: Animal Behavior Processes, 38(3), 266. [DOI] [PubMed] [Google Scholar]
  54. Hogarth L, & Chase HW (2011). Parallel goal-directed and habitual control of human drug-seeking: Implications for dependence vulnerability. Journal of Experimental Psychology: Animal Behavior Processes, 37(3), 261. [DOI] [PubMed] [Google Scholar]
  55. Holland PC (2004). Relations between Pavlovian-instrumental transfer and reinforcer devaluation. Journal of Experimental Psychology: Animal Behavior Processes, 30(2), 104. [DOI] [PubMed] [Google Scholar]
  56. Huys QJ, Guitart-Masip M, Dolan RJ, & Dayan P (2015). Decision-theoretic psychiatry. Clinical Psychological Science, 3(3), 400–421. [Google Scholar]
  57. Johnson A, & Redish AD (2007). Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. Journal of Neuroscience, 27(45), 12176–12189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Kavanagh DJ, Andrade J, & May J (2005). Imaginary relish and exquisite torture: the elaborated intrusion theory of desire. Psychological review, 112(2), 446. [DOI] [PubMed] [Google Scholar]
  59. Keramati M, & Gutkin B (2013). Imbalanced decision hierarchy in addicts emerging from drug-hijacked dopamine spiraling circuit. PloS one, 8(4), e61489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Kober H, Brewer JA, Tuit K, & Sinha R (2017). Neural stress reactivity relates to smoking outcomes and differentiates between mindfulness and cognitive-behavioral treatments. . NeuroImage, 151, 4–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Kober H, Kross EF, Mischel W, Hart CL, & Ochsner KN (2010). Regulation of craving by cognitive strategies in cigarette smokers. Drug and Alcohol Dependence, 106(1), 52–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Kober H, Mende-Siedlecki P, Kross EF, Weber J, Mischel W, Hart CL, & Ochsner KN (2010). Prefrontal-striatal pathway underlies cognitive regulation of craving. Proceedings of the National Academy of Sciences, 107(33), 14811–14816 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Konova AB, Louie K, & Glimcher PW (2018). The computational form of craving is a selective multiplication of economic value. Proceedings of the National Academy of Sciences, 115(16), 4122–4127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Koob GF, & Volkow ND (2010). Neurocircuitry of addiction. Neuropsychopharmacology, 35(1), 217–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Lu L, Grimm JW, Hope BT, & Shaham Y (2004). Incubation of cocaine craving after withdrawal: a review of preclinical data. Neuropharmacology, 47, 214–226. [DOI] [PubMed] [Google Scholar]
  66. Lucantonio F, Caprioli D, & Schoenbaum G (2014). Transition from ‘model-based’ to ‘model-free’ behavioral control in addiction: involvement of the orbitofrontal cortex and dorsolateral striatum. Neuropharmacology, 76, 407–415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Lüscher C (2016). The Emergence of a Circuit Model for Addiction. Annual review of neuroscience, 39, 257–276. [DOI] [PubMed] [Google Scholar]
  68. Lüscher C, & Malenka RC (2011). Drug-evoked synaptic plasticity in addiction: from molecular changes to circuit remodeling. Neuron, 69(4), 650–663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Marks KR, Kearns DN, Christensen CJ, Silberberg A, & Weiss SJ (2010). Learning that a cocaine reward is smaller than expected: a test of Redish’s computational model of addiction. Behavioural brain research, 212(2), 204–207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Marr D (1982). Vision: A computational investigation into the human representation and processing of visual information: San Francisco: WH Freeman. [Google Scholar]
  71. Mathys C, Daunizeau J, Friston KJ, & Stephan KE (2011). A Bayesian foundation for individual learning under uncertainty. Frontiers in human neuroscience, 5(39), 1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Mollick JA, & Kober H (In prep). A hierarchical value-based decision-making model of addiction. [Google Scholar]
  73. Nee DE, Brown JW, Askren MK, Berman MG, Demiralp E, Krawitz A, & Jonides J (2012). A meta-analysis of executive components of working memory. Cerebral cortex, 23(2), 264–282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Nestler EJ (2005). Is there a common molecular pathway for addiction? Nature Neuroscience, 8(11), 1445–1449. [DOI] [PubMed] [Google Scholar]
  75. NIMH. (2007). National Comorbidity Survey: Lifetime prevalence estimates. Available from http://www.hcp.med.harvard.edu/ncs/
  76. O’Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, & Dolan RJ (2004). Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science, 304(5669), 452–454. [DOI] [PubMed] [Google Scholar]
  77. Olshausen BA (2004). Bayesian probability theory. The Redwood Center for Theoretical Neuroscience, Helen Wills Neuroscience Institute at the University of California at Berkeley, Berkeley, CA. [Google Scholar]
  78. Ostlund SB, & Balleine BW (2008). The disunity of Pavlovian and instrumental values. Behavioral and Brain Sciences, 31(4), 456–457. [Google Scholar]
  79. Packard MG, & McGaugh JL (1996). Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning. Neurobiology of learning and memory, 65(1), 65–72. [DOI] [PubMed] [Google Scholar]
  80. Panlilio LV, Thorndike EB, & Schindler CW (2007). Blocking of conditioning to a cocaine-paired stimulus: testing the hypothesis that cocaine perpetually produces a signal of larger-than-expected reward. Pharmacology Biochemistry and Behavior, 86(4), 774–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Park SQ, Kahnt T, Beck A, Cohen MX, Dolan RJ, Wrase J, & Heinz A (2010). Prefrontal cortex fails to learn from reward prediction errors in alcohol dependence. Journal of Neuroscience, 30(22), 7749–7753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Parvaz MA, Konova AB, Proudfit GH, Dunning JP, Malaker P, Moeller SJ, . . . Goldstein RZ (2015). Impaired neural response to negative prediction errors in cocaine addiction. Journal of Neuroscience, 35(5), 1872–1879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Phillips PE, Stuber GD, Heien ML, Wightman RM, & Carelli RM (2003). Subsecond dopamine release promotes cocaine seeking. Nature, 422(6932), 614. [DOI] [PubMed] [Google Scholar]
  84. Piray P, Keramati MM, Dezfouli A, Lucas C, & Mokri A (2010). Individual differences in nucleus accumbens dopamine receptors predict development of addiction-like behavior: a computational approach. Neural computation, 22(9), 2334–2368. [DOI] [PubMed] [Google Scholar]
  85. Prochaska JO, DiClemente CC, & Norcross JC (1992). In search of how people change: applications to addictive behaviors. American Psychologist, 47(9), 1102. [DOI] [PubMed] [Google Scholar]
  86. Redish AD (2004). Addiction as a computational process gone awry. Science, 306(5703), 1944–1947. [DOI] [PubMed] [Google Scholar]
  87. Redish AD, Jensen S, & Johnson A (2008). Addiction as vulnerabilities in the decision process. Behavioral and Brain Sciences, 31(4), 415–487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Redish AD, & Johnson A (2007). A computational model of craving and obsession. Annals of the New York Academy of Sciences, 1104(1), 324–339. [DOI] [PubMed] [Google Scholar]
  89. Regier PS, & Redish AD (2015). Contingency management and deliberative decision-making processes. Frontiers in Psychiatry, 6, 76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Reiter AM, Deserno L, Kallert T, Heinze H-J, Heinz A, & Schlagenhauf F (2016). Behavioral and neural signatures of reduced updating of alternative options in alcohol-dependent patients during flexible decision-making. Journal of Neuroscience, 36(43), 10935–10948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Rescorla RA (1994). Transfer of instrumental control mediated by a devalued outcome. Animal Learning & Behavior, 22(1), 27–33. [Google Scholar]
  92. Root DH, Fabbricatore AT, Barker DJ, Ma S, Pawlak AP, & West MO (2009). Evidence for habitual and goal-directed behavior following devaluation of cocaine: a multifaceted interpretation of relapse. PloS one, 4(9), e7170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Rose EJ, Salmeron BJ, Ross TJ, Waltz J, Schweitzer JB, McClure SM, & Stein EA (2014). Temporal difference error prediction signal dysregulation in cocaine dependence. Neuropsychopharmacology, 39(7), 1732. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Ryan F (2002). Detected, selected, and sometimes neglected: cognitive processing of cues in addiction. Experimental and Clinical Psychopharmacology, 10(2), 67. [DOI] [PubMed] [Google Scholar]
  95. Schoenbaum G, Saddoris MP, Ramus SJ, Shaham Y, & Setlow B (2004). Cocaine-experienced rats exhibit learning deficits in a task sensitive to orbitofrontal cortex lesions. European Journal of Neuroscience, 19(7), 1997–2002. [DOI] [PubMed] [Google Scholar]
  96. Schultz W (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593–1599. doi: 10.1126/science.275.5306.1593 [DOI] [PubMed] [Google Scholar]
  97. Schwartenbeck P, FitzGerald TH, Mathys C, Dolan R, Wurst F, Kronbichler M, & Friston K (2015). Optimal inference with suboptimal models: addiction and active Bayesian inference. Medical hypotheses, 84(2), 109–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Sebold M, Deserno L, Nebe S, Schad DJ, Garbusow M, Hägele C, . . . Smolka M (2014). Model-based and model-free decisions in alcohol dependence. Neuropsychobiology, 70(2), 122–131. [DOI] [PubMed] [Google Scholar]
  99. Simon DA, & Daw ND (2011). Neural correlates of forward planning in a spatial decision task in humans. Journal of Neuroscience, 31(14), 5526–5539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Simon DA, & Daw ND (2012). Dual-system learning models and drugs of abuse In Gutkin B & Ahmed SH (Eds.), Computational neuroscience of drug addiction (Vol. 10, pp. 145161). New York, NY: Springer. [Google Scholar]
  101. Sönmez MB, Kahyacı Kılıç E, Ateş Çöl I, Görgülü Y, & Köse Çınar R (2017). Decreased interoceptive awareness in patients with substance use disorders. Journal of Substance Use, 22(1), 60–65. [Google Scholar]
  102. Suzuki S, & Kober H (2018). Substance-Related and Addictive Disorders In Butcher JN, Hooley J, & Kendall PC (Eds.), APA Handbook of Psychopathology. Vol. 1, Psychopathology: Understanding, Assessing, and Treating Adult Mental Disorders (Vol. 1, pp. 481–506). Washington, DC: American Psychological Association. [Google Scholar]
  103. Takahashi Y, Roesch MR, Stalnaker TA, & Schoenbaum G (2007). Cocaine exposure shifts the balance of associative encoding from ventral to dorsolateral striatum. Frontiers in integrative neuroscience, 1, 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Takahashi Y, Schoenbaum G, & Niv Y (2008). Silencing the critics: understanding the effects of cocaine sensitization on dorsolateral and ventral striatum in the context of an actor/critic model. Frontiers in Neuroscience, 2, 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Tanabe J, Reynolds J, Krmpotich T, Claus E, Thompson LL, Du YP, & Banich MT (2013). Reduced neural tracking of prediction error in substance-dependent individuals. American journal of Psychiatry, 170(11), 1356–1363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Tiffany ST (1990). A cognitive model of drug urges and drug-use behavior: role of automatic and nonautomatic processes. Psychological review, 97(2), 147–168. [DOI] [PubMed] [Google Scholar]
  107. Tindell AJ, Berridge KC, Zhang J, Pecina S, & Aldridge JW (2005). Ventral pallidal neurons code incentive motivation: amplification by mesolimbic sensitization and amphetamine. European Journal of Neuroscience, 22(10), 2617–2634. [DOI] [PubMed] [Google Scholar]
  108. Tindell AJ, Smith KS, Berridge KC, & Aldridge JW (2009). Dynamic computation of incentive salience:“wanting” what was never “liked”. Journal of Neuroscience, 29(39), 12220–12228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Vanderschuren LJ, Di Ciano P, & Everitt BJ (2005). Involvement of the dorsal striatum in cue-controlled cocaine seeking. Journal of Neuroscience, 25(38), 8665–8670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Vanderschuren LJ, & Everitt BJ (2004). Drug seeking becomes compulsive after prolonged cocaine self-administration. Science, 305(5686), 1017–1019. [DOI] [PubMed] [Google Scholar]
  111. Verdejo-Garcia A, Clark L, & Dunn BD (2012). The role of interoception in addiction: a critical review. Neuroscience & Biobehavioral Reviews, 36(8), 1857–1869. [DOI] [PubMed] [Google Scholar]
  112. Verstynen TD, Badre D, Jarbo K, & Schneider W (2012). Microstructural organizational patterns in the human corticostriatal system. Journal of neurophysiology, 107(11), 2984–2995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Voon V, Derbyshire K, Rück C, Irvine MA, Worbe Y, Enander J, . . . Sahakian BJ (2015). Disorders of compulsivity: a common bias towards learning habits. Molecular Psychiatry, 20(3), 345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Voon V, Reiter A, Sebold M, & Groman S (2017). Model-based control in dimensional psychiatry. Biological psychiatry, 82(6), 391–400. [DOI] [PubMed] [Google Scholar]
  115. Wang X-J, & Krystal JH (2014). Computational psychiatry. Neuron, 84(3), 638–654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Wellman HM, & Gelman SA (1992). Cognitive development: Foundational theories of core domains. Annu Rev Psychol, 43(1), 337–375. [DOI] [PubMed] [Google Scholar]
  117. Wiecki TV, Poland J, & Frank MJ (2015). Model-based cognitive neuroscience approaches to computational psychiatry: clustering and classification. Clinical Psychological Science, 3(3), 378–399. [Google Scholar]
  118. Willuhn I, Burgeno LM, Everitt BJ, & Phillips PE (2012). Hierarchical recruitment of phasic dopamine signaling in the striatum during the progression of cocaine use. Proceedings of the National Academy of Sciences, 109(50), 20703–20708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Wyvell CL, & Berridge KC (2001). Incentive sensitization by previous amphetamine exposure: increased cue-triggered “wanting” for sucrose reward. Journal of Neuroscience, 21(19), 7831–7840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  120. Zhang J, Berridge KC, Tindell AJ, Smith KS, & Aldridge JW (2009). A neural computational model of incentive salience. PLoS computational biology, 5(7), e1000437. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

RESOURCES