Abstract
Drug self-administration has been regarded as a gold-standard preclinical model of addiction and substance-use disorder (SUD). However, investigators are becoming increasingly aware, that certain aspects of addiction or SUDs experienced by humans are not accurately captured in our preclinical self-administration models. The current review will focus on two such aspects of current preclinical drug self-administration models: 1) Predictable vs. unpredictable drug access in terms of the time and effort put into obtaining drugs (i.e., response requirement) and drug quality (i.e., amount) and 2) rich vs. lean access to drugs. Some behavioral and neurobiological mechanisms that could contribute to excessive allocation of behavior toward drug-seeking and drug-taking at the expense of engaging in nondrug-related activities are discussed, and some directions for future research are identified. Based on the experiments reviewed, lean and unpredictable drug access could worsen drug-seeking and drug-taking behavior in individuals with SUDs. Once more fully explored, this area of research will help determine whether and how unpredictable and lean cost requirements affect drug self-administration in preclinical laboratory studies with nonhuman subjects and will help determine whether incorporating these conditions in current self-administration models will increase their predictive validity.
Keywords: Drug self-administration, Substance-use disorder, Variable-ratio schedules, Variable reinforcer amounts, Unpredictable outcomes
Introduction
For several decades, drug self-administration in nonhuman animals has been regarded as a gold-standard preclinical model of addiction and substance-use disorder (SUD). The procedure has a high degree of face validity, and drugs that are misused by humans, with few exceptions, are self-administered by nonhuman animals (see Huskinson et al., 2014; Platt and Rowlett, 2012; O’Connor et al., 2011 for reviews). While procedures vary widely, the core feature of the approach is the contingent relation between engaging a predefined response (e.g., lever press) and the delivery of a drug. Drug self-administration studies have established that environmental factors such as the cost or number of responses required to earn drug and nondrug reinforcers, the amount and frequency of reinforcement, and the delay to reinforcer delivery can alter drug-taking behavior in nonhumans (e.g., Anderson et al., 2002; Campbell and Carroll, 2000; Huskinson et al., 2015; 2016; Maguire et al., 2013; Nader and Woolverton, 1991; 1992; Negus, 2003; Pickens and Thompson, 1968; Woolverton and Anderson, 2006). Importantly, these environmental variables that alter drug self-administration in nonhumans generally translate well to human laboratory and clinical studies (e.g., Greenwald and Steinmiller, 2009; Higgins et al., 1994; Lile et al., 2016; Packer et al., 2012; Silverman et al., 1999; Stoops et al., 2012).
Investigators are becoming increasingly aware that certain aspects of addiction or SUDs experienced by humans are not accurately captured in preclinical self-administration models in nonhuman subjects (e.g., Banks and Negus, 2012; Cadet, 2019; Smith, 2020; Vanderschuren and Ahmed, 2013). While drug self-administration research in nonhumans has contributed to successful pharmacological and behavioral therapies (e.g., agonist replacement therapies, antagonist therapies, and contingency management), there is room for improvement. For example, there are currently no FDA approved pharmacotherapies for stimulant-use disorders. In addition, some drugs (e.g., buspirone, lorcaserin, kappa-opioid receptor antagonists) have had positive results in preclinical studies with nonhumans but had no effect or worsened abuse-related outcomes in human laboratory studies or clinical trials (e.g., Bolin et al., 2016; Brandt et al. 2020; Ling et al., 2016; Pike et al., 2016; Winhusen et al., 2014). An apparent exception to the false positives identified in traditional self-administration studies with nonhuman subjects are in drug vs. nondrug choice studies which have predicted human outcomes quite well (e.g., Banks and Negus, 2017).
The purpose of the current review is to discuss how predictable schedules of reinforcement and rich access to drugs found in typical self-administration procedures may not model the unpredictable and lean conditions that likely are experienced in the natural environment by individuals with SUDs. In addition, some of what is known about unpredictable and lean schedules of reinforcement, most commonly studied with nondrug reinforcers, will be reviewed. Finally, potential mechanisms underlying behavioral outcomes that occur with unpredictable and lean schedules will be described as well as future directions for researchers interested in incorporating these aspects into their self-administration procedures with nonhumans.
The vast majority of self-administration studies with nonhumans have used predictable schedules of drug availability. For example, fixed-ratio (FR) schedules of reinforcement wherein a set number of responses is required for each drug delivery or progressive-ratio (PR) schedules wherein an increasing yet predictable number of responses is required for serial drug deliveries. However, it is unlikely that individuals across different stages of illicit drug use receive drug reinforcers under circumstances as predictable as alternative nondrug reinforcers in the environment (see Lagorio and Winger, 2014). A nondrug reinforcer such as a paycheck generally occurs at predictable points in time, in exchange for specific work requirements. Other nondrug reinforcers like consumable goods or hobbies are available at relatively predictable locations and prices. In contrast, illicit drugs like cocaine and heroin may be less predictable in terms of their availability, quality, location, price, and in the time and effort required to obtain them.
In addition to using conditions of predictable access in self-administration experiments with nonhumans, it is especially common for researchers to use a low-value FR schedule (e.g., FR 1, FR 5) and a relatively large drug dose which can be referred to as conditions of “rich drug access”. For some individuals, access to drugs in the natural environment may be available under similarly rich-access conditions, perhaps prior to the development of a SUD or for individuals who are able to obtain drugs under relatively resource-replete circumstances (e.g., while still employed or in possession of substantial savings). For example, individuals with relatively higher incomes who use cocaine or heroin report higher amounts and frequencies of cocaine or heroin use (Greenwald and Steinmiller, 2014; Roddy and Greenwald, 2009; Roddy et al., 2011). The opposite also may be true for individuals with SUDs whose access to drugs are available under relatively large cost requirements for relatively small amounts of drug, or what can be referred to as conditions of “lean drug access”. This latter scenario is especially likely for individuals living in impoverished environments (e.g., people experiencing homelessness, unemployment, or low incomes) who do not have the financial resources to purchase large amounts of drugs on demand. Indeed, individuals with relatively lower incomes who use cocaine or heroin report lower amounts and frequencies of use (Greenwald and Steinmiller, 2014; Roddy and Greenwald, 2009; Roddy et al., 2011). Leanaccess conditions also could occur for individuals who began drug-taking in a resource-rich environment, and as their SUD developed and became more severe, their environment became more impoverished. Under lean-access conditions, an individual may spend the majority of his or her time trying to earn sufficient funds to purchase drugs, often repeating the cycle of drug procurement daily or multiple times per day because small quantities of drugs are purchased at a time. Importantly, it does not appear that making contact with a dealer is difficult; individuals who use heroin report living relatively close to their suppliers and report that their suppliers are relatively reliable (Roddy and Greenwald, 2009; Roddy et al., 2011). Therefore, large response costs may be related to the time and effort required to earn sufficient funds to purchase drugs.
Few researchers have evaluated whether and how unpredictable and lean drug access affects drug self-administration in nonhumans, and whether such access differentially alters neurobiological outcomes associated with drug taking compared to the more commonly used, predictable and rich-access schedules of reinforcement. Similarly, it is well established that adding nondrug alternatives to the environment can reduce drug self-administration in nonhumans, yet no one to my knowledge has evaluated whether and how unpredictable access to nondrug alternatives affects drug self-administration when drug and nondrug reinforcers are concurrently available. Although there are additional aspects of unpredictability that could affect reinforcement and punishment processes involved in SUDs, the current review will focus on drugs as reinforcers under unpredictable response requirements and unpredictable amounts. These aspects arguably model the time and effort put into obtaining drugs (i.e., response requirement) and drug quality, respectively (the latter generally referring to drug amount; see Madden et al., 2007 for a review of variable response requirements and variable amounts with nondrug reinforcers). Because a defining feature of SUD is continued drug use in the face of negative consequences, it also will be critical for future work to consider the unpredictable nature of negative outcomes associated with SUDs and to evaluate how unpredictable consequences in the form of negative reinforcement or punishment influence drug self-administration in nonhuman animals. Indeed, some of this work has already been done (e.g., Marchant et al., 2013; Negus, 2005).
2. Unpredictable Response Requirements
Predictability and unpredictability can be studied in the nonhuman laboratory with fixed and variable schedules of reinforcement. While the literature on this topic is limited in terms of drug reinforcement, direct comparisons have been made between fixed and variable schedules of nondrug reinforcement in nonhumans (e.g., food or liquid reinforcers). Variable-ratio (VR) schedules of reinforcement are response-based schedules that require a varying number of responses per reinforcer delivery. For example, under a VR 20 schedule, a single response or several responses may be required to earn a single reinforcer, but on average, 20 responses are required for reinforcer delivery. A random-ratio (RR) schedule is similar to a VR schedule in that a variable number of responses is required for any given reinforcer delivery. However, the scheduling is such that each response has a constant probability of being reinforced (Madden et al., 2007; Schoenfeld et al., 1956). For example, on an RR 20 schedule, the probability that each response will result in reinforcer delivery is 1 in 20 or a 5% chance. Predictability of illicit drugs in the natural environment may mirror VR or RR schedules in that drug seeking sometimes results in relatively immediate reinforcement, such as the high associated with drug taking or relief of withdrawal symptoms. Alternatively, drug seeking sometimes requires a large amount of time and effort, resulting in more delayed acquisition of the drug effect.
In nonhuman animals, large-value VR schedules of reinforcement result in high-rate behavior, with little pausing after reinforcer delivery or between response bouts whereas large-value FR schedules also result in high-rate behavior with relatively long pauses after reinforcer delivery (e.g., Ferster and Skinner, 1957). Comparisons between responding maintained by a range of FR and VR schedule values have been made in behavioral-economic procedures in the nonhuman laboratory. In traditional behavioral-economic experiments, the number of responses required for each reinforcer delivery systematically increases across several sessions until the number of reinforcers delivered reaches zero or near-zero levels (e.g., Hursh, 1991; Hursh and Silberberg, 2008). The number of reinforcers earned (consumption) is plotted as a function of response requirement (price), and the “elasticity” of the resulting demand curve is calculated based on the slope obtained from non-linear regression. When decreases in consumption are proportionally less than increases in price, demand is deemed inelastic, and when decreases in consumption are proportionately greater than increases in price, demand is deemed elastic. Reinforcers that result in more elastic demand are thought to be less effective reinforcers than those that result in more inelastic demand.
In nonhuman subjects, behavior maintained by a nondrug reinforcer under RR schedules is less elastic compared with behavior maintained under FR schedules (Madden et al., 2005), indicating that the effectiveness of the same reinforcer was increased by making its response requirement unpredictable. Only recently was this area of study applied to drug self-administration (Lagorio and Winger, 2014). In rhesus monkeys, behavior was maintained by cocaine, remifentanil, or ketamine under FR and RR schedules. Compared with FR schedules, self-administration maintained under RR schedules resulted in less elastic demand, indicating that the effectiveness of the drug to maintain behavior was greater under RR compared with FR schedules. In addition, the discrepancy between behavior maintained by FR or RR schedules was most robust when relatively small drug doses were available, under relatively large average response requirements. In other words, conditions of lean, unpredictable drug reinforcement resulted in greater behavioral output (i.e., greater reinforcing effectiveness) compared to FR schedules of the same average value. If the results obtained by Lagorio and Winger (2014) with rhesus monkeys generalizes to humans, it would indicate that lean and unpredictable access to illicit drugs is not likely to reduce drug-seeking behavior. Rather, drug-seeking may be enhanced during periods of lean and unpredictable drug availability.
Recent work by Mascia and colleagues (2019) has shown a different, yet interesting relation between VR schedules and drug self-administration. Rats with a history of responding under a VR schedule of saccharin delivery subsequently self-administered more amphetamine under a PR schedule of reinforcement compared to rats with a history of working for saccharin under an FR schedule. This work suggests that unpredictability of commodity acquisition in the environment can strengthen drug-taking behavior, even when unrelated to drug procurement. If this translates to humans, an unpredictable source of nondrug alternatives in the natural environment also could worsen drug-seeking behavior, which has implications on the role of impoverished conditions in the development of SUDs.
In choice arrangements, we have known for some time that rodents and pigeons choose nondrug reinforcers associated with VR or RR schedules over the same reinforcer associated with an FR schedule (e.g., Fantino, 1967; Field et al., 1996; Madden and Hartman, 2006), and they do this even when the average response requirement for the VR schedule is greater than that for the FR schedule (Ahearn et al., 1992; Goldshmidt and Fantino, 2004; Johnson et al., 2011, 2012). Thus, in some situations, rodents will allocate their behavior toward a VR schedule despite the occasional occurrence of a more delayed payoff and overall reinforcer loss. Importantly, and like results with a single-lever arrangement, choice of nondrug reinforcers associated with VR schedules is most robust under larger average response requirements (Fantino, 1967; Field et al. 1996; Madden and Hartman, 2006). For example, in Madden and Hartman’s (2006) experiment, when choice was between FR and RR 3 schedules, nonhuman subjects allocated their behavior approximately equally among alternatives. As the average value of both schedules was raised, however, choice switched primarily to the reinforcer associated with the RR schedule at values as low as 48 and persisted to values as high as 384 or 768.
To date, a recent report from my laboratory is the only drug self-administration experiment to evaluate choice between drug reinforcers associated with FR or VR schedules in nonhuman animals (Huskinson et al., 2017). In this experiment, three of four rhesus monkeys chose cocaine associated with a VR 30 schedule over cocaine associated with an FR 30 schedule. However, this effect was repeatable in only two of the four subjects, and the statistical analysis was not significant. It is possible that the average cost was too small to result in reliable choice of the cocaine option associated with the VR schedule. In Lagorio and Winger’s (2014) experiment, also with rhesus monkeys but not in the context of choice, reliable differences in responding were not observed at average response requirements lower than 100. Taken together, the results from experiments using both nondrug and drug reinforcers suggest that unpredictable access under relatively lean response costs results in excessive allocation of behavior toward such outcomes. If translatable to individuals with SUDs, unpredictable drug access under relatively lean conditions could contribute to excessive allocation of behavior toward procuring drug reinforcers at the expense of engaging in activities that result in more predictable nondrug alternatives.
Finally, it is important to highlight a handful of self-administration experiments with nonhumans in which a VR or RR schedule of reinforcement was used. Panlilio and colleagues (1996, 2000) used a relatively small-value VR schedule (i.e., VR 3 or 5) to maintain heroin or cocaine self-administration, and Mello and colleagues (2013) used a VR 16 schedule to maintain cocaine self-administration (with an FR component that produced a light). In these experiments, the VR was not directly compared to an FR schedule. Thus, it is difficult to say whether or how the VR schedule impacted drug self-administration. In other experiments, an RR 25 schedule was used to maintain ethanol self-administration for subsequent resurgence tests (Podlesnik et al., 2006; Pyszcynski and Shahan, 2013). These experiments also were not designed to compare an RR schedule to an FR schedule. Podlesnik and collogues (2006) did note that resurgence of responding previously maintained by ethanol in their experiment was greater than reinstatement of responding previously maintained by ethanol in other experiments. Obviously, procedural differences across these studies preclude firm conclusions regarding the role of RR schedules in enhanced drug taking or seeking. These studies do, however, suggest an effect of RR schedules on these behaviors that should be studied systematically in future work.
3. Unpredictable Amounts
In addition to unpredictability in terms of the time and effort an individual devotes to obtaining drugs, the quality of an individual’s drug supply may be unpredictable. Drug producers and dealers frequently use cutting agents for several reasons (e.g., Broseus et al., 2016; Fiorentin et al., 2019). This conceivably creates variation in the quality of a given drug supply with greater amounts of cutting agents resulting in less overall drug and thus, poorer quality. Anecdotes from individuals with past or current SUDs report getting “blanks” or purchasing poor-quality drugs from dealers (Szalavitz, 2016). Individuals who use heroin estimate a wide range of purity of their heroin supply (2–100%; Roddy and Greenwald, 2009; Roddy et al., 2011), and seized drugs like heroin, cocaine, and methamphetamine vary widely in terms of purity (e.g., Fiorentin et al., 2019), altogether indicating that unpredictability in drug quality may not be a significant deterrent that prevents individuals from continuing to purchase drugs. Conversely, fentanyl and fentanyl analogs have been increasingly reported in illicit opioids like heroin or counterfeit prescription opioids, and more recently, fentanyl has made its way into illicit stimulants like cocaine and methamphetamine (e.g., Ciccarone, 2017; Zibbell, 2019). The addition of fentanyl in this manner creates a unique aspect of unpredictable drug quality that has resulted in a marked increase in the potency and toxicity of illicit-drug sources.
In nonhumans, there are several operant- and classical-conditioning experiments in which variable nondrug reinforcer or reward amounts, respectively, were evaluated. A large portion of this work has involved choice procedures and nondrug reinforcers where choice was between variable vs. fixed reinforcer amounts (e.g., Lagorio and Hackenberg, 2012; Logan, 1965; Marsh and Kacelnik, 2002; McCoy and Platt, 2005; McSweeney et al., 2003). Unlike results discussed above for variable response requirements, results from choice studies with nonhuman subjects with variable nondrug reinforcer amounts are mixed. Indeed, many report the opposite of what is found with variable response requirements where fixed nondrug reinforcer amounts are chosen over variable amounts or choice is indifferent. Like with variable schedules, almost none of the nonhuman animal work with variable reinforcer amounts was done with drug reinforcers. An exception is the Huskinson et al. (2017) study described above, where rhesus monkeys chose between a fixed dose of cocaine available under an FR 30 schedule and variable doses of cocaine also available under an FR 30 schedule. When the total intake possible was held constant by maintaining the same average dose on both options, subjects chose the variable-dose option over the fixed-dose option, a statistically significant effect that was present in all four subjects (some more robust than others). However, when the average dose was adjusted and not equal on average, only one subject continued to choose the variable-amount option even when it resulted in less possible drug intake. The other three subjects chose the larger average dose when doses were adjusted regardless of whether the larger average dose was fixed or variable.
Though no other preclinical study with nonhumans has been done to examine choice between self-administered drug reinforcers of fixed vs. variable amounts, there are two studies (Bickel et al., 2004; Kirshenbaum et al., 2006) in which human participants made choices between fixed and variable hypothetical heroin amounts or potencies and fixed and variable delays to hypothetical heroin delivery that were equal on average. Bickel and colleagues (2004) found that heroin-dependent individuals, under a simulated state of opioid withdrawal, chose the variable option more than the fixed one, an effect that was most robust with larger average amounts and potencies and longer average delays. However, under a simulated state of opioid satiation, the same participants chose the fixed option more than the variable one, except with relatively long average delays. The second study conducted by Kirshenbaum and colleagues (2006) reported similar, yet nuanced effects with fixed vs. variable hypothetical heroin amounts. Importantly, they replicated the finding that simulated opioid withdrawal enhanced choice of the variable option and that this effect was more robust with intravenous users compared with intranasal users. Clearly, more research is needed to determine whether and under what conditions (e.g., withdrawal vs. satiation) variable drug amounts are chosen over fixed drug amounts, though the nonhuman and human experiments described here suggest that unpredictable drug amounts garner more behavioral allocation than predictable drug amounts. If translatable to the natural environment, unpredictable drug quality may not deter drug-taking behavior and perhaps could worsen this behavior when an individual is experiencing withdrawal.
Classical conditioning protocols also have been used to evaluate variability of reward deliveries on learning in nonhuman animals. For example, in incentive-salience research (see Berridge and Robinson, 2016 for review), two common dependent measures are sign- and goal-tracking behaviors. Sign-tracking is measured in behaviors directed at a reward predictive cue (i.e., a conditioned stimulus), and goal-tracking is measured in behaviors directed at the location of reward delivery (i.e., directed toward the food aperture). Nonhumans that tend to display sign-tracking have phenotypes argued to resemble prominent features of addiction and SUDs (see Tomie et al., 2008, for a review), and this is considered to be a vulnerable phenotype compared to nonhumans that prominently display goal-tracking. In outbred rats, it is common to see that some animals develop sign-tracking, some goal-tracking, and some show a mixture of the two when reward deliveries are predictable or certain (Fitzpatrick et al., 2013). However, when both the amount and probability of reward delivery were variable from trial to trial, rates of sign-tracking were significantly greater compared to conditions of reward certainty (Anselme et al., 2013; Robinson et al., 2014, 2015), suggesting that unpredictable reward deliveries can enhance the development of this vulnerable phenotype. No study, to my knowledge, has examined effects of variable delivery of drug rewards in this context. The possibility remains that variable drug delivery would produce more robust sign-tracking than fixed nondrug rewards.
4. Behavioral Mechanisms and Theories
Several potential mechanisms or theories have been proposed to account for the behavioral outcomes seen with variable schedules and reinforcer amounts compared with fixed schedules and reinforcer amounts (e.g., Goldshmidt and Fantino, 2004; Houston et al., 2014; Kacelnik and Bateson, 1996; Kacelnik and El Mouden, 2013; Madden et al., 2007; Mishra, 2014). One important difference is clear (especially for relatively large average response requirements): Under a VR or RR schedule, while some reinforcers require more responses than the average cost requirement, it also is possible for reinforcer delivery to occur after a single response or after a small number of responses. Conversely, FR schedules always require the same scheduled amount of responses to obtain a reinforcer. When the schedule is a lean FR, the response cost is always relatively large. The relatively immediate payoff of receiving a reinforcer after a low response-cost assignment under a VR or RR schedule is likely a controlling factor in the high levels of behavior seen with single-lever variable schedules and in the disproportionate choice of reinforcers associated with variable schedules over fixed ones. In fact, reinforcer choice associated with VR schedules is greatest when the smallest possible ratio value is 1, and the effect diminishes as the smallest possible ratio value approaches that of the FR schedule (e.g., Fantino, 1967; Field et al., 1996).
4.1. Hyperbolic Delay Discounting
Madden and colleagues (2005, 2007, 2011) as well as others (Field et al., 1996) have proposed that the disproportionate allocation of behavior toward reinforcers associated with VR or RR schedules can be explained within a delay-discounting framework. When completing a response requirement, it necessarily takes time to complete the requirement, and larger requirements take more time to complete. Thus, one can think of smaller response requirements resulting in shorter delays to reinforcement and larger response requirements resulting in longer delays to reinforcement. Furthermore, it is well established that a reinforcer loses its value as the delay to its delivery increases, and the rate of decline in value is a hyperbolic function of delay (e.g., Mazur, 1987). That is, a reinforcer loses its value most precipitously at shorter delays and less rapidly as delays become longer. According to Madden and colleagues, the overall value of a reinforcer available under a range of response requirements that constitute a VR schedule (i.e., VR schedules have small and large values) is subjectively greater than the value of the fixed but always delayed reinforcer available under an FR schedule.
Figure 1 shows hypothetical discounting curves that can be used to illustrate the point made by Madden and colleagues (2005, 2007, 2011). In Figure 1, the subjective value of a delayed reinforcer with a minimum possible value of 0 and a maximum possible value of 100 is plotted as a function of delay to reinforcer delivery in arbitrary units. Focusing first on curve A (open symbols) and an average delay of 25 units, when the delay to reinforcer delivery is fixed, its subjective value is 16 (see Figure 1 for values used for each data point). Using a two-value variable schedule that is equal on average to a fixed schedule, the reinforcer is sometimes available immediately and sometimes after a delay of 50 units. When immediate, the subjective value is 100 (i.e., it retains its full value), and when delivered after a delay of 50 units, the subjective value is 8.5. The sum of the two values under the variable schedule after multiplying each by their probability of occurrence (p = 0.5) is 54.25. Thus, the combined subjective value under the variable schedule (54.25) is greater than that of the fixed but equal-on-average schedule (16). Given the hyperbolic nature of temporal discounting, greater “weight” is therefore applied to the sometimes-immediate delivery that can occur under variable schedules and proportionately less weight is applied to the sometimes-delayed delivery that can occur under variable schedules.
Figure 1. Hypothetical Delay Discounting Curves.
The subjective value of a delayed reinforcer with a minimum value equal to 0 and a maximum value equal to 100 is plotted as a function of the delay to reinforcer delivery in arbitrary units. Data points in curves A (open circles) and B (closed circles) were fit with the hyperbolic discounting equation: V = A/(1 + kD), where V is the present value of the delayed reinforcer, A is the magnitude of the delayed reinforcer, k is a parameter that reflects the rate of discounting, and D is the delay to reinforcer delivery (Mazur, 1987). Curve A y-axis data points (subjective values) are 100, 16, 8.5, 4, 3, 3, and 2.5 at x-axis delays 0, 25, 50, 100, 150, 175, and 200, respectively, and the resulting k value is 0.2. Curve B y-axis data points (subjective values) are 100, 67, 50, 33, and 20 at x-axis delays 0, 25, 50, 100, and 200, respectively, and the resulting k value is 0.02.
A delay-discounting framework also can account for the finding that choice becomes more extreme with larger average response requirements. To illustrate this point, recall the example above with fixed and variable schedules with an equal on average delay of 25 units and respective subjective values of 16 and 54.25. The difference between the subjective values in this example is 38.25. If we carry out the thought experiment with a larger delay of 100 units, the subjective value of the reinforcer associated with the fixed schedule is 4, while the subjective values of the equal on average variable delays of 0 and 200 units are 100 and 2.5, respectively. Multiplying each value by its probability of occurrence (p = 0.5) and summing the two values results in a combined subjective value of 51.25 for the variable delay of 100 units. The difference between subjective values of the fixed and variable delays of 100 units is now 47.25 (i.e., 52.25 – 4 = 47.25) and is larger than the difference in subjective values obtained with fixed vs. variable delays of 25 units (i.e., 38.25). Thus, the larger the average delay to reinforcer delivery, the greater the difference in subjective value between reinforcers associated with VR compared to FR schedules. In addition, a delay-discounting framework can account for the finding that reinforcer choice associated with VR schedules is greatest when the smallest possible ratio value is 1, and the effect diminishes as the smallest possible ratio value approaches that of the FR schedule. Using an average delay of 100 units, the combined subjective value of the following two-value schedules, 0 and 200, 25 and 175, 50 and 150, 100 and 100 is 51.25, 9.5, 5.75, and 4, respectively.
Figure 1 also shows a hypothetical curve B. Note that curve B is shallower than curve A and has a k value that is an order of magnitude smaller than the k value for curve A (smaller k is associated with shallower discounting). In the delay-discounting field, shallow discounting is thought to reflect self-controlled choice because a relatively shallow function indicates that the subject or participant will wait longer for a larger reinforcer while a relatively steep function indicates that the subject or participant will not wait as long for a larger reinforcer and therefore chooses the smaller, but more immediate reinforcer more often. According the account proposed by Madden and colleagues (2007, 2011), individuals who display steeper discounting will choose reinforcers associated with variable schedules more robustly than individuals who display shallower discounting, thus suggesting a relation between trait impulsivity and the effects of variability on reinforcer value. The thought experiment can be carried out with curve B as was done above for curve A. The result is that the difference in subjective values between fixed and variable delays is less extreme with curve B compared to curve A. This has relevance to individuals with SUDs who have been shown consistently over more than two decades of research to discount monetary rewards more steeply than control participants who do not have a SUD (for a meta-analysis, see Amlung et al. 2017). Individuals with SUDs also discount hypothetical drug more steeply than hypothetical money (e.g., Coffey et al., 2003; Madden et al. 1997). Altogether, this suggests that individuals with SUDs may be more likely than individuals without SUDs to allocate their behavior toward outcomes with variable response costs or delays to reinforcement than to outcomes with fixed response costs or delays to reinforcement. Similarly, it is possible that the steeper discounting of drug reinforcers compared to discounting of nondrug reinforcers seen in individuals with SUDs could result in an even greater propensity to allocate behavior towards variable drug outcomes compared with variable nondrug outcomes.
4.2. Risk-Sensitive Foraging Theory
With variable reinforcer amounts, at least in the case of drug vs. drug choice, it is possible that a similar effect occurred where greater weight was applied to the sometimes-larger drug delivery and less weight was applied to the sometimes-smaller drug delivery. I am aware of only three experiments in which choice was between variable drug amounts. One was with cocaine in nonhuman primates (Huskinson et al., 2017) and the other two were in human participants with hypothetical heroin outcomes (Bickel et al., 2004; Kirshenbaum et al., 2006). While these experiments found fairly reliable and robust choice of the variable outcomes, other experiments conducted with variable nondrug reinforcer amounts do not reliably result in choice of the variable option (Kacelnik and El Mouden, 2013). In fact, the opposite is true where animals tend to choose the fixed option or display indifference between options.
Several foraging theories have been developed to account for choices made between variable and fixed reinforcer amounts as well as variable and fixed delays to reinforcer delivery (see Craft, 2016; Houston and Wiesner, 2020; Kacelnik and El Mouden, 2013; Mishra, 2014 for recent discussions). Perhaps the most well-known foraging theory is risk-sensitive foraging, developed by behavioral ecology researchers to account for foraging decisions made by animals in the acquisition of food (e.g., Mishra, 2014). Risk-sensitive foraging theory was developed from optimal foraging theory which predicts that behavior will be allocated among available foraging opportunities in a way that maximizes reproductive value, or said another way, that maximizes caloric gains and minimizes time or effort spent foraging (see Pyke, 1984 for an early review). Optimal foraging predicts indifference among alternatives that are equal, but it is now well established that nonhuman and humans do not always display indifference among options that are equal on average. Therefore, risk-sensitive foraging theory was developed to describe allocation of behavior among available options when at least one outcome is uncertain or variable. Here, a subject is described as being risk prone when it chooses an uncertain or variable outcome and is described as being risk averse when it chooses a certain or fixed outcome.
With variable reinforcer amounts, nonhuman subjects tend to be risk-averse or indifferent, but some have noted that when animals do choose variable reinforcer amounts, it appears to be linked to the amount of food available in the subject’s environment, or in other words, the subject’s income (see Kacelnik and Bateson, 1996 for a review; Madden et al., 2007 for a discussion). This observation led to the development of the budget rule which predicts that subjects will be risk averse when in a positive energy budget (i.e., the subject has excess caloric intake) and will be risk prone when in a negative energy budget (i.e., the subject has insufficient caloric intake). Translation of the budget rule to drug reinforcers was made in two studies with human participants choosing between variable heroin outcomes under simulated states of opioid satiation or a positive energy budget and opioid withdrawal or a negative energy budget (Bickel et al., 2004; Kirshenbaum et al., 2006), and their results largely supported the budget rule.
While risk-sensitive foraging theory’s budget rule historically had somewhat strong support, some have made the argument that it has received less support over time. According to Kacelnik and El Mouden (2013), only 24% of studies completed in nonhuman subjects that were reviewed by the authors provided support for the budget rule when it comes to nondrug reinforcers. However, others have noted that despite mixed results obtained with nonhumans foraging for food, recent research evaluating human behavior have been more successful (Mishra, 2014). According to this author, studies with humans have provided somewhat strong support of risk-sensitive forage theory and the budget rule. Clearly more research with drug reinforcers is needed to make strong conclusions about the ability of the budget rule to predict choices among fixed vs variable drug reinforcers in humans or nonhuman animals.
Other foraging theories that were not discussed here have been developed to account for some risk-sensitive foraging outcomes (e.g., scalar utility theory, prospect theory). In general, these theories tend to predict allocation of choice to variable delays and costs over fixed ones and tend to predict the opposite with variable amounts (i.e., that choice will be allocated to fixed amounts over variable ones (see Craft, 2016; Houston and Wiesner, 2020; Kacelnik and El Mouden, 2013; Mishra, 2014 for recent discussions of different theories). To my knowledge, these theories have not been applied to drug reinforcers, and additional research is needed to determine which of these theories can best account for choice among drug reinforcers that are fixed or variable in nature.
5. Potential Neurochemical/Neurobiological Mechanisms
Some neurochemical evidence exists that could help explain why variable situations are sometimes chosen over fixed ones. A relatively large body of literature describes a phenomenon referred to as “Dopamine Prediction Error”, which suggests that unpredictable reward delivery produces a larger, more sustained dopaminergic response compared to predictable reward deliveries (Fiorillo et al., 2003) and that the dopaminergic response can predict subjects’ choices of predictable or unpredictable reinforcer amounts (Sugam et al., 2012; see Nasser et al., 2017; Schultz, 2016 for recent reviews on Prediction Error). A similar effect has been demonstrated with VR schedules. When rats’ behavior was maintained by saccharin delivery, dopamine overflow in the NAc was stable across increasing response requirements for rats in the FR group whereas for rats in the VR group, dopamine overflow in the NAc was an increasing function of the average response requirement (Mascia et al. 2019). As the average VR requirement was raised, dopamine overflow became significantly greater for the VR group compared with the FR group. This enhanced dopamine overflow for larger average cost requirements under a VR schedule is consistent with the findings that more robust behavioral results are seen with larger average response requirements.
Interestingly, exposure to a VR schedule of saccharin has other effects that are related to drug-taking. Compared with FR exposure, a history of VR exposure subsequently resulted in an enhanced locomotor response in rats following an amphetamine challenge, effects that also occur following amphetamine sensitization protocols (Mascia et al., 2019; Singer et al., 2012). These outcomes have led some to suggest that exposure to uncertainty produces a similar effect on the brain and behavior as amphetamine sensitization protocols; in other words, unpredictable access to nondrug reinforcers or rewards could be sensitizing dopamine neurons in a manner similar to drug exposure (Robinson and Anselme, 2019; Zack et al., 2014). Since most drugs of abuse robustly enhance dopamine overflow in the NAc, it is possible that the combined effect of the drug as well as the effect of uncertainty combine to produce an even larger dopaminergic response. Such an outcome likely would contribute to the behavioral outcomes described above for humans and nonhumans under conditions of unpredictable drug availability.
6. Future Directions
While there is insufficient evidence at this stage to know whether incorporating conditions of lean and unpredictable access in nonhuman animal models of self-administration will improve the model’s value or its predictive validity, this area of research has numerous unexplored directions that could be taken and numerous independent variables that have not been evaluated in a systematic way. Some of this work is currently underway in my nonhuman primate laboratory, and it is not feasible to describe all of the possibilities for future research here. Many different self-administration procedures could easily incorporate unpredictable and lean drug access. For example, if unpredictable drug access contributes to excessive or worsened drug-taking patterns, one would expect quicker acquisition, greater intake, and greater resistance to extinction or punishment.
Unpredictable drug access also can be incorporated into drug-choice procedures. The only drug self-administration experiment to examine choice between predictable and unpredictable outcomes used drug vs. drug choice with rhesus monkeys (i.e., intravenous cocaine; Huskinson et al., 2017). Investigating choice between reinforcers of the same type was a logical first step to study choice between unpredictable and predictable outcomes because it is easier to do, and it is easier to interpret choice studies that use the same type of reinforcer. However, drug vs. drug choice does not capture the excessive allocation of behavior toward drug taking at the expense of engaging in nondrug-related activities. Drug vs. nondrug choice studies are better suited for studying allocation of behavior toward drug at the expense of obtaining nondrug alternatives. When used in preclinical models, drug vs. nondrug choice studies have been demonstrated to have high predictive validity (see Banks et al., 2015 for a review) as evidenced by the fact that behavioral or pharmacological treatments applied to this choice situation align with outcomes (both positive and negative) in the human laboratory and in contingency-management settings (e.g., Greenwald and Steinmiller, 2009; Higgins et al., 1994; Lile et al., 2016; Packer et al., 2012; Stoops et al., 2012). Thus, future research should examine choice between predictable and unpredictable schedules in drug vs. food (or other nondrug reinforcer) choice procedures. If unpredictable access contributes to excessive drug taking, one would expect drug choice to be enhanced when its delivery is unpredictable and nondrug reinforcer delivery is predictable compared with traditional drug vs. nondrug choice when both alternatives are available under a predictable schedule. Excessive drug-taking would be reflected by an increase in the potency of the drug reinforcer under the unpredictable arrangement.
Future research also could be conducted to evaluate whether drug choice can be reduced by making the nondrug reinforcer unpredictable in terms of amount and cost. This type of research could have implications for contingency management, a behavioral therapy modeled by drug vs. nondrug choice, and importantly, an effective psychosocial treatment for SUDs across a variety of substances and populations (e.g., Davis et al., 2016; Dutra et al. 2008). Clinical research in contingency management has examined effects of a probabilistic schedule of nondrug reinforcer delivery under a “prize” schedule as well as “standard” contingency management (e.g., Olmstead and Petry 2009; Petry et al., 2005). Future research with nonhuman subjects could determine parameters of unpredictable reinforcement that are most effective at reducing drug choice and increasing nondrug choice.
Considering the potential neurochemical mechanisms described above, future research could determine the dopaminergic response to predictable compared with unpredictable drug deliveries in rich- and lean-access conditions in any of the operant- or classical-conditioning procedures described above. No research has been conducted to determine whether a similar enhancement of the dopaminergic response would occur when drug self-administration behavior is maintained by a VR schedule as has been reported with saccharin (Mascia et al., 2019). In addition, we know neurotransmitters other than dopamine and brain regions other than the NAc can be altered by drugs and can play a significant role in drug-taking behavior. The neuroscience community could identify a number of directions for evaluating the effects of unpredictable and lean drug access on neurochemical and neurobiological outcomes (e.g., McCoy and Platt, 2005; Monosov and Hikosaka, 2013; Robinson and Anselme, 2019; Soltani and Izquierdo, 2019).
Finally, this review focused on unpredictability in terms of cost or quality by manipulating the schedule of reinforcement or the amount of each reward or reinforcer delivery. It is not likely that these are the only dimensions of drug reinforcement that are variable in the natural environment. In addition, there are negative outcomes associated with drug seeking and drug taking that are unpredictable. For example, overdose or drug toxicity, incarceration, and violence associated with drug use are unpredictable. Intoxication itself can result in behaviors one might not normally engage in when not under the influence of an illicit drug (e.g., having risky sex, sharing needles), and these behaviors can have unpredictable or uncertain consequences that are worthy of study in their own right. While unpredictability in terms of reinforcement and punishment processes can be modeled in nonhuman subjects, some of these aspects of drug taking are uniquely human.
7. Conclusions
To the extent that the time and effort or other features of the environment are unpredictable, unpredictable drug access could be worsening drug-seeking and drug-taking behavior that we see in individuals with SUDs. Once more fully explored, this area of research will help determine whether and how unpredictable and lean cost requirements affect drug self-administration in preclinical laboratory studies with nonhuman subjects. Hopefully this review will help guide other researchers who wish to explore unpredictable and lean-access conditions in their current self-administration procedures. Inclusion of such outcomes and conditions may be more reflective of the natural environment for people with SUDs (i.e., may have a greater degree of face validity). However, more research is needed to determine whether incorporating lean and unpredictable conditions in current self-administration models will increase their predictive validity. It should be noted that the behavioral outcomes obtained with lean and unpredictable schedules of nondrug reinforcement (e.g., high-rate behavior, resistance to disruption, and excessive allocation) suggest that incorporating these features into self-administration would create a baseline more akin to the type of drug seeking and taking that is so difficult to disrupt in humans. If this baseline is indeed more difficult to disrupt and garners more behavioral allocation to drug reinforcers, it has the potential to reduce false positives and better predict outcomes in humans. If greater predictive validity can be obtained, it is possible that this area of research will improve our understanding of drug taking in humans in the natural environment and may one day help to provide effective treatments for individuals with SUDs.
Highlights.
Some aspects of addiction are not captured in preclinical drug self-administration
Access to illicit drugs in the natural environment may be lean and unpredictable
Lean and unpredictable access could worsen drug seeking and taking behavior
More research is needed to determine the predictive validity of the directions proposed
Acknowledgements
The author would like to thank Kevin B. Freeman, Ph.D. and James K. Rowlett, Ph.D. for their feedback on versions of this manuscript. Preparation of this manuscript was supported by a National Institute on Drug Abuse grant DA045011 to S.L.H. The funding source had no role in the writing of this review.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Ahearn W, Hineline PH, David FG, 1992. Relative preferences for various bivalued ratio schedules. Animal Learning & Behavior, 20, 407–415. 10.3758/BF03197964 [DOI] [Google Scholar]
- Ahmed SH, 2010. Validation crisis in animal models of drug addiction: Beyond non-disordered drug use toward drug addiction. Neuroscience & Biobehavioral Reviews, 35, 172–184. 10.1016/j.neubiorev.2010.04.005 [DOI] [PubMed] [Google Scholar]
- Ahmed SH, 2012. The science of making drug-addicted animals. Neuroscience, 211, 107–125. 10.1016/j.neuroscience.2011.08.014 [DOI] [PubMed] [Google Scholar]
- Ahmed SH, Lenoir M, Guillem K, 2013. Neurobiology of addiction versus drug use driven by lack of choice. Current Opinion in Neurobiology, 23, 581–587. 10.1016/j.conb.2013.01.028 [DOI] [PubMed] [Google Scholar]
- Amlung M, Vedelago L, Acker J, Balodis I, MacKillop J, 2017. Steep delay discounting and addictive behavior: A meta-analysis of continuous associations. Addiction, 112, 51–62. 10.1111/add.13535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson KG, Velkey AJ, Woolverton WL, 2002. The generalized matching law as a predictor of choice between cocaine and food in rhesus monkeys. Psychopharmacology, 163, 319–326. 10.1007/s00213-002-1012-7 [DOI] [PubMed] [Google Scholar]
- Anselme P, Robinson MJF, Berridge KC, 2013. Reward uncertainty enhances encentive salience attribution as sign-tracking. Behavioural Brain Research, 238, 53–61. 10.1016/j.bbr.2012.10.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banks ML, Hutsell BA, Schwienteck KL, Negus SS, 2015. Use of preclinical drug vs. food choice procedures to evaluate candidate medications for cocaine addiction. Current Treatment Options in Psychiatry, 2, 136–150. 10.1007/s40501-015-0042-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banks ML, Negus SS, 2012. Preclinical determinants of drug choice under concurrent schedules of drug self-administration. Advances in Pharmacological Sciences, 2012. 10.1155/2012/281768 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banks ML, Negus SS, 2017. Insights from preclinical choice models on treating drug addiction. Trends in Pharmacological Sciences, 38, 181–194. 10.1016/j.tips.2016.11.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berridge K,C, Robinson TE, 2016. Liking, wanting and the incentive-sensitization theory of addiction. American Psychologist, 71, 670–679. 10.1037/amp0000059 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bickel WK, Giordano LA, Badger GJ, 2004. Risk-sensitive foraging theory elucidates risky choices made by heroin addicts. Addiction, 99, 855–861. 10.1111/j.1360-0443.2004.00733.x [DOI] [PubMed] [Google Scholar]
- Bolin BL, Lile JA, Marks KR, Beckmann JS, Rush CR, Stoops WW, 2016. Buspirone reduces sexual risk-taking intent but not cocaine self-administration. Experimental and Clinical Psychopharmacology, 24, 162–173. 10.1037/pha0000076 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brandt L, Jones JD, Martinez S, Manubay JM, Mogali S, Ramey T, Levin FR, Comer SD, 2020. Effects of lorcaserin on oxycodone self-administration and subjective responses in participants with opioid use disorder. Drug and Alcohol Dependence, 208, 107859 10.1016/j.drugalcdep.2020.107859 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broséus J, Gentile N, Esseiva P, 2016. The cutting of cocaine and heroin: A critical review. Forensic Science International, 262, 73–83. 10.1016/j.forsciint.2016.02.033 [DOI] [PubMed] [Google Scholar]
- Cadet JL, 2019. Animal models of addiction: Compulsive drug taking and cognition. Neuroscience and Biobehavioral Reviews, 106, 5–6. 10.1016/j.neubiorev.2019.05.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campbell UC, Carroll ME, 2000. Reduction of drug self-administration by an alternative non-drug reinforcer in rhesus monkeys: Magnitude and temporal effects. Psychopharmacology, 147, 418–425. 10.1007/s002130050011 [DOI] [PubMed] [Google Scholar]
- Ciccarone D, 2017. Fentanyl in the US heroin supply: A rapidly changing risk environment. The International Journal on Drug Policy, 46, 107–111. 10.1016/j.drugpo.2017.06.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coffey SF, Gudleski GD, Saladin ME, Brady KT, 2003. Impulsivity and rapid discounting of delayed hypothetical rewards in cocaine-dependent individuals. Experimental and Clinical Psychopharmacology, 11, 18–25. 10.1037//1064-1297.11.1.18 [DOI] [PubMed] [Google Scholar]
- Craft BB, 2016. Risk-sensitive foraging: Changes in choice due to reward quality and delay. Animal Behaviour, 111, 41–47. 10.1016/j.anbehav.2015.09.030 [DOI] [Google Scholar]
- Davis DR, Kurti AN, Skelly JM, Redner R, White TJ, Higgins ST, 2016. A review of the literature on contingency management in the treatment of substance use disorders, 2009–2014. Preventive Medicine, 92, 36–46. 10.1016/j.ypmed.2016.08.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dutra L, Stathopoulou G, Basden SL, Leyro TM, Powers MB, Otto MW, 2008. A meta-analytic review of psychosocial interventions for substance use disorders. American Journal of Psychiatry, 165, 179–187. 10.1176/appi.ajp.2007.06111851 [DOI] [PubMed] [Google Scholar]
- Fantino E, 1967. Preference for mixed- versus fixed-ratio schedules 1. Journal of the Experimental Analysis of Behavior, 10, 35–43. 10.1901/jeab.1967.10-35 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferster CB, Skinner BF, 1957. Schedules of reinforcement. Acton, Massachusetts [Google Scholar]
- Field DP, Tonneau F, Ahearn W, Hineline PN, 1996. Preference between variable-ratio and fixed-ratio schedules: Local and extended relations. Journal of the Experimental Analysis of Behavior, 66, 283–295. 10.1901/jeab.1996.66-283 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fiorentin TR, Krotulski AJ, Martin DM, Browne T, Triplett J, Conti T, Logan BK, 2019. Detection of cutting agents in drug-positive seized exhibits within the United States. Journal of Forensic Sciences, 64, 888–896. 10.1111/1556-4029.13968 [DOI] [PubMed] [Google Scholar]
- Fiorillo CD, Tobler PN, Schultz W, 2003. Discrete coding of reward probability and uncertainty by dopamine neurons. Science, 299, 1898–1902. 10.1126/science.1077349 [DOI] [PubMed] [Google Scholar]
- Fitzpatrick CJ, Gopalakrishnan S, Cogan ES, Yager LM, Meyer PJ, Lovic V, … Flagel SB, 2013. Variation in the form of Pavlovian conditioned approach behavior among outbred male Sprague-Dawley rats from different vendors and colonies: Sign-tracking vs. goal-tracking. PLoS One, 8 10.1371/journal.pone.0075042 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldshmidt JN, Fantino E, 2004. Economic context and pigeons’ risk-taking: An integrative approach. Behavioural Processes, 65, 133–154. 10.1016/j.beproc.2003.08.002 [DOI] [PubMed] [Google Scholar]
- Greenwald MK, Steinmiller CL, 2009. Behavioral economic analysis of opioid consumption in heroin-dependent individuals: Effects of alternative reinforcer magnitude and post-session drug supply. Drug and Alcohol Dependence, 104, 84–93. 10.1016/j.drugalcdep.2009.04.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greenwald MK, Steinmiller CL, 2014. Cocaine behavioral economics: From the naturalistic environment to the controlled laboratory setting. Drug and Alcohol Dependence, 141, 27–33. 10.1016/j.drugalcdep.2014.04.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Higgins ST, Bickel WK, Hughes JR, 1994. Influence of an alternative reinforcer on human cocaine self-administration. Life Sciences, 55, 179–187. 10.1016/0024-3205(94)00878-7 [DOI] [PubMed] [Google Scholar]
- Houston AI, Fawcett TW, Mallpress DE, McNamara JM, 2014. Clarifying the relationship between prospect theory and risk-sensitive foraging theory. Evolution and Human Behavior, 35, 502–507. 10.1016/j.evolhumbehav.2014.06.010 [DOI] [Google Scholar]
- Houston AI, Wiesner K, 2020. Gains v. losses, or context dependence generated by confusion? Animal Cognition, 23, 361–366. 10.1007/s10071-019-01339-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hursh SR, 1991. Behavioral economics of drug self-administration and drug abuse policy. Journal of the Experimental Analysis of Behavior, 56, 377–393. 10.1901/jeab.1991.56-377 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hursh SR, Silberberg A, 2008. Economic demand and essential value. Psychological Review, 115, 186–198. 10.1037/0033-295X.115.1.186. [DOI] [PubMed] [Google Scholar]
- Huskinson SL, Freeman KB, Petry NM, Rowlett JK, 2017. Choice between variable and fixed cocaine injections in male rhesus monkeys. Psychopharmacology, 234, 2353–2364. 10.1007/s00213-017-4659-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huskinson SL, Myerson J, Green L, Rowlett JK, Woolverton WL, Freeman KB, 2016. Shallow discounting of delayed cocaine by male rhesus monkeys when immediate food is the choice alternative. Experimental and Clinical Psychopharmacology, 24, 456–463. 10.1037/pha0000098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huskinson SL, Naylor JE, Rowlett JK, Freeman KB, 2014. Predicting abuse potential of stimulants and other dopaminergic drugs: Overview and recommendations. Neuropharmacology, 87, 66–80. 10.1016/j.neuropharm.2014.03.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huskinson SL, Woolverton WL, Green L, Myerson J, Freeman KB, 2015. Delay discounting of food by rhesus monkeys: Cocaine and food choice in isomorphic and allomorphic situations. Experimental and Clinical Psychopharmacology, 23, 184–193. 10.1037/pha0000015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson PS, Madden GJ, Brewer AT, Pinkston JW, Fowler SC, 2011. Effects of acute pramipexole on preference for gambling-like schedules of reinforcement in rats. Psychopharmacology, 213, 11–18. 10.1007/s00213-010-2006-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson PS, Madden GJ, Stein JS, 2012. Effects of acute pramipexole on male rats’ preference for gambling-like rewards II. Experimental and Clinical Psychopharmacology, 20, 167–172. 10.1037/a0027117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kacelnik A, Bateson M, 1996. Risky theories—the effects of variance on foraging decisions. American Zoologist, 36, 402–434. 10.1093/icb/36.4.402 [DOI] [Google Scholar]
- Kacelnik A, El Mouden C, 2013. Triumphs and trials of the risk paradigm. Animal Behaviour, 86, 1117–1129. 10.1016/j.anbehav.2013.09.034 [DOI] [Google Scholar]
- Kirshenbaum AP, Bickel WK, Boynton DM, 2006. Simulated opioid withdrawal engenders risk-prone choice: A comparison of intravenous and intranasal-using populations. Drug and Alcohol Dependence, 83, 130–136. 10.1016/j.drugalcdep.2005.11.002 [DOI] [PubMed] [Google Scholar]
- Lagorio CH, Hackenberg TD, 2012. Risky choice in pigeons: Preference for amount variability using a token-reinforcement system. Journal of the Experimental Analysis of Behavior, 98, 139–154. 10.1901/jeab.2012.98-139 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lagorio CH, Winger G, 2014. Random-ratio schedules produce greater demand for iv drug administration than fixed-ratio schedules in rhesus monkeys. Psychopharmacology, 231, 2981–2988. 10.1007/s00213-014-3477-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lile JA, Stoops WW, Rush CR, Negus SS, Glaser PEA, Hatton KW, Hays LR, 2016. Development of a translational model to screen medications for cocaine use disorder II: Choice between intravenous cocaine and money in humans. Drug and Alcohol Dependence, 165, 111–119. 10.1016/j.drugalcdep.2016.05.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ling W, Hillhouse MP, Saxon AJ, Mooney LJ, Thomas CM, Ang A, Matthews AG, Hasson A, Annon J, Sparenborg S, Liu DS, McCormack J, Church S, Swafford W, Drexler K, Schuman C, Ross S, Wiest K, Korthuis PT, Lawson W, … Rotrosen J, 2016. Buprenorphine + naloxone plus naltrexone for the treatment of cocaine dependence: the Cocaine Use Reduction with Buprenorphine (CURB) study. Addiction, 111, 1416–1427. 10.1111/add.13375 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Logan FA, 1965. Decision making by rats: Uncertain outcome choices. Journal of Comparative and Physiological Psychology, 59, 246–251. 10.1037/h0021850 [DOI] [PubMed] [Google Scholar]
- Madden GJ, Petry NM, Badger GJ, Bickel WK, 1997. Impulsive and self-control choices in opioid-dependent patients and non-drug-using control participants: Drug and monetary rewards. Experimental and Clinical Psychopharmacology, 5, 256–262. 10.1037//1064-1297.5.3.256 [DOI] [PubMed] [Google Scholar]
- Madden GJ, Dake JM, Mauel EC, Rowe RR, 2005. Labor supply and consumption of food in a closed economy under a range of fixed- and random-ratio schedules: Tests of unit price. Journal of the Experimental Analysis of Behavior, 83, 99–118. 10.1901/jeab.2005.32-04 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Madden GJ, Ewan EE, Lagorio CH, 2007. Toward an animal model of gambling: Delay discounting and the allure of unpredictable outcomes. Journal of Gambling Studies, 23, 63–83. 10.1007/s10899-006-9041-5 [DOI] [PubMed] [Google Scholar]
- Madden GJ, Francisco MT, Brewer AT, Stein JS, 2011. Delay discounting and gambling. Behavioural Processes, 87, 43–49. 10.1016/j.beproc.2011.01.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Madden GJ, Hartman EC, 2006. A steady-state test of the demand curve analysis of relative reinforcer efficacy. Experimental and Clinical Pyschopharmacology, 14, 79–86. 10.1037/1064-1297.14.1.79 [DOI] [PubMed] [Google Scholar]
- Maguire DR, Gerak LR, France CP, 2013. Delay discounting of food and remifentanil in rhesus monkeys. Psychopharmacology, 229, 323–330. 10.1007/s00213-013-3121-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marchant NJ, Khuc TN, Pickens CL, Bonci A, Shaham Y, 2013. Context-induced relapse to alcohol seeking after punishment in a rat model. Biological Psychiatry, 73, 256–262. 10.1016/j.biopsych.2012.07.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marsh B, Kacelnik A, 2002. Framing effects and risky decisions in starlings. PNAS, 99, 3352–3355. 10.1073/pnas.042491999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mascia P, Neugebauer NM, Brown J, Bubula N, Nesbitt KM, Kennedy RT, Vezina P, 2019. Exposure to conditions of uncertainty promotes the pursuit of amphetamine. Neuropsychopharmacology, 44, 274–280. 10.1038/s41386-018-0099-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mazur JE, 1987. An adjusting procedure for studying delayed reinforcement, in: Commons ML, Mazur JE, Nevin JA, Rachlin H (Eds.), Quantitative analysis of behavior: The effect of delay and of intervening events on reinforcement value. Erlbaum, Hillsdale, New Jersey, pp. 55–73. [Google Scholar]
- McCoy AN, Platt ML, 2005. Risk-sensitive neurons in macaque posterior cingulate cortex. Nature Neuroscience, 8, 1220–1227. 10.1038/nn1523 [DOI] [PubMed] [Google Scholar]
- McSweeney FK, Kowal BP, Murphy ES, 2003. The effect of rate of reinforcement and time in session on preference for variability. Animal Learning & Behavior, 31, 225–241. 10.3758/BF03195985 [DOI] [PubMed] [Google Scholar]
- Mello NK, Fivel PA, Kohut SJ, Bergman J, 2013. Effects of chronic buspirone treatment on cocaine self-administration. Neuropsychopharmacology, 38, 455–467. 10.1038/npp.2012.202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mishra S, 2014. Decision-making under risk: Integrating perspectives from biology, economics, and psychology. Personality and Social Psychology Review, 18, 280–307. 10.1177/1088868314530517 [DOI] [PubMed] [Google Scholar]
- Monosov IE, Hikosaka O, 2013. Selective and graded coding of reward uncertainty by neurons in the primate anterodorsal septal region. Nature Neuroscience, 16, 756–762. 10.1038/nn.3398 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nader MA, Woolverton WL, 1991. Effects of increasing the magnitude of an alternative reinforcer on drug choice in a discrete-trials choice procedure. Psychopharmacology, 105, 169–174. 10.1007/BF02244304 [DOI] [PubMed] [Google Scholar]
- Nader MA, Woolverton WL, 1992. Effects of increasing response requirement on choice between cocaine and food in rhesus monkeys. Psychopharmacology, 108, 295–300. 10.1007/BF02245115 [DOI] [PubMed] [Google Scholar]
- Nasser HM, Calu DJ, Schoenbaum G, Sharpe MJ, 2017. The dopamine prediction error: Contributions to associative models of reward learning. Frontiers in Psychology, 8, 244 10.3389/fpsyg.2017.00244 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Negus SS, 2003. Rapid assessment of choice between cocaine and food in rhesus monkeys: Effects of environmental manipulations and treatment with d-amphetamine and flupenthixol. Neuropsychopharmacology, 28, 919–931. 10.1038/sj.npp.1300096 [DOI] [PubMed] [Google Scholar]
- Negus SS, 2005. Effects of punishment on choice between cocaine and food in rhesus monkeys. Psychopharmacology, 181, 244–252. 10.1007/s00213-005-2266-7 [DOI] [PubMed] [Google Scholar]
- O’Connor EC, Chapman K, Butler P, Mead AN, 2011. The predictive validity of rat self-administration model for abuse liability. Neuroscience & Biobehavioral Reviews, 35, 912–938. 10.1016/j.neubiorev.2010.10.012 [DOI] [PubMed] [Google Scholar]
- Olmstead TA, Petry NM, 2009. The cost-effectiveness of prize-based and voucher-based contingency management in a population of cocaine-or opioid-dependent outpatients. Drug and Alcohol Dependence, 102, 108–115. 10.1016/j.drugalcdep.2009.02.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Packer RR, Howell DN, McPherson S, Roll JM, 2012. Investigating reinforcer magnitude and reinforcer delay: A contingency management analog study. Experimental and Clinical Psychopharmacology, 20, 287–292. 10.1037/a0027802 [DOI] [PubMed] [Google Scholar]
- Panlilio LV, Weiss SJ, Schindler CW, 2000. Effects of compounding drug-related stimuli: Escalation of heroin self-administration. Journal of the Experimental Analysis of Behavior, 73, 211–224. 10.1901/jeab.2000.73-211 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Panlilio LV, Weiss SJ, Schindler CW, 1996. Cocaine self-administration increased by compounding discriminative stimuli. Psychopharmacology, 125, 202–208. 10.1007/BF02247329 [DOI] [PubMed] [Google Scholar]
- Petry NM, Alessi SM, Marx J, Austin M, Tardif M, 2005. Vouchers versus prizes: Contingency management treatment of substance abusers in community settings. Journal of Consulting and Clinical Psychology, 73, 1005–1014. 10.1037/0022-006X.73.6.1005 [DOI] [PubMed] [Google Scholar]
- Pickens R, Thompson T, 1968. Cocaine-reinforced behavior in rats: Effects of reinforcement magnitude and fixed-ratio size. The Journal of Pharmacology and Experimental Therapeutics, 161, 122–129. [PubMed] [Google Scholar]
- Pike E, Stoops WW, Rush CR, 2016. Acute buspirone dosing enhances abuse-related subjective effects of oral methamphetamine. Pharmacology, Biochemistry, and Behavior, 150, 87–93. 10.1016/j.pbb.2016.09.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Platt DM, Rowlett JK, 2012. Nonhuman primate models of drug and alcohol addiction, in: Abee CR, Mansfield K, Tardif S, Morris T (Eds.), Nonhuman Primates in Biomedical Research: Diseases, Volume 2, Elsevier Inc, pp. 817–839. [Google Scholar]
- Podlesnik CA, Jimenez-Gomez C, Shahan TA, 2006. Resurgence of alcohol seeking produced by discontinuing non-drug reinforcement as an animal model of drug relapse. Behavioural Pharmacology, 17, 369–374. 10.1097/01.fbp.0000224385.09486.ba [DOI] [PubMed] [Google Scholar]
- Pyke GH, 1984. Optimal foraging theory: A critical review. Annual Review of Ecology and Systematics, 15, 523–575. [Google Scholar]
- Pyszcynski AD, Shahan TA, 2013. Loss of nondrug reinforcement in one context produces alcohol seeking in another context. Behavioural Pharmacology, 24, 496–503. 10.1097/FBP.0b013e328364502a [DOI] [PubMed] [Google Scholar]
- Robinson MJF, Anselme P, 2019. How uncertainty sensitizes dopamine neurons and invigorates amphetamine-related behaviors. Neuropsychopharmacology, 44, 237–238. 10.1038/s41386-018-0130-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson MJF, Anselme P, Fischer AM, Berridge KC, 2014. Initial uncertainty in Pavlovian reward prediction persistently elevates incentive salience and extends sign-tracking to normally unattractive cues. Behavioral Brain Research, 266, 119–130. 10.1016/j.bbr.2014.03.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson MJF, Anselme P, Suchomel K, Berridge KC, 2015. Amphetamine-induced sensitization and reward uncertainty similarly enhance incentive salience for conditioned cues. Behavioral Neuroscience, 129, 502–511. 10.1037/bne0000064 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roddy J, Greenwald M, 2009. An economic analysis of income and expenditures by heroin-using research volunteers. Substance Use & Misuse, 44, 1503–1518. 10.1080/10826080802487309 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roddy J, Steinmiller CL, Greenwald MK, 2011. Heroin purchasing is income and price sensitive. Psychology of Addictive Behaviors, 25, 358–364. 10.1037/a0022631 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenfeld WN, Cumming WW, Hearst E, 1956. On the classification of reinforcement schedules. Proceedings of the National Academy of Sciences of the United States of America, 42, 563 10.1073/pnas.42.8.563 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W, 2016. Dopamine reward prediction-error signalling: A two-component response. Nature Reviews Neuroscience, 17, 183–195. 10.1038/nrn.2015.26 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silverman K, Chutuape M, Bigelow GE, Stitzer ML, 1999. Voucher-based reinforcement of cocaine abstinence in treatment-resistant methadone patients: Effects of reinforcement magnitude. Psychopharmacology, 146, 128–138. 10.1007/s002130051098 [DOI] [PubMed] [Google Scholar]
- Singer BF, Scott-Railton J, Vezina P, 2012. Unpredictable saccharin reinforcement enhances locomotor responding to amphetamine. Behavioural Brain Research, 226, 340–344. 10.1016/j.bbr.2011.09.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith MA, 2020. Nonhuman animal models of substance use disorders: Translational value and utility to basic science. Drug and Alcohol Dependence, 206, 107733 10.1016/j.drugalcdep.2019.107733 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Soltani A, Izquierdo A, 2019. Adaptive learning under expected and unexpected uncertainty. Nature Reviews Neuroscience, 20, 635–644. 10.1038/s41583-019-0180-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stoops WW, Lile JA, Glaser PEA, Hays LR, Rush CR, 2012. Alternative reinforcer response costs impacts cocaine choice in humans. Progress in Neuro-Psychopharmacology & Biological Psychiatry, 36, 189–193. 10.1016/j.pnpbp.2011.10.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sugam JA, Day JJ, Wightman RM, Carelli RM, 2012. Phasic nucleus accumbens dopamine encodes risk-based decision-making behavior. Biolological Psychiatry, 71, 199–205. 10.1016/j.biopsych.2011.09.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szalavitz M, 2017. Unbroken brain: A revolutionary new way of understanding addiction. New York: Picador St. Martin’s Press. [Google Scholar]
- Tomie A, Grimes KL, Pohorecky LA, 2008. Behavioral characteristics and neurobiological substrates shared by Pavlovian sign-tracking and drug abuse. Brain Research Reviews, 58, 121–135. 10.1016/j.brainresrev.2007.12.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vanderschuren LJ, Ahmed SH, 2013. Animal studies of addictive behavior. Cold Spring Harbor Perspectives in Medicine, 3, a011932 10.1101/cshperspect.a011932 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Volkow ND, Michaelides M, Baler R, 2019. The neuroscience of drug reward and addiction. Physiological Reviews, 99, 2115–2140. 10.1152/physrev.00014.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Winhusen TM, Kropp F, Lindblad R, Douaihy A, Haynes L, Hodgkins C, Chartier K, Kampman KM, Sharma G, Lewis DF, VanVeldhuisen P, Theobald J, May J, Brigham GS, 2014. Multisite, randomized, double-blind, placebo-controlled pilot clinical trial to evaluate the efficacy of buspirone as a relapse-prevention treatment for cocaine dependence. The Journal of Clinical Psychiatry, 75, 757–764. 10.4088/JCP.13m08862 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woolverton WL, Anderson KG, 2006. Effects of delay to reinforcement on the choice between cocaine and food in rhesus monkeys. Psychopharmacology, 186, 99–106. 10.1007/s00213-006-0355-x [DOI] [PubMed] [Google Scholar]
- Zack M, Featherstone RE, Mathewson S, Fletcher PJ, 2014. Chronic exposure to a gambling-like schedule of reward predictive stimuli can promote sensitization to amphetamine in rats. Frontiers in Behavioral Neuroscience, 8, 36 10.3389/fnbeh.2014.00036 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zibbell JE, 2019. The latest evolution of the opioid crisis: Changing patterns in fentanyl adulteration of heroin, cocaine, and methamphetamine and associated overdose risk. RTI International. https://www.rti.org/insights/latest-evolution-opioid-crisis-changing-patterns-fentanyl-adulteration-heroin-cocaine-and