Abstract
Deficits in decision making are at the heart of many psychiatric diseases, such as substance abuse disorders and attention deficit hyperactivity disorder. Consequently, rodent models of decision making are germane to understanding the neural mechanisms underlying adaptive choice behavior and how such mechanisms can become compromised in pathological conditions. A critical factor that must be integrated with reward value to ensure optimal decision making is the occurrence of consequences, which can differ based on probability (risk of punishment) and temporal contiguity (delayed punishment). This article will focus on two models of decision making that involve explicit punishment, both of which recapitulate different aspects of consequences during human decision making. We will discuss each behavioral protocol, the parameters to consider when designing an experiment and, finally, how such animal models can be utilized in studies of psychiatric disease.
Introduction
Whether we are deciding what to eat for dinner or evaluating a job offer, our ability to make optimal decisions is vital to our survival. This is evident when considering the fact that many psychiatric diseases, such as substance use disorder (SUD), are characterized by deficits in decision making (Bechara, 2005; Gowin, Mackey, & Paulus, 2013; Gowin, Sloan, Ramchandani, Paulus, & Lane, 2018). Not only can these cognitive impairments perpetuate the disease itself, but they can also lead to other poor life outcomes, such as criminality and financial loss. Animal models of decision making therefore have great utility in the study of the behavioral and neurobiological mechanisms underlying such decision-making deficits. Further, they allow for the controlled manipulation of the different variables being evaluated during decision making to determine how different consequences impact choice behavior.
The “Risky Decision-making Task” (RDT) is one such animal model that simulates the choice between options that differ in their relative reward and the risk of adverse consequences that may accompany them (Orsini, Blaes, Setlow, & Simon, 2019; Simon, Gilbert, Mayse, Bizon, & Setlow, 2009). In the RDT, subjects chose between a small, safe food reward and a large, food reward that is accompanied by an escalating risk of footshock punishment. Performance in this task recapitulates individual differences in risk taking observed in the human population, endowing it with high translational validity (Simon et al., 2009; Simon et al., 2011). Further, performance in this task is easily dissociable from measures of anxiety, motivation and pain and is sensitive to pharmacological manipulations (Blaes et al., 2018; Orsini et al., 2018; Simon et al., 2009; Simon et al., 2011) and chronic drug self-administration (Mitchell et al., 2014). Finally, consistent with human literature (Byrnes, Miller, & Schafer, 1999; Cross, Copping, & Campbell, 2011; Grissom & Reyes, 2019; Killgore, Grugle, Killgore, & Balkin, 2010), there are clear sex differences in choice behavior in the RDT, with females exhibiting greater risk aversion than males (Orsini, Willis, Gilbert, Bizon, & Setlow, 2016). Considered together, these task attributes demonstrate that the RDT is a behavioral paradigm that reliably models risk-based decision making, providing a useful means by which to understand risk taking, dysfunctional or otherwise, in humans.
Rewards associated with immediate and certain punishments are typically avoided, but punishments occurring later in time exhibit diminished control over behavior (Banks & Vogel-Sprott, 1965; Baron, 1965). This is particularly relevant to motivational disorders such as SUD, during which consequences (such as withdrawal symptoms and financial concerns) often manifest long after drug use (Bechara, Dolan, & Hindes, 2002). The “Delayed Punishment Decision-making Task” (DPDT) measures sensitivity to delayed vs. immediate consequences utilizing a behavioral schema comparable to the RDT, with subjects given a choice between small, safe and large, punished rewards. Unlike the probabilistic punishment in RDT, shock occurrence is guaranteed following punished reward choice, with DPDT instead manipulating time elapsed between the decision and the punishment (Liley, Gabriel, Sable, & Simon, 2019). Initially, punishment occurs immediately upon a decision; as the task progresses, however, the delay preceding punishment is incrementally increased, resulting in reduced temporal contingency between action and aversive outcome. Critically, reward is always delivered immediately after a decision, rendering shock delay as the only dynamic factor within the task. The DPDT reveals that, as in humans (Banks & Vogel-Sprott, 1965; Yates, 1975), rats underestimate, or “discount”, the negative value of delayed punishment, leading to increased preference for the punished reward. Moreover, male rats discount delayed punishment more than females, despite showing comparable avoidance of rewards associated with immediate punishment (Liley et al., 2019). Thus, this task is among the first (see also Rodriguez, Bouzas, & Orduna, 2018; Woolverton, Freeman, Myerson, & Green, 2012) animal protocols that enables investigation of neuronal mechanisms of delayed punishment during decision making, which has utility toward development of sex-specific biomarkers and therapeutic intervention relevant to aberrant sensitivity to consequences in psychopathological disorders.
Strategic Planning
There are several considerations that need to occur in planning for an experiment using the protocols described below. The first is verifying that the protocols in the preferred software work correctly in the behavioral system. While this seems trivial, the coding that is involved in these protocols, irrespective of whether GraphicState4 (Coulbourn Instruments) or MedPC (Med Associates) is used, is complex and likely will contain errors in its initial inception. The coding process does differ between these two software packages, but they are both equally capable of executing these protocols.
Secondly, it is important to identify the objective of the experiment as it will determine how shock parameters are adjusted during training in the tasks. For example, if innate group differences are being tested, such as young vs. aged or male vs. female, the shock intensities used in the task should be identical for all subjects in the study. Further, as it is common that shocks will need to be adjusted over the course of an experiment, any changes to shock intensities should occur for all subjects similarly. This is an important consideration for this type of study since shock intensities are powerful drivers of punishment-related decision-making behavior (Shimp, Mitchell, Beas, Bizon, & Setlow, 2015). Hence, if differences in choice behavior do exist, they can be attributed to true innate group differences (i.e., age or sex, respectively) rather than differences in shock intensities. Similarly, if individual differences are being examined in relation to choice preference, shock intensities should also remain identical across subjects (and adjusted as such) and, if there are inherent group differences (e.g., males vs. females) that are not central to the objective of the study, shock intensities should be identical and adjusted equivalently within each group, even if they may differ between groups. This is in contrast with the strategy used when assessing effects of a specific manipulation on choice behavior. In this instance, the shock intensities should be titrated individually for each subject to ensure that their mean choice performance is as close to the center of the parametric space as possible, reducing as much between-subjects variability as possible. As an example, if the question is whether lesions of a brain region alter choice performance, one would want the pre-lesion baseline to have sufficient parametric space above and below it to readily observe either increases or decreases in choice behavior as a result of the lesion. Hence, identifying which strategy to use when setting and adjusting shock parameters prior to the start of the experiment will establish the optimal conditions in which to address specific scientific questions.
Notably, certain strains of rats acquire the task more rapidly than other strains. While Long-Evans and Sprague-Dawleys readily shape to press levers, strains such as Fischer 344’s, Fisher 344 X Brown Norway hybrids and Lewis rats typically require additional shaping sessions to reach criteria. Additionally, these strains may need more RDT training sessions than Long-Evans and Sprague-Dawleys to reach behavioral stability (see Critical Parameters). Despite these slight differences, all strains are capable of learning the contingencies required for both tasks, and on average will display robust discounting curves. Recently, the RDT has been extended and adapted for use in mice using a touchscreen system (Glover, in press). Like rats, mice are sensitive to risk of punishment, decreasing their choice of the large, risky reward as the probability of punishment increases. The DPDT, however, has yet to be adapted for use in mice.
Prior to behavioral training in the operant boxes, it is important to acclimate the rats with the experimenter(s) and, to avoid neophobia during initial training, expose them to the food pellets that will serve as the reward in the RDT or DPDT (place 5-10 pellets in home cage). After several days of acclimation, rats should be food restricted to 85-90% of their free feeding weight. Training can begin 5-7 days after food restriction, at which point the rats should be close to their “target weight” and sufficiently motivated to work for food. If rats have undergone an invasive procedure (e.g. jugular catheterization), food restriction should begin at least 1 week after the procedure to ensure full recovery. To account for growth over the course of an experiment, which can last weeks to several months, the target weight should be increased by 5 g/week (with the exception of aged rats).
Materials
The following materials can be used for all Protocols:
Male and female rats (Long-Evans, but see Strategic Planning for discussion; optimal age is approximately 90 days old)
Windows-based PC with monitor, keyboard and mouse
Software to control behavioral system and collect data
GraphicState 4 (Coulbourn Instruments)
MED PC (Med Associates)
Customized protocols coded in either GraphicState 4, MED PC and/or Matlab (available upon request)
Operant chamber that includes the following (See Figures 1 and 2 for Coulbourn and Med Associates operant chambers, respectively):
2 retractable levers
Recessed food trough with a light source and beams to detect entries
Pellet dispenser
Grid floor through which shock is delivered
Shock generator
House light
Locomotor sensor
Drip pan
Figure 1. Coulbourn operant chamber.
The Risky Decision-making Task can be conducted in a Coubourn operant chamber (Orsini et al., 2016; Simon et al., 2009). Each chamber consists of two metal side walls, Plexiglass front and back walls and a stainless-steel grid floor through which footshocks are delivered. The chamber is equipped with a recessed food trough (not shown) and two retractable levers positioned to the left and right of the trough. Food is delivered via a food hopper located directly above the food trough. A house light is located on the top left part of the control board (“Environmental Control Board”) located directly behind the operant chamber. Finally, an activity monitor is attached to the top of the chamber to detect locomotor activity in the chamber below.
Figure 2. Med Associates operant chamber.
Both the Risky Decision-making Task and the Delayed Punishment Decision-making Task have been conducted using this operant chamber format (Freels et al., 2020; Liley et al., 2019). Each chamber consists of two metal side walls, Plexiglass front and back walls and a stainless-steel grid floor through which footshocks are delivered. The chamber is equipped with a recessed food trough and retractable levers positioned to the left and right of the trough (trough pictured is extra tall to enable unimpeded entry with a cranial implant). Food is delivered via a food hopper located directly above the food trough. A house light is located on the top central part of the back wall of the chamber. Finally, activity monitors are attached to both plexiglass walls. The control board is located in the bottom corner behind the front wall (not shown).
Food pellets
Options include:
Sucrose pellets (Bioserv)
Soy-free pellets (TestDiet; see Critical Parameters)
Paper towels
Notebook
Disinfectant
Options include:
Quatricide
Nolvasan
Basic Protocol 1
Behavioral training:
Magazine training:
Rats are first shaped to perform the basic components of the RDT or DPDT, such as nosepoking and lever pressing. In the first of such shaping sessions (magazine training), a single food pellet is delivered 38 times across a 64-minute session, with each delivery occurring every 100 ± 40 seconds. Rats typically learn the association between the sound of food delivery with food availability in the food trough in a single session. This can be confirmed by checking the number of nosepokes into the food trough: if it is evident that the rat reliably entered the food trough (~ ≥ 100 nosepokes), the rat can proceed to the next phase of shaping. If not, additional sessions can be run until the rat reaches criterion.
Load the protocol for magazine training in the software program associated with the behavioral system and enter the subject names/numbers of the rats being trained in the session.
Place each rat in their corresponding operant chamber, closing both the door of the chamber as well as those of the sound-attenuating cubicle housing the operant chamber.
- Once all rats are in the operant chambers, start the test session.The number of nosepokes should be monitored while the rats are running and recorded when the session ends.
When the session is complete, remove the rat from the operant chamber.
Confirm that the pellets were consumed by the subject (i.e., check if there are pellets remaining in the food trough). If it is evident that rats did not retrieve and consume the pellets, repeat the session the following day.
Clean each chamber with disinfectant (e.g., Nolvasan), including the bottom of the grid floor.
Passing criterion: only when a rat reliably enters the food trough (~ ≥ 100 nosepokes) can the rat proceed to the next phase of shaping.
Lever shaping:
Following magazine training, rats are trained to press levers for the delivery of a single food pellet (accompanied by the illumination of the light in the food trough). In this 30-minute session, the lever is always extended and the house light remains illuminated for the duration of the session. The rats are trained to press each lever in two separate sessions. If a rat presses one lever ≥ 50 times, the rat is then shaped to press the other lever in a similar manner in a separate session. Notably, the order in which the lever shaping sessions occur should be counterbalanced across rats. When this criterion has been met for both levers, the rat can proceed to the next phase of shaping. Training on each lever can take anywhere between 1 and 4 sessions.
Load the protocol for lever shaping in the software program associated with the behavioral system and enter the subject names/numbers of the rats being trained in the session.
Place each rat in their corresponding operant chamber, closing both the door of the chamber as well as those of the sound-attenuating cubicle housing the operant chamber.
Once all rats are in the operant chambers, start the test session.
To encourage lever pressing, a single food pellet can be placed directly on the extended lever.
Upon conclusion of the session, remove the rat from the operant chamber.
Record the total number of lever presses for each rat.
Clean each chamber with disinfectant (e.g., Nolvasan), including the bottom of the grid floor.
Passing Criteria: a rat must press each lever in its corresponding shaping session ≥ 50 times before proceeding to the next phase of shaping.
Nosepoke shaping:
In the next phase of shaping, rats learn to execute the basic sequence of the final task (initiate trial with a nosepoke, press a lever for reward delivery, collect food reward). Specifically, rats learn that a nosepoke results in the extension of either lever, a press on which results in the delivery of a single food pellet. Importantly, the order in which levers are extended is pseudorandom whereby no lever is presented more than twice in a row across consecutive trials. The beginning of each trial is signaled by the illumination of both the food trough and house lights. A nosepoke extinguishes the trough light and leads to the extension of one of the two levers. If no nosepoke occurs within 10 seconds of initial illumination, both the house light and trough light are extinguished and the intertrial interval (ITI) begins (20-25 seconds). If rats successfully nosepoke, they have 10 seconds in which to press the extended lever. A lever press results in the immediate delivery of a food pellet and lever retraction, but the failure to lever press results in lever retraction and the beginning of the ITI. To reach criterion, a rat must press ≥ 30 times on both levers within the 60-minute session. Although it is possible for rats to meet criterion in a single session, many rats usually require training in this protocol for 3-4 days.
Load the protocol for nosepoke shaping in the software program associated with the behavioral system and enter the subject names/numbers of the rats being trained in the session.
Place each rat in their corresponding operant chamber, closing both the door of the chamber as well as those of the sound-attenuating cubicle housing the operant chamber.
Once all rats are in the operant chambers, start the test session.
Upon conclusion of the session, remove the rat from the operant chamber.
Record the number of lever presses for each lever.
Clean each chamber with disinfectant (e.g., Nolvasan), including the bottom of the grid floor.
Passing criterion: a rat is required to press ≥ 30 times on each lever within a 60-minute session.
Support Protocol 1
Equipment testing:
Each day before behavioral sessions commence, it is recommended that the operant system is checked to ensure all components of the operant box are functional and that there are no clogged feeders.
Turn on the computer that controls the operant boxes (Coulbourn or MedPC), the power source for the operant boxes (e.g., the Power Base for Coulbourn systems) and the shock generators.
- Run a protocol in which each component of the operant box is tested. A typical test protocol would mimic or involve the same elements used in the actual behavioral tasks. Using such a protocol, perform the following steps for each operant box:
- Upon illumination of the trough light and house light, insert a finger into the food trough to trigger extension of levers.
- Press one lever to trigger the delivery of food pellets into the food trough and the delivery of scrambled shocks via the grid floor.
- Verify the delivery of the correct number of food pellets.
- Verify the delivery of shock by lightly placing a hand to the grid floor (ensure the shock intensity is set to a level at which it can be detected by the person who is testing the equipment).A voltmeter or oscilloscope can be used to precisely assess shock output. For Coulbourn systems, a voltmeter is connected to a “Shock Level Tester” (Coulbourn Instruments), which itself is wired to the grid floor with alligator clips. Using this system, the voltage corresponding to a specific shock intensity (in mA) can be verified. For example, a shock intensity of 0.5 mA should result in a value of 4-5 V. If the reading is below the expected voltage, the grid floor can either be replaced and/or thoroughly cleaned and tested again. An oscilloscope can also be connected to the “Shock Level Tester” and used alone or in combination with the voltmeter to accurately measure shock intensity. While it is recommended that this procedure is done weekly, it may be cumbersome to do on a daily basis and is not necessary if grid floors are well maintained.
- Upon re-illumination of the trough light and house light, insert a finger to trigger extension of levers.
- Press the other lever to trigger the delivery of food pellets and delivery of scrambled shocks, using the same verification methods as described above.
- Verify that the correct number of finger “pokes” and lever presses for each operant box were recorded in in the software.
- Remove any food pellets remaining in the food trough and/or in the drip pan located underneath the grid floor
If any component does not work, try to identify the source of the problem and if it cannot be fixed, replace the part with a spare one (and then run the test protocol again to verify that the spare part also works; see Troubleshooting). To avoid unexpected gaps in data production, it is recommended to maintain at least one spare part for each of the above components.
If the behavioral system is functioning properly, behavioral sessions can begin.
Alternate Protocol 1
Reward discrimination:
Although not necessary, acquisition of the final task is often facilitated by an additional training phase in which rats learn that a press on one lever yields a large (2-3 pellets) food reward whereas a press on the other lever yields a small (1 pellet) food reward. Importantly, the identities of the levers (large vs. small reward) are counterbalanced across rats and should remain fixed throughout both reward discrimination and RDT or DPDT training. When given this choice, rats develop a strong preference for the lever associated with the large reward. In this 60-minute session, each 40-second trial begins with the illumination of the food trough and house light and a nosepoke into the food trough results in the extension of either one (forced choice; see below) or both (free choice; see below) levers and extinguishes the food trough light. Failure to press a lever within 10 seconds causes lever retraction and the beginning of the ITI, ranging from 20 to 35 seconds (the duration of this interval is a function of the latency with which rats perform each step of the task). A successful lever press, however, causes the delivery of the small or large food reward (depending on lever choice) and the illumination of the food trough light, which is extinguished after 10 seconds or upon food collection, whichever occurs first.
Each session begins with 8 forced choice trials, in which each lever is individually presented twice in a pseudorandom order (similar to that used in nosepoke shaping). These trials establish the lever contingencies (1 vs. 2-3 pellets) prior to choice trials. The following 10 trials are free choice trials in which both levers are extended, and rats can choose either option, with both levers retracting after a choice is made. To proceed to the RDT or DPDT, a rat must choose the large reward on ≥ 80% of the free choice trials for a minimum of 3 consecutive days. Rats typically require 3-5 days to reach this criterion.
Load the protocol for reward discrimination in the software program associated with the behavioral system and enter the subject names/numbers of the rats being trained in the session.
Place each rat in their corresponding operant chamber, closing both the door of the chamber as well as those of the sound-attenuating cubicle housing the operant chamber.
Once all rats are in the operant chambers, start the test session.
Upon conclusion of the session, remove the rat from the operant chamber.
Record the number of lever presses for each lever.
Clean each chamber with disinfectant (e.g., Nolvasan), including the bottom of the grid floor.
Passing criteria: before proceeding to the RDT or DPDT, a rat must choose the large reward on ≥80% of the free choice trials for a minimum of 3 consecutive days.
Basic Protocol 2
Risky Decision-making Task (RDT):
The standard RDT is 60 minutes in duration and consists of five blocks of 18 trials (Figure 3). Each 40-second trial begins with the illumination of the food trough and house light. A nosepoke into the food trough extinguishes its light and results in the extension of either one (forced choice) or both (free choice) levers. Failure to nosepoke within 10 seconds results in the termination of the food trough light and the beginning of the ITI (20-35 seconds). A press on one lever results in the delivery of a small, “safe” food reward (1 food pellet) and a press on the other lever results in the delivery of a large, “risky” food reward (2-3 pellets). Failure to lever press causes the levers to retract, initiating the ITI. The identity of these levers (small, safe vs. large, risky) should match the identity of the levers in the reward discrimination training. In the RDT, however, the delivery of the large reward is accompanied by a varying probability of a one-second footshock, which increases across the five blocks (0, 25, 50, 75, 100%). Importantly, the large food reward is always delivered, irrespective of shock delivery. Reward delivery is accompanied by the illumination of the food trough light, which is terminated after 10 seconds or upon food collection, whichever occurs first.
Figure 3. Representation of a free choice trial in the Risky Decision-making Task.
A. At the start of each trial, the food trough light is illuminated. B. A poke into the lit trough causes both levers to extend. Rats choose between the two levers, causing both to retract. C. Choice of the left lever results in immediate delivery of a single pellet, with no other programmed consequences (small, “safe” reward). D. Choice of the right lever results in immediate delivery of three pellets accompanied by increasing probabilities of footshock punishment (large, “risky” reward). After either lever press sequence, the trial progresses to the intertrial interval, and then repeats.
Each 18-trial block begins with eight forced choice trials, in which each lever is individually extended four times in a pseudorandom order (similar to that used in nosepoke shaping), followed by 10 free choice trials, in which both levers are extended. Forced choice trials are necessary to establish the shock contingencies for that block. For example, in the 75% block, three out of four lever presses on the large, risky lever will result in shock delivery, indicating that when presented with both levers in the free choice trials of that block, there is a 75% chance that upon selecting the large, risky lever, the reward will be accompanied by a footshock.
It will take approximately 15-30 days for rats to reliably learn the task and exhibit stable performance. It is not uncommon, however, for it to take slightly longer if female or aged subjects are included in the study. Nevertheless, it is critical to analyze the data on a daily basis to determine whether certain parameters (e.g., shock intensities) need to be adjusted and to identify the point at which behavior has stabilized.
Load the protocol for the RDT in the software program associated with the behavioral system and enter the subject names/numbers of the rats being trained in the session.
Change the shock intensities for each corresponding operant chamber.
Place each rat in their associated operant chamber, closing both the door of the chamber as well as those of the sound-attenuating cubicle housing the operant chamber.
- Once all rats are in the operant chambers, start the test session.It is important to monitor rats’ performance during the session in case issues arise. For example, reduced lever pressing could indicate a broken component of the operant chamber. Similarly, if a rat’s performance differs significantly from previous sessions, it could indicate that the rat might be in the incorrect box. By monitoring the sessions closely, it is possible to correct these mistakes before they have a confounding impact on the data.
Upon conclusion of the session, remove the rat from the operant chamber.
Record the number of lever presses for each lever.
Clean each chamber with disinfectant (e.g., Nolvasan), including the bottom of the grid floor.
Data analysis:
The primary dependent variable used in the analysis of behavior in the RDT is the percentage of completed free choice trials in each block on which the rat chose the large, risky reward. Below is a step-by-step process of how to analyze and plot this output, followed by a discussion of other data analysis methods and strategies.
- Compile the number of lever presses for the large, risky reward and for the small, safe reward in the free choice trials in each of the five blocks.Forced choice trials should not be included, as these do not measure preference between options.Data analysis templates and/or code for the RDT are available upon request.
To calculate the percentage of large, risky choices for each block of trials, divide the number of lever presses for the large, risky reward by the total number of lever presses that were made in that block and multiply that number by 100 [(large, risky lever presses / large, risky lever presses + small, safe lever presses) X 100].
- Once these values have been calculated for each block for each subject, calculate the mean and standard error of the percentage of large, risky choices for each block across subjects.This will result in five data points (and their corresponding standard error), each representing the mean percentage of large, risky choices for each of the five blocks for a group of subjects.
- The data can be plotted as a five-point curve, with “Risk of punishment” on the x-axis and “Percent choice of the large, risky reward” on the y-axis.At baseline, it is expected that rats will decrease their choice of the large, risky reward (otherwise referred to as decreased risk taking or decreased risky choice) as the risk of punishment increases (Figure 4). In other words, rats should discount the large reward as the risk of punishment increases. Note that although both males and females exhibit this typical discounting behavior, sex differences do exist such that males prefer the large, risky reward to a greater extent than females (Orsini et al., 2016).
Figure 4. Representation of stable behavior in the Risky Decision-making Task.
On average, as the risk of punishment increases across the behavioral test session, the choice of the large, risky reward decreases. Data are represented as mean choice of the large, risky reward ± standard error of the mean.
Additional considerations for data analyses:
If a single data point is needed for data analyses (e.g., for pairwise correlations), the area under the curve (AUC) or the slope of the discounting curve can be calculated for this purpose. For these calculations, it is recommended that only blocks 2-5 are used, as they capture changes in choice performance when there is a risk of punishment (as opposed to block 1 in which such risk is absent; Orsini et al., 2020). Although slope is a useful index of choice elasticity, it is important to note that rats with flat curves either at the ceiling (risk-seeking rats) or at the floor (risk-averse rats) will have equivalent slopes despite pronounced differences in their choice preference. This underscores the importance of visually examining choice performance across blocks if slope will be used as an outcome measure.
Additional variables of interest include locomotor activity during the ITIs (e.g., allows one to determine whether a drug is on board if there is no effect on choice performance), locomotor activity during footshock delivery (e.g., allows one to determine whether a certain manipulation affects sensitivity to footshock) and latencies to press levers (e.g., serves as an indirect measure of incentive salience). These variables, in addition to percentage of large, risky choices, are typically analyzed when assessing group differences (e.g., males vs. females; aged vs. young) or the effects of a specific manipulation (e.g., drug administration, lesions or optogenetic stimulation).
Basic Protocol 3
Delayed Punishment Decision-making Task (DPDT):
Prior to training in the DPDT, rats must undergo the same steps, beginning with acclimation and food restriction and proceeding through all behavioral shaping procedures as described in Basic Protocol 1. Rather than progressing to the RDT, rats will commence training in the DPDT after meeting criterion in reward discrimination training (Alternate Protocol 1). As noted previously with the RDT, it is critical to test the equipment daily prior to each training session (see Support Protocol 1). The DPDT is 30 minutes in duration and consists of six blocks of 12 trials (Figure 5). As in the RDT and reward discrimination training, each trial begins with illumination of the trough and house light. Entry into the recessed trough extinguishes both lights, and causes extension of either one (forced choice) or both (free choice) levers. Failure to complete this nosepoke within 10 seconds results in the trial being scored as an omission, and causes termination of all lights and initiation of the ITI (10-12 seconds). A press on the “safe” lever results in the immediate delivery of 1 food pellet into the trough, while a press on the “punished” lever results in the immediate delivery of three food pellets. Failure to press either lever within 10 seconds causes the levers to retract and the trial to proceed to the ITI. The identity of each lever should be identical to that used in reward discrimination training. Unlike the probabilistic punishment in the RDT, in the DPDT the large, punished reward is always accompanied by a one-second footshock (except in block six). However, the latency between punished lever choice and shock increases across the five blocks (0, 4, 8, 12, 16 seconds), such that the shock becomes more temporally distant from the lever press/food delivery with each subsequent block. After each shock, the trial progresses to the ITI. To ensure that post-decision trial length is consistent regardless of lever choice, a delay matched to punishment latency in that block (0, 4, 8, 12, 16 seconds) follows food pellet delivery after safe lever choice before proceeding to the ITI. Reward delivery is always accompanied by illumination of the trough light, which is terminated after three seconds or upon food collection. After completion of the first five blocks, the DPDT culminates with a sixth block of trials in which the shock is absent, and no delays occur with either lever choice (i.e., rats can choose between a large and small reward with no other associated consequences).
Figure 5. Representation of a free choice trial in the Delayed Punishment Decision-making Task.
A. At the start of each trial, the food trough light is illuminated. B. A poke into the lit trough causes both levers to extend. Rats choose between the two levers, causing both to retract. C. Choice of the left lever results in immediate delivery of a single pellet, followed by a delay period that varies based on trial block. D. Choice of the right lever results in immediate delivery of three pellets, followed by a delay period, then a one second footshock. Delays are comparable for each lever, and increase throughout the session. After either lever press sequence, the trial progresses to the intertrial interval, and then repeats.
Each 12-trial block begins with two forced choice trials, in which each lever is individually extended once in a pseudorandom order. These trials are necessary to establish the pre-shock delay length for that block; because shock probabilities do not change across block, fewer forced choice trials are necessary in comparison to the RDT [using two forced choice trials is also effective in reward delay discounting tasks (Evenden & Ryan, 1996; Simon, Mendez, & Setlow, 2007)]. Forced choice trials will be followed by 10 free choice trials, in which both levers are extended, allowing rats to choose a preferred option between the safe and punished levers.
Due to the immediate and consistent occurrence of punishment following punished lever choice in the first block of the task, some punishment-averse rats will begin omitting trials early in training, preventing them from successfully learning the learn task contingencies. This concern is most prominent when using female subjects. To circumvent this, training begins using a comparatively mild footshock (0.10 mA). After rats complete a session with less than 10 omissions, shock intensity is increased in 0.05 mA increments, until reaching the intensity of 0.35 mA. From that point, training requires approximately 15-20 sessions to reach stable performance.
Load the protocol for the DPDT in the software program associated with the behavioral system and enter the subject names/numbers of the rats being trained in the session.
Change the shock intensities for each corresponding operant chamber.
Place each rat in their associated operant chamber/sound-attenuating cubicle and close doors.
-
Once all rats are in the operant chambers, start the test session.
As with the RDT above, monitor rats’ performance via computer running software and live video feed (if available) to test for experimenter error and/or equipment malfunction.
Upon conclusion of the session, remove the rat from the operant chamber.
Record the number of lever presses for each lever.
Clean each chamber with disinfectant (e.g., Nolvasan), including the bottom of the grid floor.
Data analysis:
The primary dependent variable used in the analysis of behavior in the DPDT is the percentage of completed free choice trials in each block on which the rat chose the large, punished reward. Below is a step-by-step process of how to analyze and plot this output, followed by a discussion of other data analysis methods and strategies.
- Compile the number of lever presses for the large, punished reward and for the small, safe reward in the free choice trials in each of the six blocks.Forced choice trials should not be included, as these do not measure preference between options.Data analysis templates and/or code for the DPDT are available upon request.
To calculate the percentage of punished choices for each block of trials, divide the number of lever presses for the punished reward by the total number of lever presses that were made in that block and multiply that number by 100 [(punished lever presses / punished lever presses + safe lever presses) X 100].
- Once these values have been calculated for each block for each subject, calculate the mean and standard error of the percentage of large, punished choices for each block across subjects.This will result in six data points (and their corresponding standard error), each representing the mean percentage of punished choices for each of the six blocks for a group of subjects.
- The data can be plotted as a six-point curve, with “Punishment delay” on the x-axis and “Percent choice of the large, punished reward” on the y-axis (Figure 6).Increased percent choice of the punished reward with increasing delays (a curve with a positive slope) is indicative of underestimation of the negative value of delayed punishment. Thus, when comparing two conditions, the curve with a higher slope (i.e. larger difference between the immediate and delayed punishment blocks) represents a greater degree of delayed punishment discounting. Note that although both males and females exhibit this typical discounting behavior, sex differences do exist such that males show greater delayed punishment discounting than females (Liley et al., 2019).
Figure 6. Representation of stable behavior in the Delayed Punishment Decision-making Task.
On average, rats shift choice toward the punished reward as the delay increases, indicative of underestimation of delayed punishment. Data are represented as mean choice of the large, punished reward ± standard error of the mean. NS = No Shock.
Additional considerations for data analyses:
In some cases, correlation or regression analysis may be required to utilized to compare DPDT with a second variable of interest (such as a biological measure or performance in a second behavioral assay). To obtain a cumulative measure of each subject’s performance, the AUC or slope across all blocks that include punishment (blocks 1-5) can be used as a dependent variable (Liley et al., 2019). Importantly, the same caveat for using slope as an outcome measure in the RDT also applies to the DPDT: rats with a flat curve at floor or ceiling will have comparable slopes despite profound differences in choice behavior. Therefore, it is critical to visually inspect data from the DPDT before using slope as the primary dependent variable.
Locomotor activity should be assessed during multiple points in the task. Summated activity across all ITIs serves as a general measure of spontaneous movement, which is useful for testing the efficacy of psychostimulant drug manipulations (as in the RDT, see above). Locomotion should also be tracked during the pre-shock delay period after choice of the punished reward and the post safe-reward delay period after choice of a safe reward (Figure 5). Locomotion during the pre-shock period provides an indirect measure of shock expectation after choice of the punished reward, as anticipation of impending shock reduces locomotion (Fanselow, 1980). This should be calculated as locomotor units/second to control for differences in total delay length based on choice preference. This measure can then be compared with locomotor units/second during the post safe-reward delay period, with reduced locomotion prior to shock vs after safe choice suggesting punishment anticipation (Liley et al., 2019). This important measure provides evidence that choice of the punished reward was not elicited by lack of awareness of impending shock, but was instead caused by diminished influence of delayed punishment over decision making.
Commentary
Background information
Rodent models allow precise regulation of punishment occurrence during economic decision making. The tasks described herein manipulate either risk of punishment (RDT) or delay preceding punishment (DPDT), both of which must be evaluated to determine an optimal course of action. Importantly, each task manipulates these variables repeatedly within a single session, yielding a discounting curve that reflects how reinforcer value is affected by changes in punishment parameters for each subject. Tightly-controlled behavioral protocols such as these are necessary to determine how the brain utilizes information about impending consequences to guide reward-seeking, and how this processing goes awry in psychopathological disorders.
Using the RDT, we have begun constructing a neurobiological model of the mechanisms by which the brain processes and computes decision making involving risk of explicit punishment. Among the first of these discoveries was the finding that the basolateral amygdala (BLA) has a critical role in this form of risk-based decision making. Initial studies showed that excitotoxic lesions of the BLA resulted in a pronounced increase in risky choice (Orsini, Trotta, Bizon, & Setlow, 2015), leading to the conclusion that BLA integrity is necessary for the integration of reward- and punishment-related information to guide optimal risk-based choice behavior. Additional studies using an optogenetic approach further dissected the functional contribution of the BLA, revealing that this brain region is differentially recruited during different phases of the decision process (Orsini et al., 2017). Whereas BLA activity appears to promote the choice of larger, riskier rewards during the period in which a choice is being made (deliberation), potentially by attributing greater salience to the more rewarding option, BLA activity appears to divert future choice away from the large, risky option after receiving such an option accompanied by punishment.
The dissociable role of the BLA during risky choice is likely mediated through its numerous connections with other brain regions in the mesocorticolimbic circuit known to be involved in risk-based decision making (Orsini, Hernandez, Bizon, & Setlow, 2019; Orsini et al., 2017). Two such brain regions are the orbitofrontal cortex (OFC) and the medial prefrontal cortex (mPFC). Indeed, we have shown that both of these brain areas are necessary for risky choice (Deng, Orsini, Shimp, & Setlow, 2018; Orsini et al., 2018; Orsini et al., 2015), albeit for different aspects of this cognitive process. Based on findings from lesion experiments in the RDT, the OFC seems to be required to match the expected outcomes of a choice with the actual outcomes of said choice and update these expectations for future decisions (Orsini et al., 2015). Absent these “model-based” cognitive expectations, choice behavior becomes reliant only on the immediate reinforcement history of decisions without consideration of broader probability contingencies that may exist. With OFC lesions, this manifests as the persistent choice of the small, safe reward even at low probabilities of punishment due to the inability to accurately calculate the probabilities of punishment associated with each block of trials in the RDT. In contrast, the mPFC contributes to the behavioral flexibility needed to respond to changing risk contingencies (Orsini et al., 2018). This is evident based on pharmacological inactivation experiments whereby mPFC inactivation resulted in an increase in risky choice when the probabilities of punishment ascended, but resulted in a decrease in risky choice when probabilities of punishment descended. Considered together, both of these subregions are clearly critical to the ability to adjust choice behavior in the face of changing risk contingencies, whether it is via updating outcome expectations or disengaging in ongoing behavior to adapt to new rules and outcome probabilities.
The nucleus accumbens (NAc) is another integral member of the circuit mediating risk-based decision making (Orsini, Hernandez, et al., 2019; Winstanley & Floresco, 2016). This brain region receives input from both the BLA and PFC (Groenewegen, Wright, Beijer, & Voorn, 1999), among many others, and has long been considered the limbic-motor interface (Mogenson, Jones, & Yim, 1980), integrating information from various limbic areas of the brain and translating them into motor output via its downstream connections (Floresco, 2014). The NAc also receives dense dopaminergic projections from the ventral tegmental area, which serve to modulate the activity of medium spiny neurons via interactions with G-protein coupled dopamine receptors (Gerfen & Surmeier, 2011; Surmeier, Ding, Day, Wang, & Shen, 2007; Tritsch & Sabatini, 2012). Using the RDT, we have shown that risky decision making is associated with elevated phasic dopamine release in the shell subregion of the NAc (Freels, Gabriel, Lester, & Simon, 2020). The relationship between risk taking and dopamine is regulated specifically by D2 dopamine receptors (D2DR) in the NAc shell (Mitchell et al., 2014). Similar to systemic administration of D2DR agonists (Blaes et al., 2018; Simon et al., 2011), microinjections of the D2DR agonist quinpirole directly into the NAc decrease risky choice. Further, levels of D2DR mRNA expression in the NAc shell are negatively correlated with risk-taking preference such that low D2DR mRNA expression in the NAc shell is associated with greater risky choice (Mitchell et al., 2014). Notably, although levels of D1 dopamine receptor (D1DR) mRNA in the NAc shell are positively correlated with risk taking (Simon et al., 2011), administration of D1DR agonists or antagonists fail to affect risky choice, supporting the notion that D2DRs have a distinctive role in dopaminergic modulation of decision making involving risk of punishment.
The neuronal circuitry underlying the discounting of delayed punishment during decision making (quantified in the DPDT) is less clear. Both the RDT and DPDT involve integration of rewards with punishment as well as reward magnitude discrimination, but it is likely that probabilistic and delayed punishment are encoded in a distinctive fashion. Understanding the divergences and commonalities in neural substrates between these tasks will help shape understanding of the complex role of punishment in guiding decision making. Methodological consistency of these tasks across experiments and labs is paramount to addressing this critical research question.
While little is known about how the brain engenders the underestimation (or “discounting”) of delayed punishment during DPDT, there is substantial background literature on delayed rewards (Cardinal, Winstanley, Robbins, & Everitt, 2004; Frost & McNaughton, 2017; McClure & Bickel, 2014). It is unknown whether reward and punishment delay discounting are mediated by separate neural systems, or whether a unitary circuit regulates the discounting of outcomes independent of valence (i.e., delayed rewarding and aversive stimuli are underestimated in similar fashion). Interestingly, reward delay discounting and DPDT are not correlated in rats (Liley et al., 2019), which suggests that these processes may employ unique neural systems and/or different patterns of functional activity.
Critical Parameters
Behavioral Stability:
For both the RDT and DPDT, the number of sessions required to complete training may vary between subjects/cohorts. Therefore, subjects are required to achieve a stability criterion rather than a fixed number of training sessions. To determine stable performance, the percent choice of the large, punished reward from the last 3-5 sessions is subjected to a two-way day X trial block repeated-measures ANOVA. Performance is considered to be stable if there is 1.) no main effect of day, 2.) no day X trial block interaction and 3.) a main effect of trial block [indicative of a discounting curve (Freels et al., 2020; Liley et al., 2019; Orsini, Blaes, et al., 2019)]. With small sample sizes, these statistics may be insufficient to ascertain stability due to a lack of power. In these instances, visual inspection of daily discounting curves should also be performed. Critically, performance must stabilize prior to any experimental manipulation to avoid conflation between experimental effects and ongoing task acquisition.
Sex Differences Considerations:
If the primary research question for an experiment involves sex differences, consistent shock intensity should be maintained across both male and female subjects. If shock levels must be adjusted due to an overabundance of subjects displaying ceiling/floor effects, then these changes should be equivalent across both sexes. If an experiment includes both sexes but the primary comparison is repeated measures (such as comparing an acute drug treatment with saline vehicle), then shock levels can be titrated for each subject regardless of sex.
Due to satiation and elevated sensitivity to shock, female subjects often require a longer training regimen than males to achieve behavioral stability, and are more prone to trial omission. Accordingly, experimental parameters may require adjustment to ensure that females complete enough trials for statistical analyses. In these instances, large reward size can be reduced from three to two pellets, or lower caloric pellets can be utilized. Critically, any task parameters that differ between females and males must be reported and taken into consideration when interpreting sex differences in behavior.
In both the RDT and DPDT, baseline female choice behavior is independent of estrous cycle (Liley et al., 2019; Orsini et al., 2016). However, as dopamine dynamics and neuronal activity in some brain regions differ as a function of estrous cycle (Blume et al., 2017; Calipari et al., 2017), it is feasible that the effects of pharmacological or neurobiological manipulations on task performance or the functional activity underlying punishment-based decision making may be modulated by estrous phase. If estrous cycle is deemed a variable of interest and needs to be monitored, daily samples should be obtained after behavioral sessions to prevent the influence of vaginal lavage-induced stress on decision making (Liley et al., 2019; Orsini et al., 2016). To equate experiences between sexes, males should also be handled in a similar manner as females undergoing lavages. Additionally, using soy-free pellets and wood chip bedding, which are both sources of phytoestrogens, is recommended for any experiment that includes female subjects as it removes the possibility of modulation of behavior by exogenous estrogen.
Electrophysiology/Imaging Considerations:
Synchronization between task events and neuronal activity requires a substantial amount of trials to compensate for trial by trial variability in firing rate. Thus, both the RDT and DPDT require alterations for measuring functional activity, with each block including 30-50 free choice trials. Multiple blocks with different punishment contingencies are necessary to facilitate repeated-measures comparison of neuronal activity throughout the session, but should be limited to mitigate satiation due to the inclusion of more trials. We recommend two blocks for the RDT (block 1: small, safe reward vs large, safe reward; block 2: small, safe reward vs large reward with risk of shock), and three for the DPDT (block 1: small, safe reward vs large reward with immediate shock; block 2: small, safe reward vs large reward with delayed shock; block 3: small, safe reward vs large, safe reward). Additionally, a gap (0.5-1 seconds) should be inserted between actions (lever press) and outcome delivery (reward/shock) to allow measurement of outcome anticipation-evoked activity that is distinguishable from action-evoked activity. Punishment-evoked activity is difficult to quantify due to electrical noise caused by shock; as an alternative, analyze activity during the pre-action period, action, and outcome anticipation. Finally, an optional modification is to train subjects to sustain a nosepoke into the trough for 0.5-1 seconds prior to each event of interest. This behavioral “hold” enables assessment of task-evoked activity that is not obfuscated by differences in movement between trials.
Troubleshooting
The best way to avoid mishaps and errors is ultimately to analyze the data on a daily basis, checking to see if performance in subjects has changed noticeably within a short period of time or to determine whether shock intensities need to be adjusted to avoid floor or ceiling effects (this depends on the objective of the experiment; see Strategic Planning for a discussion as well as below for ways in which to deal with these unwanted effects).
Behavioral equipment:
It is not uncommon for components of the operant chamber to fail over the course of an experiment. This could include lever malfunction, jammed food pellets in the feeder, misaligned beams in the food trough (resulting in the failure to detect a nosepoke) and/or decreases in shock conductance through the grid floors. Any of these issues can result in changes in performance, the most common being an increase in omissions or a complete reversal in choice preference. The best way to avoid encountering these problems is to run a daily test protocol before behavioral sessions begin (see Basic Protocol 1: Behavioral Training). If there are any malfunctioning components, it will be easier to replace them or determine the source of the problem before subjects are tested rather than having to determine post hoc the reason why the subject’s behavior is so variable. Problems with shock conductance (typically a reduction in the shock intensity that reaches the floor) in the grid floors may be less immediately apparent and, consequently, it is recommended that the experimenter test the actual shock intensity output with a voltmeter or oscilloscope on a weekly basis (see Support Protocol 1). If the shock output deviates from the intended output, the experimenter can try thoroughly cleaning the grid floor and re-testing the shock output and/or installing a new (and tested) grid floor.
Floor and ceiling performance:
Even early in training, it is critical to be vigilant about daily data analysis to avoid floor or ceiling performance effects. In the case of the former, subjects will predominantly choose the small, safe reward and, in extreme instances, will choose this option even in the block of trials when there is no risk of punishment (first block in RDT and sixth block in DPDT). Conversely, subjects may predominantly choose the large, risky reward in all trial blocks (ceiling effect), despite the fact that, in the last block of trials in the RDT or in all of the first five blocks of the DPDT, the probability of punishment is absolute. In such circumstances, the experimenter should adjust the shock intensity: with ceiling effects, the shock intensity should be increased while the shock intensity should be decreased in the case of floor effects. Note that even in experiments in which individual differences in choice performance are critical to the objective of the study, it may still be necessary to adjust the shock intensities until there is a wider distribution of choice performance within the parametric space, although these adjustments should be consistent across all subjects.
Remedial training:
In cases in which subjects continue to choose the small reward in the block with no punishment (block 1 in the RDT, block 6 in the DPDT) even after decreasing the shock intensity, remedial training may be required. This may also be necessary if subjects continue to predominantly choose the punished reward at extremely high shock intensities (> 0.8 mA), although it is recommended to first verify the shock output of the grid floor before considering remedial training. Such corrective measures include returning rats to shaping protocols, such as nosepoke shaping, to encourage engagement in both levers again, or to reward discrimination training to remind subjects of the reward magnitude differences. Alternatively, rats can be trained in a protocol built to extinguish lever pressing on the biased lever. Such remedial training typically requires 4-5 sessions before rats can resume training in the decision-making tasks. It is always better to err on the side of caution and conduct extra training sessions than too few as it reduces the chance that rats will easily revert back to problematic choice performance.
Body weights:
Because subjects are food restricted, body weights should be monitored throughout the duration of the experiment. In addition to serving as a proxy for health, body weights may also provide information about aberrant or unwanted behavior. For example, if rats’ body weights are consistently far above their target weight and they predominantly prefer the small reward (or heavily omit free choice trials), it is conceivable that they are not sufficiently motivated to withstand the risk of punishment to obtain the larger reward. Decreasing daily portions of food may not only reduce body weight to target levels, but also promote choice of the large reward. Importantly, however, if a subject is underweight (>20 g below target weight), this may indicate that the rat is in distress and may need supplemental nutrition, removal from the study, and/or veterinary care.
Anticipated Results
Experimenters should anticipate that it may take approximately one month for behavior to stabilize in the RDT (Basic Protocol 2). Upon reaching behavioral stability, rats will display a pattern of behavior whereby, on average, their choice of the large, risky reward decreases as the risk of punishment increases (Figure 4). Across all subjects, there will be a degree of between-subjects variability, but the magnitude of this variability depends on the objective of the study (i.e., assessing individual differences or baseline group differences vs. assessing how average choice performances changes as a result of a certain manipulation). There are also well-established sex differences in risky choice in the RDT, with males choosing the large, risky reward significantly more than the females (Orsini et al., 2016). Despite this difference, however, both males and females will exhibit risk discounting (i.e., both will decrease their choice of the large, risky reward as the risk of punishment increases even if the rate at which this occurs differs between sexes).
Oftentimes, it may be necessary to reverse the order of punishment probabilities to determine, for example, whether a specific manipulation that affected choice in the ascending probability version (0, 25, 50, 75, 100%) similarly alters choice performance in the descending probability version (100, 75, 50, 25, 0%). This is a particularly helpful control experiment to determine whether changes or differences in risky choice are actually secondary to changes or differences in behavioral flexibility. Under baseline conditions in the descending version of the RDT, however, the experimenter should expect to observe an increase in the choice of the large, risky reward as the risk of punishment decreases.
The DPDT (Basic Protocol 3) also requires approximately one month of training for a cohort of subjects to achieve stability. Average performance across all subjects typically manifests as a positive curve, with minimal choice of the immediately punished large reward transitioning to increased choice of the large reward when punishment is delayed (Figure 6). The final block, in which no punishment is presented, should yield the greatest preference for the large reward, although it should be noted that subjects may continue to avoid this option due to carryover effects of punishment from previous blocks. If a manipulation increases the positive slope/preference for the punished reward, this is indicative of increased delayed punishment discounting (i.e., subjects avoid rewards associated with immediate punishment, but are less likely to avoid rewards when punishment is delayed). Notably, the data should differ between males and females. Typically, both sexes demonstrate comparable avoidance of the immediately punished reward, but males show greater preference for the punished reward when punishment is delayed. Thus, when data are visualized as a curve, the male curve is steeper than in females (Liley et al., 2019), which is indicative of increased discounting of delayed punishment. Despite this sex difference, both males and females are expected to demonstrate delayed punishment discounting (both statistically increase choice of the punished reward when punishment is delayed).
Several of the fundamental principles of analyzing data with the RDT also hold true for the DPDT. As with the RDT, substantial individual differences will be observed if shock intensity is held consistent across all subjects (Liley et al., 2019). In addition, reversal of delays from ascending to descending (no punishment, 16, 12, 8, 4, 0 seconds) may be necessary to elucidate if treatment effects are a result of altered discounting or behavioral flexibility.
ACKNOWLEDGEMENTS
We would like to acknowledge our funding sources that supported the preparation of this article (NIH/NIDA R00DA041493 to CAO and NIH/NIDA R15DA046797 to NWS).
References
- Banks RK, & Vogel-Sprott M (1965). Effect of delayed punishment on an immediately rewarded response in humans. J Exp Psychol, 70(4), 357–359. doi: 10.1037/h0022233 [DOI] [PubMed] [Google Scholar]
- Baron A (1965). Delayed Punishment of a Runway Response. J Comp Physiol Psychol, 60, 131–134. doi: 10.1037/h0022326 [DOI] [PubMed] [Google Scholar]
- Bechara A (2005). Decision making, impulse control and loss of willpower to resist drugs: a neurocognitive perspective. Nat Neurosci, 8(11), 1458–1463. doi: 10.1038/nn1584 [DOI] [PubMed] [Google Scholar]
- Bechara A, Dolan S, & Hindes A (2002). Decision-making and addiction (part II): myopia for the future or hypersensitivity to reward? Neuropsychologia, 40(10), 1690–1705. doi: 10.1016/s0028-3932(02)00016-7 [DOI] [PubMed] [Google Scholar]
- Blaes SL, Orsini CA, Mitchell MR, Spurrell MS, Betzhold SM, Vera K, … Setlow B. (2018). Monoaminergic modulation of decision-making under risk of punishment in a rat model. Behav Pharmacol, 29(8), 745–761. doi: 10.1097/FBP.0000000000000448 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blume SR, Freedberg M, Vantrease JE, Chan R, Padival M, Record MJ, … Rosenkranz JA. (2017). Sex- and Estrus-Dependent Differences in Rat Basolateral Amygdala. J Neurosci, 37(44), 10567–10586. doi: 10.1523/JNEUROSCI.0758-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Byrnes JP, Miller DC, & Schafer WD (1999). Gender Differences in Risk Taking: A Meta-Analysis. Psychol Bull, 125(3), 367–383. [Google Scholar]
- Calipari ES, Juarez B, Morel C, Walker DM, Cahill ME, Ribeiro E, … Nestler EJ. (2017). Dopaminergic dynamics underlying sex-specific cocaine reward. Nat Commun, 8, 13877. doi: 10.1038/ncomms13877 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cardinal RN, Winstanley CA, Robbins TW, & Everitt BJ (2004). Limbic corticostriatal systems and delayed reinforcement. Ann N Y Acad Sci, 1021, 33–50. doi: 10.1196/annals.1308.004 [DOI] [PubMed] [Google Scholar]
- Cross CP, Copping LT, & Campbell A (2011). Sex differences in impulsivity: a meta-analysis. Psychol Bull, 137(1), 97–130. doi: 10.1037/a0021591 [DOI] [PubMed] [Google Scholar]
- Deng JV, Orsini CA, Shimp KG, & Setlow B (2018). MeCP2 Expression in a Rat Model of Risky Decision Making. Neuroscience, 369, 212–221. doi: 10.1016/j.neuroscience.2017.11.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Evenden JL, & Ryan CN (1996). The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement. Psychopharmacology (Berl), 128(2), 161–170. [DOI] [PubMed] [Google Scholar]
- Fanselow MS (1980). Conditioned and unconditional components of post-shock freezing. Pavlov J Biol Sci, 15(4), 177–182. doi: 10.1007/bf03001163 [DOI] [PubMed] [Google Scholar]
- Floresco SB (2014). The Nucleus Accumbens: An Interface Between Cognition, Emotion, and Action. Annu Rev Psychol. doi: 10.1146/annurev-psych-010213-115159 [DOI] [PubMed] [Google Scholar]
- Freels TG, Gabriel DBK, Lester DB, & Simon NW (2020). Risky decision-making predicts dopamine release dynamics in nucleus accumbens shell. Neuropsychopharmacology, 45(2), 266–275. doi: 10.1038/s41386-019-0527-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frost R, & McNaughton N (2017). The neural basis of delay discounting: A review and preliminary model. Neurosci Biobehav Rev, 79, 48–65. doi: 10.1016/j.neubiorev.2017.04.022 [DOI] [PubMed] [Google Scholar]
- Gerfen CR, & Surmeier DJ (2011). Modulation of striatal projection systems by dopamine. Annu Rev Neurosci, 34, 441–466. doi: 10.1146/annurev-neuro-061010-113641 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glover LR, Postle AF, Holmes A (in press). Touchscreen-based assessment of risky-choice in mice. Behav Brain Res. doi: 10.1016/j.bbr.2020.112748 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gowin JL, Mackey S, & Paulus MP (2013). Altered risk-related processing in substance users: Imbalance of pain and gain. Drug Alcohol Depend, 132(1-2), 13–21. doi: 10.1016/j.drugalcdep.2013.03.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gowin JL, Sloan ME, Ramchandani VA, Paulus MP, & Lane SD (2018). Differences in decision-making as a function of drug of choice. Pharmacol Biochem Behav, 164, 118–124. doi: 10.1016/j.pbb.2017.09.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grissom NM, & Reyes TM (2019). Let’s call the whole thing off: evaluating gender and sex differences in executive function. Neuropsychopharmacology, 44(1), 86–96. doi: 10.1038/s41386-018-0179-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Groenewegen HJ, Wright CI, Beijer AV, & Voorn P (1999). Convergence and segregation of ventral striatal inputs and outputs. Ann N Y Acad Sci, 877, 49–63. [DOI] [PubMed] [Google Scholar]
- Killgore WD, Grugle NL, Killgore DB, & Balkin TJ (2010). Sex differences in self-reported risk-taking propensity on the Evaluation of Risks scale. Psychol Rep, 106(3), 693–700. doi: 10.2466/PR0.106.3.693-700 [DOI] [PubMed] [Google Scholar]
- Liley AE, Gabriel DBK, Sable HJ, & Simon NW (2019). Sex Differences and Effects of Predictive Cues on Delayed Punishment Discounting. eNeuro, 6(4). doi: 10.1523/ENEURO.0225-19.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClure SM, & Bickel WK (2014). A dual-systems perspective on addiction: contributions from neuroimaging and cognitive training. Ann N Y Acad Sci, 1327, 62–78. doi: 10.1111/nyas.12561 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mitchell MR, Weiss VG, Beas BS, Morgan D, Bizon JL, & Setlow B (2014). Adolescent risk taking, cocaine self-administration, and striatal dopamine signaling. Neuropsychopharmacology, 39(4), 955–962. doi: 10.1038/npp.2013.295 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mogenson GJ, Jones DL, & Yim CY (1980). From motivation to action: functional interface between the limbic system and the motor system. Prog Neurobiol, 14(2-3), 69–97. doi: 10.1016/0301-0082(80)90018-0 [DOI] [PubMed] [Google Scholar]
- Orsini CA, Blaes SL, Dragone RJ, Betzhold SM, Finner AM, Bizon JL, & Setlow B (2020). Distinct relationships between risky decision making and cocaine self-administration under short- and long-access conditions. Prog Neuropsychopharmacol Biol Psychiatry, 98, 109791. doi: 10.1016/j.pnpbp.2019.109791 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orsini CA, Blaes SL, Setlow B, & Simon NW (2019). Recent Updates in Modeling Risky Decision Making in Rodents. Methods Mol Biol, 2011, 79–92. doi: 10.1007/978-1-4939-9554-7_5 [DOI] [PubMed] [Google Scholar]
- Orsini CA, Hernandez CM, Bizon JL, & Setlow B (2019). Deconstructing value-based decision making via temporally selective manipulation of neural activity: Insights from rodent models. Cogn Affect Behav Neurosci, 19(3), 459–476. doi: 10.3758/s13415-018-00649-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orsini CA, Hernandez CM, Singhal S, Kelly KB, Frazier CJ, Bizon JL, & Setlow B (2017). Optogenetic Inhibition Reveals Distinct Roles for Basolateral Amygdala Activity at Discrete Time Points during Risky Decision Making. J Neurosci, 37(48), 11537–11548. doi: 10.1523/JNEUROSCI.2344-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orsini CA, Heshmati SC, Garman TS, Wall SC, Bizon JL, & Setlow B (2018). Contributions of medial prefrontal cortex to decision making involving risk of punishment. Neuropharmacology, 139, 205–216. doi: 10.1016/j.neuropharm.2018.07.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orsini CA, Trotta RT, Bizon JL, & Setlow B (2015). Dissociable Roles for the Basolateral Amygdala and Orbitofrontal Cortex in Decision-Making under Risk of Punishment. J Neurosci, 35(4), 1368–1379. doi: 10.1523/JNEUROSCI.3586-14.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orsini CA, Willis ML, Gilbert RJ, Bizon JL, & Setlow B (2016). Sex differences in a rat model of risky decision making. Behav Neurosci, 130(1), 50–61. doi: 10.1037/bne0000111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez W, Bouzas A, & Orduna V (2018). Temporal discounting of aversive consequences in rats. Learn Behav, 46(1), 38–48. doi: 10.3758/s13420-017-0279-9 [DOI] [PubMed] [Google Scholar]
- Shimp KG, Mitchell MR, Beas BS, Bizon JL, & Setlow B (2015). Affective and cognitive mechanisms of risky decision making. Neurobiol Learn Mem, 117, 60–70. doi: 10.1016/j.nlm.2014.03.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simon NW, Gilbert RJ, Mayse JD, Bizon JL, & Setlow B (2009). Balancing risk and reward: a rat model of risky decision making. Neuropsychopharmacology, 34(10), 2208–2217. doi: 10.1038/npp.2009.48 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simon NW, Mendez IA, & Setlow B (2007). Cocaine exposure causes long-term increases in impulsive choice. Behav Neurosci, 121(3), 543–549. doi: 10.1037/0735-7044.121.3.543 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simon NW, Montgomery KS, Beas BS, Mitchell MR, LaSarge CL, Mendez IA, … Setlow B. (2011). Dopaminergic modulation of risky decision-making. J Neurosci, 31(48), 17460–17470. doi: 10.1523/JNEUROSCI.3772-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Surmeier DJ, Ding J, Day M, Wang Z, & Shen W (2007). D1 and D2 dopamine-receptor modulation of striatal glutamatergic signaling in striatal medium spiny neurons. Trends Neurosci, 30(5), 228–235. doi: 10.1016/j.tins.2007.03.008 [DOI] [PubMed] [Google Scholar]
- Tritsch NX, & Sabatini BL (2012). Dopaminergic modulation of synaptic transmission in cortex and striatum. Neuron, 76(1), 33–50. doi: 10.1016/j.neuron.2012.09.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Winstanley CA, & Floresco SB (2016). Deciphering Decision Making: Variation in Animal Models of Effort- and Uncertainty-Based Choice Reveals Distinct Neural Circuitries Underlying Core Cognitive Processes. J Neurosci, 36(48), 12069–12079. doi: 10.1523/JNEUROSCI.1713-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woolverton WL, Freeman KB, Myerson J, & Green L (2012). Suppression of cocaine self-administration in monkeys: effects of delayed punishment. Psychopharmacology (Berl), 220(3), 509–517. doi: 10.1007/s00213-011-2501-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yates JF, Watts RA (1975). Preferences for deferred losses. Organ Behav Human Perform(13), 294–306. [Google Scholar]