Skip to main content
Biological Psychiatry Global Open Science logoLink to Biological Psychiatry Global Open Science
. 2024 Jul 20;4(6):100362. doi: 10.1016/j.bpsgos.2024.100362

Computational Modeling Differentiates Learning Rate From Reward Sensitivity Deficits Produced by Early-Life Adversity in a Rodent Touchscreen Probabilistic Reward Task

Brian D Kangas a,b, Yuen-Siang Ang a,c,d, Annabel K Short e,f, Tallie Z Baram e,f, Diego A Pizzagalli a,c,
PMCID: PMC11387686  PMID: 39262818

Abstract

Background

Exposure to adversity, including unpredictable environments, during early life is associated with neuropsychiatric illness in adulthood. One common factor in this sequela is anhedonia, the loss of responsivity to previously reinforcing stimuli. To accelerate the development of new treatment strategies for anhedonic disorders induced by early-life adversity, animal models have been developed to capture critical features of early-life stress and the behavioral deficits that such stressors induce. We have previously shown that rats exposed to the limited bedding and nesting protocol exhibited blunted reward responsivity in the probabilistic reward task, a touchscreen-based task reverse translated from human studies.

Methods

To test the quantitative limits of this translational platform, we examined the ability of Bayesian computational modeling and probability analyses identical to those optimized in previous human studies to quantify the putative mechanisms that underlie these deficits with precision. Specifically, 2 parameters that have been shown to independently contribute to probabilistic reward task outcomes in patient populations, reward sensitivity and learning rate, were extracted, as were trial-by-trial probability analyses of choices as a function of the preceding trial.

Results

Significant deficits in reward sensitivity, but not learning rate, contributed to the anhedonic phenotypes in rats exposed to early-life adversity.

Conclusions

The current findings confirm and extend the translational value of these rodent models by verifying the effectiveness of computational modeling in distinguishing independent features of reward sensitivity and learning rate that complement the probabilistic reward task’s signal detection end points. Together, these metrics serve to objectively quantify reinforcement learning deficits associated with anhedonic phenotypes.

Keywords: Anhedonia, Bayesian computational modeling, Computational psychiatry, Early-life adversity, Probabilistic reward task, Research domain criteria

Plain Language Summary

Exposure to early-life adversity can lead to psychiatric illness, including anhedonia, the loss of pleasure from previously rewarding activities. This article describes findings from rats exposed to a model of simulated poverty on a touchscreen-based assay reverse translated from a task used to characterize anhedonia in humans. We documented the ability of Bayesian computational modeling and probability analyses, identical to those used with humans, to objectively quantify reinforcement learning deficits associated with anhedonia in rats.

Plain Language Summary

Exposure to early-life adversity can lead to psychiatric illness, including anhedonia, the loss of pleasure from previously rewarding activities. This article describes findings from rats exposed to a model of simulated poverty on a touchscreen-based assay reverse translated from a task used to characterize anhedonia in humans. We documented the ability of Bayesian computational modeling and probability analyses, identical to those used with humans, to objectively quantify reinforcement learning deficits associated with anhedonia in rats.


Childhood and adolescent exposure to poverty, trauma, and chaotic environments have been associated with the development of neuropsychiatric illness that can persist into adulthood (1, 2, 3). Early-life adversity affects more than 30% of children in the United States (4) and has been implicated in the emergence and maintenance of mental illnesses including major depression, bipolar disorder, posttraumatic stress disorder, and substance use disorders. Although several psychological domains have been hypothesized to mediate such risk—including cognitive (e.g., executive functioning, memory) and affective (e.g., reward processing, reactivity to social and affective stimuli, emotion regulation) domains—the emergence of anhedonia, a blunted responsivity to previously rewarding activities, has attracted substantial interest (5, 6, 7). Coordinated bidirectional cross-species research efforts, including clinical and preclinical studies, have aimed to characterize key features of early-life adversity and consequent anhedonic phenotypes in humans and model them in laboratory animals to accelerate the development of novel therapeutic strategies (8, 9, 10, 11).

Regarding early-life adversity, one rodent paradigm that was specifically designed to simulate early-life poverty and unpredictable maternal care is the limited bedding and nesting (LBN) protocol. Numerous studies have documented that implementing resource scarcity to dams and their pups elicits a chaotic and unpredictable/fragmented environment, which in turn has been shown to produce depressive- and anhedonic-like behavioral phenotypes in males and associated abnormalities in the reward circuit (12,13). Importantly, corresponding research endeavors of functionally similar early-life adversity phenotypes in the human laboratory have demonstrated that unpredictable maternal behaviors lead to neurodevelopmental deficits in children (14). Accordingly, self-report instruments have been developed to quantify exposure to fragmented and unpredictable environments during childhood, thereby allowing for risk assessments of subsequent psychiatric disorders (15,16).

Regarding anhedonia, one laboratory-based technique that has proven useful in objectively quantifying deficits in reward responsivity is the probabilistic reward task (PRT). This paradigm uses visual discrimination methodology and asymmetric probabilistic reinforcement contingencies such that correct responses to one alternative (rich) are rewarded more often than correct responses to the other (lean). As predicted by signal detection theory (17, 18, 19), healthy control participants reliably demonstrate an adaptive response bias toward the rich stimulus; however, as was originally observed when probing anhedonic phenotypes in patients with major depressive disorder (MDD) (20,21) and subsequently other disorders including bipolar disorder (22) and substance use disorder (23,24), a blunted response bias is commonly observed that is correlated with self-reported anhedonia. Highlighting the richness of the behavioral repertoire that can be derived from this task, PRT studies of patient samples have also probed for possible abnormalities in the probability of specific responses as a function of the immediately preceding trial. For example, compared with healthy control participants, unmedicated individuals with MDD were characterized by a lower probability of selecting the more frequently rewarded (rich) stimulus after correct identification of the rich stimulus in the immediately preceding trial that had not been rewarded (because a probabilistic reward had not been scheduled) (21). Thus, a blunted response bias in MDD was mainly driven by a reduced ability to sustain a preference for the more advantageous stimulus in the absence of immediate rewards.

Due to the ability of the PRT to provide objective quantifications of anhedonic phenotypes, it has been selected as a recommended task to probe positive valence systems in the latest revision (25) of the Research Domain Criteria’s (26) initiative toward advancing medication development for psychiatric conditions that include anhedonia. Given this collective value, and in the spirit of coordinated research efforts in preclinical therapeutics development across species, the PRT has been reverse translated using touchscreen technology for rats (27), mice (28), and nonhuman primates (29), all of which produce outcomes similar to those observed in human participants (18).

Given the parallel development of the LBN paradigm and anhedonia task described above, we evaluated in a recent study whether the LBN protocol in male rats would produce expected deficits in the PRT’s signal detection metrics of reward responsivity. As described in Kangas et al. (30), response bias toward the more richly rewarded stimulus was blunted in 2 independent cohorts of adult rats previously exposed to the LBN procedure during early life compared with control rats reared under standard housing conditions. Moreover, this anhedonic phenotype was shown to be significantly associated with unpredictable and chaotic maternal interaction with the affected pups, as quantified by entropy metrics of dam/pup behavior, thus confirming construct validity.

The purpose of the current work was to test the quantitative limits of cross-species continuity in this LBN/anhedonia platform. Specifically, several recent human PRT studies have highlighted the value of Bayesian computational modeling as an important analytic technique to complement the signal detection end points and elucidate the behavioral mechanisms of blunted reward responsivity in anhedonic patient populations with greater precision (31). Two key parameters that have been shown to independently contribute to PRT outcomes are reward sensitivity (i.e., which captures the immediate hedonic impact of rewards, that is, consummatory pleasure) and learning rate (i.e., which captures the ability to learn from reinforcing consequences, specifically reward prediction errors, which refer to the difference between expected and obtained rewards). For example, in a recent study of adolescents with either a low or high risk of depression based on maternal history (32), expected blunting of reward responsivity in the PRT and neuroimaging correlates were driven by deficits in reward sensitivity but not learning rate. In another study (33), compared with placebo, an 8-week treatment with a kappa opioid receptor antagonist, which has been hypothesized to have anti-anhedonic effects, was associated with higher response biases as well as higher learning rates. Of note, effects on learning rate, but not reward sensitivity, after kappa opioid receptor treatment was a priori hypothesized in light of 1) preclinical evidence that kappa opioid receptor antagonism restores dopaminergic signaling within brain reward pathways [e.g., nucleus accumbens (34)] and 2) computational modeling indicating that dopaminergic manipulations affect learning rate but not reward sensitivity (31,35,36).

Despite its effectiveness in characterizing deficits in human PRT performance, it is currently unknown whether these computational modeling approaches can effectively distinguish reward deficit determinants in laboratory animals. Therefore, secondary analyses were conducted of anhedonic phenotypes in male participants exposed to the LBN paradigm versus outcomes in healthy control participants (30). Specifically, trial-by-trial reinforcement learning was examined using Bayesian computational modeling and probability analyses optimized in previous human studies to determine 1) whether these quantitative approaches are similarly effective in rats at distinguishing reward sensitivity from learning rate and, if so, 2) how these profiles contribute to the LBN-induced anhedonic phenotypes observed.

Methods and Materials

Subjects

Thirty-two male Sprague Dawley rats, offspring of 6 timed-pregnant dams (delivered embryonic day 15; Envigo), were utilized in the current studies, in which primary outcomes were presented in Kangas et al. (30). These rats were subjected to either standard or LBN rearing protocols at the University of California, Irvine. Around postnatal day (P) 100, these rats were transported to McLean Hospital via overnight shipping. Upon their arrival, subjects were quarantined in an isolated, climate-controlled vivarium bay with unrestricted access to rodent chow and water. Following clearance from quarantine, they were transferred to a larger vivarium with otherwise identical housing conditions. Although subjects continued to have unrestricted access to water in their home cage, to establish sweetened condensed milk as a reinforcer, they were food restricted via daily postsession portions of 10 to 15 g of rodent chow. All research assistants and vivarium technicians responsible for conducting the behavioral studies and animal husbandry duties were blinded to the subjects’ group assignment (i.e., control or LBN). The study’s procedures were approved by the Institutional Animal Care and Use Committee at McLean Hospital and in accordance with guidelines from the Committee on Care and Use of Laboratory Animals of the Institute of Laboratory Animals Resources, Commission on Life Sciences (37).

Early-Life Adversity Paradigm

On P2, 6 dams were randomly assigned as either control or LBN. To create litters of 12 with equal numbers of sexes where possible, pups were cross-fostered between litters born within 12 hours of each other. Early-life adversity was imposed using the LBN paradigm, which consists of limiting the nesting and bedding materials in home cages between P2 and P9 as described previously (12,13). For the LBN group, a plastic-coated mesh platform was placed 2.5 cm above the floor of a standard cage. Cob bedding was reduced to cover the cage floor sparsely, and only one half of a single paper towel was provided for nesting material on the platform. Conversely, control dams and litters resided in standard home cages that contained ample cob bedding and 1 whole paper towel that dams shredded for nesting material. Figure 1 shows a representative photograph of control and LBN housing conditions. Both control and LBN cages were undisturbed from P2 to P9 and housed in temperature- and humidity-controlled rooms. On P10, LBN groups were transferred to home cages identical to control conditions. Rats were weaned on P21 and then group housed.

Figure 1.

Figure 1

Representative photographs of the LBN (upper left) and control (lower left) housing conditions, PRT task schematic (upper right), and PRT (lower right). LBN, limited bedding and nesting; P, postnatal day; PRT, probabilistic reward task.

PRT Training and Testing

Upon their arrival and subsequent release from quarantine at McLean Hospital and following the establishment of food restriction conditions, subjects began PRT training and testing protocols. Empirical validation and task optimization of the touchscreen-based rat PRT can be found in Kangas et al. (27). Details of the rat touch-sensitive experimental chamber can be found in Kangas and Bergman (38), and a task schematic and photograph are presented in Figure 1.

Line-Length Discrimination Training

Trials began with presentation of a white line on a black background positioned 3 cm above 2 blue response boxes (5 × 5 cm) that were left and right of center. The line was either long (600 × 123 pixels; 31.5 × 6.5 cm) or short (200 × 60 pixels; 10.5 × 3.25 cm). Long and short line-length trial types varied in a quasi-random order across 100-trial sessions such that there were exactly 50 trials of each type, but a given trial type was not presented more than 5 times in a row. Subjects were trained to respond to the left or right response box depending on the length of the white line (long line: respond left, short line: respond right, or vice versa). Response box designation was counterbalanced across subjects. During the line-length discrimination training phase, each correct response was reinforced with 0.1 mL of 30% sweetened condensed milk that was paired with an 880-ms yellow screen flash and a 440-Hz tone and followed by a 5-second blackout period. Each incorrect response immediately resulted in a 10-second blackout period without reinforcement. A correction procedure (39) was programmed during initial discrimination training in which each incorrect trial was repeated until a correct response was made and was discontinued after <10 repeats of each trial type occurred in 2 consecutive sessions. Concordant with the performance criteria used in previous human PRT studies (20, 21, 22), discrimination training sessions continued without correction until accuracies for both line-length trial types were ≥80% correct for 2 consecutive sessions. After this training criterion had been met, PRT testing commenced.

PRT Testing

On approximately P200, subjects underwent a 5-session testing protocol using 3:1 probabilistic reinforcement contingencies such that a correct response to one of the line lengths (long or short) was reinforced 60% of the time (rich stimulus), whereas a correct response to the other line length was reinforced 20% of the time (lean stimulus). Incorrect responses were never reinforced. The line length associated with the rich and lean contingency was determined for each subject during their final 2 line-length discrimination training sessions by examining their accuracies and designating the line length with a higher mean accuracy as the stimulus to be rewarded on the lean schedule. This method was specifically designed to examine the effects of early-life adversity on response bias generated by responsivity to asymmetrical probabilistic contingencies rather than the amplification of a preexisting inherent bias that is a function of uncontrolled variables.

Data Analysis

To interrogate the effects of LBN on PRT performance beyond the blunted log b response bias metrics reported previously [cf., (30)], we fit a computational model of trial-level performance, which allows the parsing of 2 constructs critically implicated in reward learning tasks: reward sensitivity, which captures consummatory pleasure, and learning rate, which captures the subject’s ability to learn from reward feedback (31,40). This Bayesian model fit 4 reinforcement learning models to the group-level PRT choice data [for details about the mathematical implementation, see (31)]. Model 1 was the stimulus-action Rescorla-Wagner model with separate sensitivities for reward and nonreward events, which postulates that subjects treat nonrewards as actual punishments and assumes that subjects correctly assign the rewards to particular stimulus-action combinations. Model 2 was the basic stimulus-action Rescorla-Wagner model, which assumes that subjects correctly assign the rewards to particular stimulus-action combinations. Model 3 was the belief model, which assumes that because subjects are unsure about the presented stimulus, they assign rewards to both stimuli, with only a certain preference for the actually presented stimulus. Model 4 was the action-only model, which assumes that subjects only learn about the value of each action, independent of the stimuli.

Following prior recommendations (31), models were fitted using an empirical Bayesian random-effects approach and compared through integrated group-level Bayesian information criterion factors. To constrain individual subjects’ parameter inference by an empirical prior distribution, data were fitted concurrently. Finally, to evaluate the robustness of the findings, the Bayesian modeling was run 3 times, and in each run, the model with the lowest integrated Bayesian information criterion value was considered the winning model. All 4 models yielded the 2 parameters of interest—reward sensitivity and learning rate. Unpaired t tests and Cohen’s d were used to evaluate group differences and effect sizes (control vs. LBN) in reward sensitivity and learning rate, respectively, using the winning model. These computational parameters can be disentangled using a mathematical formulation of reward learning, which relies on prediction errors and has been associated with dopaminergic activity (36,41,42). Thus, let us consider an experiment in which rewards are administered stochastically on a select number of trials (in the case of the PRT, 30 rich rewards and 10 lean rewards within a block of 100 trials). We can denote rt = 1 when a reward is received in trial t, and rt = 0 when no reward is dispensed. The variable ρ is used to represent the subjective value that a subject assigns to the reward. Within this framework, a subject has a so-called expectation (Qt) of the average reward that it might gain on a given trial through a prediction error, which represents the difference δt = ρrt − Qt (i.e., the discrepancy between the obtained ρrt and expected Qt reward). This prediction error is then utilized to adjust future expectations (43), according to the formula Qt+1 = Qt + εδt, where 0 ≤ ε ≤ 1 is a learning rate. Therefore, the 2 parameters—ρ and ε—could contribute to anhedonic behavior. The larger the ρ, the more sensitive a subject is to the reward. Conversely, ε captures the extent to which reward prediction errors modulate learning, specifically the speed at which reward affects behavior (44). Thus, a high ε points to a large impact of reward feedback of the prior reward feedback on the current decision, whereas a low learning rate reflects a relatively small impact.

As an additional check of model fit, we also generated surrogate datasets for each subject using the winning model’s parameters. The subject-specific model parameters were fed back into the underlying equations to simulate responses that were theoretically plausible under the same task conditions. This simulation process was repeated 500 times for each subject to ensure a robust sample of predicted responses. Then, we examined whether the surrogate response biases reasonably captured the general pattern observed in the empirical data.

In subsequent analyses, to investigate the effects of LBN on PRT performance with more granularity, as in prior human PRT studies (21,22), we computed the probability that rats chose rich- or lean-associated response as a function of the prior stimulus type (rich vs. lean) and whether their correct response was rewarded or not. For example, what is the probability that rats chose the rich-associated response when the preceding rich or lean stimulus had been rewarded or not (because the probabilistic reward was not scheduled)? As in prior human PRT studies that used these metrics, an arcsine transformation was applied to these percentages. Next, 2 mixed analyses of variance (ANOVAs) were performed. In the first analysis, we evaluated the effects of prior rewards by running a mixed ANOVA with preceding stimulus (rich, lean), current stimulus (rich, lean), and response (rich-associated, lean-associated) as repeated measures and group (LBN, control) as the between-subject factor. In the second analysis, an identical ANOVA was performed, but prior nonrewarded trials were considered. For brevity, only effects involving group were followed up and reported.

Results

Model Selection

Of the 4 reinforcement learning models examined (stimulus-action Rescorla-Wagner model with separate sensitivities for reward and nonreward events, basic stimulus-action Rescorla-Wagner model, belief model, action-only model) across the 3 runs, the belief model was associated with the lowest integrated Bayesian information criterion value at the group level and thus yielded the most parsimonious account of the data (see Figure S1 for model outcome comparisons). As detailed above, the belief model assumes that because subjects are unsure about the presented stimulus, they assign rewards to both stimuli, with only a certain preference for the stimulus that was actually presented. As an additional verification of model fit, surrogate datasets were generated for each subject using belief model parameters and fed back into the underlying equations to simulate across 500 iterations the responses that were theoretically plausible under the same task conditions. Figure 2 presents the surrogate response bias as predicted by the belief model, which also captured the general pattern of response bias reasonably well. This suggests that these computational parameters serve as a reliable proxy for understanding and predicting performance on the PRT.

Figure 2.

Figure 2

Real (white bars) and surrogate (gray bars) mean (±SEM) response bias data using the belief model.

Computational Outcomes

Reward sensitivity and learning rate were extracted from run 1, which fit the data best. Figure 3 presents outcomes from Bayesian computational modeling and probability analyses using trial-by-trial reinforcement learning metrics identical to those previously optimized in human PRT studies (31). As the left panel shows, significant group differences were observed in reward sensitivity (t30 = 4.10, p = .0003, d = 1.46). Conversely, however, as shown in the right panel, learning rate did not differ statistically between groups (t30 = 1.67, p = .11, d = 0.59).

Figure 3.

Figure 3

Effects of control (n = 17) and LBN (n = 15) rearing conditions [see (30)] on reward sensitivity (left panel) and learning rate (right panel). Horizontal lines represent group mean (±SEM), and data points represent values for individual subjects. ∗∗∗p < .001. LBN, limited bedding and nesting.

Probability Analyses Effects of Prior Rewards

The preceding stimulus (rich, lean) × current stimulus (rich, lean) × response (rich-associated, lean-associated) × group (LBN, control) model was run on trials following rewarded responses and revealed various main effects and 2-way interactions not involving group. Critically, several interactions involving group were significant, including the preceding stimulus × group (F1,30 = 5.56, p < .025) and response × group (F1,30 = 10.06, p < .003) interactions, which were qualified by a 4-way interaction effect (F1,30 = 4.74, p < .037). To disentangle the 4-way interaction, we performed 2 follow-up ANOVAs and entered current stimulus, response, and group as factors for trials that were preceded by a rewarded rich versus lean stimulus. For both groups, the probability of choosing the rich-associated alternative was significantly higher than the probability of choosing the lean-associated alternative (both ps < .005). However, as shown in Figure 4A, when considering trials preceded by a rewarded rich trial, the only effect that involved group was a significant response × group interaction (F1,30 = 4.84, p < .036). Bonferroni-corrected simple tests revealed that, compared with the control group, the LBN group displayed a significantly lower probability of choosing the rich-associated alternative after the preceding rich trial had been rewarded (irrespective of the current stimulus type; p < .034) and instead had a significantly higher probability of choosing the lean-associated alternative (p < .038). As shown in Figure 4B, when considering trials preceded by a rewarded lean trial, the only significant effect that involved group was the main effect of group (F1,30 = 4.97, p < .035), which was qualified by a significant response × group interaction (F1,30 = 5.99, p < .020). Bonferroni-corrected simple tests revealed that, compared with the control group, the LBN group had a significantly lower probability of choosing the rich-associated alternative after the preceding lean trial had been rewarded (irrespective of the current stimulus type; p < .019) and instead showed a significantly higher probability of choosing the lean-associated alternative (p < .022). In the control group (p < .001), but not the LBN group (p > .39), the probability of choosing the rich-associated alternative was significantly higher than the probability of choosing the lean-associated alternative (p < .001).

Figure 4.

Figure 4

Effects of control (gray bars) and LBN (black bars) rearing conditions [see (30)] on the likelihood of choosing the rich- vs. lean-associated response when the prior trial was a rich and rewarded trial type (A), a lean and rewarded trial type (B), or the prior trial was not rewarded (C). ∗p < .05, ∗∗p < .005. LBN, limited bedding and nesting.

Probability Analyses: Effects of Prior No Rewards

As shown in Figure 4C, the preceding stimulus (rich, lean) × current stimulus (rich, lean) × response (rich-associated, lean-associated) × group (LBN, control) run on trials following correct but not rewarded responses revealed only a significant response × group interaction (F1,30 = 17.86, p < .001). For both groups, the probability of choosing the rich-associated alternative was significantly higher than the probability of choosing the lean-associated alternative (irrespective of the prior and current stimulus type; both ps < .001). However, Bonferroni-corrected simple tests revealed that, compared with the control group, the LBN group showed a significantly lower probability of choosing the rich-associated alternative after the preceding correct trial had not been rewarded (irrespective of the prior and current stimulus type; p < .001) and instead had a significantly higher probability of choosing the lean-associated alternative (p < .001).

Discussion

Exposure to early-life adversity using a rodent model of simulated poverty induced blunted reward responsivity as quantified by PRT signal detection metrics [see (30) for psychophysical details]. The current work extends the translational value of this research platform via secondary analyses of Kangas et al. (30) that confirmed Bayesian computational models, identical to those optimized for use in human PRT performance (31), and distinguished independent features of reward sensitivity and learning rate that contribute to deficits associated with anhedonic phenotypes. Specifically, reward sensitivity was significantly impaired compared with control subjects. Conversely, learning rate did not differ significantly between the groups. These computational findings are significant considering that youths at increased risk for MDD (32) as well as adults who eventually show poor response to an 8-week antidepressant treatment (45) are characterized by reduced reward sensitivity using the same Bayesian modeling. Thus, the current findings suggest a platform for evaluating novel pharmacological treatments to alleviate anhedonic phenotypes, which are poorly addressed by currently available interventions (46). More fundamentally, we believe that demonstration in laboratory animals and humans of similar effects of given manipulations on precise computational parameters, which dissect behavior in more granularity, offers some of the most compelling cross-species confluence (47). Moreover, confirming correspondence in behavioral mechanisms across quantitative approaches [i.e., signal detection theory (30) and Bayesian analyses (current findings)] should bolster the predictive validity of preclinical models and in turn enhance translational relevance.

Likewise, sequential analyses of choice and consequence during rich versus lean stimulus trial types also served to provide detailed trial-by-trial characterizations of response allocation that, when aggregated, produced the signal detection–derived anhedonic phenotypes. Specifically, they revealed that, although both groups were more likely to choose the response associated with the rich stimulus, compared with control subjects, the LBN group was less likely to select the rich-associated response following a trial in which they were rewarded in the presence of the rich stimulus, rewarded in the presence of the lean stimulus, or not rewarded at all on the preceding trial. This suggests a lack of trial type specificity in their contributions to the session-wide blunting of adaptive response biases in subjects exposed to early-life adversity compared with healthy control subjects.

Conclusions

Taken together, isolating distinct critical mechanistic features in task outcomes across species can identify behavioral mechanisms in reward processing. Ultimately, it is hoped that coordinated bidirectional studies that use these computerized tasks and computational metrics across human and experimental animals will accelerate the development of therapeutic strategies for neuropsychiatric conditions prominently characterized by anhedonia.

Acknowledgments and Disclosures

This work was supported in part by the National Institute of Mental Health (Grant Nos. P50 MH119467 and R37 MH068376 [to DAP] and P50 MH 096889 [to TZB]) and the National Institute on Drug Abuse (Grant No. R01 DA047575 [to BDK]). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

During the past 3 years, BDK has had sponsored research agreements with BlackThorn Therapeutics, Compass Pathways, Delix Therapeutics, Engrail Therapeutics, Neurocrine Biosciences, and Takeda Pharmaceuticals. During the past 3 years, DAP has received consulting fees from Boehringer Ingelheim, Compass Pathways, Engrail Therapeutics, Karla Therapeutics, Neumora Therapeutics, Neurocrine Biosciences, Neuroscience Software, Otsuka, Sage Therapeutics, Sama Therapeutics, Sunovion Therapeutics, and Takeda; he has received honoraria from the American Psychological Association, Psychonomic Society, and Springer (for editorial work), as well as Alkermes; he has received research funding from the Brain & Behavior Research Foundation, Dana Foundation, Wellcome Leap, Millennium Pharmaceuticals, and the National Institute of Mental Health; he has received stock options from Compass Pathways, Engrail Therapeutics, Neumora Therapeutics, and Neuroscience Software. No funding from these entities was used to support the current work, and all views expressed are solely those of the authors. All other authors report no biomedical financial interests or potential conflicts of interest.

Footnotes

Supplementary material cited in this article is available online at https://doi.org/10.1016/j.bpsgos.2024.100362.

Supplementary Material

Key Resources Table
mmc1.xlsx (20.6KB, xlsx)
Supplementary Material
mmc2.pdf (315.7KB, pdf)

References

  • 1.Danese A., Lewis S.J. Psychoneuroimmunology of early-life stress: The hidden wounds of childhood trauma? Neuropsychopharmacology. 2017;42:99–114. doi: 10.1038/npp.2016.198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Green J.G., McLaughlin K.A., Berglund P.A., Gruber M.J., Sampson N.A., Zaslavsky A.M., Kessler R.C. Childhood adversities and adult psychiatric disorders in the national comorbidity survey replication I: Associations with first onset of DSM-IV disorders. Arch Gen Psychiatry. 2010;67:113–123. doi: 10.1001/archgenpsychiatry.2009.186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Short A.K., Baram T.Z. Early-life adversity and neurological disease: Age-old questions and novel answers. Nat Rev Neurol. 2019;15:657–669. doi: 10.1038/s41582-019-0246-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bonuck K., McGrath K., Gao Q. National Parent Survey 2017: Worries, hopes, and child well-being. J Community Psychol. 2020;48:2532–2551. doi: 10.1002/jcop.22434. [DOI] [PubMed] [Google Scholar]
  • 5.Corral-Frías N.S., Nikolova Y.S., Michalski L.J., Baranger D.A., Hariri A.R., Bogdan R. Stress-related anhedonia is associated with ventral striatum reactivity to reward and transdiagnostic psychiatric symptomatology. Psychol Med. 2015;45:2605–2617. doi: 10.1017/S0033291715000525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Pechtel P., Pizzagalli D.A. Effects of early life stress on cognitive and affective function: An integrated review of human literature. Psychopharmacol (Berl) 2011;214:55–70. doi: 10.1007/s00213-010-2009-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Pizzagalli D.A. Depression, stress, and anhedonia: Toward a synthesis and integrated model. Annu Rev Clin Psychol. 2014;10:393–423. doi: 10.1146/annurev-clinpsy-050212-185606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bale T.L., Baram T.Z., Brown A.S., Goldstein J.M., Insel T.R., McCarthy M.M., et al. Early life programming and neurodevelopmental disorders. Biol Psychiatry. 2010;68:314–319. doi: 10.1016/j.biopsych.2010.05.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Birnie M.T., Levis S.C., Mahler S.V., Baram T.Z. Developmental trajectories of anhedonia in preclinical models. Curr Top Behav Neurosci. 2022;58:23–41. doi: 10.1007/7854_2021_299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Birnie M.T., Short A.K., de Carvalho G.B., Taniguchi L., Gunn B.G., Pham A.L., et al. Stress-induced plasticity of a CRH/GABA projection disrupts reward behaviors in mice. Nat Commun. 2023;14:1088. doi: 10.1038/s41467-023-36780-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Chen Y., Baram T.Z. Toward understanding how early-life stress reprograms cognitive and emotional brain networks. Neuropsychopharmacology. 2016;41:197–206. doi: 10.1038/npp.2015.181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Molet J., Maras P.M., Avishai-Eliner S., Baram T.Z. Naturalistic rodent models of chronic early-life stress. Dev Psychobiol. 2014;56:1675–1688. doi: 10.1002/dev.21230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Walker C.D., Bath K.G., Joels M., Korosi A., Larauche M., Lucassen P.J., et al. Chronic early life stress induced by limited bedding and nesting (LBN) material in rodents: Critical considerations of methodology, outcomes and translational potential. Stress. 2017;20:421–448. doi: 10.1080/10253890.2017.1343296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Davis E.P., Stout S.A., Molet J., Vegetabile B., Glynn L.M., Sandman C.A., et al. Exposure to unpredictable maternal sensory signals influences cognitive development across species. Proc Natl Acad Sci U S A. 2017;114:10390–10395. doi: 10.1073/pnas.1703444114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Glynn L.M., Stern H.S., Howland M.A., Risbrough V.B., Baker D.G., Nievergelt C.M., et al. Measuring novel antecedents of mental illness: The Questionnaire of Unpredictability in Childhood. Neuropsychopharmacology. 2019;44:876–882. doi: 10.1038/s41386-018-0280-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lindert N.G., Maxwell M.Y., Liu S.R., Stern H.S., Baram T.Z., Poggi Davis E., et al. Exposure to unpredictability and mental health: Validation of the brief version of the Questionnaire of Unpredictability in Childhood (QUIC-5) in English and Spanish. Front Psychol. 2022;13 doi: 10.3389/fpsyg.2022.971350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Davison M.C., Tustin R.D. The relation between the generalized matching law and signal-detection theory. J Exp Anal Behav. 1978;29:331–336. doi: 10.1901/jeab.1978.29-331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Luc O.T., Pizzagalli D.A., Kangas B.D. Toward a quantification of anhedonia: Unified matching law and signal detection for clinical assessment and drug development. Perspect Behav Sci. 2021;44:517–540. doi: 10.1007/s40614-021-00288-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.McCarthy D. Measures of response bias at minimum-detectable luminance levels in the pigeon. J Exp Anal Behav. 1983;39:87–106. doi: 10.1901/jeab.1983.39-87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Pizzagalli D.A., Jahn A.L., O’Shea J.P. Toward an objective characterization of an anhedonic phenotype: A signal-detection approach. Biol Psychiatry. 2005;57:319–327. doi: 10.1016/j.biopsych.2004.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Pizzagalli D.A., Iosifescu D., Hallett L.A., Ratner K.G., Fava M. Reduced hedonic capacity in major depressive disorder: Evidence from a probabilistic reward task. J Psychiatr Res. 2008;43:76–87. doi: 10.1016/j.jpsychires.2008.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Pizzagalli D.A., Goetz E., Ostacher M., Iosifescu D.V., Perlis R.H. Euthymic patients with bipolar disorder show decreased reward learning in a probabilistic reward task. Biol Psychiatry. 2008;64:162–168. doi: 10.1016/j.biopsych.2007.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Boger K.D., Auerbach R.P., Pechtel P., Busch A.B., Greenfield S.F., Pizzagalli D.A. Co-occurring depressive and substance use disorders in adolescents: An examination of reward responsiveness during treatment. J Psychother Integr. 2014;24:109–121. doi: 10.1037/a0036975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Janes A.C., Pedrelli P., Whitton A.E., Pechtel P., Douglas S., Martinson M.A., et al. Reward responsiveness varies by smoking status in women with a history of major depressive disorder. Neuropsychopharmacology. 2015;40:1940–1946. doi: 10.1038/npp.2015.43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.National Institute of Mental Health Behavioral Assessment Methods for RDoC Constructs, August 2016. 2016. https://www.nimh.nih.gov/about/advisory-boards-and-groups/namhc/reports/behavioral-assessment-methods-for-rdoc-constructs Available at:
  • 26.Insel T., Cuthbert B., Garvey M., Heinssen R., Pine D.S., Quinn K., et al. Research Domain Criteria (RDoC): Toward a new classification framework for research on mental disorders. Am J Psychiatry. 2010;167:748–751. doi: 10.1176/appi.ajp.2010.09091379. [DOI] [PubMed] [Google Scholar]
  • 27.Kangas B.D., Wooldridge L.M., Luc O.T., Bergman J., Pizzagalli D.A. Empirical validation of a touchscreen probabilistic reward task in rats. Transl Psychiatry. 2020;10:285. doi: 10.1038/s41398-020-00969-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Luc O.T., Kangas B.D. Validation of a touchscreen probabilistic reward task for mice: A reverse-translated assay with cross-species continuity. Cogn Affect Behav Neurosci. 2024;24:281–288. doi: 10.3758/s13415-023-01128-x. [DOI] [PubMed] [Google Scholar]
  • 29.Wooldridge L.M., Bergman J., Pizzagalli D.A., Kangas B.D. Translational assessments of reward responsiveness in the marmoset. Int J Neuropsychopharmacol. 2021;24:409–418. doi: 10.1093/ijnp/pyaa090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kangas B.D., Short A.K., Luc O.T., Stern H.S., Baram T.Z., Pizzagalli D.A. A cross-species assay demonstrates that reward responsiveness is enduringly impacted by adverse, unpredictable early-life experiences. Neuropsychopharmacology. 2022;47:767–775. doi: 10.1038/s41386-021-01250-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Huys Q.J., Pizzagalli D.A., Bogdan R., Dayan P. Mapping anhedonia onto reinforcement learning: A behavioural meta-analysis. Biol Mood Anxiety Disord. 2013;3:12. doi: 10.1186/2045-5380-3-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Belleau E.L., Kremens R., Ang Y.S., Pisoni A., Bondy E., Durham K., et al. Reward functioning abnormalities in adolescents at high familial risk for depressive disorders. Biol Psychiatry Cogn Neurosci Neuroimaging. 2021;6:270–279. doi: 10.1016/j.bpsc.2020.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Pizzagalli D.A., Smoski M., Ang Y.S., Whitton A.E., Sanacora G., Mathew S.J., et al. Selective kappa-opioid antagonism ameliorates anhedonic behavior: Evidence from the Fast-fail Trial in Mood and Anxiety Spectrum Disorders (FAST-MAS) Neuropsychopharmacology. 2020;45:1656–1663. doi: 10.1038/s41386-020-0738-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Carlezon WA Jr, Krystal A.D. Kappa-opioid antagonists for psychiatric disorders: From bench to clinical trials. Depress Anxiety. 2016;33:895–906. doi: 10.1002/da.22500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Montague P.R., Hyman S.E., Cohen J.D. Computational roles for dopamine in behavioural control. Nature. 2004;431:760–767. doi: 10.1038/nature03015. [DOI] [PubMed] [Google Scholar]
  • 36.Waelti P., Dickinson A., Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001;412:43–48. doi: 10.1038/35083500. [DOI] [PubMed] [Google Scholar]
  • 37.National Research Council . 8th ed. National Academies Press; Washington, DC: 2011. Guide for the Care and Use of Laboratory Animals. [Google Scholar]
  • 38.Kangas B.D., Bergman J. Touchscreen technology in the study of cognition-related behavior. Behav Pharmacol. 2017;28:623–629. doi: 10.1097/FBP.0000000000000356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kangas B.D., Branch M.N. Empirical validation of a procedure to correct position and stimulus biases in matching-to-sample. J Exp Anal Behav. 2008;90:103–112. doi: 10.1901/jeab.2008.90-103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Webb C.A., Dillon D.G., Pechtel P., Goer F.K., Murray L., Huys Q.J., et al. Neural correlates of three promising endophenotypes of depression: Evidence from the EMBARC study. Neuropsychopharmacology. 2016;41:454–463. doi: 10.1038/npp.2015.165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Bayer H.M., Glimcher P.W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. doi: 10.1016/j.neuron.2005.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Montague P.R., Dayan P., Sejnowski T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci. 1996;16:1936–1947. doi: 10.1523/JNEUROSCI.16-05-01936.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Sutton R., Barto A. MIT Press; Cambridge: 1998. Reinforcement Learning: An Introduction. [Google Scholar]
  • 44.Smith A., Li M., Becker S., Kapur S. A model of antipsychotic action in conditioned avoidance: A computational approach. Neuropsychopharmacology. 2004;29:1040–1049. doi: 10.1038/sj.npp.1300414. [DOI] [PubMed] [Google Scholar]
  • 45.Ang Y.S., Kaiser R., Deckersbach T., Almeida J., Phillips M.L., Chase H.W., et al. Pretreatment reward sensitivity and frontostriatal resting-state functional connectivity are associated with response to bupropion after sertraline nonresponse. Biol Psychiatry. 2020;88:657–667. doi: 10.1016/j.biopsych.2020.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Klein M.E., Grice A.B., Sheth S., Go M., Murrough J.W. Pharmacological treatments for anhedonia. Curr Top Behav Neurosci. 2022;58:467–489. doi: 10.1007/7854_2022_357. [DOI] [PubMed] [Google Scholar]
  • 47.Pizzagalli D.A. Toward a better understanding of the mechanisms and pathophysiology of anhedonia: Are we ready for translation? Am J Psychiatry. 2022;179:458–469. doi: 10.1176/appi.ajp.20220423. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Key Resources Table
mmc1.xlsx (20.6KB, xlsx)
Supplementary Material
mmc2.pdf (315.7KB, pdf)

Articles from Biological Psychiatry Global Open Science are provided here courtesy of Elsevier

RESOURCES