Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Sep 25.
Published in final edited form as: Curr Biol. 2022 Jul 20;32(17):3690–3703.e5. doi: 10.1016/j.cub.2022.06.089

Transient food insecurity during the juvenile-adolescent period affects adult weight, cognitive flexibility, and dopamine neurobiology

Wan Chen Lin 1, Christine Liu 1, Polina Kosillo 3, Lung-Hao Tai 1, Ezequiel Galarce 4, Helen Bateup 1,3,5, Stephan Lammel 1,3, Linda Wilbrecht 1,2,6,*
PMCID: PMC10519557  NIHMSID: NIHMS1903535  PMID: 35863352

Summary

A major challenge for neuroscience, public health, and evolutionary biology is to understand the effects of scarcity and uncertainty on the developing brain. Currently, a significant fraction of children and adolescents worldwide experience insecure access to food. The goal of our work was to test in mice if transient experience of insecure versus secure access to food during the juvenile-adolescent period produced lasting differences in learning, decision making, and dopamine system in adulthood. We manipulated feeding schedule in mice from postnatal day(P) 21 to 40 as food insecure or ab libitum and found that when tested in adulthood (after P60), males with different developmental feeding history showed significant differences in multiple metrics of cognitive flexibility in learning and decision making. Adult females with different developmental feeding history showed no differences in cognitive flexibility, but did show significant differences in adult weight. We next applied reinforcement learning models to these behavioral data. The best fit models suggested that in males, developmental feeding history altered how mice updated their behavior after negative outcomes. This effect was sensitive to task context and reward contingencies. Consistent with these results, in males we found that the two feeding history groups showed significant differences in AMPAR/NMDAR ratio of excitatory synapses on nucleus accumbens-projecting midbrain dopamine neurons and evoked dopamine release in dorsal striatal targets. Together, these data show in a rodent model that transient differences in feeding history in the juvenile-adolescent period can have significant impacts on adult weight, learning, and decision making.

Keywords: Food Insecurity, Cognitive Flexibility, Feeding, Adaptive, Dopamine

Introduction

Food insecurity is defined as uncertain or limited access to sufficient, nutritionally adequate, and safe food; and is distinguished from starvation and malnutrition1,2. Before the COVID-19 pandemic, approximately 14.5 million households with children in the United States and over 2 billion people worldwide experienced food insecurity in their daily life3,4. After COVID-19, numbers have increased worldwide4.

The epidemiological literature shows that children and adolescents who have experienced food insecurity are at higher risk for a number of mental health and behavioral problems511, including internalizing and externalizing behaviors7,10,12 and issues with self-control10,13. Food insecurity is also associated with differences in learning14, lower IQ metrics15, and worse math, reading, and vocabulary scores16,17. A functional brain imaging study of children who had experienced food insecurity found they performed worse than children who were food secure in a task based on reaction time and showed lower fronto-striatal white matter integrity18.

The effects of food insecurity on human development are often confounded with other factors associated with poverty and adversity. While researchers can statistically control for these variables in carefully designed human studies, it is particularly hard to fully isolate childhood food insecurity from parental stress and depression8,19,20. It is also of great interest to understand the effects of feeding history on brain development from the perspective of evolutionary biology2123. These factors motivated us to develop a mouse model of juvenile and adolescent food insecurity to explore effects on behavioral and brain development while controlling genetic and other environmental factors.

Systems that control learning and decision making evolved in large part to support foraging strategies. Foraging strategies can be considered behavioral phenotypes. Phenotypes that affect survival will be under selective pressure. An individual at birth may be capable of expressing multiple possible adult phenotypes, but the phenotypes ultimately expressed may be informed by the environment encountered by each individual during development21. In evolution and ecology, this complex developmental interaction of genes and environment has been termed adaptive developmental plasticity and encompasses ideas from life history theory2427. In theoretical, lab, and field work, it has been shown to be advantageous for organisms to make use of abundance and scarcity cues experienced in development to promote expression of a phenotype optimized for these conditions in the adult environment21,28,29. It is thought that when environments are relatively consistent across time, information acquired during development can be used to trigger a predictive adaptive response21,24,30. These ideas have been well established in evolutionary biology and may be bridged with studies of experience-dependent plasticity and sensitive periods in neuroscience22,30.

When forming our hypotheses about the effects of food insecurity on learning and decision making, we considered both human epidemiology and adaptive developmental plasticity frameworks. Based on human epidemiology, one might be inclined to predict that mice who experienced food insecurity, when compared to those who always experienced ad libitum access to food, would show worse cognitive performance in learning and decision making tasks. However, when considering adaptive developmental plasticity, we may also anticipate observing gains in performance in specific contexts in animals that experienced food insecurity. One recent study reported that increased past exposure to uncertainty was associated with greater cognitive flexibility in human subjects, but only when subjects were tested in conditions of uncertainty, which possibly match or mimic a more unstable developmental environment31.

Here, we present the paradigm we used to manipulate developmental juvenile-adolescent feeding history in mice from postnatal day(P) 21 to P40. We present results showing how these differences in feeding history affected weight, behavioral performance, and neurobiological measures in adulthood after P60.

Results

We manipulated the juvenile-adolescent feeding schedule in both male and female mice between P21 and P40, during which mice either had free access to food (ad libitum, AL) or had fluctuating, uncertain, and limited amounts of food (food insecure, FI). After P41, all mice had free access to food (Figure 1A1C).

Figure 1. Food insecurity feeding paradigm, experimental timeline, and mouse weight gain.

Figure 1.

A, Mice were assigned to 2 different groups, Ad lib (AL) and Food Insecurity (FI) at P21. After P41, both AL and FI mice were fed ad libitum until testing. Behavioral or neurobiological testing were performed after P60. B, Schematic illustrates the P21-P40 treatment differences. AL mice had free access to food daily while FI mice received food in alternating ‘feast and famine’ days. C, Food was delivered to the FI mice with variable ratio (with a 5.0g total baseline delivered every 48 hours). D, FI mice showed transient disruption of weight gain during the P21-40 (n(AL)=30, n(FI)=25). E, Male mice weight (n(AL)=16, n(FI)=12) showed no group differences in weight gain in adulthood. Note, all mice underwent food restriction for 4COF task in the P60s. F, Female mice (n(AL)=14, n(FI)=13) showed significant differences in weight gain in adulthood. **p<0.01, ***p<0.001, ****p<0.0001. D-F, Data are represented as mean ± SEM. See also Figure S1.

Feeding history in development affected adult weight in females but not males

During P21-40, FI mice were significantly lighter than AL mice (Figure 1D, treatment: F(1,1028)= 41.17, p<0.0001, age: F(19,1028)=163.5, p<0.0001, interaction: F(19,1028)= 3.24, p<0.0001; post-hoc Sidak’s multiple comparison: P37: p=0.0023, p39: p<0.0001, p41: p<0.0001). The P21-40 weights of the FI mice were comparable to P21-40 mice that were food restricted with a stable but limited amount of food during P21-40 (See Figure S1 for FR group mice) that allowed them to maintain 80-90% of the average weight of P21-40 AL mice (Figure S1A,B). The FI mice regained weight quickly and were comparable to the same sex AL mice by P43, the first weight measurement after FI mice were returned to ad libitum food (Figure 1D, P43: p>0.99, Figure S1B).

In adulthood, male FI mice maintained weights comparable to male AL mice (Figure 1E, treatment: F(1,261)=2.07, p=0.15, age: F(11, 261)=42.83, p<0.0001, interaction: F(11,261)=1.13, p=0.34; Figure S1C). Female FI mice grew significantly heavier than female AL mice (Figure 1F, treatment: F(1,300)=75, p<0.0001, age: F(11,300)=31.65, p<0.0001, interaction: F(11,300)=2.98, p=0.0009; P110: p=0.054, P120: p=0.0004, p130: p<0.0001, p140: p=0.0003, p150: p=0.0003; Figure S1D). These data are in line with findings in human literature that females are at higher risk to develop obesity with food insecurity experience3234.

In male mice, differences in feeding history (P21-40) affected reversal learning in adulthood

We next used a 4-choice odor-based foraging (4COF) task3538 (Figure 2A) to test how juvenile-adolescent feeding history affected capacity for learning and cognitive flexibility in adulthood. The mice were tested in discrimination and reversal learning phases in which a reward was obtained by digging selectively in one of four pots with different scented shavings. The spatial location of the pots was shuffled in each trial. To meet the criterion in each phase, mice needed to make 8 correct choices out of 10 consecutive trials. In the discrimination phase, adult (P60-70) male AL and FI mice took similar numbers of trials to criterion (Figure 2B, t(20)=0.33, p=0.75; Figure S2A) and made a similar number of total errors (Figure 2C, t(20)=0.97, p=0.34; Figure S2B), indicating that feeding history in development did not affect the capacity for initial associative learning in adult male mice.

Figure 2. In male mice, developmental feeding history affected adult cognitive flexibility in reversal learning. RL modeling suggests this effect was driven by differences in the learning rate in response to negative outcomes.

Figure 2.

A, Schematic of the 4COF task. O1 was rewarded in the discrimination phase and unrewarded in the reversal phase. Previously unrewarded O2 became rewarded in the reversal phase. B,C, Discrimination performance was similar between the adult male AL (n=10) and FI (n=12) mice. D,E, The FI mice took significantly more trials than the AL mice to reach criterion in the reversal phase, driven by a greater number of errors. F, The FI mice made significantly more reversal errors (O1 Error), especially Perseverative O1 errors. G,I, Inverse temperatures in both phases, βdis and βrev, were similar between the AL and FI mice. H, Learning rate, αdis, in the discrimination phase were comparable. J,K, In reversal phase, learning rates αrevpos(α+), were not significantly different between groups, but learning rates in response to negative outcomes, αrevneg(α), were significantly smaller in the FI group. **p<0.01, ***p<0.001, ****p<0.0001. Data are represented as mean ± SEM. See also Figure S2,S3 and Table S1.

In the reversal phase, we found clear group differences in male mice. Adult male FI mice showed less cognitive flexibility, taking significantly more trials to reach criterion (Figure 2D, t(20)=5.29, p<0.0001; Figure S2C) and made more total errors (Figure 2E, t(20)=4.98, p<0.0001; Figure S2D), compared to male AL mice. When the error types were examined, adult male FI mice made significantly more reversal errors (odor 1, O1 errors; Figure 2F, t(20)=4.49, p=0.0002). The majority of these O1 errors were perseverative errors (Perseverative O1), defined as errors choosing O1 before making the first correct choice (Figure 2F, t(20)=2.88, p=0.0092). There was no significant difference in regressive errors (Regressive O1), defined as errors choosing O1 after making the first correct choice, yet the data show a possible trend level difference (Mann Whitney U=31, p=0.054). There were no differences in irrelevant errors defined as choosing the never rewarded odor (O3) (t(20)=1.98, p=0.061) or novel errors defined as choosing a newly added odor O4’ (U=45.5, p=0.29). Both groups also had approximately the same number of omission trials in which no digging choice was made within a 3-minute time limit (Figure 2F, t(20)=0.25, p=0.80).

In two further cohorts of male mice, we replicated these 4COF behavioral results (Figure S2EL). As a further control, we also performed comparisons between FI and AL mice and a third food restriction group (FR). The male FR mice were intermediate between AL and FI mice in terms of reversal performance (Figure S2AH), suggesting FI treatment has effects beyond those of food restriction with daily feeding.

These results indicated that juvenile-adolescent feeding history had robust effects on reversal learning in adult male mice (Figure 2; Figure S2AL). A history of P21-40 food insecurity was associated with more perseverative choices in reversal learning in the deterministic context of the 4COF task.

Different feeding history affected cognitive flexibility by altering learning rates in adult male mice

To better understand the differences found in the 4COF task, we used reinforcement learning (RL) models to fit the trial-by-trial data (Figure 2). Comparing multiple submodels (Table S1), we found that our RL5 model with 5 parameters had the lowest average Akaike information criterion (AIC) score39 and good simulation recovery of mouse behavioral data (Figure S3AD). The parameters in the RL5 model were βdis and αdis for the discrimination phase and βrev,αrevpos(a+), and αrevneg(a) for the reversal phase, where β inverse temperature parameters capture stochasticity of the actions and action selection policy and α parameters capture learning rates in response to outcomes in each phase of the 4COF task.

In the discrimination phase, we found no significant differences between male AL and FI groups in either βdis (Figure 2G, U=49, p=0.50, MED(AL)=0.059, MED(FI)=0.081) or learning rate αdis (Figure 2H, U=50, p=0.54, MED(AL)=0.063, MED(FI)=0.047). In the reversal phase, we found that feeding history differentially affected learning rates αrevpos(a+) and αrevneg(a) (Figure 2J, t(20)=2.01, p=0.058; Figure 2K, U=17, p=0.0034, MED(AL)=0.23, MED(FI)=0.080), but did not affect βrev (Figure 2I, t(20)=0.96, p=0.35). The αrevneg(a) in response to unrewarded outcomes, or negative prediction error, in adulthood was significantly smaller in the FI group, likely contributing to the less flexible and perseverative performance in the reversal phase.

In female mice, differences in feeding history did not affect reversal learning in adulthood

We next examined the effects of juvenile-adolescent AL and FI feeding history (Figure 1AC) on adult performance of female mice in the 4COF task (Figure 3A). We found that FI during P21-40 did not affect performance in discrimination and reversal learning in adulthood in any metric (Figure 3BE; Figure S2MP). We also applied the RL models to this female data and again found that there were no differences between groups for all parameters in our RL5 model (Figure S3QU). Together, these data show that juvenile-adolescent feeding history does not affect initial associative learning or cognitive flexibility in reversal learning in a deterministic context in adult female mice.

Figure 3. In female mice, developmental feeding history did not affect adult cognitive flexibility in reversal learning.

Figure 3.

A, Schematic of the 4COF task. B,C, Discrimination phase, B, Trials to criterion (t(21)=0.72, p=0.48) C, Total numbers of errors (t(21)=0.72, p=0.48). D-F, Reversal phase. D, Trials to criterion (t(21)=0.40, p=0.70), E, Total numbers of errors (t(21)=0.46, p=0.65), F, O1 errors (t(21)=0.61, p=0.55), Perseverative O1 error (U=56.5, p=0.61), Regressive O1 error (t(21)=0.0035, p=0.99), Irrelevant error (t(21)=0.41, p=0.69), Novel error (U=62.5, p=0.89) and Omission (t(21)=0.79, p=0.44). n(AL)=10, n(FI)=13. Unpaired two-tailed t-test or Mann-Whitney two-tailed test. Data are represented as mean ± SEM. See also Figure S2,S3 and Table S1.

Differences in feeding history had no impact on measures of palatable food consumption behavior in adulthood in males and weak effects in females

While our study did not focus on feeding behavior, we did perform a small study of consumption of high-fat food (HFF) in both adult male and female mice. We tested HFF intake in three different consumption test conditions – baseline, restricted, and resated condition followed by 3 weeks of 2-hour access on an intermittent schedule (Figure S1L). We found no differences between adult male AL, FR, and FI mice in all of these consumption tests (Figure S1FK,T,V). In adult female mice, we found a significant increase in resated session (Figure S1W), relative to baseline session 3 in FI mice compared to AL mice, but no significant group differences in other consumption tests (Figure S1NS,U).

In male mice, differences in feeding history affected flexibility under probabilistic and volatile reward conditions in adulthood

We next tested male AL and FI mice in a probabilistic 2-armed bandit task (2ABT)40. This task also tests cognitive flexibility but differentially taxes decision making systems due to a probabilistic reward contingency and experience of repeated switching over hundreds of trials. Water was used as a reinforcer and no discrimination cues were provided in this task.

To probe behavior under different conditions of uncertainty, mice were trained in three phases with different reward contingencies: Phase 1-75%, Phase 2-90%, and Phase 3-65% (Figure 4AD). Mice initiated a trial by poking at the center initiation port and chose either the left (L) or right (R) peripheral port. In the L-port rewarded blocks of Phase 1, there was a 75% chance of reward delivery when mice made a correct decision at the L-port and always 0% of reward delivery when mice made a R-port choice (Figure 4A). Volatility came from block switching, i.e. a change from L-port rewarded to R-port rewarded side in all phases, which occurred every 15±8 rewards.

Figure 4. In male mice, developmental feeding history affected adult cognitive flexibility and the learning rate in response to negative outcomes in the probabilistic 2-armed bandit task.

Figure 4.

A, Schematic of the 2ABT. B-D, Comparing to adult male AL mice (n=8), FI mice (n=8) took significantly fewer trials to switch in Phase 1 and 3 (when the context was more probabilistic, 75% and 65% respectively). Groups did not differ in Phase 2 (90%). E-G, After a reward block switch, performance drops and then recovers. The FI mice reached 0.5 fraction of correct choice faster after a switch trial in Phase 1 and 3. This difference was present but less prominent in Phase 2. H, Within both groups, mice reached 0.5 fraction of L-choice faster in Phase 3. I-L, RL model with 4 parameters, β,αpos(a+), and αneg(a), and st in Phase 1. The FI group had significantly greater αneg(a) and smaller st values than the AL group. *p<0.05, ****p<0.0001. Dotted line at trial 0 indicates the reward block switching. Note the trial before the switch was always rewarded. Data are represented as mean ± SEM. See also Figure S4,S5 and Table S1.

One of our primary outcome measures in the 2ABT was the number of trials mice took to switch their chosen side when the action-outcome contingency changed at a block switch. During Phase 1-75%, we found that adult male FI mice, on average, took significantly fewer trials to switch than the AL mice (Figure 4B, t(14)=3.93, p=0.0015, AL:3.08±0.09, FI:2.50±0.11). This was again found in Phase 3-65% (Figure 4D, t(14)=2.67, p=0.018, AL:3.00±0.083, FI:2.58±0.13). However, the two groups took a comparable number of trials to switch in Phase 2-90% (Figure 4C, t(14)=0.76, p=0.46, AL:2.15±0.048, FI:2.04±0.13).

When we examined the switching behavior more closely (trial by trial after a switch), we again found that adult male FI mice switched significantly faster than the AL mice in Phase 1-75% (Figure 4E, treatment: F(1,182)=20.11, p<0.0001, trials relative to switch: F(12,182)=877.6, p<0.0001, interaction: F(12,182)=8.10, p<0.0001). Adult male FI mice reached the fraction of correct choice equaling 0.5 faster and the fraction of correct choice was significantly higher at first, second and third trials after the switch (Figure 4E, Sidak’s: 1st: p=0.014, 2nd: p<0.0001; 3rd: p<0.0001). A similar behavioral difference between groups was observed in Phase 3-65% (Figure 4G, 2nd: p<0.0001, 3rd: p=0.027; treatment: F(1,182)=3.48, p=0.064, trials relative to switch: F(12,182)=777, p<0.0001, interaction: F(12,182)=3.61, p<0.0001). In Phase 2-90%, this trial-by-trial difference was also significant but was less prominent (Figure 4F, 1st: p=0.0013, treatment: F(1,182)=0.31, p=0.58, trials relative to switch: F(12,182)=717.9, p<0.0001, interaction: F(12,182)=2.45, p=0.006). Together, these data suggest that adult male mice with juvenile-adolescent FI feeding history can show significantly more flexible behavior than their AL counterparts when the reward contingency and context is more uncertain and probabilistic (≤75%). For further analyses see supplemental information and Figure S5.

Different feeding history affected switching behavior in adult male mice by altering learning rates and ‘sticky choice’’

We again turned to RL models to better understand the latent variables contributing to performance differences in the 2ABT (Figure 4). We found that a model (RL2a1b1s) that included β for inverse temperature, αpos(a+) for learning rate associated with positive outcomes, and αneg(a) for learning rate associated with negative outcomes, and st for ‘stickiness’ (a parameter that accounts for staying with a previous choice affecting the policy stage) had the lowest average AIC scores (Figure S4A).

Adult male AL and FI groups showed similar inverse temperature β parameters in all phases (Figure 4I, Phase 1: t(14)=0.58, p=0.57, Figure S4C,G, Phase 2: t(14)=0.19, p=0.85, Phase 3: t(14)=0.95, p=0.36) and αpos(a+) (Figure 4J, Phase 1: t(14)=0.36, p=0.73, Figure S4D,H, Phase 2: t(14)=0.079, p=0.94, Phase 3: t(14)=0.63, p=0.54). Differences between groups emerged in αneg(a). Adult male FI group showed significantly greater learning rate αneg(a) than the AL group in Phase 1 (Figure 4K, t(14)=2.46, p=0.028). There were no significant differences between groups in Phase 2 or 3 (Figure S4E,I). AL and FI groups also showed significant differences in st in Phase 1 (Figure 4L, t(14)=2.7, p=0.017) and Phase 3 (Figure S4J, U=4, p=0.0019, MED(AL)=0.15, MED(FI)=0.096)), in which the FI group had smaller values. This suggests the FI group were less perseverative in the 2ABT (stayed with a previous choice less) than the AL group when reward probability was equal or less than 75% but not 90%. Measures of integration of reward history using a logistic regression analysis (Equation 5) suggested the AL and FI mice consistently differed in their integration of unrewarded trials in all three phases of the 2ABT (Figure S5EG).

Together, our modeling analyses support the interpretation that juvenile-adolescent feeding history affects updating of behavior particularly after negative outcomes in adult male mice.

In female mice, differences in feeding history did not affect flexibility and learning rates under probabilistic and volatile conditions in adulthood

We also ran female mice in the 2ABT to test if juvenile-adolescent feeding history affected the behavioral processes engaged by this task in females (Figure 5).

Figure 5. Developmental feeding history had no impact on performance in the 2-armed bandit task in adult female mice.

Figure 5.

A-C, Adult female AL (n=8) and FI mice (n=8) did not differ in the number of trials to switch in all phases. D-F, The female AL and FI groups did not differ in fraction of correct choice in all phases. Data are represented as mean ± SEM. See Figure S4,S5 and Table S1.

Comparing adult female AL and FI mice, there were no differences in trials to switch in Phase 1, 2 or 3 on average (Figure 5AC, t(14)<0.77, p>0.45), or when trial events were aligned to the switch trial (Figure 5DF, Phase 1-3: treatment: F(1,182)<0.20, p>0.65, trials relative to switch: F(12,182)>672, p=<0.0001, interaction: F(12,182)<0.72, p>0.73). We further applied the same RL modeling and logistic regression analyses to these female 2ABT. Results suggested that there was no difference in β,αpos(a+), αneg(a), or st parameters between adult female AL and FI groups in any of the three phases (Figure S4KV). Female AL and FI groups also showed similar logistic regression weights of both past rewarded and past unrewarded trials, indicating past rewarded and unrewarded history had similar effects on switching behavior in these three different probabilistic reward conditions (Figure S5HJ).

Together, these data suggest that juvenile-adolescent feeding history did not affect cognitive flexibility or updating to either positive or negative outcomes in a probabilistic context in adult female mice.

Feeding history affected synaptic strength of excitatory synapses onto mesolimbic dopamine neurons in adult male mice

We next turned to examine the neurobiology of dopamine neurons in adult male mice with different feeding history to investigate possible sources of their differences in cognitive flexibility. We first targeted dopamine neurons of the ventral tegmental area identified via with retrobeads injected into the nucleus accumbens (NAc) core region. In ex vivo slice, we measured excitatory postsynaptic currents (EPSCs) in labeled VTA neurons evoked by local electrical stimulation (Figure 6A). The dual EPSCs mediated by both α-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid receptors (AMPAR) and N-methyl-D-aspartate receptors (NMDAR) were recorded while neurons were held at a membrane potential of +40 mV. The NMDAR antagonist D-2-amino-5-phosphonopentanoate (D-AP5) was then applied to block NMDAR and isolate the AMPAR-mediated EPSCs (Figure 6B). We found that the AMPAR/NMDAR ratio in NAc core-projecting VTA dopamine neurons was significantly smaller in slices from the FI group (0.335±0.045, n=10) compared to the AL group (0.523±0.051, n=11) (Figure 6C, t(19)=2.72, p=0.014). These data suggest that juvenile-adolescent feeding history can modulate the strength of glutamatergic inputs onto VTA dopamine neurons in adulthood (in the absence of any training on a task).

Figure 6. In male mice, developmental feeding history affected AMPAR/NMDAR ratio in mesolimbic dopamine neurons in adulthood.

Figure 6.

A, Retrobeads were injected into the NAc core and recordings were made from the VTA in naïve male AL and FI mice at P61-70. B, Example of evoked EPSC traces before and after application of D-AP5. The dual components of AMPAR-mediated and NMDAR-mediated EPSCs were recorded. D-AP5 was applied to isolate AMPAR-mediated currents. C, The AMPAR/NMDAR ratio was significantly reduced in the FI group (n=10) compared to the AL group (n=11). D,E, AMPAR-mediated EPSC(I)-Voltage(V) relationship curve. There was a trend level difference in current at +40mV in the slices from the FI group (treatment: F(1,90)=1.367, p=0.25, voltage: F(4,90)=56.56, p<0.0001, interaction: F(4,90)=1.105, p=0.36, post-hoc Tukey at +40mV: p=0.096). There was no significant difference in rectification index between the AL(n=9) and FI (n=9) groups. F,G, Paired-pulse ratios were not significantly different between groups at 50, 100, and 200 ms intervals. *p<0.05. Data are represented as mean ± SEM.

We also examined the AMPAR current (I)-voltage (V) relationship (Figure 6D) and calculated the rectification index (Equation 6). We found a trend level difference in AMPAR-mediated EPSCs at +40 mV in the FI group (Figure 6D, post-hoc Tukey: p=0.096), but the rectification index did not significantly differ between the two groups (Figure 6E, t(16)=0.9775, p=0.34, AL: 2.12±0.42, n=9, FI: 1.64±0.27, n=9).

To assess potential changes in glutamatergic presynaptic release probability in these neurons, we delivered pairs of stimuli at different time intervals and calculated the paired-pulse ratio. We found no significant differences between groups in the paired-pulse ratios (Figure 6F,G, treatment: F(1,50)=0.78, p=0.38, paired pulse interval: F(2,50)=2.18, p=0.12, interaction: F(2,50)=0.21, p=0.81; Interval: 50-ms: AL=1.14±0.12, FI=1.11±0.18, 100-ms: AL=1.10±0.08, FI 0.95±0.09, and 200-ms: AL=0.93±0.04, FI=0.89±0.08).

These electrophysiology data together suggest that differences found in the AMPAR/NMDAR ratios between the male AL and FI groups (Figure 6C) possibly result from differences in AMPAR signaling and composition of AMPAR subunit expression without differences in presynaptic release probability. The changes in AMPAR/NMDAR ratios also suggest that the excitatory synapses onto VTA dopamine neurons were ‘weaker’ in the FI group than in the AL group (even after 20 days of ad libitum feeding had resumed for the FI group).

Feeding history affected evoked DA release in the nigrostriatal system in adult male mice

We next used fast scan cyclic voltammetry (FSCV) to investigate dopamine release in multiple striatal subregions, including dorsomedial striatum (DMS), dorsocentral striatum (DCS), dorsolateral striatum (DLS), central striatum (CS), ventrolateral striatum (VLS), ventromedial striatum (VMS), and NAc core (Figure 7A). We found that peak dopamine concentration ([DA]o) in the DLS evoked by a single stimulation (1p) was significantly lower in the FI group (in μM: 0.71±0.08) compared to peak [DA]o in the AL group (1.05±0.09) (Figure 7B, t(19)=4.31, p=0.0004, paired two-tailed t-test). Electrically-evoked peak [DA]o was comparable between groups in other striatal regions (NAc core t(31)=1.11, p=0.28, DMS (t(19)=0.69, p=0.50), DCS(t(18)=0.54, p=0.60), CS(t(19)=1.27, p=0.22), VLS(t(16)=1.36, p=0.19), VMS(t(19)=0.77, p=0.45, paired two-tailed t-test).

Figure 7. In male mice, developmental feeding history affected evoked dopamine release in the dorsal striatum in adulthood.

Figure 7.

A, Evoked dopamine release [DA]o in striatal subregions showing single pulse (1p) data. Inset, cyclic voltammogram shows characteristic dopamine waveforms. B, Quantification of peak [DA]o by 1p stimulation. N= 17-32 transients per site from 5 mice per group. The evoked peak [DA]o in the DLS was significantly lower in the FI group compared to the AL group. C, Peak [DA]o by a 4p train 100 Hz stimulation. N= 9-16 transients per site from 5 mice per group. The evoked peak [DA]o in the DLS was significantly lower in the FI group. D, Ratio of 4p/1p peak [DA]o. The 4p/1p ratio in the DMS was significantly lower in the FI group compared to the AL group. See also Figure S6. Paired two-tailed t-test. Slices were paired such that one FI and one AL brain were recorded using the same electrode on the same day. *p<0.05, **p<0.01, ***p<0.001. DMS, dorsomedial striatum. DCS, dorsocentral striatum; DLS, dorsolateral striatum. CS, central striatum. VLS, ventrolateral striatum. NAc, nucleus accumbens. VMS, ventromedial striatum. Data are represented as mean ± SEM.

We also measured dopamine release evoked by a short train of high frequency stimuli (4 pulses at 100 Hz, 4p) to simulate a burst firing state. Again we found that peak [DA]o evoked by 4p stimulation was significantly lower in the DLS of the FI group (in μM: 1.35±0.19) compared to peak [DA]o in the AL group (1.81±0.26, t(9)=2.43, p=0.038). The 4p stimulation produced no significant difference between groups in other striatal subregions (Figure 7C).

We calculated and compared the ratio of peak [DA]o evoked by a 4p 100Hz train to 1p stimulation as a measure of pre-synaptic release probability. The ratio of 4p/1p peak [DA]o was significantly lower in the DMS in the FI group (n= 5 mice) compared to that in the AL group (n=5 mice) (Figure 7D, t(4)=2.81, p=0.04, paired two-tailed t-test, AL: 1.68±0.10, FI:1.38±0.02), suggestive of increased release probability.

These data show that there is altered evoked dopamine release within the dorsal striatum of males with a history of food insecurity during the juvenile-adolescent period.

Discussion

In this work, we generated a mouse model of developmental food insecurity to investigate the impact of juvenile-adolescent feeding history on adult learning and decision making with a focus on metrics of cognitive flexibility. We also investigated weight, food consumption and neurobiology of the striatal dopamine systems.

We found that male mice with different juvenile-adolescent feeding histories (P21-40) did not show a difference in adult weight (Figure 1E; Figure S1C), high fat diet consumption (Figure S1EW), or simple discrimination learning, but did have significant differences in cognitive flexibility in a foraging task (4COF) (Figure 2;Figure S2) and a probabilistic 2-armed bandit task (2ABT) (Figure 4). Effects on cognitive flexibility in the 4COF task were replicated in two additional cohorts (Figure S2). In female mice, the experience of food insecurity at P21-40 significantly increased adult weight (Figure 1F; Figure S1D) and had some effect on high fat food consumption (Figure S1), but did not have significant impacts on learning and cognitive flexibility in adulthood in the two tasks used here (Figure 3,5).

Effects of food insecurity on weight and high food fat food consumption

In previous human studies, researchers have found that developmental and adult food insecurity is associated with increased weight gain and greater risk of developing obesity and that this phenomenon is more pronounced and more consistent in females3234. It is thought that increased body weight after a history of food insecurity or other harsh circumstances and increased HFF consumption after experience of acute restriction may serve caloric and somatic preparedness for reproduction, especially for females27,41,42. Future work will be needed to investigate the biological mechanisms resulting in the sex specificity of increased adult body weight in FI vs AL females.

Feeding history effects on cognition and sensitivity to negative outcomes

Our male mouse behavioral data are consistent with literature on human development suggesting that food insecurity can affect cognition. The experience of food insecurity has been associated with negative impacts on academic performance and mental health16,17,4345. Studies that specifically isolate food insecurity and the cognitive domain, however, are rare. Aurino et al. (2019) found that chronic food insecurity through development as well as acute, early, and more punctuated episodes of food insecurity in the later juvenile period impaired academic performance at age 1216. Dennison et al. (2019) also found that people with childhood adversity had altered reward processing and responses in a monetary incentive delay task18. By controlling for multiple factors, they determined that these differences were mediated by the experience of food insecurity, but not by other forms of adversity such as neglect and abuse. More general studies from early life adversity in humans have also shown that the experience of early life adversity can result in reduced cognitive flexibility, but without explicit report of sex differences4650. Notably, one study suggests that effects of adversity on flexibility may be positive in specific contexts: Mittal et al. (2015) found when testing human subjects that had experienced greater uncertainty in their past and ‘controls’ that lacked this exposure, their performance was comparable when in a neutral context but more flexible when testing in an uncertain context31.

Using RL modeling analyses, we were able to more closely investigate the trial-by-trial behavior to examine how learning from rewarded and unrewarded outcomes contributed to task performance. We found that juvenile-adolescent feeding history affected a, learning from unrewarded outcomes, in both tasks (Figure 2,4). Interestingly, while cognitive flexibility and sensitivity to negative outcomes was significantly lower in the FI group in the deterministic 4COF task, the effects observed in the probabilistic 2ABT were in a different direction. Adult male FI mice switched faster and used a larger a than the AL mice when tested in a 75% reward contingency (Phase 1) but showed more comparable behavior when tested in a 90% reward contingency (Phase 2) (Figure 4; Figure S4). These data suggest the male mice with FI history do not just have a simple impairment in updating in response to negative outcomes, but that feeding history interacts with testing context (including reward probability within the 2ABT) to elicit this ability. Thus, cognitive flexibility is differentially gated by uncertainty in the AL and FI groups.

Other recent studies in humans have revealed that learning rates are not intrinsic to a subject but instead are sensitive to uncertainty and volatility and perhaps further aspects of task context 51,52. Uncertainty may be a particularly influential contextual factor because it is known to affect the dopamine system53,54.

Feeding history in development can generate detectable neurobiological differences in adulthood

In our neurobiological studies, we found that VTA dopamine neurons that project to the NAc core had a reduced AMPAR/NMDAR ratio in the FI group compared to the AL group (Figure 6). We also found that regulation of dopamine release from dopamine terminals in the dorsal striatum was affected by feeding history (Figure 7). These data suggest that FI mice differed from AL mice in adulthood at least two ways: the strength of glutamatergic inputs onto VTA dopamine neurons were weaker and dopamine release in the dorsal striatum was also likely lower (if neurons were firing similarly). VTA and substantia nigra pars compacta (SNpc) dopamine neurons have been implicated in signaling reward prediction error55,56, reward probability and uncertainty in reinforcement learning57, and uncertainty associated with reward probabilities in a probabilistic environment58,59. In addition, two studies that manipulated activity and signaling of dopamine neurons and used the same 4COF task in adult animals also found effects on cognitive flexibility in reversal phase38,60. We posit that complex interplay of differences in firing patterns, and intracellular signaling on the time scales of seconds to minutes, as well as hours and days, could lead to functional differences in dopamine neurons and cortico-striatal circuits, that result in behavioral differences in flexible updating in the tasks (See Lin et al. 2020 for a working model22).

Our neurobiological data are consistent with other studies of food restriction and diet induced obesity in adult rodents. A previous study found that adult rats that experienced three to four weeks of food restriction had lower AMPAR- and/or NMDAR-mediated currents compared to rats that were fed ad libitum61. Food and feeding experience in adult animals have also been found to affect dopamine release at axonal terminals in both ventral and dorsal striatum6269. Our study adds to these existing data by showing that feeding history during development can generate changes in the mesolimbic and nigrostriatal dopamine systems that can be observed in adulthood, 20 days after food insecurity has ended.

Limitations of the study and future directions

In our study, we found differences in neurobiology that could plausibly cause the different behavioral phenotypes38,60, but we did not test this connection with further manipulations or in vivo measurement. In addition, we only took neurobiological measures from adult male mice. Future studies in females are required to know if changes in neurobiology were sex-specific and to understand how females are resilient to cognitive effects. Finally, we do not yet know what specific role may be played by food restriction (and possibly transient malnourishment) versus experience of uncertainty alone. Future study designs may better isolate uncertainty and/or measure effects of restriction.

Conclusion and public health relevance

Our results suggest that the experience of food insecurity during the juvenile-adolescent period impacts cognitive flexibility and responsiveness to negative outcomes in adult male mice and impacts weight gain and high fat food consumption in adult female mice. Our data are consistent with epidemiological studies of human subjects that find relationships between food insecurity and depression, substance abuse, academic outcomes, and obesity. While mice may not fully model human biology, they may help understand the mechanisms that lead to these major public health issues in humans.

Our data also reveal that increased cognitive flexibility can occur in mice with a history of food insecurity when they are tested in more uncertain contexts. This suggests feeding history does not simply stunt the capacity for flexible updating but rather affects how flexible updating is recruited in a context-dependent manner. Drawing from theoretical work, we posit that the flexibility we observe in the context of uncertainty may be a predictive adaptive response to scarcity or adversity.

We hope our study will inform public health decision making and galvanize efforts to provide secure access to food for all children and adolescents. Our data show that feeding history in the juvenile-adolescent period is a major variable that can significantly impact adult weight and behavior. These data suggest feeding programs not only reduce hunger, but also likely affect longer term metabolic and cognitive function.

STAR Methods

Resources availability

Lead contact

Further information and requests for resources should be directed to the Lead Contact, Dr. Linda Wilbrecht (wilbrecht@berkeley.edu).

Materials availability

This study did not generate new unique reagents.

Data and code availability

All data reported in this paper will be shared by the lead contact upon request.

All code used for analysis in this paper will be shared by the lead contact upon request.

Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Experimental model and subject details

We used C57BL/6N mice from Taconic Biosciences, Inc. All mice used in experiments were born in the animal facility of University of California, Berkeley. We chose to use Taconic mice (C57BL/6N) to avoid a mutation in the metabolism relevant nuclear-encoded, mitochondrial protein Nnt gene. An Nnt mutation can be found in C57BL/6J mice from Jackson laboratory70. Nicholson et al. (2010) found that the C57BL/6J mice with Nnt mutation had higher non-fasting level of glucose in plasma and more severe glucose intolerance compared to C57BL/6N71.

Mice were housed on a 12h/12h reverse light-dark cycle (lights off at 10 AM). Teklad Global 18% Protein Rodent Diet 2918 (Envigo) was used as the standard diet in all feeding experience experiments and regular rearing. All animals received nesting materials and water ad libitum in their home cages. Behavioral testing was conducted after P60 and during the dark cycle period. Animals were assigned into two or three different experimental groups, Ad Libitum (AL), Food Insecurity (FI), and Food Restriction (FR). There were 39 males (AL=16, FI=12, FR=11) and 40 females (AL=14, FI=13, FR=13) used in the weight monitoring experiment. In the 4COR task, 33 males (AL=10, FI=12, FR=11) and 36 females (AL=10, FI=13, FR=13) were used. In the two replications of the 4COR tasks, a total of 53 male mice (1st replication: AL=6, FI=7, FR=7; 2nd replication: AL=17, FI=16) were used. In the 2ABT task, a total of 32 mice were used (n=8 per group per sex). In the electrophysiology experiment, 10 male mice (AL=5, FI=5) were used. In the fast-scan cyclic voltammetry experiment, a total of 10 male mice were used (AL=5, FI=5). All procedures were approved by the UC Berkeley Animal Care and Use Committee.

Method details

Food insecurity and ad libitum feeding paradigm

Mice were weaned, individually housed, and assigned into 2 or 3 different groups at P21. Mice in the Ad Libitum (AL) group (AL mice) had abundant access to standard rodent chow, while mice in the Food Insecurity (FI) group (FI mice) experienced food scarcity from P21 to P40 at level of 80-90% average weights of the AL mice. In pilot and some control experiments (shown in Figure S1,S2) we also included a food restricted (FR) group which was fed a constant and restricted amount daily from P21 to P40 to achieve a level of ~85% average weights of the AL mice. In the 20-day treatment period, FI mice received variable food delivery with alternating high versus low amounts. Food amount was set at 5.0g for 48-hour as baseline. All mice were weighed every two days to track their growth. AL mice weights were used to adjust 48-hour (2-day) total food amounts from the baseline 5.0g to be given to FI and FR mice to keep FI and FR mice at ~85% average weights of the AL mice. However the delivery of this 48h amount to FI mice was varied; the daily fed amounts of food for Day1 and Day2 in each 48h period of P21-40 followed a ratio 100%:0%, 80%:20%, and 90%:10 (for Day1:Day2 ratio), respectively (Figure 1;Figure S1). Note, For FI mice high and low amounts varied with predictable regularity, but zero food days were more rare and unpredictable. At P41, all FI and FR mice were placed back on ad libitum food, and thereafter feeding was matched among groups. Nesting materials and water were always provided and freely available in their homecages. All behavioral and neurobiological measures experiments were performed after P60.

High fat food consumption test

To test adult food consumption, adult mice were given access to the high fat food (HFF, Oreo cookies, original chocolate flavor) for 2 hours in each consumption session between 9 am to 1 pm. The cookies were crushed into powder using a hand blender, placed into a plastic cup, and covered with a metal feeder top to prevent spill. In the HFF consumption test, adult mice were given 3 baseline consumption sessions (session 1-3) with standard rodent chow ad libitum on Day 1-3. On Day 4 and 5, mice were food restricted to 80-90% of ad lib weight. On Day 6, mice were given a consumption session under the food restricted condition. After the 2-hour food-restricted consumption session, mice were returned to food ad libitum for 2 days (Day 6-8). On Day 9, mice were given a consumption session with food ad libitum under the resated condition. After the HFF consumption test, mice were given access to HFF for 2 hours in the intermittent consumption schedule on Mondays, Wednesdays, and Fridays for 3 weeks. Mice body weights were taken before the consumption session. Weights of HFF container were measured before and after the 2-hour consumption session to measure HFF intake.

4-choice odor-based foraging (4COF) task

The 4-choice odor-based foraging (4COF) task has been described in previously published work3537. In the task, all mice were first mildly food restricted for two days to target at about 80-90% of ad libitum fed weight. Mice were then habituated to the testing arena with four ceramic pots containing a piece of cheerio reward (HoneyNut Cheerios, General Mills) for three 10-min sessions in the next day. The testing arena was a 12”x 12” x 9” square maze with four clear transparent acrylic walls partially dividing the arena into four quadrants. On the following day, mice learned to dig to retrieve cheerio reward buried in a pot with gradually increased levels of unscented Aspen wood shavings (Kaytee Products, Inc) with a total of 12 trials. The location of the pot was pseudo-randomly shuffled, allowing each quadrant to be rewarded equally. On the behavioral testing day, mice went through both discrimination and reversal phases in which four pots with scented shavings (with 4 unique odors O1-4) were present at the four corners of the arena. In each trial, mice had a maximum of three minutes to make a choice by digging in one of the four ceramic pots (O1, O2, O3, O4). Once the first bimanual digging choice was made, mice were gently locked into the selected quadrant using a central cylinder. All pots were sham-baited with a cheerio under a mesh screen, to control for the odor of cheerio rewards. The location of the scented pots was pseudorandomized at the four quadrants in each trial. The same odor never appeared in the same quadrant in consecutive trials. In the discrimination phase, O1 was rewarded while O2, O3, and O4 were not rewarded. Discrimination learning was considered complete when mice reached a criterion 8 out of 10 consecutive trials correct. Mice then immediately began the reversal testing phase in the next trial. In the reversal phase, the previously unrewarded odor O2 became rewarded and O1 was no longer rewarded. We also replaced O4 with a novel odor (O4’) to test if a novel odor in the environment was sampled more heavily by any group. Anise extract (McCormick) undiluted was used as odor O1 at 0.02 ml/g of shavings. Essential oils clove and litsea (San Francisco Massage Supply Co) diluted 1:10 in mineral oil were used as O2 and O3, respectively, at 0.02 ml/g of shavings. Thyme was made from Thymol diluted 1:20 in 50% ethanol was used as O4 at 0.01 ml/g of shavings. Essential oil eucalyptus (San Francisco Massage Supply Co) diluted 1:10 in mineral oil was used as the novel odor in the reversal phase (O4’) at 0.02 ml/g of shavings. Choice made by digging, entries in each quadrant, and latency to dig in each trial were recorded.

2-armed bandit task (2ABT)

After completion of the 4COF task, mice were trained in the 2-armed bandit task (2ABT) (starting at ~P110). In this 2ABT40, mice were trained to nose poke for a water reward with probabilistic nature and reward location periodically alternating at random intervals. Mice were mildly water restricted 1-2 days prior to the training sessions to motivate learning. During the training sessions, mice were placed in an operant chamber with 3 different ports on the same wall. To self-initiate the trial, mice needed to poke their nose into the center initiation port and then indicate a decision by poking one of the two peripheral ports, left (L) or right (R) port, for probabilistic reward water delivery. White LED lights, indicating a Go cue, would be turned on at both peripheral ports when mice poked and held at the center port long enough to initiate a trial. Water reward was only delivered at one peripheral port at a time. Infrared photodiode and phototransistor pairs (Island Motion) were used for detecting port entries and exits. Water reward delivered by water valves (Neptune Research) was calibrated to a constant volume (2 μl) for rewarded choices.

In our version of the 2ABT, there were three training phases. In the first training phase, the correct choices were rewarded at 75% while the incorrect choices were always unrewarded (75% vs 0%). The side of the rewarded port was switched every 15±8 rewarded trials, depending on the total number of rewards delivered in each block. The reward probabilities for the correct choice in second and third training phases were changed to 90% vs 0% and 65% vs 0%, respectively. Male and Female mice were trained for at least 3 sessions in each phase (Phase 1 - 75%: 6 - 10 sessions, Phase 2-90%: 3 - 6 sessions, Phase 3 - 65%: 5 - 15 sessions). Total numbers of trials per mouse 11847±1207 in Phase 1, 5264±272.3 in Phase 2, and 11032±579.2 in Phase 3 were included in analysis.

Retrograde labeling and electrophysiology

Male AL and FI mice were unilaterally injected with red retrobeads (100 nl; LumaFluor Inc.) into left NAc core (bregma +1.1 mm, lateral 1.4 mm, ventral −4.4 mm from skull) 2 days before electrophysiology experiments at P61-70. Mice were deeply anaesthetized with pentobarbital (200 mg/kg i.p.; Vortech). After intracardial perfusion with ice-cold artificial cerebrospinal fluid (ACSF), 200 μm coronal midbrain slices were prepared. ACSF solutions contained in mM: 2.5 glucose, 50 sucrose, 125 NaCl, 2.5 KCl, 25 NaHCO3, 1.25 NaH2PO4, 0.1 CaCl2, and 4.9 MgCl2, and oxygenated with 95% O2/5% CO2. After 90 minutes of recovery, slices were transferred to a recording chamber and perfused continuously with oxygenated ACSF containing in mM: 11 glucose, 125 NaCl, 2.5 KCl, 25 NaHCO3,1.25 NaH2PO4, 1.3 MgCl2, and 2.5 CaCl2. Patch pipettes (3.8-4.4 MΩ) were pulled from borosilicate glass (G150TF-4; Warner Instruments) and filled with internal solution containing in mM: 117 CsCH3SO3, 20 HEPES, 0.4 EGTA, 2.8 NaCl, 5 TEA, 4 MgATP, 0.3 NaGTP, 5 QX314, and 0.1 Spermine, pH7.3 (270-285 mOsm). D-AP5 (50 μM) was applied to block NMDA receptors.

Electrophysiological recordings were made at 30-32° C using a MultiClamp700B amplifier and acquired using a Digidata 1440A/1550 digitizer, sampled at 10kHz, and filtered at 2 kHz. A concentric bipolar stimulating electrode was placed 100-300 μm lateral to the recording electrode, controlled by an ISO-Flex stimulus isolator (A.M.P.I). All data acquisition was performed using pCLAMP software (Molecular Devices). Labeled neurons in the VTA of the midbrain slices were identified by retrobead labeling, where majority of VTA neurons projecting to the NAc core are dopaminergic72.

Fast scan cyclic voltammetry (FSCV)

Dopamine release was monitored using FSCV in acute coronal slices containing striatum38,73. Separate cohorts of male AL and FI mice at P61-70 were anesthetized with isoflurane and decapitated. Following decapitation, the brain was removed. Coronal slices with 275 μm thickness were cut on a vibratome (Leica VT1000S) in ice-cold high Mg2+ ACSF containing in mM: 85 NaCl, 25 NaHCO3, 2.5 KCl, 1.25 NaH2PO4, 0.5 CaCl2, 7 MgCl2, 10 glucose, 65 sucrose, oxygenated with 95% O2/5% CO2. Slices between +1.5 mm and +0.5 mm from bregma containing dorsal striatum and NAc were used for experimentation74. Slices were then placed in ACSF containing in mM: 130 NaCl, 25 NaHCO3, 2.5 KCl, 1.25 NaH2PO4, 2 CaCl2, 2 MgCl2, 10 glucose at room temperature during 1 hour recovery and at 32° C in recording chamber.

Striatal DA release following electrical stimulation with a bipolar concentric stimulating electrode (2 ms, 600 μA) was monitored with fast cyclic voltammetry at carbon-fiber microelectrodes (CFMs). Electrical stimulation was controlled by a Isoflex stimulus isolator (A.M.P.I.) that was delivered out of phase with voltammetric scans. A triangular waveform was applied to CFMs scanning from −0.7V to +1.3V and back, against the Ag/AgCl reference electrode at a rate of 800 V/s.

CFMs were approximately 100 μm away from the stimulating electrode. Evoked dopamine transients were sampled at 8 Hz, and data were acquired to 50 kHz using AxoScope 10.2 (Molecular Devices).

Electrical stimulation was delivered in the following sequence: single pulse, pulse train of 4 pulses at 100 Hz, and single pulse. Each pulse or pulse train was delivered 2.5 minutes apart. Slices from different treatment slices were recorded with the same CFMs for every treatment pair. There were two release events per recording site per slice for single-pulse data, while 4-pulse data consisted of one release event per recording site per slice. Sampling subregions included dorsomedial striatum (DMS), dorsocentral striatum (DCS), dorsolateral striatum (DLS), central striatum (CS), ventrolateral striatum (VLS), ventromedial striatum (VMS), and nucleus accumbens core (NAc).

Quantification and statistical analysis

Group values were reported as mean (M) ± standard error of mean (SEM) or median (MED). Data were tested for normality and then analyzed using two-tailed t-tests or ANOVAs with post-hoc analysis. Data with non-normal distribution were analyzed with nonparametric tests. GraphPad Prism 7 was used for statistical analysis. MATLAB 2016a was used for reinforcement learning model fitting and simulation and logistic regression modeling and analysis.

RL modeling of the 4COF task

Reinforcement learning (RL) models were used to further examine the impact of different juvenile-adolescent developmental feeding experiences on latent processes underlying learning, updating, and decision making. Classic RL algorithms assume that subjects learn information in the environment by updating their value estimates of different cues and/or actions (options) incrementally through iterative trial-and-error processes75,76. The RL models use prediction error (δ) to update the estimated expected value (Q) of each available option (i.e. the 4 different odors in each phase of the 4COF task), where the prediction error (δ) is the difference between the current feedback value (λ) obtained from outcome and expected value of action a,(Q(a)).

In our RL models for the 4COF task, the feedback value (λ) was set as 100 for rewarded choices and set as 0 for unrewarded choices. The value updating from the prediction error (δ) was scaled by a learning rate parameter (α), with 0 ≤ α ≤ 1 (Equation 1).

Qt(a)=Qt1(a)+α×δt(a),δt(a)=λQt1(a) (Equation 1.)

The action probability P(a), or the relative probability of selecting each action, for each trial was calculated by transforming the expected action value of action a,Q(a), to relative probability using the softmax function76. The inverse temperature parameters (β) in the function indicates the stochasticity of the actions and action selection policy, with 0 ≤ β ≤ 1 (Equation 2).

Pt(a)=eQt(a)eβ×Qt(a) (Equation 2.)

We fit the discrimination and learning behavioral data with several alternative classic RL models (Table S1). The basic RL model (RL2) had 1 α and 1 β parameters, assuming the agent had the same learning rate and action selection policy in both phases of the task. We also set up the RL models (RL3, RL4) with two separate learning rate parameters, αdis and αrev for discrimination and reversal phases, respectively. In RL3, there was one inverse temperature, β, assuming the mice had the same selection strategy in both phases of the 4COF task. In the RL4, we had separate inverse temperature parameters, βdis and βrev, for discrimination and reversal phases, respectively, to examine if the learned experience of the task structure in the discrimination phase changed the learning rate and selection strategy in the reversal phase. It is possible that different mechanisms support learning from rewarded and unrewarded outcomes, therefore we also set up RL model (RL5) with separate learning rates for positive and negative prediction errors, αrevpos and arevneg respectively, in the reversal phase.

According to the error type analysis in the reversal phase, we also considered another alternative family of RL models, adding a single sticky parameter, st, in both phases of the task (Table S1). The sticky parameter st act on the level of transforming expected value Q(a) to action probability P(a), with 0 ≤ st ≤ 1 (Equation 34).

When sticky parameter st equals one, the estimated action value of previously selected action a applied to the softmax function U(a) is increased by a hundred (Equation 3), suggesting agents will be more likely to choose the same previous action in the current trial (Equation 4).

Ut=Qt,Ut(a)=Qt(a)+100×st (Equation 3.)
Pt(a)=eUt(a)eβ×Ut(a) (Equation 4.)

The RL model with lowest Akaike Information Criterion (AIC) score was selected as the current working RL model, which was the RL5 composed of 5 parameters, αdis,βdis,αrevpos,αrevneg, and βrev. In the 4COF task, there were 5 total odors available. Mice had their subjective values in response to each of the 5 odors (O1, O2, O3, O4, O4’) as innate preferences at the beginning of the task. We calculated the percentage of choosing each odor in the first 4 trials in the discrimination phase for each mouse and used the averages of these percentages multiplied by 100 as the initial values for each odor option or associated action. Similar methods to identify the initial values were used and published77. The initial values were set to be Q(O1)=35.48, Q(O2)=12.86, Q(O3)=1.19, and Q(O4)=50.48. In the reversal phase, the initial values for O1, O2, O3 were the same as the values in the very last trial of the discrimination phase and the value for O4’ was calculated by using the same method as described above (Q(O4’)=0.7).

RL modeling of the 2ABT

Four different standard RL models with different numbers of parameters were employed and compared for the 2ABT behavioral data (Table S1). The RL1α1β had one learning rate that accounted for all outcomes, while the RL2α1β separated the learning rate for rewarded (apos) and unrewarded (aneg) outcomes. The feedback value (λ) was similarly set as 1 for rewarded choices and set as 0 for unrewarded choices in the 2ABT. The learning rate parameter (α) was constrained with 0 ≤ α ≤ 1 (Equation 1). The inverse temperature parameters (β) was constrained with 0 ≤ β ≤ 10 in this task (Equation 2). Suggested from the logistic regression results (Figure S5), we also added a single sticky parameter, st, to the models, RL1α1β1s and RL2α1β1s for model comparison (Figure S4). The st again act on the level of transforming expected value Q(a) to action probability P(a), with 0 ≤ st ≤ 1 (Equation 34) but with slightly modification in Equation 3, which is shown as Equation 3.1. The initial values for the two options (L-port and R-port) were set as 0.5.

Ut=Qt,Ut(a)=Qt(a)+1×st (Equation 3.1.)

Logistic regression analysis of the 2ABT

We also employed the multivariate logistic regression model analysis40 to analyze the behavioral data from the 2ABT. The logistic regression model (Equation 5) can be used to determine the relative contribution of past rewarded and unrewarded outcomes on a trial-by-trial basis to predict upcoming choice behavior.

log(PL(i)1PL(i))=j=1nWjRewarded(YL(ij)YR(ij))+j=1nWjUnrewarded(NL(ij)NR(ij))+W0 (Equation 5.)

PL(i) is the probability of choosing the L port. The variables YL or YR indicate if a water reward is received (1) or not received (0) at the L or R port, respectively, while NL or NR indicate the absence of water reward (1 or 0) at either the selected L or R port, respectively. i indicates that the event happened in the i-th trial. The variable n represents the number of trials in the past that were included in the model (n=4). The regression coefficients WRewarded and WUnrewarded represent the contribution of past rewarded history and past unrewarded history, respectively, and W0 represents the intrinsic bias of choosing the L or R port of the animal.

Electrophysiology data analysis

AMPAR/NMDAR ratio at +40 mV was calculated from values obtained from average of excitatory postsynaptic currents (EPSCs) before and after application of D-AP5, where NMDAR-EPSCs were calculated by the digital subtraction of average EPSC with D-AP5 from average EPSC without D-AP572. Rectification index (RI) was calculated by plotting average EPSCs at −70, −50, 0, +20, and +40 mV and taking the ratio of the slopes between currents (I) at different potentials (V) by the formula shown below (Equation 6)78,79.

RI={I+40I0I0I70}×74 (Equation 6.)

Fast scan cyclic voltammetry data analysis

Two recording sites within the NAc core were averaged together for analysis. FSCV data were first processed using the AxoScope 10.2 software and analyzed using excel and GraphPad Prism. Peak-evoked dopamine release levels were compared. Peak [DA]o by 1p stimulation was calculated from 17-32 transients per site from 5 mice per treatment group. Peak [DA]o by a 4p train 100 Hz stimulation per subregion was calculated from n= 9-16 transients per site from 5 mice per treatment group.

Supplementary Material

Supplemental Material

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
 
 
 
 
 
Bacterial and virus strains
 
 
 
 
 
Biological samples
 
 
 
 
 
Chemicals, peptides, and recombinant proteins
D-AP5 Hello Bio HB0225
 
 
 
 
Critical commercial assays
 
 
 
 
 
Deposited data
 
 
 
 
 
Experimental models: Cell lines
 
 
 
 
 
Experimental models: Organisms/strains
Mouse: C57BL/6 Taconic Biosciences C57BL/6NTac
 
 
 
 
 
Oligonucleotides
 
 
 
 
 
Recombinant DNA
 
 
 
 
 
Software and algorithms
 
 
 
 
 
Other
 
 
 
 
 

Acknowledgements

We thank Anne Collins for discussion of modeling. We thank Michael McDannald, Kristen Delevich and members of the Wilbrecht lab for discussion. We thank Amy Zuo, Alagia Cirolia, Becky Lee, and Aishwarya Pattnaik for assistance with experiments, data analysis and modeling. This work was supported National Institutes of Health (NIH) R21 AA025172 (to L.W.), and a seed grant from the Robert Wood Johnson Foundation, Health & Society Scholars Program (to E.G.), and NIH U19 1U19NS113201 (to L.W.). H.S.B. is a Chan Zuckerberg Biohub Investigator and a Weill Neurohub Investigator.

Footnotes

Declaration of interests

The authors declare no competing interests.

References

  • 1.Cook JT, and Frank DA (2008). Food security, poverty, and human development in the United States. Ann N Y Acad Sci 1136, 193–209. 10.1196/annals.1425.001. [DOI] [PubMed] [Google Scholar]
  • 2.Coleman-Jensen A, Nord M, and Singh A (2013). Household Food Security in the United States in 2012. USDA Economic Research Report. http://ers.usda.gov/media/1183208/err-155.pdf. [Google Scholar]
  • 3.Coleman-Jensen AR, M P , Gregory CA, and Singh A (2019). Household food insecurity in the United States in 2018. USDA Economic Research Report. https://www.ers.usda.gov/webdocs/publications/94849/err-270.pdf?v=7256. [Google Scholar]
  • 4.FAO, IFAD, UNICEF, WFP, and WHO. (2021). The State of Food Security and Nutrition in the World 2021. Transforming food systems for food security, improved nutrition and affordable healthy diets for all (FAO, IFAD, UNICEF, WFP, WHO). 10.4060/cb4474en. [DOI] [Google Scholar]
  • 5.Burke MP, Martini LH, Cayir E, Hartline-Grafton HL, and Meade RL (2016). Severity of Household Food Insecurity Is Positively Associated with Mental Disorders among Children and Adolescents in the United States. J Nutr 146, 2019–2026. 10.3945/jn.116.232298. [DOI] [PubMed] [Google Scholar]
  • 6.Poole-Di Salvo E, Silver EJ, and Stein RE (2016). Household Food Insecurity and Mental Health Problems Among Adolescents: What Do Parents Report? Acad Pediatr 16, 90–96. 10.1016/j.acap.2015.08.005. [DOI] [PubMed] [Google Scholar]
  • 7.Weigel MM, and Armijos RX (2018). Household Food Insecurity and Psychosocial Dysfunction in Ecuadorian Elementary Schoolchildren. Int J Pediatr 2018, 6067283. 10.1155/2018/6067283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kotchick BA, Whitsett D, and Sherman MF (2021). Food Insecurity and Adolescent Psychosocial Adjustment: Indirect Pathways through Caregiver Adjustment and Caregiver-Adolescent Relationship Quality. J Youth Adolesc 50, 89–102. 10.1007/s10964-020-01322-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rani D, Singh JK, Acharya D, Paudel R, Lee K, and Singh SP (2018). Household Food Insecurity and Mental Health Among Teenage Girls Living in Urban Slums in Varanasi, India: A Cross-Sectional Study. Int J Environ Res Public Health 15. 10.3390/ijerph15081585. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kimbro RT, and Denney JT (2015). Transitions Into Food Insecurity Associated With Behavioral Problems And Worse Overall Health Among Children. Health Aff (Millwood) 34, 1949–1955. 10.1377/hlthaff.2015.0626. [DOI] [PubMed] [Google Scholar]
  • 11.Jackson DB, and Vaughn MG (2017). Household food insecurity during childhood and adolescent misconduct. Prev Med 96, 113–117. 10.1016/j.ypmed.2016.12.042. [DOI] [PubMed] [Google Scholar]
  • 12.Slopen N, Fitzmaurice G, Williams DR, and Gilman SE (2010). Poverty, food insecurity, and the behavior for childhood internalizing and externalizing disorders. J Am Acad Child Adolesc Psychiatry 49, 444–452. 10.1097/00004583-201005000-00005. [DOI] [PubMed] [Google Scholar]
  • 13.Jackson DB, Newsome J, Vaughn MG, and Johnson KR (2018). Considering the role of food insecurity in low self-control and early delinquency. J Crim Just 56, 127–139. 10.1016/j.jcrimjus.2017.07.002. [DOI] [Google Scholar]
  • 14.Howard LL (2011). Does food insecurity at home affect non-cognitive performance at school? A longitudinal analysis of elementary student classroom behavior. Economics of Education Review 30, 157–176. 10.1016/j.econedurev.2010.08.003. [DOI] [Google Scholar]
  • 15.Belsky DW, Moffitt TE, Arseneault L, Melchior M, and Caspi A (2010). Context and sequelae of food insecurity in children’s development. Am J Epidemiol 172, 809–818. 10.1093/aje/kwq201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Aurino E, Fledderjohann J, and Vellakkal S (2019). Inequalities in adolescent learning: Does the timing and persistence of food insecurity at home matter? Economics of Education Review 70, 94–108. 10.1016/j.econedurev.2019.03.003. [DOI] [Google Scholar]
  • 17.Winicki J, and Jemison K (2003). Food insecurity and hunger in the kindergarten classroom: Its effect on learning and growth. Contemp Econ Policy 21, 145–157. DOI 10.1093/cep/byg001. [DOI] [Google Scholar]
  • 18.Dennison MJ, Rosen ML, Sambrook KA, Jenness JL, Sheridan MA, and McLaughlin KA (2019). Differential Associations of Distinct Forms of Childhood Adversity With Neurobehavioral Measures of Reward Processing: A Developmental Pathway to Depression. Child Dev 90, e96–e113. 10.1111/cdev.13011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Whitaker RC, Phillips SM, and Orzol SM (2006). Food insecurity and the risks of depression and anxiety in mothers and behavior problems in their preschool-aged children. Pediatrics 118, e859–868. 10.1542/peds.2006-0239. [DOI] [PubMed] [Google Scholar]
  • 20.Melchior M, Caspi A, Howard LM, Ambler AP, Bolton H, Mountain N, and Moffitt TE (2009). Mental health context of food insecurity: a representative cohort of families with young children. Pediatrics 124, e564–572. 10.1542/peds.2009-0583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Nettle D, and Bateson M (2015). Adaptive developmental plasticity: what is it, how can we recognize it and when can it evolve? Proc Biol Sci 282, 20151005. 10.1098/rspb.2015.1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Lin WC, Delevich K, and Wilbrecht L (2020). A role for adaptive developmental plasticity in learning and decision making. Current Opinion in Behavioral Sciences 36, 48–54. 10.1016/j.cobeha.2020.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Biro PA, and Stamps JA (2008). Are animal personality traits linked to life-history productivity? Trends Ecol Evol 23, 361–368. 10.1016/j.tree.2008.04.003. [DOI] [PubMed] [Google Scholar]
  • 24.Bateson P, Gluckman P, and Hanson M (2014). The biology of developmental plasticity and the Predictive Adaptive Response hypothesis. J Physiol 592, 2357–2368. 10.1113/jphysiol.2014.271460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Lea AJ, Tung J, Archie EA, and Alberts SC (2017). Developmental plasticity: Bridging research in evolution and human health. Evol Med Public Health 2017, 162–175. 10.1093/emph/eox019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lafuente E, and Beldade P (2019). Genomics of Developmental Plasticity in Animals. Front Genet 10, 720. 10.3389/fgene.2019.00720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ellis BJ, Figueredo AJ, Brumbach BH, and Schlomer GL (2009). Fundamental Dimensions of Environmental Risk : The Impact of Harsh versus Unpredictable Environments on the Evolution and Development of Life History Strategies. Hum Nat 20, 204–268. 10.1007/s12110-009-9063-7. [DOI] [PubMed] [Google Scholar]
  • 28.Bloxham L, Bateson M, Bedford T, Brilot B, and Nettle D (2014). The memory of hunger: developmental plasticity of dietary selectivity in the European startling, Sturnus vulgaris. Animal Behavior 91, 33–40. 10.1016/j.anbehav.2014.02.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Smallegange IM (2011). Complex environmental effects on the expression of alternative reproductive phenotypes in the bulb mite. Evolutionary Ecology 25, 857–873. 10.1007/s10682-010-9446-6. [DOI] [Google Scholar]
  • 30.Frankenhuis WE, and Walasek N (2020). Modeling the evolution of sensitive periods. Dev Cogn Neurosci 41, 100715. 10.1016/j.dcn.2019.100715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Mittal C, Griskevicius V, Simpson JA, Sung S, and Young ES (2015). Cognitive adaptations to stressful environments: When childhood adversity enhances adult executive function. J Pers Soc Psychol 109, 604–621. 10.1037/pspi0000028. [DOI] [PubMed] [Google Scholar]
  • 32.Franklin B, Jones A, Love D, Puckett S, Macklin J, and White-Means S (2012). Exploring mediators of food insecurity and obesity: a review of recent literature. J Community Health 37, 253–264. 10.1007/s10900-011-9420-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Dinour LM, Bergen D, and Yeh MC (2007). The food insecurity-obesity paradox: a review of the literature and the role food stamps may play. J Am Diet Assoc 107, 1952–1961. 10.1016/j.jada.2007.08.006. [DOI] [PubMed] [Google Scholar]
  • 34.Davis CR, Dearing E, Usher N, Trifiletti S, Zaichenko L, Ollen E, Brinkoetter MT, Crowell-Doom C, Joung K, Park KH, et al. (2014). Detailed assessments of childhood adversity enhance prediction of central obesity independent of gender, race, adult psychosocial risk and health behaviors. Metabolism 63, 199–206. 10.1016/j.metabol.2013.08.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Johnson C, and Wilbrecht L (2011). Juvenile mice show greater flexibility in multiple choice reversal learning than adults. Dev Cogn Neurosci 1, 540–551. 10.1016/j.dcn.2011.05.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Thomas AW, Caporale N, Wu C, and Wilbrecht L (2016). Early maternal separation impacts cognitive flexibility at the age of first independence in mice. Dev Cogn Neurosci 18, 49–56. 10.1016/j.dcn.2015.09.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Vandenberg A, Lin WC, Tai LH, Ron D, and Wilbrecht L (2018). Mice engineered to mimic a common Val66Met polymorphism in the BDNF gene show greater sensitivity to reversal in environmental contingencies. Dev Cogn Neurosci 34, 34–41. 10.1016/j.dcn.2018.05.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kosillo P, Doig NM, Ahmed KM, Agopyan-Miu A, Wong CD, Conyers L, Threlfell S, Magill PJ, and Bateup HS (2019). Tsc1-mTORC1 signaling controls striatal dopamine release and cognitive flexibility. Nat Commun 10, 5426. 10.1038/s41467-019-13396-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Akaike H (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–723. 10.1109/TAC.1974.1100705. [DOI] [Google Scholar]
  • 40.Tai LH, Lee AM, Benavidez N, Bonci A, and Wilbrecht L (2012). Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value. Nat Neurosci 15, 1281–1289. 10.1038/nn.3188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hochberg Z, and Belsky J (2013). Evo-devo of human adolescence: beyond disease models of early puberty. BMC Med 11, 113. 10.1186/1741-7015-11-113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Roff DA (2002). Life history evolution (Sinauer Associates; ). [Google Scholar]
  • 43.Belachew T, Hadley C, Lindstrom D, Gebremariam A, Lachat C, and Kolsteren P (2011). Food insecurity, school absenteeism and educational attainment of adolescents in Jimma Zone Southwest Ethiopia: a longitudinal study. Nutr J 10, 29. 10.1186/1475-2891-10-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Jyoti DF, Frongillo EA, and Jones SJ (2005). Food insecurity affects school children’s academic performance, weight gain, and social skills. J Nutr 135, 2831–2839. 10.1093/jn/135.12.2831. [DOI] [PubMed] [Google Scholar]
  • 45.Raskind IG, Haardorfer R, and Berg CJ (2019). Food insecurity, psychosocial health and academic performance among college and university students in Georgia, USA. Public Health Nutr 22, 476–485. 10.1017/S1368980018003439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Amitai N, Young JW, Higa K, Sharp RF, Geyer MA, and Powell SB (2014). Isolation rearing effects on probabilistic learning and cognitive flexibility in rats. Cogn Affect Behav Neurosci 14, 388–406. 10.3758/s13415-013-0204-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Goodwill HL, Manzano-Nieves G, LaChance P, Teramoto S, Lin S, Lopez C, Stevenson RJ, Theyel BB, Moore CI, Connors BW, and Bath KG (2018). Early Life Stress Drives Sex-Selective Impairment in Reversal Learning by Affecting Parvalbumin Interneurons in Orbitofrontal Cortex of Mice. Cell Rep 25, 2299–2307 e2294. 10.1016/j.celrep.2018.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Harms MB, Shannon Bowen KE, Hanson JL, and Pollak SD (2018). Instrumental learning and cognitive flexibility processes are impaired in children exposed to early life stress. Dev Sci 21, e12596. 10.1111/desc.12596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Hurtubise JL, and Howland JG (2017). Effects of stress on behavioral flexibility in rodents. Neuroscience 345, 176–192. 10.1016/j.neuroscience.2016.04.007. [DOI] [PubMed] [Google Scholar]
  • 50.Wang L, Jiao J, and Dulawa SC (2011). Infant maternal separation impairs adult cognitive performance in BALB/cJ mice. Psychopharmacology (Berl) 216, 207–218. 10.1007/s00213-011-2209-4. [DOI] [PubMed] [Google Scholar]
  • 51.Eckstein MK, Master SL, Dahl RE, Wilbrecht L, and Collins AGE (2022). Reinforcement learning and Bayesian inference provide complementary models for the unique advantage of adolescents in stochastic reversal. Dev Cogn Neurosci 55. 10.1016/j.dcn.2022.101106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Eckstein MK, Wilbrecht L, and Collins AGE (2021). What do reinforcement learning models measure? Interpreting model parameters in cognition and neuroscience. Current Opinion in Behavioral Sciences 41, 128–137. 10.1016/j.cobeha.2021.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Gershman SJ, and Uchida N (2019). Believing in dopamine. Nat Rev Neurosci 20, 703–714. 10.1038/s41583-019-0220-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Starkweather CK, Gershman SJ, and Uchida N (2018). The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty. Neuron 98, 616–629 e616. 10.1016/j.neuron.2018.03.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Glimcher PW (2011). Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci U S A 108 Suppl 3, 15647–15654. 10.1073/pnas.1014269108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Schultz W (1997). Dopamine neurons and their role in reward mechanisms. Curr Opin Neurobiol 7, 191–197. 10.1016/s0959-4388(97)80007-4. [DOI] [PubMed] [Google Scholar]
  • 57.Fiorillo CD, Tobler PN, and Schultz W (2003). Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902. 10.1126/science.1077349. [DOI] [PubMed] [Google Scholar]
  • 58.de Lafuente V, and Romo R (2011). Dopamine neurons code subjective sensory experience and uncertainty of perceptual decisions. Proc Natl Acad Sci U S A 108, 19767–19771. 10.1073/pnas.1117636108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Tennyson SS, Brockett AT, Hricz NW, Bryden DW, and Roesch MR (2018). Firing of Putative Dopamine Neurons in Ventral Tegmental Area Is Modulated by Probability of Success during Performance of a Stop-Change Task. eNeuro 5. 10.1523/ENEURO.0007-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Luo SX, Timbang L, Kim JI, Shang Y, Sandoval K, Tang AA, Whistler JL, Ding JB, and Huang EJ (2016). TGF-beta Signaling in Dopaminergic Neurons Regulates Dendritic Growth, Excitatory-Inhibitory Synaptic Balance, and Reversal Learning. Cell Rep 17, 3233–3245. 10.1016/j.celrep.2016.11.068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Pan Y, Chau L, Liu S, Avshalumov MV, Rice ME, and Carr KD (2011). A food restriction protocol that increases drug reward decreases tropomyosin receptor kinase B in the ventral tegmental area, with no effect on brain-derived neurotrophic factor or tropomyosin receptor kinase B protein levels in dopaminergic forebrain regions. Neuroscience 197, 330–338. 10.1016/j.neuroscience.2011.08.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Avena NM, Rada P, and Hoebel BG (2008). Underweight rats have enhanced dopamine release and blunted acetylcholine response in the nucleus accumbens while bingeing on sucrose. Neuroscience 156, 865–871. 10.1016/j.neuroscience.2008.08.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Avena NM, Murray S, and Gold MS (2013). Comparing the effects of food restriction and overeating on brain reward systems. Exp Gerontol 48, 1062–1067. 10.1016/j.exger.2013.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Bassareo V, and Di Chiara G (1999). Modulation of feeding-induced activation of mesolimbic dopamine transmission by appetitive stimuli and its relation to motivational state. Eur J Neurosci 11, 4389–4397. 10.1046/j.1460-9568.1999.00843.x. [DOI] [PubMed] [Google Scholar]
  • 65.Brown HD, McCutcheon JE, Cone JJ, Ragozzino ME, and Roitman MF (2011). Primary food reward and reward-predictive stimuli evoke different patterns of phasic dopamine signaling throughout the striatum. Eur J Neurosci 34, 1997–2006. 10.1111/j.1460-9568.2011.07914.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Pothos EN, Creese I, and Hoebel BG (1995). Restricted eating with weight loss selectively decreases extracellular dopamine in the nucleus accumbens and alters dopamine response to amphetamine, morphine, and food intake. J Neurosci 15, 6640–6650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.de Lartigue G, and McDougle M (2019). Dorsal striatum dopamine oscillations: Setting the pace of food anticipatory activity. Acta Physiol (Oxf) 225, e13152. 10.1111/apha.13152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Burke MV, and Small DM (2016). Effects of the modern food environment on striatal function, cognition and regulation of ingestive behavior. Curr Opin Behav Sci 9, 97–105. 10.1016/j.cobeha.2016.02.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Fritz BM, Munoz B, Yin F, Bauchle C, and Atwood BK (2018). A High-fat, High-sugar ‘Western’ Diet Alters Dorsal Striatal Glutamate, Opioid, and Dopamine Transmission in Mice. Neuroscience 372, 1–15. 10.1016/j.neuroscience.2017.12.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Toye AA, Lippiat JD, Proks P, Shimomura K, Bentley L, Hugill A, Mijat V, Goldsworthy M, Moir L, Haynes A, et al. (2005). A genetic and physiological study of impaired glucose homeostasis control in C57BL/6J mice. Diabetologia 48, 675–686. 10.1007/s00125-005-1680-z. [DOI] [PubMed] [Google Scholar]
  • 71.Nicholson A, Reifsnyder PC, Malcolm RD, Lucas CA, MacGregor GR, Zhang W, and Leiter EH (2010). Diet-induced obesity in two C57BL/6 substrains with intact or mutant nicotinamide nucleotide transhydrogenase (Nnt) gene. Obesity (Silver Spring) 18, 1902–1905. 10.1038/oby.2009.477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Lammel S, Ion DI, Roeper J, and Malenka RC (2011). Projection-specific modulation of dopamine neuron synapses by aversive and rewarding stimuli. Neuron 70, 855–862. 10.1016/j.neuron.2011.03.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Threlfell S, Lalic T, Platt NJ, Jennings KA, Deisseroth K, and Cragg SJ (2012). Striatal dopamine release is triggered by synchronized activity in cholinergic interneurons. Neuron 75, 58–64. 10.1016/j.neuron.2012.04.038. [DOI] [PubMed] [Google Scholar]
  • 74.Paxinos G, and Franklin KBJ (2008). The mouse brain in stereotaxic coordinates (Elsevier Science; ). [Google Scholar]
  • 75.Rescorla RA (1976). Stimulus generalization: some predictions from a model of Pavlovian conditioning. J Exp Psychol Anim Behav Process 2, 88–96. 10.1037//0097-7403.2.1.88. [DOI] [PubMed] [Google Scholar]
  • 76.Sutton RS, and Barto AG (2018). Reinforcement learning: An introduction (The MIT Press; ). [Google Scholar]
  • 77.Johnson CM, Peckler H, Tai LH, and Wilbrecht L (2016). Rule learning enhances structural plasticity of long-range axons in frontal cortex. Nat Commun 7, 10785. 10.1038/ncomms10785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Panicker S, Brown K, and Nicoll RA (2008). Synaptic AMPA receptor subunit trafficking is independent of the C terminus in the GluR2-lacking mouse. Proc Natl Acad Sci U S A 105, 1032–1037. 10.1073/pnas.0711313105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Adesnik H, and Nicoll RA (2007). Conservation of glutamate receptor 2-containing AMPA receptors during long-term potentiation. J Neurosci 27, 4598–4602. 10.1523/JNEUROSCI.0325-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Data Availability Statement

All data reported in this paper will be shared by the lead contact upon request.

All code used for analysis in this paper will be shared by the lead contact upon request.

Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

RESOURCES