Skip to main content
Learning & Memory logoLink to Learning & Memory
. 2021 Dec;28(12):435–439. doi: 10.1101/lm.053472.121

Maintained goal-directed control with overtraining on ratio schedules

Eric Garr 1, Yasmin Padovan-Hernandez 2, Patricia H Janak 1,2, Andrew R Delamater 3,4
PMCID: PMC8600976  PMID: 34782401

Abstract

It is thought that goal-directed control of actions weakens or becomes masked by habits over time. We tested the opposing hypothesis that goal-directed control becomes stronger over time, and that this growth is modulated by the overall action–outcome contiguity. Despite group differences in action–outcome contiguity early in training, rats trained under random and fixed ratio schedules showed equivalent goal-directed control of lever pressing that appeared to grow over time. We confirmed that goal-directed control was maintained after extended training under another type of ratio schedule—continuous reinforcement—using specific satiety and taste aversion devaluation methods. These results add to the growing literature showing that extensive training does not reliably weaken goal-directed control and that it may strengthen it, or at least maintain it.


Flexible behavior requires an understanding of how one's actions influence the environment around them. Goal-directed control, defined as control over an action by its anticipated consequence, is known to be influenced by the schedule of reinforcement that the action has been acquired under. Goal-directed control has been shown to emerge under random ratio schedules more readily than random interval schedules (Gremel and Costa 2013), and more readily under fixed interval than random interval schedules (DeRusso et al. 2010; Garr et al. 2020). In a recent report (Garr et al. 2020), it was argued that a potentially important variable driving these schedule effects is the overall action–outcome (A-O) contiguity, defined as the mean time between an instrumental action and its subsequent outcome. Specifically, the hypothesis maintains that (1) goal-directed control of instrumental actions grows with repeated experience of A-O pairings, and (2) the overall A-O contiguity controls the rate of this growth, such that shorter times between actions and outcome facilitate the onset of goal-directed control while longer times have a retarding effect. The idea that learning becomes stronger with more favorable contiguity conditions is an old idea with significant empirical support (e.g. Balsam et al. 2010).

The idea that goal-directed control becomes stronger, or is at least maintained, with learning receives some support in the literature (Colwill and Rescorla 1985, 1988; Colwill and Triola 2002; Jonkman et al. 2010; Garr et al. 2020) but contradicts the popular idea that goal-directed control becomes weaker or masked by habitual control over training (Adams 1982; Tricomi et al. 2009; Smith and Graybiel 2013). Furthermore, the idea that A-O contiguity is a primary determinant of goal-directed control needs further testing. Urcelay and Jonkman (2019) trained rats on a continuous reinforcement (CRF) schedule but varied A-O contiguity between groups. They observed that rats trained with a 20-sec delay between lever press and food reward failed to develop goal-directed control, whereas rats trained with immediate reward were goal-directed. We sought to follow up on these ideas by training rats on random and fixed ratio schedules and subsequently testing goal-directed control in tests of reward devaluation. We predicted that the random schedule would induce relatively poor A-O contiguity especially early in training compared with the fixed schedule, and this should translate to slower growth of goal-directed control. We also predicted that the magnitude of goal-directed control would grow over the course of training rather than weaken.

Forty-eight Long-Evans rats (24 male and 24 female) were run in three replications (n = 16 per replication). They were maintained at 85% of their ad libitum weight. Rats were first given one session of magazine training with one pellet type (TestDiet MLabRodent 45-mg grain or Bio-serv 45-mg purified). Twenty pellets were delivered on a 60-sec random time schedule. Rats were then trained to press a lever on a CRF schedule such that each lever press yielded one pellet. Following one session of CRF, the reinforcement schedule was switched to either a fixed (n = 24) or random (n = 24) ratio schedule. The fixed schedule was arranged so that a pellet was delivered into the magazine every 20 (FR-20; n = 16) or 25 (FR-25; n = 8) presses. The random schedule was arranged so that, each time the lever was pressed, a pellet was delivered with a 0.05 (RR-20; n = 16) or 0.04 (RR-25; n = 8) probability. Two rats (one FR and one RR) did not learn to lever press and were excluded. At the beginning of each session, the lever was inserted and remained available until 50 pellets were earned or 60 min elapsed, whichever occurred first. Ratio training continued for 30 daily sessions, with devaluation test cycles conducted repeatedly after sessions 2, 10, 20, and 30. These test intervals were chosen based on a previous study with interval schedules (Garr et al. 2020), with an additional test cycle to examine truly extensive training.

Each devaluation test cycle was comprised of two tests separated by a session of retraining. Prior to the first test, rats were isolated in wire cages and given 1 h of unlimited access to either the pellet type that was associated with lever pressing (“devalued” test) or the other pellet used as a control for general satiety (“valued” test). Beginning 2 d prior to the first test, rats were preexposed to the unfamiliar pellet type and consumed 20 pellets each day after training sessions. Immediately after the satiation period, rats were placed in the operant chambers and given a 5-min test in which the lever was available but no rewards were delivered. The retraining session was run the following day and was identical to ratio training. The next day, each rat was sated on the other pellet type and tested again. Group assignment, pellet outcome, sex, and order of devaluation tests were counterbalanced.

All statistical tests were conducted using analysis of variance (ANOVA) with α = 0.05. Significant interactions were followed up with one-way ANOVAs and mutually orthogonal post hoc contrasts using the recommendations of Rodger (1974). Degrees of freedom for follow-up one-way ANOVAs were derived from Satterthwaite's (1946) approximation. We also provide a measure of effect size based on Perlman and Rasmussen's (1975) estimate of the noncentrality parameter Δ.

During instrumental training, rats showed increases in lever pressing rates across sessions (Fig. 1A). An ANOVA with session, schedule randomness, and ratio value as factors showed only a main effect of session (F(3,126) = 211.94, P < 0.05). We recorded the time from each lever press to the subsequent pellet delivery and computed an average A-O temporal distance per session (Fig. 1B). An ANOVA on this measure revealed significant main effects of schedule randomness (F(1,44) = 15.49, P < 0.05) and session (F(1,44) = 49.19, P < 0.05), and a randomness × session interaction (F(1,44) = 7.97, P < 0.05). Follow-up one-way ANOVAs on sessions 2, 10, 20, and 30 revealed a significant difference only during session 2 (F(1,176) = 32.64, Δ = 31.27, P < 0.05), with the RR group showing weaker A-O contiguity compared with the FR group.

Figure 1.

Figure 1.

(A) Mean lever presses per minute across training sessions. (B) Mean time separating an individual lever press from the subsequent pellet delivery across sessions. Data are collapsed across FR/RR-20 and FR/RR-25 subgroups. Vertical bars indicate SEM. (*) P < 0.05.

For reward devaluation tests, we subjected mean lever pressing rates to an ANOVA with session, outcome value, schedule randomness, ratio value, and sex as factors. Pressing rates are shown in Figure 2A. Significant main effects were detected only for session (F(3,114) = 16.47, P < 0.05), outcome value (F(1,38) = 87.37, P < 0.05), and ratio value (F(1,38) = 10.80, P < 0.05). The main effect of ratio value reflects the fact that the FR-25 and RR-25 subgroups pressed at overall higher rates than the FR-20 and RR-20 subgroups, but since this variable did not interact with any other variable, we chose to collapse across these subgroups. The only significant interaction was between session and outcome value (F(3,114) = 8.61, P < 0.05). A follow-up one-way ANOVA collapsed across groups revealed significant differences among the eight tests (F(7,308) = 26.86, Δ = 176.74, P < 0.05). Post hoc contrasts could not detect a significant devaluation effect after 2 d of ratio training, but showed robust devaluation effects after 10, 20, and 30 d (Fs(7,308) > 4.72, P < 0.05) as well as a greater overall rate of pressing during the final three test cycles compared with the first (F(7,308) = 6.41, P < 0.05). We also calculated devaluation scores as (valued − devalued)/(valued + devalued) and plotted individual scores (Fig. 2B). These analyses suggest, for both groups, that, once manifested, goal-directed control was maintained over training.

Figure 2.

Figure 2.

(A) Lever pressing rates during nonrewarded tests as a function of test type (valued vs. devalued) and training length. (Left) fixed ratio group. (Right) random ratio group. Rats were repeatedly tested. (*) P < 0.05. (B) Devaluation scores over time. Scores were calculated as (valued − devalued)/(valued + devalued). Circles are individual rats and horizontal bars are means. (C) Amount of pellets consumed during 1 h satiation periods prior to testing as a function of test type (valued vs. devalued) and training length. (D) Consumption scores over time. Scores were calculated as (devalued − valued)/(valued + devalued).

We also measured consumption during the satiation periods (Fig. 2C). There was a main effect of session (F(3,132) = 26.70, P < 0.05), but no main effect of value or group (Fs(1,44) < 0.73, Ps > 0.05). There were no interactions. The main effect of session reflects the fact that consumption increased over tests. Consumption scores were calculated as (devalued − valued)/(devalued + valued) and plotted for individual rats (Fig. 2D). We conclude that goal-directed control of instrumental actions is initially difficult to detect but, once detected, is maintained comparably on fixed and random ratio schedules, and seems unaffected by variations in A-O contiguity early in training.

Our hypothesis that differences in A-O contiguity between FR and RR groups should have resulted in different acquisition rates of goal-directed control was not supported by the data, although it is worth noting that A-O contiguity differed between groups only early in training and goal-directed control was detectable in both groups sometime between sessions 2 and 10. Further tests between these two training levels would be required to more fully evaluate the hypothesis. The data also support the view that goal-directed control may strengthen with overtraining, and do not match the assumption that actions transition to outcome-insensitive habits with overtraining on ratio schedules. Therefore, we next sought to replicate the first experiment that reported habit formation with overtraining on a ratio schedule (Adams 1982). This experiment involves training separate groups of rats for either two or 10 sessions of CRF followed by testing after reward devaluation. In the original report, rats given two training sessions showed goal-directed control of lever pressing, but goal-directed control was attenuated in rats given 10 training sessions. That experiment used taste aversion to devalue the instrumental outcome. We continued with our method of using specific satiety but, in a separate experiment, also used taste aversion with lithium chloride (LiCl) because it has been suggested that the specific satiety procedure may break habits (Bouton et al. 2020).

In the specific satiety experiment, 16 rats (eight male and eight female) were treated similarly as before but with the following exceptions. Following magazine training, rats were trained only on CRF schedules for either two (n = 8) or 10 (n = 8) sessions. A single devaluation test cycle (5-min tests) was conducted after the assigned length of training. Satiation occurred in the home cage. The pellet types used were Bio-serv 45-mg grain pellets and Bio-serv 45-mg banana-flavored sucrose pellets. One rat in each group did not learn to lever press and was excluded.

The degree of goal-directed control appeared to differ between groups (Fig. 3A), and this was supported by statistical analysis. A session × outcome value × sex ANOVA revealed a main effect of outcome value (F(1,12) = 31.78, P < 0.05) and an outcome value × group interaction (F(1,12) = 8.59, P < 0.05). No other main effects or interactions reached significance. Follow-up one-way ANOVAs did not detect a devaluation effect in the 2-d group (F(1,12) = 3.66, P > 0.05) but did detect an effect in the 10-d group (F(1,12) = 36.71, Δ = 29.59, P < 0.05). Devaluation scores for individual rats showed a notable difference in variability between groups—while there was a high degree of variability within the 2-d group, all the rats within the 10-d group showed a strong devaluation effect (Fig. 3B). We also measured consumption during the satiation periods (Fig. 3C). There was no main effect of outcome value (F(1,12) = 2.58, P > 0.05), a main effect of group (F(1,12) = 7.68, P < 0.05), and no outcome value × group interaction (F(1,12) = 0.90, P > 0.05). Consumption scores are plotted for individual rats (Fig. 3D). Of note, the outlier in Figure 3B is not the same rat as the outlier in Figure 3D. This analysis supports the conclusion that goal-directed control was greater after 10 versus 2 d of CRF training.

Figure 3.

Figure 3.

(A) Lever pressing rates during nonrewarded tests as a function of test type (valued vs. devalued) and group following either two or 10 sessions of CRF. Devaluation tests were conducted using specific satiety. (*) P < 0.05. (B) Devaluation scores calculated as (valued − devalued)/(valued + devalued). Circles are individual rats and horizontal bars are means. (C) Amount of pellets consumed during 1 h satiation periods prior to testing as a function of test type (valued vs. devalued) and group. (D) Consumption scores were calculated as (devalued − valued)/(valued + devalued). (E) Lever pressing rates during nonrewarded tests as a function of test type (valued vs. devalued) and group following either two or 10 sessions of CRF. Devaluation tests were conducted using taste aversion. (F) Pellets consumed over time during taste aversion.

In the LiCl experiment, 32 rats (15 male and 17 female) were evenly divided into a 2-d and a 10-d training group. Rats were once again trained to press a lever on a CRF schedule for either grain or banana sucrose pellets. Pellet outcome, group assignment, and sex were counterbalanced. Reward devaluation was conducted over five 2-d cycles. During the first day of each cycle, all rats were given 50 pellets in the absence of the lever in the operant box randomly over 15 min followed by an injection of either LiCl for the devalued subgroups (0.3 M, 15 mL/kg) or saline for the valued subgroups. During the second day, rats were placed in the operant box without access to pellets for 15 min and then given an injection of either saline for the devalued subgroups or LiCl for the valued subgroups.

During the final training session, the mean presses per min for the 2-d valued, 2-d devalued, 10-d valued, and 10-d devalued groups were 5.02, 5.03, 12.35, and 11.51, respectively. One rat from the 10-d devalued group failed to learn to press and was excluded. Taste aversion proceeded smoothly (Fig. 3F), with the exception that one rat from the 2-d devalued group did not receive pellets during the final session due to a technical malfunction and was excluded. Pressing rates during the 5-min tests are presented in Figure 3E. A session × outcome value × sex ANOVA showed a main effect of outcome value (F(1,22) = 6.45, P < 0.05), reflecting the fact that pressing rates were greater for valued than devalued groups regardless of training length. No other main effects or interactions reached significance. This analysis supports the conclusion that goal-directed control was equally strong after 10 versus 2 d of CRF training.

The present set of experimental results add to the growing literature showing that extensive training does not weaken goal-directed control and can strengthen or maintain it (Colwill and Rescorla 1985, 1988; Colwill and Triola 2002; Jonkman et al. 2010; Corbit et al. 2012; de Wit et al. 2018; Thrailkill et al. 2018; Garr and Delamater 2019; Garr et al. 2020; Pool et al. 2021). Rats trained on fixed and random ratio schedules displayed equivalent levels of goal-directed control that became more evident between sessions 2 and 10 of instrumental training and was maintained from sessions 10 to 30. This is despite the fact that fixed and random ratio schedules induced a large initial difference in A-O contiguity, which was hypothesized to drive differences in the rate of A-O learning. A-O contiguity may be less critical for A-O learning than originally thought (DeRusso et al. 2010; Garr et al. 2020). However, more direct manipulations of A-O contiguity with a wider variety of operant schedules may be necessary to further evaluate this idea (e.g., Urcelay and Jonkman 2019).

It has long been recognized that the schedule of reinforcement influences the degree to which an action becomes goal-directed (Dickinson 1985), and we can now update our knowledge of these schedule effects. Specifically, after rats or mice have been trained for six to 10 training sessions to press a single lever, lever pressing is reliably goal-directed under fixed ratio, random ratio, and fixed interval schedules, but not random interval schedules (Dickinson et al. 1983; DeRusso et al. 2010; Gremel and Costa 2013; Garr et al. 2020). What we have shown in our current and previous (Garr et al. 2020) reports is that once instrumental training reaches 20 sessions, lever-pressing becomes goal-directed under all schedules. It is notable that extensive training is often defined in a range of six to 14 sessions (e.g., Colwill and Rescorla 1988; Malvaez et al. 2018) and training rarely exceeds that limit.

One caveat in interpreting the results of the current set of experiments is that lower overall levels of pressing after two training sessions potentially impede the ability to detect a devaluation effect because pressing rates are close to the floor. Strong devaluation effects are more easily detectable when baseline levels are higher. Moreover, across studies, consumption during the satiation periods increased over time. This could reflect increasing hunger, which could raise pressing rates higher above the floor and thus facilitate the detection of a devaluation effect for rats that underwent extensive training. It is therefore difficult to state with certainty that goal-directed control grew between session 2 to session 10, only that it did not decline. However, it is worth pointing out that even when these confounds are controlled for, there is still strong evidence for strengthening of goal-directed control with extensive training (Colwill and Rescorla 1988).

An interesting question is why random interval schedules produce sluggish goal-directed learning. A recent study in humans found that, after training on a random interval schedule, self-reported stress levels served as a moderating variable of performance on a specific satiety devaluation test such that high stress levels predicted devaluation-insensitive performance (Pool et al. 2021). In a randomized control study, a stress manipulation also predicted devaluation-insensitive performance (Schwabe and Wolf 2009), and stress has been appealed to as part of a neurobiological mechanism mediating the effect of a high-fat diet on instrumental sensitivity to food devaluation (Tantot et al. 2017). Whether random interval schedules induce more stress than fixed interval or ratio schedules in rodents, and whether this can account for devaluation effects under specific satiety and/or taste aversion paradigms, remains to be investigated.

Of note, we found that goal-directed control either grew or was maintained when rats were tested after 10 versus two sessions on a CRF schedule, and this was confirmed with two types of reward devaluation methods. This result constitutes a failure to replicate the first study that examined the effect of training length on instrumental sensitivity to reward devaluation (Adams 1982). It is difficult to decipher why the results did not replicate. One possibility is a difference in reward types. Adams (1982) used sucrose pellets while we used banana-flavored sucrose pellets and grain pellets, and there is some evidence that the reward type can change performance during devaluation tests (Vandaele et al. 2017). Nonetheless, it is noteworthy that there are not many instances where the effects of overtraining on ratio schedules have been examined in relation to goal-directed control. To our knowledge, actions appear to convert to outcome-insensitive “habits” on only a rather limited set of training conditions; namely, training on fairly lean random interval schedules. This leads us to question the generality of the phenomenon.

Acknowledgments

The research reported here was supported by a National Institute of General Medical Sciences grant (SC1 DA034995) awarded to A.R.D. and a National Institute on Drug Abuse grant (R01 DA035943) award to P.H.J. We thank Badrunnesa Bushra, Chloé Pierre-Louis, Norman Tu, and Dan Siegel for assistance with data collection.

Footnotes

References

  1. Adams CD. 1982. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q J Exp Psychol B Comp Physiol Psychol 34: 77–98. 10.1080/14640748208400878. [DOI] [Google Scholar]
  2. Balsam PD, Drew MR, Gallistel CR. 2010. Time and associative learning. Comp Cogn Behav Rev 5: 1–22. 10.3819/ccbr.2010.50001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bouton ME, Broomer MC, Rey CN, Thrailkill EA. 2020. Unexpected food outcomes can return a habit to goal-directed action. Neurobiol Learn Mem 169: 107163. 10.1016/j.nlm.2020.107163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Colwill RM, Rescorla RA. 1985. Instrumental responding remains sensitive to reinforcer devaluation after extensive training. J Exp Psychol Anim Behav Process 11: 520–536. 10.1037/0097-7403.11.4.520 [DOI] [Google Scholar]
  5. Colwill RM, Rescorla RA. 1988. The role of response-reinforcer associations increases throughout extended instrumental training. Anim Learn Behav 16: 105–111. 10.3758/BF03209051 [DOI] [Google Scholar]
  6. Colwill RM, Triola SM. 2002. Instrumental responding remains under the control of the consequent outcome after extended training. Behav Process 57: 51–64. 10.1016/S0376-6357(01)00204-2 [DOI] [PubMed] [Google Scholar]
  7. Corbit LH, Nie H, Janak PH. 2012. Habitual alcohol seeking: time course and the contribution of subregions of the dorsal striatum. Biol Psychiatry 72: 389–395. 10.1016/j.biopsych.2012.02.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. DeRusso AL, Fan D, Gupta J, Shelest O, Costa RM, Yin HH. 2010. Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement. Front Integr Neurosci 4: 1–8. 10.3389/fnint.2010.00017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. de Wit S, Kindt M, Knot SL, Verhoeven AAC, Robbins TW, Gasull-camos J, Gillan CM. 2018. Shifting the balance between goals and habits: five failures in experimental habit induction. J Exp Psychol Gen 147: 1043–1065. 10.1037/xge0000402.supp. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dickinson A. 1985. Actions and habits: the development of behavioural autonomy. Philos Trans R Soc Lond B Biol Sci 308: 67–78. 10.1098/rstb.1985.0010 [DOI] [Google Scholar]
  11. Dickinson A, Nicholas DJ, Adams CD. 1983. The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. Q J Exp Psychol B 35: 35–51. 10.1080/14640748308400912. [DOI] [Google Scholar]
  12. Garr E, Delamater AR. 2019. Exploring the relationship between actions, habits, and automaticity in an action sequence task. Learn Mem 26: 128–132. 10.1101/lm.048645.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Garr E, Bushra B, Tu N, Delamater AR. 2020. Goal-directed control on interval schedules does not depend on the action–outcome correlation. J Exp Psychol Anim Learn Cogn 46: 47–64. 10.1037/xan0000229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gremel CM, Costa RM. 2013. Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat Commun 4: 1–12. 10.1038/ncomms3264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Jonkman S, Kosaki Y, Everitt BJ, Dickinson A. 2010. The role of contextual conditioning in the effect of reinforcer devaluation on instrumental performance by rats. Behav Process 83: 276–281. 10.1016/j.beproc.2009.12.017. [DOI] [PubMed] [Google Scholar]
  16. Malvaez M, Greenfield VY, Matheos DP, Angelillis NA, Murphy MD, Kennedy PJ, Wood MA, Wassum KM. 2018. Habits are negatively regulated by histone deacetylase 3 in the dorsal striatum. Biol Psychiatry 84: 383–392. 10.1016/j.biopsych.2018.01.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Perlman MD, Rasmussen U. 1975. Some remarks on estimating a noncentrality parameter. Commun Stat Theory Methods 4: 455–468. 10.1080/03610927508827262 [DOI] [Google Scholar]
  18. Pool ER, Gera R, Fransen A, Perez OD, Cremer A, Aleksic M, Tansmith S, Quail S, Ceceli AO, Manfredi DA, et al. 2021. Determining the effects of training duration on the behavioral expression of habitual control in humans: a multi-laboratory investigation. PsyArXiv. 10.31234/osf.io/z756h [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Rodger RS. 1974. Multiple contrasts, factors, error rate and power. Br J Math Stat Psychol 27: 179–198. 10.1111/j.2044-8317.1974.tb00539.x [DOI] [Google Scholar]
  20. Satterthwaite FE. 1946. An approximate distribution of estimates of variance components. Biometrics 2: 110–114. 10.2307/3002019 [DOI] [PubMed] [Google Scholar]
  21. Schwabe L, Wolf OT. 2009. Stress prompts habit behavior in humans. J Neurosci 29: 7191–7198. 10.1523/JNEUROSCI.0979-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Smith KS, Graybiel AM. 2013. A dual operator view of habitual behavior reflecting cortical and striatal dynamics. Neuron 79: 361–374. 10.1016/j.neuron.2013.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Tantot F, Parkes SL, Marchand AR, Boitard C, Naneix F, Layé S, Trifilieff P, Coutureau E, Ferreira G. 2017. The effect of high-fat diet consumption on appetitive instrumental behavior in rats. Appetite 108: 203–211. 10.1016/j.appet.2016.10.001. [DOI] [PubMed] [Google Scholar]
  24. Thrailkill EA, Trask S, Vidal P, Alcalá JA, Bouton ME. 2018. Stimulus control of actions and habits: a role for reinforcer predictability and attention in the development of habitual behavior. J Exp Psychol Anim Learn Cogn 44: 370–384. 10.1037/xan0000188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Tricomi E, Balleine BW, O'Doherty JP. 2009. A specific role for posterior dorsolateral striatum in human habit learning. Eur J Neurosci 29: 2225–2232. 10.1111/j.1460-9568.2009.06796.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Urcelay GP, Jonkman S. 2019. Delayed rewards facilitate habit formation. J Exp Psychol Anim Learn Cogn 45: 413–421. 10.1037/xan0000221 [DOI] [PubMed] [Google Scholar]
  27. Vandaele Y, Pribut HJ, Janak PH. 2017. Lever insertion as a salient stimulus promoting insensitivity to outcome devaluation. Front Integr Neurosci 11: 1–13. 10.3389/fnint.2017.00023. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Learning & Memory are provided here courtesy of Cold Spring Harbor Laboratory Press

RESOURCES