Abstract
The effect of repetitive training on learned actions has been a major subject in behavioural neuroscience. Many studies of instrumental conditioning in mammals, including humans, suggested that learned actions early in training are goal-driven and controlled by outcome expectancy, but they become more automatic and insensitive to reduction in the value of the outcome after extended training. It was unknown, however, whether the development of value-insensitive behaviour also occurs by extended training of Pavlovian conditioning in any animals. Here we show that crickets Gryllus bimaculatus that had received minimal training to associate an odour with water (unconditioned stimulus, US) did not exhibit conditioned response (CR) to the odour when they were given water until satiation before the test, but those that had received extended training exhibited CR even when they were satiated with water. Further pharmacological experiments suggested that octopamine neurons, the invertebrate counterparts of noradrenaline neurons, mediate US value signals and control execution of CR after minimal training, but the control diminishes with the progress of training and hence the CR becomes insensitive to US devaluation. The results suggest that repetitive sensory experiences can lead to a change from a goal-driven response to a more automatic one in crickets.
Keywords: habit formation, classical conditioning, octopamine, cricket, reward devaluation
1. Introduction
Many studies in mammals, including humans, suggest that learned actions early in instrumental training are goal-directed and controlled by outcome expectancy, but similar behaviours often proceed as automatic responses to antecedent stimuli with repetition of training [1–3]. Such a change, called habit formation, has been demonstrated using a post-training outcome devaluation procedure, in which the value of the outcome (food) is decreased by either taste-aversion learning or satiation: action early in training is, in large part, sensitive to the devaluation but the sensitivity decreases with extended training [1,2]. The development of behavioural autonomy with extended training, however, is not found in some experimental situations, for example, when animals are subjected to complex conditioning tasks [1,4,5]. Therefore, the number of training trials is not the only factor to determine the development of behavioural autonomy. In humans, this shift can be beneficial in everyday life but can also lead to loss of control over compulsive behaviour, namely drug seeking in addiction [6–8]. Lesion, electrophysiological and transgenic studies in rodents suggested that the transition from goal-oriented to habitual behaviour involves transitions in the neural circuits controlling the behaviours [7,9,10], but the cellular and molecular mechanisms of habit formation remain largely unknown.
The effect of extended training has also been studied in Pavlovian conditioning in rats, in which they learn the relationship between a previously neutral stimulus (conditioned stimulus, CS) and a stimulus with biological value (unconditioned stimulus, US) and produce a response (conditioned response, CR) to the CS thereafter. After minimal training, animals that received a procedure to decrease the value of the US exhibited a decrement of the CR, and hence it has been argued that the CR is controlled by CS-induced activations of representations of the US, analogous to that actions are driven by representations of goals in instrumental conditioning. In contrast to instrumental conditioning, however, the CR remains sensitive to devaluation of the US after extended training, and hence it has been concluded that control of the CR by US representations does not diminish after extended training [11–13]. To our knowledge, loss of sensitivity of the CR to US devaluation by extended Pavlovian training has not been reported in any animals.
In this study, we investigated whether extended Pavlovian training leads to a change of sensitivity of CR to devaluation of US in the cricket, Gryllus bimaculatus. Crickets are newly emerging animal models for the study of basic mechanisms of learning and memory, because they have excellent learning capabilities and it is relatively easy to apply various experimental manipulations such as pharmacological intervention, RNA interference, and genome editing [14–19]. We previously showed that two to four pairing trials to associate an odour with water, with inter-trial intervals (ITIs) of 5 min, are sufficient to lead to protein synthesis-dependent long-term memory that lasts for at least 4 days [20,21]. We refer to this training regimen with a single session of four trials with 5 min ITIs as standard training. For extended training, we applied three training sessions for one session per day on three consecutive days. We show that the execution of the CR is sensitive to devaluation of the US after standard training but not after extended Pavlovian training in crickets. Moreover, we previously suggested that a class of octopamine (OA, invertebrate counterpart of noradrenaline) neurons conveys appetitive US signals in Pavlovian conditioning in crickets [14–16,22–25], and that execution of the CR after standard training requires CS-induced activation of another class of OA neurons, which may convey US value signals that guide execution of the CR [16]. Here we show that control of the CR by OA neurons is lost after extended training. We thus propose that control of the CR by OA neurons decreases with extended training and this leads to the CR being more independent of the current value of the US.
2. Material and methods
(a). Animals
Adult male crickets, Gryllus bimaculatus, at one week after the imaginal moult were used. They were reared in a 12 L : 12 D cycle at 27 ± 2°C and were fed a diet of insect pellets and water ad libitum. Three days before the start of the experiment, about 50 animals were individually placed in 100 ml glass beakers and deprived of drinking water to enhance their motivation to search for water.
(b). Classical olfactory conditioning
We used an elemental appetitive conditioning procedure in which an odour (CS) was paired with water (US) as described previously [20]. A syringe containing water was used for conditioning. A filter paper soaked with an odour essence was attached to the needle of the syringe. The filter paper was placed within 1 cm of the cricket's head for 3 s so as to present an odour, and then a drop (ca. 10–13 µl) of water was presented to the mouth. After the conditioning trials, the air in the beaker was ventilated. The cricket was kept in a beaker until the next training or test. Crickets were subjected to four conditioning trials with 5 min ITIs for 1 day (four trials × 1 day training) or for three consecutive days (four trials × 3 days training).
For conditioning with sucrose solution, crickets were not given food for 3 days with water provided ad libitum and were then subjected to four-trial training to associate an odour with 0.5 M sucrose solution with 5 min ITIs.
(c). Odour preference test
All groups of animals were subjected to preference tests between the odour used in training and an odour not used in training (control odour) before and after training [14,20]. The test apparatus consisted of two waiting chambers and a test chamber. On the floor of the test chamber, there were two holes that connected the chamber with two odour sources. Each odour source consisted of a plastic container containing a filter paper soaked with 3 µl solution of odour essence, covered with a fine gauze net. Three containers were mounted on a rotatable holder and two of three odour sources could be located simultaneously just below the holes of the test chamber. Before the odour preference test, a cricket was transferred to the waiting chamber at the waiting position and left for about 4 min to become accustomed to the surroundings. The waiting chamber was then slid into the entrance position, and the door to the test chamber was opened. When the cricket entered the test chamber, the door was closed and the test was started. Two min later, the relative positions of the odour sources were changed by rotating the container holder. The preference test lasted for 4 min. If the total time of visits of an animal to either source was less than 10 s, we considered that the animal was less motivated to visit odour sources, possibly owing to a poor physical condition, and the data were rejected. Animals that fell into this category were less than 10% of all animals. After the test, the cricket was returned to a beaker. In the experiment in which the post-training test was performed 3 days after training, two drops (ca. 20–26 µl) of water were given every day until the day of the test to prevent excessive hydration.
We used two different pairs of odours for experiments. The first was peppermint odour and vanilla odour. Crickets prefer vanilla odour over peppermint odour [20] and thus we used peppermint odour as a CS to be associated with water, and vanilla odour was used as a control odour in a test to study relative preference between the CS and the control odour. Experiments with this odour pair led to a high learning score and allowed us to obtain significant statistical results with a relatively small number of samples [20] and thus are used for initial exploratory stages of this study. In experiments to examine whether results obtained with this odour pair can be generalized to another odour pair, apple odour and banana odour were used. Crickets prefer these two odours almost equally [14], and we used one of the two odours as a CS and the other odour as a control odour in a counterbalanced manner. Groups in which either apple odour or banana odour was used as the CS showed similar learning scores [14], and thus data from the groups were pooled.
In the test, an odour source was considered to have been visited when the cricket probed the top net with its mouth. The time spent probing the top net of each odour source was recorded cumulatively. Relative odour preference of each animal was determined by using the preference index (PI), defined as (tr/(tr + tnr)) × 100 (%), where tr was the time spent exploring the odour associated with reward and tnr was the time spent exploring the odour not associated with reward.
(d). Water satiation procedure
One day after completing the standard (four trials × 1 day) training or extended (four trials × 3 days) training, crickets were given water to their mouths by a syringe until they stopped consuming the water. They typically consumed 130–230 µl of water. At 1 h after the water satiation (or devaluation) trial, they were given a post-training test. In parallel experiments, we observed that when tested 1 h after the satiation procedure crickets typically consumed only 0–40 µl of water, confirming that the satiation procedure was successful.
(e). Drug injection
By using a 10 µl syringe, saline solutions (each 3 µl) containing drugs were injected into the head haemolymph of the crickets from a small hole made in the median ocellus. Epinastine, a highly effective antagonist of insect OA receptors (see discussion in [25]), was purchased from SIGMA (Tokyo, Japan). The drug was dissolved in cricket saline [21]. The dose of epinastine was determined on the basis of previous studies [14,15].
(f). Experimental design and statistical analyses
We performed between-group and within-group comparisons of odour preferences to examine if the two comparisons produce consistent results. Wilcoxon's signed-rank test (WCX test) was used to compare odour preferences before and after training of a given group (within-group comparison), and the Mann–Whitney test (M–W test) was used to compare odour preferences between the experimental and control groups, because the data often deviated from a normal distribution. A Kruskal–Wallis test was used to compare odour preferences in the test prior to the training among groups used in this study. Sample sizes are determined by intrinsic variation of the dataset. The post hoc Holm's method was used to adjust the p-value when comparing among three groups. All statistical results are described in the electronic supplementary material, table S1. p-values of less than 0.05 were considered statistically significant. We used R (v. 3. 3. 1) package for statistical analysis.
3. Results
(a). Effect of water satiation on execution of the conditioned response to water-associated odour after standard training
We used two different odour pairs for conditioning experiments, namely, a peppermint odour and vanilla odour pair and an apple odour and banana odour pair. In initial stages of this study, we used the former odour pair for experiments, while the latter odour pair was used for experiments to confirm results obtained in earlier stages (see Methods).
At first, two groups of crickets were deprived of water for 3 days and then subjected to four-trial training (or four trials × 1 day training) to associate peppermint odour with water reward. Relative preference between peppermint odour (CS) and vanilla odour (control odour) was tested before and 1 day after training. Crickets in one group (water-satiated or reward devalued group) were given water until they stopped drinking, and after 1 h they were given a post-training test. Separate experiments showed that crickets were sufficiently satiated with water at the time of the test (see Methods). Crickets in another group (control group) were not given water prior to the post-training test.
The control group exhibited significantly increased preference for the conditioned odour after training compared to that before training (figure 1a; for details of the statistical results, see the electronic supplementary material, table S1). We previously showed that this change is pairing-specific and not owing to a non-associative effect caused by the exposure to the odour or water [20]. On the other hand, the water-satiated group did not exhibit significantly increased preference for the CS. Between-group comparison confirmed that the post-training preference for the CS was significantly lower in a water-satiated group than in the control group (figure 1a). This is in contrast to the Kruskal–Wallis test which showed no significant differences in odour preference in the pre-training test among groups used in this study (electronic supplementary material, table S1). We thus conclude that water satiation diminishes CRs to water-associated odour. We obtained the same conclusion when we evaluated the data by the time spent visiting each odour (electronic supplementary material, figure S1).
Is suppression of response to the CS by providing water until satiation specific to the response to the CS associated with water? To address this question, two groups of crickets were given no food for 3 days and then received four-trial conditioning of peppermint odour with 0.5 M sucrose solution [24]. On the next day, crickets in one group were given water until satiation and those in the other group were not. The water-satiated group exhibited a significant increase of the preference for the CS, the level of which was as high as that of the control group in which no water satiation procedure was performed (figure 1b). A parallel experiment with another group of crickets that received four-trial conditioning of the same odour with water and were then given water until satiation, reproduced the results in the water-satiated group shown in figure 1a, namely, they exhibited no significant level of CR to the CS (figure 1b). Thus, the effect of providing water until satiation is specific to the response to the CS associated with water US.
It could be argued that the absence of increased preference for the water-associated odour in the water-satiated group in figure 1a (and in electronic supplementary material, figure S1) might possibly be owing to that (i) the water satiation procedure changed the unconditioned preference for the odour, namely, reduction of preference for peppermint odour; or (ii) it led to extinction of memory, rather than inhibition of responses to the CS. The former possibility was ruled out by our observation that untrained crickets which had been given water until satiation exhibited no change of odour preference (electronic supplementary material, figure S2). The latter possibility is also unlikely since we observed in figure 1b that crickets which had received the water satiation procedure exhibited increased preferences for sucrose-associated peppermint odour. We thus conclude that devaluation of water US after standard (four trials × 1 day) training leads to the suppression of responses to water-associated CS. Moreover, the results of the experiment with extended training, which will be described next, serve as another evidence against the account based on the non-specific effect of water satiation.
(b). Effect of reward devaluation on execution of the conditioned response after extended training
Next, we performed reward devaluation experiments with four trials × 3 days training with water used as US. Peppermint odour was used as the CS and vanilla odour was used as the control odour. One group was subjected to the water satiation procedure prior to the post-test, whereas the other group was not subjected to the procedure. Both the devalued and control groups exhibited significantly increased preference for the CS after training as compared to the pre-training test (figure 2a), indicating that responses to the water-associated CS (CRs) were not inhibited by water satiation after extended (four trials × 3 days) training. This was in contrast to observations in figure 1a in which the CR was fully inhibited by water satiation after standard training. Contrasting effects of water satiation on CRs after standard and extended training confirms that inhibition of the CR by water satiation after standard training was not owing to unspecific effects of water satiation on the execution of the CR.
It could be argued that the loss of sensitivity of the CR to the US devaluation procedure after four-trials × 3 days training shown in figure 2a might be owing to the passage of time for 3 days after the start of training, not owing to the increased amount of training. To examine this possibility, crickets in two groups were subjected to four-trial training. Three days later, one group (devalued group) was given a test 1 h after being given water until satiation and the other group (control group) was given a test without receiving water (figure 2b). The control group exhibited significantly increased preference for the CS after training compared to that before training but the devalued group did not. The results indicate that the passage of time is not the reason for the loss of sensitivity of the CR to the water satiation procedure after extended training.
We next investigated whether the observed loss of sensitivity of the CR to water satiation procedure by extended training is reproducible in experiments with apple odour and banana odour, in which one of the odours was used as CS and the other was used as a control odour in a counterbalanced manner. The results confirmed that the CR is initially sensitive to water satiation procedure after standard (four trials × 1 day) training (figure 2c) but that the sensitivity is lost after extended (four trials × 3 days) training (figure 2d). The results confirm the effect of extended training observed in figure 2a.
We next performed experiments to determine whether the total number of trials (12 trials) or the intervals between training sessions (1 day intervals) is more important to lead to the insensitivity of the CR to US devaluation. We tested 12 trials × 1 day training and six trials × 2 days training, the number of trials being the same as that of four trials × 3 days training, using peppermint odour as CS and vanilla odour as the control odour. In the experiments with 12 trials × 1 day training, the control group exhibited significantly increased preference for the CS after training compared to that before training, but the devalued group did not (electronic supplementary material, figure S3a), indicating that the CR is sensitive to reward devaluation. In experiments with six trials × 2 days training, both the control and devalued groups exhibited significantly increased preference for the CS after training compared to that before training (electronic supplementary material, figure S3b), but between-group comparison showed that the preference for the CS after training in the devalued group was significantly lower than in the control group. This indicates that six trials × 2 days training is not sufficient to make the CR completely insensitive to US devaluation. We conclude that an increased number of trials is not sufficient to lead to the loss of sensitivity of the CR to US devaluation. Rather, repetitions of trials with sufficient intervals are needed for this change.
(c). A neural model for the conditioned response to lose sensitivity to reward devaluation by extended training
How can the decrement of control of the CR by the current value of the US with extended training be accounted for by changes in synaptic transmission in the neural circuitry mediating the learned behaviour? To help answer this question, we propose a neural circuit model (figure 3) by modifying our previously proposed model [16,22,23], which was assumed to represent the neural circuitry in lobes of the mushroom body, a secondary olfactory centre that plays critical roles in olfactory learning [26,27]. The model was designed to account for the findings that administration of an OA receptor antagonist impairs both acquisition of conditioning and execution of CRs after standard Pavlovian training [14,15,18]. Our model assumes two classes of OA neurons, namely, OA1 neurons that govern the learning process and OA2 neurons that govern the execution of CR [22,23], and in the model shown in figure 3, we focus on the roles of OA2 neurons. In the model, we assume that: (i) presentation of the CS activates ‘CS’ neurons and then ‘OA2’ neurons, (ii) activation of both types of neurons is needed to activate ‘CR’ neurons that produce the CR after standard training (figure 3a) [16], (iii) activation of ‘OA2’ neurons is inhibited when animals are satiated with water, and (iv) the requirement of activation of ‘OA2’ neurons for production of the CR is lost after extended training; owing either to enhancement of the efficacy of synaptic transmission from ‘CS’ neurons and ‘CR’ neurons in this circuitry by extended training, so that the requirement of activation of ‘OA2’ neurons for activation of ‘CR’ neurons is lost (figure 3b), or by the development of a novel neural circuitry that can produce the CR without activation of ‘OA2’ neurons outside the area where initial learning occurs (figure 3c). In short, the model proposes that a type of OA neurons represents the current value of the US and that CS-induced activation of the OA neurons controls the CR early in training, but the control is lost and hence the CR becomes independent of the current value of the US with extended training.
(d). Effect of blockade of octopamine receptors on execution of the conditioned response after standard training and extended training
The model predicts that blockade of OA-ergic transmission impairs execution of the CR after standard training but not after extended training. We next investigated whether this indeed occurs. At first, two groups were subjected to standard (four trials × 1 day) training, and animals in these groups received an injection of saline or saline containing epinastine, a potent antagonist of insect OA receptors [26], before the post-test (control and epinastine groups). Peppermint odour was used as the CS and vanilla odour was used as the control odour. The control group exhibited a significantly increased preference for the CS after training compared to that before training, but the epinastine group did not (figure 4a). We previously showed that injection of epinastine leads to suppression of the CR but not impairment of memory retention [16]. In experiments with extended (four trials × 3 days) training, on the other hand, both the control and epinastine groups exhibited a significantly increased preference for the CS after training compared to that before training (figure 4b). Thus, activation of OA neurons is needed for expression of the CR after standard training but not after extended training, being in accordance with our model.
To confirm the reproducibility of the results shown in figure 4a,b, similar experiments were performed with either apple odour or banana odour being used as CS and the other being used as the control odour. In experiments with standard training, the control (saline-injected) group exhibited a significantly increased preference for the CS after training, but the epinastine group did not (figure 4c). On the other hand, in experiments with extended training, both the control group and epinastine group exhibited a significantly increased preference for the CS after training (figure 4d). The results confirm that normal synaptic transmission from OA neurons is needed for execution of the CR after standard training but not after extended training, regardless of the kinds of odours used in experiments.
Next, we performed experiments to determine whether the amount of training required to make the CR insensitive to OA receptor blockade matches that required to make the CR insensitive to reward devaluation. In experiments with 12 trials × 1 day training, the saline-injected control group exhibited a significantly increased preference for the CS after training compared to that before training, but the epinastine group did not (electronic supplementary material, figure S4a). In experiments with six trials × 2 days training, the saline-injected control group exhibited a significantly increased preference for the CS after training (electronic supplementary material, figure S4b), but the epinastine group did not. Between-group comparison, however, showed that the post-training preference for the CS in the epinastine group did not significantly differ from that in the control group. Thus, six trials × 2 days training is not sufficient to make the CR completely insensitive to blockade of OA receptors. We conclude that the amount of training to make the CR insensitive to blockade of OA receptors is, in general, in accordance with that to make the CR insensitive to US devaluation, although the match might not be perfect.
4. Discussion
This study demonstrated a novel feature of Pavlovian conditioning, namely, execution of the CR is initially sensitive to devaluation of US (reward) but becomes insensitive to reward devaluation after extended training in crickets. Many studies in instrumental conditioning in mammals (mostly in rats) suggest that learned actions are initially goal-directed and are sensitive to devaluation of the outcomes but that they become more insensitive to outcome devaluation after extended training [1–3], although it should be cautioned that such change does not occur in some types of training [1,4,5]. On the other hand, no evidence showing a decrement of the sensitivity of the CR to US devaluation by extended Pavlovian conditioning has been reported in mammals [11–13]. To our knowledge, this is the first study showing decrement of the control of the CR by current value of the US with extended Pavlovian training in any species of animal.
There is evidence, however, that contents of learning change with extended Pavlovian training in rats [11–13]. For example, after minimal training to associate an auditory or visual CS with a food US, consumption of food was reduced when the CS was paired with toxin (LiCl), indicating the CS-evoked representation of the food US entered into association with the toxin, and this learning is referred to as mediating conditioning [11,12]. After extended training, however, the pairing of the CS with toxin did not reduce consumption of food, owing possibly to the reduction in the effectiveness of the CS to activate sensory aspects of representation of the food US after extended training. On the other hand, conditioned responding to the CS was reduced when the US was paired with toxin after minimal and extended training, owing possibly to the effectiveness of the CS to activate motivational aspects of US representation does not change with the extension of training [11,12]. Therefore, the effect of extended training to reduce the contents of learning in rats is much subtler than that in crickets.
It should also be cautioned that we used satiation procedure for reward devaluation in crickets, while taste-aversion learning has been used in most studies in rats [1,11,12]. Future studies are needed to investigate whether the different effect of reward devaluation on CR after extended Pavlovian training in rats and crickets might have been owing to different procedures of reward devaluation.
We suggest that the loss of sensitivity of CR to reward devaluation by extended training is owing to loss of control of the CR by OA neurons that may mediate value signals of the US. Interestingly, it has been reported in Pavlovian conditioning in rats that blockade of dopaminergic (DA-ergic) transmission impairs execution of CR after minimal training but the sensitivity of the CR to blockade of DA-ergic transmission decreases with extended training [28,29]. It is not known, however, whether the decrease of DA-dependency of CR accompanies a decrease of sensitivity to US devaluation procedure in rats.
We propose an updated conceptual definition of habit formation to match the fact that the loss of sensitivity of learned behaviour to reward devaluation can occur in both instrumental conditioning and Pavlovian conditioning. Theorists argue that learned action early in instrumental training in mammals often depends on action–outcome (A–O) association but that the action becomes dependent more on stimulus–response (S–R) association with the progress of training [1–3,30]. In Pavlovian conditioning in crickets, it can be argued from our model that execution of the CR requires activation of both stimulus–stimulus (S–S) association and S–R association early in training, but that the CR becomes dependent solely on the S–R association after extended training (see figure 4). Thus, habit formation can be defined as learned behaviour becoming dependent more on S–R association by extended training, regardless of the associative learning being instrumental or Pavlovian. In rats, the distinction between S–R systems and S–S systems have been well documented in Pavlovian conditioning [31] and related tasks [2,32]. Both the systems form from an earlier stage of training and which of the two systems has more influence on behaviour depends on the type of training [33]. Since the distinction between S–S and S–R systems (and between A–O and S–R systems) matches the distinction between model-based and model-free systems in reinforcement learning theories [6], habit formation can also be considered as learned behaviour becoming more dependent on a model-free system after extended training.
In rats, it has been reported that reward devaluation reduces CRs after first- but not second-order conditioning, and argued that S–R association but not S–S association underlies CRs after second-order conditioning [34]. In crickets, we have proposed a model in which both S–S and S–R associations underlie CRs after second-order conditioning [16], and the model predicts that CRs after second-order conditioning are sensitive to reward devaluation. This prediction needs to be tested.
What is the possible functional significance of the changes of associative substrates underlying the CR with the extension of Pavlovian training? In instrumental conditioning, it has been argued that advantages of ‘action-to-habit’ transition include (i) making higher order cognitive centres free from routine tasks, and (ii) allowing more rapid responses [7]. Both might be applicable to Pavlovian conditioning, but the former might be particularly important for insects because insects have very small brains and thus neuronal resources that can be allocated for new learning would be limited.
In rodents, there is evidence showing that the shift from initial goal-directed action to the habitual response by extended instrumental training accompanies a shift in the neural substrates subserving memory storage and execution of learned actions [7,9,10]. Therefore, our next step should be to study the neural circuitry that mediates changes in the CR with extended Pavlovian training, and the model proposed in figure 3 should provide a basis for this study. There is evidence in some insects that olfactory learning initially occurs in the mushroom body [26,27,35], and investigation is needed to determine whether the mushroom body or another area of the central olfactory pathway (figure 3b,c), such as the antennal lobe (primary olfactory centre), mediates the CR after extended Pavlovian training. Studies with microinjections of epinastine into specific brain regions, as well as immunohistochemical and electrophysiological characterization of OA neurons projecting to the mushroom body or the antennal lobe, will help to clarify this point.
In fruit flies, it has been reported that memory formed by Pavlovian conditioning by a small number of training sessions was generalized or transferred to a new behaviour, but the generalization decreased with an increase in the training sessions and, moreover, evidence has been provided to show that the mushroom body is needed for this generalization [36]. That study demonstrated that a change in neural substrates subserving learned behaviour occurs by extended training in fruit flies. Investigation is needed to determine whether this training accompanies a change in sensitivity to the US devaluation procedure.
Psychologists argue that not only actions but also the way of thinking in humans (the way in which humans deal with sensory or social experiences) may become more stereotyped or automatic by repetitive sensory or social experiences, and this may lead to psychological disorder [37,38]. Thus, it is of great theoretical interest to understand whether repetitive experiences of a given contingency (correlation) could give rise to a habit-like inflexible cognition or response in humans and other animals. In the current study, we offered, to our knowledge, the first demonstration of such a case in crickets.
Supplementary Material
Supplementary Material
Acknowledgements
We thank Yutaka Kosaki for helpful comments on the manuscript.
Data accessibility
A summary of the data for each experiment is provided in a PDF file as the electronic supplementary material.
Authors' contributions
M.M. designed and S.H., A.S., R.A., K.T. and Y.M. performed behavioural and pharmacological experiments, S.H., K.T. and M.M. performed statistical analysis and M.M. wrote the manuscript with the help of other authors. All authors gave final approval for publication.
Competing interests
The authors have no competing interests.
Funding
This study was supported by Grants-in-Aid for Scientific Research from the Ministry of Education, Science, Culture, Sports and Technology of Japan (no. 16H04814 and 16K18586 to M.M.) and that to JSPS fellows (no. 15J01414 to K.T.).
References
- 1.Dickinson A. 1985. Action and habits: the development of behavioral autonomy. Phil. Trans. R. Soc. Lond. B 308, 67–78. ( 10.1098/rstb.1985.0010) [DOI] [Google Scholar]
- 2.Yin HH, Knowlton BJ. 2006. The role of the basal ganglia in habit formation. Nat. Rev. Neurosci. 7, 464–476.24. ( 10.1038/nrn1919) [DOI] [PubMed] [Google Scholar]
- 3.Smith KS, Graybiel AM. 2014. Investigating habits: strategies, technologies and models. Front. Behav. Neurosci. 8, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kosaki Y, Dickinson A. 2010. Choice and contingency in the development of behavioral autonomy during instrumental conditioning. J. Exp. Psychol. Anim. Behav. Process. 36, 334–342. ( 10.1037/a0016887) [DOI] [PubMed] [Google Scholar]
- 5.DeRusso AL, Fan D, Gupta J, Shelest O, Costa RM, Yin HH. 2010. Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement. Front. Integr. Neurosci. 4, 17 ( 10.3389/fnint.2010.00017) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Dolan RJ, Dayan P. 2013. Goals and habits in the brain. Neuron 80, 312–325. ( 10.1016/j.neuron.2013.09.007) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wood W, Rünger D. 2016. Psychology of habit. Annu. Rev. Psychol. 67, 289–314. ( 10.1146/annurev-psych-122414-033417) [DOI] [PubMed] [Google Scholar]
- 8.Everitt BJ, Robbins TW. 2016. Drug addiction: updating actions to habits to compulsions ten years on. Annu. Rev. Psychol. 6, 23–50. ( 10.1146/annurev-psych-122414-033457) [DOI] [PubMed] [Google Scholar]
- 9.O'Hare JK, Ade KK, Sukharnikova T, Van Hooser SD, Palmeri ML, Yin HH, Calakos N. 2016. Pathway-specific striatal substrates for habitual behavior. Neuron 89, 472–479. ( 10.1016/j.neuron.2015.12.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gremel CM, Chancey JH, Atwood BK, Luo G, Neve R, Ramakrishnan C, Deisseroth K, Lovinger DM, Costa RM. 2016. Endocannabinoid modulation of orbitostriatal circuits gates habit formation. Neuron 90, 1312–1324. ( 10.1016/j.neuron.2016.04.043) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Holland P. 1998. Amount of training affects associatively-activated event representation. Neuropharmacology 37, 461–469. ( 10.1016/S0028-3908(98)00038-0) [DOI] [PubMed] [Google Scholar]
- 12.Holland PC. 2005. Amount of training effects in representation-mediated food aversion learning: no evidence of a role for associability changes. Learn. Behav. 33, 464–478. ( 10.3758/BF03193185) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Holland PC, Lasseter H, Agrawal I. 2008. Amount of training and cue-evoked taste-reactivity responding in reinforcer devaluation. J. Exp. Psychol. 34, 119–132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Unoki S, Matsumoto Y, Mizunami M. 2005. Participation of octopaminergic reward system and dopaminergic punishment system in insect olfactory learning revealed by pharmacological study. Eur. J. Neurosci. 22, 1409–1416. ( 10.1111/j.1460-9568.2005.04318.x) [DOI] [PubMed] [Google Scholar]
- 15.Unoki S, Matsumoto Y, Mizunami M. 2006. Roles of octopaminergic and dopaminergic neurons in mediating reward and punishment signals in insect visual learning. Eur. J. Neurosci. 24, 2031–2038. ( 10.1111/j.1460-9568.2006.05099.x) [DOI] [PubMed] [Google Scholar]
- 16.Mizunami M, Unoki S, Mori Y, Hirashima D, Hatano A, Matsumoto Y. 2009. Roles of octopaminergic and dopaminergic neurons in appetitive and aversive memory recall in an insect. BMC Biol. 7, 46 ( 10.1186/1741-7007-7-46) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mizunami M, Terao K, Alvarez B. 2018. Application of a prediction error theory to Pavlovian conditioning in an insect. Front. Psychol. 9, 1272 ( 10.3389/fpsyg.2018.01272) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Mizunami M, Matsumoto Y. 2017. Roles of octopamine and dopamine neurons for mediating appetitive and aversive signals in Pavlovian conditioning in crickets. Front. Physiol. 8, 1027 ( 10.3389/fphys.2017.01027) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Matsumoto Y, Matsumoto CS, Mizunami M. 2018. Signaling pathways for long-term memory formation in the cricket. Front. Psychol. 9, 1014 ( 10.3389/fpsyg.2018.01014) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Matsumoto Y, Mizunami M. 2002. Temporal determinants of long-term retention of olfactory memory in the cricket Gryllus bimaculatus. J. Exp. Biol. 205, 1429–1437. [DOI] [PubMed] [Google Scholar]
- 21.Matsumoto Y, Noji S, Mizunami M. 2003. Time course of protein synthesis-dependent phase of olfactory memory in the cricket Gryllus bimaculatus . Zool. Sci. 20, 409–416. ( 10.2108/zsj.20.409) [DOI] [PubMed] [Google Scholar]
- 22.Terao K, Matsumoto Y, Mizunami M. 2015. Critical evidence for the prediction error theory in associative learning. Sci. Rep. 5, 8929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Terao K, Mizunami M. 2017. Roles of dopamine neurons in mediating the prediction error in aversive learning in insects. Sci. Rep. 7, 14694 ( 10.1038/s41598-017-14473-y) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Awata H, Watanabe T, Hamanaka Y, Mito T, Noji S, Mizunami M. 2015. Knockout crickets for the study of learning and memory: dopamine receptor Dop1 mediates aversive but not appetitive reinforcement in crickets. Sci. Rep. 5, 15885 ( 10.1038/srep15885) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Awata H, et al. 2016. Roles of OA1 octopamine receptor and Dop1 dopamine receptor in mediating appetitive and aversive reinforcement revealed by RNAi studies. Sci. Rep. 6, 29696 ( 10.1038/srep29696) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Liu C, et al. 2012. A subset of dopamine neurons signals reward for odour memory in Drosophila. Nature 488, 512–516. ( 10.1038/nature11304) [DOI] [PubMed] [Google Scholar]
- 27.Giurfa M. 2013. Cognition with few neurons: higher-order learning in insects. Trends Neurosci. 36, 285–294. ( 10.1016/j.tins.2012.12.011) [DOI] [PubMed] [Google Scholar]
- 28.Choi WY, Balsam PD, Horvitz JC. 2005. Extended habit training reduces dopamine mediation of appetitive response expression. J. Neurosci. 25, 6729–6733. ( 10.1523/JNEUROSCI.1498-05.2005) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Clark JJ, Collins AL, Sanford CA, Phillips PE. 2013. Dopamine encoding of Pavlovian incentive stimuli diminishes with extended training. J. Neurosci. 33, 3526–3532. ( 10.1523/JNEUROSCI.5119-12.2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Balleine BW, Dickinson A. 1998. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37, 407–419. ( 10.1016/S0028-3908(98)00033-1) [DOI] [PubMed] [Google Scholar]
- 31.Holland PC. 2008. Cognitive versus stimulus-response theories of learning. Learn. Behav. 36, 227–241. ( 10.3758/LB.36.3.227) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Perks SM, Clifton PG. 1997. Reinforcer revaluation and conditioned place preference. Physiol. Behav. 61, 1–5. ( 10.1016/S0031-9384(96)00243-0) [DOI] [PubMed] [Google Scholar]
- 33.Clark JJ, Hollon NG, Phillips PE. 2012. Pavlovian valuation systems in learning and decision making. Curr. Opin. Neurobiol. 22, 1054–1061. ( 10.1016/j.conb.2012.06.004) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Holland PC, Rescorla RA. 1975. The effect of two ways of devaluing the unconditioned stimulus after first- and second-order appetitive conditioning. J. Exp. Psychol. Anim. Behav. Process. 1, 355–363. ( 10.1037/0097-7403.1.4.355) [DOI] [PubMed] [Google Scholar]
- 35.Menzel R. 2012. The honeybee as a model for understanding the basis of cognition. Nat. Rev. Neurosci. 13, 758–768. ( 10.1038/nrn3357) [DOI] [PubMed] [Google Scholar]
- 36.Brembs B. 2009. Mushroom bodies regulate habit formation in Drosophila. Curr. Biol. 19, 1351–1355. ( 10.1016/j.cub.2009.06.014) [DOI] [PubMed] [Google Scholar]
- 37.Verplanken B, Fisher N. 2014. Habitual worrying and benefits of mindfulness. Mindfulness 5, 566–573. ( 10.1007/s12671-013-0211-0) [DOI] [Google Scholar]
- 38.Gardner B. 2015. A review and analysis of the use of ‘habit’ in understanding, predicting and influencing health-related behaviour. Health Psychol. Rev. 9, 277–295. ( 10.1080/17437199.2013.876238) [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
A summary of the data for each experiment is provided in a PDF file as the electronic supplementary material.