Skip to main content
Biology Letters logoLink to Biology Letters
. 2019 Jul 3;15(7):20190084. doi: 10.1098/rsbl.2019.0084

An optogenetic analogue of second-order reinforcement in Drosophila

Christian König 1,, Afshin Khalili 1, Thomas Niewalda 1, Shiqiang Gao 2, Bertram Gerber 1,3,4,
PMCID: PMC6684970  PMID: 31266421

Abstract

In insects, odours are coded by the combinatorial activation of ascending pathways, including their third-order representation in mushroom body Kenyon cells. Kenyon cells also receive intersecting input from ascending and mostly dopaminergic reinforcement pathways. Indeed, in Drosophila, presenting an odour together with activation of the dopaminergic mushroom body input neuron PPL1-01 leads to a weakening of the synapse between Kenyon cells and the approach-promoting mushroom body output neuron MBON-11. As a result of such weakened approach tendencies, flies avoid the shock-predicting odour in a subsequent choice test. Thus, increased activity in PPL1-01 stands for punishment, whereas reduced activity in MBON-11 stands for predicted punishment. Given that punishment-predictors can themselves serve as punishments of second order, we tested whether presenting an odour together with the optogenetic silencing of MBON-11 would lead to learned odour avoidance, and found this to be the case. In turn, the optogenetic activation of MBON-11 together with odour presentation led to learned odour approach. Thus, manipulating activity in MBON-11 can be an analogue of predicted, second-order reinforcement.

Keywords: Drosophila melanogaster, reinforcement, second-order conditioning, mushroom body, dopamine, prediction

1. Introduction

Animals and humans go to great lengths to obtain rewards, such as food and water, and to avoid punishment, such as bodily damage and pain. Essential to these processes is the learning of cues predictive of such actual or first-order reinforcement. Critically, predictive cues not only acquire learned valence but, once predictive relationships are established, also can confer learned valence themselves; i.e. they can serve as second-order reinforcement [13]. In humans, for example, learning that money can buy food establishes money as a second-order reward. In general, second-order conditioning may underlie chains of predictions and early anticipatory behaviour in humans and animals. Indeed, the capacity for second-order conditioning is widely distributed across the animal kingdom, including insects [47], and is implemented in many computational models of associative learning [8].

In flies, presenting odour A with an electric shock punishment and odour B without punishment leads to learned avoidance of A in a subsequent choice test. This learning of an odour as a predictor of electric shock takes place in the Kenyon cells (KCs) of the mushroom body (figure 1a) [912]. The mushroom body provides a sparse, combinatorial representation of the sensory environment, including odours. Along their long axonal fibres, the KCs further receive intersecting input from neurons mediating internal reinforcement, many of which are dopaminergic (DANs). The coincidence of activation by odour and of DAN signalling can lead to presynaptic plasticity at the cholinergic synapse between the KCs and the output neurons of the mushroom body (MBONs). Arborizations from DANs and MBONs overlap and are regionally confined along the KC fibres, establishing a characteristic compartmental organization. In the case of the PPL1-01 DAN mediating an internal punishment signal, synaptic strength between the odour-coding KCs and the approach-promoting MBON-11 is reduced [18,19]. For the punished odour, the innate balance between approach and avoidance is thus tilted in favour of avoidance. In other words, activity in PPL1-01 can provide first-order punishment, and an odour that predicts first-order punishment leads to reduced activity in MBON-11. We therefore wondered whether, in experimentally naive flies, optogenetically silencing MBON-11 might be an analogue of a punishment-predicting odour such that it confers a punishing effect of second-order upon an actually present odour associated with such silencing (also see [20])—and whether in turn optogenetically activating MBON-11 might have a rewarding effect.

Figure 1.

Figure 1.

(a) Simplified account of odour–shock associative learning in flies (after [914]). Odour presentation in untrained animals mediates balanced approach and avoidance tendencies of mushroom body output neurons (MBONs). Coincidence of odour-evoked activity in the mushroom body Kenyon cells (KCs) and activity of the dopaminergic neuron PPL1-01 evoked by the electric shock leads to a depression of the synapses from these KCs to an approach-promoting MBON. In a subsequent test, this allows avoidance tendencies through non-depressed KC-MBON synapses in parallel compartments to prevail. The organization of innate olfactory, punishment- and reward-related behaviour largely bypasses the mushroom body. For simplicity KC–KC, KC–DAN, DAN–MBON and MBON–MBON synapses are omitted from this figure [15,16]. Cloud: odour; star: depressed/non-depressed KC-MBON synapse. A possible feedback from the MBONs towards the DANs is indicated. Note the multiple targets of MBON-11 within the ipsi- and contralateral mushroom body, as well as outside the mushroom body sketched in (d). (b) Presenting odour (cloud) with green light (star) leads to aversive associative memory in flies expressing the green-light-gated anion-channel GtACR1 in MBON-11, but not in genetic controls. (c) As in (b), using three training trials with an inter-trial interval of 3 min. (d) Sketch of connectivity of MBON-11; Greek letters refer to mushroom body lobes. Target regions of MBON-11 outside the mushroom body include MBON-01, the crepine (CRE) and the superior medial, intermediate and lateral protocerebrum (SMP, SIP, SLP) (after [13]). Postsynaptic partners of the contralateral branch of MBON-11 include PPL1-01 [17]. All these target regions could contribute to the reinforcing effects of manipulating the activity of MBON-11. Data are displayed as box plots (middle line: median; box boundaries and whiskers: 25/75% and 10/90% quantiles, respectively). Data were analysed across groups by Kruskal–Wallis tests at p < 0.05, followed in the case of significance by pairwise comparisons with Mann–Whitney U-tests at p < 0.05 with Bonferroni–Holm correction (asterisk). Underlying preference scores can be found in the electronic supplementary material, figure S1. Sample sizes and statistical results can be found in the electronic supplementary material, table S1. A ‘+’ below box plots indicates the presence of the respective transgene. (Online version in colour.)

2. Material and methods

Procedures follow [21], unless mentioned otherwise. Drosophila melanogaster were maintained on standard food, with 60–70% relative humidity, at 25°C, and in constant darkness to prevent unintended optogenetic effects. Flies aged 1–3 days after hatching were collected and kept at 18°C for up to four additional days. MB320C and MB085C (Fly Light Split-GAL4 Driver Collection) [13] as driver strains covering the PPL1-01 and MBON-11 neurons, respectively, were crossed to UAS-ChR2-XXL (Bloomington stock number: 58374) [22] or UAS-GtACR1 as effectors for optogenetic activation or silencing, respectively. To generate the latter strain, the GtACR1 DNA was synthesized (Thermo Fisher Scientific) according to the published sequence [23] with codon usage optimized to D. melanogaster. The synthesized GtACR1 DNA with a C-terminal YFP was inserted into the expression vector pJFRC7. Embryo injection (BestGene Inc.) was performed to establish flies carrying UAS-GtACR1. Crosses for genetic controls yielded animals heterozygous for either construct. Synonyms for PPL1-01 are PPL1-γ1pedc and MB-MP1; synonyms for MBON-11 are MBON-γ1pedc>α/β and MB-MVP2.

Behavioural experiments used a set-up from CON-ELEKTRONIK (Greussenheim, Germany) and took place at 23–25°C and 60–80% relative humidity. Training was performed in red light, which is invisible to flies, and testing in darkness. As odorants, 50 µl benzaldehyde (BA) and 250 µl 3-octanol (OCT) (CAS 100-52-7, 589-98-0; both from Fluka, Steinheim, Germany) were applied to 1 cm-deep Teflon containers of 5 and 14 mm diameter, respectively. Flies were presented with both odours during training, but only one was paired with light for optogenetic activation (465 nm) or silencing (520 nm), whereas the other odour was presented alone (see electronic supplementary material, figure S2, for more details). The flies were then tested in a T-maze for their choice between the two odours. From the number of flies choosing each odour (#), the relative preference was calculated as

BAPreference=(#BA#OCT#Total)×100. 2.1

The presentation of BA and OCT with or without the light (*) was alternated between repetitions of the experiment, allowing an associative memory score to be obtained from reciprocally trained sets of flies as

Memoryscore=BAPreferenceBABAPreferenceOCT2. 2.2

Data were analysed with Kruskal–Wallis tests (KW-tests) to compare more than two groups, Mann–Whitney U-tests (U-test) for pairwise comparisons, one-sample sign-tests for comparisons to chance level (i.e. zero), in all cases with Bonferroni–Holm corrections of p < 0.05 significance levels as appropriate, using Statistica 11.0 (StatSoft, Hamburg, Germany) and R 2.15.1 (www.r-project.org).

3. Results

Presenting an odour together with optogenetically silencing MBON-11 via the green-light-gated anion-channel GtACR1 established aversive memory for the odour (figure 1b). This effect was replicated using three training cycles (figure 1c). Consideration of the genetic controls suggests a weak appetitive olfactory memory through the pairing of odour with the green light, which is visible to the flies. Critically, relative to either genetic control, silencing MBON-11 had a punishing effect. Conversely, does activating MBON-11 have a rewarding effect?

Presenting an odour together with optogenetically activating MBON-11 via the blue-light-gated cation-channel ChR2-XXL established appetitive memory for the odour (figure 2). Corresponding to what is typically observed for primary food rewards such as sugar [24], this appetitive memory appeared slightly stronger under starved conditions (figure 2c; indeed starvation was shown to facilitate MBON-11 activity [25]). In the case of blue light too, the data from the genetic controls suggest a weakly rewarding effect. We further note that relative to the respective genetic controls, the punishing effect of silencing MBON-11 (figure 1c) appears to be stronger than the rewarding effect of activating it (figure 2b).

Figure 2.

Figure 2.

(a,b) The same as in figure 1b, c but using ChR2-XXL to activate MBON-11 (star). This leads to stronger appetitive learning in the experimental genotype than in genetic controls. (c) Same as the experiment in (a), but with an initial 18 h period of wet starvation, which improves appetitive learning [24]. Underlying preference scores can be found in electronic supplementary material, figure S3. Sample sizes and statistical results can be found in electronic supplementary material, table S1. Other details as in the legend of figure 1. (Online version in colour.)

We conclude that silencing/activating MBON-11 has a punishing/rewarding effect.

4. Discussion

MBON-11 is GABAergic [13]. It targets premotor circuitry outside the mushroom bodies, and hetero-compartmental regions in the ipsi- and the contralateral mushroom body, and furthermore features a homo-compartmental and contralateral feedback loop onto the dopaminergic, punishing PPL1-01 neuron (figure 1d) [13,17,25,26]. All of these regions could contribute to reinforcement through manipulation of MBON-11 activity, and we expressly do not draw a conclusion as to which of these regions is indeed involved in these reinforcing effects. One scenario is that silencing MBON-11 lifts inhibition from PPL1-01, promotes PPL1-01 activity and thus exerts a punishing effect (but see [20]). Accordingly, the observation that activating MBON-11 has just a mild rewarding effect (figure 2) would suggest that spontaneous activity in PPL1-01 is moderate, and thus that silencing PPL1-01 would have less effect than activating it. Indeed, as previously reported, activating PPL1-01 is very strongly punishing (electronic supplementary material, figure S4B) [14,19], whereas silencing it is of no measureable rewarding effect (electronic supplementary material, figure S4C) (see [27] for a punishing effect of silencing the DAN of the γ3 compartment). This scenario would therefore suggest that targets other than PPL1-01 are responsible for the rewarding effect of activating MBON-11 (also see [20]).

Interestingly, the pathway from MBON-11 onto the glutamatergic MBON-01 neuron of the γ5 compartment and further from MBON-01 to the rewarding DANs of that compartment is critical for extinction learning after aversive training ([26]; also see [25]) (synonyms for MBON-01 are MBON-γ5β′2a and MB-M6). According to the scenario put forward in [26, fig. 7E-F], odours presented with MBON-11 silencing should lift inhibition from MBON-11 to MBON-01 and should thus drive the rewarding DANs of the γ5 compartment. This indirect, hetero-compartmental connection would thus support appetitive learning through MBON-11 silencing, whereas aversive learning would result for odours presented with MBON-11 activation—which is the opposite of what we report here! To reconcile this contradiction, consider that during second-order conditioning a stimulus X is first paired with primary reinforcement, and then X is presented together with a novel stimulus A in the absence of primary reinforcement. Whereas during AX training the effects of X as a reinforcement-predicting, second-order reinforcer will initially dominate, extended AX training will extinguish the X-with-reinforcement association. The above scenario would thus suggest that the opposing effects of second-order reinforcement and extinction learning, well known to practitioners of this paradigm, are related to homo- versus hetero-compartmental processes.

We note that placing the behavioural effects of manipulating MBON-11 activity into an experimental psychology framework of secondary reinforcement processing also encompasses the effect labelled ‘BGAM’ (for blockade of MBON-γ1pedc-induced aversive memory) [20, fig. 3B,C], obtained by blocking synaptic output from MBON-11 (also see [28]). Critically, the present framework suggests that silencing MBON-11 or preventing synaptic output from it leads to aversive learning about the odour paired with such treatment, whereas [20, p. 569] suggests that synaptic output from MBON-11 is necessary to prevent aversive learning about odours presented in an unpaired manner (for a discussion of paired and unpaired learning, see [29]).

We think that it is interesting that activity in a cell such as MBON-11 can be an analogue of second-order reinforcement, because this is the earliest site efferent to the memory trace in the presynaptic terminals of the mushroom body KCs for such an effect. This might inform the search for such analogues of secondary reinforcement in other species. It also raises the question of how much further down efferent pathways such analogues of second-order reinforcement can be observed, and indeed what the relation of action to valence is.

Supplementary Material

Figure S1
rsbl20190084supp1.tiff (2.9MB, tiff)

Supplementary Material

Figure S2
rsbl20190084supp2.tiff (2.9MB, tiff)

Supplementary Material

Figure S3
rsbl20190084supp3.tiff (2.9MB, tiff)

Supplementary Material

Figure S4
rsbl20190084supp4.tiff (2.9MB, tiff)

Supplementary Material

Table S1
rsbl20190084supp5.xlsx (89KB, xlsx)

Acknowledgements

We thank R. Kittel, University of Leipzig, Germany and G. Nagel, University of Würzburg, Germany, for providing the effector strains for GtACR1 and ChR2-XXL expression, and R. D. V. Glasgow, Zaragoza, Spain, for language editing.

Ethics

All experiments comply with applicable law and ethics regulations.

Data accessibility

For data and statistical report, see the electronic supplementary material, table S1.

Authors' contributions

B.G. and C.K. conceived and coordinated the project; S.G. generated UAS-GtACR1; other authors collected and/or analysed the data. All authors contributed to the preparation of the manuscript by drafting or critically revising it, gave their final approval, and are accountable for its content.

Competing interests

We declare we have no competing interests.

Funding

This study was supported by Deutsche Forschungsgemeinschaft CRC 779-B11, GE1091/4-1 and FOR 2705 (to B.G.), as well as Leibniz Institute for Neurobiology, Magdeburg.

References

  • 1.Pavlov IP. 1927. Conditioned reflexes. (Transl. by GV Anrep.) Oxford, UK: Oxford University Press. [Google Scholar]
  • 2.Rescorla RA. 1980. Pavlovian second order conditioning: studies in associative learning. Hillsdale, NJ: Erlbaum. [Google Scholar]
  • 3.Parkes SL, Westbrook RF. 2011. Role of the basolateral amygdala and NMDA receptors in higher-order conditioned fear. Rev. Neurosci. 22, 317–333. ( 10.1515/RNS.2011.025) [DOI] [PubMed] [Google Scholar]
  • 4.Bitterman ME, Menzel R, Fietz A, Schäfer S. 1983. Classical conditioning of proboscis extension in honeybees (Apis mellifera). J. Comp. Psychol. 97, 107–119. [PubMed] [Google Scholar]
  • 5.Hussaini SA, Komischke B, Menzel R, Lachnit H. 2007. Forward and backward second-order Pavlovian conditioning in honeybees. Learn. Mem. 14, 678–683. ( 10.1101/lm.471307) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Brembs B, Heisenberg M. 2001. Conditioning with compound stimuli in Drosophila melanogaster in the flight simulator. J. Exp. Biol. 204, 2849–2859. [DOI] [PubMed] [Google Scholar]
  • 7.Tabone CJ, de Belle JS.. 2011. Second-order conditioning in Drosophila. Learn. Mem. 18, 250–253. ( 10.1101/lm.2035411) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Malaka R. 1999. Models of classical conditioning. Bull. Math. Biol. 61, 33.. ( 10.1006/bulm.1998.9998) [DOI] [Google Scholar]
  • 9.Heisenberg M. 2003. Mushroom body memoir: from maps to models. Nat. Rev. Neurosci. 4, 266–275. ( 10.1038/nrn1074) [DOI] [PubMed] [Google Scholar]
  • 10.Guven-Ozkan T, Davis RL. 2014. Functional neuroanatomy of Drosophila olfactory memory formation. Learn. Mem. 21, 519–526. ( 10.1101/lm.034363.114) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Owald D, Waddell S. 2015. Olfactory learning skews mushroom body output pathways to steer behavioral choice in Drosophila. Curr. Opin. Neurobiol. 35, 178–184. ( 10.1016/j.conb.2015.10.002) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gerber B, Aso Y. 2017. Localization, diversity and behavioral expression of associative engrams in Drosophila. In Learning and memory: a comprehensive reference (ed. Byrne J.), vol. 1 (Learning theory and behavior (ed. R Menzel)), pp. 463–473, 2nd edn Oxford, UK: Elsevier. [Google Scholar]
  • 13.Aso Y, et al. 2014. The neuronal architecture of the mushroom body provides a logic for associative learning. eLife 3, e04577 ( 10.7554/elife.04577) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Aso Y, et al. 2014. Mushroom body output neurons encode valence and guide memory-based action selection in Drosophila. eLife 3, e04580 ( 10.7554/elife.04580) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Eichler K, et al. 2017. The complete connectome of a learning and memory centre in an insect brain. Nature 548, 175–182. ( 10.1038/nature23455) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Takemura SY, et al. 2017. A connectome of a learning and memory center in the adult Drosophila brain. eLife 6, e26975 ( 10.7554/eLife.26975) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Pavlowsky A, Schor J, Plaçais P-Y, Preat T. 2018. A GABAergic feedback shapes dopaminergic input on the Drosophila mushroom body to promote appetitive long-term memory. Curr. Biol. 28, 1783–1793. ( 10.1016/j.cub.2018.04.040) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Aso Y, Siwanowicz I, Bräcker L, Ito K, Kitamoto T, Tanimoto H. 2010. Specific dopaminergic neurons for the formation of labile aversive memory. Curr. Biol. 20, 1445–1451. ( 10.1016/j.cub.2010.06.048) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hige T, Aso Y, Modi MN, Rubin GM, Turner GC. 2015. Heterosynaptic plasticity underlies aversive olfactory learning in Drosophila. Neuron 88, 985–998. ( 10.1016/j.neuron.2015.11.003) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ueoka Y, Hiroi M, Abe T, Tabata T. 2017. Suppression of a single pair of mushroom body output neurons in Drosophila triggers aversive associations. FEBS Open Biol. 7, 562–576. ( 10.1002/2211-5463.12203) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.König C, Khalili A, Ganesan M, Nishu AP, Garza AP, Niewalda T, Gerber B, Aso Y, Yarali A. 2018. Reinforcement signaling of punishment versus relief in fruit flies. Learn. Mem. 25, 247–257. ( 10.1101/lm.047308.118) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Dawydow A, et al. 2014. Channelrhodopsin-2–XXL, a powerful optogenetic tool for low-light applications. Proc. Natl Acad. Sci. USA 111, 13 972–13 977. ( 10.1073/pnas.1408269111) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Govorunova EG, Sineshchekov OA, Janz R, Liu XQ, Spudich JL. 2015. Natural light-gated anion channels: a family of microbial rhodopsins for advanced optogenetics. Science 349, 647–650. ( 10.1126/science.aaa7484) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Tempel BL, Bonini N, Dawson DR, Quinn WG. 1983. Reward learning in normal and mutant Drosophila. Proc. Natl Acad. Sci. USA 80, 1482–1486. ( 10.1073/pnas.80.5.1482) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Perisse E, Owald D, Barnstedt O, Talbot CB, Huetteroth W, Waddell S. 2016. Aversive learning and appetitive motivation toggle feed-forward inhibition in the Drosophila mushroom body. Neuron. 90, 1086–1099. ( 10.1016/j.neuron.2016.04.034) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Felsenberg J, et al. 2018. Integration of parallel opposing memories underlies memory extinction. Cell 175, 709–722. ( 10.1016/j.cell.2018.08.021) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Yamagata N, Hiroi M, Kondo S, Abe A, Tanimoto H. 2016. Suppression of dopamine neurons mediates reward. PLoS Biol. 14, e1002586 ( 10.1371/journal.pbio.1002586) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ichinose T, Aso Y, Yamagata N, Abe A, Rubin GM, Tanimoto H. 2015. Reward signal in a recurrent circuit drives appetitive long-term memory formation. eLife 4, e10719 ( 10.7554/elife.10719) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Schleyer M, Fendt M, Schuller S, Gerber B. 2018. Associative learning of stimuli paired and unpaired with reinforcement: evaluating evidence from maggots, flies, bees, and rats. Front. Psychol. 9, 1494 ( 10.3389/fpsyg.2018.01494) [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1
rsbl20190084supp1.tiff (2.9MB, tiff)
Figure S2
rsbl20190084supp2.tiff (2.9MB, tiff)
Figure S3
rsbl20190084supp3.tiff (2.9MB, tiff)
Figure S4
rsbl20190084supp4.tiff (2.9MB, tiff)
Table S1
rsbl20190084supp5.xlsx (89KB, xlsx)

Data Availability Statement

For data and statistical report, see the electronic supplementary material, table S1.


Articles from Biology Letters are provided here courtesy of The Royal Society

RESOURCES