Skip to main content
eLife logoLink to eLife
. 2020 Jan 15;9:e50901. doi: 10.7554/eLife.50901

A genetic, genomic, and computational resource for exploring neural circuit function

Fred P Davis 1,2,†,‡,, Aljoscha Nern 1,, Serge Picard 1, Michael B Reiser 1, Gerald M Rubin 1, Sean R Eddy 1,3,4, Gilbert L Henry 1,5,
Editors: Hugo J Bellen6, K VijayRaghavan7
PMCID: PMC7034979  PMID: 31939737

Abstract

The anatomy of many neural circuits is being characterized with increasing resolution, but their molecular properties remain mostly unknown. Here, we characterize gene expression patterns in distinct neural cell types of the Drosophila visual system using genetic lines to access individual cell types, the TAPIN-seq method to measure their transcriptomes, and a probabilistic method to interpret these measurements. We used these tools to build a resource of high-resolution transcriptomes for 100 driver lines covering 67 cell types, available at http://www.opticlobe.com. Combining these transcriptomes with recently reported connectomes helps characterize how information is transmitted and processed across a range of scales, from individual synapses to circuit pathways. We describe examples that include identifying neurotransmitters, including cases of apparent co-release, generating functional hypotheses based on receptor expression, as well as identifying strong commonalities between different cell types.

Research organism: D. melanogaster

eLife digest

In the brain, large numbers of different types of neurons connect with each other to form complex networks. In recent years, researchers have made great progress in mapping all the connections between these cells, creating ‘wiring diagrams’ known as connectomes.

However, charting the connections between neurons does not give all the answers as to how the brain works; for example, it does not necessarily reveal the nature of the information two connected cells exchange. Assessing which genes are switched on in different neurons can give insight into neuronal properties that are not obvious from physical connections alone.

To fill that knowledge gap, Davis, Nern et al. aimed to measure the genes expressed in a well-characterized network of neurons in the fruit fly visual system. First, 100 fly strains were established, each carrying a single type of neuron colored with a fluorescent marker. Then, a biochemical approach was developed to extract the part of the cell that contains the genetic code from the neurons with the marker. Finally, a statistical tool was used to assess which genes were on in each type of neurons. This led to the creation of a database that shows whether 15,000 genes in each neuron type across 100 fly strains were switched on.

Combining this information with previous knowledge about the flies’ visual system revealed new information: for example, it helped to understand which chemicals the neurons use to communicate, and whether certain cells activate or inhibit each other.

The work by Davis, Nern et al. demonstrates how genetic approaches can complement other methods, and it offers a new tool for other scientists to use in their work. With more advanced genetic methods, it may one day become possible to better grasp how complex brains in other organisms are organized, and how they are disrupted in disease.

Introduction

The anatomy of neural circuits is being characterized with increasing resolution and throughput, in part following a dramatic increase in the size of circuits amenable to detailed electron microscopy reconstruction (Swanson and Lichtman, 2016) and the development of genetic tools to access individual cell types (Luo et al., 2018). These efforts reveal anatomy at unprecedented detail, but not the molecular properties of cells. In principle, the genes expressed in each cell of a neural circuit should serve as a molecular proxy for cell physiology. However, most genomic efforts have focused on surveying neuronal diversity rather than characterizing circuit function (Ecker et al., 2017). To develop a resource exploring molecular correlates of circuit function, here we use an approach that genetically targets cell types within a well-characterized brain region to measure high-quality transcriptomes that can be integrated with connectomes.

Drosophila affords an ideal system to study neural circuits in detail, as both excellent genetic tools and high resolution connectomes are available. Here we focus on the repeating columnar circuits of the visual system, found in the optic lobes, a widely used model for studying circuit development and function with an extensive genetic toolbox and well-described anatomy (Figure 1A; Nériec and Desplan, 2016; Silies et al., 2014; Apitz and Salecker, 2014). This network begins with photoreceptor neurons and contains several layers of connected neurons which process incoming luminance signals into multiple parallel streams of visual information (Figure 1B). Many of its cellular components have been described by light microscopy, including classical Golgi studies (Fischbach and Dittrich, 1989) and recent analyses using genetic methods (Morante and Desplan, 2008Otsuna and Ito, 2006; Nern et al., 2015; Wu et al., 2016). Electron microscopy reconstruction work has characterized the synaptic connections of many optic lobe neurons (Meinertzhagen and O'Neil, 1991; Meinertzhagen and Sorra, 2001; Rivera-Alba et al., 2011; Takemura et al., 2013; Takemura et al., 2015; Takemura et al., 2017; Shinomiya et al., 2019). Comparative studies have also explored the evolution of this ancient brain structure (Strausfeld, 2009). Despite this wealth of information, many of its fundamental properties remain unknown, including the neurotransmitters used at many of its synapses.

Figure 1. Genetic tools to access cell types in the Drosophila visual system.

(A) Major brain regions profiled in this study (brain image from Jenett et al., 2012). The optic lobes have a repetitive structure of ~750 retinotopically arranged visual columns of similar cellular composition. (B,C) Examples of single cells in the optic lobe. (B) Left, subregions of the fly visual system. Right, examples of layers and neuropil patterns of various classes of visual system neurons. (C) We profiled cell types arborizing in the lamina (blue), medulla (purple) and lobula complex (green) of the visual system. Many cells contribute to multiple neuropiles so other groupings are possible. Note, some cell types are present at one cell per column, while others are less numerous with cells that each contribute to several columns. For example, the main synaptic region of the first optic lobe layer, the lamina, contains processes of some 13,000 cells but these belong to only 17 main cell types: 14 neuronal and three glial (Figure 1C, top row). A small number of additional neurons (lamina tangential cells, Lat) project to a region just distal to the main lamina neuropile. (D) Representative expression patterns of driver lines that target specific cell types. Each image is a maximum intensity projection of a whole brain confocal stack (only one optic lobe is shown). In each image the brain is counter-stained (magenta) with a neuropil marker and both the targeted cell type and the driver are indicated in the lower left and right corner, respectively. Additional images (focusing on drivers first described in this study) are shown in Figure 1—figure supplements 1 and 2. Imaging parameters and brightness and contrast were adjusted individually for each image. For genotypes and image details see Supplementary file 1E.

Figure 1.

Figure 1—figure supplement 1. Whole brain expression patterns of new driver lines generated in this study.

Figure 1—figure supplement 1.

(A) Maximum intensity projection of confocal stacks taken from whole fly brains (only one optic lobe is shown). Expression patterns of the driver lines (myristoylated-GFP) are in green and a neuropil marker is in magneta. Imaging parameters and brightness and contrast were individually adjusted for each sample. (B, B’) T4 and T5 cells comprise four subtypes (a,b,c,d) each of which project to specific layers of the lobula plate (B’). (C, C’). Individual T4 and T5 driver lines label combinations of subtypes but show preferential expression in some subtypes. Subtypes were identified by their projections to specific layers in the lobula plate (C,C’). For example, T5_d2 mainly labels lobula plate layers one and two, indicating expression in T5a and T5b. Each of the lower panels is a higher magnification view of the lobula plate region (C’). In (C) both the driver and the split identifier are indicated in the lower left and right corner respectively.

Figure 1—figure supplement 2. Optic lobe patterns of driver lines.

Figure 1—figure supplement 2.

(A) Optic lobe expression patterns of new driver lines used in this study. All images orient the mediolateral axis of the brain along the lower left to upper right diagonal and are resampled substack projections generated from high resolution (63x) confocal stacks. (B) Examples of segmented single cells illustrating cell morphology (left), the complete optic lobe expression pattern (middle) or individual cells labeled by MultiColor FlpOut (MCFO, right). In all images the neuropil marker is in gray and both the targeted cell type and driver are indicated in the lower left and right corner respectively.

Measuring the genes expressed in specific cells of the brain is challenging due to its compact and complex organization. RNA sequencing (RNA-seq) addresses this challenge by profiling either single cells or genetically labeled populations of cells (Ecker et al., 2017). The latter approach requires genetic tools to access individual cell types but provides more direct access to cells of interests than sampling of unmarked single cells, especially for sparse cell types. Profiling identified cell types provides a direct link to previous work on the anatomy and physiology of those cell types. Cell type-specific drivers also facilitate follow-up experiments, for example evaluating the role of individual genes in individual cells. In Drosophila, large collections of GAL4 driver lines (Jenett et al., 2012; Tirian and Dickson, 2017) and the possibility to further refine these patterns with intersectional methods such as split-GAL4 (Luan et al., 2006; Dionne et al., 2018) enable genetic access to many neuronal populations (see, for example, Tuthill et al., 2013; Aso et al., 2014; Wu et al., 2016). We therefore chose the genetic, rather than single cell, approach to build a genomics resource to explore circuit function.

We previously developed an Isolation of Nuclei Tagged in a specific Cell Type (INTACT) method (Deal and Henikoff, 2010) to measure transcriptomes and epigenomes of genetically-marked neuronal populations in Drosophila (Henry et al., 2012) and mouse (Mo et al., 2015). Here, we develop a tandem affinity purification of INTACT nuclei (TAPIN) method with increased specificity, sensitivity, and throughput. By combining this method with an extensive set of new driver lines with predominant expression in specific cell types and a new probabilistic method to interpret transcript abundance, we build a resource of high-quality transcriptomes for one hundred driver lines. We selected drivers that expressed in cell types constituting the lamina (Fischbach and Dittrich, 1989; Tuthill et al., 2013; Edwards et al., 2012) as well as the major cell types of the circuits that compute the direction of visual motion (Mauss et al., 2017; Figure 1C). We further included neuronal populations in two central brain regions, the mushroom body and central complex, primarily to serve as informative outgroups.

By profiling these driver lines, we develop an expression catalog for 67 Drosophila cell types as well as several broader cell populations. Through validation experiments and comparisons to the literature we demonstrate that this resource is useful both for identifying individual genes expressed in specific cell types and for revealing broader patterns such as the expression of all members of a gene family across many cell types. As an example, we describe the expression of neurotransmitters and their receptors and use this information to interpret synaptic connectivity. For example, we unexpectedly found that the R8 photoreceptors express acetylcholine in addition to histamine and show that this apparent co-transmitter phenotype is further supported by differential expression of neurotransmitter receptors in R8 postsynaptic partners. Our results demonstrate that combining expression and connectomes leads to specific testable hypotheses about circuit mechanisms that are inaccessible to either approach alone.

Results

Genetic tools for labeling the visual system

To enable transcriptome analyses of defined cell populations, we first assembled a collection of genetic drivers to access them. For this study, we combined drivers from existing collections for cell types in the lamina (Tuthill et al., 2013), the mushroom body (Aso et al., 2014), and the lobula (Wu et al., 2016) with new driver lines for many additional optic lobe cell types and also some neurons of the central complex (Wolff and Rubin, 2018), T. Wolff, personal communication). Nearly all of these drivers were generated using an intersectional method, split-GAL4, to refine expression patterns of GAL4 driver lines. To characterize new driver lines, we imaged expression patterns across the entire fly brain to determine overall driver specificity (Figure 1D, Figure 1—figure supplement 1) and examined anatomical features such as layer patterns in higher resolution images to identify specific cell types (Supplementary file 1A, Figure 1—figure supplement 2). For most lines, we further confirmed the identity of labeled cells by examining the morphology of individual cells using stochastic labeling (Figure 1—figure supplement 2). Although we noted that a few patterns also include some additional contaminating cells (Supplementary file 1A), these driver lines are the most specific tools currently available to access individual cell types in the optic lobe.

Purifying nuclei with TAPIN

Next, we employed an improved INTACT method to measure nuclear transcriptomes in genetically defined cell populations (Henry et al., 2012), and we also developed a new variant of the method that permits higher throughput with increased purity and sensitivity. In both approaches, nuclei are purified using a nuclear tag whose expression is driven in a cell population of interest by either a standard or split GAL4 driver (Figure 2A). The new variant protocol, tandem affinity purification of INTACT nuclei (TAPIN), uses a bacterial protease (IdeZ) to specifically cleave antibodies in the hinge region separating their Fc and antigen binding F(ab’)2 fragments (Figure 2B, Figure 2—figure supplement 1B). Treating protein A magnetic bead-bound nuclei with this protease generates both nucleus-F(ab’)2 and bead-Fc complexes. Soluble nucleus-F(ab’)2 is then recaptured on protein G magnetic beads, removing non-specifically bound material from the first capture. INTACT successfully profiled many of the abundant cell types in the optic lobe (>1000 cells per brain), but failed for sparser cell types and those whose nuclei were difficult to purify by differential centrifugation (photoreceptors, glia, T4, T5). We solved these problems with TAPIN, which does not purify nuclei prior to bead capture.

Figure 2. Tandem-affinity purification of INTACT nuclei (TAPIN) enables neuronal genomics.

(A) Cell type-specific drivers enable expression of the UNC84-2XGFP nuclear tag (green) in specific populations of cells. Both the targeted cell type and driver are indicated in the lower left and right corner, respectively. (B) Following nuclei harvest, two rounds of magnetic bead capture serially purify target nuclei. After the first round of protein A bead capture, bacterial protease IdeZ cleaves the anti-GFP antibody in the flexible hinge region, allowing a second round of bead capture with protein G, which recognizes the F(ab’)2 region. Protein G, unlike Protein A, can bind both the Fc and F(ab’)2 regions of an immunoglobulin. (C) Two capture rounds reduce the level of non-specific background (gray bars, mock IgG control) while maintaining the cDNA yield from the captured target nuclei (green bars). Bars represent the mean of two replicates (shown as points). (D) RNA-seq libraries created with more nuclei yield more cDNA (circles). TAPIN libraries had lower non-specific background than INTACT (blue vs orange triangles). (E) Libraries with more cDNA detect more genes. (F) Libraries with more cDNA have more reproducible transcript abundances. (G) Previously identified markers of lamina monopolar and inner photoreceptor neurons (Tan et al., 2015) are enriched in the expected cells.

Figure 2.

Figure 2—figure supplement 1. Two variants of nuclei capture, INTACT and TAPIN.

Figure 2—figure supplement 1.

(A) We used an INTACT-seq variant, that we originally developed for mouse (Mo et al., 2015), that purifies nuclei by differential centrifugation. (B) TAPIN-seq replaces the space- and time-intensive centrifugation with a two-step capture enabled by antibody hinge cleavage with the bacterial protease IdeZ. Both protein A and protein G bind the Fc region, while only protein G is able to bind F(ab’)2. (C) Libraries built from more nuclei have more transcript molecules (estimated using synthetic spike-ins). (D) Nearly all libraries showed relatively unbiased positional coverage across gene bodies. (E) The maximum bias in positional coverage observed in each library was inversely correlated with cDNA yield, although with large variance in bias for lower yield libraries. (F) T4.T5 transcriptomes of female (y-axis) and male (x-axis) flies are well correlated, but also recover known sex-specific genes including RNA on X 1 (roX1) and roX2 (Amrein and Axel, 1997) and yolk protein 1 (Yp1) and Yp3 (Belote et al., 1985). (G) Estimated transcript abundances were reproducible as evaluated by Pearson correlation (of log-transformed transcript abundance) between biological replicates (black), alternative drivers for the same cell type (orange), or comparing TAPIN to INTACT profiles (blue).
Figure 2—figure supplement 2. TAPIN-seq vs FACS-seq comparison.

Figure 2—figure supplement 2.

(A) TAPIN-seq expression of marker genes identified from an independent FACS-seq dataset covering 12 cell types we also profiled (Konstantinides et al., 2018). We defined marker genes for each cell type as the top-10 most highly expressed genes relative to the mean of all cell types, requiring at least 4x higher abundance than the mean and a relative abundance of at least 50 TPM. (B) We more broadly compared the TAPIN-seq and FACS-seq datasets by first identifying cell type-enriched genes within each dataset (at least two-fold higher than mean expression; at least 50 TPM in one sample) and then quantifying the degree of overlap between datasets using the overlap coefficient: 100 * (# genes enriched in both TAPIN-seq AND FACS-seq)/minimum(# genes enriched in TAPIN-seq, # genes enriched in FACS-seq)).

The greatest advantage of TAPIN is its ability to purify nuclei from sparse cell types (<50 cells/brain) (Supplementary file 1A). INTACT is not suitable for these lines because of loss during differential centrifugation. This difficulty cannot be overcome by processing more brains per experiment because differential centrifugation is difficult to scale. TAPIN solves this problem by running a first capture on crude extracts generated from hundreds to thousands of fly heads. The substantial background in this first capture is reduced 5- to 6- fold in a second capture with only a modest decline in both the yield of nuclei and amplified cDNA (Figure 2C).

Measuring transcriptomes with INTACT- and TAPIN-seq

We applied INTACT and TAPIN to the cell populations defined by the genetic drivers we described above (Supplementary file 1B). Most drivers express in a single anatomically defined cell type or a small group of related cell types. We note that a few of our cell types are strictly groups of related cell types (for example, the muscle cells or, at a different level of a cell type hierarchy, the T4 and T5 cells, with four subtypes each). Additional drivers target more heterogeneous cell populations sharing a common property (e.g., driver lines aimed at recapitulating the expression of a neurotransmitter marker). Altogether, we built 250 RNA-seq libraries from 242 samples of purified nuclei (46 using INTACT and 196 using TAPIN) and eight manually dissected samples (Supplementary file 1B). We estimated relative transcript abundance in each library using kallisto (Bray et al., 2016). Libraries built from more nuclei yielded more cDNA (Figure 2D), allowed more genes to be detected (Figure 2E), had more estimated transcripts (Figure 2—figure supplement 1C), more reproducible transcript abundance (Figure 2F), and less bias in coverage across gene bodies (Figure 2—figure supplement 1D,E). We focused on 203 libraries that had at least 8500 genes detected, 3µg cDNA yield, and 0.85 Pearson’s correlation of transcript abundances in two biological replicates. These 203 libraries consist of at least two biological replicates built from 100 drivers that covered 67 cell types (53 visual system, 7 mushroom body, 5 central complex, 2 muscle), 6 broader cell populations (ChAT, Gad1, VGlut, Kdm2, Crz, and NPF), and 2 manually dissected tissues (the lamina and remainder of the optic lobe) (Materials and methods). We provide the read and abundance data for the remaining sub-optimal libraries (47 libraries covering 24 cell types) in the event they may be informative, but we do not consider these to be of sufficient quality and do not consider them further here. We did not sort the sex of flies when preparing TAPIN-seq libraries, as we did not observe large differences in male and female expression profiles (Figure 2—figure supplement 1F).

We were encouraged by the clear enrichment of previously identified markers in cell types where they were expected. For example, we recovered transcription factors (TFs) previously found in the developing monopolar interneurons and inner photoreceptors (Tan et al., 2015; Figure 2G). We further confirmed our measurements by comparing TAPIN-seq results for twelve cell types that were also recently profiled by FACS-seq (Konstantinides et al., 2018; Figure 2—figure supplement 2A,B) and found concordant expression of cell type-enriched genes. This concordance also argues against major differences between nuclear and cytoplasmic transcriptomes. In combination with the technical quality of our libraries, this confirmation by independent gene expression measurements validated our approach, and also motivated us to explore how to best interpret a large dataset of relative abundances.

Interpreting transcript abundance with mixture modeling

Deriving biological insights from a matrix of transcript abundances is not straightforward. While a cell’s expression of a gene can be used to infer a specific functional property of that cell, the level of expression that is needed to establish confidence in such an inference is much less clear. For example, expressing the vesicular acetylcholine transporter (VAChT) implies that a neuron is cholinergic. However, VAChT transcript abundance exhibits a wide distribution and it is not clear, a priori, what level is necessary to conclude that a cell is cholinergic (Figure 3A).

Figure 3. Mixture modeling accurately interprets TAPIN-seq measurements.

(A) The distribution of Vesicular acetylcholine transporter (VAChT) abundance fit with a mixture of two log-normal components. (B) Interpreting these components as ‘off’ and ‘on’ states unambiguously infers expression state in essentially all drivers. (C) Mixture modeling transforms our catalog of relative transcript abundances (top) to discretized expression states (bottom). (D) Histogram of expression breadth per gene. (E) Cumulative distributions of expression breadth for all genes (gray), transcription factors (black), homeobox TFs (orange; InterPro domain IPR001356), neuropeptides (red), and genes involved in synaptic vesicle endocytosis (blue). (F,G) The fkh modeling results were compared to its protein expression pattern as evaluated with a BAC transgenic (See Figure 3—figure supplement 2A). (F) Histogram bars represent raw abundance of all cells in our catalog. Blue and orange curves represent the inferred off and on components, respectively. Points represent the cells tested for transgene expression showing either detectable GFP (Green) or no signal (Black). The points’ vertical position reflect the estimated probability of gene expression. (G) Forkhead-GFP expression in selected cell types. Fkh-GFP (mainly nuclear, in green) and cell type-specific expression of a membrane marker (in magenta) are shown. Because of the wide range of fkh expression levels, imaging parameters and brightness and contrast adjustments are not identical for different panels. Cells with detectable nuclear GFP signal above the background in the same image were scored as expressing fkh. (H) As in J, to evaluate Ets65A modeling results (See Figure 3—figure supplement 2B).

Figure 3.

Figure 3—figure supplement 1. Overview of INTACT-seq and TAPIN-seq libraries.

Figure 3—figure supplement 1.

(A) Libraries with fewer nuclei had greater carry-over of ninaE transcript, which encodes the abundant rhodopsin in the fly eye. The upper outliers are libraries made from R1-6 photoreceptors, the only cells that express ninaE. The lower outliers are appendage muscle libraries created after heads are removed from the fly bodies, effectively eliminating ninaE carry-over from photoreceptors. (B, C). Modeling the distribution of ninaE correctly distinguishes true expression by R1-6 from transcript carry-over in the remaining samples. (D) Distribution of mean on-state transcript abundance across all modeled genes. (E) Distribution of dynamic range across all modeled genes. (F) Concordance of inferred expression states between replicates. Concordance was computed as the number of genes predicted to express (p(on)>=0.8) or not (p(on)<=0.2) in both replicates divided by the number of genes predicted to express or not in either replicate. (G) Cumulative distribution of inferred expression probabilities for gene/cell pairs reported to express in FlyBase (n = 193 positive benchmark points). Our mixture model correctly inferred expression of 179 of the 193 gene/cell pairs. The 14 discordant pairs involved six genes (labeled in black). (H–M) Modeling results for the six genes with benchmark mismatches. The on and off components are represented as orange and blue curves, respectively. Black points represent the inferred probabilities of expression for all drivers. Red points highlight the drivers where the model results disagreed with the benchmark. The transcript abundance (x-axis) reflects the average of all ‘high quality’ replicates (minimum two per cell type).
Figure 3—figure supplement 2. Validation of fkh and Ets65A model inferences.

Figure 3—figure supplement 2.

(A, B) To evaluate our modeling results for fkh and Ets65A we evaluated protein expression in several cell types (related to Figure 3F–H) using GFP-fusion proteins (Materials and methods). The indicated cell types (lower left corner) were visualized with a membrane marker (magenta). The second Mi15 panel (in B) includes examples of occasional cells without detectable Ets65-GFP expression (arrows).

We used mixture modeling to address this challenge by describing the expression levels of each gene as arising from a mixture of two log-normal distributions representing binary ‘on’ and ‘off’ states (Figure 3A; Materials and methods). Genes can of course express in more than two states, but we show through extensive validation that this simplifying assumption is a useful one. Modelling VAChT expression in the high-quality TAPIN/INTACT-seq libraries unambiguously inferred VAChT states for all drivers (Figure 3B). We also found that the model was useful for addressing transcript-carryover, evident in our data (as well as published bulk and single cell studies in the fly [Davie et al., 2018] and mouse [Siegert et al., 2012; Macosko et al., 2015]) as photoreceptor transcripts detected in non-photoreceptor cells (Figure 3—figure supplement 1A). For example, the model correctly inferred that only R1-6 photoreceptors expressed the primary rhodopsin ninaE, although ninaE abundance in other cells reached as high as 2,702 TPM (the mushroom body cell type PAM_1) (Figure 3—figure supplement 1B,C). We used this method to transform our catalog of transcript abundances to probabilities of expression (Figure 3C), observing a wide spectrum of on levels and dynamic ranges between on and off states (Figure 3—figure supplement 1D,E). To further simplify these probabilities, we discretized them into on (p>=0.8) and off (p<=0.2) states, and otherwise considered them to be ambiguous (0.2 < p < 0.8). The expression states inferred for replicates had a median 95% concordance (Figure 3—figure supplement 1F). We combined information from replicates to infer expression at the driver and cell type levels (Materials and methods).

We found many genes that express in all cell types, and many that express in only one, with a range in between (Figure 3D,E). As expected, given their roles in specifying identity, homeobox transcription factors (TF) expressed more specifically than transcription factors in general (Figure 3E). Neuropeptides also expressed specifically, while genes with the more general function of synaptic vesicle endocytosis were broadly expressed. We explore these functional properties in more detail later (Figure 4C).

Figure 4. TAPIN-seq profiles identify genes enriched in cell types and groups.

(A) Cells grouped by a minimum evolution tree of their inferred expression states. (B) Heatmap of marker genes enriched in photoreceptors, glia, muscle, and pigment cells. (C) Distribution of expression breadth for genes in ‘terminal’ FlyBase gene groups with more than 10 members in our expression probability matrix. The least- and most- broadly expressed gene groups are labeled, along with the DPR-interacting, beat and DPR family of extracellular proteins. (D) TfAP-2 transcription factor distinguishes closely related cell types T4 and T5. (E,F) TfAP-2 protein is specifically expressed in T4 and not in T5, confirming this detection of differential expression levels. GFP-tagged TfAP-2 (mainly nuclear, in green; see Supplementary file 1E and Materials and methods) is shown together with a membrane marker (magenta) expressed in T4 (E) or T5 (F) cells. (G) Comparison of genes with differential expression in two driver lines for T5 neurons expressing in different subtypes, identify genes that differentially label layers of the lobula plate (corresponding to different subtypes of T5 cells). (H) Confirming the TAPIN-seq identification, klg protein (detected using a GFP tag (green); see Supplementary file 1E and Materials and methods) is expressed in T4/T5 cells with the expected layer specificity (layers 3 and 4) in the lobula plate (LP). A neuropil marker is shown in magenta.

Figure 4.

Figure 4—figure supplement 1. TAPIN-seq profiles identify genes enriched in cell types and groups.

Figure 4—figure supplement 1.

(A) Cell adhesion molecules specifically expressed across our transcriptome catalog. (B–D) The expression pattern of all beat, DIP, and Dpr family members depicted as heatmaps of probabilities of expression (left), heatmaps of relative transcript abundance (middle), or cumulative density curves of normalized expression level (right). The density curves, each depicting a bimodally expressed gene in the gene family, show expression levels that were normalized using the mean expression levels of the modeled off an on states for each gene; normalized level = (logE - mu_off) / (mu_on - mu_off). The density curves illustrate that DIP genes are sparsely expressed, followed by beat and Dpr genes. Of the three families, the Dpr genes exhibit transcript abundance that appears more continuous between the estimated off and on states rather than discretely bimodal. (E,F) The number of interacting pairs of extracellular protein pairs (Özkan et al., 2013) expressed by pairs of cells in the lamina (E) is not sufficient to predict the synaptic connectivity of these cells F; data from Rivera-Alba et al. (2011). To match our expression data, we summed the synapse counts for the individual R1-R6 photoreceptors originally reported by Rivera-Alba et al. (2011). For the same reason, we also duplicated the subtype-unidentified Lawf synapse counts as separate Lawf1 and Lawf2 entries in the connectome matrix.

Evaluating accuracy of TAPIN-seq measurements

To validate our TAPIN-seq measurements, we first compared our inferred expression states to FlyBase curated reports of protein expression (n = 197 data points of gene/cell pairs; four negative points, 193 positive points; n = 22 cells; n = 69 genes, Supplementary file 1C). Protein expression can of course differ from that of the corresponding mRNA due to post-transcriptional regulation. However, since most functional interpretations of transcriptome data are implicitly about protein expression, we used this as a more stringent and practical test of our model. We found 93% concordance (183 matches; 14 mismatches from six genes; 0 mismatches for negative benchmark points; Figure 3—figure supplement 1G). The benchmark mismatches fell into three categories: expression levels near the transition between inferred on and off components (veli, verm, para; Figure 3—figure supplement 1H–J), genes with a wide dynamic range of expression (Syx, Rab11; Figure 3—figure supplement 1K,L), and genes with undetected transcript but previously detected protein (Myo61F; Figure 3—figure supplement 1M). The first two categories likely arise from imprecision in the model’s fitted components and its representation of transcript abundance as bimodal, rather than continuous. The third category (conflicting transcript and protein levels) could reflect either technical issues (low sensitivity in our measurements, or false positives in the prior work due to antibody cross-reaction) or biological complexities (e.g., long-lived transcripts, subcellular localization, post-transcriptional regulation).

To further evaluate our results for genes expressed across a wide range of levels, we compared the model output to protein expression patterns for two transcription factors: Forkhead (fkh) and Ets65A. We visualized each protein using a C-terminal GFP tag; the tagged proteins were expressed from BAC transgenes with large flanking sequences to ensure a near native genomic context (Kudron et al., 2018). From the transcript data, we inferred fkh gene expression in 14 cell types across a 35-fold range of abundance (60 to 2,103 TPM). Of 28 cell types that we visualized at the protein level, fkh was detected in all but one that we expected from TAPIN-seq (Figure 3F,G, Figure 3—figure supplement 2A). The sole exception, Tm4, has a fkh abundance (60 TPM) near the border between the inferred off and on states (Figure 3F). However, we did detect protein in Dm9, which had a near identical raw transcript abundance (61 TPM). Similarly evaluating Ets65A expression identified two mismatches out of 11 tested cells (Figure 3H, Figure 3—figure supplement 2B). Ets65a protein was not detected in Tm20 (70 TPM) and epithelial glia (161 TPM), while it was weakly detected in Dm3 (77 TPM). These results further support the accuracy of TAPIN-seq and our statistical model even for genes with a wide dynamic range. The agreement between our transcript on/off calls and protein expression encouraged us to use the discretized on/off calls for all further analyses; the unprocessed relative abundances in TPM are reserved for deeper analysis when needed.

Examining the relation between cell types using transcriptomes

To study the relation between cell types, we built a dendrogram based on the expression states we inferred for the whole transcriptome and estimated the support for each branch point with bootstrap resampling (Figure 4A). The broad groupings were well supported and mostly intuitive: muscle were outgroups, followed by a mushroom body cell type (PAM_4), the glia, the photoreceptors, and the remaining neurons. Several fine groupings of anatomically closely related neurons were also well supported (e.g., Kenyon cells; C2, C3; Lawf1, Lawf2; T4, T5; LPLC1, LPLC2). However, mid-level branchings were not well supported, indicating the lack of a simple hierarchical relationship. Neurons were generally grouped by region: central complex, mushroom body, and optic lobe. One surprise was the grouping of Tm20 and Dm1, away from all other optic lobe cell types. Upon closer examination, the identity of genes expressed exclusively in these two lines (lz, Pdh, bw) suggest that this grouping is driven by shared pigment cell contamination in the GAL4-tagged patterns of these driver lines. Similarly, the unusual position of PAM_4 is likely due to some unidentified non-neuronal cells in the driver. These are examples of imperfections in the GAL4 driver lines. While they can lead to some false positives for the main target cell types, they can also provide additional information. For example, analyzing the overlap between Dm1 and Tm20 allowed us to infer marker genes expressed in the pigment cell population.

Identifying genes that mark cell types and groups

We next identified genes that marked cell groups in the tree, using three criteria: genes that expressed in all the cells within a group, at most two cells outside this group, and with transcript abundance higher than all cells outside the group (For simplicity, we will hereon refer to cell type as just cell. This is not meant to exclude the possibility of heterogeneity within the individual cells of a profiled population.). We used these criteria to identify markers for photoreceptors (n = 108), glia (n = 60), and muscle (n = 76) (Figure 4B, Supplementary file 1D). These genes included many known as well as new markers. For example, genes enriched in photoreceptors include signaling components (Arr2, Galphaq) and transporters (trpl, Eaat2) with known physiological roles as well as uncharacterized orphan transporters (e.g., CG8468). We also identified 18 markers for pigment cells using the Tm20 and Dm1 profiles. In addition to the three types of lamina glia we profiled, several other glia types are present in both the lamina and the medulla. Genes expressed exclusively in the dissected samples (lamina, remainder of optic lobe) and not in the TAPIN libraries identified marker genes for optic lobe cells that we did not directly profile, such as medulla glia. Indeed, the genes identified in this way included several known markers for astrocytes (alrm, wun2, Obp44a) (Huang et al., 2015).

We examined the breadth of expression of different functional groups of genes, as defined by FlyBase gene group curation. HOX-like homeobox TFs were among the most specifically expressed group, while groups of core cellular machinery (e.g., beta importins, mitochondrial complexes) were among the most broadly expressed groups (Figure 4C). Some groups included both broadly and very specifically expressed genes. For example, among cell adhesion molecules, we noted an interesting distribution for three gene groups proposed to be involved in protein-protein interactions that underlie synaptic connectivity (Özkan et al., 2013; Tan et al., 2015; Carrillo et al., 2015). While 11 DPR-interacting proteins (DIP) were among the most specifically expressed genes (expressed in a median of 6 cells), beat (median, 25.5 cells) and DPR (median, 51 cells) genes were more broadly expressed (Figure 4—figure supplement 1A–D). As physical interactions among these and other extracellular proteins have been systematically characterized (Özkan et al., 2013), we combined their expression and interaction patterns to estimate the number of potential interaction between cells in the lamina (Figure 4—figure supplement 1E), many of which are in actual contact (Figure 4—figure supplement 1F). Except for a clear paucity of transcripts encoding interacting protein pairs expressed by glia, these global expression-based patterns did not correlate well with connectivity in the lamina. However, we found that every pair of lamina cells expressed mRNAs for tens of interacting protein pairs, highlighting the broad potential for cell-cell interactions not only in the developing (Tan et al., 2015) but also adult optic lobe.

Transcriptomes can distinguish closely related cell types and subtypes

To ask whether we could identify genes distinguishing closely related cell types, we examined T4 and T5. These cells had similar transcriptomes and were neighbors in the phylogenetic tree, but we found one transcription factor, TfAP-2, that was expressed nearly two orders of magnitude higher in T4 (390 TPM) than T5 (6 TPM) (Figure 4D). We confirmed this pattern at the protein level (Figure 4E,F).

T4 and T5 cells can each be further divided into four subtypes that preferentially respond to motion in one of four cardinal directions and differ in anatomical details such as the lobula plate layer to which they project axons. While our split-GAL4 lines do not isolate single T4/T5 subtypes, the T5_d1 and T5_d2 drivers show differences in subtype expression (Figure 1—figure supplement 1B,B’,C,C’). Comparing the transcriptomes of these two drivers confirmed previously described markers (Con, bi, dac; Apitz and Salecker, 2018) that distinguish T4/T5 cells of lobula plate layers 1/2 and 3/4, and indicated additional genes, including a transcription factor (dysf) and cell adhesion molecules (klg, Dscam3) with selective expression in these subtypes (Figure 4G). As a further confirmation of this finding, we verified that a tagged klg protein showed layer-specific expression in the lobula-plate consistent with these T4/T5 subtypes (Figure 4H).

Reference bulk transcriptomes help interpret single cell transcriptomes

Single cell RNA-seq (scRNA-seq) was recently used to map the optic lobe (Konstantinides et al., 2018) and brain (Davie et al., 2018). Despite its routine use, interpreting scRNA-seq measurements – clustering single cell transcriptomes and labeling these clusters as known cell types – remains challenging. For example, the 52 single cell clusters found in the optic lobe (23 labeled as known cell types, including 7 types of glia; Konstantinides et al., 2018) and the 87 clusters found in the whole brain (41 labeled, also including seven glia; Davie et al., 2018) far under-estimate the expected diversity of cell types – over one hundred anatomically distinct neuronal cell types have been described in the optic lobe alone (Fischbach and Dittrich, 1989; Morante and Desplan, 2008; Otsuna and Ito, 2006; Nern et al., 2015; Wu et al., 2016). Furthermore, comparing the number of single cells in each optic lobe cluster to the true abundance of each cell type (as established by neuroanatomical studies) reveals that the single cell map does not proportionally represent abundance (ranging from ~5 times fewer Dm8/Tm5c cells to ~7 times more Dm12 cells than expected in the optic lobe map; Figure 5A). The whole brain map, measured using the more sensitive 10X scRNA-seq platform rather than Drop-seq used for the optic lobe map (Svensson et al., 2017), showed similar cell type abundances (Figure 5B). Without detailed neuroanatomy to serve as ground-truth, this similarity could be interpreted as reproducibility across platforms. Instead, our results suggest caution when interpreting cell type frequencies from scRNA-seq maps, as they can be skewed by experimental artifacts such as cell type-specific differences in RNA isolation yields, computational over-clustering, or inaccurate cell type labeling. Given the known number of cell types and their frequencies, it is clear that interpreting single cell measurements is challenging.

Figure 5. TAPIN-seq complements single cell RNA-seq profiling.

(A, B) We evaluated whether single cell RNA-seq of the optic lobe (A) (Konstantinides et al., 2018) and brain (Davie et al., 2018) proportionally represents cell types found in the optic lobe. By comparing the single cell cluster sizes to the true abundance of each cell type (estimated as described in the Materials and methods) we found that the scRNA-seq map can both under- and over-estimate the abundance of each cell type (assuming accurate cell type labels), or that the cell type is incorrectly assigned (i.e. contains different or additional cell types). To estimate the true cell count, we made use of known anatomy (for example, several cell types are known to be present exactly once in each of the ~2×750 medulla columns per brain) or relied on published counts. In addition, we performed some new counts. (See Materials and methods for details.) Observed/expected ratio = ((size of cluster labeled as cell type X/size of cluster labeled as T1) / (true abundance of cell type X/true abundance of T1)). (C) We used non-negative least squares regression to model each TAPIN-seq profile as a linear weighted sum of single cell clusters in the whole brain scRNA-seq map. The heatmap represents the regression coefficients of each single cell cluster (rows) contributing to the TAPIN-seq profile of each cell type, normalized within rows. (D) We evaluated expression of genes that mark selected single cell clusters (Davie et al., 2018) in our TAPIN-seq profiles of visual system neurons. (see Figure 5—figure supplement 2 for the complete heatmap).

Figure 5.

Figure 5—figure supplement 1. Regressing TAPIN-seq profiles against optic lobe single cell clusters.

Figure 5—figure supplement 1.

(A) We used non-negative least squares regression to model each TAPIN-seq profile as a linear weighted sum of optic lobe single cell clusters (Konstantinides et al., 2018). The heatmap represents the regression coefficients of each single cell cluster (rows) contributing to the TAPIN-seq profile of each cell type, normalized within rows.
Figure 5—figure supplement 2. TAPIN-seq expression of genes marking single cell clusters.

Figure 5—figure supplement 2.

(A) We evaluated expression of marker genes for each optic lobe single cell cluster (as reported in Konstantinides et al., 2018) in our TAPIN-seq profiles of visual system neurons. If a single cell cluster marker corresponds to one of our identified cell types, we expect to see its marker genes highly enriched in the corresponding cell type’s expression. Note that some of the single cell clusters with the best apparent cell type matches (e.g., cluster 15/TmY5a, cluster 55/Mi15) were originally reported with a different annotation. (B) Expression of marker genes for each brain single cell cluster (as reported in Davie et al., 2018), as in (A).
Figure 5—figure supplement 3. kn-GAL4 expression.

Figure 5—figure supplement 3.

(A) kn-GAL4 driven expression of a membrane-targeted GFP (green) in the optic lobe. Single confocal section with a reference marker (anti-Brp) in magenta. TmY14 cell bodies are unusual in that they are found only in a subregion of the medulla cell body rind (see http://flweb.janelia.org/cgi-bin/view_flew_imagery.cgi?line=R10G02 for an image of a GAL4 line that in the optic lobe expresses mainly in TmY14). Both the cell body distribution and optic lobe layer pattern (compare cells in (C)) indicate that kn-GAL4 is expressed in optic lobe cell types other than TmY14. (B) Stochastic labeling of kn-GAL4 neurons using MCFO. A TmY5a cell and an LC4 cell are indicated. Other cell types, including several TmY14, are also labeled. Image shows a single confocal section without a reference marker. (C,D) Examples of TmY14 (C) and TmY5a (D) cells. Reconstructed views generated from confocal stacks of MCFO-labeled cells using the indicated driver lines.

Comparison with our data reveals additional challenges in mapping scRNA-seq clusters to known cell types. To compare our data to the brain map, we used non-negative least squares regression to model each TAPIN-seq transcriptome as a linear weighted sum of single cell cluster transcriptomes, assuming that large regression coefficients reflect matching cell types (Davie et al., 2018; Cao et al., 2019). We interpreted the results of this comparison against an ideal scenario where single cell clusters were perfectly resolved and accurately labeled, and assuming that the driver lines used for TAPIN-seq profiling had minimal expression outside of the main target cell type. In this scenario, we would expect a unique cluster matching each of our TAPIN-seq profiles of cell type-specific driver lines, as well as many unmatched clusters, reflecting cell types that we did not profile. However, we observed few one-to-one matching clusters for our TAPIN-seq profiles (e.g., T1, Tm1, Dm8, Dm9, Pm3), several one-to-many matches (e.g., photoreceptor cluster #53 matching our R1-6, R7, R8-Rh5, and R8-Rh6 profiles; also, L1-5 cluster #20, Lawf1/2 cluster #58, and T4/T5 cluster #24), clusters with no TAPIN-seq matches (e.g., clusters #7, 15, 23), as well as TAPIN-seq cell types without a matching cluster (e.g., Dm4, Dm11, Mi4, Tm2, Tm20, LPLC1) (Figure 5C). The matches confirmed several clusters labeled as single cell types (e.g., Tm1, Mi1) or multiple cell types (e.g., photoreceptors, L1-5, Lawf1/2, T4/5, Kenyon cells) and also suggested possible labels for previously unlabeled clusters (e.g., Dm8 cluster #52, Dm9 cluster #74, pigment cell cluster #76) and alternative labels for previously labeled clusters (e.g., TmY5a TAPIN matches the TmY14 cluster #11; Lai TAPIN matches Dm8/Tm5c cluster #39). We observed similar results when analyzing the optic lobe map, with few apparent single cell – TAPIN-seq matches (Figure 5—figure supplement 1). Although this result could arise from major errors in our TAPIN-seq profiles, this possibility is unlikely given our earlier validation results and the concordance between our TAPIN-seq profiles and cell type-enriched genes identified from independent FACS-seq measurements (Figure 2—figure supplement 2).

As a separate comparison of the bulk and single cell profiles, we examined the bulk expression of genes marking each single cell cluster (Figure 5—figure supplement 2). Confirming the regression results, this analysis also found few one-to-one matches in which single cell cluster markers were enriched in only a single TAPIN-seq profile. Instead, most cluster markers were either enriched in multiple bulk cell types (over-clustering), or were not enriched in our data (cell types we did not profile by TAPIN-seq). As before, many TAPIN-seq profiles were not enriched for cluster markers, reflecting cell types that were either missing or clustered with other cell types in the single cell map.

We further explored specific examples where the TAPIN-seq data offered new insight into the single cell maps by suggesting alternative labels or labeling unannotated clusters. The single cell clusters labeled as TmY14 matched the TmY5a TAPIN profile (Figure 5C). The cluster was originally labeled based on the expression of a single transcription factor, knot (kn), as determined using a kn-GAL4 reporter. We also observed kn expression in our TmY5a measurements and further confirmed its expression in both TmY14 and TmY5a cells using the kn-GAL4 reporter line, suggesting that the cluster likely includes not only TmY14 cells, but also TmY5a and other kn-expressing cells (Figure 5—figure supplement 3). Similarly, we found that the Dm2 cluster (optic lobe cluster #55), which was labeled based on a Dm2 FACS-seq profile, matched our Mi15 profile (Figure 5—figure supplement 2A). This observation is concordant with previous reports that the line used to FACS sort Dm2 also expresses in Mi15 (Supplementary Figure 2 in Takemura et al., 2013, Supplementary file 1 Table S4 in Nern et al., 2015). Finally, we found that the Dm8/Tm5c cluster (brain cluster #39) matches our Lai TAPIN-seq profile, while unlabeled clusters match our Dm8 and Dm9 TAPIN-seq profiles (Figure 5D). Our measurements also suggest labels for other previously unannotated clusters, such as brain cluster #76 which likely reflects pigment cell, as demonstrated by enrichment of its marker genes in both Dm1 and Tm20 profiles – both measured with lines that also express in pigment cells (Figure 5—figure supplement 2B). As expected, the genes marking this cluster include known pigment cell markers (e.g., Pdh, rdhB). Altogether, our results demonstrate that cell type-identified data, such as bulk transcriptomes, can help interpret single cell RNA-seq measurements.

Profiles identify candidate neurotransmitter output for most neuron types

The proteins that synthesize and transport neurotransmitters are well known, enabling us to use their expression to predict neurotransmitter phenotype. We used histamine decarboxylase (Hdc), glutamate decarboxylase (Gad1), the vesicular acetylcholine transporter (VAChT), and the vesicular glutamate transporter (VGlut) to identify potential histaminergic, GABAergic, cholinergic, and glutamatergic cell types, respectively (Figure 6A). Our model unambiguously inferred expression states for these genes and indicated a single transmitter (from this group) for nearly all neurons we profiled. A second cholinergic marker, choline acetyltransferase (ChaT), matched VAChT expression almost perfectly (the two genes also share an exon). The sole exception, apparent expression of ChAT but not VAChT in R7 photoreceptors, likely results from a subset of dorsal rim R8 cells labeled by the R7 driver line (further discussed below, also see Supplementary file 1A).

Figure 6. Expression of synthesis and transport genes indicate neurotransmitter phenotypes.

(A) Expression of neurotransmitter marker genes indicate the neurotransmitters produced in nearly all profiled cells. With few exceptions, nearly all cell types express only one fast neurotransmitter. (B, C) We confirm TAPIN-seq results at the protein level (green) for (B) Vesicular monoamine transporter (Vmat) expressed in Mi15 (magenta) and (C) Nitric oxide synthase (Nos) in C3 (magenta). Top panel in (C) shows a section through the optic lobe, lower panels C3 cell bodies. (D) Several neuropeptides and receptors also express specifically (examples). (E) Allatostatin A (AstA) protein expression in the medulla as an example of a neuropeptide with a very specific optic lobe expression pattern. The AstA distribution in the optic lobe matches the distribution and layer pattern of Pm3 cells, consistent with the TAPIN-Seq data.

Figure 6.

Figure 6—figure supplement 1. Transcriptional regulators of neurotransmitter identity.

Figure 6—figure supplement 1.

(A) Transcription factors whose expression is predictive of neurotransmitter phenotype (i.e., P(neurotransmitter output | transcription factor expressed)). The ten most predictive transcription factors are shown for each neurotransmitter output marker. (B) Summary of orthologous transcription factors in worm and mouse and their association with specific neurotransmitter types. (C) The Gad1-associated gene Lim3 does not express in cholinergic Dm12 neurons, but does in the GABA-ergic Dm10 neurons. Double labeling using LexA-markers for Dm12 and Dm10 (green) with a Lim3 protein-trap-GAL4 driving RFP (magenta). This example highlights a case where Lim3 expression identifies a cell type in a group of similar cells (the Dm cells profiled in this study) that is GABAergic (all other Dms in this group are glutamatergic).

Besides these four neurotransmitters that we identified by one or two marker genes, we also identified candidate dopaminergic neurons based on the combined expression of tyrosine hydroxylase (ple), dopa decarboxylase (ddc), vesicular monoamine transporter (Vmat) and dopamine transporter (DAT). While DAT, ple, and ddc were also expressed individually in several cell types that did not express Vmat, only known dopaminergic cell types and one medulla neuron (Mi15) expressed this combination (Figure 6A).

One neuronal cell type, T1, expressed none of the neurotransmitter markers VGlut, VAChT, Vmat, and Gad1 (Figure 6A). Although T1 does express most pan-neuronal genes, it does not express bruchpilot (brp), a key component of presynaptic active zones. Consistent with this result, EM reconstruction has identified very few T1 presynaptic specializations (Takemura et al., 2008).

Transmitters for nearly half of our cell types have been previously proposed and generally agree with our results. For example, VAChT/ChAT expression in Kenyon cells supports recent reports showing they are cholinergic (Barnstedt et al., 2016; Crocker et al., 2016). Fluorescence in situ hybridization and immunolabeling guided by our measurements confirmed the expression of ChAT, Gad1, and VGlut in Mi1, Mi4, and Mi9, respectively (Long et al., 2017; Takemura et al., 2017). However, we see considerable differences between our assignments and some previous work that used reporter transgenes (Raghu and Borst, 2011; Varija Raghu et al., 2011; Raghu et al., 2013), which we generally attribute to unfaithful transgene expression patterns. We believe our assignments to be more reliable, however they are not without problems. For example, one assignment inferred by our model that seems unlikely and is not supported by other available data is the presence of Gad1 in Mi9, which was not detected in the FISH or antibody experiments mentioned above. Given the presence of some contaminating Mi4 cells in at least one Mi9 driver and the lower Gad1 abundance (mean 276 TPM in Mi9; 2165 TPM in Mi4; 1870 mean TPM in predicted GABAergic cells), we attribute the Mi9 Gad1 signal to contaminating contributions from other GABAergic cells such as Mi4.

Transcriptional regulation of neurotransmitter output

We next tried to identify transcriptional regulators of neurotransmitter output, by searching for TF genes expressed in strong correlation with transmitter phenotype. However, we only found such TFs for histaminergic output (Figure 6—figure supplement 1A), which in our dataset is only represented by photoreceptor neurons. This observation agrees with work on neuronal identity showing that single TFs rarely encode transmitter identity, but rather different TFs and TF combinations are used to specify the same neurotransmitter output (Hobert, 2016). We thus expanded our search to TFs whose expression was informative about transmitter phenotype (i.e., cells expressing TF A are likely to produce neurotransmitter B; even if not all cells producing neurotransmitter B express TF A; Figure 6—figure supplement 1A). This search identified candidate TFs for nearly all neurotransmitter types. For example, the 19 neuronal types (including the broad chat-GAL4 line) expressing apterous (ap) are cholinergic. Its worm ortholog, ttx-3, regulates the cholinergic phenotype of the AIY neuron (Wenick and Hobert, 2004). Several other TFs we identified also have worm or mouse orthologs implicated in neuronal identity (Figure 6—figure supplement 1B). Several TFs appeared to identify a transmitter phenotype within a group of cell types but not across the entire dataset. For example, Lim3 distinguishes the GABAergic Dm10 from the other Dm cell types in our dataset and is also expressed in several other GABAergic cells (Mi4, Pm3, Pm4) but was also detected in the cholinergic LC4 and the glutamatergic TmY5a and Tm29. We confirmed the differential Lim3 protein expression in Dm10 and Dm12 cells (Figure 6—figure supplement 1C). Several of the transcription factors that we found to be informative of neurotransmitter output were also implicated by single cell RNA-seq data, including ap (cholinergic), tj (glutamatergic), and Lim3 (GABAergic) (Konstantinides et al., 2018). Our data also indicate exceptions to these patterns (i.e., neurons expressing tj and Lim3 but with a different neurotransmitter phenotype; Figure 6—figure supplement 1A). These observations indicate that neuronal features are likely regulated in a context-dependent and combinatorial manner, and that transcriptomes can identify putative regulators.

Co-expression of canonical small molecule transmitters with non-canonical transmitters is widespread

Co-release of multiple neurotransmitters can enhance the signaling capacity of neurons and neural circuits. For example, the same cell type might release different transmitters under distinct conditions or use them to elicit distinct responses in different target cells. In addition to Mi9 (discussed above as being likely due to contamination), we observed two cases of potential co-transmission involving the canonical small molecule neurotransmitters. Both Mi15 drivers express dopaminergic and cholinergic markers, and both R8 drivers expressed cholinergic and histaminergic markers. We confirmed expression of Vmat protein in Mi15 (Figure 6B), the first identified columnar dopaminergic cell type within the optic lobe, and further below we confirm the unexpected VAChT expression in R8 (Figure 8A).

Evidence for co-transmission involving additional molecules, such as neuropeptides or nitric oxide, appears frequently in our data set. Nitric oxide is a widely conserved signaling molecule that can act on many kinds of cells, including neurons (Lowenstein and Snyder, 1992). We observed very specific expression of its synthesizing enzyme, nitric oxide synthase (Nos), in the lamina (C2, C3, and Lawf2) and medulla (Mi4, Pm4, Tm4 and Mi15). To further validate these results, we confirmed Nos expression at the protein level in C3 neurons (Figure 6C). Nitric oxide can be released extra-synaptically, potentially enabling signaling between neurons that are not synaptic partners.

Several neuropeptides and their receptors were also expressed in distinct patterns suggesting widespread yet specific peptidergic signaling in the visual system (Figure 6D). For example, AstA was observed in just one cell type (Pm3; confirmed at the protein level; Figure 6E), while AstC was expressed in several cell types, and pigment-dispersing factor (Pdf) was expressed in none of the optic lobe cells we profiled. The receptors for all three of these neuropeptides were more broadly expressed (Figure 6D). The broad expression of Pdf receptor (Pdfr) is consistent with the extensive arborization previously observed for Pdf-expressing neurons at the surface of the medulla.

While we focused on genes with well-known functions, our expression patterns also suggest new functions for poorly characterized genes (Figure 6A). For example, photoreceptors specifically expressed CG8468, an orphan transporter in the solute carrier 16 (SLC16) family of monocarboxylate transporters. This gene might represent a candidate vesicular or plasma membrane transporter of histamine, which remains unidentified in any species. We also observed photoreceptor-specific expression of CG45782 (lovit), a member of the SLC45 sucrose transporter family recently reported as a putative histamine transporter (Xu and Wang, 2019).

Broad and patterned expression of neurotransmitter receptors

Since the functional consequences of the release of a neurotransmitter depend on which receptors for this transmitter are expressed in the receiving cell, measuring the expression of both neurotransmitter input and output genes is necessary to assign potential synaptic signs to connectomes. For example, glutamatergic transmission in Drosophila may be either inhibitory or excitatory, depending on the receptors.

In general, neurotransmitter receptors are broadly expressed, qualifying each cell type to detect multiple neurotransmitters (Figure 7A). Patterns for individual receptors (or receptor subunits) varied widely. Some receptors, such as the GluClalpha glutamate-gated chloride channel, thought to be the main mediator of inhibitory glutamatergic transmission in flies, were expressed in most but not all cell types (Figure 7A,B). Expression of others was much more restricted, such as the EKAR glutamate receptor subunit detected only in photoreceptor neurons, consistent with previous work (Hu et al., 2015). Nearly all cells expressed receptors for acetylcholine, GABA, and glutamate, as expected from the combination of predicted transmitter phenotypes and connectomics data. Receptors for neuromodulators such as serotonin, dopamine, octopamine, and neuropeptides in general were also widespread (Figure 6D). For example, octopamine receptors were expressed in broad, yet gene- and cell-type specific patterns, consistent with widespread octopaminergic modulation of visual processing (for example, Arenz et al., 2017; Strother et al., 2018; Tuthill et al., 2014). We confirmed Oamb expression at the protein level in specific lamina neurons and glia, including Lawf2 cells previously shown to be octopamine sensitive (Tuthill et al., 2014; Figure 7C).

Figure 7. Patterns of neurotransmitter receptor expression.

(A) Neurotransmitter receptors are widely expressed in specific patterns. With the exception of histamine, most cells express receptors or receptor subunits for nearly all neurotransmitters. (B) Expression of the glutamate-gated chloride channel (GluClalpha), detected using a GFP-tag (green), in the optic lobe. The lamina pattern includes many neurons as well as proximal satellite, epithelial and marginal glia. A glia-specific nuclear marker (anti-repo) is shown in magenta. (C) Octopamine receptor (Oamb) expressing cells in the optic lobe detected with a protein-trap GAL4 driving expression of a membrane targeted GFP (green). Anti-repo (magenta). In the lamina (to the top and left of the image), Lawf1/2 and L5 neurons and marginal glia are recognizable.

Figure 7.

Figure 7—figure supplement 1. Patterns of neurotransmitter receptor expression complement connectomics.

Figure 7—figure supplement 1.

(A) Transcriptomes reveal the neurotransmitters in core cell types of the ON and OFF components of the motion detection pathway. The ON and OFF motion detection pathways supply inputs to directionally sensitive T4 and T5 neurons, respectively (Takemura et al., 2017; Shinomiya et al., 2019). Our results show that all of the inputs to T5 (Tm1, Tm2, Tm4, and Tm9) are cholinergic, whereas the inputs to T4 are a mixture of GABAergic (C3, Mi4), cholinergic (Mi1, Tm3), and glutamatergic (Mi9), suggesting different input signs. Discovering the functional signs of inputs to the directionally selective neurons is an essential step in understanding the mechanism of this long-studied neuronal computation (Strother et al., 2017). In addition, our data reveals aspects of the motion pathway that have not yet been functionally examined, such as the identification of other signaling components (see B). (B) Examples of the expression of genes involved in neuropeptide, non-canonical small molecule (nitric oxide), or gap junction communication in the cell types in (A).

Combining transcriptomes and connectomes

A principal goal of our work is to provide a foundation for combining molecular data such as neurotransmitter and receptor expression patterns with anatomical or functional connectivity data. One application of expression information is to constrain mechanistic models of neural circuits such as the extensively studied motion detection circuit in the fly eye (reviewed in Mauss et al., 2017; Figure 7—figure supplement 1A). The combined availability of expression and connectomics data for many cell types in a brain region also makes it possible to systematically identify and further explore unusual patterns of receptor or transmitter expression; for example, cell types in which an otherwise widely expressed receptor is absent or cells with unusual combinations of receptor subunits. Below we discuss three examples, focused on potential signs of synaptic transmission, of how such patterns can lead to specific and unexpected hypotheses about circuit function. As we illustrate, combining expression data with synapse-level anatomy permits analyses which are inaccessible to either approach alone.

Presynaptic cholinergic markers and absence of histamine receptors in some postsynaptic targets: R8 photoreceptors may signal via two fast transmitters

Fly photoreceptors have long been known to release histamine (Hardie, 1987; Sarthy, 1991). Our data indicate that inner (color vision) R8 photoreceptors also express the cholinergic markers ChAT and VAChT, suggesting an unexpected additional cholinergic phenotype (Figure 6A). To confirm these results, we visualized a tagged VAChT protein (VAChT-HA; Pankova and Borst, 2017), expressed from the endogenous locus, selectively in photoreceptor cells. These experiments showed VAChT-HA labeling in medulla terminals of R8 but not R7 cells (Figure 8A,A’,A’’,B), including the specialized polarized light-responsive R8-cells in the dorsal rim of the medulla. The latter express the rhodopsin Rh3 (which is otherwise expressed in R7s; Fortini and Rubin, 1990), consistent with the presence of ChAT and VAChT transcripts in the R7 driver line (for which the model inferred expression for VAChT but not ChAT).

Figure 8. Molecular and connectomics analyses suggest R8 photoreceptors signal via both histaminergic and cholinergic neurotransmission.

Figure 8.

(A, A’, A’’, B). Expression of VAChT in R8 cells. Expression of a HA-tagged VAChT was induced in R8 cells by recombinase-mediated excision of an interruption cassette from a modified genomic copy of the VAChT gene (Pankova and Borst, 2017). R7 and R8 cells project to different layers of the medulla (A, schematic). Single confocal sections show R7 and R8 cells in magenta and anti-HA immunolabeling in green. R7 and R8 cells (labeled with mAb 24B10) are shown in magenta. Stop-cassette excision in R8 photoreceptors (using sens-FLP) results in VAChT-HA labeling of R8 terminals in both the main medulla (A’) and the dorsal rim (where R7 and R8 cells project to very similar layer positions) (A’’). Stop-cassette excision in all photoreceptors (using ey3.5-FLP) also produces VAChT-HA labeling in R8 while expression in R7 was not detected (B). Scale bar, 10 µm. (C) Heatmap of receptor expression probabilities (color) and relative abundance (numbers; transcripts per million) in R8 targets identified by EM (at least five synapses in Takemura et al., 2013). (D) Connectivity network for R8 cells, overlaid with receptor expression. (E) Possible distributions of postsynaptic receptors at R8 synapses. Individual active zones can interact with multiple postsynaptic cells which could be grouped in distinct ways. (F) Classification of postsynaptic cells at individual R8 active zones (Takemura et al., 2013) based on histamine receptor expression. (G) Same analysis as in F but for an R7 cell.

We asked whether the apparent co-transmitter phenotype of R8 neurons was reflected in the expression of neurotransmitter receptors in their different postsynaptic partners. Postsynaptic partners of R8 cells identified by electron microscopy reconstructions include seven cell types in our dataset: Dm9, Mi1, Mi4, Mi15, R7, L1 and Tm20 (Figure 8C; Takemura et al., 2013; Takemura et al., 2015). All of these express one or more nAChR subunits (Figure 7A). By contrast, expression of the histamine-gated chloride channels HisCl1 and ort, which mediate histaminergic transmission by photoreceptors (Pantazis et al., 2008), was more selective (Figure 8C,D): L1, Tm20 and Dm9 express ort, consistent with previous reports (Gao et al., 2008), while HisCl1 transcripts were detected in the R7 as well as R8 driver lines, in agreement with another recent report (Schnaitmann et al., 2018; Tan et al., 2015). However, we did not find evidence of expression of ort or HisCl1 in Mi4, Mi1 and Mi15, further supporting R8 signaling via a transmitter other than histamine.

We were interested in whether release of ACh and histamine might occur at spatially distinct locations. Insect synapses often consist of multiple postsynaptic sites apposed to the same presynapse. For cells that release more than one transmitter, two general distributions of postsynaptic processes at such multicomponent synapses are possible (Figure 8E). Postsynaptic cells with different receptors could be grouped at different sites based on receptor expression (Figure 8E-left) or occur together at the same locations (Figure 8E-right). To distinguish these possibilities for R8 cells, we used EM reconstruction data (Takemura et al., 2013) to map the predicted expression of histamine receptors in postsynaptic cells at the single synapse level for all presynaptic sites of one reconstructed R8 cell (Figure 8F). The resulting pattern indicates that processes of cell types with and without histamine receptor expression are often located near the same R8 presynapse (Figure 8F), whereas this is not the case for a reconstructed R7 cell (Figure 8G). This is consistent with the VAChT-HA labeling observed throughout the medulla terminals of R8s (Figure 8A). This spatial pattern is compatible with either co-release of histamine and ACh or independently regulated release from different vesicles at the same sites.

A combined cholinergic and histaminergic phenotype has been reported for a small group of extraretinal photoreceptors (the Hofbauer-Buchner eyelet) located near the lamina (Yasuyama and Meinertzhagen, 1999) but was unexpected for R-cells of the compound eye. Establishing the functional significance of potential acetylcholine release by R8 cells will require further experiments. However, we note that double mutants lacking both histamine receptors are not completely blind (Gao et al., 2008), consistent with histamine-independent transmission by photoreceptor neurons. In addition, a very recent study suggests a role of cholinergic R8 signaling in the entrainment of the fly’s circadian rhythm to light-dark cycles (Alejevski et al., 2019), perhaps similar to that of ACh-release from the Hofbauer-Buchner eyelet (Schlichting et al., 2016).

Potentially excitatory GABA-A receptors in lamina monopolar cells

Fast GABAergic transmission via GABA-A receptors is a major source of inhibition in the nervous system. However, some GABA-A subunit combinations could mediate depolarizing GABA-signaling: in vitro assays indicate that homomeric Rdl or heteromeric Rdl/Lcch3 receptors are typical GABA-gated chloride channels (Zhang et al., 1995), while Lcch3/Grd form GABA-gated cation channels (Gisselmann et al., 2004). However, the in vivo significance of this difference is unknown. Rdl and Lcch3 were expressed in nearly all neurons in our dataset (Figure 7A, Figure 9A,B), consistent with the general inhibitory nature of GABA signaling. By contrast, Grd and another predicted GABA-A receptor subunit, CG8916, were expressed in a minority of cell types (Figure 7A, Figure 9A,B). Photoreceptor neurons, for which no major GABAergic inputs have been identified by connectomics, expressed none of the four transcripts (Figure 9B). Lamina monopolar L1 and L2 were the only neurons other than photoreceptors that did not express significant levels of Rdl. However, both express Grd, Lcch3 and also CG8916. Together with the in vitro findings mentioned above, this result suggests that some or all GABA-A receptors in L1 and L2 may be cation rather than chloride channels. Remarkably, lamina monopolar cells in the housefly Musca, which are thought to have very similar functional properties to those in Drosophila, depolarize in response to GABA (Hardie, 1987) but hyperpolarize in response to histamine (via ort-containing chloride channels). Thus our data identify a potential link between in vivo electrophysiology, in vitro receptor properties and cell type differences in GABA-A subunit (Rdl or Grd) expression.

Figure 9. Using gene expression to functionally interpret circuit structure.

Figure 9.

(A) Different properties of GABA-A receptors in Drosophila observed in in vitro studies. GABA-A receptor subunits can form either cation or anion channels depending on subunit composition. (B) Expression of GABA-A subunits in selected cell types. (C) L1 and two of its target cells form strong reciprocal connections with C2 neurons. (D) Distribution of Rdl and Grd expressing cells at individual C2 synapses. (E) Glutamate receptors can also be excitatory or inhibitory. (F) Examples of expression patterns for selected glutamate receptors and transporters. (G, H) Morphology of Lai (G) and Dm9 (H) cells. Illustrations based on MCFO images of single cells. (I, J) Analysis of the input and output pathways of Lai (I) and Dm9 (J) neurons suggests a potentially similar functional role for these cells. The predicted absence of GluCl-alpha in Dm9 suggests that glutamatergic input from Dm8 to Dm9 may be excitatory.

We next asked whether depolarizing GABA-signaling to L1 and L2 was plausible given their synaptic connectivity. Based on synapse counts and our transmitter data, the main GABAergic inputs to L1 and L2 are C2 and C3 neurons (Meinertzhagen and O'Neil, 1991; Rivera-Alba et al., 2011; Takemura et al., 2013; Takemura et al., 2015). Conversely, L1 is the main input to both C2 and C3 cells, followed by the cholinergic L1 targets L5 and Mi1. These strong connections (illustrated for C2 in Figure 9C) indicate that the effective sign of GABA input to L1 and L2 is almost certainly of functional significance. In the illustrated circuit (Figure 9C), L1 cells hyperpolarize in response to luminance increases (as histamine from photoreceptors opens ort chloride channels). The resulting reduced secretion of glutamate is thought to depolarize L1 targets such as Mi1 (via closing of GluClalpha channels). One plausible, though speculative scenario, is that similar to Mi1, C2 cells also depolarize in response to light. In this case, GABA-gated cation channels in L1 (formed by Grd and Lcch3) would enable negative feedback (counter-acting) from C2 to L1, which for example could return the membrane potential closer to resting levels – speeding up the response to subsequent luminance changes. By contrast, opening of conventional GABA-A receptors (GABA-gated chloride channels) in L1 would resemble a light response (opening of histamine-gated chloride channels), and thus provide positive (reinforcing) feedback in this case. The latter possibility appears less consistent with the transient nature of the L1 (and L2) response to light (Jarvilehto and Zettler, 1971; Laughlin and Hardie, 1978). Distinguishing these and other possibilities will of course require future experimental work.

Similar to the findings for histamine receptors described above (Figure 8F), again using connectivity data for the medulla, we observed that cells with different GABA-A profiles can be postsynaptic at the same synapse (Figure 9D). In addition to L1 and L2, Grd expression indicated several other candidates for cells with unusual GABA responses (Figures 7A and 9B). In these neurons (e.g., Dm8 or Mi4), Rdl and Grd were detected together, raising questions such as whether their subcellular distribution is synapse-specific or whether these subunits might co-assemble into channels with yet unexplored properties.

Diverse patterns of glutamate receptor expression in the targets of a single local interneuron type in the lamina

The diverse expression of glutamate receptor subunits, which can mediate both inhibitory and excitatory signaling, was particularly striking in the lamina (Figures 7A and 9E,F). Notable patterns include the photoreceptor-specific expression of EKAR, the predicted absence of GluClalpha, otherwise broadly expressed in neurons, from some cell types, including photoreceptors (Figures 7A and 9F) and its strong expression in epithelial glia. CG3822, a Kainate-type receptor subunit recently reported to function in presynaptic homeostatic control at the neuromuscular junction (Kiragasi et al., 2017), was strongly enriched in the lamina intrinsic Lai cells. Since Lai neurons are the only known source of vesicular glutamate release in the lamina, CG3822 function in Lai is predicted to also be pre- or perhaps extrasynaptic. T1 and L3, while not expressing VGlut, might also influence glutamate levels in the lamina via the Eaat1 plasma membrane glutamate transporter. The strong expression of this transporter in T1 rather than glia is another unusual feature of glutamatergic signaling in the lamina and may be a clue to the enigmatic function of T1 cells (Tuthill et al., 2013). These examples further highlight how a transmitter released by one neuron type, here Lai, is predicted to have very different effects on target cells due to the receptors they express.

Comparisons of cell shape, synaptic connectivity and receptor expression reveal multiple similarities between local interneurons Lai in the lamina and Dm9 in the medulla

The combination of highly specific EKAR expression and the unusual absence of GluClalpha in photoreceptors prompted us to further explore cellular sources and potential functions for glutamatergic signaling to photoreceptor neurons.

Photoreceptor neurons function over an extremely wide range of light levels, from moonlight to bright sunlight. One mechanism proposed to enable this behavior is a depolarizing feedback signal from photoreceptor targets that increases photoreceptor output under low light conditions, but reduces output at higher light intensities (Zheng et al., 2009). As Lai cells express ort, and thus, like other ort-expressing photoreceptor targets, are thought to hyperpolarize in response to light, increased glutamate release from Lai could provide such light-dependent feedback via EKAR in R-cells. This scenario is consistent with reduced photoreceptor responses at low light intensities after reduction of Lai output or EKAR function (Hu et al., 2015). EKAR is also expressed in R7 and R8 cells, which project to the medulla and are not postsynaptic to Lai. We therefore asked whether there might be a medulla counterpart of Lai neurons.

Synaptic connectivity data identify Dm9 as a strong candidate for such a role: Dm9 is both a major pre- and postsynaptic partner of R7 and R8; it is the only identified R7/R8 target with these properties (other known R7 or R8 targets appear to form few if any feedback synapses on these cells). Remarkably, the overall anatomy of Dm9 cells is also very similar to Lai (Figure 9G,H): Both Lai and Dm9 cells span multiple visual columns but the precise number and distribution of columns innervated by each individual cell is variable. Finally, Lai and Dm9 share key molecular properties: for example, both are glutamatergic and express ort histamine receptors. Based on connectivity and gene expression (Figure 9F,J), Dm9 cells are predicted to receive hyperpolarizing R7 and R8 input via ort and excitatory input from the photoreceptor targets L3 and Dm8. Thus, similar to Lai (Figure 9G,I), Dm9 appears qualified to increase photoreceptor output in the medulla under low light conditions, similar to a proposed function of Lai in the lamina.

One notable difference between Lai and Dm9 is that in contrast to Lai, Dm9 cells receive input from photoreceptor neurons with different spectral tuning. This input involves direct (R7, R8) and indirect pathways (R7 via Dm8, R1-6 via L3) (Figure 9J). This integration of multiple spectral inputs could support a role of Dm9 in color processing. Indeed, the anatomical and predicted functional properties of Dm9 match those of an as yet unidentified ort expressing cell type proposed to contribute to color opponent signaling between R7 and R8 cells (Schnaitmann et al., 2018).

Discussion

We present an approach to characterize the function of neural circuits by combining genetic tools to access their component cells, TAPIN-seq to measure their transcriptomes, and a probabilistic model to interpret these measurements (Figures 1, 2 and 3). We used this approach to establish an extensive resource of the genes expressed in 67 Drosophila cell types, including 53 in the visual system, covering photoreceptors, lamina, and components of the motion detection circuit (Figure 4) and systematically compare our results to single cell RNA-seq (Figure 5). Our approach enables an extensive analysis of neurotransmission in the Drosophila visual system, including the neurotransmitters sent and received across the network as well as transcription factors that potentially regulate neurotransmitter identity (Figures 6 and 7). We also provide specific examples of integrating transcriptomes and connectomes to illuminate circuit function (Figures 8 and 9).

Many recent studies have explored gene expression in neurons. However, only a few of these were aimed at neurons in genetically tractable organisms and brain regions for which detailed anatomical data, especially at the level of synaptic connections, are available. Previous work in the mouse retina has used both genetic (Siegert et al., 2012) and single cell approaches (Macosko et al., 2015) to characterize transcriptional regulators as well as classify cell types. More recent work in Drosophila used single cell RNA-sequencing to characterize heterogeneity in olfactory projection neurons (Li et al., 2017), the midbrain (Croset et al., 2018), the optic lobe (Konstantinides et al., 2018), and the whole brain (Davie et al., 2018). The expression patterns of many genes have also been mapped in C. elegans neurons, whose connectivity has long been known, although these studies typically focus on individual genes rather than genome-wide catalogs (Hobert, 2016). The unique combination of an extensive genetic toolbox to access individual cell types in the Drosophila visual system and systematic efforts to map its connectivity, make it well suited for exploring whether a comprehensive catalog of gene expression is useful for understanding circuit function. Towards this end, we profiled a diverse array of cell types including all of the neuronal cell types that populate the lamina and a subset of cell types in the medulla and lobula complex including those known to play a central role in the detection of motion. We also analyzed a number of cell types residing in deeper brain structures such as the mushroom body and central complex.

Our approach requires genetic driver lines to obtain transcriptomes of specific cell populations. The recent availability of large collections of reagents for split-GAL4 intersections (Dionne et al., 2018; Tirian and Dickson, 2017) make it possible to obtain such lines for virtually any cell type of interest. This expanding genetic toolbox works well with our TAPIN-seq method to profile transcriptomes.

In some cases, available driver lines, including some used in this study (see Supplementary file 1A), may label some additional cell types. In addition to the presence of different, anatomically distinct cell types in a driver pattern, heterogeneity could also result from as yet unrecognized subpopulations within a seemingly uniform group of cells. For example, R7 and R8 photoreceptor neurons each include two major subtypes (pale and yellow) with different rhodopsin expression but very similar, if not identical, cell morphology (Wernet and Desplan, 2004). While drivers with even higher specificity could be obtained through testing of additional split-GAL4 intersections or perhaps triple intersections (Dolan et al., 2017), we did not find the contributions of small numbers of ‘off-target’ cells to be a major limitation for many applications of expression data. In general, the transcriptomes support the high specificity of the intersectional lines we used to access visual system cells (Figure 1). For example, we found specific expression of known marker genes (Figures 2G and 4B) and also that most neurons only express genes for a single neurotransmitter type (Figure 6A). In general, the expression patterns observed in our validation experiments (for example, Figure 3G and Figure 3—figure supplement 2) were highly consistent within a cell type. This suggests that individual cells of many if not all of the anatomical cell types we profile do indeed share specific molecular signatures, even if subdivisions within some types might exist. The availability of specific driver lines makes such validation at cellular resolution possible in a way that is otherwise difficult, for example in single cell RNA-seq studies. Driver lines also permit repeated access to the same cell type in multiple animals at defined time points, enabling the study of behavioral or circadian conditions in individual cell types without having to sequence the whole brain or dissected brain regions.

Modifying the one-step affinity capture in the original INTACT method to a two-step capture in TAPIN-seq increased its specificity, sensitivity, and throughput without the need for time-consuming and labor-intensive centrifugation steps (Figure 1). We initially tried improving the original INTACT method by using density gradient centrifugation to purify nuclei prior to the bead capture step, but this was cumbersome, low throughput, and ineffective for cell types with few nuclei per brain. In addition, for reasons that remain unclear, both photoreceptors and T4 cells consistently yielded few nuclei with this approach. Even with TAPIN, the libraries obtained with some sparser driver lines did not meet the quality control standards we applied. We suspect that the quality of these sub-optimal libraries can be improved by starting with more flies, which is simplified by TAPIN-seq’s ability to use frozen material, enabling the collection of many flies on multiple days at defined time points. In contrast, manual or FACS sorting of dissociated cells is more challenging to scale up, because these more labor-intensive tissue procurement schemes cannot be simplified in the same way. It is also worth noting that our tandem affinity purification approach can improve the specificity of any immunopurification method that uses a capture antibody that is cleavable by IdeZ (all IgG subclasses), without requiring expression of a traditional TAP tag (Rigaut et al., 1999).

TAPIN-seq complements single-cell RNA-seq studies of neurons in several ways (Ecker et al., 2017; Konstantinides et al., 2018). First, our high-resolution transcriptomes will serve as a reference for interpreting single-cell measurements. In particular, comparing our expression catalog to recent single cell maps of the optic lobe and whole brain highlights the challenges in interpreting single cell measurements. Several cell types that we profiled don’t appear as clusters in the single cell map, while others are grouped into the same cluster. The well-established neuroanatomy of the optic lobe makes it an ideal setting to evaluate the accuracy of single cell RNA-seq measurements and raises a broader caution when interpreting scRNA-seq surveys of less well-characterized tissues: the composition of cell types (or states or clusters) observed by scRNA-seq can deviate significantly from their true abundance and requires validation with independent methods. While we analyze fly neurons here, the cautions may also be worth considering in other tissues and species (e.g., the Human Cell Atlas effort; Regev et al., 2017). Having both deep bulk transcriptomes and single cell maps of the same tissue also provides an opportunity for developing new analytical tools that can harness available cell type-identified information while clustering single cell data. Second, combining our approach with single-cell profiling could more efficiently profile heterogeneity within a brain region or genetically defined cell population. Such a combined approach could also help to characterize known cell types for which specific driver lines are not yet available or help to identify and exclude contributions from ‘off-target’ cells to a bulk profile. Finally, the complementarity between bulk and single-cell measurements extends to other genomic features that can be measured in TAPIN-seq purified nuclei, including accessible chromatin and modified histones. We expect this combination of genomic tools to help decipher the transcriptional and epigenetic regulation of neuronal expression programs.

Transcriptome measurements can be of limited utility because it is challenging to interpret relative transcript abundance. In this study we developed a probabilistic mixture modeling approach to classify relative abundances into binary on and off states. This model was a useful guide for interpreting our measurements; most genes are readily described with the two-state model, although the expression of some genes is not (e.g., Rab11; Figure 3—figure supplement 1L). Even for specific genes whose expression is more continuous than bimodal, the results still offer a useful family-wide summary of expression patterns. For example, DPR family members are more broadly expressed than DIP genes (Figure 4C; Figure 4—figure supplement 1D), an observation supported by two recent studies using different methods (Cosmanescu et al., 2018; Venkatasubramanian et al., 2019). Despite our model’s utility, it is important to remember the many potential sources of error (minor cell types in driver line patterns, transcript carry over during TAPIN, biases in RNA-seq library construction and sequencing, etc.) that can affect measurements of relative transcript abundance and the resulting model inferences. Having observed most discrepancies between our modeling results and protein-level expression near the boundary between on and off states, it is prudent to treat these cases more carefully.

Our resource provides additional foundation for systematic functional and molecular studies of the Drosophila visual system. We illustrated how the resource can characterize neurotransmission in the network, particularly when combined with connectome information detailing connectivity between cell types as well as the grouping of post-synaptic partner cell types. We predict neurotransmitters used by every cell we profiled and found two likely cases of co-transmission (Figure 6A). The expression patterns of the major fast-acting transmitters histamine, acetylcholine, glutamate and GABA were comparatively simple: Nearly all neuronal cell types in our catalogue appear to express exactly one of these four transmitters. However, the transcriptomes suggest that many cells also have the potential to release specific neuropeptides, other chemical messengers such as nitric oxide, or form gap junctions with other cells.

While selected transmitter markers (e.g., Gad1 or VGlut) could also be assigned to cell types using methods such as immunolabeling or FISH, these approaches are not practical for comprehensive sampling of markers across these different modes of cell-cell communication. This is particularly clear when the expression patterns of neurotransmitter receptors are also considered (Figure 7A). Our results suggest that, for canonical small molecule transmitters, neurotransmitter output space is tightly tuned while input space is not: neurons typically send just one type of signal but can receive many (Figures 6 and 7). The expression patterns of neurotransmitter receptors provide further context for determining circuit mechanisms (Figure 7—figure supplement 1, Figures 8 and 9). Our results also implicate transcription factors involved in regulating neurotransmitter phenotype, including several that appear to have conserved roles in specifying neuronal identity in other species (Figure 6—figure supplement 1).

The availability of connectivity data for many neurons in the visual system allowed us to interpret neurotransmitter use and receptor distribution in the context of circuit architecture (Takemura et al., 2013; Takemura et al., 2015; Rivera-Alba et al., 2011). For example, our data show expression of both histaminergic and cholinergic markers in R8 photoreceptors. We further find that some major synaptic targets of R8 cells, as identified by electron microscopy, do not express known receptors for histamine (Figure 8F). Given that R7 and R8 cells were previously thought to be exclusively histaminergic both results are unexpected and individually might appear difficult to explain (Gao et al., 2008). However, in combination, these findings make a strong case for a dual histaminergic and cholinergic transmitter phenotype of R8 cells. In contrast, R7 only expresses the histaminergic marker Hdc, and all of its targets express a histamine receptor (Figure 8G).

Finally, our approach especially complements ongoing efforts to map circuit connectivity, which is complete for C. elegans, and is becoming accessible on a whole brain level for Drosophila (Zheng et al., 2018), and for portions of the mouse brain such as the retina. Methods to obtain and interpret serial electron micrographs, array tomography and other methods for mapping connectivity are rapidly progressing (Swanson and Lichtman, 2016; Micheva and Smith, 2007; Kebschull et al., 2016). All told, we are entering a period in neuroscience where connectomics will become pivotal. We expect that genomic approaches, such as the methods for data collection and analysis that we describe here, will enhance these efforts by using transcriptomes to provide, at high-throughput, a molecular proxy for physiological features that are otherwise inaccessible to connectomic methods.

Materials and methods

All reagents and other resources are listed in the Key Resource Table (Supplementary file 2). A detailed description of split-GAL4 hemidrivers (https://bdsc.indiana.edu/stocks/gal4/split_intro.html) and cell-type specific split-GAl4 lines is available online (https://www.janelia.org/split-GAL4).

Experimental models and subject details

Flies were reared on standard cornmeal/molasses food at 25°C. For profiling experiments adults, 4–7 days of age, were entrained to a 12:12 light:dark cycle and anesthetized by CO2 at ZT8 - ZT12. Samples can be stored indefinitely at −80°C after flash freezing in liquid N2. We used female flies for all anatomical characterizations.

Method details

Anatomical analyses

Details of individual genotypes and labeling methods used in the characterization of the driver lines and other anatomical experiments are summarized in Supplementary file 1E. Details of the driver lines are provided in Supplementary file 1A. For the naming of RNA-seq samples, we identified all drivers with a main cell type or cell types (e.g., Mi9_d1). Most of these cell types have been described in detail and were identified based on prior descriptions (for details see Supplementary file 1A; Takemura et al., 2013; Gao et al., 2008; Nern et al., 2015; Fischbach and Dittrich, 1989; Tuthill et al., 2013; Aso et al., 2014; Wu et al., 2016; Wolff and Rubin, 2018; Wolff and Rubin, 2018; Edwards et al., 2012; Helfrich-Förster et al., 2007; Panser et al., 2016; Mauss et al., 2015). The driver names do not attempt to include additional cells present in some drivers. A few of our cell types are strictly groups of related cell types (for example, the muscle cells or, at a different level of a cell type hierarchy, the T4 and T5 cells, with four subtypes each, or R7 photoreceptor neurons, which include R7s of pale and yellow ommatidia).

Generation and characterization of new driver lines

Split-GAL4 and GAL4 driver lines (Supplementary file 1A) were used to express UNC84-2XGFP in defined cell populations. Previously published driver lines were from the following studies (see Supplementary file 1A for details; Tuthill et al., 2013; Diao et al., 2015; Aso et al., 2014; Wu et al., 2016; Strother et al., 2017; Park et al., 2003; Rulifson et al., 2002; Tayler et al., 2012; Taghert et al., 2001; Brand and Perrimon, 1993; Wu et al., 2003; Park et al., 2000; Sweeney et al., 1995; Wolff and Rubin, 2018; von Reyn et al., 2017). New split-GAl4 lines were generated as in previous work (Tuthill et al., 2013; Wu et al., 2016). Briefly, we first identified GAL4 lines with expression in the cell type of interest by screening images of the expression patterns of large collections of such lines (Jenett et al., 2012; Tirian and Dickson, 2017). Typically, several candidate combinations of AD- and DBD-hemidrivers were tested to identify lines with sufficient specificity.

To characterize new driver lines, we examined both overall expression pattern in the brain and optic lobe and, for most lines, confirmed the identity of the main cell type or types using MultiColor FlpOut (MCFO)-labeled single cells (Nern et al., 2015). Since details of the expression patterns of GAL4 or split-GAL4 driver lines can depend on the particular UAS reporter used, we re-imaged 20 drivers with the TAPIN nuclear marker used for the profiling experiments (Figure 2A). In general, the distribution of labeled nuclei in these images appeared to match the expression patterns and specificity expected from the driver line’s original characterization using a membrane marker. As expected, a small number of off-target cells were detectable (often more weakly labeled) in many driver lines.

Validation experiments

For validation experiments, we examined expression patterns of tagged proteins expressed in a near native genomic context using either large BAC-transgenes or modifications of the endogenous loci (Nagarkar-Jaiswal et al., 2015; Diao et al., 2015; Kudron et al., 2018; Lee et al., 2018).

We classified fkh-GFP and Ets65A-GFP as expressed or not expressed by visually comparing nuclear GFP signal in cells of interest (identified using a split-GAL4 driver) to background labeling in surrounding cells. Because of considerable differences in the GFP signal for different cell types, confocal settings and post-imaging adjustments were done individually for different cell types for these experiments.

The following transgenes were used (also see Supplementary file 1E):

PBac{y[+mDint2] w[+mC]=fkh GFP.FPTB}VK00037 (RRID:BDSC_43951), PBac{y[+mDint2] w[+mC]=Ets65 A-GFP.FLAG}VK00037 (RRID:BDSC_38640), Mi{PT-GFSTF.0}Nos[MI09718-GFSTF.0] (RRID:BDSC_60278), Mi{Trojan-GAL4.1}Oamb[MI12417-TG4.1] (RRID:BDSC_67506), Mi{Trojan-GAL4.1}Lim3[MI03817-TG4.1] (RRID:BDSC_67450), Mi{PT-GFSTF.1}klg[MI02135-GFSTF.1] (RRID:BDSC_59787), Mi{PT-GFSTF.2}GluClalpha[MI02890-GFSTF.2] (RRID:BDSC_60533), Mi{PT-GFSTF.0}TfAP-2[MI04611-GFSTF.0] (RRID:BDSC_61776), Mi{Trojan-GAL4.2}kn[MI15480-TG4.2] (RRID:BDSC_67516) pJFRC12-10XUAS-IVS-myr::GFP in attP2 (RRID:BDSC_32197), pJFRC19-13XLexAop2-IVS-myr::GFP in su(Hw)attP8 (RRID:BDSC_32211), and pJFRC21-10XUAS-IVS-mCD8::RFP in attP18.

The VAChT-FRT-STOP-FRT-HA transgene (TI{TI}VAChT[FRT-STOP-FRT.HA] (RRID:BDSC_76021) described in Pankova and Borst (2017) was used to examine VAChT expression in photoreceptor neurons. Flp-recombinase, either sens-FLP (expressed in R8 cells; Chen et al., 2014) (fly stock w[*] P{y[+t7.7] w[+mC]=sens-FLPG5.C}attP18; wg[Sp-1]/CyO; sens[Ly-1]/TM6B, Tb[1] (RRID:BDSC_55768) or ey3.5FLP (expressed in all R-cells; Bazigou et al., 2007) (fly stock P{w[+mC]=ey3.5 FLP.B}1, y[1] w[*]; CyO/In(2LR)Gla, wg[Gla-1] PPO1[Bc] (RRID:BDSC_35542) was used to induce VAChT stop-cassette excision.

Histology

Visualization of split-GAL4 driver line expression patterns with pJFRC51-3XUAS-IVS-Syt::smHA in su(Hw)attP1 and pJFRC225-5XUAS-IVS-myr::smFLAG in VK00005 (Nern et al., 2015) or, in a few cases, 20XUAS-CsChrimson-mVenus in attP18 (Klapoetke et al., 2014) as reporters was performed as described (Aso et al., 2014; Wu et al., 2016). Detailed protocols are also available online (https://www.janelia.org/project-team/flylight/protocols under ‘IHC - Anti-GFP’, ‘IHC - Polarity Sequential’ and ‘DPX mounting’). Multicolor Flp-out (MCFO) markers were detected by immunolabeling with antibodies against HA, FLAG and V5 epitopes as described (Nern et al., 2015). Detailed protocols are also available online (https://www.janelia.org/project-team/flylight/protocols under ‘IHC - MCFO’.

For other experiments, brains of female flies were dissected in insect cell culture medium (Schneider’s Insect Medium, Sigma Aldrich, #S0146) and fixed with 2% PFA (w/v) (prepared from a 20% stock solution, Electron Microscopy Sciences: 15713) also in cell culture medium for 1 hr at room temperature. Brains were washed with 0.5% (v/v) TX-100 (Sigma Aldrich: X100) in PBS and incubated in PBT-NGS (5% Goat Serum [ThermoFisher: 16210–064] in PBT) for at least 30 min. Incubations with primary antibodies and subsequently, after additional PBT washes, secondary antibodies, were in PBT-NGS at 4°C overnight. After additional washes with PBT and then PBS, brains were mounted in SlowFadeGold (ThermoFisher: S36937) and imaged on a Zeiss LSM 710 confocal microscope using 20 × 0.8 NA, 40x NA 1.3 or 63 × 1.4 NA objectives. A few specimens were mounted in DPX following the protocol described in Nern et al. (2015). For experiments using only native fluorescence, brains were fixed as above and mounted and imaged after the initial post-fixation washes.

Primary antibodies used in each experiment are indicated in Supplementary file 1E. Primary antibodies were anti-GFP rabbit polyclonal (ThermoFisher: A-11122, RRID:AB_221569; used at 1:1000 dilution), anti-GFP mouse monoclonal 3E6 (ThermoFisher: A-11120, RRID:AB_221568; dilution 1:100), anti-dsRed rabbit polyclonal (Clontech Laboratories, Inc.: 632496, RRID:AB_10013483; dilution 1:1000), anti-HA rabbit monoclonal C29F4 (Cell Signaling Technologies: 3724S, RRID:AB_1549585; dilution 1:300), anti-FLAG rat monoclonal (DYKDDDDK Epitope Tag Antibody [L5], Novus Biologicals: NBP1-06712, RRID:AB_1625981; 1:200), DyLight 549 or DyLight 550 conjugated anti-V5 mouse monoclonals (AbD Serotec: MCA1360D549GA or MCA1360D550GA, RRID:AB_10850329 or RRID:AB_2687576; 1:500 dilution), anti-cockroach allatostatin (Ast7) mouse monoclonal 5F10 (Stay et al., 1992) (also detects Drosophila AstA (Hergarden et al., 2012); Developmental Studies Hybridoma Bank (DSHB): RRID:AB_528076; dilution 1:5), anti-CadN rat monoclonal DN-Ex #8 (DSHB: RRID:AB_528121; dilution 1:20) (Iwai et al., 1997), anti-chaoptin mouse monoclonal 24B10 (DSHB: RRID:AB_528161, dilution 1:20. Fujita et al., 1982) and anti-Brp mouse monoclonal nc82 (Wagh et al., 2006) (DSHB: RRID:AB_2314866; dilution 1:30).

Secondary antibodies (all from Jackson ImmunoResearch Laboratories, Inc) were DyLight 488-AffiniPure Donkey Anti-Mouse IgG (H+L): 715-485-151, 1:500 dilution; DyLight 594 AffiniPure Donkey anti Rabbit IgG (H+L): 711-515-152, 1:300 dilution; Alexa Fluor 647 AffiniPure Donkey Anti-Rat IgG (H+L): 712-605-153, 1:300 dilution; Alexa Fluor 594 AffiniPure Donkey Anti-Mouse IgG (H+L): 715-585-151,1:300 dilution; Alexa Fluor 647 AffiniPure Donkey Anti-Mouse IgG (H+L): 715-605-151, 1:300 dilution and Alexa Fluor 488 AffiniPure Donkey Anti-Rabbit IgG (H+L): 711-545-152, 1:1000 dilution.

Image processing

Image analyses and processing were mainly done using Fiji (http://fiji.sc) and Vaa3D (Peng et al., 2010). Brightness and contrast were adjusted separately for individual images and channels. Figure panels were assembled using Adobe Indesign. This included selection of fields of view and adjustments of image size. Some images were rotated or mirrored. In some panels with rotated images, empty space outside the original image was filled in with zero pixels. Most of the images in Figure 1—figure supplement 1C,C’ and Figure 1—figure supplement 2 show resampled views that were generated from three dimensional image stacks using the Neuronannotator mode of Vaa3D and exported as TIFF format screenshots.

INTACT purification of nuclei

Frozen adult flies were decapitated by vigorous vortexing. Heads or wings/appendages were then collected on cooled metal sieves (H and C Sieving Systems: 1296, 1297, 1298, 1301). Both flies and purified frozen material can be stored indefinitely at −80°C. In a typical experiment 100–500 frozen heads were added to 5 ml of 20 mM β-glycerophosphate pH7, 200 mM NaCl, 2 mM EDTA, 0.5% NP40, 0.5 mM spermidine, 0.15 mM spermine, 1 mM DTT, 1X complete protease inhibitor (Sigma: 5056489001), 1.5 mg/ml BSA (ThermoFisher: AM2618), 1 mg/ml torula yeast RNA (ThermoFisher: AM7118), 0.6 mg/ml carboxyl coated Dynabeads (ThermoFisher: 14306D) and 2µg anti-GFP antibody (ThermoFisher: G10362, RRID:AB_2536526). Homogenization was carried out on ice by 50 tractions in a Dounce homogenizer using the tight pestle followed by filtration over a 10µm cup filter (Partec: 0400422314). Released chromatin and broken nuclei were adsorbed to carboxyl coated magnetic beads for 30 min at 4°C with constant rotation. Beads were removed on a magnetic stand and the supernatant was diluted to 50 ml with 20 mM β-glycerophosphate pH7, 200 mM NaCl, 2 mM EDTA, 0.5% NP40, 0.5 mM spermidine, 0.15 mM spermine, 1 mM DTT and 1X complete protease inhibitor (Sigma: 5056489001), filtered over a 1µm cup filter (Pluriselect: 435000103) and split into two equal volumes. A 40% Optiprep (Sigma: D1556), 20 mM β-glycerophosphate pH7, 2 mM EDTA and 0.5% NP40 solution was then gently placed under each aliquot, followed by a lower layer of 50% Optiprep, 20 mM β-glycerophosphate pH7, 2 mM EDTA and 0.5% NP40. Nuclei were then pelleted on to the 50% layer for 30 min at 2300Xg. Purified nuclei were passed over a 10µm cup filter, diluted to 10 ml with 20 mM β-glycerophosphate pH7, 200 mM NaCl, 2 mM EDTA, 0.5% NP40, 0.5 mM spermidine, 0.15 mM spermine, 1 mM DTT and 1X complete protease inhibitor and incubated with 30µl of protein G Dynabeads (ThermoFisher: 10004D) for 40 min on ice with occasional agitation. Bead-bound nuclei were recovered on a magnet stand followed by a 20 min incubation on ice in 9mls of 20 mM β-glycerophosphate pH7, 300 mM NaCl, 1M urea, 0.5% NP40, 2 mM EDTA, 0.5 mM spermidine, 0.15 mM spermine, 1 mM DTT, 1X complete protease inhibitor, 0.075 mg/ml torula RNA and 0.05 U/ml Superasin (ThermoFisher: AM2696). Nuclei were then recovered on a magnet stand, resuspended in 1 ml of the previous buffer, passed over a 10µm cup filter, a 5µl aliquot was withdrawn for quantitation and the remainder of the sample solubilized in Arcturus Picopure RNA extraction buffer (ThermoFisher: KIT0204).

TAPIN purification of nuclei

100–3000 frozen heads were added to 5 ml of 20 mM sodium acetate pH8.5, 2.5 mM MgCl2, 250 mM sucrose, 0.5% NP-40, 0.6 mM spermidine, 0.2 mM spermine, 1 mM DTT, 1X complete protease inhibitor, 0.5 mg/ml torula RNA, 0.6 mg/ml carboxyl coated Dynabeads and 2µg anti-GFP antibody (Supplementary file 1F). Homogenization was carried out on ice by 50 tractions in a Dounce homogenizer using the tight pestle followed by filtration over either a 10 or 20µm cup filter (Partec: 0400422314 or 040042315). Released chromatin and broken nuclei were adsorbed to carboxyl coated magnetic beads for 30 min at 4°C with constant rotation. Unbound antibody was removed by incubating the sample on ice for 20 min with 100µl of UNOsphere SUPra resin (Biorad: 1560218), which was previously washed 2X with 500 mM sodium acetate ph8.5/0.5% NP40 and 2 × 20 mM sodium acetate ph8.5/0.5% NP40. After the resin was removed on a 10µm cup filter and the carboxyl beads on a magnet stand, the nuclei-containing supernatant was mixed with an equal volume of 500 mM sodium acetate pH8.5, 250 mM sucrose, 6 mM EGTA, 6 mM EDTA, 0.6 mM spermidine, 0.2 mM spermine, 1 mM DTT, 1X complete protease inhibitor, 0.25 mg/ml torula yeast RNA and 30µl Protein A Dynabeads (ThermoFisher: 10002D) (Supplementary file 1F). A 2 hr incubation on ice with occasional agitation was used to recover tagged nuclei. Bead-bound nuclei were then recovered on a magnet stand and washed twice with 250 mM sodium acetate ph8.5, 250 mM sucrose and 0.1% NP40 (Supplementary file 1F). Nuclei were then released at 37°C for 1 hr by incubation in 50µl of 10 mM Tris pH7.5, 2.5 mM MgCl2, 0.5 mM CaCl2, 250 mM sucrose, 0.1% NP40, 1 mg/ml torula RNA, 40 units RNAsin (Promega: N2515), 2 units DNAseI (NEB: M0303L), 320 units IdeZ protease (NEB: P0770S) (Supplementary file 1F). The sample was diluted to 100µl with 10 mM Tris pH7.5, 2.5 mM MgCl2, 0.5 mM CaCl2, 250 mM sucrose and 0.1% NP40, EGTA was added to 1 mM and the suspension was rapidly triturated 100 times. After returning the sample to a magnet stand, 90µls of buffer containing released nuclei was removed and added to 1.5µl of Protein G Dynabeads that were previously resuspended in 10µl of 10 mM Tris pH7.5, 2.5 mM MgCl2, 0.5 mM CaCl2, 250 mM sucrose and 0.1% NP40. The second binding reaction was run for 1–3 hr on ice with occasional agitation, followed by two 250µl washes in 10 mM Tris pH7.5, 2.5 mM MgCl2, 0.5 mM CaCl2, 250 mM sucrose and 0.1% NP40. Prior to the last wash a 5µl aliquot was removed for quantitation and the remainder of the sample was solubilized in Arcturus Picopure RNA extraction buffer.

RNA-seq library construction

Nuclear RNA was DNAseI (Qiagen: 79254) treated and purified using the Arcturus PicoPure (ThermoFisher: KIT0204) system as instructed by the supplier. Purified RNA was mixed with a 1:100,000 dilution of ERCC standard RNA mix #1 (ThermoFisher: 4456740) and amplified using the Nugen Ovation v2 system (Nugen: 7102–32). cDNA was then blunted, ligated to barcoded linkers (Nugen: 0319–32, 0320–32) and sequenced on an Illumina Hiseq 2500 to 50 bp read length using Rapid Run flow cells.

In total we built 266 RNA-seq libraries, including 46 INTACT-seq, 196 TAPIN-seq, eight total RNA libraries from dissected tissues, and 16 control libraries that we used to characterize each INTACT/TAPIN-seq step (Figure 2C, Supplementary file 1A).

RNA-seq data processing

We trimmed five nucleotides from the 5’ end of reads using seqtk (https://github.com/lh3/seqtk) to remove potential contaminating adapter sequence from the NuGen Ovation kit. We estimated the abundance of annotated genes using kallisto (v0.43.1; Bray et al., 2016) to pseudo-align trimmed reads to the fly transcriptome (cDNA and ncRNA transcript sequences from ENSEMBL release 91, based on FlyBase release 2017_04), ERCC spike-ins, and the INTACT construct sequences GAL4-DBD, p65-AD, and UNC84_2XGFP. ERCC, INTACT tag constructs, and rRNA genes were removed from the abundance tables and the estimated abundances of the remaining genes were renormalized to one million total transcripts. The ERCC spike-ins and nuclear yield values allowed us to convert relative transcript abundance (in Transcripts Per Million, TPM) to absolute abundance (Figure 3—figure supplement 1G). However, we only used relative abundance for our analyses. We also aligned the trimmed reads to the genome using STAR (v2.5.3c; Dobin et al., 2013) and evaluated gene body coverage bias using Picard (v 1.9.1; http://broadinstitute.github.io/picard).

We used three criteria to quantify the quality of each library: the number of genes detected, the pearson correlation between transcript abundances measured in replicates, and the cDNA yield. We used only high-quality libraries (at least 8500 genes detected, 3µg cDNA yield, and 0.85 Pearson’s correlation of transcript abundances in two biological replicates) as input to the model described below.

Comparison to published single cell and FACS-seq datasets

We obtained genes reported to mark the single cell clusters in a recent scRNA-seq study of the optic lobe (Konstantinides et al., 2018). We also obtained the cluster assignments for each single cell in this dataset from the SCope database (Davie et al., 2018), using Seurat clustering resolution 4.0, as reported by the authors. We analyzed FACS-sorted RNA-seq samples reported by Konstantinides et al. by downloading the raw sequencing reads from the NCBI Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) and estimating transcript abundance using kallisto and the same transcriptome index as above.

To compare actual and predicted cluster sizes, we used the following numbers for cells per type: Cell types that are thought to be present once (L1, L2, L3, L4, L5, T1, Mi1, T2, T3, Tm2, Tm9, C2,C3, T4a,T4b,T4c,T4d,T5a,T5b,T5c,T5d) or approximately once (Dm8, Tm3) per medulla column based on EM studies (Takemura et al., 2013; Takemura et al., 2015; Takemura et al., 2017) and light microscopy of specific driver lines (for example, T4/T5 Mauss et al., 2014; lamina cells Tuthill et al., 2013, T4 inputs Strother et al., 2017, Dm8 Nern et al., 2015) were estimated as 1 cell/column * 750 columns/medulla * two hemispheres = 1500 cells per brain. Estimates for Dm12 (~120×2 cells per brain, Nern et al., 2015) and Lawf2 (~140×2 cells per brain, Tuthill et al., 2014) were as published. We performed new counts for Pm3 (mean +/- SD 37 +/- 3 cells per optic lobe; n = 4 optic lobes; driver line SS00328) and Lawf1 (151+/- 7 cells per optic lobe; n = 4 optic lobes; two optic lobes each for driver lines SS00689 and SS00800). No precise count was available for Tm5c; since this cell type is known to be present in many but not all medulla columns, we used an estimate of 400 cells per optic lobe x two hemispheres = 800 cells per brain (Takemura et al., 2013; Melnattur et al., 2014; Karuppudurai et al., 2014).

To compare our bulk TAPIN-seq profiles to published single cell datasets, we used non-negative least squares regression to model each bulk TAPIN-seq profile as a linear weighted sum of single cell clusters and a profile-specific residual (TAPIN ~cluster). We began with single-cell expression matrices extracted from the loom files deposited in the SCope database, using the SCopeLoomR package (Davie et al., 2018). After normalizing the transcriptome of each cell from transcript counts to counts per million, we calculated the mean transcriptome for each single cell cluster. We then selected genes that were enriched in either single cell clusters or TAPIN-seq cell types, using the following criteria (adapted from Cao et al., 2019): union of the top-50 genes that were most enriched relative to the average of all other clusters (or TAPIN-seq cell types) and the top-50 genes that were most enriched relative to the maximum level in all other clusters (or TAPIN-seq cell types). We then performed NNLS regression using the Lawson-Hanson implementation (Lawson and Hanson, 1974) available through the nnls R package (Mullen and Stokkum, 2012). To visualize the results, we created heatmaps of the regression coefficients, normalizing the values for each single cell cluster across TAPIN-seq cell types.

We also performed the NNLS regression in the opposite direction (cluster ~TAPIN, explaining single cell clusters as combination of TAPIN-seq profiles), as we thought this direction would more naturally describe mixed clusters composed of multiple cell types (e.g., all photoreceptors, or all monopolar cells). In practice however, this regression assigned coefficients of exactly zero to several TAPIN-seq profiles – as expected given the power of NNLS to recover sparse solutions (Slawski and Hein, 2013) and the collinearity amongst TAPIN-seq profiles (e.g., highly correlated expression amongst photoreceptor subtypes). In contrast, the TAPIN ~ cluster regression correctly matched mixed clusters with the corresponding TAPIN-seq profiles.

Inferring expression state from transcript abundance

We begin with a catalog of S RNA-seq samples generated from nuclei isolated from cell type cell(s) and the estimated abundance (in TPM), of Egs transcripts from gene g in each sample s. We consider only protein-coding genes with at least 10 TPM abundance in at least one sample (n = 12,377 of 13,931 total coding genes).

To interpret Egs, we assume that all genes express in either an 'on' or an 'off' state. Our goal is to infer from these abundances the probability that each gene is expressed in each cell type, P(zgc=on). Depending on the cell types in our catalog, we will observe some genes in both on and off states (bimodal), while others are exclusively off (unimodal-off) or on (unimodal-on). We deal with these scenarios in turn below.

Assuming that a gene is bimodal, we model its expression as arising from a mixture of two gene-specific log-normal distributions describing expression in cells where the gene is off, P(Eg|z=off), and those where the gene is on, p(Eg|z=on), combined with a mixing weight, πg. We use the same standard deviation for both on and off distributions to ensure a monotonic relationship between transcript abundance and the posterior probability of the on state. If we use different standard deviations for each component distribution, the wider one would become more probable than the narrower one at both low and high expression levels.

logEg|z𝒩(μgz,σg)

We estimate the posterior probability of the on state (assuming bimodal expression):

P(zg=on|bimodal)=πgp(Egs|z=on)πgp(Egs|z=on)+(1πg)p(Egs|z=off)

We treated each replicate sample of the same driver as an independent probe of the same underlying driver-line expression state. To combine replicates of the same driver we sum over their likelihoods:

P(zgd=on|bimodal)=πgΠsp(Egs|z=on)πgΠsp(Egs|z=on)+(1πg)Πsp(Egs|z=off)

Similarly, to combine samples from the same cell type we sum over their likelihoods:

P(zgc=on|bimodal)=πgΠsp(Egs|z=on)πgΠsp(Egs|z=on)+(1πg)Πsp(Egs|z=off)

We estimated parameters for each gene-specific mixture model by maximizing the likelihood for the observed sample-level data:

=gs(πgp(Egs|z=on,μgz,σg)+(1πg)p(Egs|z=off,μgz,σg))

Because we assume independence of genes, we separately optimized the model parameters for each gene. To model the possibility that a gene is unimodally expressed across the cell types we analyzed, we also model the data using a single log-normal distribution, estimating the distribution parameters μ and σ and estimating the data likelihood as:

=gsp(Egs|μg,σg)

Deciding whether a gene is bimodally or unimodally expressed is an example of the model selection problem in statistics. To compare the quality of the unimodal and bimodal models for each gene, we used a recently developed approach to leave-one-out cross-validation that uses Pareto-smoothed importance sampling (PSIS-LOO; Vehtari et al., 2017). Specifically, we performed 10-fold cross validation, by randomly holding out 1/10 of the samples as a ‘test’ set (requiring that at least one replicate of each driver exist in the remaining ‘training’ set), fitting the models using only the training data, and then evaluating the likelihood of the test data using the fitted parameters. Each of the ten cross-validation fits, i, returns an ensemble of S = 500 draws from the posterior distribution of the model parameters. We estimated the expected log pointwise predictive density (elpd) of each cross-validation fit by evaluating the likelihood of each held-out dataset i using each parameter draw s:

elpdi^=log(1drawssdrawsp(yi|θs,k))

We then combined the pointwise log-likelihoods for each cross-validation fit to calculate a single estimate for each model:

elpd^=elpdi^

To compare the unimodal and bimodal models, we calculated the difference in elpd as well as its standard error:

Δelpd^=elpd^bimodalelpd^unimodal
se(Δelpd^)=nVin(elpd^ibimodalelpd^iunimodal)

We then picked the model with the higher elpd, unless the difference in elpd was within two multiples of its standard error (abs(Δelpd^)2se(Δelpd^), corresponding approximately to the half-width of a 95% confidence interval in a normal sampling distribution) in which case we considered the two models’ performance to be indistinguishable and chose the simpler unimodal model.

If we decide a gene is unimodal, we must still decide if it is expressed or not. To model the expression state of unimodal genes, we created two separate log-normal distributions of abundances of confidently bimodal genes (Δelpd^>10) using samples where they were either estimated to be on according to the bimodal model (P(zgs=on|bimodal)>0.9) and where they were estimated to be off (P(zgs=on|bimodal)<0.1), combined with a mixing weight, π, set to the fraction of datapoints that were estimated to be ‘on’ according to the bimodal model.

logEg|z𝒩(μz,σz)

We estimate the posterior probability of the on state assuming unimodal expression as:

P(zg=on|unimodal,μg)=πp(μg|z=on)πp(μg|z=on)+(1π)p(μg|z=off)

To build the final matrix of P(zgs=on) calls, we used bimodal estimates for genes where the bimodal model was a better fit than the unimodal model, and the unimodal estimates for the remaining genes.

P(zgs=on)={P(zgs=on|bimodal),if elpd^bimodal(g)>elpd^unimodal(g) and Δelpd^>2se(Δelpd^),P(zg=on|unimodal),otherwise

We did not include the transcriptomes of the dissected samples in the mixture models because we were concerned that their cellular heterogeneity would violate our assumption of binary gene expression in each sample. That is, genes expressed in a subset of the cells of a dissected sample would give rise to transcript abundance intermediate between the off and on states, and thus make it more difficult to accurately infer the component distributions. However, in some cases the dissected samples could be useful for interpreting transcript levels in the cells that we profiled, by providing examples that extend the observed dynamic range. For example, in the case of a gene expressed in a dissected tissue, but not in the cells that we specifically profiled, the dissected levels would add ‘on’ examples that would make it easier to interpret the levels in the cell types as ‘off’. To use the dissected samples to better model dynamic range, we added two ‘dummy’ samples to each model: the minimum and maximum observed level across both the cell catalog and the dissected samples. This choice allowed us to use the dissected levels if they in fact outflanked the cell type-specific levels, while not confusing the model with intermediate abundance levels. Once the models were fit, we could use the inferred parameters to estimate expression probabilities for samples that were not used in the model fit. For example, we estimated the probabilities of expression in the dissected samples to search for genes expressed exclusively in the dissected samples and not in the anatomically defined cell type libraries, indicating potential markers for cells that we did not specifically profile.

We implemented all models using RStan (Stan Development Team, 2017; Carpenter et al., 2017) to infer the posterior distribution of unknown parameters using hamiltonian Markov chain Monte Carlo. We used the same weak prior (N(7,5)) for the mean log-expression levels of both on and off components, allowing us to use Stan’s positive_ordered data type to describe the location of the two components. The models were sampled using 4 independent Markov chains, each run for 500 iterations. Additional details of the model fitting procedure can be found in the STAN files specifying the model and the R code that calls the model fitting procedures (github repository http://github.com/fredpdavis/opticlobe; copy archived at https://github.com/elifesciences-publications/opticlobeDavid, 2020).

Evaluating model accuracy

To evaluate the accuracy of the mixture modeling approach we created a benchmark set of expression data extracted from FlyBase. Specifically, we queried the FlyBase website (http://flybase.org) for genes expressed in the optic lobe or the photoreceptor. The resulting benchmark set included 193 positive and 4 negative expression datapoints. We quantified the model’s accuracy on this benchmark in two ways. First, we quantified concordance between the benchmark expression state and our model’s inferred state. Second, we computed the cumulative distribution function of the inferred probabilities of expression for the positive benchmark datapoints.

Expression-based tree of cell types

To study cell relationships, we used phylogenetic tree-building to compare their expression profiles. We first selected a subset of genes with on-component means of at least exp(3) ~ 21 TPM and difference between on and off components of at least exp(1.5) ~ 4.5 fold. We then encoded the expression profile of each cell as a ‘sequence’ of expression states, where each position represents a gene, and the character indicates the gene is expressed (‘A’, P(zgc=on)> 0.8), not expressed (‘C’, P(zgc=on) < 0.2), or its expression is uncertain (‘N’, 0.2 < P(zgc=on) < 0.8). We computed the Hamming distance between pairs of expression ‘sequences’ considering only unambiguous positions, using the dist.dna() routine in the ape R package (Paradis et al., 2004). We then used the minimum evolution approach to estimate the ‘phylogeny’ of the cells, using the balanced weighting scheme (Desper and Gascuel, 2002), as implemented in the ape fastme.bal() routine. We then built trees from 1000 bootstrapped replicates and quantified the support for each branch on the original tree. We visualized the tree using the phytools R package (Revell, 2012).

Identifying marker genes

We identified marker genes specifically enriched in individual cell types and groups of cells (photoreceptor, glia, muscle, neuron) by searching for genes inferred to be almost exclusively expressed in a single cell type or cell group (P(zgc=on)>=0.9 for all cells within a group, and at most two cells outside a group) and with transcript abundance higher than all cells outside the group.

Evaluating expression patterns for genes with different functions

We used FlyBase Gene Groups (release 2018_02) to assign functions to genes, and considered the most terminal groups in the hierarchy that had at least 10 genes.

Mapping receptor expression onto synapses

To map receptor expression onto synaptic connectivity, we first obtained synapse pairs from Takemura et al. (2013) to identify synaptic targets of R8 (cell #111), R7 (cell #205), and C2 (cell #214) cells in the medulla. When multiple instances of a cell type were available in the synaptic table, we chose the one with the greatest number of synaptic partners. For target cell types that we profiled with TAPIN/INTACT-seq, we discretized their expression as either on (p(on)>=0.8) or off (p(on)<0.8). For cell types that we did not profile, we classified them as unknown receptor expression.

Data and software availability

All raw and processed transcriptome data is available from NCBI GEO (accession GSE116969). The shell scripts used to process the raw RNA-seq data, and the R and Stan programs that implement the mixture model as well as generate all figures and tables in this paper are available at github (http://github.com/fredpdavis/opticlobe). The cell type-level expression table can be explored interactively at http://www.opticlobe.com.

Acknowledgements

This project was supported by the Howard Hughes Medical Institute. FPD was supported in part by the Intramural Research Program of the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health. We would like to thank Teresa Tian and all members of the Janelia Fly Facility for their help in maintaining and rearing flies, Tanya Wolff and Arnim Jenett for the central complex and Heather Dionne for the l-LNv split-GAL4 lines, and the FlyLight Project Team for imaging of split-GAL4 expression patterns.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Fred P Davis, Email: fredpdavis@gmail.com.

Gilbert L Henry, Email: henry@cshl.edu.

Hugo J Bellen, Baylor College of Medicine, United States.

K VijayRaghavan, National Centre for Biological Sciences, Tata Institute of Fundamental Research, India.

Funding Information

This paper was supported by the following grants:

  • National Institute of Arthritis and Musculoskeletal and Skin Diseases Intramural Research Program to Fred P Davis.

  • Howard Hughes Medical Institute to Fred P Davis, Aljoscha Nern, Serge Picard, Michael B Reiser, Gerald M Rubin, Sean R Eddy.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Software, Formal analysis, Investigation, Methodology, Writing - original draft, Writing - review and editing.

Conceptualization, Data curation, Validation, Investigation, Methodology, Writing - original draft, Writing - review and editing.

Resources, Investigation.

Funding acquisition, Writing - review and editing.

Funding acquisition, Writing - review and editing.

Conceptualization, Supervision, Funding acquisition, Writing - review and editing.

Conceptualization, Data curation, Investigation, Methodology, Writing - original draft, Writing - review and editing.

Additional files

Supplementary file 1. Supplemental tables.

(A) All drivers. (B) RNAseq samples. (C) Benchmark entries. (D) Marker genes. (E) Anatomy details. (F) TAPIN-seq buffers.

elife-50901-supp1.xlsx (56.5KB, xlsx)
Supplementary file 2. Key Resource Table.

List of key resources and reagents used in this study.

elife-50901-supp2.docx (149.4KB, docx)
Transparent reporting form

Data availability

All raw and processed transcriptome data is available from NCBI GEO (accession GSE116969).

The following dataset was generated:

Davis FP, Nern A, Picard S, Reiser MB, Rubin GM, Eddy SR, Henry GL. 2019. A genetic, genomic, and computational resource for exploring neural circuit function. NCBI Gene Expression Omnibus. GSE116969

The following previously published datasets were used:

Konstantinides N, Kapuralin K, Desplan C. 2018. RNA sequencing of Drosophila melanogaster optic lobe cell types. NCBI Gene Expression Omnibus. GSE103772

Konstantinides N, Kapuralin K, Desplan C. 2018. Single-cell RNA sequencing of Drosophila melanogaster optic lobe cells. NCBI Gene Expression Omnibus. GSE103771

Davie K, Janssens J, Koldere D, Aerts S. 2018. A single-cell transcriptome atlas of the ageing Drosophila brain. NCBI Gene Expression Omnibus. GSE107451

References

  1. Alejevski F, Saint-Charles A, Michard-Vanhée C, Martin B, Galant S, Vasiliauskas D, Rouyer F. The HisCl1 histamine receptor acts in photoreceptors to synchronize Drosophila behavioral rhythms with light-dark cycles. Nature Communications. 2019;10:252. doi: 10.1038/s41467-018-08116-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Amrein H, Axel R. Genes expressed in neurons of adult male Drosophila. Cell. 1997;88:459–469. doi: 10.1016/S0092-8674(00)81886-3. [DOI] [PubMed] [Google Scholar]
  3. Apitz H, Salecker I. A challenge of numbers and diversity: neurogenesis in the Drosophila optic lobe. Journal of Neurogenetics. 2014;28:233–249. doi: 10.3109/01677063.2014.922558. [DOI] [PubMed] [Google Scholar]
  4. Apitz H, Salecker I. Spatio-temporal relays control layer identity of direction-selective neuron subtypes in Drosophila. Nature Communications. 2018;9:2295. doi: 10.1038/s41467-018-04592-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Arenz A, Drews MS, Richter FG, Ammer G, Borst A. The temporal tuning of the Drosophila motion detectors is determined by the dynamics of their input elements. Current Biology. 2017;27:929–944. doi: 10.1016/j.cub.2017.01.051. [DOI] [PubMed] [Google Scholar]
  6. Aso Y, Hattori D, Yu Y, Johnston RM, Iyer NA, Ngo TT, Dionne H, Abbott LF, Axel R, Tanimoto H, Rubin GM. The neuronal architecture of the mushroom body provides a logic for associative learning. eLife. 2014;3:e04577. doi: 10.7554/eLife.04577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Barnstedt O, Owald D, Felsenberg J, Brain R, Moszynski JP, Talbot CB, Perrat PN, Waddell S. Memory-Relevant mushroom body output synapses are cholinergic. Neuron. 2016;89:1237–1247. doi: 10.1016/j.neuron.2016.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bazigou E, Apitz H, Johansson J, Lorén CE, Hirst EM, Chen PL, Palmer RH, Salecker I. Anterograde jelly belly and alk receptor tyrosine kinase signaling mediates retinal axon targeting in Drosophila. Cell. 2007;128:961–975. doi: 10.1016/j.cell.2007.02.024. [DOI] [PubMed] [Google Scholar]
  9. Belote JM, Handler AM, Wolfner MF, Livak KJ, Baker BS. Sex-specific regulation of yolk protein gene expression in Drosophila. Cell. 1985;40:339–348. doi: 10.1016/0092-8674(85)90148-5. [DOI] [PubMed] [Google Scholar]
  10. Brand AH, Perrimon N. Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development. 1993;118:401–415. doi: 10.1242/dev.118.2.401. [DOI] [PubMed] [Google Scholar]
  11. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology. 2016;34:525–527. doi: 10.1038/nbt.3519. [DOI] [PubMed] [Google Scholar]
  12. Cao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, Trapnell C, Shendure J. The single-cell transcriptional landscape of mammalian organogenesis. Nature. 2019;566:496–502. doi: 10.1038/s41586-019-0969-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A. Stan: a probabilistic programming language. Journal of Statistical Software. 2017;76:1–32. doi: 10.18637/jss.v076.i01. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Carrillo RA, Özkan E, Menon KP, Nagarkar-Jaiswal S, Lee PT, Jeon M, Birnbaum ME, Bellen HJ, Garcia KC, Zinn K. Control of synaptic connectivity by a network of Drosophila IgSF cell surface proteins. Cell. 2015;163:1770–1782. doi: 10.1016/j.cell.2015.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chen Y, Akin O, Nern A, Tsui CY, Pecot MY, Zipursky SL. Cell-type-specific labeling of synapses in vivo through synaptic tagging with recombination. Neuron. 2014;81:280–293. doi: 10.1016/j.neuron.2013.12.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cosmanescu F, Katsamba PS, Sergeeva AP, Ahlsen G, Patel SD, Brewer JJ, Tan L, Xu S, Xiao Q, Nagarkar-Jaiswal S, Nern A, Bellen HJ, Zipursky SL, Honig B, Shapiro L. Neuron-Subtype-Specific expression, interaction affinities, and specificity determinants of DIP/Dpr cell recognition proteins. Neuron. 2018;100:1385–1400. doi: 10.1016/j.neuron.2018.10.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Crocker A, Guan X-J, Murphy CT, Murthy M. Cell-Type-Specific transcriptome analysis in the Drosophila mushroom body reveals Memory-Related changes in gene expression. Cell Reports. 2016;15:1580–1596. doi: 10.1016/j.celrep.2016.04.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Croset V, Treiber CD, Waddell S. Cellular diversity in the Drosophila midbrain revealed by single-cell transcriptomics. eLife. 2018;7:e34550. doi: 10.7554/eLife.34550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. David FP. Github; 2020. https://github.com/fredpdavis/opticlobe [Google Scholar]
  20. Davie K, Janssens J, Koldere D, De Waegeneer M, Pech U, Kreft Ł, Aibar S, Makhzami S, Christiaens V, Bravo González-Blas C, Poovathingal S, Hulselmans G, Spanier KI, Moerman T, Vanspauwen B, Geurs S, Voet T, Lammertyn J, Thienpont B, Liu S, Konstantinides N, Fiers M, Verstreken P, Aerts S. A Single-Cell transcriptome atlas of the aging Drosophila brain. Cell. 2018;174:982–998. doi: 10.1016/j.cell.2018.05.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Deal RB, Henikoff S. A simple method for gene expression and chromatin profiling of individual cell types within a tissue. Developmental Cell. 2010;18:1030–1040. doi: 10.1016/j.devcel.2010.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Desper R, Gascuel O. Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. Journal of Computational Biology. 2002;9:687–705. doi: 10.1089/106652702761034136. [DOI] [PubMed] [Google Scholar]
  23. Diao F, Ironfield H, Luan H, Diao F, Shropshire WC, Ewer J, Marr E, Potter CJ, Landgraf M, White BH. Plug-and-play genetic access to Drosophila cell types using exchangeable exon cassettes. Cell Reports. 2015;10:1410–1421. doi: 10.1016/j.celrep.2015.01.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Dionne H, Hibbard KL, Cavallaro A, Kao JC, Rubin GM. Genetic reagents for making Split-GAL4 lines in Drosophila. Genetics. 2018;209:31–35. doi: 10.1534/genetics.118.300682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Dolan MJ, Luan H, Shropshire WC, Sutcliffe B, Cocanougher B, Scott RL, Frechter S, Zlatic M, Jefferis G, White BH. Facilitating Neuron-Specific genetic manipulations in Drosophila melanogaster Using a Split GAL4 Repressor. Genetics. 2017;206:775–784. doi: 10.1534/genetics.116.199687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Ecker JR, Geschwind DH, Kriegstein AR, Ngai J, Osten P, Polioudakis D, Regev A, Sestan N, Wickersham IR, Zeng H. The BRAIN initiative cell census consortium: lessons learned toward generating a comprehensive brain cell atlas. Neuron. 2017;96:542–557. doi: 10.1016/j.neuron.2017.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Edwards TN, Nuschke AC, Nern A, Meinertzhagen IA. Organization and metamorphosis of Glia in the Drosophila visual system. The Journal of Comparative Neurology. 2012;520:2067–2085. doi: 10.1002/cne.23071. [DOI] [PubMed] [Google Scholar]
  29. Fischbach K-F, Dittrich APM. The optic lobe of Drosophila Melanogaster. I. A golgi analysis of wild-type structure. Cell and Tissue Research. 1989;258:441–475. doi: 10.1007/BF00218858. [DOI] [Google Scholar]
  30. Fortini ME, Rubin GM. Analysis of cis-acting requirements of the Rh3 and Rh4 genes reveals a bipartite organization to rhodopsin promoters in Drosophila Melanogaster. Genes & Development. 1990;4:444–463. doi: 10.1101/gad.4.3.444. [DOI] [PubMed] [Google Scholar]
  31. Fujita SC, Zipursky SL, Benzer S, Ferrús A, Shotwell SL. Monoclonal antibodies against the Drosophila nervous system. PNAS. 1982;79:7929–7933. doi: 10.1073/pnas.79.24.7929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gao S, Takemura SY, Ting CY, Huang S, Lu Z, Luan H, Rister J, Thum AS, Yang M, Hong ST, Wang JW, Odenwald WF, White BH, Meinertzhagen IA, Lee CH. The neural substrate of spectral preference in Drosophila. Neuron. 2008;60:328–342. doi: 10.1016/j.neuron.2008.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Gisselmann G, Plonka J, Pusch H, Hatt H. Drosophila Melanogaster GRD and LCCH3 subunits form heteromultimeric GABA-gated cation channels. British Journal of Pharmacology. 2004;142:409–413. doi: 10.1038/sj.bjp.0705818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Hardie RC. Is histamine a neurotransmitter in insect photoreceptors? Journal of Comparative Physiology A. 1987;161:201–213. doi: 10.1007/BF00615241. [DOI] [PubMed] [Google Scholar]
  35. Helfrich-Förster C, Shafer OT, Wülbeck C, Grieshaber E, Rieger D, Taghert P. Development and morphology of the clock-gene-expressing lateral neurons of Drosophila Melanogaster. The Journal of Comparative Neurology. 2007;500:47–70. doi: 10.1002/cne.21146. [DOI] [PubMed] [Google Scholar]
  36. Henry GL, Davis FP, Picard S, Eddy SR. Cell type-specific genomics of Drosophila neurons. Nucleic Acids Research. 2012;40:9691–9704. doi: 10.1093/nar/gks671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Hergarden AC, Tayler TD, Anderson DJ. Allatostatin-A neurons inhibit feeding behavior in adult Drosophila. PNAS. 2012;109:3967–3972. doi: 10.1073/pnas.1200778109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hobert O. Terminal selectors of neuronal identity. Current Topics in Developmental Biology. 2016;116:455–475. doi: 10.1016/bs.ctdb.2015.12.007. [DOI] [PubMed] [Google Scholar]
  39. Hu W, Wang T, Wang X, Han J. Ih channels control feedback regulation from amacrine cells to photoreceptors. PLOS Biology. 2015;13:e1002115. doi: 10.1371/journal.pbio.1002115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Huang Y, Ng FS, Jackson FR. Comparison of Larval and Adult Drosophila Astrocytes Reveals Stage-Specific Gene Expression Profiles. G3: Genes|Genomes|Genetics. 2015;5:551–558. doi: 10.1534/g3.114.016162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Iwai Y, Usui T, Hirano S, Steward R, Takeichi M, Uemura T. Axon patterning requires DN-cadherin, a novel neuronal adhesion receptor, in the Drosophila embryonic CNS. Neuron. 1997;19:77–89. doi: 10.1016/S0896-6273(00)80349-9. [DOI] [PubMed] [Google Scholar]
  42. Jarvilehto M, Zettler F. Localized intracellular potentials from pre- and postsynaptic components in the external plexiform layer of an insect retina. Zeitschrift FuR Vergleichende Physiologie. 1971;75:422–440. doi: 10.1007/BF00630561. [DOI] [Google Scholar]
  43. Jenett A, Rubin GM, Ngo T-TB, Shepherd D, Murphy C, Dionne H, Pfeiffer BD, Cavallaro A, Hall D, Jeter J, Iyer N, Fetter D, Hausenfluck JH, Peng H, Trautman ET, Svirskas RR, Myers EW, Iwinski ZR, Aso Y, DePasquale GM, Enos A, Hulamm P, Lam SCB, Li H-H, Laverty TR, Long F, Qu L, Murphy SD, Rokicki K, Safford T, Shaw K, Simpson JH, Sowell A, Tae S, Yu Y, Zugates CT. A GAL4-Driver line resource for Drosophila neurobiology. Cell Reports. 2012;2:991–1001. doi: 10.1016/j.celrep.2012.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Karuppudurai T, Lin TY, Ting CY, Pursley R, Melnattur KV, Diao F, White BH, Macpherson LJ, Gallio M, Pohida T, Lee CH. A hard-wired glutamatergic circuit pools and relays UV signals to mediate spectral preference in Drosophila. Neuron. 2014;81:603–615. doi: 10.1016/j.neuron.2013.12.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Kebschull JM, Garcia da Silva P, Reid AP, Peikon ID, Albeanu DF, Zador AM. High-Throughput mapping of Single-Neuron projections by sequencing of barcoded RNA. Neuron. 2016;91:975–987. doi: 10.1016/j.neuron.2016.07.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Kiragasi B, Wondolowski J, Li Y, Dickman DK. A presynaptic glutamate receptor subunit confers robustness to neurotransmission and homeostatic potentiation. Cell Reports. 2017;19:2694–2706. doi: 10.1016/j.celrep.2017.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Klapoetke NC, Murata Y, Kim SS, Pulver SR, Birdsey-Benson A, Cho YK, Morimoto TK, Chuong AS, Carpenter EJ, Tian Z, Wang J, Xie Y, Yan Z, Zhang Y, Chow BY, Surek B, Melkonian M, Jayaraman V, Constantine-Paton M, Wong GK, Boyden ES. Independent optical excitation of distinct neural populations. Nature Methods. 2014;11:338–346. doi: 10.1038/nmeth.2836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Konstantinides N, Kapuralin K, Fadil C, Barboza L, Satija R, Desplan C. Phenotypic convergence: distinct transcription factors regulate common terminal features. Cell. 2018;174:622–635. doi: 10.1016/j.cell.2018.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Kudron MM, Victorsen A, Gevirtzman L, Hillier LW, Fisher WW, Vafeados D, Kirkey M, Hammonds AS, Gersch J, Ammouri H, Wall ML, Moran J, Steffen D, Szynkarek M, Seabrook-Sturgis S, Jameel N, Kadaba M, Patton J, Terrell R, Corson M, Durham TJ, Park S, Samanta S, Han M, Xu J, Yan KK, Celniker SE, White KP, Ma L, Gerstein M, Reinke V, Waterston RH. The ModERN resource: genome-wide binding profiles for hundreds of Drosophila and Caenorhabditis elegans Transcription Factors. Genetics. 2018;208:937–949. doi: 10.1534/genetics.117.300657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Laughlin SB, Hardie RC. Common strategies for light adaptation in the peripheral visual systems of fly and dragonfly. Journal of Comparative Physiology ? A. 1978;128:319–340. doi: 10.1007/BF00657606. [DOI] [Google Scholar]
  51. Lawson C, Hanson R. Solving Least Squares Problems. Society for Industrial and Applied Mathematics; 1974. [DOI] [Google Scholar]
  52. Lee PT, Zirin J, Kanca O, Lin WW, Schulze KL, Li-Kroeger D, Tao R, Devereaux C, Hu Y, Chung V, Fang Y, He Y, Pan H, Ge M, Zuo Z, Housden BE, Mohr SE, Yamamoto S, Levis RW, Spradling AC, Perrimon N, Bellen HJ. A gene-specific T2A-GAL4 library for Drosophila. eLife. 2018;7:e35574. doi: 10.7554/eLife.35574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Li H, Horns F, Wu B, Xie Q, Li J, Li T, Luginbuhl DJ, Quake SR, Luo L. Classifying Drosophila olfactory projection neuron subtypes by Single-Cell RNA sequencing. Cell. 2017;171:1206–1220. doi: 10.1016/j.cell.2017.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Long X, Colonell J, Wong AM, Singer RH, Lionnet T. Quantitative mRNA imaging throughout the entire Drosophila brain. Nature Methods. 2017;14:703–706. doi: 10.1038/nmeth.4309. [DOI] [PubMed] [Google Scholar]
  55. Lowenstein CJ, Snyder SH. Nitric oxide, a novel biologic messenger. Cell. 1992;70:705–707. doi: 10.1016/0092-8674(92)90301-R. [DOI] [PubMed] [Google Scholar]
  56. Luan H, Peabody NC, Vinson CR, White BH. Refined spatial manipulation of neuronal function by combinatorial restriction of transgene expression. Neuron. 2006;52:425–436. doi: 10.1016/j.neuron.2006.08.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Luo L, Callaway EM, Svoboda K. Genetic dissection of neural circuits: a decade of progress. Neuron. 2018;98:256–281. doi: 10.1016/j.neuron.2018.03.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, Trombetta JJ, Weitz DA, Sanes JR, Shalek AK, Regev A, McCarroll SA. Highly parallel Genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161:1202–1214. doi: 10.1016/j.cell.2015.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Mauss AS, Meier M, Serbe E, Borst A. Optogenetic and pharmacologic dissection of feedforward inhibition in Drosophila motion vision. Journal of Neuroscience. 2014;34:2254–2263. doi: 10.1523/JNEUROSCI.3938-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Mauss AS, Pankova K, Arenz A, Nern A, Rubin GM, Borst A. Neural circuit to integrate opposing motions in the visual field. Cell. 2015;162:351–362. doi: 10.1016/j.cell.2015.06.035. [DOI] [PubMed] [Google Scholar]
  61. Mauss AS, Vlasits A, Borst A, Feller M. Visual circuits for direction selectivity. Annual Review of Neuroscience. 2017;40:211–230. doi: 10.1146/annurev-neuro-072116-031335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Meinertzhagen IA, O'Neil SD. Synaptic organization of columnar elements in the Lamina of the wild type in Drosophila Melanogaster. The Journal of Comparative Neurology. 1991;305:232–263. doi: 10.1002/cne.903050206. [DOI] [PubMed] [Google Scholar]
  63. Meinertzhagen IA, Sorra KE. Synaptic organization in the fly's optic lamina: few cells, many synapses and divergent microcircuits. Progress in Brain Research. 2001;131:53–69. doi: 10.1016/s0079-6123(01)31007-5. [DOI] [PubMed] [Google Scholar]
  64. Melnattur KV, Pursley R, Lin TY, Ting CY, Smith PD, Pohida T, Lee CH. Multiple redundant medulla projection neurons mediate color vision in Drosophila. Journal of Neurogenetics. 2014;28:374–388. doi: 10.3109/01677063.2014.891590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Micheva KD, Smith SJ. Array tomography: a new tool for imaging the molecular architecture and ultrastructure of neural circuits. Neuron. 2007;55:25–36. doi: 10.1016/j.neuron.2007.06.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Mo A, Mukamel EA, Davis FP, Luo C, Henry GL, Picard S, Urich MA, Nery JR, Sejnowski TJ, Lister R, Eddy SR, Ecker JR, Nathans J. Epigenomic signatures of neuronal diversity in the mammalian brain. Neuron. 2015;86:1369–1384. doi: 10.1016/j.neuron.2015.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Morante J, Desplan C. The color-vision circuit in the medulla of Drosophila. Current Biology. 2008;18:553–565. doi: 10.1016/j.cub.2008.02.075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Mullen KM, Stokkum IHMvan. nnls: The Lawson-Hanson algorithm for non- negative least squares (NNLS) 2012
  69. Nagarkar-Jaiswal S, Lee P-T, Campbell ME, Chen K, Anguiano-Zarate S, Cantu Gutierrez M, Busby T, Lin W-W, He Y, Schulze KL, Booth BW, Evans-Holm M, Venken KJT, Levis RW, Spradling AC, Hoskins RA, Bellen HJ. A library of MiMICs allows tagging of genes and reversible, spatial and temporal knockdown of proteins in Drosophila. eLife. 2015;4:e05338. doi: 10.7554/eLife.05338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Nériec N, Desplan C. From the eye to the brain: development of the Drosophila visual system. Current Topics in Developmental Biology. 2016;116:247–271. doi: 10.1016/bs.ctdb.2015.11.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Nern A, Pfeiffer BD, Rubin GM. Optimized tools for multicolor stochastic labeling reveal diverse stereotyped cell arrangements in the fly visual system. PNAS. 2015;112:E2967–E2976. doi: 10.1073/pnas.1506763112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Otsuna H, Ito K. Systematic analysis of the visual projection neurons of Drosophila Melanogaster. I. Lobula-specific pathways. The Journal of Comparative Neurology. 2006;497:928–958. doi: 10.1002/cne.21015. [DOI] [PubMed] [Google Scholar]
  73. Özkan E, Carrillo RA, Eastman CL, Weiszmann R, Waghray D, Johnson KG, Zinn K, Celniker SE, Garcia KC. An extracellular interactome of immunoglobulin and LRR proteins reveals receptor-ligand networks. Cell. 2013;154:228–239. doi: 10.1016/j.cell.2013.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Pankova K, Borst A. Transgenic line for the identification of cholinergic release sites in Drosophila Melanogaster. The Journal of Experimental Biology. 2017;220:1405–1410. doi: 10.1242/jeb.149369. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Panser K, Tirian L, Schulze F, Villalba S, Jefferis G, Bühler K, Straw AD. Automatic segmentation of Drosophila neural compartments using GAL4 expression data reveals novel visual pathways. Current Biology. 2016;26:1943–1954. doi: 10.1016/j.cub.2016.05.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Pantazis A, Segaran A, Liu CH, Nikolaev A, Rister J, Thum AS, Roeder T, Semenov E, Juusola M, Hardie RC. Distinct roles for two histamine receptors (hclA and hclB) at the Drosophila photoreceptor synapse. Journal of Neuroscience. 2008;28:7250–7259. doi: 10.1523/JNEUROSCI.1654-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Paradis E, Claude J, Strimmer K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics. 2004;20:289–290. doi: 10.1093/bioinformatics/btg412. [DOI] [PubMed] [Google Scholar]
  78. Park JH, Helfrich-Förster C, Lee G, Liu L, Rosbash M, Hall JC. Differential regulation of circadian pacemaker output by separate clock genes in Drosophila. PNAS. 2000;97:3608–3613. doi: 10.1073/pnas.97.7.3608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Park JH, Schroeder AJ, Helfrich-Förster C, Jackson FR, Ewer J. Targeted ablation of CCAP neuropeptide-containing neurons of Drosophila causes specific defects in execution and circadian timing of ecdysis behavior. Development. 2003;130:2645–2656. doi: 10.1242/dev.00503. [DOI] [PubMed] [Google Scholar]
  80. Peng H, Ruan Z, Long F, Simpson JH, Myers EW. V3D enables real-time 3D visualization and quantitative analysis of large-scale biological image data sets. Nature Biotechnology. 2010;28:348–353. doi: 10.1038/nbt.1612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Raghu SV, Claussen J, Borst A. Neurons with GABAergic phenotype in the visual system of Drosophila. The Journal of Comparative Neurology. 2013;521:252–265. doi: 10.1002/cne.23208. [DOI] [PubMed] [Google Scholar]
  82. Raghu SV, Borst A. Candidate glutamatergic neurons in the visual system of Drosophila. PLOS ONE. 2011;6:e19472. doi: 10.1371/journal.pone.0019472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Birney E, Bodenmiller B, Campbell P, Carninci P, Clatworthy M, Clevers H, Deplancke B, Dunham I, Eberwine J, Eils R, Enard W, Farmer A, Fugger L, Göttgens B, Hacohen N, Haniffa M, Hemberg M, Kim S, Klenerman P, Kriegstein A, Lein E, Linnarsson S, Lundberg E, Lundeberg J, Majumder P, Marioni JC, Merad M, Mhlanga M, Nawijn M, Netea M, Nolan G, Pe'er D, Phillipakis A, Ponting CP, Quake S, Reik W, Rozenblatt-Rosen O, Sanes J, Satija R, Schumacher TN, Shalek A, Shapiro E, Sharma P, Shin JW, Stegle O, Stratton M, Stubbington MJT, Theis FJ, Uhlen M, van Oudenaarden A, Wagner A, Watt F, Weissman J, Wold B, Xavier R, Yosef N, Human Cell Atlas Meeting Participants The human cell atlas. eLife. 2017;6:e27041. doi: 10.7554/eLife.27041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Revell LJ. Phytools: an R package for phylogenetic comparative biology (and other things) Methods in Ecology and Evolution. 2012;3:217–223. doi: 10.1111/j.2041-210X.2011.00169.x. [DOI] [Google Scholar]
  85. Rigaut G, Shevchenko A, Rutz B, Wilm M, Mann M, Séraphin B. A generic protein purification method for protein complex characterization and proteome exploration. Nature Biotechnology. 1999;17:1030–1032. doi: 10.1038/13732. [DOI] [PubMed] [Google Scholar]
  86. Rivera-Alba M, Vitaladevuni SN, Mishchenko Y, Mischenko Y, Lu Z, Takemura SY, Scheffer L, Meinertzhagen IA, Chklovskii DB, de Polavieja GG. Wiring economy and volume exclusion determine neuronal placement in the Drosophila brain. Current Biology. 2011;21:2000–2005. doi: 10.1016/j.cub.2011.10.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Rulifson EJ, Kim SK, Nusse R. Ablation of insulin-producing neurons in flies: growth and diabetic phenotypes. Science. 2002;296:1118–1120. doi: 10.1126/science.1070058. [DOI] [PubMed] [Google Scholar]
  88. Sarthy PV. Histamine: a neurotransmitter candidate for Drosophila photoreceptors. Journal of Neurochemistry. 1991;57:1757–1768. doi: 10.1111/j.1471-4159.1991.tb06378.x. [DOI] [PubMed] [Google Scholar]
  89. Schlichting M, Menegazzi P, Lelito KR, Yao Z, Buhl E, Dalla Benetta E, Bahle A, Denike J, Hodge JJ, Helfrich-Förster C, Shafer OT. A neural network underlying circadian entrainment and photoperiodic adjustment of sleep and activity in Drosophila. Journal of Neuroscience. 2016;36:9084–9096. doi: 10.1523/JNEUROSCI.0992-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Schnaitmann C, Haikala V, Abraham E, Oberhauser V, Thestrup T, Griesbeck O, Reiff DF. Color processing in the early visual system of Drosophila. Cell. 2018;172:318–330. doi: 10.1016/j.cell.2017.12.018. [DOI] [PubMed] [Google Scholar]
  91. Shinomiya K, Huang G, Lu Z, Parag T, Xu CS, Aniceto R, Ansari N, Cheatham N, Lauchie S, Neace E, Ogundeyi O, Ordish C, Peel D, Shinomiya A, Smith C, Takemura S, Talebi I, Rivlin PK, Nern A, Scheffer LK, Plaza SM, Meinertzhagen IA. Comparisons between the ON- and OFF-edge motion pathways in the Drosophila brain. eLife. 2019;8:e40025. doi: 10.7554/eLife.40025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Siegert S, Cabuy E, Scherf BG, Kohler H, Panda S, Le YZ, Fehling HJ, Gaidatzis D, Stadler MB, Roska B. Transcriptional code and disease map for adult retinal cell types. Nature Neuroscience. 2012;15:487–495. doi: 10.1038/nn.3032. [DOI] [PubMed] [Google Scholar]
  93. Silies M, Gohl DM, Clandinin TR. Motion-detecting circuits in flies: coming into view. Annual Review of Neuroscience. 2014;37:307–327. doi: 10.1146/annurev-neuro-071013-013931. [DOI] [PubMed] [Google Scholar]
  94. Slawski M, Hein M. Non-negative least squares for high-dimensional linear models: consistency and sparse recovery without regularization. Electronic Journal of Statistics. 2013;7:3004–3056. doi: 10.1214/13-EJS868. [DOI] [Google Scholar]
  95. Stan Development Team RStan: the R interface to stan. 2017 http://mc-stan.org
  96. Stay B, Chan KK, Woodhead AP. Allatostatin-immunoreactive neurons projecting to the corpora Allata of adult Diploptera punctata. Cell & Tissue Research. 1992;270:15–23. doi: 10.1007/BF00381875. [DOI] [PubMed] [Google Scholar]
  97. Strausfeld NJ. Brain organization and the origin of insects: an assessment. Proceedings of the Royal Society B: Biological Sciences. 2009;276:1929–1937. doi: 10.1098/rspb.2008.1471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Strother JA, Wu ST, Wong AM, Nern A, Rogers EM, Le JQ, Rubin GM, Reiser MB. The emergence of directional selectivity in the visual motion pathway of Drosophila. Neuron. 2017;94:168–182. doi: 10.1016/j.neuron.2017.03.010. [DOI] [PubMed] [Google Scholar]
  99. Strother JA, Wu ST, Rogers EM, Eliason JLM, Wong AM, Nern A, Reiser MB. Behavioral state modulates the ON visual motion pathway of Drosophila. PNAS. 2018;115:E102–E111. doi: 10.1073/pnas.1703090115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Svensson V, Natarajan KN, Ly LH, Miragaia RJ, Labalette C, Macaulay IC, Cvejic A, Teichmann SA. Power analysis of single-cell RNA-sequencing experiments. Nature Methods. 2017;14:381–387. doi: 10.1038/nmeth.4220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Swanson LW, Lichtman JW. From cajal to connectome and beyond. Annual Review of Neuroscience. 2016;39:197–216. doi: 10.1146/annurev-neuro-071714-033954. [DOI] [PubMed] [Google Scholar]
  102. Sweeney ST, Broadie K, Keane J, Niemann H, O'Kane CJ. Targeted expression of tetanus toxin light chain in Drosophila specifically eliminates synaptic transmission and causes behavioral defects. Neuron. 1995;14:341–351. doi: 10.1016/0896-6273(95)90290-2. [DOI] [PubMed] [Google Scholar]
  103. Taghert PH, Hewes RS, Park JH, O'Brien MA, Han M, Peck ME. Multiple amidated neuropeptides are required for normal circadian locomotor rhythms in Drosophila. The Journal of Neuroscience. 2001;21:6673–6686. doi: 10.1523/JNEUROSCI.21-17-06673.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Takemura SY, Lu Z, Meinertzhagen IA. Synaptic circuits of the Drosophila optic lobe: the input terminals to the medulla. The Journal of Comparative Neurology. 2008;509:493–513. doi: 10.1002/cne.21757. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Takemura SY, Bharioke A, Lu Z, Nern A, Vitaladevuni S, Rivlin PK, Katz WT, Olbris DJ, Plaza SM, Winston P, Zhao T, Horne JA, Fetter RD, Takemura S, Blazek K, Chang LA, Ogundeyi O, Saunders MA, Shapiro V, Sigmund C, Rubin GM, Scheffer LK, Meinertzhagen IA, Chklovskii DB. A visual motion detection circuit suggested by Drosophila connectomics. Nature. 2013;500:175–181. doi: 10.1038/nature12450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Takemura SY, Xu CS, Lu Z, Rivlin PK, Parag T, Olbris DJ, Plaza S, Zhao T, Katz WT, Umayam L, Weaver C, Hess HF, Horne JA, Nunez-Iglesias J, Aniceto R, Chang LA, Lauchie S, Nasca A, Ogundeyi O, Sigmund C, Takemura S, Tran J, Langille C, Le Lacheur K, McLin S, Shinomiya A, Chklovskii DB, Meinertzhagen IA, Scheffer LK. Synaptic circuits and their variations within different columns in the visual system of Drosophila. PNAS. 2015;112:13711–13716. doi: 10.1073/pnas.1509820112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Takemura SY, Nern A, Chklovskii DB, Scheffer LK, Rubin GM, Meinertzhagen IA. The comprehensive connectome of a neural substrate for 'ON' motion detection in Drosophila. eLife. 2017;6:e24394. doi: 10.7554/eLife.24394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Tan L, Zhang KX, Pecot MY, Nagarkar-Jaiswal S, Lee PT, Takemura SY, McEwen JM, Nern A, Xu S, Tadros W, Chen Z, Zinn K, Bellen HJ, Morey M, Zipursky SL. Ig superfamily ligand and receptor pairs expressed in synaptic partners in Drosophila. Cell. 2015;163:1756–1769. doi: 10.1016/j.cell.2015.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Tayler TD, Pacheco DA, Hergarden AC, Murthy M, Anderson DJ. A neuropeptide circuit that coordinates sperm transfer and copulation duration in Drosophila. PNAS. 2012;109:20697–20702. doi: 10.1073/pnas.1218246109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Tirian L, Dickson B. The VT GAL4, LexA, and split-GAL4 driver line collections for targeted expression in the Drosophila nervous system. bioRxiv. 2017 doi: 10.1101/198648. [DOI]
  111. Tuthill JC, Nern A, Holtz SL, Rubin GM, Reiser MB. Contributions of the 12 neuron classes in the fly Lamina to motion vision. Neuron. 2013;79:128–140. doi: 10.1016/j.neuron.2013.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Tuthill JC, Nern A, Rubin GM, Reiser MB. Wide-field feedback neurons dynamically tune early visual processing. Neuron. 2014;82:887–895. doi: 10.1016/j.neuron.2014.04.023. [DOI] [PubMed] [Google Scholar]
  113. Varija Raghu S, Reiff DF, Borst A. Neurons with cholinergic phenotype in the visual system of Drosophila. The Journal of Comparative Neurology. 2011;519:162–176. doi: 10.1002/cne.22512. [DOI] [PubMed] [Google Scholar]
  114. Vehtari A, Gelman A, Gabry J. Practical bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. 2017;27:1413–1432. doi: 10.1007/s11222-016-9696-4. [DOI] [Google Scholar]
  115. Venkatasubramanian L, Guo Z, Xu S, Tan L, Xiao Q, Nagarkar-Jaiswal S, Mann RS. Stereotyped terminal axon branching of leg motor neurons mediated by IgSF proteins DIP-α and Dpr10. eLife. 2019;8:e42692. doi: 10.7554/eLife.42692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. von Reyn CR, Nern A, Williamson WR, Breads P, Wu M, Namiki S, Card GM. Feature integration drives probabilistic behavior in the Drosophila escape response. Neuron. 2017;94:1190–1204. doi: 10.1016/j.neuron.2017.05.036. [DOI] [PubMed] [Google Scholar]
  117. Wagh DA, Rasse TM, Asan E, Hofbauer A, Schwenkert I, Dürrbeck H, Buchner S, Dabauvalle MC, Schmidt M, Qin G, Wichmann C, Kittel R, Sigrist SJ, Buchner E. Bruchpilot, a protein with homology to ELKS/CAST, is required for structural integrity and function of synaptic active zones in Drosophila. Neuron. 2006;49:833–844. doi: 10.1016/j.neuron.2006.02.008. [DOI] [PubMed] [Google Scholar]
  118. Wenick AS, Hobert O. Genomic cis-regulatory architecture and trans-acting regulators of a single interneuron-specific gene battery in C. elegans. Developmental Cell. 2004;6:757–770. doi: 10.1016/j.devcel.2004.05.004. [DOI] [PubMed] [Google Scholar]
  119. Wernet MF, Desplan C. Building a retinal mosaic: cell-fate decision in the fly eye. Trends in Cell Biology. 2004;14:576–584. doi: 10.1016/j.tcb.2004.09.007. [DOI] [PubMed] [Google Scholar]
  120. Wolff T, Rubin GM. Neuroarchitecture of the Drosophila central complex: a catalog of Nodulus and asymmetrical body neurons and a revision of the protocerebral bridge catalog. Journal of Comparative Neurology. 2018;526:2585–2611. doi: 10.1002/cne.24512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Wu Q, Wen T, Lee G, Park JH, Cai HN, Shen P. Developmental control of foraging and social behavior by the Drosophila neuropeptide Y-like system. Neuron. 2003;39:147–161. doi: 10.1016/S0896-6273(03)00396-9. [DOI] [PubMed] [Google Scholar]
  122. Wu M, Nern A, Williamson WR, Morimoto MM, Reiser MB, Card GM, Rubin GM. Visual projection neurons in the Drosophila lobula link feature detection to distinct behavioral programs. eLife. 2016;5:e21022. doi: 10.7554/eLife.21022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Xu Y, Wang T. LOVIT is a putative vesicular histamine transporter required in Drosophila for vision. Cell Reports. 2019;27:1327–1333. doi: 10.1016/j.celrep.2019.04.024. [DOI] [PubMed] [Google Scholar]
  124. Yasuyama K, Meinertzhagen IA. Extraretinal photoreceptors at the compound eye's posterior margin in Drosophila melanogaster. The Journal of Comparative Neurology. 1999;412:193–202. doi: 10.1002/(SICI)1096-9861(19990920)412:2&#x0003c;193::AID-CNE1&#x0003e;3.0.CO;2-0. [DOI] [PubMed] [Google Scholar]
  125. Zhang HG, Lee HJ, Rocheleau T, ffrench-Constant RH, Jackson MB. Subunit composition determines picrotoxin and bicuculline sensitivity of Drosophila gamma-aminobutyric acid receptors. Molecular Pharmacology. 1995;48:835–840. [PubMed] [Google Scholar]
  126. Zheng L, Nikolaev A, Wardill TJ, O'Kane CJ, de Polavieja GG, Juusola M. Network adaptation improves temporal representation of naturalistic stimuli in Drosophila eye: i dynamics. PLOS ONE. 2009;4:e4307. doi: 10.1371/journal.pone.0004307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Zheng Z, Lauritzen JS, Perlman E, Robinson CG, Nichols M, Milkie D, Torrens O, Price J, Fisher CB, Sharifi N, Calle-Schuler SA, Kmecova L, Ali IJ, Karsh B, Trautman ET, Bogovic JA, Hanslovsky P, Jefferis G, Kazhdan M, Khairy K, Saalfeld S, Fetter RD, Bock DD. A complete electron microscopy volume of the brain of adult Drosophila Melanogaster. Cell. 2018;174:730–743. doi: 10.1016/j.cell.2018.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision letter

Editor: Hugo J Bellen1

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Thank you for submitting your article "A genetic, genomic, and computational resource for exploring neural circuit function" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and K VijayRaghavan as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

The three reviewers are in agreement that this work is a very valuable contribution to the literature and that it should be published in eLife. I am also of the same opinion. However, they all three raised issue that will require some textual changes, better explanations of the modeling. Here is a summary of the most important issues.

Summary

In this manuscript Davis et al. have used different enhancer combinations to mark diverse subtypes of cells, mostly from the visual system, and conduct RNA-seq analysis on isolated nuclei to assign gene expression patterns observed in 67 cell types. The authors first refined the INTACT method, which helps to isolate nuclei specified by a GAL4 expression pattern. In the new method (TAPIN) the authors add tandem affinity purification to the nucleus isolation pipeline. The authors show that this modification increases sensitivity of detection of transcripts and specificity of isolated nuclei compared to the previous methodology. The authors convert the often-graded expression levels to two normal distributions of binary on-off states. They then convert this data to probability of expression scores by using mathematical modeling. The accuracy of this modelling approach is demonstrated in multiple examples. The authors compare the expression patterns that they obtained to the recently available single cell sequencing data sets. Interestingly very few expression patterns directly map to a single cell transcription profile. Nevertheless, this comparison helped to determine the identity of many single cell sequencing clusters obtained in previous studies. Finally, the authors analyze the expression patterns of neurotransmitters and their receptors. By combining high resolution connectome obtained by EM with the expression patterns obtained by TAPIN and INTACT methods the authors find multiple wiring paradigms in the Drosophila brain.

The authors try to find relationships between cell types. They unsurprisingly find grouping of cell types with similar developmental origins and structures. They compare their results to published single-cell sequencing datasets from the optic lobes and the brain and confirmed that the single-cell datasets consisted of clusters that include more than one cell types, which was already known based on the number of clusters itself. They manage to annotate and correct some of the single-cell clusters, showing that bulk transcriptomes can be used to interpret single-cell clusters. They go on to analyze neurotransmitter and neurotransmitter receptor expression in different cell types, before finally trying to interpret connectomic data in the light of their transcriptomes.

Major comments:

– There are issues that need to be discussed more openly. For instance, in many instances the RNA expression pattern of a cell type is referred to a cells expression pattern. This would assume that the expression pattern is homogenous in a given cell type. This is a big assumption. The fact that the TAPIN expression patterns do not directly correlate with the single cell expression profiles can indicate that the cells from a cell type can have divergence in their expression profile. This has implications about the cell types that show multiple neurotransmitters. This issue should be more clearly discussed.

– Another issue is that the authors use the RNA levels and protein levels interchangeably, disregarding the possibility of post transcriptional regulation (This is more pronounced when they use the RNA expression data about cell adhesion molecules in Figure 4—figure supplement 1). This should also be addressed.

– The extent to which single cell sequencing clusters can be compared to TAPIN profiles should be better discussed. The TAPIN profiles are RNA expression of cells that share one or two enhancers (by the use of GAL4 or splitGAL4). Unbiased clustering of single cell sequencing data is based on multiple principle components. The manuscript compares the two approaches and suggest one is better than the other. These are complementary approaches and the data are different and difficult to compare (due to possible RNA expression heterogeneity within cell types). This is reflected in the fact that mid-level hierarchical clustering was not well supported by the TAPIN data (Figure 4A) whereas discrete clustering can be observed in single cell sequencing data. The best experiment to address this would be to do single cell sequencing on TAPIN isolated population of neurons nuclei for one or two cell types. I am not requiring this but they should discuss this better

– Although this is a resource paper, the resource is not accessible to people with limited programming or computational skills. In the web portal opticlobe.com, a simple heatmap can be generated based on the user input. This visualization can hardly reflect the depth and complexity of the data. Please refer to https://gtexportal.org/home/ as a reference to add more tools for general users.

– The description of the mixture model is confusing. The lowercase and uppercase p is mixed for different probabilistic events. Do they denote different functions? Conditioned on "bimodal", the posterior probability equation in subsection “Inferring expression state from transcript abundance” is very confusing. Did you integrate out the p(bimodal)? If not, how is the Bayesian rule applied here? The math equations are not well defined.

– On the model selection section, the first equation used sums of probability, the second equation uses a log sum of probability. Is there a reference or proof for such modeling choices from STAT or machine learning literature?

– The mixture model is well studied in both Bayesian and Frequentist framework, how does this approach compare to the standard Bayesian Gaussian mixture model?

– Figure 3C is the main result of the mixture model and the lower heatmap showed about HALF of the cells in the matrix are read and the other half is blue across all the genes and samples. This is likely due to an artifact of the customized mixture modeling. Since most of the results in the manuscript depend on the probability estimated from the mixture model, it is crucial to show mathematically that the model is correct.

eLife. 2020 Jan 15;9:e50901. doi: 10.7554/eLife.50901.sa2

Author response


Major comments:

– There are issues that need to be discussed more openly. For instance, in many instances the RNA expression pattern of a cell type is referred to a cells expression pattern. This would assume that the expression pattern is homogenous in a given cell type. This is a big assumption. The fact that the TAPIN expression patterns do not directly correlate with the single cell expression profiles can indicate that the cells from a cell type can have divergence in their expression profile. This has implications about the cell types that show multiple neurotransmitters. This issue should be more clearly discussed.

The reviewers raise several important points. By construction, our approach aims to profile the bulk expression of all the cells in a given GAL4 or split-GAL4 pattern. Most of the driver lines we used were selected to be primarily expressed in cells with shared anatomical properties that distinguish these cells from the cells of other types. In particular, many of the optic lobe cell types we profile are anatomically very well defined and have long been accepted as distinct cell types in the literature on the fly visual system. While we go through extensive efforts to characterize the specificity of the driver patterns by anatomy, it is true that the cells in such a pattern could exhibit different profiles. In a few cases such heterogeneity is expected: As we discuss in the text and detail in Supplementary file 1A, some of our driver lines combine related cell types or include “contaminants” (cells of other types) in addition to the targeted neurons. In addition, as we now explicitly discuss, cells without obvious anatomical differences at the light microscopy level could still include hidden subtypes with distinct gene expression (such as the pale and yellow subtypes of R7 and R8 photoreceptor neurons with different rhodopsin expression but similar anatomy). Finally, there are of course sources of variation in gene expression other than cell type differences. For example, even within the exact same cell, the transcriptome can vary over time, e.g. due to circadian rhythms or, over a different time-scale, aging (Davie et al., 2018).

While we sometimes refer to a cell type’s expression as a cell’s expression, we now clarify this wording in the text: “(For simplicity, we will hereon refer to cell type as just cell. This is not meant to exclude the possibility of heterogeneity within the individual cells of a profiled population.)”

For many applications, combined data for populations of related cell types can be useful, since such cells are likely to share many molecular properties. For example, we profile driver lines expressed in motion-sensitive T4/T5 neurons that combine eight anatomically distinct subpopulations. As we highlight in Figure 4D-H, using drivers that express in different subsets of these eight cell types, we can identify clear differences between the transcriptomes of some subtypes. Nevertheless, most molecular properties of T4/T5 cells appear to be shared across the different subpopulations. Recently, the Zipursky lab has sorted T4 and T5 cells and used single cell RNA-seq to more comprehensively characterize T4/T5 subtypes (Kurmangaliyev et al., eLife 2019). This scRNA analysis identified eight T4/T5 subclusters, exactly in line with the predictions from anatomy. This work also confirms our finding of Tfap2 and klg as markers for different T4/T5 subsets and supports the general similarity of the eight subtypes.

To the reviewers specific point of how heterogeneity within cell types might explain differences between TAPIN and single cell profiles, we would remind the reviewer that we compared our TAPIN profiles to the average profile of single cells within each cluster, not to individual single cells. And so, any heterogeneity within a single cell cluster would be averaged out in the comparison to TAPIN-seq profiles. In fact, the resolution of the single cell maps that we analyzed (e.g. collapsing the eight T4/T5 subtypes into a single cluster) makes it unlikely that they resolved many named cell types into previously unknown distinct molecular subtypes, and the authors of those maps make no such claims.

Finally, we agree that, using only a given bulk profile, it is not possible to formally rule out that two neurotransmitter markers (or, for that matter, any other combination of transcripts) that appear co-expressed are actually present in distinct, non-overlapping sets of cells. Single cell studies do not necessarily avoid this issue either: For example, Konstatinides et al. report two clusters with apparent co-expression of cholinergic with glutamatergic or GABAergic markers, respectively, but are unable to determine whether this is true co-expression (and in addition do not know the identity of the cell type(s) involved). In our view, the best approach to resolve such uncertainty are follow-up experiments with an independent method: For example, we used immunolabeling of a tagged VAChT protein (Figure 8A) to confirm the unexpected expression of cholinergic markers in R8 photoreceptors cells. Together with published work on histamine expression in R-cells and our observation that histamine receptors are expressed in some but not all synaptic partners of R8 cells (Figure 8B-F), this strongly suggests a co-transmitter phenotype of these neurons. In general, because TAPIN-seq profiles are linked to markers (driver lines) for the profiled cells, performing validation experiments for selected transcripts is straightforward and we provide several examples of this.

– Another issue is that the authors use the RNA levels and protein levels interchangeably, disregarding the possibility of post transcriptional regulation (This is more pronounced when they use the RNA expression data about cell adhesion molecules in Figure 4—figure supplement 1). This should also be addressed.

All of the TAPIN-seq data that we measured were at the transcript level. We compared it to protein expression as a more stringent functional-requirement on “real” expression. The reviewer is correct that there could be additional levels of post-transcriptional regulation and we clarify this point in the text.

“Protein expression can of course differ from that of the corresponding mRNA due to post-transcriptional regulation. However, since most functional interpretations of transcriptome data are implicitly about protein expression, we used this as a more stringent and practical test of our model.”

– The extent to which single cell sequencing clusters can be compared to TAPIN profiles should be better discussed. The TAPIN profiles are RNA expression of cells that share one or two enhancers (by the use of GAL4 or splitGAL4). Unbiased clustering of single cell sequencing data is based on multiple principle components. The manuscript compares the two approaches and suggest one is better than the other. These are complementary approaches and the data are different and difficult to compare (due to possible RNA expression heterogeneity within cell types). This is reflected in the fact that mid-level hierarchical clustering was not well supported by the TAPIN data (Figure 4A) whereas discrete clustering can be observed in single cell sequencing data. The best experiment to address this would be to do single cell sequencing on TAPIN isolated population of neurons nuclei for one or two cell types. I am not requiring this but they should discuss this better

We agree with the reviewer that single cell sequencing and bulk profiling such as TAPIN-seq are complementary approaches. Indeed, we mention this repeatedly in the text, as well as titling Figure 5 to reflect this idea: “TAPIN-seq complements single cell RNA seq profiling”.

We also devote extensive discussion to this theme, including the reviewers suggestion of combining single cell sequencing and TAPIN profiling.

“TAPIN-seq complements single-cell RNA-seq studies of neurons in several ways”“

“Having both deep bulk transcriptomes and single cell maps of the same tissue also provides an opportunity for developing new analytical tools that can harness available cell type-identified information while clustering single cell data.”

“Second, combining our approach with single-cell profiling could more efficiently profile heterogeneity within a brain region or genetically defined cell population”

“Finally, the complementarity between bulk and single-cell measurements extends to other genomic features that can be measured in TAPIN-seq purified nuclei, including accessible chromatin and modified histones.”

In the recent Zipursky lab single cell study of T4/T5 neurons (see above), a GAL4 driver was used to mark cells for presorting by FACS prior to scRNASeq. In this case, the results provide strong support for the notion that anatomical features can be reliable criteria for cell type classification (both anatomy and single cell sequencing indicate exactly eight T4/T5 subtypes).

We anticipate that similar experiments with other optic lobe neurons using our driver lines would produce similar results (i.e. support for anatomically and genetically defined cell types), but do not exclude the possibility that there could be unexpected subdivisions within some of the cell types we profile (as we now discuss).

The reviewer suggests that unbiased clustering of single cell data is somehow more informative (“uses multiple principal components”, a dimension reduction of the 15K gene abundances) and better powered to detect cellular differences than TAPIN-seq profiles of cells that “share one or two enhancers” (although we also measure 15K gene levels). We think this might reflect a mis-understanding of what exactly the one or two enhancers represent. The key feature of our driver lines is not the number of elements used in their construction but the specificity of the expression patterns. The cells labeled by one of these GAL4 or split-GAL4 lines are not randomly related, but through extensive anatomical efforts, are believed to be of the same (anatomically-defined) cell type (or group of cell types). Therefore, the expectation, and our observation in this paper, are that the transcriptomes we measure in these cells using TAPIN-seq in fact reflect the transcriptome of specific cell types.

Bulk and single-cell data are difficult to compare for a number of reasons, that we discuss in the manuscript. Within-cell heterogeneity could potentially be one reason, but we doubt that is the main reason. In fact, as we showed, the unbiased clustering of single cell maps often merge cells of different types, which TAPIN-seq is able to resolve thanks to the genetic tools that differentially mark them. Even if the single cells were correctly clustered so that each cluster does reflect a cell type, their transcriptomes would still be difficult to compare directly to bulk transcriptomes, as the measurement characteristics are quite different, with single cell data using current technologies exhibiting high levels of dropout, as we describe in the text. Nevertheless, as we showed, comparing the two datasets is still a fruitful way of overcoming the main limitations of both: the requirement for genetic tools of TAPIN-seq, and the difficulty of resolving and labeling cell types in scRNA-seq. We also further emphasized the potential of combining bulk and single cell methods in the Discussion:

”Second, combining our approach with single-cell profiling could more efficiently profile heterogeneity within a brain region or genetically defined cell population. Such a combined approach could also help to characterize known cell types for which specific driver lines are not yet available or help to identify and exclude contributions from “off-target” cells to a bulk profile.”

The lack of support for a mid-level hierarchical clustering in the TAPIN-seq profiles is not contradicted by a discrete clustering of single cell data. Whether single cells of a type cluster together is not the same as asking which other cell types/clusters they are most closely related to. We went to lengths to evaluate the uncertainty within our clustering trees and present the confidence values alongside the tree. In contrast, the trees presented in most single cell approaches, either do not perform this evaluation or do not present it alongside the tree. For example, the tree presented in Figure 3 of Konstantinides et al., 2018 does not show this type of evaluation. They do show a similar bootstrapped evaluation as a supplemental Figure (S4), and this shows that very few of their branch points have significant statistical support. In brief, just because you can build a tree doesn’t mean it accurately reflects the transcriptome measurements, let alone provide reliable clues to biological features such as developmental or evolutionary lineage that might conceivably be reflected in such trees.

– Although this is a resource paper,the resource is not accessible to people with limited programming or computational skills. In the web portal opticlobe.com, a simple heatmap can be generated based on the user input. This visualization can hardly reflect the depth and complexity of the data. Please refer to https://gtexportal.org/home/ as a reference to add more tools for general users.

We agree with the reviewer that access to the data is key. For this reason, we have made the expression tables available for download so that any user can explore them using user-friendly spreadsheet programs, such as Microsoft Excel. To remove the hurdle of downloading and opening the table in Excel, we also developed a web portal to support simple queries. We see this web portal mainly as a tool for initial queries which can then be followed up by downloading the spreadsheets or even reanalyzing the original data. This web portal has been useful for others in the field (eg, doi 10.1523/JNEUROSCI.3213-18.2019).

We agree that a better developed web interface would be ideal, and the GTEX portal is indeed an informative guide for this: This project received $800K from the NIH this year to develop “a portal and integrative collaborative analysis platform for GTEX” (https://projectreporter.nih.gov/project_info_description.cfm?aid=9729842&icde=47359899). This kind of effort is beyond the scope of this paper.

– The description of the mixture model is confusing. The lowercase and uppercase p is mixed for different probabilistic events. Do they denote different functions? Conditioned on "bimodal", the posterior probability equation in subsection “Inferring expression state from transcript abundance” is very confusing. Did you integrate out the p(bimodal)? If not, how is the Bayesian rule applied here? The math equations are not well defined.

We thank the reviewer for bringing the confusion about lowercase and uppercase p to our attention. In the statistics literature, uppercase P refers to the probability of a discrete event and lowercase p refers to a continuous density function. We fixed three cases where we inconsistently used uppercase P where it should have been lowercase p.

Yes, in a discrete sense, we did integrate out p(bimodal). As we described in the text, we used the PSIS-LOO approach to selecting whether each gene was better described by a unimodal or bimodal model. Depending on this binary call (either unimodal or bimodal was selected), we chose either the p(expressed | bimodal) or the p(expressed | unimodal). This is equivalent to using the model selection to decide p(gene type = bimodal) = 1 or p(gene type = unimodal) = 1.

– On the model selection section, the first equation used sums of probability, the second equation uses a log sum of probability. Is there a reference or proof for such modeling choices from STAT or machine learning literature?

This math is widely adopted in the Bayesian inference field, and comes from the PSIS-LOO paper that we cite at the beginning of this section (See Equations 20 and 21 from Vehtari et al., 2015; https://arxiv.org/pdf/1507.04544.pdf).

– The mixture model is well studied in both Bayesian and Frequentist framework, how does this approach compare to the standard Bayesian Gaussian mixture model?

Our model is mostly a standard bayesian mixture model, with the wrinkle of the model selection process that we describe in the paper. It would have been ideal if there was a single hierarchical model that jointly learned both the gene type (bimodal vs unimodal) as well as the gene-specific on/off distributions. However, we were unable to build such a model, and the one that we did ultimately build was sufficient to help us successfully navigate our transcriptome catalog.

– Figure 3C is the main result of the mixture model and the lower heatmap showed about HALF of the cells in the matrix are read and the other half is blue across all the genes and samples. This is likely due to an artifact of the customized mixture modeling. Since most of the results in the manuscript depend on the probability estimated from the mixture model, it is crucial to show mathematically that the model is correct.

While there are indeed several choices that go into the modeling, we go through extensive comparison to published expression datasets and extensive protein-level validation of two genes across thirty cell types. In our hands, the probability model was a useful guide for navigating our data. In the case that one is not comfortable with the model calls, they can also use the raw expression levels, which we also provide for download.

In reality, all mathematical models are wrong. As we described in the text, genes are not just on and off; different RNA species have different degradation rates, different translation rates, etc. Like all experiments, every step in a transcriptome study suffers from various kinds of artifacts, ranging from the raw measurements to interpretation.

But in practice, we showed that the measurements are informative, recapitulate known biology, serve as a guide to interpret other genomics datasets, are consistent and corroborated by connectomics data, and provide new insight into the Drosophila visual system.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Davis FP, Nern A, Picard S, Reiser MB, Rubin GM, Eddy SR, Henry GL. 2019. A genetic, genomic, and computational resource for exploring neural circuit function. NCBI Gene Expression Omnibus. GSE116969 [DOI] [PMC free article] [PubMed]
    2. Konstantinides N, Kapuralin K, Desplan C. 2018. RNA sequencing of Drosophila melanogaster optic lobe cell types. NCBI Gene Expression Omnibus. GSE103772
    3. Konstantinides N, Kapuralin K, Desplan C. 2018. Single-cell RNA sequencing of Drosophila melanogaster optic lobe cells. NCBI Gene Expression Omnibus. GSE103771
    4. Davie K, Janssens J, Koldere D, Aerts S. 2018. A single-cell transcriptome atlas of the ageing Drosophila brain. NCBI Gene Expression Omnibus. GSE107451 [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Supplementary file 1. Supplemental tables.

    (A) All drivers. (B) RNAseq samples. (C) Benchmark entries. (D) Marker genes. (E) Anatomy details. (F) TAPIN-seq buffers.

    elife-50901-supp1.xlsx (56.5KB, xlsx)
    Supplementary file 2. Key Resource Table.

    List of key resources and reagents used in this study.

    elife-50901-supp2.docx (149.4KB, docx)
    Transparent reporting form

    Data Availability Statement

    All raw and processed transcriptome data is available from NCBI GEO (accession GSE116969). The shell scripts used to process the raw RNA-seq data, and the R and Stan programs that implement the mixture model as well as generate all figures and tables in this paper are available at github (http://github.com/fredpdavis/opticlobe). The cell type-level expression table can be explored interactively at http://www.opticlobe.com.

    All raw and processed transcriptome data is available from NCBI GEO (accession GSE116969).

    The following dataset was generated:

    Davis FP, Nern A, Picard S, Reiser MB, Rubin GM, Eddy SR, Henry GL. 2019. A genetic, genomic, and computational resource for exploring neural circuit function. NCBI Gene Expression Omnibus. GSE116969

    The following previously published datasets were used:

    Konstantinides N, Kapuralin K, Desplan C. 2018. RNA sequencing of Drosophila melanogaster optic lobe cell types. NCBI Gene Expression Omnibus. GSE103772

    Konstantinides N, Kapuralin K, Desplan C. 2018. Single-cell RNA sequencing of Drosophila melanogaster optic lobe cells. NCBI Gene Expression Omnibus. GSE103771

    Davie K, Janssens J, Koldere D, Aerts S. 2018. A single-cell transcriptome atlas of the ageing Drosophila brain. NCBI Gene Expression Omnibus. GSE107451


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES