Skip to main content
eLife logoLink to eLife
. 2019 Oct 8;8:e46711. doi: 10.7554/eLife.46711

Embryo polarity in moth flies and mosquitoes relies on distinct old genes with localized transcript isoforms

Yoseop Yoon 1, Jeff Klomp 1,, Ines Martin-Martin 2, Frank Criscione 2, Eric Calvo 2, Jose Ribeiro 2, Urs Schmidt-Ott 1,
Editors: Patricia J Wittkopp3, Patricia J Wittkopp4
PMCID: PMC6783274  PMID: 31591963

Abstract

Unrelated genes establish head-to-tail polarity in embryos of different fly species, raising the question of how they evolve this function. We show that in moth flies (Clogmia, Lutzomyia), a maternal transcript isoform of odd-paired (Zic) is localized in the anterior egg and adopted the role of anterior determinant without essential protein change. Additionally, Clogmia lost maternal germ plasm, which contributes to embryo polarity in fruit flies (Drosophila). In culicine (Culex, Aedes) and anopheline mosquitoes (Anopheles), embryo polarity rests on a previously unnamed zinc finger gene (cucoid), or pangolin (dTcf), respectively. These genes also localize an alternative transcript isoform at the anterior egg pole. Basal-branching crane flies (Nephrotoma) also enrich maternal pangolin transcript at the anterior egg pole, suggesting that pangolin functioned as ancestral axis determinant in flies. In conclusion, flies evolved an unexpected diversity of anterior determinants, and alternative transcript isoforms with distinct expression can adopt fundamentally distinct developmental roles.

Research organism: D. melanogaster, Other

eLife digest

With very few exceptions, animals have ‘head’ and ‘tail’ ends that develop when they are an embryo. The genes involved in specifying these ends vary between species and even closely-related animals may use different genes for the same roles. For example, the products of two unrelated genes called bicoid in fruit flies and panish in common midges accumulate at one end of their respective eggs to distinguish head from tail ends. It remained unclear how other fly species, which have neither a bicoid nor a panish gene, distinguish the head from the tail end, or how genes can evolve the specific function of bicoid and panish.

Cells express genes by producing gene templates called messenger ribonucleic acids (or mRNAs for short). The central portions of mRNAs, known as protein-coding sequences, are then used to produce the protein. Proteins can play several distinct roles, which they acquire through evolution. This can happen in different ways, for example, genetic mutations in the part of a gene that codes for protein may alter the resulting protein, giving it a new activity. Alternatively, sequences at the beginning and the end of an mRNA molecule that do not code for protein, but regulate when and where proteins are made, can influence a protein’s role by changing its environment. Many genes produce mRNAs with alternative sequences at the beginning or the end, a process known as alternative transcription.

Here, Yoon et al. identified three unrelated genes that perform similar roles to bicoid and panish in the embryos of several different moth flies and mosquitoes. These genes appear to have acquired their activity because one of their alternative transcripts accumulated at the future head end, rather than through mutations in the protein-coding sequences. Studying multiple species also made it clear that panish inherited its function from a localized alternative transcript of an old gene that duplicated and diverged.

These findings suggest that alternative transcription may provide opportunities for genes to evolve new roles in fundamental processes in flies. Most animal genes use alternative start and stop sites for transcription, but the reasons for this remain largely obscure. This is especially the case in the human brain. The findings of Yoon et al., therefore, raise the question of whether alternative transcription has played an important role in the evolution of the human brain.

Introduction

The specification of the primary axis (head-to-tail) in embryos of flies (Diptera) offers important advantages for studying how new essential gene functions evolve in early development. This process rests on lineage-specific maternal mRNAs that are localized at the anterior egg pole (‘anterior determinants’), which, surprisingly, have changed during the evolution of flies. While the anterior determinants of most flies remain unknown, they can be identified by comparing the transcriptomes of anterior and posterior egg halves (Klomp et al., 2015). Furthermore, their function can be analyzed in the syncytial early embryos of a broad range of species via microinjection, considering timing and subcellular localization. It is therefore possible to conduct phylogenetic comparisons at the functional level. Finally, when the function of anterior determinants is suppressed, embryos develop into an unambiguous, predictable phenotype: these embryos lack all anterior structures and develop as two outward facing tail ends (‘double abdomen’).

Anterior determinants can be encoded by new genes with a dedicated function in establishing embryonic polarity. One example is bicoid in the fruit fly Drosophila melanogaster. Maternal mRNA of bicoid is localized in the anterior pole of the egg and Bicoid protein is expressed in a gradient in the early embryo (Berleth et al., 1988). Bicoid-deficient embryos fail to develop anterior structures and instead form a second tail end, or a symmetrical double abdomen when the maternal activity gradient of another gene, hunchback, is disrupted simultaneously (Driever, 1993). The bicoid gene originated in the lineage of cyclorrhaphan flies more than 140 million years ago by duplication of zerknüllt (zen; aka Hox3), which, in insects, plays an important role in extraembryonic tissue development (Schmidt-Ott et al., 2010). The expression and function of cyclorrhaphan bicoid orthologs are conserved but bicoid has not been found outside this group, and has been lost in some lineages within the Cyclorrhapha.

Another example is panish, which encodes the anterior determinant of a midge, Chironomus riparius. This gene evolved by gene duplication of the Tcf homolog pangolin (pan) and capture of the maternal promoter of a nucleoside kinase gene, and has been called panish (for pan“ish’) (Klomp et al., 2015). Pangolin functions as the effector of ß-catenin-dependent Wnt signaling pathway (‘canonical’ Wnt signaling) but Panish lacks the ß-catenin domain of Pangolin, and sequence similarity between Pangolin and Panish is limited to the cysteine-clamp domain (30 amino acids). panish has not been found outside the family Chironomidae, suggesting that lower dipterans use different anterior determinants.

Here, we have used embryos of a wider range of dipteran species that lack bicoid and panish to address the question of how anterior determinants evolve. We started our analysis with moth flies (Psychodidae: Clogmia albipunctata, Lutzomyia longipalpis) and subsequently extended it to mosquitoes (Culicidae: Culex quinquefasciatus, Aedes aegypti, Anopheles gambiae, Anopheles coluzzii), and to crane flies (Tipulidae: Nephrotoma suturalis) (Figure 1A). Our results reveal three distinct old genes that evolved anterior determinants by localizing an alternative maternal transcript isoform at the anterior egg pole of the respective species. Therefore, alternative transcription might have played an important role in the evolution of this gene function and gene regulatory networks in fly embryos.

Figure 1. Expression of alternative Cal-opa transcripts in Clogmia embryos.

(A) Phylogenetic relationship of fly species referred to in the text (Wiegmann et al., 2011). (B) Differential expression analysis of maternal transcripts between anterior and posterior halves of 1 hr-old Clogmia embryos. logFC: log fold-change. (C) Stage-specific RNA-seq read coverage of Cal-opa locus. Transcription start sites (TSS) and transcription termination sites (TTS) are indicated on the genomic scaffold (solid line with 1000 bp intervals marked) and were confirmed by RACE. Exon-intron sketches of Cal-opaMat and Cal-opaZyg transcript variants are shown with the open reading frame in black and the position of in situ hybridization probes and dsRNA underlined in blue. (D) RNA in situ hybridization of Cal-opaMat and Cal-opaZyg transcripts in 1 hr-old preblastoderm and 7 hr-old cellular blastoderm embryos. Anterior is left and dorsal up. Scale bar: 100μm.

Figure 1.

Figure 1—figure supplement 1. Protein alignment of predicted dipteran Odd-paired orthologs.

Figure 1—figure supplement 1.

Odd-paired orthologs were aligned using ClustalW 2.1 (Larkin et al., 2007). Black triangles mark Cal-Opa truncations referred to in the main text (see Figure 2). Cal: Clogmia albipunctata (Psychodidae); Llo: Lutzomyia longipalpis (Psychodidae); Aga: Anopheles gambiae (Culicidae); Cri: Chironomus riparius (Chironomidae); Sca: Stomoxys calcitrans (Muscidae); Bcu: Bactrocera cucurbitae (Tephritidae); Dme: Drosophila melanogaster (Drosophilidae). Conserved zinc finger domains are indicated with colored boxes. Conserved amino acids are shown color.

Results

An alternative maternal transcript of the conserved segmentation gene odd-paired functions as anterior determinant in Clogmia

We annotated 5602 transcripts from the anterior and posterior transcriptomes of 1 hr-old bisected Clogmia embryos and ranked them according to the magnitude of their differential expression scores and P values (Figure 1B). In the anterior embryo, the most enriched transcript was homologous to odd-paired, the Drosophila homolog of mammalian Zic (zinc finger of the cerebellum) genes. ZIC proteins are known to function as transcription factors or co-factors (Houtmeyers et al., 2013). odd-paired was discovered in a screen for early Drosophila segmentation genes and subsequently classified as a ‘pair-rule’ gene, since odd-paired mutants fail to develop alternating segments (Jürgens et al., 1984). During the Drosophila segmentation process, odd-paired is expressed in a single broad domain and controls the 'frequency-doubling' of other pair-rule genes (Clark and Akam, 2016).

The Clogmia genome contains a single odd-paired locus (Cal-opa) (Vicoso and Bachtrog, 2015). Using RNA-seq data from preblastoderm and blastoderm embryos and Rapid Amplication of cDNA Ends (RACE), we identified maternal and zygotic Cal-opa transcripts with alternative first exons that we mapped onto a 54 kb genomic scaffold (Figure 1C). The maternal transcript (Cal-opaMat) was detected in preblastoderm embryos (0.5 hr-old) and syncytial blastoderm embryos (4 hr-old). The zygotic transcript (Cal-opaZyg) was found in cellularized blastoderm embryos (7 hr-old) and gastrulating embryos (9 hr-old). Protein alignments with homologs from other flies suggest that Cal-opaZyg encodes the full-length Cal-Opa protein (655 amino acids), while Cal-opaMat encodes a truncated protein variant (635 amino acids), lacking the N-terminal 20 amino acids of Cal-OpaZyg (Figure 1—figure supplement 1).

To confirm the alternative Cal-opaMat and Cal-opaZyg transcripts and their non-overlapping expression patterns, we performed whole mount RNA in situ hybridization experiments with transcript-specific probes. The Cal-opaMat transcript was anteriorly localized in preblastoderm embryos but absent at the cellular blastoderm stage. Conversely, the Cal-opaZyg transcript was absent in preblastoderm embryos but expressed broadly in the trunk region of 7 hr-old blastoderm embryos (Figure 1D), like odd-paired in Drosophila. These observations suggest that Cal-opa produces transcript isoforms with spatially and temporally distinct expression patterns.

To determine the function of Cal-opaMat and Cal-opaZyg, we established a protocol for microinjecting early Clogmia embryos and conducted transcript-specific RNA interference (RNAi) experiments. Injection of Cal-opaMat double-stranded RNA (dsRNA) led to mirror-image duplications of the tail end (double abdomen; Figure 2A and Figure 2—figure supplement 1). In contrast, injection of dsRNA targeting Cal-opaZyg resulted in half the number of segmental expression domains of Cal-slp (the ortholog of pair-rule gene sloppy-paired) and caused defects in segmentation, dorsal closure, and head development but did not alter embryo polarity (Figure 2B). Finally, injection of dsRNA targeting both Cal-opaMat and Cal-opaZyg resulted in double abdomens with missing segments (Figure 2—figure supplement 2). These observations indicate distinct roles of Cal-opaMat and Cal-opaZyg in specifying embryo polarity and in segmentation, respectively.

Figure 2. Function of alternative Cal-opa transcripts in early Clogmia embryo.

(A) 1 st instar larval cuticle of wild type (top) and following Cal-opaMat RNAi (middle). RNA in situ hybridization of Cal-cad in a wild-type preblastoderm embryo (bottom left) and stage-matched Cal-opaMat RNAi embryos (bottom right). Anterior is left and dorsal up. T: thoracic segment; A: abdominal segment. Segment numbers in Cal-opaMat RNAi larval cuticle were assigned based on the assumption of polarity reversal. Scale bar: 100μm. (B) 1 st instar larval cuticle phenotype following Cal-opaZyg RNAi (top) and RNA in situ hybridization of Cal-slp in extending wild-type germband (middle) and stage-matched Cal-opaZyg RNAi embryo (bottom). Anterior is left and dorsal up. Md: mandibular segment; Mx: maxillary segment; Lb: labial segment; T: thoracic segment; A: abdominal segment. Scale bar: 100μm. (C) RNA in situ hybridizations of Cal-otd in wild-type gastrula (top left) and stage-matched embryo following posterior Cal-opaMat mRNA injection (top right) are shown in ventral view. A live wild-type embryo (bottom left) and a stage-matched embryo following posterior Cal-opaMat mRNA injection (bottom right) in lateral view. Anterior is left. Scale bar: 100μm. (D) Posterior injection of Cal-opa mRNA and mutated variants. Complete, symmetrical duplication of the bilateral Cal-otd expression domain in gastrulating embryos was counted as double head (blue, see Figure 2C). All other phenotypes, including incomplete duplications and wild type, were conservatively counted as ‘no double head’ (black). Sketches of predicted Cal-Opa proteins are shown with ZIC/Opa conserved motif (ZOC) in yellow, the ZIC family protein N-terminal conserved domain (ZFNC) in green, and zinc finger domains in orange. The Met21Leu mutation in Cal-OpaZyg-Met21Leu is marked in red. Cal-opaMat (late): Cal-opaMat was injected during the syncytial blastoderm stage (4 hr). ns: p>0.05; ***: p<0.001; ****: p<0.0001, Fisher’s exact test. (E) RNA in situ hybridization of Cal-nos4 in a gastrulating embryo (left) and stage-matched Cal-opaMat RNAi embryos (right).

Figure 2.

Figure 2—figure supplement 1. Frequencies of strong Cal-opaMat and Cal-opaZyg RNAi phenotypes.

Figure 2—figure supplement 1.

(A) Cal-opaMat expression in preblastoderm embryos injected with panish dsRNA (negative control, left) or Cal-opaMat dsRNA (right). Asterisk in light blue: WT Cal-opaMat expression; asterisk in red: no Cal-opaMat expression (p<0.0001, Fisher’s exact test). RNAi experiments were performed in parallel using the same dsRNA concentration (1 µg/µl) and embryos were fixed one hour after the injection. RNA in situ hybridizations were performed simultaneously with the same probe concentration and Alkaline Phosphatase incubation time. (B) Cuticle phenotypes of 1 st instar larvae. For representative examples see Figure 2A and Figure 2B. ****: p<0.0001, Fisher’s exact test. (C) Cal-cad and Cal-slp expression phenotypes in embryos. For representative examples see Figure 2A and B. Cal-cad in situ hybridizations were done on Cal-opaMat RNAi embryos that reached cellular blastoderm stage. Cal-slp in situ hybridizations were done on 14- to 18 hr-old Cal-opaZyg RNAi embryos. Embryos that had failed to reach the stage of germband extension were excluded from the analysis.
Figure 2—figure supplement 2. Cuticle phenotype of a 1 st instar larva following Cal-opaMat and Cal-opaZyg double RNAi.

Figure 2—figure supplement 2.

A double abdomen with several missing segments is shown in lateral view. More severe phenotypic defects were not found in 1 st instar larvae, possibly due to embryo death prior to cuticle secretion. Scale bar, 100μm.
Figure 2—figure supplement 3. Maternal transcript localization and RNAi phenotypes of Cal-slp and Cal-mira in Clogmia.

Figure 2—figure supplement 3.

(A) Cal-slp and Cal-mira RNA in situ hybridization in embryos at preblastoderm stage. Anterior is left; Scale bar: 100μm. (B) Cuticle phenotypes of 1 st instar larvae following Cal-slp RNAi (left) and Cal-mira RNAi (right) in ventral view. Anterior is left. Scale bar: 100μm.
Figure 2—figure supplement 4. Cuticle phenotype of a 1 st instar Clogmia larva following Cal-opaMat mRNA injection.

Figure 2—figure supplement 4.

Dorsal view. Scale bar: 100μm.
Figure 2—figure supplement 5. Clogmia homologs of nanos, vasa, tudor, and germ cell-less.

Figure 2—figure supplement 5.

(A) Protein alignment of Cal-Nos paralogs using ClustalW 2.1 (Larkin et al., 2007). Conserved amino acids are highlighted. (B) Cal-nos1, Cal-nos2, Cal-nos3, and Cal-nos4 RNA in situ hybridizations in embryos at preblastoderm, syncytial blastoderm, cellular blastoderm, and gastrulation stages. Anterior is left and dorsal up. Scale bar: 100μm. (C) Cal-vas, Cal-tud, and Cal-gcl RNA in situ hybridizations in embryos at gastrulation stage. Anterior is left and dorsal up. Scale bar: 100μm.

We noticed that maternal transcripts of Cal-slp and Cal-mira, a homolog of miranda, which encodes an adaptor protein for cell fate determinants in Drosophila (Ikeshima-Kataoka et al., 1997; Adams et al., 2000), were also slightly enriched in the anterior portion of embryo (Figure 1B). This observation was confirmed by RNA in situ hybridizations (Figure 2—figure supplement 3). Injection of Cal-slp dsRNA resulted in head and dorsal closure defects while Cal-mira dsRNA caused labrum and antennal defects, but in both cases embryo polarity was retained (Figure 2—figure supplement 3).

To test whether Cal-opaMat can induce head development ectopically, we injected Cal-opaMat mRNA into the posterior pole of 1 hr-old embryos. These embryos expressed a head marker, Cal-otd (ortholog of ocelliless/orthodenticle), on both ends of the embryo and developed a symmetrical double head, including some duplicated thoracic elements (Figure 2C and Figure 2—figure supplement 4 and Video 1). These observations suggest that anterior enrichment of maternal transcripts other than Cal-opaMat mRNA is not essential for head development, and that Cal-opaMat localization is sufficient for establishing embryo polarity.

Video 1. Live double head embryo in ventrolateral view.

Download video file (2.1MB, mp4)
DOI: 10.7554/eLife.46711.011

The larval cuticle of this embryo is shown in Figure 3—figure supplement 1.

The anterior determinant function of Cal-opa is sensitive to expression timing but insensitive to 5’ truncation of the open reading frame

Next, we asked whether the early timing of odd-paired expression is critical for its function as anterior determinant in moth flies. To test this hypothesis, we conducted posterior injections of Cal-opaMat mRNA during the syncytial blastoderm stage (4 hr) and examined Cal-otd expression after gastrulation. These embryos developed with normal head-to-tail polarity (Figure 2D). This result places the requirement of odd-paired for axis specification prior to the syncytial blastoderm stage and suggests that early timing of odd-paired activity is essential for its function as anterior determinant.

Cal-opaMat and Cal-opaZyg mRNAs not only differ in the timing of expression, but also differ in their 5’UTRs and predicted N-terminal protein sequences, as mentioned above (Figure 1C and Figure 1—figure supplement 1). To test whether the open reading frame difference is required for the anterior determinant function, we injected Cal-opaZyg mRNA at the posterior pole of preblastoderm embryos. Embryos from this experiment also developed as double heads (Figure 2D). Because the translation start site of Cal-OpaMat is located downstream of the Cal-OpaZyg translation start site, we also tested Cal-opaZygmRNA in which the putative start codon for Cal-OpaMat was mutated to encode leucine (Cal-opaZyg-Met21Leu). Posterior injection of Cal-opaZyg-Met21LeumRNA also resulted in double heads (Figure 2D). These findings indicate that the protein difference between Cal-OpaMat and Cal-OpaZyg is not essential for the anterior determinant function of odd-paired in Clogmia.

To test whether only a small portion of the Cal-opa open reading frame is required for its function as anterior determinant, we examined the ability of various truncated variants of Cal-Opa mRNAs (Figure 1—figure supplement 1) to induce head development at the posterior egg pole (Figure 2D). mRNAs of protein variants with large N-terminal truncation (Cal-Opa115-655 and Cal-Opa182-655) retained the ability to induce double heads. However, mRNAs of protein variants with C-terminal truncation (Cal-Opa182-445 and Cal-Opa1-337+22, a hypothetical splice variant) failed to induce double heads. These results indicate that the ability of Cal-Opa to specify embryo polarity requires the C-terminal portion of the protein but is largely insensitive to N-terminal truncation, corroborating our above conclusion that N-terminal differences between Cal-OpaMat and Cal-OpaZyg were not essential for evolving the anterior determinant function of Cal-opa.

Cal-opaMat suppresses zygotic germ cell specification at the anterior pole and Clogmia lacks maternal germ plasm

In Drosophila and other dipterans, maternal germ plasm in the posterior embryo not only specifies primordial germ cells but also contributes to and stabilizes embryo polarity via nanos, which suppresses the translation of anterior determinants in the posterior embryo (Tautz, 1988; Gavis and Lehmann, 1992; Struhl et al., 1992; Lemke and Schmidt-Ott, 2009). The activity of nanos in the posterior preblastoderm is dependent on oskar (Lehmann, 2016), which is conserved in many insects (Ewen-Campen et al., 2010). However, in Clogmia, expression profiling of anterior and posterior egg halves did not reveal any posteriorly localized maternal transcripts (Figure 1B, alpha = 0.001, unadj.), and no oskar homolog was found in our Clogmia transcriptomes or the Clogmia genome (Vicoso and Bachtrog, 2015). To test whether Clogmia lacks maternal germ plasm, we examined the expression of candidate germ cell markers, including Clogmia homologs of nanos (Cal-nos1, Cal-nos2, Cal-nos3, and Cal-nos4), vasa (Cal-vas), tudor (Cal-tud), and germ cell-less (Cal-gcl). Cal-nos1, Cal-nos3, and Cal-nos4 were not localized in the posterior of preblastoderm embryos but were expressed in a small set of cells at the posterior pole of cellular blastoderm and gastrulating embryos along with Cal-vas, Cal-tud, and Cal-gcl that were expressed more broadly (Figure 2—figure supplement 5). These observations suggest that Clogmia lacks maternal germ plasm and that Clogmia may induce the germ cell fate zygotically. To test this hypothesis, we examined Cal-nos expression in Cal-opaMat RNAi embryos. Cal-nos positive cells were duplicated in double abdomens (Figure 2E), indicating that Clogmia uses an inductive mechanism for germ cell specification, which is repressed in the anterior embryo by Cal-opaMat. Therefore, axis specification in the Clogmia embryo is independent from germ cell specification. To our knowledge, Clogmia is also the first example of inductive germ cell specification in flies.

Evolution of the anterior determinant function of moth fly odd-paired

Maternal odd-paired transcript is absent in freshly deposited eggs of chironomids (Klomp et al., 2015) and mosquitoes (Akbari et al., 2013), both of which belong to the Culicomorpha lineage (Figure 1A). To test whether localized maternal odd-paired transcript is broadly conserved in the Psychodomorpha lineage, we examined maternal transcript localization in the eggs of the sand fly Lutzomyia longipalpis, a moth fly species of public health concern due to its role in the transmission of visceral leishmaniasis. Of 5392 annotated transcripts, the most enriched maternal transcript in the anterior half of 1–2 hr old embryos was homologous to odd-paired and was therefore named Llo-opaMat (Figure 3A). In the posterior Lutzomyia embryo, the most enriched transcript was homologous to oskar, indicating that Lutzomyia eggs contain maternal germ plasm at the posterior pole, unlike the Clogmia eggs. These findings suggest that a broad range of moth flies use odd-paired transcript as anterior determinant, and that maternal germ plasm was lost only in the Clogmia lineage. Close examination of Lutzomyia transcriptomes from 1 hr-old and 24 hr-old embryos also revealed zygotic odd-paired transcript (Llo-opaZyg) (Figure 3B). Llo-opaMat and Llo-opaZyg share the same open reading frame but differ at their untranslated 5’ and 3’ ends. Since the N-terminal ends of Llo-OpaMat/Llo-OpaZyg and Cal-OpaZyg proteins are homologous (Figure 1—figure supplement 1), we infer that the N-terminal truncation of Cal-OpaMat occurred after the transcript had evolved maternal expression and anterior localization. The detection of Llo-opaZyg transcript in 24 hr-old embryos coincided with that of gap and pair-rule segmentation gene homologs (Figure 3—figure supplement 1), indicating that Llo-opaZyg functions during segmentation.

Figure 3. Alternative odd-paired transcript isoforms in Lutzomyia and ectopic head induction by mRNAs derived from Cal-opa orthologs in Clogmia.

(A) Differential expression analysis of maternal transcripts between anterior and posterior halves of 1–2 hr-old Lutzomyia embryos. logFC: log fold-change. (B) Stage-specific RNA-seq read coverage of Llo-opa genomic locus and sketches of Llo-opaMat and Llo-opaZyg transcripts (see also legend to Figure 1A). (C) Posterior injection of odd-paired mRNAs from Lutzomyia, Chironomus, and Drosophila in Clogmia embryo. Phenotypes were counted as in (Figure 2D) and double head from Dme-opa mRNA injection is shown as example. Cri-opa and Dme-opa mRNAs include Kozak sequence of Cal-opaMat (TAAG upstream of the predicted translation start site). Cri-kozak and Dme-kozak refer to odd-paired sequences with donor-specific Kozak-sequences from Chironomus (AAAA) and Drosophila (GACC), respectively. ns.: p>0.05; *: p<0.05. ***: p<0.001, Fisher’s exact test.

Figure 3.

Figure 3—figure supplement 1. Expression level of select Lutzomyia gap gene (red) and pair-rule gene (blue) homologs.

Figure 3—figure supplement 1.

Based on RNA-seq data from 1, 4, 6, 8, 12, and 24 hr-old embryos. TPM: transcripts per million.

The odd-paired gene of ancestral moth flies could have evolved the ability to establish the embryo polarity via specific amino acid substitutions. In this case, odd-paired homologs from species with a different anterior determinant, such as Drosophila or Chironomus, should not induce ectopic head development in Clogmia embryos. Alternatively, odd-paired could have evolved its role as axis determinant in moth flies independent of any amino acid substitution via co-option. In this case, odd-paired homologs from Drosophila or Chironomus could have the ability to induce head development in Clogmia embryos when appropriately expressed. To test this possibility, we injected odd-paired mRNA from Lutzomyia, Chironomus, or Drosophila into the posterior pole of early Clogima embryos. All of these odd-paired homologs induced double heads in Clogmia, provided that the endogenous kozak sequence of Cal-opaMat was used for optimal translation efficiency (Figure 3C). Since neither Drosophila nor Chironomus uses odd-paired for specifying embryo polarity, these results suggest that amino acid substitutions were not essential for the evolution of the anterior determinant function of odd-paired in moth flies. We therefore propose that this gene function evolved via co-option when alternative maternal transcript of moth fly odd-paired became enriched at the anterior egg pole.

A previously uncharacterized C2H2 zinc finger gene, named cucoid, functions as anterior determinant in culicine mosquitoes

Given that freshly deposited mosquito eggs lack maternal odd-paired transcript orthologs (Akbari et al., 2013), we extended our search for anterior determinants to mosquitoes. Initially, we focused on the Southern House Mosquito Culex quinquefasciatus, a vector of West Nile virus (the leading cause of mosquito-borne disease in the continental United States) and of Wuchereria bancrofti (the major cause of lymphatic filariasis). This species was chosen because their eggs are large and have clearly distinguishable anterior and posterior egg poles. We annotated 8239 Culex transcripts from the pooled anterior and posterior transcriptomes of 1 hr-old preblastoderm embryos and ranked them according to the magnitude of their differential expression scores and P values (Figure 4A). In the posterior embryo, the most enriched transcript was related to nanos, consistent with the presence of maternal germ plasm in this species (Figure 4D) (Juhn et al., 2008). The most enriched transcript in the anterior embryo was closely related to an uncharacterized gene of Drosophila (CG9215). However, reciprocal BLAST searches suggest that CG9215 belongs to a poorly defined larger gene family in Drosophila that might be represented by a single gene in mosquitoes. We named this gene cucoid to reflect its bicoid-like function in a culicine mosquito. cucoid encodes a protein with five C2H2 zinc finger domains (Figure 4—figure supplement 1). RACE experiments with cDNA from 0 to 7 hr-old embryos revealed three alternative cucoid transcripts with distinct 3’ ends (cucoidA, cucoidB, and cucoidC) (Figure 4B), but only cucoidB and cucoidC were recovered from cDNA of 0–2 hr-old preblastoderm embryos, suggesting that one or both these transcripts might be maternally localized at the anterior pole. To test this hypothesis, we performed RNA in situ hybridizations with specific probes for cucoidA (probe A) or cucoidB (probe B) and, due to the very short sequence unique to cucoidC (121 nucleotides), a probe against all three isoforms (probe C). cucoidA and cucoidB expression was detected in the fore and hind gut of extended germbands but not in 1 hr-old preblastoderm embryos. In contrast, the probe against all three isoforms detected maternally localized cucoid transcript at the anterior pole in addition to the zygotic expression pattern (Figure 4C). Taken together, these results suggest that only the cucoidC isoform is maternally localized at the anterior pole and could function as anterior determinant. To test this hypothesis, we injected cucoid dsRNA from the shared 5’ region and examined the expression of a posterior marker (Cqu-cad) in gastrulating embryos. Many of these embryos expressed Cqu-cad in the anterior and underwent ectopic gastrulation at the anterior pole, suggesting that normal head-to-tail polarity was lost (Figure 4E). Taken together, our results suggest that cucoid acquired the anterior determinant function via the localization of a maternal transcript isoform with an alternative 3’ end.

Figure 4. Expression and function of alternative cucoid isoforms in Culex.

(A) Differential expression analysis of maternal transcripts between anterior and posterior halves of 1 hr-old Culex embryos. logFC: log fold-change. (B) Sketches of cucoid transcript isoforms based on RNA-seq and RACE experiments (see also legend to Figure 1A). Blue lines the position of in situ hybridization probes. (C) RNA in situ hybridization of cucoidA, cucoidB, and cucoidC transcripts in 1 hr-old preblastoderm and germband extending embryos. Anterior is left and dorsal up. Scale bar: 100μm. (D) RNA in situ hybridization of Cqu-nos in 1 hr-old preblastoderm embryo. Anterior is left. Scale bar: 100μm (E) RNA in situ hybridization of Cqu-cad in a wild-type gastrulating embryo (left) and in a stage-matched cucoid RNAi embryos (right; 16/48 with cucoid dsRNA versus 0/25 with lacZ control dsRNA; p<0.0005). Anterior is left and dorsal up. Scale bar: 100μm.

Figure 4.

Figure 4—figure supplement 1. Protein alignment of dipteran cucoid orthologs and a Drosophila paralog CG4424.

Figure 4—figure supplement 1.

Proteins were aligned using ClustalW 2.1 (Larkin et al., 2007). Conserved zinc finger domains are indicated with colored boxes. Black asterisk(*) indicates the last amino acids of proteins. Nsu: Nephrotoma suturalis (Tipulidae); Cal: Clogmia albipunctata (Psychodidae); Aae: Aedes agypti (Culicide); Cqu: Culex quinquefasciatus (Culicidae); Aga: Anopheles gambiae (Culicidae); Cri: Chironomus ripariu (Chironomidae).

We obtained similar results in another culicine mosquito, the Yellow Fever Mosquito Aedes aegypti, which transmits Dengue, Chikungunya, and Zika viruses. In this species, expression profiling of 5802 transcripts from the anterior and posterior transcriptomes of 1 hr-old preblastoderm embryos also identified cucoid (Aae-cucoid) as the gene with the most significantly enriched transcript in the anterior embryo (Figure 5A). RACE experiments with cDNA from 1 hr to 6 hr-old embryos revealed three similar Aae-cucoid transcripts with alternative 3’ ends (Aae-cucoidA, Aae-cucoidB, and Aae-cucoidC) (Figure 5B), and RNA in situ hybridization experiments with a probe against all three isoforms confirmed the anterior localization of Aae-cucoid transcript in preblastoderm embryos (Figure 5C). Aae-cucoid expression in Aedes germbands could not be examined for technical reasons.

Figure 5. Expression and function cucoid in Aedes.

Figure 5.

(A) Differential expression analysis of maternal transcripts between anterior and posterior halves of 1 hr-old Aedes embryos. logFC: log fold-change. (B) Sketches of Aae-cucoid transcript isoforms based on RNA-seq and RACE experiments with the position of the in situ hybridization probe and dsRNA underlined in blue (see also legend to Figure 1A). (C) RNA in situ hybridization Aae-cucoid (left) and Aae-nos (right) transcript in 1 hr-old preblastoderm embryo. Anterior is left and dorsal up. Scale bar: 100μm. (D) 1 st instar Aedes larval cuticle of wild type (top) and following Aae-cucoid RNAi (bottom; 9/26 versus 0/22 with control dsRNA; p<0.005). Scale bar: 100μm.

In the posterior embryo, no highly enriched transcripts were observed. This was unexpected given that whole mount in situ hybridizations revealed posterior localized transcript of Aedes nanos in 1 hr-old embryos (Figure 5C) and that maternal transcript of Aedes oskar is also localized at the posterior pole (Juhn and James, 2006). Low statistical power of our differential expression analysis in Aedes might explain this discrepancy, since we could have confounded the anterior and posterior pole in some of the bisected Aedes eggs (see Materials and methods). Alternatively, only a small portion of these transcripts might be localized at the posterior pole. Injection of Aae-cucoid dsRNA against the shared region of all transcripts resulted in double abdomens (Figure 5D), suggesting that cucoid evolved its function as anterior determinant prior to the divergence of the Culex and Aedes lineages.

pangolin/Tcf functions as anterior determinant in anopheline mosquitoes

The Anopheles gambiae species complex constitutes an outgroup to the Culex-Aedes clade. It includes eight or more sub-Saharan species that are difficult to distinguish due to widespread genealogical heterogeneity across the genome, incomplete lineage sorting and introgression (Thawornwattana et al., 2018). We interchangeably used A. gambiae and A. coluzzii, two sibling species within this species complex that are responsible for the majority of malaria transmission in Africa, to identify the anterior determinant of this mosquito lineage. Whole mount RNA in situ hybridizations with a probe against the Anopheles gambiae ortholog of cucoid did not detect any anterior localized transcript in 1 hr-old embryos, suggesting that Anopheles uses a different anterior determinant than Culex and Aedes.

To further test this possibility, we sequenced the anterior and posterior transcriptomes 1 hr-old preblastoderm embryos of A. gambiae and ranked 9353 transcripts according to the magnitude of their differential expression scores and P values. In the posterior embryo, the most enriched transcript was homologous to nanos. In the anterior embryo, the most enriched transcript was homologous to pangolin (also known as Tcf) (Figure 6A). To test for potential alternative maternal and zygotic isoforms of pangolin in Anopheles, we mapped the assembled transcripts and 5’ and 3’ RACE products from 1 to 6 hr-old embryos onto an available A. gambiae genome assembly (AgamP4). We identified two alternative transcript variants with non-overlapping 3’UTRs but nearly identical open reading frames that we named Aga-panMat and Aga-panZyg, respectively (Figure 6B and Figure 6—figure supplement 1). Aga-panMat was tightly localized at the anterior pole of 1–2 hr-old preblastoderm embryos and only weakly expressed in elongated germbands, whereas Aga-panZyg was expressed segmentally in elongated germbands but not in embryos younger than 2 hr-old preblastoderm embryos (Figure 6C). Both pangolin isoforms were conserved in Anopheles coluzzii with sequence identity above 99% and the maternal isoform (Aco-panMat) was localized at the anterior pole (Figure 6—figure supplement 2). The stage-specific expression of both pangolin isoforms was also conserved outside the Anopheles gambiae species complex in Anopheles stephensi (Figure 6—figure supplement 3). Alignments of dipteran Pangolin proteins suggest that, in Anopheles, the maternal variant includes an additional seven amino acids at C-terminal end due to alternative polyadenylation and splicing (Figure 6—figure supplement 1).

Figure 6. Expression and function of alternative pangolin isoforms in Anopheles.

(A) Differential expression analysis of maternal transcripts between anterior and posterior halves of 1 hr-old Anopheles embryos. logFC: log fold-change. (B) Sketches of Aga-panMat and Aga-panZyg transcripts based on RNA-seq and RACE experiments (see also legend to Figure 1A). Blue lines the position of in situ hybridization probes and dsRNA. (C) RNA in situ hybridization of Aga-panMat and Aga-panZyg transcripts in 1 hr-old preblastoderm and germband extending embryos of A.gambiae. Anterior is left and dorsal up. Scale bar: 100μm. (D) 1 st instar A.coluzzii larval cuticle of wild type (top). Cuticle phenotypes of double abdomen (middle; 3/37) and intermediate, anterior truncation phenotype (bottom; 8/37) following Aco-panMat RNAi. Scale bar: 100μm. (E) Differential expression analysis of maternal transcripts between anterior and posterior halves of 1 hr-old Nephrotoma (Tipulidae) embryos. logFC: log fold-change. (F) RNA in situ hybridization of Nsu-panMat.

Figure 6.

Figure 6—figure supplement 1. Protein alignment of dipteran Pangolin orthologs and Panish.

Figure 6—figure supplement 1.

Proteins were aligned using ClustalW 2.1 (Larkin et al., 2007). Conserved domains are indicated with colored boxes. Dme: Drosophila melanogaster (Drosophilidae); Nsu: Nephrotoma suturalis (Tipulidae); Cal: Clogmia albipunctata (Psychodidae); Llo: Lutzomyia longipalpis (Psychodidae); Aae: Aedes agypti (Culicide); Cqu: Culex quinquefasciatus (Culicidae); Aga: Anopheles gambiae (Culicidae); Ast: Anopheles stephensi (Culicidae); Cri: Chironomus riparius (Chironomidae). Conserved amino acids are shown color.
Figure 6—figure supplement 2. RNA in situ hybridization of Aco-panMat and Aco-panZyg transcripts in 1 hr-old A.coluzzii preblastoderm embryos.

Figure 6—figure supplement 2.

Anterior is left and dorsal up. Scale bar: 100μm.
Figure 6—figure supplement 3. Stage-specific RNA-seq read coverage of Ast-pan genomic locus and sketches of Ast-panMat and Ast-panZyg transcripts.

Figure 6—figure supplement 3.

Black box, open reading frame; white box, untranslated regions. See also legend to Figure 1A.
Figure 6—figure supplement 4. Cuticle phenotype of a A.coluzzii 1 st instar larva following Aco-panZyg RNAi (11/23).

Figure 6—figure supplement 4.

Phenotype difference between Aco-panMat RNAi and Aco-panZyg is significant; p<0.05, Fisher’s exact test. See Figure 5F for a WT cuticle. Scale bar: 100μm.
Figure 6—figure supplement 5. panish RNAi rescue experiments with Cri-pan mRNA.

Figure 6—figure supplement 5.

(A) Sketches of Cri-Pan and Panish (HMG: HMG-box domain; C: Cysteine-clamp domain) and protein alignment of conserved C-clamp domains between Panish and Pan orthologs. Panish-specific sequence change (RR->CQ) is indicated as an arrow. Pnu: Polypedilum nubifer; Pva: Polypedilum vanderplanki; Cte: Chironomus tentans; Cpi: Chironomus piger; Cri: Chironomus riparius; Aga: Anopheles gambiae; Dme: Drosophila melanogaster. (B) panish RNAi rescue experiments using panish mRNA, panish FS (frameshift mutation) mRNA, Cri-pan mRNA, and Cri-pantrunc mRNA. Cri-pantrunc mRNA encodes Cri-Pan protein that lacks N-terminal sequences before C-clamp domain (N-terminal 365 aa deleted, including HMG-box domain) and with Panish-specific sequence change (RR->CQ) in C-clamp domain, as in (A).

To investigate the function of localized maternal pangolin expression in Anopheles, we specifically targeted this isoform in Anopheles coluzzii. Injection of Aco-panMat-specific dsRNA into several hundred 1 hr-old A. coluzzii embryos resulted in only 37 cuticles with variable phenotypes, including anterior truncations and, in extreme cases, double abdomens (Figure 6D). We noticed perturbed segmentation boundaries in the double abdomens, suggesting that Aga-panMat may also function in segmentation, as suggested by its weak zygotic expression pattern. Injection of Aco-panZyg-specific dsRNA into 1 hr-old A. coluzzii embryos resulted in severe segmentation defects that were difficult to characterize, but no double abdomens or anterior-specific truncation defects were found (Figure 6—figure supplement 4). Taken together with the isoform-specific transcript localization data presented above, these RNAi results support the hypothesis that pangolin acquired the anterior determinant function via the localization of a maternal transcript isoform with an alternative 3’ end.

Localization of maternal pangolin transcript in crane flies suggests that pangolin functioned as anterior determinant in ancestral flies

Anterior-localized maternal pangolin (Tc-pan) transcript has also been observed in the eggs of a beetle (Tribolium castaneum) (Bucher et al., 2005), but the function of this transcript remains unknown. Previous Tc-pan RNAi experiments targeted both maternal and zygotic transcripts and only revealed a function in posterior development, due to the role of zygotic Tc-pan in canonical Wnt signaling in the posterior growth zone (Bolognesi et al., 2008; Fu et al., 2012; Prühs et al., 2017; Ansari et al., 2018). To test whether ancestral dipterans localized maternal pangolin transcript at the anterior pole of the egg, we established a culture of the crane fly Nephrotoma suturalis (Tipulidae), which belongs to the Tipulomorpha, one of the oldest branches of dipterans (Grimaldi and Engel, 2005; Wiegmann et al., 2011). We sequenced the anterior and posterior transcriptomes of freshly deposited Nephrotoma egg halves and ranked 5371 transcripts according to the magnitude of their differential expression scores and P values (Figure 6E). The most enriched transcript in the posterior embryo was related to oskar, suggesting that crane fly eggs contain maternal germ plasm at the posterior pole. The most enriched transcript in the anterior embryo was homologous to pangolin and therefore named Nsu-pan. The anterior localization of this transcript was confirmed by RNA in situ hybridization (Figure 6F). RACE experiments with cDNA from 1 hr-old embryos identified multiple isoforms with slightly variable 5’ ends but the same open reading frame. An alignment of the predicted Nsu-Pan protein from this open reading frame with other dipteran Pangolin homologs revealed conserved N-terminal and C-terminal ends in Nsu-Pan. Taken together with our Anopheles data, our results in Nephrotoma suggest that ancestral dipteran insects localized maternal pangolin transcript in the anterior egg pole, where this transcript may have functioned as anterior determinant.

Pangolin cannot substitute for Panish in Chironomnus

In the midge Chironomus, the ortholog of pangolin (Cri-pan) is not expressed maternally but its diverged paralog panish functions maternally as anterior determinant (Klomp et al., 2015). Given that panish evolved from pangolin via gene duplication, panish probably inherited its role from pangolin. Therefore, it is possible that Cri-pan and panish are still functionally equivalent when expressed at the anterior pole of preblastoderm Chironomus embryos. Alternatively, Panish may have co-evolved with the targets required for anterior patterning and Pangolin can no longer interact with those targets. In this case, Cri-pan should no longer be able to fulfill the function of panish. To distinguish between these possibilities, we examined the ability of panish and Cri-pan mRNAs to rescue the RNAi phenotype of panish. We have previously shown that dsRNA of the panish 3’UTR can induce the double abdomen phenotype with a penetrance of nearly 100%, and that this phenotype can be rescued in roughly half of the embryos by injecting panish mRNA with heterologous UTRs at the anterior pole, shortly after the injection of dsRNA (Klomp et al., 2015). We used this assay to compare the functions of panish mRNA (positive control), frame shifted panish mRNA (negative control), Cri-pan mRNA, and a modified Cri-pan mRNA designed to better resemble panish mRNA (Cri-pantrunc. mRNA). Cri-pantrunc. mRNA encodes a N-terminal truncated Cri-Pan variant, lacking the ß-Catenin binding and HMG box domains, with two mutations in the cysteine-clamp domain to mimic conserved changes of the Panish cysteine-clamp (Figure 6—figure supplement 5A). Only panish mRNA rescued panish RNAi embryos (Figure 6—figure supplement 5B), suggesting that panish co-evolved with its targets and functionally diverged after its origin via gene duplication from pangolin.

Discussion

Role of alternative transcription in the evolution of embryonic axis determinants from old genes

In this study, we have identified three unrelated old genes that encode the anterior determinant in moth flies, culicine mosquitoes, and anopheline mosquitoes, respectively (Figure 7). All three genes not only localize their maternal transcript at the anterior egg pole; they also are subject to alternative transcription, which allows a single gene to generate multiple transcript isoforms with distinct 5’ and 3’ ends through the use of alternative promoters (alternative transcription initiation) and polyadenylation signals (alternative transcription termination).

Figure 7. Anterior axis determinants in Diptera.

The phylogenetic tree of dipteran families is based on published data and shows Cylorrhapha in green (Wiegmann et al., 2011). Mya, million years ago.

Figure 7.

Figure 7—figure supplement 1. Occurrence of panish in Chironomidae genomes.

Figure 7—figure supplement 1.

Orthologs of panish in Chironomus piger and C. tentans have been reported previously (Klomp et al., 2015). We examined panish in recently published genome sequences of Polypedilum vanderplanki (Gusev et al., 2014), P. nubifer (Gusev et al., 2014), Clunio marinus (Kaiser et al., 2016), Belgica antarctica (Kelley et al., 2014), and Parochuls steinenii (Kim et al., 2017). The phylogeny is based on Cranston et al. (2012).
Figure 7—figure supplement 2. Maternal and zygotic transcript variant of Clogmia zerknüllt (Cal-zen).

Figure 7—figure supplement 2.

Stage-specific RNA-seq read coverage of Cal-zen genomic locus above schematic sketches of Cal-zenMat and Cal-zenZyg transcript variants with open reading frame (black) and recovered UTR sequences. The RNA-seq reads were obtained from preblastoderm (0.5 hr-old), syncytial blastoderm (4 hr-old), cellular blastoderm (7 hr-old), and gastrulation stages (9 hr-old). Maternal zerknüllt expression is common in lower Diptera but was probably lost in the stem lineage of Cyclorrhapha (Stauber et al., 2002). It is possible that zerknüllt evolved a localization signal in this stem lineage, because in Megaselia abdita (Phoridae), a cyclorrhaphan fly with maternal bicoid expression and zygotic zerknüllt expression, the maternal bicoid transcript is localized at the anterior pole of the embryo while the zygotic zerknüllt transcript is localized on the apical side of blastoderm cells (Stauber et al., 1999), presumably through the same microtubule-dependent machinery (Bullock and Ish-Horowicz, 2001).

In moth flies, the localized maternal odd-paired transcript that functions as anterior determinant has an alternative first exon compared to the canonical isoform (Figure 1B–D, Figure 2, and Figure 3A–B). In the case of mosquitoes, maternal transcript isoforms of cucoid (in culicine mosquitoes) or pangolin (in anopheline mosquitoes) with alternative last exons are localized at the anterior pole of the egg and function as anterior determinant (Figure 4, Figure 5, and Figure 6). Since anterior determinants are localized in the anterior egg, and signals for the subcellular localization of transcripts are typically found in UTRs (Holt and Bullock, 2009), it is possible that alternative transcription facilitates the evolution of anterior determinants by providing the UTR sequence for isoform-specific localization signals that do not interfere with other gene functions. For example, it has been shown that alternative last exons of transcript isoforms confer isoform-specific localization in neurons (Taliaferro et al., 2016; Ciolli Mattioli et al., 2019). Additional experiments will be needed to test whether the unique UTR sequences of anterior determinants are essential for their localization at the anterior egg pole.

In addition to changes in UTR sequences, alternative transcription also can result in the truncation or elongation of the open reading frame. For example, the anterior determinant of Clogmia (Cal-OpaMat) lacks the N-terminal 20 amino acids (Figure 1—figure supplement 1), and the anterior determinant of Anopheles (Aga-panMat) encodes protein that includes additional seven amino acids at the C-terminal end (Figure 6—figure supplement 1). However, these changes to the protein may not have been important for adopting a function as anterior determinant. The truncation in the maternal Odd-paired protein that we observed in Clogmia is not conserved in Lutzomyia, in which Llo-opaMat and Llo-opaZyg encode the same protein, and full-length Odd-paired homologs from these and other species can function as anterior determinant in Clogmia (Figure 3B–C). Also, the elongation of Aga-PanMat protein is not conserved in Nephrotoma, in which the localized Nsu-panMat transcript encodes a Pangolin protein with a conserved C-terminal end (Figure 6—figure supplement 1). Therefore, modifications in the open reading frame of these genes may reflect secondary changes.

Evolution of new genes that encode embryonic axis determinants

Unlike the anterior determinants identified in this study, the previously described anterior determinants of Drosophila and Chironomus are encoded by newly evolved genes, bicoid and panish. These genes seem to be dispensable outside the context of axis specification (Driever, 1993; Klomp et al., 2015), suggesting that they evolved specifically for this function. They could have acquired their function de novo via protein evolution or via inheritance from the progenitor gene. Our findings suggest that the role of pangolin in axis specification was already present in ancestral dipterans (Figure 7). We therefore propose that panish, which evolved from pangolin via gene duplication in the Chironominae lineage (Figure 7—figure supplement 1), inherited its function from pangolin. Future examinations of pangolin isoforms and their expression in the eggs of chironomids that lack panish orthologs (species representing basal chironomid lineages) could reveal intermediate steps in this process, such as a localized truncated pangolin isoform.

Similarly, bicoid could have acquired its function de novo via protein evolution or via inheritance from its progenitor gene, zerknüllt. Several previous studies have hypothesized that Bicoid replaced Orthodenticle, a conserved homeodomain protein with similar DNA-binding affinity that functions in animal head development (Wimmer et al., 2000; Schröder, 2003; Lynch et al., 2006; Datta et al., 2018). Ancestrally reconstructed homeodomains confirmed that a single amino acid change in the homeodomain of Bicoid (Q50K), which is shared by Bicoid and Orthodenticle, caused a dramatic shift of Bicoid’s DNA-binding affinity in vitro and target recognition in vivo (Liu et al., 2018).

We cannot rule out that orthodenticle functioned as anterior determinant in ancestral brachyceran flies. However, in analogy with our findings, it is also possible that a maternal zerknüllt isoform became localized at the anterior pole of the egg and acquired the role of anterior determinant via co-option, prior to the origin of bicoid via gene duplication (Figure 7—figure supplement 2). Maternal zerknüllt expression is common in lower Diptera but was lost in Cyclorrhapha (Stauber et al., 2002). If bicoid inherited its function from zerknüllt, the Q50K mutation in the homeodomain of Bicoid must have been a secondary, potentially maladaptive change. In this case, it may have been fixed in the cyclorrhaphan stem lineage via a compensatory or balancing mechanism and would have driven co-evolution of its targets.

It may be objected that, in Drosophila, reverting the K50 residue of the Bicoid homeodomain to Q50 is lethal and results in a bicoid null phenotype (Liu et al., 2018). However, how the ancestral gene network responded to the Q50K mutation of Bicoid cannot be inferred from observations in Drosophila. Moreover, published biochemical data suggest that the Q50K mutation increases interaction with the consensus Bicoid binding DNA motif much stronger than it reduces interaction with the consensus Zerknüllt binding DNA motif. It is therefore conceivable that the Q50K mutation had a less dramatic effect in ancestral flies, in which the target genes of the anterior determinant were activated via Zerknüllt binding sites, than in Drosophila, in which the target genes of the anterior determinant are activated via Bicoid binding sites. Examination of the mechanisms that determine embryo polarity in non-cyclorrhaphan Brachycera flies might help to test this hypothesis. If panish and bicoid inherited their functions, their evolutionary origin and divergence could have served the purpose of reducing the pleiotropy of their progenitor genes, pangolin and zerknüllt, rather than allowing them to take on an entirely new function in development.

Role of alternative transcription in the evolution

Recent genome-wide analyses have shown that alternative transcription is a widespread phenomenon. For example, there are on average four alternative transcription start sites per gene in humans (Forrest et al., 2014) and at least 50–70% of mammalian genes are subject to alternative polyadenylation (Shepard et al., 2011; Adams et al., 2000; Derti et al., 2012). Alternative transcript isoforms can be tightly regulated in a cell or tissue specific manner and can affect transcription and translation efficiency as well as splicing (An et al., 2008; Davuluri et al., 2008; Lau et al., 2010; Pinto et al., 2011; Smibert et al., 2012; Anvar et al., 2018; Tushev et al., 2018; Taliaferro et al., 2016; Ciolli Mattioli et al., 2019; Shabalina et al., 2010). Functional studies in model organisms have shown that alternative transcription can generate dominant negative and alternatively localized protein isoforms (Davuluri et al., 2008; Bharti et al., 2008; Vacik et al., 2011; Berkovits and Mayr, 2015; Schrankel et al., 2016), while misregulation of alternative transcript isoforms has been associated with human diseases including cancer (Mayr and Bartel, 2009; Wiesner et al., 2015; Pal et al., 2012; Shapiro et al., 2011). However, the contribution of alternative promoters (alternative transcription initiation) and polyadenylation signals (alternative transcription termination) to the evolution of new gene functions and regulatory networks remains poorly understood (Carroll et al., 2005; Peter and Davidson, 2011; Wittkopp and Kalay, 2011; de Klerk and 't Hoen, 2015).

The results of several previous studies suggest that alternative transcription may underlie the evolutionary diversification of gene functions. For example, a large fraction of alternative promoter sequences is conserved between human and mice, but those with cell or tissue restricted expression have frequently changed during mammalian evolution (Baek et al., 2007; Forrest et al., 2014), suggesting that alternative promoters may have played a significant role in cell type evolution. Another likely substrate for evolutionary diversification are the protein terminal ends generated by alternative transcription in conjunction with alternative splicing. These terminal ends commonly contain intrinsically disordered regions which are enriched in sites that mediate protein-protein interactions (Buljan et al., 2013; Shabalina et al., 2014). A recent case study found that a light fur color variant of beach mice evolved repeatedly via selection for an alternative agouti isoform with increased translation efficiency (Linnen et al., 2013; Mallarino et al., 2017).

Our in vivo study revealed three old genes that evolved the anterior determinant function by localizing an alternative transcript isoform at the anterior pole of the egg. Therefore, we propose that differential expression of alternative transcript isoforms can result in the evolution of new gene functions, independent of, and prior to gene duplication and sub-functionalization. Given that alternative transcription is a widespread phenomenon, it could play an important role in the evolution of gene regulatory networks.

Materials and methods

Key resources table.

Reagent type
(species) or resource
Designation Source or reference Identifiers Additional information
Strain, strain background (Nephrotoma suturalis) Nsu https://doi.org/10.1083/jcb.25.1.95 NA
Strain, strain background (Chironomus riparius) Cri https://doi.org/10.1126/science.aaa7105; https://doi.org/10.1111/mec.1411 CRIP_Laufer
Strain, strain background (Clogmia albipunctata) Cal https://doi.org/10.1007/s004270050238 NA
Strain, strain background (Culex quinquefasciatus) Cqu NIAID NA
Strain, strain background (Anopheles gambiae) A.gambiae NIAID G-3 strain
Strain, strain background (Anopheles coluzzii) A.coluzzii Insect transformation facility, University of Maryland NA
Strain, strain background (Aedes aegypti) Aae NIAID Liverpool ‘Black eye’
Gene (Clogmia albipunctata) Cal-opaMat This paper MN122104 See Materials and methods - Cloning procedures and mRNA/dsRNA synthesis; RNA in situ hybridization;Rapid Amplification of cDNA Ends (RACE)
Gene (Clogmia albipunctata) Cal-opaZyg This paper MN122105 See Materials and methods - Cloning procedures and mRNA/dsRNA synthesis; RNA in situ hybridization; Rapid Amplification of cDNA Ends (RACE)
Gene (Clogmia albipunctata) Cal-slp This paper MN122106 See Materials and methods - Cloning procedures and mRNA/dsRNA synthesis; RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-mira This paper MN122107 See Materials and methods - Cloning procedures and mRNA/dsRNA synthesis; RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-otd This paper MN122108 See Materials and methods - RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-nos1 This paper MN122109 See Materials and methods - RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-nos2 This paper MN122110 See Materials and methods - RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-nos3 This paper MN122111 See Materials and methods - RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-nos4 This paper MN122112 See Materials and methods - RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-vas This paper MN122113 See Materials and methods - RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-tud This paper MN122114 See Materials and methods - RNA in situ hybridization
Gene (Clogmia albipunctata) Cal-gcl This paper MN122115 See Materials and methods - RNA in situ hybridization
Gene (Lutzomyia longipalpis) Llo-opaMat This paper MN122116 See Materials and methods - Rapid Amplification of cDNA Ends (RACE)
Gene (Lutzomyia longipalpis) Llo-opaZyg This paper MN122117 See Materials and methods - Rapid Amplification of cDNA Ends (RACE)
Gene (Lutzomyia longipalpis) Llo-osk This paper MN122118 See Materials and methods - RNA in situ hybridization
Gene (Chironomus riparius) Cri-opa This paper MN122119 See Materials and methods - RNA in situ hybridization
Gene (Culex quinquefasciatus) cucoidA This paper MN122120 See Materials and methods - RNA in situ hybridization
Gene (Culex quinquefasciatus) cucoidB This paper MN122121 See Materials and methods - RNA in situ hybridization
Gene (Culex quinquefasciatus) cucoidC This paper MN122122 See Materials and methods - Cloning procedures and mRNA/dsRNA synthesis; RNA in situ hybridization
Gene (Culex quinquefasciatus) Cqu-nos This paper MN122123 See Materials and methods - RNA in situ hybridization
Gene (Aedes aegypti) Aae-cucoidA This paper MN122124 See Materials and methods - Rapid Amplification of cDNA Ends (RACE)
Gene (Aedes aegypti) Aae-cucoidB This paper MN122125 See Materials and methods - Rapid Amplification of cDNA Ends (RACE)
Gene (Aedes aegypti) Aae-cucoidC vectorbase.org AAEL013321 See Materials and methods - Rapid Amplification of cDNA Ends (RACE)
Gene (Aedes aegypti) Aae-nos vectorbase.org AAEL012107 See Materials and methods - RNA in situ hybridization
Gene (Anopheles gambiae) Aga-panMat This paper MN122126 See Materials and methods - Cloning procedures and mRNA/dsRNA synthesis; RNA in situ hybridization; Rapid Amplification of cDNA Ends (RACE)
Gene (Anopheles gambiae) Aga-panZyg This paper MN122127 See Materials and methods - Cloning procedures and mRNA/dsRNA synthesis; RNA in situ hybridization; Rapid Amplification of cDNA Ends (RACE)
Gene (Anopheles gambiae) Aga-cad This paper MN122128 See Materials and methods - RNA in situ hybridization
Gene (Nephrotoma suturalis) Nsu-pan This paper MN122129 See Materials and methods - RNA in situ hybridization
Sequence-based reagent dsRNAs This paper see Materials and methods - Cloning procedures and mRNA/dsRNA
Sequence-based reagent RNA probes This paper See Materials and methods - RNA in situ hybridization
Commercial assay or kit SMARTer RACE 5'/3' Kit Takara 634858
Commercial assay or kit mMESSAGE mMACHINE SP6 Thermo Fisher AM1340
Commercial assay or kit QuikChange Lightning Site-Directed Mutagenesis Kit Agilient 210515
Software, algorithm Geneious https://www.geneious.com RRID:SCR_010519 version 11.1.5
Software, algorithm GraphPad Prism https://graphpad.com RRID:SCR_015807 version 1.4

Cloning procedures and mRNA/dsRNA synthesis

Coding sequences from Clogmia, Lutzomyia, Chironomus, Anopheles, Nephrotoma, Culex, and Aedes were amplified from embryonic cDNA with primers constructed from RNA-seq data. Coding sequence of odd-paired was amplified from cDNA (FI01113) that was obtained from the BDGP Gold collection of the Drosophila Genomics Resource Center. Amplified cDNA was cloned into the expression vector pSP35T (Amaya et al., 1991), using In-Fusion HD Cloning Kit (Clontech), and PstI- or EcoRI-linearized vector was used for mRNA synthesis using mMESSAGE mMACHINE SP6. panish, Cri-pan, and Cri-pantrunc mRNAs were synthesized from PCR template (containing T7 polymerase binding site) using mMESSAGE mMACHINE T7ultra. Mutations in the open reading frame (Cal-opaZyg-Met21Leu, panish FS, Cri-pantrunc) and in the kozak sequences of Dme-opa and Cri-opa were generated using QuikChange Lightning Site-Directed Mutagenesis Kit (Agilent). Double-stranded RNA (dsRNA) was generated from PCR-amplified templates using embryonic cDNA and primers containing T7 polymerase binding sites as described (Klomp et al., 2015).

Forward and reverse primer sequences for generating templates for mRNA synthesis were:

  • Cal-opaMat

  • 5’-TAAGATGAGTCCGAATCACTTACTGGCC

  • 5’-TTAATAGGCCGTCGCTGCACC

  • Cal-opazyg

  • 5’-CAACATGATGATGAACGCTTTTATGGAA

  • 5’-TTAATAGGCCGTCGCTGCACC

  • Cal-opa115-655

  • 5’-ATGCTCTTCTCAAATCACTCTTCAGC

  • 5’-TTAATAGGCCGTCGCTGCACC

  • Cal-opa182-655

  • 5’-ATGAACCCGGGAACCTTGGG

  • 5’-TTAATAGGCCGTCGCTGCACC

  • Cal-opa182-445

  • 5’-ATGAACCCGGGAACCTTGGG

  • 5’-TTACGCGGGATTCAGCTGACTATG

  • Cal-opa1-337+22

  • 5’-CAACATGATGATGAACGCTTTTATGGAA

  • 5’-TCACAAAATTTCACTGAATTCCGTCAAAATATCACTAGA

  • Llo-opa

  • 5’-AAAGATGATGATGAATGCATTTATGGACACAG

  • 5’-TCAGTACGCCGTGGCGGCG

  • Dme-opaDme kozak

  • 5’-GACCATGATGATGAACGCCTTCA

  • 5’-GTCAATACGCCGTCGCTGCGCCGGG

  • Cri-opaCri kozak

  • 5’-AAAAATGATGATGAATGGTTTTATGGACACA

  • 5’-TCAATAAGCTGTCGTTGGACCGTGAT

  • Cri-pantrunc

  • 5’- AAAAATGTATCCAGATTGGAGCTCGC

  • 5’- TTACGTCACACTAATAGCATTTCCATCATCCC

Forward and reverse primer sequences for dsRNA (lengths of dsRNAs in brackets; gene specific sequence of primers underlined):

  • Cal-opaMat (222 bp):

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAAAACAATTGTGAAGTGCGACA

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGACAAATTTCCAAACGATGACAGA

  • Cal-opazyg (315 bp):

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGAACTACCGCCGCGAACACACG

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAGTCCAGTCGATTCCATAAAAGC

  • Cal-slp (927 bp):

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGATCGATCAGCTCCCTTTTGCC

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGATGAGATCGTTCCCGTTGGAC

  • Cal-mira (993 bp):

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGAACAGCAAAAAGGAAGCGAAA

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAGGGATTCAATTTGCCTTTGA

  • Aga-panMat (976 bp):

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGACACACAGGGCACAATAATCG

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGAGACTGCATGTCCGTCGTCTA

  • Aga-panZyg (843 bp):

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGAACATCACACACCCCACACAC

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGATTGGTCCGTTCGTGATTGTA

  • cucoid (926 bp):

  • 5’ - CAGAGATGCATAATACGACTCACTATAGGGAGACGAGGATGTTGCTGGAGAAT

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGAACTCCCGAAATCGGAAAACT

  • Aae-cucoid (857 bp):

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGACGACAAGCCCTACAAATGCT

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGATGATCTGGATGTTGCCGTAG

Microinjection of embryos

Chironomus embryo injection was done as previously described (Klomp et al., 2015). Clogmia eggs were dissected from ovaries and activated under water. Eggs of Aedes, Culex, and Anopheles were collected in a dark chamber on a moist filter paper for about 30 min (Fisher Scientific, Cat. No 09–795C). These eggs were transferred to another filter paper cut into 4 cm x 2 cm pieces and aligned perpendicularly to the edge of a cover glass (Fisher Scientific, Cat. No 12-648-5C) with the prospective injection side pointing towards the glass edge. We noticed that injecting eggs near the anterior or posterior pole was critical for survival of the procedure. During the alignment procedure, water was applied to the filter paper as needed to prevent eggs from desiccation. After aligning the eggs, the cover glass was removed and excess water on the filter paper was absorbed using filter paper. A second cover glass with a layer of double-sided tape (Scotch 3M) was slightly pressed against the aligned eggs to transfer the eggs to the double-sided tape. The embryos were then immediately covered with halocarbon oil to prevent desiccation. For cuticle preparations, the embryos were injected under halocarbon oil 27 (Sigma, MKBZ7202V). The oil was washed off under a gentle stream of water immediately after injection. In the case of Clogmia, Aedes, and Culex the cover glass was transferred to a moist chamber (petri dish with wet kimwipe paper) and kept at 28 °C, and water was added every day to prevent desiccation. In the case of Anopheles, the eggs were allowed to develop under water. Removal of the halocarbon oil was critical to ensure embryo survival until late developmental stages and hatching. For eggs to be fixed within a day following injection, we used a 1:1 mixture of halocarbon oil 27 and halocarbon oil 700 (Sigma, MKCB5817) and left the injected eggs immersed in the oil until fixation. Embryos were injected with quartz needles using a Narishige IM-300 microinjector. Quartz capillaries (Sutter Instruments Q100-70-10) were pulled with a Sutter instrument P-2000 laser-based micropipette puller. Our settings for the needle puller were: Heat 645, Fil 4, Vel 40, Del 125, Pul 130. Needles were back-filled and the tip was broken open at the time of injection by slightly touching the first egg.

Embryo fixation

Clogmia Embryos were dechorionated using a 10% dilution of commercial bleach (8.25% sodium hypochloride) for 3 min. For Nephrotoma embryos, a 25% dilution was used for 3 min until the chorion became slightly transparent. Embryos of Aedes, Culex, and Anopheles were dechorionated as described (Juhn and James, 2012). Dechorionated embryos were fixed in a 50 mL falcon tube, using 20 mL of boiling salt/detergent-solution (100 μL 10% triton-X, 500 μL 28% NaCl, up to 20 mL of water). After 10 s, water was applied to the tube to cool down the embryos. If needed, the embryos were devitellinized in a 1:1 mixture of n-heptane and methanol by gentle shaking. Embryos with vitelline membrane attached were further devitellinized using sharp tungsten needles in an agar plate covered with methanol. Devitellized embryos were stored in 100% methanol at −20°C.

RNA in situ hybridization

RNA in situ hybridizations were conducted as described (Klomp et al., 2015), using digoxigenin (DIG)-labeled probes and Fab fragments from anti-DIG antibodies conjugated with alkaline phosphatase (AP) (Roche, IN, USA). Probes were prepared from PCR templates, using sequence-specific forward primers and reverse primers with T7 promoter sequence (see above for Cal-opaMat, Cal-opaZyg, Cal-cad, Cal-slp, Cal-mira, Aga-panMat, Aga-panZyg, and Aae-cucoid; gene specific sequence underlined).

  • Cal-nos1 (450 bp):

  • 5’-AGCACTTTTCCCCCAAGAGT

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAGGCATTCATATTTCCTCAGCA

  • Cal-nos2 (475 bp):

  • 5’-AATTATTCTGTTCCAAAGTTGAGATT

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGACCCCAGACTGGTGACAAAT

  • Cal-nos3 (548 bp):

  • 5’-TGAGTTAAATAGAGTGAAAACAGCAAA

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGATACCGTCTCGTGCTTAATCG

  • Cal-nos4 (440 bp):

  • 5’-GGCAAAATTTTCCAAGTGAA

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGACGTGTCCTCAAGCGTGTAGAT

  • Cal-vas (938 bp):

  • 5’-CTGAGGCGAACTTGTGTGAA

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAATTGGCAATGTCCAGTCCTC

  • Cal-tud (921 bp):

  • 5’-ATTCTGCAAGTCGTCGAGGT

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGACCTGTACCAGCCATTGTCCT

  • Cal-gcl (450 bp):

  • 5’-GCAGAACCCCTTGGACATTA

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAGTAACGCCCACAATTCGTCT

  • cucoidA (939 bp):

  • 5’-ACGATGAGGAGGAGGGTTCT

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGACGCACTTCACCGTGTGTAAC

  • cucoidB (717 bp):

  • 5’-GGGGCGACATCTATATCTCACT

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAACAGTGAGAAAAATTCCCAACTTTAGT

  • cucoidC (926 bp):

  • 5’-CGAGGATGTTGCTGGAGAAT

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAACTCCCGAAATCGGAAAACT

  • Cqu-cad (956 bp):

  • 5’-CACGTGTTCCATCAGTCCAG

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAATGAGGCTTAACGAGGATGG

  • Cqu-nos (927 bp):

  • 5’-AAGTGCCGTGAATTTTGTCC

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGAGCGAAACCAATTCGACAGTT

  • Nsu-pan (966 bp):

  • 5’-TCGCGGCAAGATCATAGTCC

  • 5’-CAGAGATGCATAATACGACTCACTATAGGGAGACCTGCAGGGTTTACACCACT

  • Aae-nos (914 bp):

  • 5’- CAAACGTGAAGCGGAAGATT

  • 5’- CAGAGATGCATAATACGACTCACTATAGGGAGAATTACGTCCGGAAGTGTTCG

Rapid Amplification of cDNA Ends (RACE)

Total RNA was phenol/chloroform extracted from Clogmia (1 hr-old and 9 hr-old embryos), Anopheles (1–6 hr-old embryos), Culex (0–7 hr old embryos), and Nephrotoma (1–29 hr-old embryos) fixed in TRIzol Reagent (Invitrogen) and precipitated with isopropanol. 5’/3’ RACE was performed using SMARTer RACE 5'/3' Kit (Clontech) with the custom-made primers (including at the 5’ end 15 nucleotides of pRACE vector sequence). Gene specific sequences are underlined.

  • Cal-opa 5’RACE primer: 5’-GATTACGCCAAGCTTCTGGGTGACGCCGTGGGCAAGGACGTCA

  • Cal-opa 3’RACE primer: 5’-GATTACGCCAAGCTTCGCGTCGATCGTCACGCCCCCAAATTCG

  • Aga-pan 5’RACE primer: 5’-GATTACGCCAAGCTTCGAATCTCCGGCCGCGGAATTGAGACTT

  • Aga-pan 3’RACE primer: 5’-GATTACGCCAAGCTTAGCTTCACGCGACCAGCAAAACCAACGG

  • cucoid 5’RACE primer: 5’- GATTACGCCAAGCTTCGTGACGGCTTCGATGGTTGGTTTTTCC

  • cucoid 3’RACE primer: 5’- GATTACGCCAAGCTTCGCACGTGTTGAACAGTCACATGTTGAC

  • Aae-cucoid 5’RACE primer: 5’- GATTACGCCAAGCTTGATCCGGTGGATCGGACTTGGCCGAGAT

  • Aae-cucoid 3’RACE primer: 5’- GATTACGCCAAGCTTAACCTCCCTCGGGGTTGAACGTGAAGCT

  • Aae-cucoid 5’RACE primer: 5’- GATTACGCCAAGCTTGATCCGGTGGATCGGACTTGGCCGAGAT

  • Aae-cucoid 3’RACE primer: 5’- GATTACGCCAAGCTTAACCTCCCTCGGGGTTGAACGTGAAGCT

  • Nsu-pan 5’RACE primer: 5’-GATTACGCCAAGCTTTCTGGTCGTGCGACGTTCTTCCAAATCG

  • Nsu-pan 3’RACE primer: 5’-GATTACGCCAAGCTTTCCCGTTGGTGCAAATCCACGAGATGTG

Cuticle preparations

Cuticles were prepped four to five days after injection. Eggshells was removed with tungsten needles and the embryos were transferred to a glass block dish with a drop of 1:4 glycerol/acetic acid. Following incubation in 1:4 glycerol/acetic acid overnight at room temperature, the cuticles were transferred onto a glass slide, oriented, mounted in 1:1 Hoyer’s medium/lactic acid (Stern and Sucena, 2000), covered with a cover glass, and dried overnight at 65 °C.

RNA-seq sample preparation and sequencing

Bisection of anterior and posterior embryo halves, RNA extraction, and sequencing were conducted as described (Klomp et al., 2015). In the case of Clogmia, anterior or posterior embryo halves from three 1 hr-old embryos were pooled and RNA-seq data were obtained from two replicates. In case of Lutzomyia, embryo halves from ten 1–2 hr-old embryos were pooled four replicates were generated. In case of Anopheles (G-3 strain), embryo halves from five 1 hr-old embryos were pooled and three replicates were generated, In the case of Culex, embryo halves from seven 1 hr-old embryos were pooled and three replicates were generated. In case of Aedes (Liverpool ‘black eye’ strain), embryo halves from five 1 hr-old embryos were pooled and four replicates were generated. In case of Nephrotoma, embryo halves from nine 1 hr-old embryos were pooled and three replicates were generated. Stage-specific Clogmia transcriptomes were generated from the offspring of a single mother and total RNA from five embryos was used for each stage. In the case of Lutzomyia, about 100 staged embryos were pooled for RNA extraction, and two independent RNA extractions from each time point were combined and submitted for sequencing.

Prior to library construction, RNA integrity, purity, and concentration were assessed using an Agilent 2100 Bioanalyzer with an RNA 6000 Nano Chip (Agilent Technologies, USA). Purification of messenger RNA (mRNA) was performed using the oligo-dT beads provided in the Illumina TruSEQ mRNA RNA-SEQ kit (Illumina, USA). Complementary DNA (cDNA) libraries for Illumina sequencing were constructed using the Illumina TruSEQ mRNA RNA-SEQ kit (Illumina, USA), using the manufacturer-specified protocol. Briefly, the mRNA was chemically fragmented and primed with random oligos for first strand cDNA synthesis. Second strand cDNA synthesis was then carried out with dUTPs to preserve strand orientation information. The double-stranded cDNA was then purified, end repaired, and ‘a-tailed’ for adaptor ligation. Following ligation, the samples were selected a final library size (adapters included) of 400–550 bp using sequential AMPure XP bead isolation (Beckman Coulter, USA). The libraries were sequenced in an Illumina HiSeq 4000 DNA sequencer, utilizing a pair end sequencing flow cell with a HiSeq Reagent Kit v4 (Illumina, USA).

RNA-seq data preprocessing

The TrimGalore (Krueger, 2012) wrapper for Cutadapt (Martin, 2011) and FastQC (Andrews, 2010) was used to remove adapters and low quality sequences from raw fastq files. Overlapping reads were combined with Flash (Magoč and Salzberg, 2011) prior to assembly.

Transcriptome assembly and annotation

Trinity 2.4.0 (Grabherr et al., 2011) on the Indiana University Karst high-performance computing cluster was used for assembling contiguous sequences (contigs) from the paired end (PE) sequence data of Clogmia, Lutzomyia, Anopheles, and Nephrotoma. ABySS 2.0 (Jackman et al., 2017) was used for assembling contigs from Culex and Aedes data. Only contigs of 200 nucleotides or greater were retained. BLAST+ tools (Camacho et al., 2009) were used to annotate contigs by conducting best-reciprocal-blast first against the Drosophila melanogaster transcriptome (BDGP6) peptide sequences (blastx/tblastn) and then the coding sequence (tblastx) with a maximum threshold evalue of 1e-10. Biomart and AnnotationDbi packages were used for gene ids and names. The longest open reading frames (ORFs) of unannotated transcripts were compared to the RefSeq invertebrate protein database (downloaded 4-1-2017) using blastp (max evalue 1e-10) followed by a similar comparison to remove transcripts with ORFs matching RefSeq plant, protozoan, archaea, bacteria, fungi, plasmid, or viral sequences (downloaded 6-1-2017). Remaining transcripts were designated by the top BLAST hit in D. melanogaster.

Alignment and differential expression analysis

Cleaned paired-end read data was aligned and analyzed using R base (Ihaka and Gentleman, 1996) and Bioconductor (Gentleman et al., 2004) software packages. Sequence alignment was conducted with the seed-and-vote aligner, Subread, as implemented in the Rsubread package (Liao et al., 2013) with up to five multi-mapping locations, six mismatches, and 20 subreads/seeds per read. Sequence file manipulation, including sorting and indexing of ‘.bam’ files, was done using Rsamtools (Morgan et al., 2013).

To avoid potential biases in transcript localization unrelated to anterior-posterior axis formation, transcripts annotated with mitochondrial, ribosomal, or ambiguous status (e.g., predicted, hypothetical, or uncharacterized) were filtered out prior to the differential expression comparisons. Transcripts with 20 or fewer counts in any of the A-P pairs were also excluded from the analysis prior to library normalization. Lower scoring, potentially related transcripts matching a given gene from the D. melanogaster transcriptome were retained for initial differential expression comparisons but removed for clarity of presentation in subsequent analyses and volcano plots. Trimmed mean of M-values (TMM) (Robinson and Oshlack, 2010) was used for normalization and EdgeR (Robinson et al., 2010) was used to perform quasi-likelihood F-tests between A-P samples, corrected for multiple testing using FDR (Benjamini-Hochberg). Following filtering based on annotation and detection of >20 counts per paired samples, we used the following number of transcripts for differential expression comparisons: 5602 for Clogmia; 5392 for Lutzomyia; 8239 for Culex; 5802 for Aedes; 9353 for Anopheles; 5371 for Nephrotoma.

Mapping RNA-seq reads to genomic loci

RNA-seq reads from stage-specific transcriptomes were mapped to genomic scaffolds containing a gene of interest using TopHat RNA-seq aligner (Trapnell et al., 2009). Publicly available Anopheles stephensi transcriptomes used in this paper were: SRR515316, SRR515341, SRR514863, and SRR515304.

Data availability

This project was deposited at the National Center for Biotechnology Information under Bioproject ID PRJNA454000 and the reads were deposited in the Short Reads Archives under accessions SRR7132661, SRR7132662, SRR7132659, SRR7132660, SRR7132665, SRR7132666, SRR7132663 and SRR7132664 for Clogmia, SRR7134470, SRR7134469, SRR7134472, SRR7134471, SRR7134468, and SRR7134467 for Lutzomiya, SRR8729860, SRR8729859, SRR8729858, SRR8729857, SRR8729856 and SRR8729855 for Anopheles, SRR8729854, SRR8729853, SRR8729852 and SRR8729851 for Aedes, SRR8729868, SRR8729867, SRR8729870, SRR8729869, SRR8729864 and SRR8729863 for Culex and SRR8729866, SRR8729865, SRR8729872, SRR8729871, SRR8729861 and SRR8729862 for Nephrotoma. Transcript sequences are listed on the Key Resources Table.

Acknowledgements

We thank Chun Wai Kwan, Claudia Vacca, and Nicole Horio for technical assistance, Arthur Forer (York University, Canada) for the Nephrotoma culture, Robert Harrell and Channa Aluvihare (Insect Transformation Facility, University of Maryland, MD) for shipping blood-fed mosquitoes, Vanessa Macias and Anthony James (University of California at Irvine, CA) for Aedes and Culex reagents, Molly Duman Scheel (Indiana University, IN) for logistic support, and Edwin L Ferguson (University of Chicago, IL) and M Feder (University of Chicago, IL) for laboratory equipment. EL Ferguson provided detailed comments on a manuscript draft. This work was supported by funds from the National Science Foundation (IOS-1355057), the National Institute of General Medical Science (5R01GM127366-02), the University of Chicago, and the National Center for Advancing Translational Sciences of the National Institutes of Health (UL1 TR000430) to U S-O, and the Intramural Program of the National Institute of Allergy and Infectious Diseases to J R YY was the recipient of an award of University of Chicago Henry Hinds Funds for Graduate Student Research in Evolutionary Biology. Transcriptomic data are available at the National Center for Biotechnology Information Sequence Read Archive (Bioproject ID PRJNA454000). This work utilized the computational resources of the NIH HPC Biowulf cluster.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Urs Schmidt-Ott, Email: uschmid@uchicago.edu.

Patricia J Wittkopp, University of Michigan, United States.

Patricia J Wittkopp, University of Michigan, United States.

Funding Information

This paper was supported by the following grants:

  • National Science Foundation IOS-1355057 to Urs Schmidt-Ott.

  • National Institute of General Medical Sciences R01 GM127366-01A1 to Urs Schmidt-Ott.

  • National Center for Advancing Translational Sciences UL1 TR000430 to Urs Schmidt-Ott.

  • University of Chicago Institutional fund to Urs Schmidt-Ott.

  • National Institute of Allergy and Infectious Diseases Intramural Program to Jose Ribeiro.

  • University of Chicago Henry Hinds Funds for Graduate Student Research in Evolutionary Biology to Yoseop Yoon.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Formal analysis, Funding acquisition, Investigation, Visualization, Writing—original draft, Writing—review and editing.

Conceptualization, Formal analysis, Investigation, Visualization, Writing—original draft, Writing—review and editing.

Formal analysis, Investigation, Visualization.

Investigation.

Supervision.

Resources, Formal analysis, Supervision, Funding acquisition.

Conceptualization, Supervision, Funding acquisition, Writing—original draft, Project administration, Writing—review and editing.

Additional files

Supplementary file 1. Annotation and quantitation of Clogmia transcriptome.

Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

elife-46711-supp1.csv (747.2KB, csv)
DOI: 10.7554/eLife.46711.026
Supplementary file 2. Annotation and quantitation of Lutzomyia transcriptome.

Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

elife-46711-supp2.csv (714.7KB, csv)
DOI: 10.7554/eLife.46711.027
Supplementary file 3. Annotation and quantitation of Culex transcriptome.

Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

elife-46711-supp3.csv (852.7KB, csv)
DOI: 10.7554/eLife.46711.028
Supplementary file 4. Annotation and quantitation of Aedes transcriptome.

Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

elife-46711-supp4.csv (434.7KB, csv)
DOI: 10.7554/eLife.46711.029
Supplementary file 5. Annotation and quantitation of Anopheles transcriptome.

Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

elife-46711-supp5.csv (1.1MB, csv)
DOI: 10.7554/eLife.46711.030
Supplementary file 6. Annotation and quantitation of Nephrotoma transcriptome.

Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

elife-46711-supp6.csv (660.7KB, csv)
DOI: 10.7554/eLife.46711.031
Transparent reporting form
DOI: 10.7554/eLife.46711.032

Data availability

Sequencing data have been deposited at the National Center for Biotechnology Information Sequence Read Archive (Bioproject ID PRJNA454000).

The following dataset was generated:

Yoon Y, Klomp J, Martin-Martin I, Criscione F, Calvo E, Ribeiro J, Schmidt-Ott U. 2018. Evolution of an Embryonic Axis Determinant via Alternative Transcription. NCBI Bioproject. PRJNA454000

The following previously published dataset was used:

Xiaofang Jiang. 2012. Anopheles stephensi strain:Indian Wild Type (Walter Reid) Transcriptome or Gene expression. NCBI Bioproject. PRJNA168517

References

  1. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, George RA, Lewis SE, Richards S, Ashburner M, Henderson SN, Sutton GG, Wortman JR, Yandell MD, Zhang Q, Chen LX, Brandon RC, Rogers YH, Blazej RG, Champe M, Pfeiffer BD, Wan KH, Doyle C, Baxter EG, Helt G, Nelson CR, Gabor GL, Abril JF, Agbayani A, An HJ, Andrews-Pfannkoch C, Baldwin D, Ballew RM, Basu A, Baxendale J, Bayraktaroglu L, Beasley EM, Beeson KY, Benos PV, Berman BP, Bhandari D, Bolshakov S, Borkova D, Botchan MR, Bouck J, Brokstein P, Brottier P, Burtis KC, Busam DA, Butler H, Cadieu E, Center A, Chandra I, Cherry JM, Cawley S, Dahlke C, Davenport LB, Davies P, de Pablos B, Delcher A, Deng Z, Mays AD, Dew I, Dietz SM, Dodson K, Doup LE, Downes M, Dugan-Rocha S, Dunkov BC, Dunn P, Durbin KJ, Evangelista CC, Ferraz C, Ferriera S, Fleischmann W, Fosler C, Gabrielian AE, Garg NS, Gelbart WM, Glasser K, Glodek A, Gong F, Gorrell JH, Gu Z, Guan P, Harris M, Harris NL, Harvey D, Heiman TJ, Hernandez JR, Houck J, Hostin D, Houston KA, Howland TJ, Wei MH, Ibegwam C, Jalali M, Kalush F, Karpen GH, Ke Z, Kennison JA, Ketchum KA, Kimmel BE, Kodira CD, Kraft C, Kravitz S, Kulp D, Lai Z, Lasko P, Lei Y, Levitsky AA, Li J, Li Z, Liang Y, Lin X, Liu X, Mattei B, McIntosh TC, McLeod MP, McPherson D, Merkulov G, Milshina NV, Mobarry C, Morris J, Moshrefi A, Mount SM, Moy M, Murphy B, Murphy L, Muzny DM, Nelson DL, Nelson DR, Nelson KA, Nixon K, Nusskern DR, Pacleb JM, Palazzolo M, Pittman GS, Pan S, Pollard J, Puri V, Reese MG, Reinert K, Remington K, Saunders RD, Scheeler F, Shen H, Shue BC, Sidén-Kiamos I, Simpson M, Skupski MP, Smith T, Spier E, Spradling AC, Stapleton M, Strong R, Sun E, Svirskas R, Tector C, Turner R, Venter E, Wang AH, Wang X, Wang ZY, Wassarman DA, Weinstock GM, Weissenbach J, Williams SM, Woodage T, Worley KC, Wu D, Yang S, Yao QA, Ye J, Yeh RF, Zaveri JS, Zhan M, Zhang G, Zhao Q, Zheng L, Zheng XH, Zhong FN, Zhong W, Zhou X, Zhu S, Zhu X, Smith HO, Gibbs RA, Myers EW, Rubin GM, Venter JC. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. doi: 10.1126/science.287.5461.2185. [DOI] [PubMed] [Google Scholar]
  2. Akbari OS, Antoshechkin I, Amrhein H, Williams B, Diloreto R, Sandler J, Hay BA. The developmental transcriptome of the mosquito Aedes aegypti, an invasive species and major arbovirus vector. G3: Genes|Genomes|Genetics. 2013;3:1493–1509. doi: 10.1534/g3.113.006742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Amaya E, Musci TJ, Kirschner MW. Expression of a dominant negative mutant of the FGF receptor disrupts mesoderm formation in xenopus embryos. Cell. 1991;66:257–270. doi: 10.1016/0092-8674(91)90616-7. [DOI] [PubMed] [Google Scholar]
  4. An JJ, Gharami K, Liao GY, Woo NH, Lau AG, Vanevski F, Torre ER, Jones KR, Feng Y, Lu B, Xu B. Distinct role of long 3' UTR BDNF mRNA in spine morphology and synaptic plasticity in hippocampal neurons. Cell. 2008;134:175–187. doi: 10.1016/j.cell.2008.05.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Andrews S. FastQC. Babraham Bioinformatics 2010
  6. Ansari S, Troelenberg N, Dao VA, Richter T, Bucher G, Klingler M. Double abdomen in a short-germ insect: zygotic control of Axis formation revealed in the beetle Tribolium castaneum. PNAS. 2018;115:1819–1824. doi: 10.1073/pnas.1716512115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Anvar SY, Allard G, Tseng E, Sheynkman GM, de Klerk E, Vermaat M, Yin RH, Johansson HE, Ariyurek Y, den Dunnen JT, Turner SW, 't Hoen PAC. Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing. Genome Biology. 2018;19:46. doi: 10.1186/s13059-018-1418-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Baek D, Davis C, Ewing B, Gordon D, Green P. Characterization and predictive discovery of evolutionarily conserved mammalian alternative promoters. Genome Research. 2007;17:145–155. doi: 10.1101/gr.5872707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Berkovits BD, Mayr C. Alternative 3' UTRs act as scaffolds to regulate membrane protein localization. Nature. 2015;522:363–367. doi: 10.1038/nature14321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Berleth T, Burri M, Thoma G, Bopp D, Richstein S, Frigerio G, Noll M, Nüsslein-Volhard C. The role of localization of bicoid RNA in organizing the anterior pattern of the Drosophila embryo. The EMBO Journal. 1988;7:1749–1756. doi: 10.1002/j.1460-2075.1988.tb03004.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bharti K, Liu W, Csermely T, Bertuzzi S, Arnheiter H. Alternative promoter use in eye development: the complex role and regulation of the transcription factor MITF. Development. 2008;135:1169–1178. doi: 10.1242/dev.014142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bolognesi R, Farzana L, Fischer TD, Brown SJ. Multiple wnt genes are required for segmentation in the short-germ embryo of tribolium castaneum. Current Biology. 2008;18:1624–1629. doi: 10.1016/j.cub.2008.09.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bucher G, Farzana L, Brown SJ, Klingler M. Anterior localization of maternal mRNAs in a short germ insect lacking bicoid. Evolution and Development. 2005;7:142–149. doi: 10.1111/j.1525-142X.2005.05016.x. [DOI] [PubMed] [Google Scholar]
  14. Buljan M, Chalancon G, Dunker AK, Bateman A, Balaji S, Fuxreiter M, Babu MM. Alternative splicing of intrinsically disordered regions and rewiring of protein interactions. Current Opinion in Structural Biology. 2013;23:443–450. doi: 10.1016/j.sbi.2013.03.006. [DOI] [PubMed] [Google Scholar]
  15. Bullock SL, Ish-Horowicz D. Conserved signals and machinery for RNA transport in Drosophila oogenesis and embryogenesis. Nature. 2001;414:611–616. doi: 10.1038/414611a. [DOI] [PubMed] [Google Scholar]
  16. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Carroll SB, Grenier JK, Weatherbee SD. Molecular Genetics and the Evolution of Animal Design. Blackwell: Oxford Press; 2005. From DNA to diversity: . [Google Scholar]
  18. Ciolli Mattioli C, Rom A, Franke V, Imami K, Arrey G, Terne M, Woehler A, Akalin A, Ulitsky I, Chekulaeva M. Alternative 3' UTRs direct localization of functionally diverse protein isoforms in neuronal compartments. Nucleic Acids Research. 2019;47:2560–2573. doi: 10.1093/nar/gky1270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Clark E, Akam M. Odd-paired controls frequency doubling in Drosophila segmentation by altering the pair-rule gene regulatory network. eLife. 2016;5:e18215. doi: 10.7554/eLife.18215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Cranston PS, Hardy NB, Morse GE. A dated molecular phylogeny for the Chironomidae (Diptera) Systematic Entomology. 2012;37:172–188. doi: 10.1111/j.1365-3113.2011.00603.x. [DOI] [Google Scholar]
  21. Datta RR, Ling J, Kurland J, Ren X, Xu Z, Yucel G, Moore J, Shokri L, Baker I, Bishop T, Struffi P, Levina R, Bulyk ML, Johnston RJ, Small S. A feed-forward relay integrates the regulatory activities of bicoid and orthodenticle via sequential binding to suboptimal sites. Genes & Development. 2018;32:723–736. doi: 10.1101/gad.311985.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Davuluri RV, Suzuki Y, Sugano S, Plass C, Huang TH. The functional consequences of alternative promoter use in mammalian genomes. Trends in Genetics. 2008;24:167–177. doi: 10.1016/j.tig.2008.01.008. [DOI] [PubMed] [Google Scholar]
  23. de Klerk E, 't Hoen PA. Alternative mRNA transcription, processing, and translation: insights from RNA sequencing. Trends in Genetics. 2015;31:128–139. doi: 10.1016/j.tig.2015.01.001. [DOI] [PubMed] [Google Scholar]
  24. Derti A, Garrett-Engele P, Macisaac KD, Stevens RC, Sriram S, Chen R, Rohl CA, Johnson JM, Babak T. A quantitative atlas of polyadenylation in five mammals. Genome Research. 2012;22:1173–1183. doi: 10.1101/gr.132563.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Driever W. The Development of Drosophila Melanogaster. Laboratory Press: Cold Spring Harbor; 1993. 'Maternal control of anterior development in the Drosophila embryo.'. [Google Scholar]
  26. Ewen-Campen B, Schwager EE, Extavour CG. The molecular machinery of germ line specification. Molecular Reproduction and Development. 2010;77:3–18. doi: 10.1002/mrd.21091. [DOI] [PubMed] [Google Scholar]
  27. Forrest AR, Kawaji H, Rehli M, Baillie JK, de Hoon MJ, Haberle V, Lassmann T, Kulakovskiy IV, Lizio M, Itoh M, Andersson R, Mungall CJ, Meehan TF, Schmeier S, Bertin N, Jørgensen M, Dimont E, Arner E, Schmidl C, Schaefer U, Medvedeva YA, Plessy C, Vitezic M, Severin J, Semple C, Ishizu Y, Young RS, Francescatto M, Alam I, Albanese D, Altschuler GM, Arakawa T, Archer JA, Arner P, Babina M, Rennie S, Balwierz PJ, Beckhouse AG, Pradhan-Bhatt S, Blake JA, Blumenthal A, Bodega B, Bonetti A, Briggs J, Brombacher F, Burroughs AM, Califano A, Cannistraci CV, Carbajo D, Chen Y, Chierici M, Ciani Y, Clevers HC, Dalla E, Davis CA, Detmar M, Diehl AD, Dohi T, Drabløs F, Edge AS, Edinger M, Ekwall K, Endoh M, Enomoto H, Fagiolini M, Fairbairn L, Fang H, Farach-Carson MC, Faulkner GJ, Favorov AV, Fisher ME, Frith MC, Fujita R, Fukuda S, Furlanello C, Furino M, Furusawa J, Geijtenbeek TB, Gibson AP, Gingeras T, Goldowitz D, Gough J, Guhl S, Guler R, Gustincich S, Ha TJ, Hamaguchi M, Hara M, Harbers M, Harshbarger J, Hasegawa A, Hasegawa Y, Hashimoto T, Herlyn M, Hitchens KJ, Ho Sui SJ, Hofmann OM, Hoof I, Hori F, Huminiecki L, Iida K, Ikawa T, Jankovic BR, Jia H, Joshi A, Jurman G, Kaczkowski B, Kai C, Kaida K, Kaiho A, Kajiyama K, Kanamori-Katayama M, Kasianov AS, Kasukawa T, Katayama S, Kato S, Kawaguchi S, Kawamoto H, Kawamura YI, Kawashima T, Kempfle JS, Kenna TJ, Kere J, Khachigian LM, Kitamura T, Klinken SP, Knox AJ, Kojima M, Kojima S, Kondo N, Koseki H, Koyasu S, Krampitz S, Kubosaki A, Kwon AT, Laros JF, Lee W, Lennartsson A, Li K, Lilje B, Lipovich L, Mackay-Sim A, Manabe R, Mar JC, Marchand B, Mathelier A, Mejhert N, Meynert A, Mizuno Y, de Lima Morais DA, Morikawa H, Morimoto M, Moro K, Motakis E, Motohashi H, Mummery CL, Murata M, Nagao-Sato S, Nakachi Y, Nakahara F, Nakamura T, Nakamura Y, Nakazato K, van Nimwegen E, Ninomiya N, Nishiyori H, Noma S, Noma S, Noazaki T, Ogishima S, Ohkura N, Ohimiya H, Ohno H, Ohshima M, Okada-Hatakeyama M, Okazaki Y, Orlando V, Ovchinnikov DA, Pain A, Passier R, Patrikakis M, Persson H, Piazza S, Prendergast JG, Rackham OJ, Ramilowski JA, Rashid M, Ravasi T, Rizzu P, Roncador M, Roy S, Rye MB, Saijyo E, Sajantila A, Saka A, Sakaguchi S, Sakai M, Sato H, Savvi S, Saxena A, Schneider C, Schultes EA, Schulze-Tanzil GG, Schwegmann A, Sengstag T, Sheng G, Shimoji H, Shimoni Y, Shin JW, Simon C, Sugiyama D, Sugiyama T, Suzuki M, Suzuki N, Swoboda RK, 't Hoen PA, Tagami M, Takahashi N, Takai J, Tanaka H, Tatsukawa H, Tatum Z, Thompson M, Toyodo H, Toyoda T, Valen E, van de Wetering M, van den Berg LM, Verado R, Vijayan D, Vorontsov IE, Wasserman WW, Watanabe S, Wells CA, Winteringham LN, Wolvetang E, Wood EJ, Yamaguchi Y, Yamamoto M, Yoneda M, Yonekura Y, Yoshida S, Zabierowski SE, Zhang PG, Zhao X, Zucchelli S, Summers KM, Suzuki H, Daub CO, Kawai J, Heutink P, Hide W, Freeman TC, Lenhard B, Bajic VB, Taylor MS, Makeev VJ, Sandelin A, Hume DA, Carninci P, Hayashizaki Y, FANTOM Consortium and the RIKEN PMI and CLST (DGT) A promoter-level mammalian expression atlas. Nature. 2014;507:462–470. doi: 10.1038/nature13182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Fu J, Posnien N, Bolognesi R, Fischer TD, Rayl P, Oberhofer G, Kitzmann P, Brown SJ, Bucher G. Asymmetrically expressed axin required for anterior development in tribolium. PNAS. 2012;109:7782–7786. doi: 10.1073/pnas.1116641109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Gavis ER, Lehmann R. Localization of Nanos RNA controls embryonic polarity. Cell. 1992;71:301–313. doi: 10.1016/0092-8674(92)90358-J. [DOI] [PubMed] [Google Scholar]
  30. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J. Bioconductor: open software development for computational biology and bioinformatics. Genome Biology. 2004;5:R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology. 2011;29:644–652. doi: 10.1038/nbt.1883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Grimaldi D, Engel MS. Evolution of Insects. Cambridge: Cambridge University Press; 2005. [Google Scholar]
  33. Gusev O, Suetsugu Y, Cornette R, Kawashima T, Logacheva MD, Kondrashov AS, Penin AA, Hatanaka R, Kikuta S, Shimura S, Kanamori H, Katayose Y, Matsumoto T, Shagimardanova E, Alexeev D, Govorun V, Wisecaver J, Mikheyev A, Koyanagi R, Fujie M, Nishiyama T, Shigenobu S, Shibata TF, Golygina V, Hasebe M, Okuda T, Satoh N, Kikawada T. Comparative genome sequencing reveals genomic signature of extreme desiccation tolerance in the anhydrobiotic midge. Nature Communications. 2014;5:4784. doi: 10.1038/ncomms5784. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Holt CE, Bullock SL. Subcellular mRNA localization in animal cells and why it matters. Science. 2009;326:1212–1216. doi: 10.1126/science.1176488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Houtmeyers R, Souopgui J, Tejpar S, Arkell R. The ZIC gene family encodes multi-functional proteins essential for patterning and morphogenesis. Cellular and Molecular Life Sciences. 2013;70:3791–3811. doi: 10.1007/s00018-013-1285-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Ihaka R, Gentleman R. R: a language for data analysis and graphics. Journal of Computational and Graphical Statistics. 1996;5:299–314. doi: 10.2307/1390807. [DOI] [Google Scholar]
  37. Ikeshima-Kataoka H, Skeath JB, Nabeshima Y, Doe CQ, Matsuzaki F. Miranda directs Prospero to a daughter cell during Drosophila asymmetric divisions. Nature. 1997;390:625–629. doi: 10.1038/37641. [DOI] [PubMed] [Google Scholar]
  38. Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S, Hammond SA, Jahesh G, Khan H, Coombe L, Warren RL, Birol I. ABySS 2.0: resource-efficient assembly of large genomes using a bloom filter. Genome Research. 2017;27:768–777. doi: 10.1101/gr.214346.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Juhn J, Marinotti O, Calvo E, James AA. Gene structure and expression of Nanos (nos) and Oskar (osk) orthologues of the vector mosquito, culex quinquefasciatus. Insect Molecular Biology. 2008;17:545–552. doi: 10.1111/j.1365-2583.2008.00823.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Juhn J, James AA. Oskar gene expression in the vector mosquitoes, anopheles gambiae and aedes aegypti. Insect Molecular Biology. 2006;15:363–372. doi: 10.1111/j.1365-2583.2006.00655.x. [DOI] [PubMed] [Google Scholar]
  41. Juhn J, James AA. Hybridization in situ of salivary glands, ovaries, and embryos of vector mosquitoes. Journal of Visualized Experiments. 2012:3709. doi: 10.3791/3709. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Jürgens G, Wieschaus E, Nüsslein-Volhard C, Kluding H. Mutations affecting the pattern of the larval cuticle in Drosophila melanogaster: II. Zygotic loci on the third chromosome. Wilhelm Roux's Archives of Developmental Biology. 1984;193:283–295. doi: 10.1007/BF00848157. [DOI] [PubMed] [Google Scholar]
  43. Kaiser TS, Poehn B, Szkiba D, Preussner M, Sedlazeck FJ, Zrim A, Neumann T, Nguyen LT, Betancourt AJ, Hummel T, Vogel H, Dorner S, Heyd F, von Haeseler A, Tessmar-Raible K. The genomic basis of circadian and circalunar timing adaptations in a midge. Nature. 2016;540:69–73. doi: 10.1038/nature20151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kelley JL, Peyton JT, Fiston-Lavier AS, Teets NM, Yee MC, Johnston JS, Bustamante CD, Lee RE, Denlinger DL. Compact genome of the antarctic midge is likely an adaptation to an extreme environment. Nature Communications. 2014;5:4611. doi: 10.1038/ncomms5611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Kim S, Oh M, Jung W, Park J, Choi H-G, Shin SC. Genome sequencing of the winged midge, Parochlus steinenii, from the antarctic peninsula. GigaScience. 2017;6 doi: 10.1093/gigascience/giw009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Klomp J, Athy D, Kwan CW, Bloch NI, Sandmann T, Lemke S, Schmidt-Ott U. Embryo development. A cysteine-clamp gene drives embryo polarity in the midge chironomus. Science. 2015;348:1040–1042. doi: 10.1126/science.aaa7105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Krueger F. Babraham Bioinformatics; 2012. [Google Scholar]
  48. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. Clustal W and clustal X version 2.0. Bioinformatics. 2007;23:2947–2948. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
  49. Lau AG, Irier HA, Gu J, Tian D, Ku L, Liu G, Xia M, Fritsch B, Zheng JQ, Dingledine R, Xu B, Lu B, Feng Y. Distinct 3'UTRs differentially regulate activity-dependent translation of brain-derived neurotrophic factor (BDNF) PNAS. 2010;107:15945–15950. doi: 10.1073/pnas.1002929107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Lehmann R. Germ plasm biogenesis--an Oskar-Centric perspective. Current Topics in Developmental Biology. 2016;116:679–707. doi: 10.1016/bs.ctdb.2015.11.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Lemke S, Schmidt-Ott U. Evidence for a composite anterior determinant in the hover fly Episyrphus balteatus (Syrphidae), a cyclorrhaphan fly with an anterodorsal serosa anlage. Development. 2009;136:117–127. doi: 10.1242/dev.030270. [DOI] [PubMed] [Google Scholar]
  52. Liao Y, Smyth GK, Shi W. The subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research. 2013;41:e108. doi: 10.1093/nar/gkt214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Linnen CR, Poh YP, Peterson BK, Barrett RD, Larson JG, Jensen JD, Hoekstra HE. Adaptive evolution of multiple traits through multiple mutations at a single gene. Science. 2013;339:1312–1316. doi: 10.1126/science.1233213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Liu Q, Onal P, Datta RR, Rogers JM, Schmidt-Ott U, Bulyk ML, Small S, Thornton JW. Ancient mechanisms for the evolution of the bicoid homeodomain's function in fly development. eLife. 2018;7:e34594. doi: 10.7554/eLife.34594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Lynch JA, Brent AE, Leaf DS, Pultz MA, Desplan C. Localized maternal orthodenticle patterns anterior and posterior in the long germ wasp Nasonia. Nature. 2006;439:728–732. doi: 10.1038/nature04445. [DOI] [PubMed] [Google Scholar]
  56. Magoč T, Salzberg SL. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics. 2011;27:2957–2963. doi: 10.1093/bioinformatics/btr507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Mallarino R, Linden TA, Linnen CR, Hoekstra HE. The role of isoforms in the evolution of cryptic coloration in Peromyscus mice. Molecular Ecology. 2017;26:245–258. doi: 10.1111/mec.13663. [DOI] [PubMed] [Google Scholar]
  58. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011;17:10–12. doi: 10.14806/ej.17.1.200. [DOI] [Google Scholar]
  59. Mayr C, Bartel DP. Widespread shortening of 3'UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell. 2009;138:673–684. doi: 10.1016/j.cell.2009.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Morgan M, Pages H, Obenchain V, Hayden N. Rsamtools: Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import. Bioconductor 2013
  61. Pal S, Gupta R, Davuluri RV. Alternative transcription and alternative splicing in Cancer. Pharmacology & Therapeutics. 2012;136:283–294. doi: 10.1016/j.pharmthera.2012.08.005. [DOI] [PubMed] [Google Scholar]
  62. Peter IS, Davidson EH. Evolution of gene regulatory networks controlling body plan development. Cell. 2011;144:970–985. doi: 10.1016/j.cell.2011.02.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Pinto PA, Henriques T, Freitas MO, Martins T, Domingues RG, Wyrzykowska PS, Coelho PA, Carmo AM, Sunkel CE, Proudfoot NJ, Moreira A. RNA polymerase II kinetics in polo polyadenylation signal selection. The EMBO Journal. 2011;30:2431–2444. doi: 10.1038/emboj.2011.156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Prühs R, Beermann A, Schröder R. The roles of the Wnt-Antagonists axin and Lrp4 during embryogenesis of the red flour beetle tribolium castaneum. Journal of Developmental Biology. 2017;5:10. doi: 10.3390/jdb5040010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology. 2010;11:R25. doi: 10.1186/gb-2010-11-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Schmidt-Ott U, Rafiqi AM, Lemke S. Hox3/zen and the evolution of extraembryonic epithelia in insects. Advances in Experimental Medicine and Biology. 2010;689:133–144. doi: 10.1007/978-1-4419-6673-5_10. [DOI] [PubMed] [Google Scholar]
  68. Schrankel CS, Solek CM, Buckley KM, Anderson MK, Rast JP. A conserved alternative form of the purple sea urchin HEB/E2-2/E2A transcription factor mediates a switch in E-protein regulatory state in differentiating immune cells. Developmental Biology. 2016;416:149–161. doi: 10.1016/j.ydbio.2016.05.034. [DOI] [PubMed] [Google Scholar]
  69. Schröder R. The genes orthodenticle and hunchback substitute for bicoid in the beetle tribolium. Nature. 2003;422:621–625. doi: 10.1038/nature01536. [DOI] [PubMed] [Google Scholar]
  70. Shabalina SA, Spiridonov AN, Spiridonov NA, Koonin EV. Connections between alternative transcription and alternative splicing in mammals. Genome Biology and Evolution. 2010;2:791–799. doi: 10.1093/gbe/evq058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Shabalina SA, Ogurtsov AY, Spiridonov NA, Koonin EV. Evolution at Protein ends: major contribution of alternative transcription initiation and termination to the transcriptome and proteome diversity in mammals. Nucleic Acids Research. 2014;42:7132–7144. doi: 10.1093/nar/gku342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Shapiro IM, Cheng AW, Flytzanis NC, Balsamo M, Condeelis JS, Oktay MH, Burge CB, Gertler FB. An EMT-driven alternative splicing program occurs in human breast Cancer and modulates cellular phenotype. PLOS Genetics. 2011;7:e1002218. doi: 10.1371/journal.pgen.1002218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Shepard PJ, Choi EA, Lu J, Flanagan LA, Hertel KJ, Shi Y. Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA. 2011;17:761–772. doi: 10.1261/rna.2581711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Smibert P, Miura P, Westholm JO, Shenker S, May G, Duff MO, Zhang D, Eads BD, Carlson J, Brown JB, Eisman RC, Andrews J, Kaufman T, Cherbas P, Celniker SE, Graveley BR, Lai EC. Global patterns of tissue-specific alternative polyadenylation in Drosophila. Cell Reports. 2012;1:277–289. doi: 10.1016/j.celrep.2012.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Stauber M, Jäckle H, Schmidt-Ott U. The anterior determinant bicoid of Drosophila is a derived hox class 3 gene. PNAS. 1999;96:3786–3789. doi: 10.1073/pnas.96.7.3786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Stauber M, Prell A, Schmidt-Ott U. A single Hox3 gene with composite bicoid and zerknullt expression characteristics in non-Cyclorrhaphan flies. PNAS. 2002;99:274–279. doi: 10.1073/pnas.012292899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Stern DL, Sucena E. Preparation of larval and adult cuticles for light microscopy. In: Sullivan W, Ashburner M, Hawley RS, editors. Drosophila Protocols. Cold Spring Harbor: CSHL Press; 2000. [Google Scholar]
  78. Struhl G, Johnston P, Lawrence PA. Control of Drosophila body pattern by the hunchback morphogen gradient. Cell. 1992;69:237–249. doi: 10.1016/0092-8674(92)90405-2. [DOI] [PubMed] [Google Scholar]
  79. Taliaferro JM, Vidaki M, Oliveira R, Olson S, Zhan L, Saxena T, Wang ET, Graveley BR, Gertler FB, Swanson MS, Burge CB. Distal alternative last exons localize mRNAs to neural projections. Molecular Cell. 2016;61:821–833. doi: 10.1016/j.molcel.2016.01.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Tautz D. Regulation of the Drosophila segmentation gene hunchback by two maternal morphogenetic centres. Nature. 1988;332:281–284. doi: 10.1038/332281a0. [DOI] [PubMed] [Google Scholar]
  81. Thawornwattana Y, Dalquen D, Yang Z. Coalescent analysis of phylogenomic data confidently resolves the species relationships in the anopheles gambiae species complex. Molecular Biology and Evolution. 2018;35:2512–2527. doi: 10.1093/molbev/msy158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Tushev G, Glock C, Heumüller M, Biever A, Jovanovic M, Schuman EM. Alternative 3' UTRs modify the localization, regulatory potential, stability, and plasticity of mRNAs in neuronal compartments. Neuron. 2018;98:495–511. doi: 10.1016/j.neuron.2018.03.030. [DOI] [PubMed] [Google Scholar]
  84. Vacik T, Stubbs JL, Lemke G. A novel mechanism for the transcriptional regulation of wnt signaling in development. Genes & Development. 2011;25:1783–1795. doi: 10.1101/gad.17227011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Vicoso B, Bachtrog D. Numerous transitions of sex chromosomes in diptera. PLOS Biology. 2015;13:e1002078. doi: 10.1371/journal.pbio.1002078. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Wiegmann BM, Trautwein MD, Winkler IS, Barr NB, Kim JW, Lambkin C, Bertone MA, Cassel BK, Bayless KM, Heimberg AM, Wheeler BM, Peterson KJ, Pape T, Sinclair BJ, Skevington JH, Blagoderov V, Caravas J, Kutty SN, Schmidt-Ott U, Kampmeier GE, Thompson FC, Grimaldi DA, Beckenbach AT, Courtney GW, Friedrich M, Meier R, Yeates DK. Episodic radiations in the fly tree of life. PNAS. 2011;108:5690–5695. doi: 10.1073/pnas.1012675108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Wiesner T, Lee W, Obenauf AC, Ran L, Murali R, Zhang QF, Wong EW, Hu W, Scott SN, Shah RH, Landa I, Button J, Lailler N, Sboner A, Gao D, Murphy DA, Cao Z, Shukla S, Hollmann TJ, Wang L, Borsu L, Merghoub T, Schwartz GK, Postow MA, Ariyan CE, Fagin JA, Zheng D, Ladanyi M, Busam KJ, Berger MF, Chen Y, Chi P. Alternative transcription initiation leads to expression of a novel ALK isoform in Cancer. Nature. 2015;526:453–457. doi: 10.1038/nature15258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Wimmer EA, Carleton A, Harjes P, Turner T, Desplan C. Bicoid-independent formation of thoracic segments in Drosophila. Science. 2000;287:2476–2479. doi: 10.1126/science.287.5462.2476. [DOI] [PubMed] [Google Scholar]
  89. Wittkopp PJ, Kalay G. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nature Reviews Genetics. 2011;13:59–69. doi: 10.1038/nrg3095. [DOI] [PubMed] [Google Scholar]

Decision letter

Editor: Patricia J Wittkopp1
Reviewed by: Leslie Pick, Aleksander Popadic, Michael Akam2

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Evolution of embryonic axis determinants via alternative transcription" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Patricia Wittkopp as the Reviewing and Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Leslie Pick (Reviewer #1); Aleksander Popadic (Reviewer #2); Michael Akam (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary

Yoon et al. present a study of anterior determinants in Diptera, none of which utilize the Drosophila determinant, bcd. They identify opa, CG9215 and pangolin in different species through a combination of RNAseq, in situ hybridization, microinjection and RNAi in multiple species. Shared across genes and species, they find that alternate transcripts (alternate 3' or 5' ends) function in maternal and zygotic patterning. The first three figures present experiments on the moth fly, Clogmia. The authors convincingly demonstrate that opa functions as an anterior determinant in this species, analyzing expression and function in this species. They further show differential function of maternal and zygotic transcripts, which differ in expression and in 20 aa at the N-termini. Functions of the two isoforms are differentiated by RNAi, as well as ectopic injections, showing both loss and gain of function for this gene. The authors next demonstrate that opa coding sequences from other species are capable of functioning as anterior determinants in Clogmia. Next, the authors compare Clogmia opa to that of the sand fly Lutzomyia, finding alternate 5' and 3' UTRs in this species but no difference in protein coding regions of a maternal and zygotic transcript. Together, these results suggest that cis-regulatory evolution rather than protein evolution was the driver of opa taking on a role in anterior patterning. The authors next examine several mosquito species, in which opa is not maternally deposited (Figures 4 and 5). Rather, RNAseq for Culex identified CG9125 as an anteriorly localized gene that, when tested by RNAi, also appears to function as an anterior determinant. This gene also is differentially expressed and functions in Aedes mosquitoes. However, this gene was not detected as a potential anterior determinant in Anopheles mosquitoes and crane flies. Here, RNAseq identified pangolin as an anteriorly localized transcript. In all cases, alternate 5' ends were identified for maternal and zygotic transcripts. The paper also documents which of these species show posteriorly localised transcripts of the genes known to be involved in germ line specification. They identify such transcripts in all but the midge Clogmia, which appears not to localise any of the known germ cell determinants before blastoderm formation. This result is surprising, and another important finding of the paper. It perhaps deserves more prominence in the Abstract.

This is an impressive paper building on the established technical and conceptual expertise of the Schmidt-Ott group in working with axis specification in lower dipteran embryos. It extends work previously published in Science (Klomp et al., 2015), to generate remarkable further results. The study includes species of medical importance (Aedes, Anopheles, Lutzomyia) and work on a taxonomically significant group of basal dipterans, the crane flies, that have not previously been used for embryonic manipulation, to our knowledge. RNAseq analysis in bisected embryos was successful in identifying differentially localized gene products in both anterior and posterior regions of the embryo and led to the identification of novel anterior determinants in these species. The successful use of this approach is valuable for the field. Further, the authors present high quality in situ hybridization results in multiple species and, importantly, follow up on expression analysis with functional studies using both RNAi and injection of mRNA. These experiments appear to have been done rigorously and the figures are of very high quality. The data in Figure 2 particularly impressive.

Essential revisions

1) There was consensus among the reviewers that although the authors emphasize the theme of alternative transcripts, another key result from these experiments is the range of genes that can function as an anterior determinant in different species. The reviewers found this to be the more interesting and better supported conclusion of the work. As one reviewer wrote: The case for using these data as an evidence of AT is much less convincing, and it would need to be documented by rigorous testing. In addition, there is inconsistency in experimental design and over-reaching writing that detracts from the otherwise solid study. What the authors have been able to show, without any question, is that different proteins have been recruited to the anterior pole of the embryo to provide the anterior determination in different dipterans. But, this is not the same as documenting that this is due to alternative transcription. For example, in the best worked out species (Clogmia), the authors show that both maternal and zygotic opa transcripts can generate the same phenotype (double-headed larva), and consequently have the same function. In addition, the additional injections of mutated opa mRNA variants (Figure 3B) show that the obvious difference (additional 20aa in 5' end of zygotic transcript) has no effect on determining the anterior end. Surprisingly, it is the loss of 3'end sequence that has an effect. However, a closer inspection of the sequence alignment in Figure 1 supplemental shows that these regions are identical in both transcripts – hence the conundrum of explaining the difference in function. The only clue is provided in the Discussion "All anterior determinants that we report in this study contain either 5' or 3' UTR sequence that is not shared with the corresponding zygotic isoforms." But the authors do not provide any evidence in supplemental material to support this claim. If that information exists, can't that information be used to generate opa variants that can test the functional significance of these sequences? These types of experiments are required to generate evidence of alternative transcription. The paper suggests that in most cases the maternally expressed transcript is specific and likely newly evolved for this role. This is on the basis that (with one exception) the maternal transcripts are not expressed during the later stages of embryogenesis that have been examined. However, no data are shown to rule out the possibility that the maternal promoter is also expressed and functional at other life stages – for example, in the adult nervous system. It would be desirable to rule out this possibility, particularly if the authors want to stress the idea that these alternative transcripts really are novel inventions, and not simply the redeployment of an already established transcript isoform to a new role during oogenesis. The paper is written to strongly emphasise the fact that each of these novel anterior determinants is encoded by an alternative transcript form, and argues that this may be a common way for genes to acquire novel functions.

The claim about generality of AT as a mechanism and its significance is presented in the title, Abstract, Introduction, and Discussion, so there should be much stronger evidence in support of this claim. To show that AT is indeed responsible for the observed anterior axis determination the authors would have to commit to and perform rigorous experiments to show the differences in function between different transcripts, but these experiments would likely require more than two months to perform. We therefore encourage the authors to refocus the manuscript on the diversity of anterior determinants, removing prominent claims about alternative transcripts in the title and Abstract, at least. By their very nature as localised transcripts, these anterior determinants must contain specific localisation signals embedded in the RNA. This requirement may significantly increase the probability that such determinants evolve through alternative transcription, which allows novel DNA sequence to be expressed as RNA. It remains to be seen whether this would also be true for a randomly chosen set of novel gene functions. For that reason, and given the very considerable biological interest of the observation that novel determinants have evolved so frequently within the Diptera, we feel that stressing the message of alternative transcription might not be the best way to frame the paper.

2) The number of species used here, while quite impressive, is quite confusing for those outside the very immediate field. The species tree in Figure 1 has too little detail in Figure 1 while the one in Figure 6 has too much and doesn't indicate the species used in this paper. Perhaps a small table based on phylogenetic relationships among the species used would help? Photos of the species used? I would mention these relationships when each new species is introduced. For the purposes of this paper, those relationships are more important than the biomedical relevance of the species chosen.

3) Some experimental inconsistencies/over-interpretations were also noted. Some examples include: 1) probes used to detect expression patterns of the three Cqu-CG9215 transcripts (probe C is located in a conserved, shared region and is not specific to C-transcript only; it likely detects the expression of all three transcripts); 2) in Nephrotoma the authors have identified a single pangolin transcript that was labeled as Nsu-pan mat – the presence/absence of zygotic transcript has to be explicitly confirmed and stated as such; stating that on the basis of its expression "..localized maternal pangolin transcript functioned as anterior determinant in ancestral dipteran" is misleading; 3) similarly, stating that "For example, the anterior determinant of Clogmia (Cal-OpaMat) lacks the N-terminal 20 amino acids (Figure 1—figure supplement 1), […] Such changes to the protein could have been important for adopting a function as anterior determinant." when in fact this difference was shown not to be significant (Figure 3B) is erroneous and misleading. Please correct any errors and re-check the full manuscript carefully to correct such cases.

Optional suggestion from one reviewer:

1) The discussion of germ plasm and gene expression at the posterior role is certainly of interest and evidence for inductive germ cell specification in flies would be of great interest. However, I find this information distracting, at least in its current form and placement within the paper. I have never before suggested to an author that they should remove data from a manuscript but I am thinking that these data might be better presented in a separate study.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "A Range of Conserved Genes Establish Embryo Polarity in Moth Flies and Mosquitoes via Alternative Transcription" for further consideration at eLife. Your revised article has been favorably evaluated by Patricia Wittkopp as the Reviewing and Senior Editor, and three reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

We all agreed that this study presents a very impressive body of work. The most strongly supported finding is that factors specifying the anterior end of the embryo are evolving more often than anticipated, and specific transcripts that play roles in this process are identified in moth flies and mosquito species. I think that this work will have a major impact on the field of evolution and development, and I look forward to seeing it published in eLife. However, there is one very important issue left to be resolved that requires only text changes.

This issue is the same as the primary issue raised during the first round of review: the framing of the work around alternative transcription (AT) is not justified by the strength of evidence presented that the AT was in fact responsible for the functional differences observed among species. All of the cases of AT described also include a difference in expression (presence and localization of the maternal transcript). The authors seem to be convinced that this difference in expression must be due to the differences in transcript structure (presumably differences in UTRs), but they do not do any experiments to directly demonstrate this (e.g., swapping UTRs between transcripts and showing that this causes the differences in localization, or expressing the zygotic transcript with the maternal promoter and showing this does not get localized similar to the maternal transcript). I realize that these are difficult experiments to perform in the species examined, so I am not suggesting that they be done. Rather, I’m saying that the conclusions need to be modified to reflect the fact these experiments haven't been done. For example, the different transcript structures might be a neutral byproduct of different promoters used to drive maternal and zygotic transcription. Without experiments disentangling AT from the expression differences, statements like " via Alternative Transcription" in the title, which implies causation, are inappropriate. Other examples of places where the strength of the evidence for AT are overstated include:

End of Abstract: "independently evolved the function of axis determinant via alternative transcription" (same issue as with title).

Discussion section: "All three genes are subject to alternative transcription." They are also all subject to differences in expression pattern, which should also be mentioned here.

In the response, the authors write "we wish to emphasize that AT of old genes provided opportunity for evolving the anterior determinant function." The problem is that AT facilitating this evolution isn't demonstrated. So, this is a fine possibility to raise in the discussion, but not appropriate for title and main conclusion in Abstract. Indeed, the response also says "Therefore, we propose that moth fly odd-paired evolved its specific function as axis determinant via a change in expression mediated by AT (maternal expression and localization), rather than in response to protein change.". Again, proposing this model in the Discussion is fine; presenting it the take home message of the paper in the title and Abstract is not.

In the response to reviewers, response 3 lists ways in which it was demonstrated that there are alternative transcripts with different expression patterns. But, the reviewers' question here wasn't whether evidence of AT was sufficiently strong, but rather whether the evidence that this AT was responsible for the new function (presumably by altering localization of the transcript) was sufficiently strong.

In other places in the manuscript, the statements made about AT are appropriate. For example:

End of introduction (with a slight tweak): "Our results show that a range of distinct old genes function as anterior determinant in different species by localizing alternative maternal transcript isoforms at the anterior egg pole. We therefore propose that AT “might have” played an important role in the evolution of this gene function and gene regulatory networks in fly embryos."

Results section: "We therefore propose that this gene function evolved via co-option when alternative maternal transcript of moth fly odd-paired became enriched at the anterior egg pole. " Wording here makes it clear this is a hypothesis/model.

Results section: "Taken together, our results suggest that cucoid acquired the anterior determinant function via the localization of a maternal transcript isoform with an alternative 3' end." This is agnostic to whether localization or AT causes the effect.

Results section: AT for pan: "Taken together with the isoform-specific transcript localization data presented above, these RNAi results support the hypothesis that pangolin acquired the anterior determinant function via the localization of a maternal transcript isoform with an alternative 3' end. " Good because it includes both the AT and expression difference.

Discussion section: "it is possible that AT facilitates the evolution of anterior determinants by providing the UTR sequence for isoform-specific localization signals that do not interfere with other gene functions" and "Additional experiments will be needed to test whether the unique UTR sequences of anterior determinants are essential for their localization at the anterior egg pole." A clear statement of the missing evidence to support the model. Good.

I think that changing the title more substantially will improve the paper. There are other important elements to this work that are not related to AT that will likely be missed by readers with a title and Abstract focused so specifically on AT. That is, I think the current framing of the work will reduce its impact on the field because it masks other important results such as the change in germ cell specification and the role of slp and mira in embryo polarity in Clogmia, which are not related to AT. I agree with the reviewers that these findings could make their own nice paper, but I defer to the author's preference to keep them in this work. Having dedicated section headings will help keep them from getting lost. Choosing a broader title less focused on AT (combined with mentioning these findings in the Abstract) would also help readers discover these results more readily. Perhaps something like "Divergent mechanisms of embryonic patterning (or polarity) among insects" would work for a title?

Finally, we think that reorganizing the Results section a bit to streamline it would improve readability of the manuscript (see reviewer comment below). However, I recognize that this is more subjective and leave it up to the author's discretion to decide whether and how to change the manuscript in response to this feedback.

Below are some of the comments from individual reviewers that elaborate on these concerns:

1) Insistence of interpreting data as a case of AT

In author's words "[...] Specifically, we provide evidence that the odd-paired ortholog of moth flies evolved without essential protein change an additional function as anterior determinant by maternally expressing and localizing transcript with alternative first exon in the anterior egg. This point is supported by RACE experiments and transcriptome profiling (Figure 1C), transcript-specific expression patterns (Figure 1D), transcript-specific functions (Figure 2), and comparative gain of function experiments (Figure 3)."

I tried to understand the authors' rationale, especially regarding the uppercase sentence that seems to suggest that it is an alternative first exon (present in Cal-otd mat transcript) that is responsible for the new function as anterior determinant. Data in Figure 1C shows the presence of two Cal-otd transcripts (Cal-OpaMat and Cal-OpaZyg) in embryos and Figure 1D shows that these two transcripts have different expression patterns. This is followed by RNAi functional experiments that show that depletion of Cal-OpaMat results in double-abdomens, while Cal-OpaZyg knockdowns have defects in segmentation and dorsal closure (Figure 2A-B). So far, so good. Then we reach Figure 3D that beautifully illustrates that Cal-OpaMat and Cal-OpaZyg are interchangeable in their ability to function as an anterior determinant. In other words, this new function in A-P axis determination is not due to changes in the protein sequence! Instead, it is driven by temporal and spatial changes in regulation of otd expression. Most people would consider this an example of regulatory evolution, which is not associated with AT per se. On that note, it is unclear why authors imply that the first exon is responsible for the anterior determination function – this exon is absent in Cal-OpaZyg, and yet this transcript can functionally substitute for Cal-OpaMat.

2) Data presentation and logic flow

This study is rather complex, encompassing multiple species and genes and a number of different experimental approaches. The authors should keep this in mind and streamline their presentation and description of results. For example, the effects of Cal-OpaMat and Cal-OpaZyg RNAi are presented first in Figure 2A-B, to be followed by description Cal-nos expression in Figure 2A. Then, the text describes the effects of Cal-OpaMat mRNA injections in Figure 3A, to be followed by description of differential expression analysis in Lutzomyia (Figure 3B-C), then back to Clogmia and testing the effects of injections of mRNA of Cal-OpaMat and Cal-OpaZyg (Figure 3D), and so on.

A streamlined presentation of results would be greatly appreciated by general readership. This would entail keeping the current Figure 1 and text describing it, then modifying Figure 2 and 3 and reorganizing current text starting with Cal-OpaMat RNAi results (current Figure 2A bottom), followed by expression studies of selected marker genes (current Figure 2A top), and followed by results of Cal-OpaMat mRNA injections (current Figure 3A). This way, all functional experiments for Cal-OpaMat are seamlessly presented at one place. This can be followed by similarly presented results for Cal-OpaZyg, and finished with results of mRNA injections of Cal-OpaMat and Cal-OpaZyg that highlight the functional interchangeability of two transcripts.

The structure and focus of the paper as re-submitted are similar to those of the first submission, despite the consensus view of the reviewers that it might usefully be changed. I think it unwise of the authors to ignore the reviewer's advice, but I do not think this a reason to reject the paper, so long as the specific concerns of the reviewers have been adequately addressed.

The Abstract is much improved and is now fine.

The structure of the Introduction remains not to my taste, but it is OK.

The new subheadings are useful, and new paragraphs providing clarification are generally helpful.

The handling of the germ cell data in a separate section is acceptable, though reviewer 1's suggestion to publish it elsewhere seemed to me to have merit.

eLife. 2019 Oct 8;8:e46711. doi: 10.7554/eLife.46711.039

Author response


Essential revisions

1) There was consensus among the reviewers that although the authors emphasize the theme of alternative transcripts, another key result from these experiments is the range of genes that can function as an anterior determinant in different species. The reviewers found this to be the more interesting and better supported conclusion of the work. As one reviewer wrote: The case for using these data as an evidence of AT is much less convincing, and it would need to be documented by rigorous testing. In addition, there is inconsistency in experimental design and over-reaching writing that detracts from the otherwise solid study. What the authors have been able to show, without any question, is that different proteins have been recruited to the anterior pole of the embryo to provide the anterior determination in different dipterans. But, this is not the same as documenting that this is due to alternative transcription.

We agree that the range of genes that can function as an anterior determinant in different species is an interesting result of our study. However, our study also provides insight into the mechanism by which old genes evolve the anterior determinant function, and we would like to emphasize both. Specifically, we provide evidence that the odd-paired ortholog of moth flies evolved without essential protein changean additional function as anterior determinant by maternally expressing and localizing transcript with alternative first exon in the anterior egg. This point is supported by RACE experiments and transcriptome profiling (Figure 1C), transcript-specific expression patterns (Figure 1D), transcript-specific functions (Figure 2), and comparative gain of function experiments (Figure 3). Additionally, in Culex, only one of three alternative cucoid (CG1925) transcript isoforms with alternative last exon appears to be maternally expressed and localized (Figure 4B, C), and is therefore a plausible candidate of this gene’s anterior determinant function. Finally, in Anopheles, a pangolin transcript isoform with alternative last exon is maternally expressed and localized, and specifically required as anterior determinant (Figure 6B-D). The literature on AT is large but we are not aware of any comparative in vivostudy that better documents the new function of an alternative transcript isoform than this case. Therefore, we wish to emphasize that AT of old genes provided opportunity for evolving the anterior determinant function.

This is not the same as claiming that the anterior determinant function evolved “due to alternative transcription”, which on its own does not necessarily result in localized gene activity at the anterior egg pole. We realize that this difference was not made sufficiently clear in the original manuscript. To clarify this issue and to better weigh our results, we modified the title of the manuscript and carefully revised all sections of the manuscript, including the summaries provided in the Abstract, at the end of the Introduction, and at the beginning of the Discussion (see highlighted text in the comparison of the original and the revised manuscript texts).

Importantly, in the Results section, we added several subheadings and rationale to better define our goals in each species. In the revised version, results in Culex and Aedes are presented in separate figures (Figure 4 and Figure 5) because these two species serve slightly different purposes. Moreover, we moved the Nephrotoma results after the Anopheles data and clarify our specific goal with this species. Our goal with Nephrotoma has been to test whether anterior-localized maternal pangolin transcript is an old dipteran heritage. This was confirmed. We did not conduct a detailed analysis of alternative pangolin transcripts in this species because an assembled genome sequence is not available and because the purpose for including this species in the analysis was to test whether pangolin preceded panish as the anterior determinant gene.

For example, in the best worked out species (Clogmia), the authors show that both maternal and zygotic opa transcripts can generate the same phenotype (double-headed larva), and consequently have the same function. In addition, the additional injections of mutated opa mRNA variants (Figure 3B) show that the obvious difference (additional 20aa in 5' end of zygotic transcript) has no effect on determining the anterior end. Surprisingly, it is the loss of 3'end sequence that has an effect. However, a closer inspection of the sequence alignment in Figure 1 supplemental shows that these regions are identical in both transcripts – hence the conundrum of explaining the difference in function.

We revised the Results section to clarify our message. In the moth fly section, we provide evidence that the C-terminal sequence is important for Cal-Opa’s function as anterior determinant but this sequence could be important for any other function of Odd-paired as well. More surprisingly, we found that large N-terminal truncations do not preclude its role as anterior determinant (Figure 3D). In agreement with this observation, we propose that the natural occurrence of the N-terminal truncation in Cal-OpaMatmay not have played an important role in evolving the anterior determinant function. This hypothesis is supported by three additional lines:

· The zygotic ORF can function as anterior determinant if the corresponding transcript is presented in the pole region of very early embryos.

· The odd-paired ORFs of Drosophila and Chironomus can induce head development in Clogmia, if the corresponding transcript is presented in the pole region of very early embryos.

· The N-terminal truncation is not conserved in Lutzomyia, and the untruncated ORF of Llo-opaMat can induce head development in Clogmia, if the corresponding transcript is presented in the pole region of very early embryos.

Therefore, we propose that moth fly odd-paired evolved its specific function as axis determinant via a change in expression mediated by AT (maternal expression and localization), rather than in response to protein change. This is a very important point because it provides insight into the mechanism by which moth fly odd-paired evolved its function as anterior determinant.

To put this in context, we would like to point out that a large body of work has been conducted to better understand how an axis determinant, such as bicoid, acquired its function (e.g., Datta et al., 2017; Liu et al., 2018). This work has been conducted under the premise that protein changes enabled Bicoid’s unique function as axis determinant, but this premise remains contentious (see last paragraph in the Discussion section of Liu et al., 2018). Our results in moth flies provide proof-of-concept for an alternative mechanism by which genes can evolve the anterior determinant function.

The only clue is provided in the Discussion "All anterior determinants that we report in this study contain either 5' or 3' UTR sequence that is not shared with the corresponding zygotic isoforms." But the authors do not provide any evidence in supplemental material to support this claim. If that information exists, can't that information be used to generate opa variants that can test the functional significance of these sequences? These types of experiments are required to generate evidence of alternative transcription.

Please also refer to response above. Based on our RACE, RNA-seq, and RNA in situ hybridization results, Cal-opa (Figure 1C, D), cucoid (Figure 4B, C) and Aga-pangolin (Figure 6B, C) are subject to alternative transcription. Information on non-overlapping UTR sequences is included in these figures and all sequences have been made publicly available (see accession numbers under Data availability).

In the case of cucoid, only probe C detected localized maternal transcript. This probe targets a region that is shared by all three cucoid transcripts and therefore, on its own, cannot establish isoform-specific transcription. However, probes specific to the two longer transcript variants (cucoidAand cucoidB; including the variant with complete ORF) only detected zygotic expression, indicating that those transcripts are not localized at the anterior pole of the egg. We therefore infer that of the three identified cucoid isoforms, only the very short very short cucoidC isoform is localized. Directly studying the expression of this isoform is challenging due to its short unique sequence (121 nucleotides). This rationale is explained in the main text in the paragraph summarizing our results in Culex.

The paper suggests that in most cases the maternally expressed transcript is specific and likely newly evolved for this role. This is on the basis that (with one exception) the maternal transcripts are not expressed during the later stages of embryogenesis that have been examined. However, no data are shown to rule out the possibility that the maternal promoter is also expressed and functional at other life stages – for example, in the adult nervous system. It would be desirable to rule out this possibility, particularly if the authors want to stress the idea that these alternative transcripts really are novel inventions, and not simply the redeployment of an already established transcript isoform to a new role during oogenesis.

We do not know if the localized isoforms evolved specifically for their role as anterior determinant. Additional RACE experiments in Clogmia failed to identify Cal-OpaMattranscript in other tissues, including adult flies, provided that ovarian tissue was removed. However, it is conceivable that our experiments failed to detect transcript that was expressed at low level or in few cells.

In any case, it seems likely that the maternally expressed transcript isoforms are older than their ability to localize tightly at the anterior pole and to function as anterior determinant. In the revised manuscript, we emphasize that maternally expressed transcripts with alternative 5’ or 3’ end provided opportunity for evolving the anterior determinant function via transcript localization and co-option, irrespective of the evolutionary age and potential other uses of these transcript isoforms in other cell types.

The paper is written to emphasise strongly the fact that each of these novel anterior determinants is encoded by an alternative transcript form, and argues that this may be a common way for genes to acquire novel functions.

The claim about generality of AT as a mechanism and its significance is presented in the title, Abstract, Introduction, and Discussion, so there should be much stronger evidence in support of this claim. To show that AT is indeed responsible for the observed anterior axis determination the authors would have to commit to and perform rigorous experiments to show the differences in function between different transcripts, but these experiments would likely require more than two months to perform. We therefore encourage the authors to refocus the manuscript on the diversity of anterior determinants, removing prominent claims about alternative transcripts in the title and Abstract, at least.

Please refer to our first response above.

By their very nature as localised transcripts, these anterior determinants must contain specific localisation signals embedded in the RNA. This requirement may significantly increase the probability that such determinants evolve through alternative transcription, which allows novel DNA sequence to be expressed as RNA.

We agree and have made this point at the end of the first paragraph of the Discussion section. The development of a suitable assay to map the transcript localization signal of Cal-opa is something we pursue but is not trivial in this species and beyond the goal of the present study.

It remains to be seen whether this would also be true for a randomly chosen set of novel gene functions. For that reason, and given the very considerable biological interest of the observation that novel determinants have evolved so frequently within the Diptera, we feel that stressing the message of alternative transcription might not be the best way to frame the paper.

Please refer to our first and second response above.

2) The number of species used here, while quite impressive, is quite confusing for those outside the very immediate field. The species tree in Figure 1 has too little detail in Figure 1 while the one in Figure 6 has too much and doesn't indicate the species used in this paper. Perhaps a small table based on phylogenetic relationships among the species used would help? Photos of the species used? I would mention these relationships when each new species is introduced. For the purposes of this paper, those relationships are more important than the biomedical relevance of the species chosen.

To address this issue, we made changes to the relevant figures (Figure 1A and Figure 7) and clarified the phylogenetic position of each species in the main text.

3) Some experimental inconsistencies/over-interpretations were also noted. Some examples include: 1) probes used to detect expression patterns of the three Cqu-CG9215 transcripts (probe C is located in a conserved, shared region and is not specific to C-transcript only; it likely detects the expression of all three transcripts);

This is correct. Probe C detects all three transcripts. However, probe C is the only probe that detected localized maternal expression of this gene and probes specific to the two longer transcript variants only detected zygotic expression, indicating that those transcripts are not localized at the anterior pole of the egg. Please refer to our third response above for additional details.

2) in Nephrotoma the authors have identified a single pangolin transcript that was labeled as Nsu-pan mat – the presence/absence of zygotic transcript has to be explicitly confirmed and stated as such; stating that on the basis of its expression "localized maternal pangolin transcript functioned as anterior determinant in ancestral dipteran" is misleading;

See also our first response above. We changed the labeling to Nsu-pan and revised the conclusion: “Taken together with our Anopheles data, our results in Nephrotoma suggest that ancestral dipteran insects localized maternal pangolin transcript in the anterior egg pole, where this transcript may have functioned as anterior determinant.”

3) similarly, stating that "For example, the anterior determinant of Clogmia (Cal-OpaMat) lacks the N-terminal 20 amino acids (Figure 1—figure supplement 1),.……. Such changes to the protein could have been important for adopting a function as anterior determinant." when in fact this difference was shown not to be significant (Figure 3B) is erroneous and misleading. Please correct any errors and re-check the full manuscript carefully to correct such cases.

This quote is presented out of context. We present alternative hypotheses. To avoid misunderstanding, we slightly modified the relevant paragraph (second paragraph of the Discussion).

Optional suggestion from one reviewer:

1) The discussion of germ plasm and gene expression at the posterior role is certainly of interest and evidence for inductive germ cell specification in flies would be of great interest. However, I find this information distracting, at least in its current form and placement within the paper. I have never before suggested to an author that they should remove data from a manuscript but I am thinking that these data might be better presented in a separate study.

We considered this possibility before submitting our manuscript. We decided against it because the question of a potential loss of the maternal germ plasm is raised by our profiling data. The loss of maternal germ plasm in Clogmia is supported by the duplication of nos-positive cells in double abdomens and adds confidence in the transcript localization data (Figure 1B). To address the reviewer concern, we now present these data under a separate subheading. This change gives readers the option to skip this section.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

[…] This issue is the same as the primary issue raised during the first round of review: the framing of the work around alternative transcription (AT) is not justified by the strength of evidence presented that the AT was in fact responsible for the functional differences observed among species. All of the cases of AT described also include a difference in expression (presence and localization of the maternal transcript). The authors seem to be convinced that this difference in expression must be due to the differences in transcript structure (presumably differences in UTRs), but they do not do any experiments to directly demonstrate this (e.g., swapping UTRs between transcripts and showing that this causes the differences in localization, or expressing the zygotic transcript with the maternal promoter and showing this does not get localized similar to the maternal transcript). I realize that these are difficult experiments to perform in the species examined, so I am not suggesting that they be done. Rather, I’m saying that the conclusions need to be modified to reflect the fact these experiments haven't been done. For example, the different transcript structures might be a neutral byproduct of different promoters used to drive maternal and zygotic transcription. Without experiments disentangling AT from the expression differences, statements like " via Alternative Transcription" in the title, which implies causation, are inappropriate. Other examples of places where the strength of the evidence for AT are overstated include:

End of Abstract: "independently evolved the function of axis determinant via alternative transcription" (same issue as with title).

We actually agree with the reviewers on this point. Maternal expression and anterior transcript localization (or activity localization) seem to be key for evolving the anterior determinant function. However, whether the localization signal or maternal expression came first (or together), and whether the localization signal is confined to transcript-specific sequence of the maternal isoform, may vary from case to case.

AT can accommodate the change in expression for evolving the anterior determinant function (maternal plus localized activity) by providing a promoter for maternal expression and/or by providing sequence for evolving a strong localization signal de novo, or for strengthening a more diffuse pre-existing localization signal. In each case, AT allows to minimize interference with pre-existing gene regulation and function when a new anterior determinant evolves. But this is not to say that the localization signals of anterior determinants must be confined to transcript-specific sequence of the maternal isoform.

In this context, we would like to point out that the transcripts of many pair-rule genes of Drosophila contain localization signals of variable length, and would likely be localized like bicoid if expressed maternally, because they use the same microtubule-dependent machinery (e.g., Bullock, Stauber … and Schmidt-Ott, 2004, and other work from Bullock et al.). Conversely, bicoid mRNA is localized apically like pair-rule gene transcripts when injected into the syncytial blastoderm.

To avoid being misunderstood, we removed the expression “via alternative transcription” in the title, which now reads: “Embryo polarity in moth flies and mosquitoes relies on distinct old genes with localized transcript isoforms”. Additionally, we revised the Abstract and conclude: “In conclusion, flies evolved an unexpected diversity of anterior determinants, and alternative transcript isoforms with distinct expression can adopt fundamentally distinct developmental roles.” The latter point is mainly based on our Clogmia results but seems important to us, because the neglect of alternative transcription in the evo devo literature suggests that this possibility is not widely appreciated in this field.

Discussion section: "All three genes are subject to alternative transcription." They are also all subject to differences in expression pattern, which should also be mentioned here.

The sentence appears in the first paragraph of the Introduction and was changed to: “All three genes not only localize their maternal transcript at the anterior egg pole; they also are subject to alternative transcription, which allows a single gene to generate multiple transcript isoforms with distinct 5’ and 3’ ends through the use of alternative promoters (alternative transcription initiation) and polyadenylation signals (alternative transcription termination).”

The additional explanation was necessary, because we shifted the section on alternative transcription from the Introduction to the Discussion.

In the response, the authors write "we wish to emphasize that AT of old genes provided opportunity for evolving the anterior determinant function." The problem is that AT facilitating this evolution isn't demonstrated. So, this is a fine possibility to raise in the discussion, but not appropriate for title and main conclusion in Abstract. Indeed, the response also says "Therefore, we propose that moth fly odd-paired evolved its specific function as axis determinant via a change in expression mediated by AT (maternal expression and localization), rather than in response to protein change." Again, proposing this model in the discussion is fine; presenting it the take home message of the paper in the title and Abstract is not.

We moved the introductory paragraphs on alternative transcription, in a condensed form, to the Discussion section.

In the response to reviewers, response above lists ways in which it was demonstrated that there are alternative transcripts with different expression patterns. But, the reviewers' question here wasn't whether evidence of AT was sufficiently strong, but rather whether the evidence that this AT was responsible for the new function (presumably by altering localization of the transcript) was sufficiently strong.

We do not wish to claim that AT per se was responsible for the new function, but AT could have facilitated the evolution of localization signals and/or maternal expression without interfering with pre-existing gene regulation/function (as discussed under response above).

In other places in the manuscript, the statements made about AT are appropriate. For example:

End of introduction (with a slight tweak): "Our results show that a range of distinct old genes function as anterior determinant in different species by localizing alternative maternal transcript isoforms at the anterior egg pole. We therefore propose that AT “might have” played an important role in the evolution of this gene function and gene regulatory networks in fly embryos."

The end of the Introduction now concludes with: “Our results reveal three distinct old genes that evolved anterior determinants by localizing an alternative maternal transcript isoform at the anterior egg pole of the respective species. Therefore, alternative transcription might have played an important role in the evolution of this gene function and gene regulatory networks in fly embryos.”

Results section: "We therefore propose that this gene function evolved via co-option when alternative maternal transcript of moth fly odd-paired became enriched at the anterior egg pole. " Wording here makes it clear this is a hypothesis / model.

Results section: "Taken together, our results suggest that cucoid acquired the anterior determinant function via the localization of a maternal transcript isoform with an alternative 3' end." This is agnostic to whether localization or AT causes the effect.

Results section: AT for pan: "Taken together with the isoform-specific transcript localization data presented above, these RNAi results support the hypothesis that pangolin acquired the anterior determinant function via the localization of a maternal transcript isoform with an alternative 3' end. " Good because it includes both the AT and expression difference.

Discussion section: "it is possible that AT facilitates the evolution of anterior determinants by providing the UTR sequence for isoform-specific localization signals that do not interfere with other gene functions" and "Additional experiments will be needed to test whether the unique UTR sequences of anterior determinants are essential for their localization at the anterior egg pole." A clear statement of the missing evidence to support the model. Good.

Thank you for this helpful feedback.

I think that changing the title more substantially will improve the paper. There are other important elements to this work that are not related to AT that will likely be missed by readers with a title and Abstract focused so specifically on AT. That is, I think the current framing of the work will reduce its impact on the field because it masks other important results such as the change in germ cell specification and the role of slp and mira in embryo polarity in Clogmia, which are not related to AT. I agree with the reviewers that these findings could make their own nice paper, but I defer to the author's preference to keep them in this work. Having dedicated section headings will help keep them from getting lost.

We added the following sentence in the Abstract: “Additionally, Clogmia lost maternal germ plasm, which contributes to embryo polarity in fruit flies (Drosophila).” In the Results section, we discuss these data at the end of the Clogmia section under the sub-heading: “Cal-OpaMat suppresses zygotic germ cell specification at the anterior pole and Clogmia lacks maternal germ plasm”.

Choosing a broader title less focused on AT (combined with mentioning these findings in the Abstract) would also help readers discover these results more readily. Perhaps something like "Divergent mechanisms of embryonic patterning (or polarity) among insects" would work for a title?

As mentioned above, we revised the title: “Embryo polarity in moth flies and mosquitoes relies on distinct old genes with localized transcript isoforms”. We hope this title is acceptable. It focuses on facts that we wish to stress but does not infer causality. The title that you suggest would apply equally well to several previous studies on axis specification in Tribolium, Nasonia, and Chironomus, and would fail to specify what is new about the present study. Our study is the first to show that the anterior determinant function can evolve via change in expression, only.

Finally, we think that reorganizing the Results section a bit to streamline it would improve readability of the manuscript (see reviewer comment below). However, I recognize that this is more subjective and leave it up to the author's discretion to decide whether and how to change the manuscript in response to this feedback.

We essentially followed the reviewer suggestions (2). Specifically, we removed the paragraphs on AT in the Introduction to the last section of the Discussion. We wish to discuss our results in the broader context of this literature because, to our knowledge, there is a lack of well documented examples in which alternative transcripts of a gene have taken on clearly distinct developmental roles.

Importantly, we also reorganized the Result section, keeping the Clogmia data together, as suggested by the reviewer. Accordingly, some main figure compositions, and the numbering of some supplementary figures, have changed in the revised manuscript. In the revised manuscript, the Clogmia section ends with the loss of maternal germ plasm in this species. We think that it is important to include these data, not only because the volcano plot raises this issue, but also because germ plasm has been implicated in axis specification (e.g., Nos-dependent maternal Hb gradient in Drosophila embryos). To clarify the rationale for including this section we made slight changes to this section and added two references that established the role of the maternal Hb gradient in axis specification in Drosophila (Tautz, 1988 and Struhl et al., 1992).

We did not expand the section of sloppy-paired and miranda transcripts in Clogmia, because the localization of these transcripts does not seem to be essential for axis specification (see gain of function experiments with Cal-opa), and because their localization is not conserved in the other moth fly, Lutzomyia, unlike the localization of odd-paired transcript.

Below are some of the comments from individual reviewers that elaborate on these concerns:

1) Insistence of interpreting data as a case of AT

In author's words "[…] Specifically, we provide evidence that the odd-paired ortholog of moth flies evolved without essential protein change an additional function as anterior determinant by maternally expressing and localizing transcript with alternative first exon in the anterior egg. This point is supported by RACE experiments and transcriptome profiling (Figure 1C), transcript-specific expression patterns (Figure 1D), transcript-specific functions (Figure 2), and comparative gain of function experiments (Figure 3)."

I tried to understand the authors' rationale, especially regarding the uppercase sentence that seems to suggest that it is an alternative first exon (present in Cal-otd mat transcript) that is responsible for the new function as anterior determinant. Data in Figure 1C shows the presence of two Cal-otd transcripts (Cal-OpaMatand Cal-OpaZyg) in embryos and Figure 1D shows that these two transcripts have different expression patterns. This is followed by RNAi functional experiments that show that depletion of Cal-OpaMatresults in double-abdomens, while Cal-OpaZyg knockdowns have defects in segmentation and dorsal closure (Figure 2A-B). So far, so good. Then we reach Figure 3D that beautifully illustrates that Cal-OpaMatand Cal-OpaZyg are interchangeable in their ability to function as an anterior determinant. In other words, this new function in A-P axis determination is not due to changes in the protein sequence! Instead, it is driven by temporal and spatial changes in regulation of otd expression. Most people would consider this an example of regulatory evolution, which is not associated with AT per se.

We do indeed propose that regulatory evolution was key. Maternal expression and transcript localization (gene activity localization) had to be achieved for odd-paired to adopt the role of anterior determinant. In the case Cal-opa, the maternal promoter resulted from AT, and we suspect that the Cal-OpaMat -specific 5’UTR (also a result of AT) provided opportunity for evolving a transcript-specific localization signal after maternal expression. Alternatively, the localization signal was already in place (anywhere in the Cal-opa transcript) and evolved before maternal expression. A combination of both models is also possible. The links of AT to transcript localization can be manifold, as outlined under response one above, and probably provide yet another example of opportunistic evolution. What we wish to make clear is that AT is a common theme in the evolution of anterior determinants but its specific use varies case by case.

On that note, it is unclear why authors imply that the first exon is responsible for the anterior determination function – this exon is absent in Cal-OpaZyg, and yet this transcript can functionally substitute for Cal-OpaMat.

In the gain-of-function experiments, we tested mRNAs with heterologous UTRs provided by the vector, i.e., only the open reading frames of these transcripts were compared under the same artificial “localization” conditions (mRNA injection at the posterior pole). The distinct open reading frames of Cal-OpaMat and Cal-OpaZyg were functionally indistinguishable in our assay, indicating that the alternative transcription start site and/or unique 5’UTR of Cal-OpaMat (both a result of AT) were key for evolving the anterior determinant function of odd-paired in moth flies.

2) Data presentation and logic flow

This study is rather complex, encompassing multiple species and genes and a number of different experimental approaches. The authors should keep this in mind and streamline their presentation and description of results. For example, the effects of Cal-OpaMatand Cal-OpaZyg RNAi are presented first in Figure 2A-B, to be followed by description Cal-nos expression in Figure 2A. Then, the text describes the effects of Cal-OpaMatmRNA injections in Figure 3A, to be followed by description of differential expression analysis in Lutzomyia (Figure 3B-C), then back to Clogmia and testing the effects of injections of mRNA of Cal-OpaMatand Cal-OpaZyg (Figure 3D), and so on.

A streamlined presentation of results would be greatly appreciated by general readership. This would entail keeping the current Figure 1 and text describing it, then modifying Figure 2 and 3 and reorganizing current text starting with Cal-OpaMatRNAi results (current Figure 2A bottom), followed by expression studies of selected marker genes (current Figure 2A top), and followed by results of Cal-OpaMatmRNA injections (current Figure 3A). This way, all functional experiments for Cal-OpaMatare seamlessly presented at one place. This can be followed by similarly presented results for Cal-OpaZyg, and finished with results of mRNA injections of Cal-OpaMatand Cal-OpaZyg that highlight the functional interchangeability of two transcripts.

The structure and focus of the paper as re-submitted are similar to those of the first submission, despite the consensus view of the reviewers that it might usefully be changed. I think it unwise of the authors to ignore the reviewer's advice, but I do not think this a reason to reject the paper, so long as the specific concerns of the reviewers have been adequately addressed.

The Abstract is much improved and is now fine.

The structure of the Introduction remains not to my taste, but it is OK.

The new subheadings are useful, and new paragraphs providing clarification are generally helpful.

The handling of the germ cell data in a separate section is acceptable, though reviewer 1's suggestion to publish it elsewhere seemed to me to have merit.

We are grateful for these comments and implemented them as described under response one above.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Yoon Y, Klomp J, Martin-Martin I, Criscione F, Calvo E, Ribeiro J, Schmidt-Ott U. 2018. Evolution of an Embryonic Axis Determinant via Alternative Transcription. NCBI Bioproject. PRJNA454000
    2. Xiaofang Jiang. 2012. Anopheles stephensi strain:Indian Wild Type (Walter Reid) Transcriptome or Gene expression. NCBI Bioproject. PRJNA168517

    Supplementary Materials

    Supplementary file 1. Annotation and quantitation of Clogmia transcriptome.

    Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

    elife-46711-supp1.csv (747.2KB, csv)
    DOI: 10.7554/eLife.46711.026
    Supplementary file 2. Annotation and quantitation of Lutzomyia transcriptome.

    Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

    elife-46711-supp2.csv (714.7KB, csv)
    DOI: 10.7554/eLife.46711.027
    Supplementary file 3. Annotation and quantitation of Culex transcriptome.

    Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

    elife-46711-supp3.csv (852.7KB, csv)
    DOI: 10.7554/eLife.46711.028
    Supplementary file 4. Annotation and quantitation of Aedes transcriptome.

    Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

    elife-46711-supp4.csv (434.7KB, csv)
    DOI: 10.7554/eLife.46711.029
    Supplementary file 5. Annotation and quantitation of Anopheles transcriptome.

    Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

    elife-46711-supp5.csv (1.1MB, csv)
    DOI: 10.7554/eLife.46711.030
    Supplementary file 6. Annotation and quantitation of Nephrotoma transcriptome.

    Gene names are those indicated as best-reciprocal-blast hits (see Methods). Contig names correspond to transcriptome assembly. Read counts are given based on non-unique mapping of pre-processed RNA-seq data to annotated transcriptome.

    elife-46711-supp6.csv (660.7KB, csv)
    DOI: 10.7554/eLife.46711.031
    Transparent reporting form
    DOI: 10.7554/eLife.46711.032

    Data Availability Statement

    This project was deposited at the National Center for Biotechnology Information under Bioproject ID PRJNA454000 and the reads were deposited in the Short Reads Archives under accessions SRR7132661, SRR7132662, SRR7132659, SRR7132660, SRR7132665, SRR7132666, SRR7132663 and SRR7132664 for Clogmia, SRR7134470, SRR7134469, SRR7134472, SRR7134471, SRR7134468, and SRR7134467 for Lutzomiya, SRR8729860, SRR8729859, SRR8729858, SRR8729857, SRR8729856 and SRR8729855 for Anopheles, SRR8729854, SRR8729853, SRR8729852 and SRR8729851 for Aedes, SRR8729868, SRR8729867, SRR8729870, SRR8729869, SRR8729864 and SRR8729863 for Culex and SRR8729866, SRR8729865, SRR8729872, SRR8729871, SRR8729861 and SRR8729862 for Nephrotoma. Transcript sequences are listed on the Key Resources Table.

    Sequencing data have been deposited at the National Center for Biotechnology Information Sequence Read Archive (Bioproject ID PRJNA454000).

    The following dataset was generated:

    Yoon Y, Klomp J, Martin-Martin I, Criscione F, Calvo E, Ribeiro J, Schmidt-Ott U. 2018. Evolution of an Embryonic Axis Determinant via Alternative Transcription. NCBI Bioproject. PRJNA454000

    The following previously published dataset was used:

    Xiaofang Jiang. 2012. Anopheles stephensi strain:Indian Wild Type (Walter Reid) Transcriptome or Gene expression. NCBI Bioproject. PRJNA168517


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES