Abstract
In this report, we document an unusual mode of tissue-enriched gene expression that is primarily mediated by alternative and inefficient splicing. We have analyzed posttranscriptional regulation of the Drosophila erect wing gene, which provides a vital neuronal function and is essential for the formation of certain muscles. Its predominant protein product, the 116-kDa EWG protein, a putative transcriptional regulator, can provide all known erect wing-associated functions. Moreover, consistent with its function, the 116-kDa protein is highly enriched in neurons and is also observed transiently in migrating myoblasts. In contrast to the protein distribution, we observed that erect wing transcripts are present in comparable levels in neuron-enriched heads and neuron-poor bodies of adult Drosophila. Our analyses shows that erect wing transcript consists of 10 exons and is alternatively spliced and that a subset of introns are inefficiently spliced. We also show that the 116-kDa EWG protein-encoding splice isoform is head enriched. In contrast, bodies have lower levels of transcripts that can encode the 116-kDa protein and greater amounts of unprocessed erect wing RNA. Thus, the enrichment of the 116-kDa protein in heads is ensured by tissue-specific alternative and inefficient splicing and not by transcriptional regulation. Furthermore, this regulation is biologically important, as an increased level of the 116-kDa protein outside the nervous system is lethal.
Most eukaryotic primary RNA transcripts undergo posttranscriptional processing requiring splicing of introns. The best-appreciated regulatory outcome of posttranscriptional processing is alternatively spliced transcripts that differ in the coding exons or have distinct 3′ or 5′ untranslated ends (reviewed in reference 32). A second consequence of posttranscriptional regulation is the modulation of amounts of specific transcripts dependent on differential splicing efficiencies of different splice sites. It is generally thought that differences in cell-type-specific splicing machineries result in cell type-enriched or -specific alternative splicing (5, 19). In addition, efficiency of splicing could play a major role in gene regulation as primary transcripts that are not completely processed are generally not transported to the cytoplasm and are unlikely to code functional proteins (6, 21, 23). We decided to investigate the role of alternative and inefficient splicing in the regulation of the Drosophila erect wing (ewg) gene, as previous studies indicated a complex transcript profile, intron-containing cDNAs, as well as poly(A)+ transcripts with retained introns (9).
The Drosophila ewg gene provides a function that is vital in the nervous system and essential to the development of certain muscles (16). EWG protein contains an unusual DNA binding domain that is homologous to sea urchin P3A2 protein (4, 10), zebrafish Nrf (3), and mammalian transcription factors NRF-1 and initiation binding receptor (13, 18, 31). Our previous studies suggested that ewg primary transcript may be alternatively spliced, since the ewg gene has several introns and its Northern pattern shows multiple transcripts that are tissue and developmental stage modulated (15). However, at the protein level, only one major polypeptide, a 116-kDa, 733-amino-acid-long polypeptide encoded by the SC3 cDNA open reading frame (ORF), was observed in immunoblot analysis, although many other cross-reacting bands were also observed (9, 10). The translation start site of the SC3 ORF is an unconventional CTG codon, suggesting that translational regulation of ewg may be an important aspect of ewg regulation (10). Transgenes expressing the 116-kDa EWG protein provide compelling evidence that the 116-kDa protein is the major functional protein, as expression of 116-kDa protein in the neurons rescues lethality and general expression rescues both lethal and muscle phenotypes associated with ewg alleles (8, 10). An antibody generated against the 116-kDa EWG protein selectively labels all neurons in the embryonic and larval stages and certain migrating myoblasts in early pupae (8–10), suggesting a distinct tissue-specific expression of the protein and possibly transcript.
We investigated the splicing patterns of ewg RNA to address if ewg transcripts are indeed alternatively or inefficiently spliced and if the pattern of splicing shows tissue-specific differences. In this paper, we report the results on ewg splicing, using reverse transcription (RT)-PCR in head and body RNAs as representative of neuron-enriched and neuron-poor tissue, respectively. Our results show the following. (i) ewg is more widely transcribed than previously recognized, and total ewg RNA levels in heads and bodies are comparable. (ii) A subset of ewg introns are efficiently spliced, but another subset are inefficiently spliced and retained in poly(A)+ RNA. (iii) ewg RNA in bodies has a greater representation of unprocessed RNAs, and RNAs that include two exons that are not part of the SC3 ORF. One of these new exons is not included in ewg transcripts present in heads. (iv) SC3 ORF RNA is enriched in adult heads but low in the bodies. (v) Modest expression of the SC3-encoded ORF in the body can be lethal. Thus, ewg, which is widely transcribed, is primarily regulated by posttranscriptional mechanisms.
MATERIALS AND METHODS
Fly stocks and genetic crosses.
Drosophila melanogaster flies were raised on standard media and at 25°C. The Canton-S strain was used as the wild type. EWGHS and EWGNS are two white+ marked transgenes that encode the 116-kDa EWG protein isoform under the control of heat shock hsp promoter and the neuron-specific elav promoter (8). ewgl1 is a lethal, protein-null allele of the ewg gene (10). Df(1)cin-arth: uncovers several loci inclusive of cin, ewg, and y (14). Dp243: is a free duplication derived from Dp1187 which is y+ and also carries a P element marked with rosy (35).
Crosses to check rescue of ewg deletion by transgene rescue consisted of crossing females of genotype Df(1)cin-arth wa vOf f/FM7a; EWGNS to y; ry; Dp243 y+,ry+ males; Df(1)cin-arth wa vOf f/Y; Dp243 y+ males have a synthetic deletion of the ewg locus. Males of the genotype Df(1)cin-arth y wa vOf f/Y; EWGNS/+; Dp243 y+ ry+ were found at the expected frequency, while flies without EWGNS4 did not survive.
To assess impact of overall increased expression of 116-kDa protein, females of the genotype ewg11y w sn/w; +/+; EWGHS1/TM6,Tb were crossed to yw/Y; +/+; EWGHS7/+ males. To assess EWG protein levels expressed by the EWGHS genes, females of the genotype ewg11y w sn/w; +/+; EWGHS1/TM6,Tb were crossed to ewgl1/Y; +/+; EWGHS7/+ males.
RT-PCR.
Total RNA was isolated using Trizol reagent (GIBCO-BRL) from heads and bodies of 2-day-old adults. After DNase I (GIBCO-BRL) treatment, RT of 1 μg of total RNA was primed with an oligo(dT) or gene-specific probe, using a Superscript II cDNA synthesis kit (GIBCO-BRL) according to the manufacturer’s instructions except that the RNA was kept at 50°C for 5 min before initiation of the RT reaction. The manufacturer’s instructions were followed to synthesize cDNAs primed by random hexamers. The RNase H step was omitted. Controls were done with no RNA and no reverse transcriptase. The sequence of the gene-specific probe is 5′-ACACTGTTCCATCGCTGTTCGT-3′, which hybridizes to exon H. Cycle parameters for the PCRs were 30 s at 95°C, 40 s at 60°C, and 45 s at 72°C for 30 cycles, with an initial 2 min at 95°C and a final 8-min extension at 72°C. All PCRs were carried out in 50 μl, 8 μl of which was loaded on agarose gels. The Mg2+ concentration was optimized for each primer pair. Taq polymerase was from GIBCO-BRL, and PCR conditions were according to their instructions. Primers were used at a final concentration of 4 ng/μl. Primer positions are outlined in Fig. 1B, and their sequences are shown in Table 1. cDNA and genomic sequences were used for primer design. Primer sequences are shown in Table 1, and positions are outlined in Fig. 1B. Primer3, a web-based software program by Rozen and Skaletsky (27a), was used to assist in primer design. Identity of PCR bands was determined by restriction digests, internal primer PCRs, and/or direct sequencing. Direct sequencing was done with ABI automated sequencing equipment.
FIG. 1.
Schematic representation of ewg exons, cDNAs, and PCR primers. (A) Genomic map of the ewg locus and two cDNAs, MPA-1 and SC3, that share the SC3 ORF shown as filled boxes (adapted from reference 9). A map of all characterized ewg exons (A to J) and the nomenclature of alternative spliced introns is shown below the SC3 cDNA. Alternative splicing occurs only in introns 3 and 6. The new exons E and I are present within introns 3c and 6, respectively. (B) The primers are named according to the intron excision events that they were used to assess; for example, 2F and 2R amplify transcripts that span intron 2. Primers designated In were used to amplify intron-containing transcripts; all In primers with the exception of In1R and In3cF are within introns. Note that introns 1 and 6 are not drawn to scale. (C) Sequence of the 74-bp ewg exon. The underlined nucleotide T is a silent nucleotide polymorphism, as a C in this position was found by genomic sequencing. (D) Sequence of the body-enriched 38-bp exon I.
TABLE 1.
PCR primers
Splice event | Forward primer (sequence, 5′ to 3′) | Return primer (sequence, 5′ to 3′) | Mg2+ (mM) |
---|---|---|---|
rp49 | Frp49 (AAGCTGTCGCACAAATGGCG) | Rrp49 (TACCTCGTTCTTCTTGAGACGCA) | 2.5 |
1 | 1F1 (GGGACAGGCAGCTGAAAACTA) | 1R1 (TTCCAGTCGAACTCCTTTGGCT) | 1.5 |
Intron 1 | In1F (AGCAAACAAACAACGAACGAGC) | In1R (TGCTTCCGATCCTCTTCTTTCC) | 1 |
2 | 2F1 (GAGCCTTTATTCCGCTGATGCT) | 2R1 (TCTTGCGTAGAGCATGTGTCCA) | 2.5 |
Intron 2 | 2F1 (GAGCCTTTATTCCGCTGATGCT) | In2R (TAGGATTCAAAGGAGATCGGGG) | 1.5 |
3a | 3aF1 (ATCCTCCTACCATCCACACGGT) | 3aR1 (TGATTGTCGGAGCAGGAGTGTT) | 1 |
Intron 3a | 3aF1 (ATCCTCCTACCATCCACACGGT) | In3aR (GACTTATCACAGGGGAGTCGCA) | 1 |
3c | 3cF1 (CTCAAGTCTGCATCGAGCCAAT) | 3cR1 (TGAACCTGGGCAGTTGTACCAT) | 1 |
Intron 3c | In3cF (AGATCGCCAATGCTCAAGTCTG) | In3cR (TTGGGAAGCACAACAGCAATTT) | 1 |
3b | 3aF1 (ATCCTCCTACCATCCACACGGT) | 3cR1 (TGAACCTGGGCAGTTGTACCAT) | 1 |
4 | 4F1 (ATGGTACAACTGCCCAGGTTCA) | 4R1 (CCATCCTGTGTCTCGGTGAGAT) | 1 |
Intron 4 | In4F (ACTACAAACGCATTCGAGTCCT) | 4R1 (CCATCCTGTGTCTCGGTGAGAT) | 1.5 |
5 | 5F1 (AAGGCGGTGCTACCATTCAAAC) | 5R1 (TGAGATCACATTGCTCACCGAA) | 1 |
Intron 5 | 5F1 (AAGGCGGTGCTACCATTCAAAC) | In5R (TGCGAGTTCGGTAAGTGCGTAT) | 1.5 |
6 | 6F1 (ATATCCCGTTTCGGTGAGCAAT) | 6R1 (CGGAATTAATGGCCTCCATAGC) | 1 |
Intron 6 | In6F1 (CGCGGAGAAATGAGTTTACGAG) | 6R1 (CGGAATTAATGGCCTCCATAGC) | 1.5 |
38R (AGTTTTCCTATCTGCATCGGCG) | RV (TTTTCCCGGTGGATCTCTTGTT) | 1.5 |
Sequencing of ewgl1.
ewgl1 y w sn embryos were collected 24 h after egg laying and selected by the y marker after dechorionation. DNA was extracted by homogenizing ∼50 embryos in 100 mM NaCl–10 mM Tris-HCl (pH 7.5)–1 mM EDTA. The homogenate was incubated in 1% sodium dodecyl sulfate (SDS)–1 mg of proteinase K per ml for 16 h at 55°C, phenol-chloroform extracted, and precipitated. A 2.4-kb genomic fragment spanning exons B to D was amplified by PCR with primers In1F and In3cR, using Pwo polymerase (Boehringer Mannheim). This fragment was then reamplified by using primers In1F/In2R and 2F1/e4R and sequenced on both strands. The sequence was compared to both cDNA and genomic DNA sequences, which match.
Protein expression.
EWG protein was divided into three fragments for expression: exons B and C (EH1), exon D (EH2), and exons F to H containing J (EM3). Additionally, we made two fragments containing overlapping parts: exons B to D (EH4) and exons D to H containing J (EH5). Fragments were amplified from the SC3 cDNA, using the following primers containing an XbaI site in the return primer for cloning: 5′-CTGGCCACCACAAGCTATC-3′ (e23F) and 5′-GCTCTAGATCAGTTATTGCTGTTGCCCGTC-3′ (e23R) for EH1, 5′-CAACCGCAGCAGGTGAAT-3′ (e4F) and 5′-GCTCTAGATCAATCAACATCGCTGAGCGTAA-3′ (e4R) for EH2, and 5′-TACACCACGCAAACGGTC-3′ (e6-8F) and 5′-GCTCTAGATCAGCTCCAGCTATTGTTCCAT-3′ (e10R) for EM3. EM3 was cloned into pMal-c2 (New England Biolabs), using the XmnI and XbaI sites in pMal, yielding an N-terminal fusion to maltose binding protein (MBP). The remaining fragments were cloned into pSG05 via the SnaBI and NheI sites (17), yielding an N-terminal His tag, since we were unable to clone these fragments with the pMal system. Protein expression was done as described previously (17) for the EH fusions and according to the manufacturer’s instructions for the EM fusions.
Immunoblot analysis.
Drosophila protein extracts were prepared and resolved as previously described (30). Drosophila embryos were 14 to 18 h old. Bacterial extracts were prepared by pelleting the bacteria and dissolving them in 2× sample buffer. Proteins were resolved by SDS-polyacrylamide gel electrophoresis at 12.5% and 8% SDS for the bacterial extract and Drosophila extracts respectively. Anti-EWG antibody (10) was used 1:5,000 for immunoblot analysis of bacterial extracts and 1:1,500 for Drosophila extracts. A peroxidase-conjugated goat anti-rabbit secondary antibody (Amersham) was used at a dilution of 1:2,000, and blots were developed by chemiluminescence (LumiGLO; Kirkegaard & Perry).
Nucleotide sequence accession numbers.
The cDNA sequence (accession no. L11345) and genomic sequence (accession no. AF135590) have been submitted to GenBank.
RESULTS
Splicing of known ewg exons differs in adult heads and bodies.
Figure 1A represents our current understanding of the exon/intron structure of the ewg gene. This map is based on sequences of ewg genomic DNA (15), two cDNAs, SC3 and MPA-1 (9, 10), that shared a common ORF, referred to as the SC3 ORF, consisting of exons B, C, D, F, G, H, and J, and the RT-PCR analysis presented in this paper (see below). The SC3 cDNA differed from MPA-1 in that it contained part of intron 1 and lacked the noncoding exon A (Fig. 1A).
To characterize the splicing of ewg RNA, we used head and body RNAs, since differences between these tissues based on Northern patterns were expected (15). The splicing profile of ewg was determined by RT-PCR analysis of oligo(dT)-primed cDNAs and exon-specific primers in exons A, B, C, D, F, G, H, and J. Figure 1B shows the locations of primer pairs, and Table 2 summarizes the expected sizes for spliced and nonspliced products. Table 3 summarizes the PCR products detected by using exon-specific primers.
TABLE 2.
Summary of splicing of ewg introns in wild-type adult heads and bodiesa
Intron (PCR primer pair) | Splicing events
|
Abundance
|
||
---|---|---|---|---|
Figure, lanes | PCR product(s) (size [bp]) | Head | Body | |
1 (1F/1R) | 2, 1 and 2 | 1 sp (476) | ++ | +++ |
2 (2F/2R) | 2, 3 and 4 | 2 spl (177) | +++ | +++ |
3a (3aF/3aR) | 2, 5 and 6 | 3a spl (150) | +++ | ++ |
In3a (449) | ++ | ++ | ||
3b (3aF/3cR) | 2, 7 and 8 | No ExD (169) | ++ | ++ |
ExE, no ExD (243) | + | +++ | ||
ExD (637) | +++ | + | ||
ExD, ExE (711) | ND | ++ | ||
3c (3cF/3cR) | 2, 9 and 10 | 3c spl (154) | +++ | + |
ExE (228) | +1/2 | ++ | ||
ExE, In3c-1 (773) | ND | + | ||
No ExE, In3c (961) | ND | ++ | ||
4 (4F/4R) | 2, 19 and 20 | 4 spl (139) | +++ | +++ |
5 (5F/5R) | 2, 12 and 13 | 5 spl (155) | +++ | +++ |
6 (6F/6R) | 2, 14 and 15 | 6 spl (278) | +++ | + |
ExI (316) | ND | + | ||
1 ret (In1F/In1R) | 4D, 1 and 2 | ExB, In1 (345) | +++ | +1/2 |
2 ret (2F/In2R) | 4D, 3 and 4 | ExB, In2 (200) | ND | ND |
3a ret (3aF/In3aR) | 4D, 6 and 7 | ExC, In3a (277) | +++ | +++ |
3c ret (In3cF/In3cR) | 4D, 8 and 9 | ExD, In3c (105) | +++ | +++ |
4 ret (In4F/4R) | 4D, 10 and 11 | ExG, In4 (147) | ND | ND |
5 ret (5F/In5R) | 4D, 12 and 14 | ExG, In5 (195) | ND | ND |
6 ret (In6F/6R) | 4D, 15 and 16 | ExJ, In6 (799) | +++ | ++ |
Encapsulation of data from Fig. 2 and 4D. Abundance was assessed as +++, ++, +, +1/2, or ND (not detected) for each head and body pair by visual inspection; the strongest band of each primer pair was arbitrarily assigned a value of +++. In some instances, bands designated +1/2 might not be discernible in the figure. In, intron; Ex, exon; ret, retained; spl, spliced.
TABLE 3.
Sizes of ewg PCR products detected by primers flanking more than one introna
Primer pair | Figure, lane(s) | Splicing event (product size [bp]) | Abundance
|
|
---|---|---|---|---|
Head | Body | |||
1F/3aR | 4C, 3 and 14 | ExA–D (1,712) | +++ | ++ |
ExA–D, In3a (2,011) | ++ | + | ||
1F/3cR | 4C, 11 and 12 | ExA–C,D,F (2,193) | +++ | +1/2 |
ExA–C,F, no D (1,730) | ++ | ++ | ||
ExA–C,E,F, no D (1,804) | +1/2 | +++ | ||
ExA–F (2,267) | ND | +1/2 | ||
1F/4R | 4C, 9 and 10 | ExA–D,F,G (2,310) | ++ | + |
ExA–C,F,G, no D (1,847) | ++ | ++ | ||
ExA–C,E–G, no D (1,921) | +1/2 | +++ | ||
ExA–G (2,384) | ND | +1/2 | ||
2F/3aR | 4B, 1 and 2 | ExB–D (453) | +++ | ++ |
ExB–D, In3a (752) | + | + | ||
2F/3cR | 4B, 3 and 4 | ExB–D,F (933) | +++ | +1/2 |
4C, 6 and 7 | ExB–F (1,007) | ND | +1/2 | |
ExB,C,F, no D (471) | ++ | ++ | ||
ExB,C,E,F, no D (545) | +1/2 | +++ | ||
2F/RV | 4C, 2 and 3 | ExB,C,F–H, no D, In6 (1,056) | ++ | ++ |
ExB,C,E–H, no D, In6 (1,130) | +1/2 | +++ | ||
ExB–H, In6 (1,593) | ND | +1/2 | ||
ExB–D,F–H, In6 (1,519) | +1/2 | +1/2 | ||
3aF/4R | 4B, 5 and 6 | ExC,D,F,G (747) | +++ | +1/2 |
ExC,F,G, no D (285) | +++ | +++ | ||
ExC,E–G, no D (359) | +1/2 | +++ | ||
ExC–G (821) | ND | +1/2 | ||
3aF/38R | 4A, 7 | ExC,F–H,I (451) | NDb | +++ |
ExC,E–H,I (525) | NDb | +++ | ||
3aF/6R | 4A, 3 and 4 | ExC,D,F–H,J (1,058) | +++ | +1/2 |
ExC,D,F–J (1,096) | ND | + | ||
ExC,F–H,J, no D (598) | ++ | ND | ||
ExC,F–J, no D (634) | ND | + | ||
ExC,E–J, no D (708) | ND | + | ||
3aF/RV | 4A, 1 and 2 | ExC,D,F–H, In6 (1,215) | + | + |
4C, 4 and 5 | ExC–H, In6 (1,289) | ND | + | |
ExC,F–H, no D, In6 (753) | +++ | +++ | ||
ExC,E–H, no D, In6 (827) | + | +++ | ||
3cF/4R | 4B, 7 and 8 | ExD,F,G (270) | +++ | + |
ExD–G (344) | ND | + | ||
3cF/5R | 4B, 10 and 11 | ExD,F–H (334) | +++ | ++ |
ExD–H (408) | ND | ++ | ||
3A, 2, 4, 7, and 9 | ExD,F–H,In3c (1,141) | + | + | |
3cF/38R | 4A, 6 | ExD,F–H,I (436) | NDb | ++ |
ExD,E–H,I (510) | NDb | ++ | ||
ExD,E–H,I,In 3c-1 (1,055) | NDb | +++ | ||
3cF/RV | 4A, 9 and 10 | ExD,F–H,In6 (738) | ND | + |
ExD,E–H, In6 (812) | ND | + | ||
ExD,E–H,In3c-1,6 (1,357) | ND | + | ||
ExD,F–H, In3c,6 (1,545) | + | +++ | ||
4F/5R | 4B, 12 and 13 | ExF–H (202) | +++ | +++ |
3A, 1, 3, 6, and 8 | ||||
4F/6R | 4B, 14 and 15 | ExF–H,J (449) | +++ | + |
ExF–J (487) | ND | + | ||
5F/6R | 4B, 16 and 17 | ExG,H,J (401) | +++ | + |
ExG–J (439) | ND | + |
Summary of RT-PCR data from Fig. 3A and 4A to C. Transcripts with SC3-like splicing are underlined. Abundance was assessed as +++, ++, +, +1/2, or ND (not detected) for each head and body pair by visual inspection; the strongest band of each primer pair was arbitrarily assigned a value of +++. In some instances, bands designated +1/2 might not be discernible in the figure. In, intron; Ex, exon.
Data not shown.
(i) Introns 2, 4, and 5.
Analysis of RT-PCR products using exon-specific primers revealed that introns 2, 4, and 5 are excised efficiently, as in each case a single band representing the spliced product was observed for head or body RNA. Moreover, for each primer set the band densities in both head and body lanes were comparable (Fig. 2, lanes 3, 4, 19, 20, 12, and 13). The efficient splicing of these introns, which are all small (Table 4), is to be expected since small introns are known to be spliced efficiently in Drosophila (29). However, the observation that the PCR bands for both body and head RNAs were comparable was unexpected, as it implied that nonneural tissues expressed ewg RNA to a greater extent than previously thought (see below and Fig. 3). In these RNA samples, levels of a control ribosomal protein transcript, rp49, are similar in the two tissues (Fig. 2, lanes 17 and 18).
FIG. 2.
Characterization and comparison of ewg splicing in wild-type adult tissues. All RT-PCR assays were carried out with DNase I-treated total RNA isolated from 2-day-old heads (H) or bodies (B). The italicized letters below each pair of lanes represent the specific splice events as outlined in Fig. 1B, e.g., primers 3aF and 3aR for intron 3a splicing. These data are summarized in Table 2, which also lists the lengths of PCR products. rp49 transcripts were used as a control. Molecular size markers (GIBCO-BRL) are shown in lanes 11 and 16. Note that splicing of introns 3a, 3c, and 6 in heads is mostly in the mode of the SC3 cDNA.
TABLE 4.
Catalog of 5′ and 3′ splice sites and branch point sequences of ewg intronsa
Site | Sequence
|
Intron size | |
---|---|---|---|
5′ splice site | 3′ splice site | ||
Long-intron consensus | MAGgtragta | tttyyytyytncagRT | 81–5,392 bp |
Short-intron consensus | AGgtragtw | tttttyyyytncagRT | 51–80 bp |
In1 | CAGgtgcgt | gtttttccgaacagTT | ∼4.5 kb |
In2 | CAGgtgggtg | gtctccctcttcagAT | 77 bp |
In3a | AATgtaagta | aatcatcatcacagCA (8/16) | 299 bp |
In3c | GATgttcgta (4/10) | actttaatacacagTA (8/16) | 807 bp |
In3b | AATgtaagta | actttaatacacagTA | 1,568 bp |
In3b-1 | AATgtaagta | gtgttctgttgcagCT | 1,306 bp |
In3b/c-2 | CATgtaaatac | actttaatacacagTA | 188 bp |
In3c-1 | GATgttcgta | gtgttctgttgcagCT | 545 bp |
In4 | ACCgtaagta | aacttctttttcagCT | 62 bp |
In5 | AAGgtattta | ctcgcattttttagGA (7/16) | 80 bp |
In6 | CAGgtagata (5/10) | ttatctcctgaaagGT | 1,722 bp |
In6a | CAGgtagata | tttcttttaaccagAT | 1,620 bp |
In6b | TCTgtaaaga | ttatctcctgaaagGT | 64 bp |
Consensus sites as defined by Mount et al. (26); the numbers in parentheses indicate the number of nucleotides that diverge from consensus out of the total number of nucleotides assessed for weak splice sites. Criteria defined by Mount et al. (26) were used to determine highly divergent splice sites, i.e., splice sites that have at least 40% of the nucleotides that occur at a frequency less than 20% or nearly 50% of the nucleotides that occur at a frequency less than 30%. Upper- and lowercase letters represent exon and intron sequences, respectively.
FIG. 3.
Abundance of ewg transcripts is independent of polyadenylation and is equal in heads (H) and bodies (B). (A) RT-PCR using random-primed or gene-specific-primed cDNAs. The RT reaction shown in lanes 1 to 4 was primed with a primer in exon H, an exon common to all ewg transcripts. The RT reaction shown in lanes 6 to 9 was primed with random hexamers. The italicized letters below each pair of lanes represent the specific splice events assayed as outlined in Fig. 1B. The uppermost bands in lanes 2, 4, 7, and 9 show ewg transcripts containing intron 3c. The 408-bp band in lanes 2 and 7 contains exon E. Molecular size markers (GIBCO-BRL) are shown in lane 5. (B and C) Cycle titration of PCRs using primers 4F and 5R to amplify parts of ewg transcripts common to all ewg transcripts. cDNAs were synthesized with a gene-specific primer in exon H. Aliquots were removed from the PCR beginning at cycle 18 and continuing until cycle 28. Quantitation of bands reveals that PCR is in the linear range (data not shown). Note that the intensities of bands in both heads and bodies are similar at all cycles.
(ii) Intron 1.
Intron 1 is more efficiently spliced in body than in head RNA (Fig. 2, lanes 1 and 2). This inefficient splicing in heads perhaps explains the SC3 cDNA, which contains part of intron 1 (Fig. 1A).
(iii) Introns 3a, 3b, and 3c.
Intron 3a is inefficiently spliced in both heads and bodies, as both spliced (150-bp lower band) and unspliced (449-bp upper band) products are observed in RT-PCRs using primers 3aF and 3aR with poly(A)+ head and body RNAs (Fig. 2, lanes 5 and 6). The excision of intron 3c (154-bp lower band) occurs more efficiently in adult heads than in bodies (lanes 9 and 10), as the body lane shows the unspliced product (highest, 961-bp band in lane 9), while it is undetectable in heads under these assay conditions (lanes 9 and 10). Two unexplained bands were present predominantly in the body lane in RT-PCRs using primers 3cF and 3cR (lanes 9 and 10), representing a new exon (see below). Thus, both introns 3a and 3c are retained in a fraction of body transcripts, while 3a is also retained in a fraction of head RNAs.
Since both introns 3a and 3c are inefficiently spliced, we wondered if splicing events that exclude exon D also occur, as was previously indicated by a partial cDNA, SC1 (9). Primer pair 3aF/3cR amplified a band of comparable density in a position expected for transcripts that exclude exon D (169-bp lowest band) in both RNA samples (Fig. 2, lanes 7 and 8; Table 2). Using a bridge primer that hybridizes to both exon C and F, we confirmed that the exclusion of exon D occurred at equal levels in heads and bodies (data not shown). Further, a band representing transcripts where both introns 3a and 3c are spliced (637 bp) is highly enriched in heads. Again two additional bands were present almost exclusively in the body lane due to presence of a new exon.
(iv) Intron 6.
Using primer set 6F/6R and head RNA, only one band (278 bp) expected from splicing of intron 6 was seen. However, in the body RNA lane, two relatively faint bands of equal intensities were observed; the lower band represents the splicing of intron 6, while a slightly larger and unexpected band represents a second new exon (Fig. 2, lanes 14 and 15; see below). The low level of spliced product in the body lane suggests that intron 6 is inefficiently spliced and/or that some transcripts terminate within it.
In summary, the results indicate the following. (i) The spliced products resulting from the excision of introns 2, 4, and 5 are present at similar levels in head and body, implying that ewg RNA is expressed outside the nervous system, likely in many tissues. Further, the splicing of these introns is unlikely to be regulated in neurons, as no significant differences are detected between neuron-enriched heads and neuron-poor bodies. (ii) The excision of introns 3c and 6 takes place at a higher efficiency in heads, which results in higher levels of mRNAs that encode the 116-kDa EWG protein (SC3 ORF) in adult heads than in bodies (Fig. 1A). This suggests that the splicing of introns 3c and 6 is likely to be regulated in neurons. (iii) Intron 3a and 3c are retained in a fraction of polyadenylated ewg RNAs, demonstrating that these introns are not spliced efficiently. (iv) ewg RNA undergoes alternative splicing in both heads and bodies by excluding exon D. (v) Levels of intron 3b splicing are similar in heads and bodies. (vi) Body-enriched novel PCR bands were detected in the region of introns 3c and 6. That body tissue is representative of neuron-poor splicing events was supported by the identical splicing profile of ewg in abdomen RNA, which is more neuron poor than that of adult bodies, which contain the thoracic and abdominal ganglia (data not shown).
Characterization of new ewg exons E and I.
The novel bands were isolated and directly sequenced on both strands with the primers that had been used for their amplification to determine if they resulted from additional exons in the ewg gene. The sequence of the 230- to 250-bp product detected in the 3aF/3cR (Fig. 2, lane 7) and 3cF/3cR (lane 9) PCRs revealed the presence of a 74-bp exon in intron 3c. This new exon, E in Fig. 1A, codes for 24 amino acids and alters the translational frame. Further RT-PCR analysis of exon E revealed high enrichment in female abdomens (data not shown).
Sequencing of the upper band amplified from body RNA with primers 6F and 6R revealed the presence of a 38-bp exon (Fig. 1D), exon I in Fig. 1A, present within intron 6. Exon I is exclusive to ewg transcripts in bodies, encodes 12 amino acids, and also alters the translational frame (Fig. 2, lane 14; see also Fig. 4B, lanes 15 and 17).
FIG. 4.
Alternative splicing of ewg transcripts is restricted to introns 3 and 6, alternative splice events are independent of each other, and splicing of introns 1, 3, and 6 is inefficient. Comparison was done between head (H) and body (B) poly(A)+ RNAs. (A) PCR products spanning introns 3 and 6 reveal all combinations of alternatively spliced introns. The italicized letters below each pair of lanes show the amplified section of ewg transcripts. Table 3 provides a complete listing of PCR products and summarizes the data. The forward primer was 3aF or 3cF, the return primer was RV, 6R, or 38R. Molecular size markers (GIBCO-BRL) are shown in lanes 5 and 11. In lane 8, a lambda hindIII digest was used as a marker. Note that heads and bodies show differences in the ewg transcript population and abundance due to the body-enriched usage of exons E and I, while increased inclusion of exon D occurs in heads. (B and C) Primer pairs spanning several introns reveal no additional alternative splice events. The italicized letters below each pair of lanes show the amplified section of ewg transcripts. Table 3 provides a complete listing of PCR products and summarizes the data. Molecular size markers (GIBCO-BRL) are shown in lanes 9 and 18 of panel B and lanes 1 and 8 of panel C. In lane 15 of panel C, a lambda hindIII digest was used as a marker. (D) Introns 1, 3a, 3c, and 6 are retained in polyadenylated ewg transcripts. A listing of PCR products and summary of the data are shown in Table 2. Note that differences between heads and bodies are mainly detected in the retention of introns 1 and 6 but not 3. Molecular size markers (GIBCO-BRL) are shown in lanes 5 and 13.
The 5′ and 3′ splice sites for the new exons matched the splice site consensus (Table 4) (27). Moreover, the flanking introns have sequences that match candidate branch point sequences at appropriate distances from the relevant 3′ splice site (26). Thus, exons E and I fit the criteria of authentic exons capable of being spliced appropriately. Both of these new exons match the genomic sequence except for one nucleotide substitution in exon E (Fig. 1C).
ewg RNA is abundant in heads and bodies.
Comparison of levels of efficiently spliced introns 2, 4, and 5 (Fig. 2A, lanes 3, 4, 12, 13, 19, and 20) suggests that ewg RNA is present in heads and bodies at comparable levels. To verify that ewg RNA was indeed present at high levels in bodies, either random hexamers or a gene-specific primer in exon H were used for RT, allowing the amplification of all splice isoforms of ewg regardless of their state of polyadenylation in subsequent PCR. For both reactions, primers 4F and 5R yielded very similar signals with head and body RNAs (Fig. 3A, lanes 1, 3, 6, and 8). Primers 3cF and 5R revealed that both tissues contain ewg RNAs that either retain or excise intron 3c (lanes 2, 4, 7, and 9). Exon E-containing transcripts were detected mostly in bodies (lanes 2 and 7). Also, some intron 3c-1 retention in body RNA is observed (lanes 2 and 7). Thus, ewg RNAs appear to be efficiently polyadenylated, as the ewg splicing profiles are similar for cDNAs primed with oligo(dT), gene-specific, and random hexamer primers.
To verify that PCR amplification is in the linear range, the accumulation of the spliced product was assessed every two cycles from cDNAs primed with a gene-specific primer in exon H. Comparable signals were obtained in head and body lanes throughout the linear range of amplification using primers 4F and 5R (Fig. 3B and C; Fig. 1). Thus, ewg transcript is expressed in bodies and heads at similar levels.
Alternative splicing of ewg exons.
The RT-PCR studies suggested that both head and body RNAs have populations of alternatively spliced transcripts that include or exclude exon D and that body-specific transcripts are enriched in transcripts that include exon I. Further, the splicing of introns 3a and 3c and splicing of intron 6 in bodies appear to be independent of each other since ewg transcripts containing exon I were found with or without exon D (436 bp in Fig. 4A, lane 6; 451 bp in lane 7; summarized in Table 3).
To further support these data and to test whether no other exons are alternatively spliced, PCR was done with primer pairs spanning several exons. No further alternatively spliced ewg transcripts were detected (Fig. 4B and C). Thus, alternative splicing in ewg transcripts is restricted to introns 3a, 3c, and 6.
ewg is inefficiently spliced.
To determine if introns other than 3a and 3c were present in poly(A)+ ewg RNA, PCR was done with intron-specific primers. Introns 1, 3a, 3c, and 6 (Fig. 4D) are present in poly(A)+ ewg transcripts of heads and bodies. From these introns, only intron 1 is differentially retained in body RNA compared to in head RNA. All of these introns are larger than 81 bp and are classified as large Drosophila introns (26). Among these introns, the 5′ and 3′ splice sites of 3c and the 3′ splice site of 3a diverge significantly from the Drosophila consensus (Table 4) (26). Introns 2, 4, and 5 were not detected in poly(A)+ ewg transcripts, confirming that they were efficiently spliced (Fig. 4D, lanes 3, 4, 10, 11, 12, and 14; Table 3). The overall levels of unspliced ewg transcripts were assessed in assays using primer 3cF and return primer RV in intron 6. This analysis indicated the presence of greater amounts of unprocessed ewg transcripts containing both intron 3c and intron 6 in bodies than in heads (Fig. 4A, lanes 9 and 10; data summarized in Table 3).
The splicing situation in the region of intron 6 is complex, as evidenced by two spliced transcripts that exclude or include exon I and RNAs that retain intron 6, and possibly transcripts that terminate in intron 6, most of which show differential distribution in heads and bodies. First, intron 6 is spliced more efficiently in head RNA than in body RNA, where overall splicing appears to be significantly reduced (Fig. 4D, lanes 15 and 16). Second, about half of the spliced body transcripts show inclusion of exon I (Fig. 2, lanes 14 and 15; Fig. 4B, lanes 14 to 17). Finally, RNAs that retain proximal intron 6 sequences are more prevalent in body RNA, as seen by use of the return primer RV, 5′ to the polyadenylation sites in intron 6 (Fig. 4A, lanes 1, 2, 9, and 10; Fig. 4C, lanes 2 to 5). In contrast, when the return primer in exon J is used, many fewer products are amplified in bodies than in heads (Fig. 4A, 1 to 4; Fig. 4B, lanes 14 to 17). This result suggests that in bodies, exon J-containing RNA is underrepresented compared to RNA containing the 3′ region of the intron 6. Thus, some RNAs in the body may be terminated before exon J, using the putative cleavage/polyadenylation sites in intron 6 (nucleotides 5876 and 6742, AATAAA). The presence of such transcripts was previously suggested by a partial cDNA, SC1, that in addition to excluding exon D also retained part of intron 6 (9).
Expression of 116-kDa protein is able to rescue ewg-null phenotypes.
We previously demonstrated that the 116-kDa protein is able to rescue the three well-characterized phenotypes associated with ewg mutations: embryonic lethality, erect wings, and formation of dorsal longitudinal muscles (8, 10). Expression from two cDNAs expressing ewg minigenes, EWGNS (neuron specific) and EWGHS (basal level expression), rescued viability of ewgl1, an ethyl methanesulfonate-induced lethal allele, which was thought to be genetic null and 116-kDa protein null (10). The possibility that the ewg locus was formally able to generate several other isoforms made us wonder if ewgl1 was a true null for all possible EWG proteins. Sequencing of genomic DNA from exons B to D revealed a C-to-T base pair change in exon B that resulted in the termination of the ORF at amino acid 187. Since exon B is part of all putative EWG isoforms, the ewgl1 allele is a functional null allele for all possible EWG proteins; moreover, it lacks the DNA binding domain.
The rescue of ewg-associated phenotypes by the transgenes expressing the 116-kDa EWG protein was further confirmed by using a synthetic genomic deletion of the ewg locus as described in Materials and Methods (data not shown). Thus, although it is formally possible that several EWG isoforms are generated, the 116-kDa EWG protein is sufficient to provide the known EWG functions. We cannot rule out, however, the possibility that flies rescued by the 116-kDa EWG protein have subtle abnormalities that were not discerned.
Putative isoforms encoded by the ewg locus.
ewg can potentially encode several polypeptides in addition to the 116-kDa EWG protein that includes exon D. Figure 5 depicts the conceptual ewg-generated isoforms. The presence or absence of exon D, which encodes 154 amino acids, does not interrupt the translational frame of the ewg ORF, while inclusion of exon E or I alters the translational frame, leading to a premature stop compared to that of the SC3 ORF (Fig. 5). Additional protein isoforms can also be generated by the ewg transcripts that retain either intron 3a, 3c, or 6. All intron 3a-, 3c-, and 6-containing RNAs result in a premature stop in the SC3 ORF, encoding 408, 574, and 840 amino acids, respectively.
FIG. 5.
ORFs of ewg splice isoforms deduced by RT-PCR analysis. All deduced ewg RNA isoforms are outlined from data presented in Table 2. The size of each ORF is given in amino acids (aa). Transcripts including exon E and I are not shown since the ORF terminates in exon F. Also, ORFs resulting from intron-retaining transcripts are not shown.
All putative EWG isoforms contain exon B and C. These exons also show the highest homology with the DNA binding motifs of other ewg-like proteins (4, 18, 31). Omitting exon D from the SC3 ORF, however, significantly increased the homology of EWG to sea urchin P3A2 compared to its homology in the alignment in reference 31, which was done with exon D. Protein sequences encoded by exon E, exon I, and retained introns did not reveal any significant homologies.
Are isoforms other than the 116-kDa protein synthesized? These isoforms should be detectable in immunoblot analysis by the polyclonal antibody generated against the 116-kDa protein, as it recognizes epitopes throughout the protein (Fig. 6A). In fact, the anti-EWG antibody reveals several bands on immunoblots, although the 116-kDa protein is the major band (Fig. 6B) (10). To determine whether the additional bands are ewg related, we analyzed wild-type (genomic ewg), ewgl1; EWGNS4 (expression of SC3 ORF in neurons), and ewgl1 (protein-null) embryonic extracts. Comparison of these extracts should indicate whether the additional bands are protein isoforms of EWG, 116-kDa protein degradation products, or unrelated cross-reacting proteins. As expected, the 116-kDa band was absent from the ewgl1 lane (Fig. 6B). Further, the wild-type and ewgl1; EWGNS4 patterns were identical, with the prominent 116-kDa band and several minor bands which are likely products of degradation; the other bands were common to all three extracts. Consistent with the embryonic data, wild-type and ewgl1; EWGNS4 (EWGNS4 is not transcribed in nonneural tissues) adult heads or abdomens showed no significant differences in their immunoblot profiles (Fig. 6C and D and data not shown). Thus, the 116-kDa protein is the major isoform, and other putative isoforms are either minor or not synthesized.
FIG. 6.
Characterization of anti-EWG antibody and immunoblot analysis of EWG proteins in Drosophila embryos and heads. (A) Anti-EWG antibody recognizes epitopes all along the EWG protein. Different parts of EWG protein were expressed in E. coli and tagged with either six histidines (6His) or MBP at the N terminus. As controls either bacterial extracts expressing only MBP or bacterial extracts from uninduced E. coli (lanes −) were loaded. Note that the part encoded by exons B and C has a calculated molecular mass of 43 kDa but runs at ∼70 kDa. Molecular mass markers are shown on the left in kilodaltons. (B) Anti-EWG antibody recognizes one major and several minor proteins in wild-type embryos (+; lane 1), which are absent in ewgl1 embryos (lane 2). Proteins marked with arrows are degradation products of the 116-kDa EWG protein, since they are also present in ewgl1; EWGNS4 embryos (NS4; lane 3). EWGNS4 is a rescue construct where a cDNA for the 116-kDa EWG protein is fused to a elav promoter fragment restricting expression to the nervous system. Molecular mass markers are shown on the left in kilodaltons. (C) Basal expression of the 116-kDa EWG protein under a heat shock promoter is comparable to EWG levels in the wild type. Head extracts from wild-type females (Tb [+/+; +/+; +/Tb]; lane 1) are compared to head extracts from females with either one copy (HS1 [ewgl1/ewgl1; +/+; EWGHS1/+] [lane 2] and HS7, Tb [ewgl1/ewgl1; EWGHS7/+; +/Tb] [lane 3]) or two copies (HS1, HS7 [ewgl1/ewgl1; EWGHS7/+; EWGHS1/Tb] [lane 4]). Note that the amount of EWG protein is about half of the wild-type amount with only one copy. Molecular mass markers are shown on the left in kilodaltons. (D) The 40-kDa proteins recognized by the anti-EWG antibody in head extracts are not EWG isoforms. Head extracts from the wild type (+; lane 1) are compared to head extracts from an ewgl1; EWGNS4 mutant (NS4; lane 2). Molecular mass markers are shown on the left in kilodaltons.
Overexpression of 116-kDa EWG is lethal.
The splicing regulation of ewg guarantees that the 116-kDa protein is generated in the head but down regulated in the body. To test if this regulation is biologically important, the effect of overexpression of 116-kDa protein in nonneural tissues was tested. Overall overexpression was achieved by using EWGHS1 and EWGHS7, two independent insert lines for a transgene in which the SC3 cDNA is driven with the heat shock promoter. The expression from the EWGHS transgene at 25°C mimics endogenous transcription, likely because of the inclusion of regulatory sequences upstream of the ORF (SC3 cDNA in Fig. 1A). Both of these transgenes lead to lethality when homozygous. We determined the viability of flies with different doses of the wild-type ewg allele that also carry EWGHS1 and EWGHS7 (Table 5). Indeed, flies carrying two doses of heat shock transgenes are less viable, and the males have a lower viability index than females. Males of genotype ewg+/Y; EWGHS7/Tb or ewg+/Y; EWGHS1/+ have viability indices of 0.76 and 1.00, respectively, but the viability index of ewg +/Y; EWGHS7/Tb; EWGHS1/+ drops to 0.005. Female survival is better than male survival, but the viability indices are still low: 0.10 and 0.12 for females carrying both transgenes and one or two doses of wild-type ewg, respectively (Table 5).
TABLE 5.
Viability of flies expressing one or two doses of ewgHS transgenes
Genotypea | nb | Viabilityc |
---|---|---|
Females | ||
ewgl1/+; +/+; +/Tb | 204 | 0.84 |
ewgl1/+; EWGHS7/+; +/Tb | 179 | 0.74 |
ewgl1/+; +/+; EWGHS1/+d | 242 | 1.00 |
ewgl1/+; EWGHS7/+; ewgHS1/+ | 25 | 0.10 |
+/+; +/+; +/Tb | 210 | 0.87 |
+/+; EWGHS7/+; +/Tb | 172 | 0.71 |
+/+; +/+; EWGHS1/+ | 238 | 0.98 |
+/+; EWGHS7/+; EWGHS1/+ | 28 | 0.12 |
Males | ||
ewgl1/Y; +/+; +/Tb | 0 | 0 |
ewgl1/Y; EWGHS7/+; +/Tb | 102 | 0.49 |
ewgl1/Y; +/+; EWGHS1/+ | 170 | 0.82 |
ewgl1/Y; EWGHS7/+; EWGHS1/+ | 1 | 0.005 |
+/Y; +/+; +/Tb | 171 | 0.83 |
+/Y; EWGHS7/+; +/Tb | 158 | 0.76 |
+/Y; +/+; EWGHS1/+d | 207 | 1.00 |
+/Y; EWGHS7/+; EWGHS1/+ | 1 | 0.005 |
Females of the genotype ewgl1y w sn/w; +/+; EWGHS1/TM6,Tb were crossed to yw/Y; +/+; EWGHS7/+ males.
n, number of flies.
Viability of F1 genotypes; n/n of the most viable genotype of the same sex.
Most viable genotype used as control to calculate viability.
The level of expression of 116-kDa protein in different genotypes was assessed in immunoblots (Fig. 6C and D). Comparison of 116-kDa protein levels between head extracts of wild-type females and ewgl1 females carrying both EWGHS transgenes shows that levels of 116-kDa signals are comparable in these two genotypes but lower in ewgl1 flies carrying only a single dose of EWGHS1 or EWGHS7 (Fig. 6C). Since both neuronal and nonneuronal tissues contribute to the signal in head extracts of flies carrying EWGHS transgenes, the neural expression in flies carrying both transgenes is likely to be much lower than the neural level of 116 kDa in the wild type. Therefore, the EWGHS-driven 116-kDa protein expression in the neural tissue is not likely to be the cause of lethality of EWGHS7/Tb; EWGHS1/+ genotype.
DISCUSSION
EWG protein provides a vital function in the nervous system, as exclusive neural tissue-specific expression of the 116-kDa EWG protein rescues the lethality caused by the ewg-null alleles. This neural role of ewg is underscored by a robust expression of the 116-kDa protein in neurons at all stages. In fact, the high expression of EWG seen with anti-EWG antibody and the functional information had led us to assume that the weak signal seen outside the nervous system (with the exception of myoblast expression) with the antibody staining in larvae and adults was due to high background (9). Thus, it is surprising and puzzling that despite the broad distribution of the transcript, the two known functions of ewg are associated with specific cell types: neurons and myoblasts. The studies described in this paper show that although the gene is broadly transcribed, the functional transcript encoding the 116-kDa protein is highly enriched in neuron-rich head tissue compared to that in neuron-poor bodies. The neurons in the body must contribute to the functional transcript observed in the body RNA; however, our data do not address if cell types other than neurons are also capable of low-level production of the functional transcript.
Generally, transcriptional controls ensure that specific proteins are synthesized in specific tissues. In the case of ewg, however, posttranscriptional mechanisms are critical to ensure that the 116-kDa protein is present at high levels in neurons. Few examples that show broad tissue distribution of transcript but restricted protein expression are known. In Drosophila, the gene encoding P transposase is transcribed broadly but the productive splice is made only in the germ line cells (1). A second example is the Drosophila Sex-lethal gene (Sxl), which is transcribed in both males and females but generates the functional Sxl protein only in females (22). Similar to the case of ewg, in both of these instances alternative splicing is involved; in the case of Sxl, a large number of intron-containing transcripts are also present (28).
Efficiency of alternative splice events that result in SC3-like transcript is higher in head RNA. Splice events that are crucial for the production of the functional SC3 transcript are inclusion of exons D and J and exclusion of exons I and E. Inclusion of exon D requires splicing of introns 3a and 3c instead of 3b. The 3′ splice sites of both 3a and 3c and the 5′ splice site of 3c diverge from the Drosophila consensus (Table 4) (26), making them likely targets for splicing regulation. Intron 3a is inefficiently spliced in both head and body RNAs, whereas 3c is inefficiently spliced in body RNA only. Since exclusion of D resulting from exon skipping is seen in both body and head RNAs, it is likely to be the default mode, with inclusion of D requiring a positive regulatory step. To what extent the inefficient splicing of 3a and 3c affects this regulation is difficult to assess. SC3 transcript also requires the appropriate choice of a 5′ splice site, resulting in the excision of intron 6. The body RNA shows inefficient splicing in the intron 6 region and, among the spliced products, about equal amounts of exon I inclusion and intron 6 excision. It is likely that RNAs that retain intron 6 become polyadenylated as consensus sequences for polyadenylation exist in intron 6; in case these are used, transcripts that encode different C termini will be generated.
Splicing of introns 1 (∼4.5 kb), 3a (299 bp), 3c (807 bp), and 6 (1,722 bp) is inefficient; all these are large introns. Some examples of intron retention in Drosophila include transcripts of Dopa decarboxylase (2), Suppresser-of-white-apricot (34), and Sxl (28). Whether intron-containing ewg transcripts exit the nucleus is not known, although the presence of 5′ and 3′ splice sites in a transcript is often sufficient to retain most intron-containing RNAs in the nucleus (23). However, the presence of intron-containing messages in the cytoplasm has been reported; examples include insulin pre-mRNA (33) and bovine growth hormone pre-mRNA (11). Intron retention can be a powerful means of gene regulation; the Drosophila Transformer-2 protein self-regulates the retention of intron M1 in germ line tissue (25). This intron retention in tra-2 mRNA has important phenotypic consequences, and it is thought to help maintain appropriate levels of Tra-2 protein in germ line tissues (25). Intron retention can also be a means to generate alternate protein isoforms, examples of which include C-CAM3 and rVDR1 (7, 12).
Alternative splicing is a widespread mechanism to regulate functional properties of DNA binding proteins to modulate DNA binding activity and dimerization properties (reviewed in reference 24). Since all conceptual alternative EWG isoforms contain the DNA binding domain encoded by exon B, these proteins are likely to differ in either dimerization or activation properties. The activation domain of NRF-1, a protein that has homology to EWG, has several glutamine-containing hydrophobic clusters (20). There are five such clusters in head-enriched exon D and two in exon H. Thus, the head-enriched 116-kDa protein may have stronger activation properties than proteins that lack exon D. The dimerization region for EWG has not been characterized.
Are any of the putative isoforms synthesized, do they have a function, or will they provide ewg function? The putative alternate protein isoforms have no essential role in the nervous system, as the 116-kDa EWG protein rescues viability of null ewg alleles, although they may still provide subtle functions in vivo or substitute for the 116-kDa function. Whether the alternative proteins are synthesized at low levels is not known.
The studies with heat shock transgenes show that overexpression of 116-kDa protein in nonneural tissues can be lethal, yet it is the expression of the 116-kDa protein in neurons that provides the viability function associated with ewg. With two doses of the EWGHS transgenes, both ewgl1 and ewgl1/+ flies were about equally lethal, suggesting a formally neomorphic effect. The lethality could result from the expression of 116-kDa protein in nonneural tissue or be due to general overexpression. An alternative possibility of the transgene generating a toxic novel protein cannot be discounted but is unlikely because the transgene was constructed by using a cDNA and not genomic DNA. Since the 116-kDa protein levels in the head with two doses of EWGHS are comparable to the wild-type levels, the protein is unlikely to attain a higher than normal level in the neural tissue. Moreover, higher than normal levels of 116-kDa protein generated through two doses of EWGNS transgenes, which are expressed only in the nervous system, is not lethal (our unpublished observations). Therefore, expression outside the nervous system is likely to be the primary cause of lethality. Thus, given that ewg is broadly transcribed, its posttranscriptional regulation is crucial to the fulfillment of its function.
ACKNOWLEDGMENTS
We thank D. Bordne for technical assistance, S. Goodwin for advice on RT-PCR protocols, S. Ghosh for advice on bacterial protein expression, L. Torroja for discussions, and E. Dougherty for imaging assistance.
This work was supported by National Institute of Health grants GM 22350 and NS 36179. M.S. was supported by a fellowship from the Swiss National Science Foundation. S.M.D. was supported by NIH training grant 5-T32GM-07122.
REFERENCES
- 1.Adams M D, Tarng R S, Rio D C. The alternative splicing factor PSI regulates P-element third intron splicing in vivo. Genes Dev. 1997;11:129–138. doi: 10.1101/gad.11.1.129. [DOI] [PubMed] [Google Scholar]
- 2.Beall C J, Hirsh J. High levels of intron-containing RNAs are associated with expression of the Drosophila DOPA decarboxylase gene. Mol Cell Biol. 1984;4:1669–1674. doi: 10.1128/mcb.4.9.1669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Becker T S, Burgess S M, Amsterdam A H, Allende M L, Hopkins N. Not really finished is crucial for development of the zebrafish outer retina and encodes a transcription factor highly homologous to human nuclear respiratory factor-1 and avian initiation binding repressor. Development. 1998;125:4369–4378. doi: 10.1242/dev.125.22.4369. [DOI] [PubMed] [Google Scholar]
- 4.Calzone F J, Hoog C, Teplow D B, Cutting A E, Zeller R W, Britten R J, Davidson E H. Gene regulatory factors of the sea urchin embryo. I. Purification by affinity chromatography and cloning of P3A2, a novel DNA-binding protein. Development. 1991;112:335–350. doi: 10.1242/dev.112.1.335. [DOI] [PubMed] [Google Scholar]
- 5.Chabot B. Directing alternative splicing: cast and scenarios. Trends Genet. 1996;12:472–478. doi: 10.1016/0168-9525(96)10037-8. [DOI] [PubMed] [Google Scholar]
- 6.Chang D D, Sharp P A. Regulation by HIV Rev depends upon recognition of splice sites. Cell. 1989;59:789–795. doi: 10.1016/0092-8674(89)90602-8. [DOI] [PubMed] [Google Scholar]
- 7.Cheung P H, Culic O, Qiu Y, Earley K, Thompson N, Hixson D C, Lin S-H. The cytoplasmic domain of C-CAM is required for C-CAM-mediated adhesion function: studies of a C-CAM transcript containing an unspliced intron. Biochem J. 1993;295:427–435. doi: 10.1042/bj2950427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.DeSimone S, Coelho C, Roy S, VijayRaghavan K, White K. ERECT WING, the Drosophila member of a family of DNA binding proteins is required in imaginal myoblasts for flight muscle development. Development. 1996;121:31–39. doi: 10.1242/dev.122.1.31. [DOI] [PubMed] [Google Scholar]
- 9.DeSimone S M. Ph.D. thesis. Waltham, Mass: Brandeis University; 1992. [Google Scholar]
- 10.DeSimone S M, White K. The Drosophila erect wing gene, which is important for both neuronal and muscle development, encodes a protein which is similar to the sea urchin P3A2 DNA binding protein. Mol Cell Biol. 1993;13:3641–3649. doi: 10.1128/mcb.13.6.3641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Driksen W P, Sun Q, Rottman F M. Multiple splicing signals control alternative intron retention of bovine growth hormone pre-mRNA. J Biol Chem. 1995;270:5346–5352. doi: 10.1074/jbc.270.10.5346. [DOI] [PubMed] [Google Scholar]
- 12.Ebihara K, Masuhiro Y, Kitamoto T, Suzawa M, Uematsu Y, Yoshizawa T, Ono T, Harada H, Matsuda K, Hasegawa T, Masushige S, Kato S. Intron retention generates a novel isoform of the murine vitamin D receptor that acts in a dominant negative way on the vitamin D signalling pathway. Mol Cell Biol. 1996;16:3393–3400. doi: 10.1128/mcb.16.7.3393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Efiok B J S, Chiorini J A, Safer B. A key transcription factor for eukaryotic initiation factor-2a is strongly homologous to developmental transcription factors and may link metabolic genes to cellular growth and development. J Biol Chem. 1994;269:18921–18930. [PubMed] [Google Scholar]
- 14.Fleming R J. Ph.D. thesis. Waltham, Mass: Brandeis University; 1987. [Google Scholar]
- 15.Fleming R J, DeSimone S M, White K. Molecular isolation and analysis of the erect wing locus in Drosophila melanogaster. Mol Cell Biol. 1989;9:719–725. doi: 10.1128/mcb.9.2.719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Fleming R J, Zusman S, White K. Developmental genetic analysis of lethal alleles at the ewg locus and their effects on muscle development in Drosophila melanogaster. Dev Genet. 1983;3:347–363. [Google Scholar]
- 17.Ghosh S, Lowenstein J M. A multifunctional vector system for heterologous expression of proteins in Escherichia coli: expression of native and hexahistidyl fusion proteins, rapid purification of the fusion proteins, and removal of fusion peptide by Kex2 protease. Gene. 1996;176:249–255. doi: 10.1016/0378-1119(96)00260-0. [DOI] [PubMed] [Google Scholar]
- 18.Gomez-Cuadrado A, Martin M, Noel M, Ruiz-Carrillo A. Initiation binding receptor, a factor that binds to the transcription initiation site of the histone h5 gene, is a glycosylated member of a family of cell growth regulators. Mol Cell Biol. 1995;15:6670–6685. doi: 10.1128/mcb.15.12.6670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Grabowski P. Splicing regulation in neurons: tinkering with cell-specific control. Cell. 1998;92:709–712. doi: 10.1016/s0092-8674(00)81399-9. [DOI] [PubMed] [Google Scholar]
- 20.Gugneja S, Virbasius C A, Scarpulla R C. Nuclear respiratory factors 1 and 2 utilize similar glutamine-containing clusters of hydrophobic residues to activate transcription. Mol Cell Biol. 1996;16:5708–5716. doi: 10.1128/mcb.16.10.5708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hamm J, Mattaj I W. Monomethylated cap structures facilitate RNA export from the nucleus. Cell. 1990;63:109–118. doi: 10.1016/0092-8674(90)90292-m. [DOI] [PubMed] [Google Scholar]
- 22.Horabin J I, Schedl P. Regulated splicing of the Drosophila Sex-lethal male exon involves a blockage mechanism. Mol Cell Biol. 1993;13:1408–1414. doi: 10.1128/mcb.13.3.1408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Legrain P, Rosbash M. Some cis-acting and trans-acting mutants for splicing target pre-mRNA to the cytoplasm. Cell. 1989;57:573–583. doi: 10.1016/0092-8674(89)90127-x. [DOI] [PubMed] [Google Scholar]
- 24.López A J. Developmental role of transcription factor isoforms generated by alternative splicing. Dev Biol. 1995;172:396–411. doi: 10.1006/dbio.1995.8050. [DOI] [PubMed] [Google Scholar]
- 25.Mattox W, Baker B S. Autoregulation of the splicing of transcripts from the transformer-2 gene of Drosophila. Genes Dev. 1991;5:786–796. doi: 10.1101/gad.5.5.786. [DOI] [PubMed] [Google Scholar]
- 26.Mount S M, Burks C, Hertz G, Stormo G D, White O, Fields C. Splicing signals in Drosophila: intron size, information content, and consensus sequences. Nucleic Acids Res. 1992;20:4255–4262. doi: 10.1093/nar/20.16.4255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Padgett R A, Grabowski P J, Konarska M M, Seiler S, Sharp P A. Splicing of messenger RNA precursors. Annu Rev Biochem. 1986;55:1119–1150. doi: 10.1146/annurev.bi.55.070186.005351. [DOI] [PubMed] [Google Scholar]
- 27a.Rozen, S., and H. Skaletsky. 1996, posting date. Primer3. [Online.] http://www.genome.wi.mit.edu/genome_software/other/primer3.html [7 April 1999, last date accessed.]
- 28.Samuels M E, Schedl P, Cline T W. The complex set of late transcripts from the Drosophila sex determination gene Sex-lethal encodes multiple related polypeptides. Mol Cell Biol. 1991;11:3584–3602. doi: 10.1128/mcb.11.7.3584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Talerico M, Berget S M. Intron definition in splicing of small Drosophila introns. Mol Cell Biol. 1994;14:3434–3445. doi: 10.1128/mcb.14.5.3434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Torroja L, Luo L, White K. APPL, the Drosophila member of the APP-family, exhibits differential trafficking and processing in CNS neurons. J Neurosci. 1996;16:4638–4650. doi: 10.1523/JNEUROSCI.16-15-04638.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Virbasius C A, Virbasius J V, Scarpulla R C. NRF-1, an activator involved in nuclear-mitochondrial interactions, utilizes a new DNA-binding domain conserved in a family of developmental regulators. Genes Dev. 1993;7:2431–2445. doi: 10.1101/gad.7.12a.2431. [DOI] [PubMed] [Google Scholar]
- 32.Wang J, Manley J L. Regulation of pre-mRNA splicing in metazoa. Curr Opin Genet Dev. 1997;7:205–211. doi: 10.1016/s0959-437x(97)80130-x. [DOI] [PubMed] [Google Scholar]
- 33.Wang J, Shen L, Najafi H, Kolberg J, Matschinsky F M, Urdea M, German M. Regulation of insulin preRNA splicing by glucose. Proc Natl Acad Sci USA. 1997;94:4360–4365. doi: 10.1073/pnas.94.9.4360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Zachar Z, Chou T-B, Bingham P M. Evidence that a regulatory gene autoregulates splicing of its transcript. EMBO. 1987;6:4105–4111. doi: 10.1002/j.1460-2075.1987.tb02756.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zhang P, Spradling A C. Efficient and dispersed local P element transposition from Drosophila females. Genetics. 1993;133:361–373. doi: 10.1093/genetics/133.2.361. [DOI] [PMC free article] [PubMed] [Google Scholar]