Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Jan 1.
Published in final edited form as: Cell Res. 2010 May 4;20(7):763–783. doi: 10.1038/cr.2010.64

Dynamic regulation of alternative splicing and chromatin structure in Drosophila gonads revealed by RNA-seq

Qiang Gan 1,3, Iouri Chepelev 2,3, Gang Wei 2, Lama Tarayrah 1, Kairong Cui 2, Keji Zhao 2, Xin Chen 1,4
PMCID: PMC2919574  NIHMSID: NIHMS222587  PMID: 20440302

Abstract

Both transcription and post-transcriptional processes, such as alternative splicing, play crucial roles in controlling developmental programs in metazoans. Recently emerged RNA-seq method has brought our understandings of eukaryotic transcriptomes to a new level, because it can resolve both gene expression level and alternative splicing events simultaneously.

To gain a better understanding of cellular differentiation in gonads, we analyzed mRNA profiles from Drosophila testes and ovaries using RNA-seq. We identified a set of genes that have sex-specific isoforms in wild-type (wt) gonads, including several transcription factors. We found that differentiation of sperms from undifferentiated germ cells induced a dramatic down-regulation of RNA splicing factors. Our data confirmed that RNA splicing events are significantly more frequent in the undifferentiated-cell enriched bag of marbles (bam) mutant testis, but down-regulated upon differentiation in wt testis. Consistent with this, we showed that genes required for meiosis and terminal differentiation in wt testis were mainly regulated at the transcriptional level, but not by alternative splicing. Unexpectedly, we observed an increase in expression of all families of chromatin remodeling factors and histone modifying enzymes in the undifferentiated cell-enriched bam testis. More interestingly, chromatin regulators and histone modifying enzymes with opposite enzymatic activities are co-enriched in undifferentiated cells in testis, suggesting these cells may possess dynamic chromatin architecture. Finally, our data revealed many new features of the Drosophila gonadal transcriptomes, and will lead to a more comprehensive understanding of how differential gene expression and splicing regulate gametogenesis in Drosophila. Our data provided a foundation for the systematic study of gene expression and alternative splicing in many interesting areas of germ cell biology in Drosophila, such as the molecular basis for sexual dimorphism and the regulation of the proliferation vs. terminal differentiation programs in germline stem cell lineages. The GEO accession number for the raw and analyzed RNA-seq data is GSE16960.

Keywords: Transcription, alternative splicing, differentiation, testis, ovary, Drosophila

Introduction

Drosophila melanogaster is an excellent model organism for studying the molecular mechanisms underlying cellular differentiation. Of all cell types, germ cells are unique in their ability to produce the next generation of an organism upon fertilization 1. The Drosophila male and female germlines are stereotypical adult stem cell lineages that provide powerful model systems for studying the molecular mechanisms that regulate stem cell maintenance vs. differentiation 27. Recent studies using loss-of-function assays for individual genes indicate that a unique chromatin structure in germline stem cells (GSCs) is critical for their self-renewal 8, 9. Previous microarray analysis also demonstrated that GSCs express specific isoforms of genes, which encode components distinct from those in canonical transcriptional or translational machinery 10. These data implicate alternative splicing in the regulation of GSC’s maintenance.

Alternative splicing, through which a single pre-mRNA gives rise to different mature mRNAs, significantly increases protein diversity in higher eukaryotes 11, 12. Several well-known examples of how splicing regulates cellular differentiation during Drosophila development come from studying sexual development of somatic tissues, where the X chromosome to autosome ratio is fundamental in the establishment of somatic sexual identity through sex-specific gene splicing 1317. Previous microarray studies of gene expression in whole flies reveal a broad spectrum of sex-specific expression of alternatively spliced isoforms 18. However, those studies cannot resolve the difference between somatic tissues and gonads. Recent exon-specific microarray studies have demonstrated a handful of genes that are sex-specifically spliced in Drosophila gonads, demonstrating that alternative splicing may also contribute to sex determination of gonads 19. On the other hand, the unequal dosage of X chromosomes between males and females necessitates compensatory mechanisms for X chromosomal gene expression. Through a process known as dosage compensation, the transcription level of genes on the single X chromosome in males is doubled to match the level of their expression in females 20. Previous microarray analysis of gene expression has shown that male-biased genes are depleted on the X chromosome 2123. However, it is not clear whether such a preferential gene distribution applies to other groups of differentially expressed genes, such as stage-specific genes during gametogenesis.

Extensive global gene expression studies over the past decade, which have relied primarily on hybridization-based (e.g., microarray 24) and Sanger sequencing-based (e.g., serial analysis of gene expression (SAGE) 25) techniques, have described transcriptomes in a variety of cell types and developmental stages. However, these techniques are not optimal to resolve different isoforms of genes. Previous studies of alternative splicing mainly relied on expression sequence tag sequencing (EST) data, which is very difficult to reach genome-wide coverage. Recently developed massively parallel signature DNA sequencing technologies have been applied to profile transcriptomes in yeast 26, 27, plant 28, and mammalian cells 2931. The single-nucleotide resolution data from RNA-seq simultaneously provides information about mRNA levels and alternatively spliced isoforms (reviewed in 32).

To obtain a comprehensive understanding of gene expression and splicing events during sexual and cellular differentiation of Drosophila gonads, we used the RNA-seq technique to profile mRNAs in wt gonads and mutant gonads enriched with undifferentiated cells, from both male and female adult flies. Our high-resolution, genome-wide transcriptome data revealed the existence of a large number of genes that exhibit differential splicing in male and female gonads, suggesting that alternative splicing may also be a critical mechanism underlying sexual dimorphism in gonads. A dramatic increase in the expression of all families of chromatin modifying enzymes, including both ATP-dependent chromatin remodeling factors [Brahma associated proteins (BAPs) and ATP-dependent nucleosome remodeling factor (NURF)] and histone modifying enzymes [histone methyltransferases (HMTs), histone demethylases (HDMTs), histone acetyltransferases (HATs), and histone deacetylases (HDACs)], was detected in the undifferentiated cell-enriched testis, indicating a need for the dynamic regulation of chromatin structure in those cells. At the same time, splicing factors were expressed at high levels in the undifferentiated cell-enriched testis, suggesting that alternative splicing is important in maintaining the undifferentiated status of those cells in testis. In contrast, genes required for meiosis and terminal differentiation were mainly regulated at the transcriptional level, and not by alternative splicing, in the differentiated cells of testis. Interestingly, we found that the differentiation genes in testis have a significantly biased distribution on the 2L autosomal arm. In contrast, these genes are the most under-represented on the X chromosome, indicating a potential mechanism for them to escape the X-inactivation, which is thought to occur during late spermatocyte stage. In summary, our high resolution and genome-wide RNA-seq data provided a platform to study sexual and cellular differentiation of Drosophila gonads at a molecular level.

Results

Generation of sex- and stage-specific gonadal transcriptomes using RNA-seq

To characterize the molecular signatures of sexual and cellular differentiation during gametogenesis, we isolated female and male gonadal poly(A) RNAs from two different adult flies (Fig. 1A): ovaries and testes from the bam mutant flies; ovaries and testes from wt flies. In bam mutant gonads, the transition from undifferentiated cells to differentiating cells is abolished and the tissues are enriched with over-proliferative undifferentiated cells, including GSCs, transit-amplifying spermatogonial cells, as well as somatic cells 33, 34. Although the bam mutant gonads have some abnormal features, they are highly enriched with early-staged germ cells, and thus, have been used in previous microarray analysis to study molecular characteristics of undifferentiated cells in gonads 10, 35.

Figure 1. Generation of gonadal transcriptomes using RNA-seq.

Figure 1

(A) The four samples used for RNA-seq and four pair-wise comparisons in two categories to investigate: 1. sex-biased; 2. stage-specific patterns of gene expression and alternative splicing; (B) The anti-RNA Pol II ChIP-seq results using the bam testis sample. The five gene groups were classified according to their RPKM value based on RNA-seq results. The thresholds for low, moderate and high expression level were set up to reach the same number of genes in each group.

Using these gonadal samples, we performed RNA-seq analysis using the Illumina/Solexa genome analyzer, as described in the Methods and Figure S1A. We retrieved about ten million non-redundant unique reads per sample to reach a sequencing depth sufficient to cover the entire predicted fly transcriptome approximately ten times (Table 1). We normalized transcript abundance for all expressed genes as sequencing reads/per kilobase merged exonic region/per million mapped reads (RPKM value 29, see Fig. S1B and Methods). To evaluate how well the RPKM value reflects transcriptional activity of genes, we used ChIP-seq (chromatin immunoprecipitation followed by high-throughput sequencing) to plot the enrichment of RNA Pol II at the transcription start sites (TSSs) for five groups of genes in the bam testis sample, classified according to their RPKM values (30, Fig. 1B). The Pol II occupancy at TSSs positively correlated with genes’ RPKM values. The silent group (RPKM=0) showed no Pol II enrichment.

Table 1.

Summary of the Drosophila RNA-seq results.

Mapping summary bam testis wt testis bam ovary wt ovary
Total reads 41730579 30013546 32806238 29736261
Low-quality and no matched reads 14736702 8645118 14287662 8220597
Reads with multiple matches (>1 hits) 6952842 2667877 1862272 1094363
Uniquely mapped reads (genomic+junction) 20041035 18700551 16656304 20421301
Uniquely mapped non-redundant reads (genomic+junction) 13028364 10853191 11765719 11206968
Uniquely mapped non-redundant reads ( genomic+junction) subsample 10163916 10163916 10163916 10163916
Uniquely mapped non-redundant reads (genomic) 9780037 9827900 9763089 9774561
Uniquely mapped non-redundant reads (junction) 383879 336016 400827 389355
Unique and non-redundant reads mapping to annotated exons 9311716 9416811 9216015 9444989
Unique and non-redundant reads mapping to annotated introns 208725 174567 274958 152291
Unique and non-redundant reads mapping to intergenic region 259596 236522 272116 177281
ENSEMBL genes with at least 1RPKM 9383 10735 9486 8434
ENSEMBL genes with at least 5RPKM 7777 8233 7828 7147
Alternatively spliced genes 989 805 988 879
Total CFTRs 2123 2455 2487 1500
CFTRs in intron* 719 922 959 501
CFTRs in intergenic regions 1404 1533 1528 999
CFTRs supported by mRNA or EST 1402 1149 1727 1241
CFTRs predicted by N-SCAN or CONTRAST 332 370 408 304
CFTRs hit to NCBI non-redundant protein sequence 227 378 223 150
Average length of CFTRs (bp) 419 348 417 415
*

The RPKM of intronic CFTRs are ≥ 10% of neighboring gene’s (or genes’) RPKM.

Using the RPKM values of individual genes (Table S1), we performed four pair-wise comparisons between different samples in two categories (Fig. 1A): (1) sex-biased genes in bam mutant and wt gonads, respectively, to examine key regulators for sexual dimorphism in gonads; (2) genes that are expressed in a stage-specific manner in male and female gonads, to study stem cell or undifferentiated cell-enriched factors, as well as genes required for cellular differentiation of gametes.

The wt gonads demonstrated an outstanding sex-biased gene expression pattern, shown as a low correlation coefficient (r=0.182 for wt ovary vs. wt testis comparison). The sexual difference in undifferentiated cell-enriched bam gonads was much smaller (r=0.716 for bam ovary vs. bam testis comparison). In addition, more differential gene expression was observed in testes than in ovaries when comparing gonads with different-staged germline and somatic cells: testis samples showed much higher variation (r=0.242 for bam vs. wt testis) than ovary samples (r=0.865 for bam vs. wt ovary). A possible explanation for such a difference is that we used flies that were within one day post-eclosion, which is sufficient for testis to reach full-term spermatogenesis but may be too early for ovary to have mature eggs36, 37.

In order to further evaluate the data quality, we compared our RNA-seq results with the existing gene expression profiling data using microarray. We found that our data were largely consistent with the microarray data (22 and http://www.flyatlas.org/). For example, in the wt testis vs. wt ovary comparison (Fig. S2), greater than 77% (in most cases, greater than 89%) of the sex-biased genes identified by microarrays overlapped with the same category of genes identified by RNA-seq, indicating that the RNA-seq method largely recapitulated the microarray results and will be useful to identify possibly other differentially expressed genes in gonads. Our data also showed that approximately 56% of all genes in wt gonads and 21% in bam mutant gonads showed sex-biased expression patterns (≥ 2-fold difference between sexes). These data were consistent with the previous work that indicates approximately half of the Drosophila genes have sex-biased expression pattern 18 and wt gonads contain most of them 3840.

Enrichment of chromatin regulators in undifferentiated cell-enriched bam testis

The differentiation of germ cells, especially in testis, is characterized by a dramatic morphological change and genome-wide chromatin condensation with replacement of histones by protamines 41. To probe the molecular circuitries regulating gametes differentiation programs, gene expression profiles between wt and bam mutant gonads were compared in both male and female.

These comparisons indicated that spermatogenesis has a distinct gene expression profile switch from undifferentiated cell-enriched bam testis to fully differentiated wt testis: We found that 501 and 1,894 genes were uniquely expressed in bam and wt testis, respectively (Fig. 2A). These results were consistent with the thought that a large pool of meiotic and terminal differentiation genes are transcribed in spermatocytes (42, Fuller, M.T. and White-Cooper, H. personal communications). Examples of such genes included the testis-specific TAF homologs (tTAFs) and components of a testis-specific version of the MIP/DREAM complex (tMAC), which are remarkably up-regulated in wt testis (Fig. 2B) 4245.

Figure 2. Enrichment of the chromatin regulators in undifferentiated cell-enriched bam testis.

Figure 2

(A) Heat map showing genes that are uniquely expressed in both bam testis vs. wt testis, and bam ovary vs. wt ovary comparisons. Unique expression means that a certain gene is expressed in one sample (RPKM ≥1) but silent in another sample (RPKM <0.5). The RPKM cutoffs for expressed and silent genes are based on98. Heat scale: log2RPKM. To calculate the log2 RPKM values of individual gene, all their original RPKM values were added by a pseudo-count of 1. (B) Enriched chromatin remodeling/modifying factors and histone modifying enzymes in bam testis. Abbreviations: PcG - Polycomb group complex, PRC1 and PRC2- Polycomb repressive complex 1 and 2, TrxG- Trithorax group complex, HMT- histone methyltransferase, HAT- histone acetyltransferase, HDAC- histone deacetylase, tTAFs- testis TAF homologs, tMAC- a testis-specific version of the MIP/DREAM complex, meiotic genes and TD- terminal differentiation genes for spermiogenesis. (C) In situ data using antisense riboprobes that recognize the trx (HMT) gene and lid (HDMT) gene in wt testis, respectively. There is a co-enrichment of the trx and lid transcripts in undifferentiated cells located at the tip of the wt testis, indicated by the black arrows. (D) In situ data using antisense riboprobes that recognize the Pcaf (HAT) gene and Rpd3 (HDAC) gene in wt testis, respectively. There is a co-enrichment of the Pcaf and Rpd3 transcripts in undifferentiated cells located at the tip of the wt testis, indicated by the black arrows.

Previous studies have suggested that the undifferentiated status of female GSCs is maintained by a unique chromatin structure 8, 9, 46. To examine the potential epigenetic mechanisms that maintain the undifferentiated status of male GSCs and spermatogonia, we compared the expression levels of all families of chromatin remodeling factors and histone modifying enzymes in bam testis vs. wt testis: We found that 62.5% (20 out of 32) of chromatin remodeling factors and 70.2% (40 out of 57) of histone modifying enzymes are highly enriched in the undifferentiated cells of bam testis (Table 2). For example, BAP60 and BAP55, components of the ATP-dependent SWI/SNF family of chromatin remodeling BAP complexes 47, 48, were elevated 6.1-fold and 5.8-fold, respectively, in bam testis compared to wt testis. Nurf-38, a subunit of the NURF complexes 49, 50 was elevated 3.4-fold in bam testis compared to wt testis (Table 2). The Polycomb group (PcG) and Trithorax group (TrxG) complexes, as well as other histone modifying enzymes, including HMTs, HDMTs, HATs, and HDACs, were also remarkably elevated in undifferentiated cells in bam testis. Interestingly, chromatin modifying enzymes that have antagonizing functions were co-enriched in bam testis. For example, the activities of the PcG and TrxG protein complexes counteract with each other in the determination of cell fate 51, 52. We detected a 4.1-fold enrichment of a key PcG component Enhancer of zeste [E(z)] and a 3.3-fold enrichment of a critical TrxG component trithorax (trx) in bam testis, compared to wt testis (Fig. 2B and Table 2). In addition, genes encoding histone modifying enzymes that have opposite biochemical activities, such as the H3K4me3 HMT (e.g. trx) 53 and HDMT (e.g. little imaginal disc or lid: 5.6-fold enrichment) 5456; as well as HATs (e.g., Pcaf: 4.9-fold enrichment) 57 and HDACs (e.g., Rpd3: 15.4-fold enrichment) 58, were both highly expressed in bam testis compared to wt testis. Interestingly, the lid gene was genetically identified as a TrxG gene 59, indicating that it may act with trx in a coordinated manner to antagonize PcG activities and regulate male germ cell differentiation. Recent studies have also demonstrated that the HAT and HDAC enzymes act together to regulate gene transcription 60, and different histone enzymes associate with each other and affect their respective enzymatic activities 61. Consistent with those findings, here our data demonstrated that chromatin modifying enzymes with opposite activities were co-enriched and may act cooperatively to regulate the undifferentiated status of male germ cells.

Table 2. Chromatin regulators are enriched in bam testis compared to wt testis.

The chromatin remodeling factors were retrieved from http://web.wi.mit.edu/young/pub/chromatin_remodeling.html; the histone modifying enzymes were from 99107 (All chromatin remodeling factors and histone modifying enzymes listed here are supported by published research papers.). For each chromatin regulator, the fold change in bam/wt gonads is listed, as well as the RPKM values in bam and wt gonads, respectively. Red numbers indicate enrichment in bam gonads, and blue numbers indicate enrichment in wt gonads. Most of the chromatin regulators have high expression in bam testis (RPKM> 10), therefore the down-regulation in wt testis is significant.

Gene Name bam/wt testis bam testis (RPKM) wt testis (RPKM) Gene Name bam/wt testis bam testis (RPKM) wt testis (RPKM) Gene Name bam/wt testis bam testis (RPKM) wt testis (RPKM) Gene Name bam/wt ovary bam ovary (RPKM) wt ovary (RPKM)
Rpd3 15.35 158.02 10.29 CG11033 4.70 43.67 9.30 trr 3.05 26.58 8.72 tna 2.81 48.70 17.31
ph-p 13.46 17.27 1.28 Suv4-20 4.56 78.75 17.25 mof 2.90 34.20 11.80 Iswi 2.29 122.00 53.20
Art4 11.92 43.10 3.62 kis 4.45 120.73 27.16 tlk 2.89 80.46 27.86 SuUR 0.29 8.37 28.63
CG3654 8.45 48.74 5.77 Mi-2 4.30 138.92 32.32 Snr1 2.82 59.58 21.12 Art4 0.36 26.46 73.45
tara 8.38 17.31 2.07 osa 4.20 174.52 41.54 Su(var)3-9 2.74 85.46 31.20 Ada2b 0.37 28.12 75.50
mod 8.24 127.82 15.52 CG13902 4.19 22.48 5.36 Su(var)2-HP2 2.58 37.82 14.65 Sgf11 0.48 14.46 30.38
egg 6.77 76.40 11.29 ial 4.11 56.06 13.62 CG2982 2.52 72.24 28.65 borr 0.48 28.40 59.37
MED21 6.53 14.72 2.25 E(z) 4.06 51.80 12.75 wda 2.48 18.39 7.42 z 0.50 22.94 46.21
Psc 6.31 15.90 2.52 not 4.02 66.83 16.62 Hcf 2.32 43.82 18.91 Scm 0.49 23.64 48.50
Rpb4 6.19 20.81 3.36 Bre1 3.96 52.01 13.13 Su(var)3-3 2.27 28.13 12.39 Set2 0.50 29.84 60.13
BAP60 6.12 69.93 11.43 CG12879 3.90 15.40 3.95 Sce 2.17 47.07 21.73
esc 6.03 51.57 8.55 CG8165 3.87 10.56 2.73 Invadolysin 2.14 50.45 23.56
BAP55 5.77 87.57 15.17 phol 3.77 68.00 18.03 Su(var)205 2.14 101.00 47.23
lid 5.56 149.49 26.87 pr-set7 3.52 44.45 12.64 Taf1 2.05 46.89 22.86
Hdac3 5.56 68.80 12.38 Nurf-38 3.39 191.34 56.43 ash1 0.49 18.95 38.48
Art1 5.53 140.78 25.47 Sir2 3.37 39.03 11.58 Mst35Ba 0.49 0.00 84.60
mor 5.39 103.02 19.12 trx 3.28 29.73 9.07 Kdm4B 0.49 54.43 111.24
Dp1 5.37 157.07 29.24 Pc 3.28 12.13 3.70 tna 0.21 42.48 205.15
Elp3 5.30 57.39 10.83 Nurf55 3.21 123.73 38.50 Mst35Bb 0.01 0.00 135.32
Acf1 4.97 55.55 11.17 HDAC4 3.16 31.95 10.12 CG31281 0.00 0.00 282.35
Pcaf 4.86 50.72 10.44 Su(z)12 3.10 75.43 24.30

To further validate these RNA-seq results, we performed in situ hybridization using gene-specific antisense riboprobes for a number of chromatin regulators. Indeed, we found co-enrichment of the trx and lid transcripts in undifferentiated cells located at the tip of the wt testis (Fig. 2C). The same co-enrichment pattern was also obtained for the antagonizing Pcaf and Rpd3 genes (Fig. 2D). These data confirmed the RNA-seq results and implicated that the chromatin status of undifferentiated cells in testis may be dynamically regulated by balanced activities of counteracting chromatin regulators and histone modifying enzymes. In contrast, among the 925 genes that are uniquely expressed in bam ovary (Fig. 2A), only two chromatin regulator genes (tna and Iswi) were enriched in the bam ovary compared to wt ovary. These observations were consistent with previous reports that female GSCs have a relatively repressed transcriptome 10, 62 and possibly a repressive chromatin landscape.

In summary, our data suggest that the status of undifferentiated cells in bam testis is associated with high expression level of a cohort of chromatin regulators, which may be critical for the maintenance of the unique molecular identities and cellular behavior of undifferentiated male GSCs and spermatogonial cells, which have been shown to retain the plasticity to de-differentiate to become GSC-like cells 7, 63, 64.

RNA-splicing factors are highly enriched in undifferentiated cell-enriched bam testis

Interestingly, we found significant enrichment of splicing factors in the bam testis: Approximately 56.9% of genes that encode characterized or putative splicing factors were enriched at least 2-fold in bam testis relative to wt testis (177 out of 311, Fig. 3A and Table S2), which was significantly higher than random distribution (P< 10−15). The bam testis-enriched splicing factors included seven genes encoding SR proteins, which regulate splice site selection (Table S2). In contrast, only 8.4% of splicing factors were enriched in wt testis compared to bam testis (26 out of 311, Fig. 3A and Table S2), which was significantly lower than random distribution (P< 10−8). These data suggested that the undifferentiated cells in testis may be associated with a genome-wide increase of splicing activities.

Figure 3. Enhanced splicing activities in undifferentiated cell-enriched bam testis.

Figure 3

(A) Expression of splicing factors is up-regulated in bam testis compared to wt testis. The red and green lines are the 2-fold cutoff lines. 177 splicing factors are 2-fold more enriched in bam testis relative to wt testis; while 26 splicing factors are 2-fold more enriched in wt testis relative to bam testis. The enrichment of splicing factors in bam testis is significant (P< 10−15). P-value was calculated using the chi-square test. (B) Enhanced overall splicing activities in bam testis compared to wt testis, shown by the CDF plot. P-value was calculated using one-sided Kolmogorov-Smirnov test. (C) Percentage of differentiation genes as single-isoform vs. multi-isoform genes [Differentiation genes are defined as silent in bam testis (RPKM in bam testis < 0.5); but are expressed in wt testis (RPKM in wt testis ≥ 1).]. Only 13% of the differentiation genes are multi-isoform genes, which is significantly (P< 10−15) lower than the 23% genome-wide proportion of multi-isoform genes, based on the Ensembl annotation. The P value was calculated using one-sided Fisher exact test.

Enhanced alternative splicing activities in undifferentiated cell-enriched bam testis

To test the above hypothesis, we used gene entropy (S) as a measure of overall splicing complexity at any given splicing locus (Methods and 65). We computed gene entropy using the formula S=pklog2pk. The pk is the frequency of a particular transcribed isoform k, which is estimated from the RNA-seq data (Methods and Table S3). We then used normalized genes’ entropy to plot a “Cumulative Distribution Function (CDF)” for each sample (Methods).

We first examined alternative splicing activities in bam gonads and compared the results with those in wt gonads. Indeed, we found that there were 23% more alternatively spliced genes in bam testis compared to wt testis (Table 1). Consistently, the CDF plot of bam testis shifted to the right side of wt testis, indicating an overall higher complexity of transcripts in bam testis (P< 0.001, Fig. 3B). Similarly but less significantly, results were obtained from the bam ovary vs. wt ovary comparison (P< 0.02, data not shown).

The significantly decreased expression of splicing factors in wt testis (Fig. 3A) implicated that single-transcript genes may be popular for terminal differentiation genes in testis (RPKM< 0.5 in bam testis and RPKM ≥ 1 in wt testis). Indeed, we found that the percentage of multi-isoform genes is significantly (P< 10−15) under-represented among terminal differentiation genes in testis (Fig. 3C): Only 13% of testis differentiation genes are multi-isoform genes; whereas 23% of all Ensembl annotated genes are multi-isoform genes. In addition, when we computed the ratio (percentage) of the intronic region size relative to the total gene size, we found a unique feature for testis differentiation genes: The mean of the intron size to the entire gene size ratio for all Drosophila genes is approximately 25%. However, for the terminal differentiation genes in testis, this ratio was significantly reduced to 19.3% (P< 10−16). These intriguing results indicated that this particular group of genes might have evolved to become the simplest single-isoform ones, which could be the molecular mechanism that ensures efficient and coordinated transcription of a cohort of genes required for meiosis and spermiogenesis. However, this finding appeared to be in contrast with previous studies in mammals, which have suggested that alternative splicing is prevalent in testis 66, 67. This apparent discrepancy could be due to mixed staged germ cells in the mammalian studies, incomplete coverage of the mammalian EST libraries, or a difference between mammals and flies during evolution.

Sex-specific isoforms in bam and wt gonads

Alternative splicing is well known to play a critical role for sexual differentiation in Drosophila somatic tissues 17, 68. However, sex determination of germ cells in gonads is not as well-understood, but is thought to take a different route and rely on signals emanating from surrounding somatic cells 69, 70. Interestingly, recent studies have demonstrated that alternative splicing may also contribute to sexual differentiation of gonads, with a set of representative examples 19. Here we used the RNA-seq data to identify sex-specific isoforms genome-widely in gonads. We analyzed uniquely detectable isoforms of each alternatively spliced gene in all four samples (Methods and Table S3). We then plotted each alternatively spliced gene for the presence or absence of sex-specific isoforms in both bam testis vs. bam ovary, and wt testis vs. wt ovary comparisons, respectively. Our analysis revealed 462 genes with sex-specific isoforms in bam testis vs. bam ovary comparison (Fig. 4A and Table S4-1). We also identified 614 genes that have sex-specific isoforms in wt testis and wt ovary (Fig. 4B and Table S4-2). Surprisingly, p53 was among these genes with different isoforms in wt male and female gonads (Table 3 and Table S4-2). Using real-time PCR assays with isoform-specific primers, we found that the p53-RA isoform was 2-fold more enriched in wt ovary than in wt testis; whereas the p53-RB isoform was 14.6-fold more enriched in wt testis than in wt ovary (Fig. 4C–D). The p53 gene is well known for its regulation of cell cycle and programmed cell death in response to cellular stresses 71, 72. The Drosophila p53 was also known to regulate primordial germ cell development in embryos 73. Our discovery that there are distinct p53 isoforms in wt testis and ovary implicated potentially dimorphic roles of p53 in regulating apoptosis, which could contribute to distinct cellular differentiation pathways of male and female gametes.

Figure 4. Visualization of sex-specific isoforms of individual genes in bam and wt testis vs. ovary comparisons.

Figure 4

(A) Genes with sex-specific isoforms in bam testis vs. bam ovary. All genes with ≥2 uniquely detectable isoforms (Methods) were analyzed for the presence (red) and absence (blue) of sex-specific isoforms. Presence (red) in “both” category indicated that the corresponding gene has at least one isoform expressed in both bam testis and bam ovary samples. The gene numbers of each class were labeled on the Y-axis of the panel. The RRR, RBR, BRR and RRB classes had at least one sex-specific isoform, therefore were labeled as black numbers (see Table S4-1 for details.). (B) Genes with sex-specific isoforms in wt testis vs. wt ovary. The presence (red) and absence (blue) of sex-specific isoforms for each individual gene were analyzed in wt testis and wt ovary samples; the presence (red) in both somatic samples indicated a lack of sex specificity for a particular isoform(s) of an individual gene (Table S4-2). (C) The wt ovary-biased p53 isoform is CG33336-RA or FBtr0084359. The wt testis-biased p53 isoform is CG33336-RB or FBtr0084360. (D) Real-time RT-PCR using isoform-specific primer sets showed that p53-RA is about 2-fold more enriched in wt ovary compared to wt testis; while p53-RB is approximately 14.6-fold more enriched in wt testis compared to wt ovary. The level of both p53-RA and p53-RB were normalized to the total p53 level, using a primer set that amplifies a common region of both isoforms (Methods). The total p53 level in wt ovary and wt testis was also shown, which has been normalized to the sample with a smaller RPKM (wt testis in this case, Table S1). (E) The wt ovary-biased Rab14 isoform is CG4212-RA or FBtr0080626. The wt testis-biased Rab14 isoform is CG4212-RB or FBtr0080627. (F) Real-time RT-PCR using isoform-specific primer sets showed that Rab14-RA is about 13.9-fold more enriched in wt ovary compared to wt testis; while Rab14-RB is approximately 171.4-fold more enriched in wt testis compared to wt ovary. The level of both Rab14-RA and Rab14-RB were normalized to the total Rab14 level, using a primer set that amplifies a common region of both isoforms (Methods). The total Rab14 level in wt ovary and wt testis was also shown, which has been normalized to the sample with a smaller RPKM (wt ovary in this case, Table S1).

Table 3. Genes that exhibit sex-specific or sex-biased isoforms in bam and wt gonads.

Genes that encode transcription factors or putative transcription factors were highlighted in red. Genes that encode splicing factors or putative splicing factors were highlighted in yellow.

bam testis vs. bam ovary wt testis vs. wt ovary wt testis vs. wt ovary
CG18812 (unknown) CG6151 (unknown) mge (transmembrane transporter activit)
Jbug (actin binding) CG6767 (ribose phosphate diphosphokinase activity) Mlf (unknown)
ltd (GTPase activity) CG6921 (unknown) mud (protein binding)
Mctp (unknown) CG8709 (unknown) Pabp (mRNA 3’-UTR binding)
PRL-1 tyrosine phosphatase activity) cnn (microtubule binding) Pfk (6-phosphofructokinase ac)
wt testis vs. wt ovary CoRest (chromatin binding) PhKgamma (phosphorylase kinase activity)
Alh (transcription factor activity) Cpr (NADPH-hemoprotein reductase activity) Picot (high affinity inorganic phosphate:sodium symporter activity)
aop (protein binding) CycB (cyclin-dependent protein kinase regulator activity) qua (actin binding)
aret (RNA binding) eIF-4E (translation initiation factor activity) Rab14 (GTPase activity)
BicC (protein binding) exu (RNA localization) rdx (protein binding)
bun (protein homodimerization activity) fs(2)ltoPP43 (unknown) Rpn6 (endopeptidase activity)
capu (microtubule binding) garz (guanyl-nucleotide exchange factor activity) Rtnl1 (unknown)
Cbl (ligase activity) granny-smith (aminopeptidase activity) sgg (protein kinase activity)
CG12360 (unknown) Hmgcr (NADPH activity) sle (unknown)
CG1244 (ATPase activity) I-2 (phosphoprotein phosphatase inhibitor activity) smg (translation repressor activity)
CG14619 (ubiquitin-specific protease Imp (mRNA bindin) SPoCk (manganese-transporting ATPase activity)
CG1640 (L-alanine:2-oxoglutarate aminotransferase activity) p53 (transcription factor activity) SRPK (protein kinase activity)
CG17034 (ATPase activity) jog (unknown) ssh (protein phosphatase activity)
CG18135 (protein binding) kdn (citrate synthase activity) tamo (protein binding)
CG1882 (catalytic activity) kis (ATP-dependent helicase activity) toc (protein kinase binding)
CG3074 (endopeptidase activity) klar (protein binding) Tpi (triose-phosphate isomerase activity)
CG33523 (structural molecule activity) Lk6 (protein kinase activity) Tpr2 (heat shock protein binding)
CG3994 (zinc ion transmembrane transporter activity) lola (transcription factor activity) vig (mRNA binding)
CG4238 (ubiquitin-protein ligase activity) ltd (GTPase activity) CG5315 (hormone binding)

Another interesting example is the Rab14 gene (Table 3 and Table S4-2). Our real-time PCR analyses revealed that the two Rab14 isoforms demonstrated the opposite enrichment in wt ovary (Rab14-RA, with 13.9-fold enrichment in ovary) and wt testis (Rab14-RB, with 171.4-fold enrichment in testis, Fig. 4E–F). The Rab genes encode small guanosine triphosphatases (GTPases), which have important roles in regulating vesicle trafficking and actin filament assembly in flies 74, 75. Interestingly, certain Rab gene, such as the Rab11, has been shown to maintain the identities of female GSCs 76. It will be intriguing to explore whether Rab14 may regulate germ cell dimorphism or have dimorphic functions through its distinct isoforms. More examples of genes that exhibit sex-specific (biased) isoforms are shown in Figure S3. In summary, our data indicated that sex-specific isoforms may contribute to sex-specific gametogenesis, and provide a splicing atlas for further studies.

Splicing factors (or putative splicing factors) themselves exhibit sex- and stage-specific isoforms in gonads

It has been reported that splicing factors themselves are regulated by alternative splicing during C. elegans development 77. To obtain a comprehensive understanding of this phenomenon in Drosophila gonads, we searched all 311 genes that encode characterized or putative splicing factors (Table S2) for distinct isoforms in both bam and wt gonads. Indeed, we found that 21 of them have differential isoforms in at least one of the four pair-wise comparisons (Table S4-1 to Table S4-4). One interesting example is the exuperantia (exu) gene (Table 3), which regulates RNA localization 78, 79 and possibly splicing as well (Table S2). The exu gene exhibits germline-specific expression pattern in adult flies 80 and has been demonstrated to have male germ cell-specific splicing under the control of the Transformer 2 (Tra2) splicing factor 81. Our RNA-seq data confirmed previous reports that the exu-RC is a testis-specific isoform. In addition, using real-time PCR assays, we found that the exu-RA is a testis-biased isoform (with 4.9-fold enrichment in wt testis) and the exu-RB is an ovary-biased isoform (with 94.3-fold enrichment in wt ovary, Fig. 5A–B). The differences of all exu transcripts are located at the untranslated regions (UTRs), which may lead to different Exu protein levels in male and female germ cells. It will be interesting to examine whether such a difference may allow Exu to execute sex-specific functions, such as splicing, in germ cells.

Figure 5. Examples of characterized or putative splicing factors that have sex- or stage-specific isoforms.

Figure 5

(A) UCSC snapshots show that the exu gene has sex-specific and sex-biased isoforms: exu-RC is wt testis-specific, exu-RA is wt testis-biased and exu-RB is wt ovary-biased isoforms, respectively. (B) Real-time RT-PCR results showed that exu-RA is 4.9-fold more enriched in wt testis compared to wt ovary; while exu-RB is approximately 94.3-fold more enriched in wt ovary compared to wt testis. The level of both exu-RA and exu-RB were normalized to the total exu level, using a primer set that amplifies a common region of all three isoforms (Methods). The total exu level in wt ovary and wt testis was also shown, which has been normalized to the sample with a smaller RPKM (wt ovary in this case, Table S1). (C) UCSC snapshots show that the imp gene is expressed in both undifferentiated spermatogonia-enriched bam testis and differentiating spermatids-containing wt testis. The imp gene has stage-specific isoforms: imp-RA/RB/RC are bam testis-specific, and imp-RG/RH are wt testis-specific. (D) Real-time RT-PCR results showed that imp-RA/RB/RC are 40.2-fold more enriched in bam testis compared to wt testis; while imp-RG/RH are approximately 1,966-fold more enriched in wt testis compared to bam testis. The level of both subsets of imp isoforms were normalized to the total imp level, using a primer set that amplifies a common region of all isoforms (Methods). The total imp level in bam testis and wt testis was also shown, which has been normalized to the sample with a smaller RPKM (bam testis in this case, Table S1).

Another interesting example showed that a splicing factor had stage-specific isoforms (Table S4-3 and Table S5). The IGF-II mRNA-binding protein (Imp) is a component of the Drosophila spliceosomal complex 82. The Imp gene is expressed in both undifferentiated spermatogonia and differentiating spermatids 83. Our RNA-seq data and real-time PCR validation experiments revealed both bam testis-specific isoforms (imp-RA, imp-RB and/or imp-RC with 40.2-fold enrichment in bam testis) and wt testis-specific isoforms (imp-RG and/or imp-RH with 1,966-fold enrichment in wt testis, Fig. 5C–D), indicating that it may regulate splicing in both staged cells using distinct isoforms. Overall, our results suggested that a subset of splicing factors (or putative splicing factors) may regulate cell type-specific splicing through their own sex- or stage-specific isoforms.

Biased distribution of testis terminal differentiation genes on the 2L chromosomal arm

We next investigated whether and how chromosomal territories affect gene expression by studying chromosomal distribution of stage-specific genes during spermatogenesis. In mammals, it has been reported that the spermatogonial genes are over-represented on the X chromosome 8486. Here, we mapped spermatogonia-enriched genes based on the comparison of bam testis vs. wt testis (Fig. 6A). We found that the bam testis (thus spermatogonia)-enriched genes were significantly enriched on the X chromosome (P< 10−15), consistent with what has been reported from mammalian studies. Interestingly, this biased gene distribution on the X chromosome was reversed on the 2L arm, where the spermatogonia-enriched genes were the least frequent (P< 10−7). In contrast, terminal differentiation genes that are highly expressed in wt testis were significantly under-represented on the X chromosome (P< 10−5), and were found mostly enriched on the 2L arm (P< 10−5) (Fig. 6B). The X-inactivation is thought to occur during the late spermatocyte stage 87, which could have provided the selective pressure for differentiation genes to translocate from the X chromosome to autosomes, where they could avoid inactivation. These data suggested that in addition to the X chromosome, the 2L arm is critical in the determination of proper gene expression during spermatogenesis. And the fact that the X chromosome and 2L chromosomal arm were reciprocally favored or avoided suggested that the gene distribution on these two chromosomes may have been subjected to selective pressures during evolution.

Figure 6. Preferential gene distribution of testis terminal differentiation genes on the 2L chromosomal arm.

Figure 6

Preferential chromosomal distribution of stage-biased genes in testis on the X chromosome and the 2L chromosomal arm: Percentage was calculated as the differentially expressed genes (≥ 2-fold change)/ total genes on a particular chromosome arm (X, 2L, 2R, 3L and 3R). To calculate the ratio of RPKM of bam testis/ wt testis or wt testis/ bam testis, the RPKM value was set to 0.5 if it less than 0.5. P-value was calculated with the Pearson’s chi-squared test (Method).

New features of the Drosophila gonadal transcriptomes

Our RNA-seq data revealed many new features of the Drosophila gonadal transcriptomes, including thousands of transcribed regions that are not included in current Flybase. We called them Complementary-to-Flybase Transcribed Regions or CFTRs. In each sample, we identified ~1,500–2,500 CFTRs (Methods). The average size of the CFTRs was approximately 400 bp (Table 1). Most of the CFTRs were located in the intergenic regions (Fig. 7A). On average, the expression level of CFTRs was approximately 30–43% of that of the annotated genes (Fig. 7B). To test whether the CFTRs have been predicted or identified, we compared them to the N-SCAN, CONTRAST, mRNA and EST databases. We found that about 47–83% of the identified CFTRs in each sample were supported by the mRNA or EST databases, and about 15–20% of the identified CFTRs in each sample were present in predicted gene datasets (Table S6 and Methods). Using an extensive BLAST search, we found that only 8.9–15.4% of the CFTRs contain known protein coding sequences (Table 1, Table S6 and Methods). The remaining approximately 85% of the CFTRs could encode novel peptides or non-coding RNAs. Interestingly, analysis of the tissue specificity of the CFTRs revealed hundreds of specific ones in each pair-wise comparison (Fig. S4). This result indicated that CFTRs are another source of differentially expressed transcripts in gonads.

Figure 7. New features of the Drosophila gonadal transcriptome.

Figure 7

(A) Distribution of identified CFTRs at the intronic and intergenic regions, based on the Ensembl annotation (http://www.ensembl.org/index.html) and the most update Flybase version (r5.19). *: Retained intronic CFTR must have a RPKM value greater than 10% of the RPKM(s) of the neighboring gene(s), to avoid contamination from pre-mRNAs. (B) Box plot of the expression level (log2RPKM) of annotated expressed genes and CFTRs in all four samples.

In summary, our data revealed many new features of the Drosophila gonadal transcriptomes, which will lead to a more comprehensive understanding of how differential gene expression and splicing regulate sexual and cellular differentiation of the fly gonads 8890.

Discussion

Epigenetic regulation in male gonads

Emerging evidence indicates that embryonic stem cells maintain their identities by a unique transcription network and chromatin structure (reviewed by 91, 92). However, it is not well understood whether adult stem cells, such as GSCs, maintain their unique features using a particular chromatin structure; and if so, how developmental programs change such a structure and regulate terminal differentiation (reviewed by 93, 94). Previous studies in male germline lineage revealed that the cell type-specific tTAFs counteract Polycomb functions upon differentiation 44, which indicate that the chromatin structure in germ cells may switch upon the transition from undifferentiated to differentiating status. Here our RNA-seq data demonstrated a remarkable enrichment of nearly all families of chromatin modifying enzymes, including both ATP-dependent chromatin remodeling factors and histone modifying enzymes in the undifferentiated cell-enriched testis, compared with fully differentiated testis. These data demonstrated that a dynamically regulated chromatin structure may be required for maintaining the undifferentiated status of male GSCs and transit-amplifying spermatogonial cells. In addition, developmentally programmed mechanisms are required for switching such a chromatin landscape in order to initiate the terminal differentiation gene expression program.

Enhanced splicing factors in undifferentiated-cell enriched testis may contribute to stage-specific splicing events in testis

Differential gene expression and alternative splicing contribute significantly to pleiotropic phenotypic and behavioral features 10, 22, 35. In addition to the dynamic chromatin structure, our studies revealed enrichment of the majority of the splicing factors in the undifferentiated cell-enriched testis. The significantly higher splicing activities may collaborate with the highly expressed chromatin remodeling factors and histone modifying enzymes to maintain an intricate but sensitive transcription network through post-transcriptional mechanisms. Upon differentiation, the developmental program switches the gene network from undifferentiated to differentiating status in a unidirectional mode. The terminal differentiation genes that are turned on in differentiating cells are less dependent on splicing factors. The lower transcript complexity of this group of genes may contribute to the higher degree of transcriptional efficiency, in order to coordinate the co-expression of a large number of genes in spermatocytes 42. It is not clear how these stage-specific features of transcriptomes are regulated. However, these phenomena are amenable to further genetic and molecular studies.

Sex-specific isoforms in gonads

Among all the phenotypic traits of Drosophila, sexual dimorphism is undoubtedly one of the most distinct features, and gonads are the most sexually distinguishable tissues. Despite a wealth of knowledge about sexual differentiation of soma, sexual differentiation of germ cells is not well understood. Our data demonstrate that wt gonads are not only the major tissues with sex-specifically expressed genes, but also the tissues with prevalent sex-specific isoforms of genes (Fig. 4B). Studying these genes, especially those with regulatory roles such as transcription factors or splicing factors will undoubtedly shed light on how sexual dimorphism is established and maintained during gametogenesis. And our work has provided much needed information to address these intriguing biological questions at a molecular level in gonads.

In summary, we have discovered that the dynamic regulation of chromatin remodeling factors and histone modifying enzymes is characteristic of undifferentiated cell-enriched Drosophila testis. The sex-specific isoforms of critical transcription factors and splicing factors may contribute to the sexual differentiation of gonads. Our single base-pair resolution, genome-wide RNA-seq data provide a foundation for the systematic study of many interesting areas in Drosophila reproductive biology and stem cell biology, such as the molecular basis for sexual dimorphism of gonads, and the regulation of proliferation vs. terminal differentiation programs in GSC lineages.

Materials and Methods

Fly strains and tissue preparation

All fly stocks were grown at 25°C incubator with standard medium. The bam[1]/TM3 stock were obtained from Bloomington Drosophila Stock Centers. The bam[delta86]/TM3 strain was a gift from Dr. Allan C. Spradling. The bam[114-97]/TM6B strain was a gift from Dr. Margaret Fuller. Testes from bam[1]/bam[114-97] mutant males and ovaries from bam[1]/bam[delta86] females were dissected in DEPC-treated 1× PBS buffer in 20mins interval, followed by immediate snap freezing in liquid nitrogen. The wt testes and ovaries were dissected from y, w flies using the same method. All flies were less than one day post-eclosion. Noticeably, we found some accessory gland genes (e.g. Acp genes) in our bam testis RNA-seq dataset (Table S1), which may be caused by technical difficulties to completely isolate testes from accessory glands from bam mutant males. In fact, we found the same issue with previous microarray data using a similar fly strain 35, shown as high deviation among different biological replicates.

Library preparation for RNA-seq

Total RNA was extracted using TRIzol (Invitrogen, #15596-018) following the manufacturer’s instructions. Samples used for total RNA extraction were from ~ 200 pairs of bam testes (8.5 μg), ~200 pairs of bam ovaries (6.9 μg), ~200 pairs of wt testes (19 μg) and 45 pairs of wt ovaries (15 μg). DNA was degraded using 2 Units of DNase I (Fermentas, #EN0521) at 37°C for 20 mins. The integrity of RNA was checked by gel electrophoresis (1% agarose).

From ~10 μg total RNA for each sample, we performed two rounds of mRNA isolation using Dynabeads mRNA purification kit (Invitrogen, #610-06), according to the manufacturer’s instructions. The final mRNAs were eluted in 13.5μl 10 mM Tris-HCl (pH=7.5) and immediately used to generate the first strand cDNA, using 4 μl random hexamers (ABI, #N8080127) and SuperScript II Reverse Transcription Kit (Invitrogen, #18064-014) in a 30μl final volume, following the manufacturer’s instructions. The second strand cDNA was generated with the following recipe: 10 μl 5× second strand buffer (500 mM Tris-HCl pH7.8, 50 mM MgCl2, 10 mM DTT), 30 nmol dNTPs (Invitrogen, #18427-013), 2 Units of RNase H (Invitrogen, #18021-014) and 50 Units of DNA Pol I (Invitrogen, #18010-025). The entire reaction mix was incubated at 16°C for 2.5 hours. The double-stranded DNA (dsDNA) was purified with QIAquick PCR purification kit (Qiagen, #28106) and the concentration was quantified by a Qubit fluorometer (Invitrogen).

For generating libraries for sequencing, about 300 ng dsDNA of each sample was fragmented by sonication using Bioruptor (Diagenode, UCD-200-TM-EX) under the following conditions: medium power output for 30 mins in ice water. The resulting DNA fragments were analyzed by agarose gel to verify a ~100–300 bp size range. Sequencing libraries were prepared as the follows: end-repair (DNA end-repair kit from Epicenter, #ER0720); A-tailing (300 ng dsDNA, 5 μl Thermo buffer, 10 nmol dATP, 15 Units of Taq polymerase, at 70°C for 30mins); Solexa adaptor ligation (300 ng dsDNA, 4 μl DNA Ligase buffer, 1μl Solexa adaptor mix, 3 μl DNA Ligase, at 70°C overnight.); PCR (98°C 10 sec, 65°C 30 sec, 72°C 30 sec for 16 cycles; then additional 72°C for 5 mins) amplification with adaptor primers and size selection (200–400 bp). Then the library dsDNA for each sample was used on Solexa 1G sequencer at a concentration of 10ng per lane.

Primary annotation information

Drosophila exon annotation information was downloaded using BioMart from the Drosophila BDGP5.4, Ensembl database (release 50).

Preliminary analysis of short read data

Short reads alignment and filtering

The quality-filtered 30 bp short sequence reads were aligned to the reference sequence consisting of dm3 Drosophila melanogaster genome plus a library of synthetic exon junction sequences using ELAND (Efficient Local Alignment of Nucleotide Data) software, allowing up to two mismatches with the reference sequence (Fig. S1B). The library of exon junction sequences was created as follows. Drosophila exon sequences were retrieved from Ensembl database (release 50). All possible pairs of exons that belonged to the same transcript were joined such that the genomic order of exons is aligned. A junction sequence is consisted of the last 26 bp of 5’ exon and the first 26 bp of 3’ exon. Redundant sequences were removed from the resulting set of 52 bp exon junction sequences. The numbers of reads that aligned uniquely to the genome and exon junctions are shown in Table 1. In order to remove possible PCR amplification artifacts and to reduce confounding effects of systematically bad sequencing cycles in short sequence reads, we retained a single copy of each unique read. This filtering procedure yields the set of non-redundant unique reads.

Generating an in-silico fly transcriptome

As described above we created a library of exon junction sequences by joining pairs of exons that belong to the same transcript such that the genomic order of exons is retained. Not all junctions in the resulting library are present in the annotated Ensembl transcripts. Junctions of neighboring exons in a transcript belong to annotated transcripts, whereas junctions of non-neighbor exons potentially belong to novel transcripts.

Sequencing depths in different samples

There was some variability in sequencing depths in different tissue samples. At least two technical replicates were used and 10–13 million 30 bp sequencing reads were obtained per sample (Table 1). In order to characterize splicing and gene expression differences between samples in an unbiased way, we used the following procedure to equate the sequencing depths of different samples. We randomly sub-sampled approximately 10 million non-redundant unique reads (300 million bp) in each of the four samples and used these reads for all downstream analyses, which was sufficient to cover the entire predicted fly transcriptome 10 times (~30 million bp transcribed sequences according to Ensembl database release 50).

Calculation of gene expression level

For each tissue sample, we used the following procedure to compute a single number that summarizes overall expression level of a gene. All exon regions belonging to a gene are merged and the total number of non-redundant unique reads in the resulting merged exonic region is counted. The resulting number of reads is normalized with respect to the total size of merged exonic region of the gene and the total number of genomic unique and non-redundant reads in the particular sample, and RPKM value is computed. The use of merged regions to compute read counts avoids the problem of double-counting in regions where exons overlap with each other.

Using the RPKM values, the correlation coefficient (r) among the technical replicates for each sample was calculated, and this coefficient showed very little variation (r>0.983) in all samples, consistent with the idea that the RNA-seq method is highly reproducible. Thus, data from all technical replicates for each sample were combined for further analysis.

ChIP-seq procedure

We dissected 200 pairs of bam testes in cold PBS and grouped in 200 μl PBS which contained protease inhibitor (Roche complete mini, # 11836153001) and 0.5 mM PMSF (MP Biomedicals, #195381); We then added 5.5 μl 37% fresh formaldehyde (Supelco, # 47083-U) and incubated at 37°C for 15mins. The testes were spin at 2k for 2 mins and washed 2 times with 450 μl cold 1×PBS (with inhibitors and PMSF). Then 200 μl lysis buffer (50 mM Tris-HCl, pH7.6, 1 mM CaCl2, 0.2% Triton X-100, 5 mM butyrate, 1× proteinase inhibitor cocktail, and 0.5 mM fresh PMSF) was added and the tissues were homogenized thoroughly followed by incubation at RT for 10mins. We then did sonication with Microtip (Misonix, Inc, Microson XL-2000) under the following procedure: 5” at power 20, rest for 50”, 3–4 times, followed by spinning at 14k rpm for 10’ at 4°C. The chromatin was diluted 10× with RIPA buffer (10 mM Tris, pH7.6, 1 mM EDTA, 0.1% SDS, 0.1% Na-Deoxycholate, 1% Triton X-100, with protease inhibitors and PMSF) and 50μl of this dilution was taken out as input.

We washed 40 μl of Dynabeads Protein A (Dynal Biotech ASA, Oslo, Norway) once with 1× PBS. We then added 4 μg anti-RNA Pol II antibody (Abcam, ab5408) to the Dynabeads and incubated at RT for 40mins, followed by washing with 1× PBS. Next 1ml of the chromatin extract was added to the beads and the mixture was rotated at 4°C overnight. Subsequent washing was performed per the manufacturer’s instructions. The beads were suspended in 100 μl 1× TE containing 3 μl 10% SDS and 5 μl 20 mg/ml proteinase K. After overnight incubation at 65°C, the supernatant was transferred to a new tube using a magnet (Dynal MPC-S) to precipitate the Dynabeads. Samples were treated by Phenol/Chloroform extraction, salt/EtOH precipitation, and resuspended in 50 μl 1×TE. The products were processed for Solexa sequencing according to the established protocol 95.

Comparison of RNA-seq with ChIP-seq

All Drosophila Ensembl genes were classified into five groups according to their RPKM value: 8,943 genes that have at least one RPKM were classified into high (2,981 genes, 41.07<RPKM<5307), moderate (2,981 genes, 15.36<RPKM<41.06) or low (2,981 genes, 1.00<RPKM<15.36). 2,294 genes whose RPKM between 0.01 to 1.00 was considered as uncertain group, and the rest 1,902 genes were classified as a silent group (RPKM=0). The genes in each group were aligned to their transcription start site (TSS), using UCSC annotation (ftp://hgdownload.cse.ucsc.edu/goldenPath/dm3/database/). The read density was calculated in 5 bp windows.

Comparison of RNA-seq data with published microarray data

We first downloaded the CEL files generated by Parisi et al 22 and Chintapalli et al 96 from NCBI GEO database. We then extracted and normalized microarray signals with RMA function embedded in the limma package [downloaded from Bioconductor R packages (http://www.bioconductor.org)]. Genes with multiple probes were filtered out if different probes give out inconsistent Present (P) or Absent (A) calls. Genes with at least three P calls or three A calls from four independent biological replicates for each sample were retained for further analysis. Differentially expressed genes were identified using the combination of P-value (P≤0.05) and fold change (≥2) cutoffs, followed by comparison with their expression level using the RNA-seq data.

Heat map analyses

To visualize gene expression across different samples, heat map was generated using the TIGR software MeV v4.3.01 (http://www.tm4.org/mev.html). The hierarchical clustering was made using parameters of Euclidean distance and average linkage clustering. The sample columns were fixed and the genes were clustered with the optimized gene leaf order.

In situ hybridization

PCR primers were appended with a T7 RNA polymerase binding site at their 5’ ends (5’-AAGGATCCTAATACGACTCACTATAGGGAGA-3’). For each of the following genes, we design two sets of primers to synthesize both sense (S) and antisense (AS) riboprobes. The primers’ sequences are:

  • T7_AS_trx: AAGGATCCTAATACGACTCACTATAGGGAGAAGGTCTCCTTGCCAAGCTTCAGAT;

  • S_trx: TAGAAACGTGCTGGAGACAAGCGA;

  • T7_S_trx: AAGGATCCTAATACGACTCACTATAGGGAGATAGAAACGTGCTGGAGACAAGCGA;

  • AS_trx: AGGTCTCCTTGCCAAGCTTCAGAT;

  • T7_AS_lid: AAGGATCCTAATACGACTCACTATAGGGAGACGCCACTATTGCTGTTGCTATTGG;

  • S_lid: TCAAGAAGCGATTATGGCGCAGCA;

  • T7_S_lid: AAGGATCCTAATACGACTCACTATAGGGAGATCAAGAAGCGATTATGGCGCAGCA;

  • AS_lid: CGCCACTATTGCTGTTGCTATTGG;

  • T7_AS_rpd3: AAGGATCCTAATACGACTCACTATAGGGAGATGCGTTATTCGCCACATTGGATCG;

  • S_rpd3: ACAGCAACAAGGCATCCTCAGAGA;

  • T7_S_rpd3: AAGGATCCTAATACGACTCACTATAGGGAGAACAGCAACAAGGCATCCTCAGAGA;

  • AS_rpd3: TGCGTTATTCGCCACATTGGATCG;

  • T7_AS_Pcaf: AAGGATCCTAATACGACTCACTATAGGGAGAACGTTCTCATCCCGCGACACATTA;

  • S_Pcaf: AAGGATGATTCGCCCATCTGGGAT;

  • T7_S_Pcaf: AAGGATCCTAATACGACTCACTATAGGGAGAAAGGATGATTCGCCCATCTGGGAT;

  • AS_Pcaf: ACGTTCTCATCCCGCGACACATTA.

After the PCR amplification (Fermentas PCR purification Kit, #K0702), we synthesize the RNA probes using DIG-labeled NTPs (Roche #11277073910) and the T7 RNA polymerase (Roche #10881767001) at 37°C for 2 hours, followed by hydrolysis with carbonate buffer (120 mM Na2CO3; 80 mM NaHCO3; pH=10.2) at 65°C for 10 min. Whole mount testes in situ hybridization was carried out as previously described 97 with the following modifications: For each reaction, ~15 pairs of testes were dissected from the y,w males that were less than one day post-eclosion. RNA hybridization to DIG-labeled probes was carried out in RNase-Free tubes at 65°C for 16–18 hours. And the testes were subsequently washed 6 times (15 minutes each time) with 500 μL Hybe B [10.75 ml DEPC H2O, 25 ml Formamide (Fermentas #BP227500), 12.5ml 20X SSC, 1ml sheared herring sperm DNA (5mg/ml, Sigma #D3159-10G), 250 μL heparin (10mg/ml, Sigma #H3393-100KU), 500 μL Tween 20 (Promega #H5151)] in a 65 C water bath; followed by 15min with each of the following solutions at room temperature (RT): Hybe B: PBST (4:1); Hybe B: PBST (3:2); Hybe B: PBST (2:3); Hybe B: PBST (1:4); and 2× 15min with 500 μL PBST (1× PBS plus 0.1% Tween-20).

The results were next developed using anti-Digoxygenin antibody (Roche #11093274910, at a 1:2000 dilution) in RNase-Free tubes at 4°C overnight. The testes was then washed 4× 20 min with 500 μL PBST at RT; followed by 3× 5min washes in freshly made NMTT (for 10 ml NMTT: 8.3 ml DEPC H2O, 1 ml Tri-HCl (pH 9.0), 200 μL 5M NaCl, 500 μL 1M MgCl2, 10 μL Tween 20) at RT. The testes were then transferred from the tubes into the wells of dissecting dish. The NMTT buffer in each well was replaced by 300 μL NBT (4-Nitro blue tetrazolium chloride) staining solution [298 μL NMTT, 1.35 μL NBT (Roche #11383213001) and 1.05 μL X-phopshate (Roche #11383221001)]. The color reaction was allowed until differential staining signal could be detected in testes stained with antisense probe vs. sense probe. The reaction was then stopped by 3× quick washes using PBST; 1× quick wash using 100% EtOH: PBST (1:1); 2x quick washes using 100% EtOH; and 2x quick washes using 100% EtOH: Methyl salicylate (1:1). Approximately 300 μL GMM [3 ml Methyl Salicylate (Fermentas #03695-500), and 12 ml Canada Balsam (Fermentas #B10-100)] were added to the testes followed by overnight shaking at RT. Finally, the testes were mounted on slides and visualized using the CCD camera. The images were processed by Adobe Photoshop.

Transcript isoform expression from RNA-seq data

Since many transcripts overlap with each other, not all isoforms are unambiguously detectable using RNA-seq short reads. In order to facilitate transcript isoform detection, we introduce the following natural ID system for transcripts. For each transcript T, let RT be the union of its exonic region that does not overlap with the rest of Ensembl transcripts. RT uniquely identifies transcript T. We then called a transcript T with a non-empty RT a uniquely detectable isoform. However, not all transcripts possess a non-empty R, because some are fully overlapped with other transcripts. We estimate the expression level of each uniquely detectable transcript isoform T by counting the number of unique and non-redundant reads in the RT region, followed by using the total size of RT as the normalization factor to compute the RPKM value. We used RPKM ≥ 1 as the cutoff for positive transcription.

We did not use exon-exon junction reads data to estimate the expression levels of alternatively spliced isoforms. The estimates of expression level using exon-exon junction read counts are not reliable due to a high level of sampling noise present at the small junctions. The relative sampling noise level is of the order n−1/2. The n is the expected number of reads at an exon-exon junction, dependent on the expression level of the corresponding transcript.

Entropy as a measure of alternatively spliced gene complexity

We set T1,T2,…,Tn to be the set of uniquely detectable isoforms of a gene; and x1,x2,…,xn to be the corresponding RPKM values estimated as described above. If only a few isoforms of a multi-isoform gene are expressed, the gene has low AS complexity. If majority of isoforms are expressed, the gene has high AS complexity. As a measure of AS complexity, we used gene entropy (or “S” in abbreviation) 65. The probability pk of isoform Tk is estimated as the frequency pk = xk/N, where N is the sum x1+x2+…+xn of isoforms’ RPKM values of the corresponding gene. We added a pseudo-count ε = 0.001 to each xk to avoid zeroes. Using the pseudo-count-adjusted probability estimates, we computed the entropy of the gene as S = −Σ pk log2 pk.

By definition, the entropy is non-negative (S ≥ 0). It is zero if and only if all pk, except one, are zero. This is a desired property because if only one isoform of a multi-isoform gene is expressed, then the AS complexity of the gene should be zero. Another property of entropy is that it only depends on the probabilities (estimated as frequencies), but not on absolute RPKM values, as long as these values are large enough for a reliable estimate of frequencies. Thus, for a sufficiently expressed gene, entropy does not depend on the overall expression level of the gene. Therefore, it is meaningful to compare gene entropies across different samples. Entropy estimates for genes expressed at very low levels are less reliable. But, since we are mainly interested in the comparison of the overall AS complexity of a large number of genes across different samples, the effect of unreliable entropy estimates for a small subset of genes is insignificant.

The largest possible entropy log2 k for a gene with k uniquely detectable isoforms will be achieved when the frequencies of all isoforms are equal to 1/k. We normalized gene entropy by dividing it with log2 k. Therefore, the normalized entropy Snorm is always within the range from zero to one (0 ≤ Snorm ≤ 1).

Cumulative Distribution Function (CDF)

For each normalized entropy value x, the function CDF(x) gave out the fraction of genes whose normalized entropies are smaller than x. For example, CDF (0.1) = 0.9 means that 90% of genes have normalized entropy less than 0.1. To statistically test if normalized entropies of AS loci in one sample is significantly larger than those in another sample, we used one-sided Kolmogorov-Smirnov test, which gave out P values shown in Fig.3B.

Differential splicing analysis

In Ensembl 50 database, 3,463 Drosophila genes (approximately 23%) have multiple isoforms. Among them, 2,094 were identified in our dataset that have more than one uniquely detectable isoform.

In order to visually display differential splicing of genes in two samples, we used the following method. For each gene, we asked the following questions:

  1. Does it have transcript isoforms expressed both in samples 1 and 2?

  2. Does it have transcript isoforms expressed only in sample 1?

  3. Does it have transcript isoforms expressed only in sample 2?

The answer to each of these questions is yes (Y) or no (N). Thus, for each gene, there can be 8 combinations of answers: YYY, YNY, NYY, YYN, NNY, YNN, NYN, NNN. In Fig. 4A and Fig. 4D, Y is denoted by red color and N is denoted by blue color (The NNN category for each comparisons were not shown.).

We generated the list of candidate genes that undergo alternative splicing in a sex-specific or sex-biased manner in Table 3 as follows. The number of observed RNA-seq reads from a small exonic region is subjected to large sampling noise. The relative sampling noise level is of the order n−1/2. The n is the expected number of reads at the exonic region, which is dependent on the expression level of the corresponding transcript. In order to avoid the noise and thus false positive differentially spliced genes, we removed all uniquely detectable transcript isoforms with the ID region R of a total size smaller than 100bp. Only genes with two or more uniquely detectable long R isoforms (UDLRI) were retained for further analysis. A differentially spliced gene must be: (1) sufficiently expressed in both samples (i.e. most expressed UDLRI isoforms have RPKM ≥ 10), (2) contain at least one ULDRI isoform which is not expressed (RPKM < 2) in one sample but highly expressed (≥ 10 RPKM) in another one. The differentially spliced genes identified using the above procedure were then manually inspected on the UCSC genome browser to remove false positives and listed in Table 3 and Table S5.

Chromosomal mapping assay

Chromosomal mapping was done with Pearson’s chi-squared test in R programming environment (R version 2.5.0, download from http://www.r-project.org). For bam testis vs. wt testis comparison in Figure 6, we calculated the P-value in a 2x2 tables with the gene numbers of the following categories: differentially expressed genes (≥ 2 fold change between bam testis and wt testis) on a certain chromosome arm (e.g. chrX), non-differentially expressed genes on a certain chromosome arm (e.g. chrX), differentially expressed genes on the other four chromosome arms (e.g. chr2L, chr2R, chr3L and chr3R in together), non-differentially expressed genes on the other four chromosome arms. And the P< 0.01 was used to evaluate the significance of differential chromosomal distribution.

Identification of Complementary-to-Flybase Transcribed Regions (CFTRs)

The identification of CFTRs was modified from 95. Reads which can be mapped to known exons were all removed. Known exons were obtained from Ensembl database (http://www.ensembl.org/index.html) and the newest Flybase version (r5.19) (http://flybase.org/). The remaining reads were analyzed to discover CFTRs. For this purpose, CFTRs were defined for each sample by first calculating the number of reads aligning to each 40-bp window across the genome, and windows passing a P-value threshold (Poisson) of 0.05 were retained. Consecutive windows were grouped to form a large region allowing a gap of two windows that did not satisfy the p-value cutoff. Finally, all CFTRs smaller than 100 bp (to reduce false positive caused by non-specific PCR amplification in some small regions that have high density of sequencing reads), or contain fewer than 10 reads were eliminated. The RPKM for a CFTR was calculated according to their reads, sizes and the total unique sequencing reads in each sample.

CFTRs were mapped to intergenic or intronic regions according to their mapped genomic loci (Ensembl). To eliminate possible contamination by pre-mRNA, intronic CFTRs were removed if their RPKM value was less than 10% of the RPKM of neighboring gene(s).

To validate the CFTRs, we compared them with the EST and mRNA sequences retrieved from UCSC or NCBI database, and genes predicted by N-SCAN (http://mblab.wustl.edu/predictions/Drosophila/dm3/) or CONTRAST (http://contra.stanford.edu/contrast/dm3.html). We found approximately 14–23% CFTRs in each sample overlapped with these predicted genes (≥20 bp) at their corresponding genomic loci.

BLAST search

To determine potential products encoded by CFTR transcripts (e.g. protein coding sequences or non-coding RNA sequences), all CFTR sequences were extracted with a Perl script based on BDGP v5.0 and analyzed by BLASTX search against all non-redundant protein sequence with entries from GenPept, Swissprot, PIR, PDF, PDB and NCBI RefSeq (4-Oct-2007) downloaded from ftp://ftp.ncbi.nih.gov/blast/db (Dec, 24th, 2008). BLASTX were performed with local Linux system and the cutoffs were as following: minimum score =50, e-value ≤ 10−6, percent identity ≥ 50%, and have a minimum match of 30 amino acids.

Box plot analysis

The distribution of gene expression level was analyzed using boxplot in the R programming environment (R version 2.5.0, downloaded from http://www.r-project.org). The box represents the 25th and 75th percentiles, with the 50th percentile as a black bar. The whiskers refer to the outliers which are at least 1.5x IQR (interquartile range) from the box. The Y axis represents the log2 RPKM value.

Real time RT-PCR validation

Real time RT-PCR was used to validate sample-specific isoforms in all four pair-wise comparisons. In each experiment, two biological samples were prepared independently to carry out the real time PCR experiments. For each PCR reaction, we run duplicates using SYBR Green PCR Master Mix (Fermentas #K0221) in an ABI 7300 system. The following primers were used to validate corresponding genes for the their common regions (c) and isoform-specific regions:

  • p53.cF: GGAGAAGCAAAGGAACACACGCAA;

  • p53.cR: ACTCGATTCCGCTGAACAAGCTCT;

  • p53.RA-F: ATTCCGATCCCGATACCTCCACC;

  • p53.RA-R: CAGCCAATGTCGTGGCACAAAGAA;

  • p53.RB-F: CTCTGCAGAAACTTCGTTGCCGAT;

  • p53.RB-R: GCGGACACAAATCGCAACTGCTAA;

  • rab14.cF: TCTGGAGACCGCACGCAAGATTTA;

  • rab14.cR: TTTAGCACGAGCACTGATCCTTGG;

  • rab14.RA-F: ATTGCAATCGAATTCCGCACAGCC;

  • rab14.RA-R: TTTAGTCCACCTTAGGGAGCGAAC;

  • rab14.RB-F: TAAGCAGCGACTACGGTTGGACAT;

  • rab14.RB-R: TGCGCACTTTGCTCATCTTGACAC.

  • exu.cF: ACTTGTCACCTCCTGCTCCAAACT;

  • exu.cR: TGCTCGAGCTTCTGGACAGCTATT;

  • exu.RA-F: ACGCCCACCAGGATATAATTACCG;

  • exu.RA-R: AAAGCGAAAGAGCCCATCGAAACC;

  • exu.RB-F: ACGCCCACCAGGATATAATTACCG;

  • exu.RB-R: ATCTAGTGAAAGCGGTTCGCGT;

  • imp.cF: CATTTCGCTCTGCACAAGAATGCG;

  • imp.cR: TGTTGGTCTGAACGGTGTCGAGTT;

  • imp.RA/RC/RB-F: CTGGCCGACTGTTGAGTTTCTTTC;

  • imp.RA/RC/RB-R: GCAATAACTACAACAACACACGGCT;

  • imp.RG/RH-F: AACTTGGTTGTGCGTTGCGA;

  • imp.RG/RH-R: AAGGCCAAAGGAAAGGCGAAAGAC;

Supplementary Material

supple Table 1. Table S1: The RPKM values for expressed genes in at least one of the four samples.

The Probe IDs in Affymetrix GeneChip Drosophila Genome 2.0 Array were listed for comparison with microarray data if they are available.

supple Table 2. Table S2: Expression of genes involved in splicing in bam vs. wt testis.

311 genes which are involved in splicing were extracted from several sources, including 21 splicing factors from (http://www.sdbonline.org/fly/aignfam/splice.htm), 161 genes with “RNA splicing” GO annotation based on Amigo database (http://amigo.geneontology.org), 97 gene that are involved in pre-mRNA processing (http://www.wam.umd.edu/~smount/DmRNAfactors/table.html), and 207 genes that are components of the spliceosome according to the proteomic study 82. Genes coming from these different sources are not mutually exclusive. We found 177 out of the 311 genes (56.9%) are enriched in bam testis (bam testis/wt testis≥ 2); but only 26 out of the 311 genes (8.4%) are enriched in wt testis (wt testis/bam testis≥ 2). All genes that encode SR protein are highlighted in green, SR-like proteins in light green and hnRNP in yellow.

supple Table 3. Table S3: Meta table of alternatively spliced genes in all four samples.

Isoform-specific RPKM values for 2,094 genes with two or more uniquely detectable isoforms. The geneAS column shows array T1|T2|…|Tn of uniquely detectable transcripts of each individual gene.

supple Table 4-1. Table S4: Four pair-wise comparisons of genes with differential isoforms in gonads.

(4-1) bam testis vs. bam ovary comparison; (4-2) wt testis vs. wt ovary comparison. (4-3) wt testis vs. bam testis comparison (4-4) wt ovary vs. bam ovary comparison. All splicing factors or putative splicing factors that have different isoforms in each pair-wise comparison are highlighted in yellow.

supple Table 4-2
supple Table 4-3
supple Table 4-4
supple Table 5. Table S5: Genes that exhibit stage-specific or stage-biased isoforms in bam and wt gonads.

Genes that encode transcription factors or putative transcription factors were highlighted in red. Genes that encode splicing factors or putative splicing factors were highlighted in yellow.

supple Table 6. Table S6: Detection of CFTRs in all four samples.

The identification and analysis of CFTRs were described in Methods.

supple figures

Figure S1: Methods of RNA-seq. (A) Flow chart of the experimental design; (B) Scheme of the data processing procedure.

Figure S2: Comparison of RNA-seq data with microarray data. (A) Comparison of RNA-seq data with the microarray data from 22. There are 11,893 genes in the FlyGEM microarray, 11,341 of them have the same gene ID used for our RNA-seq. Therefore 11,341 genes were used for this comparison. For the genes identified by the Parisi et al. study, 89% of the testis-enriched genes and 93% of the ovary-enriched genes were also identified in our study. In Parisi et al. study, the y,w strain was used as the wt strain, ovaries and testes were dissected 3 to 5 days post eclosion. We used the same strain but we dissected within 1 day post-eclosion, which may contribute to some of the differences of these two data sets. (B) Comparison of RNA-seq data with the microarray data from FlyAtlas (http://www.flyatlas.org/). We used 12,394 genes that FlyAtlas and our RNA-seq overlap for this comparison. We detected 3,615 and 3,910 genes that are enriched in wt testis and wt ovary, respectively, whereas the FlyAtlas identified 2,602 and 3,939 genes respectively. 91% of the testis-enriched genes and 77% of the ovary-enriched genes identified by FlyAtlas’s data overlapped with our RNA-seq data. In FlyAtlas, Canton S strain was used as the wt strain, ovaries and testes were dissected 7 days post-eclosion, which may contribute to some of the differences of these two data sets.

Figure S3: More examples of genes that have sex-specific isoforms. (A–F) UCSC snapshots show that the Itd gene has sex-specific isoforms in bam testis and bam ovary samples; and the smg (B), aret (C), BicC (D), qua (E), capu (F) genes have sex-specific isoforms in wt testis and wt ovary samples.

Figure S4: Venn diagrams of pair-wise comparisons of CFTRs. In each comparison, the CFTRs were considered to be shared in both samples if their corresponding genomic loci had ≥ 20bp overlap. The numbers in brackets represented the shared CFTRs from the right sample in each pair-wise comparison. All CFTRs had ≥ 10 sequencing reads and ≥ 2 RPKM. Identification of CFTRs was described in Methods and all CFTRs were listed in Table S6.

Acknowledgments

We would like to thank Drs Karen Beemon, Allan Spradling, Mark Van Doren and Chen lab members for critical readings and suggestions of the manuscript. We thank Dr. Dustin E. Schones for help to set up the initial data analysis pipeline, and Caitlin Choi and Ankit Vartak for technical assistance with the PCR experiments. This work is supported in part by Research Grant No. 05-FY09-88 from the March of Dimes Foundation, the R00HD055052 NIH Pathway to Independence Award from NICHD, the 49th Mallinckrodt Scholar Award from the Edward Mallinckrodt, Jr. Foundation, support from the Johns Hopkins University (X.C.) and the Division of Intramural Research, the National Heart, Lung and Blood Institute, NIH (K.Z.).

References

  • 1.Cinalli RM, Rangan P, Lehmann R. Germ cells are forever. Cell. 2008;132 (4):559–562. doi: 10.1016/j.cell.2008.02.003. [DOI] [PubMed] [Google Scholar]
  • 2.Fuller MT, Spradling AC. Male and female Drosophila germline stem cells: two versions of immortality. Science. 2007;316 (5823):402–404. doi: 10.1126/science.1140861. [DOI] [PubMed] [Google Scholar]
  • 3.Kiger AA, Jones DL, Schulz C, Rogers MB, Fuller MT. Stem cell self-renewal specified by JAK-STAT activation in response to a support cell cue. Science. 2001;294 (5551):2542–2545. doi: 10.1126/science.1066707. [DOI] [PubMed] [Google Scholar]
  • 4.Tulina N, Matunis E. Control of stem cell self-renewal in Drosophila spermatogenesis by JAK-STAT signaling. Science. 2001;294 (5551):2546–2549. doi: 10.1126/science.1066700. [DOI] [PubMed] [Google Scholar]
  • 5.Yamashita YM, Jones DL, Fuller MT. Orientation of asymmetric stem cell division by the APC tumor suppressor and centrosome. Science. 2003;301 (5639):1547–1550. doi: 10.1126/science.1087795. [DOI] [PubMed] [Google Scholar]
  • 6.Yamashita YM, Mahowald AP, Perlin JR, Fuller MT. Asymmetric inheritance of mother versus daughter centrosome in stem cell division. Science. 2007;315 (5811):518–521. doi: 10.1126/science.1134910. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Brawley C, Matunis E. Regeneration of male germline stem cells by spermatogonial dedifferentiation in vivo. Science. 2004;304 (5675):1331–1334. doi: 10.1126/science.1097676. [DOI] [PubMed] [Google Scholar]
  • 8.Maines JZ, Park JK, Williams M, McKearin DM. Stonewalling Drosophila stem cell differentiation by epigenetic controls. Development. 2007;134 (8):1471–1479. doi: 10.1242/dev.02810. [DOI] [PubMed] [Google Scholar]
  • 9.Buszczak M, Paterno S, Spradling AC. Drosophila stem cells share a common requirement for the histone H2B ubiquitin protease scrawny. Science. 2009;323 (5911):248–251. doi: 10.1126/science.1165678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kai T, Williams D, Spradling AC. The expression profile of purified Drosophila germline stem cells. Dev Biol. 2005;283 (2):486–502. doi: 10.1016/j.ydbio.2005.04.018. [DOI] [PubMed] [Google Scholar]
  • 11.Maniatis T, Tasic B. Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature. 2002;418 (6894):236–243. doi: 10.1038/418236a. [DOI] [PubMed] [Google Scholar]
  • 12.Kim E, Magen A, Ast G. Different levels of alternative splicing among eukaryotes. Nucleic Acids Res. 2007;35 (1):125–131. doi: 10.1093/nar/gkl924. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zarkower D. Establishing sexual dimorphism: conservation amidst diversity? Nat Rev Genet. 2001;2 (3):175–185. doi: 10.1038/35056032. [DOI] [PubMed] [Google Scholar]
  • 14.McKeown M. Sex differentiation: the role of alternative splicing. Curr Opin Genet Dev. 1992;2 (2):299–303. doi: 10.1016/s0959-437x(05)80288-6. [DOI] [PubMed] [Google Scholar]
  • 15.Burtis KC, Baker BS. Drosophila doublesex gene controls somatic sexual differentiation by producing alternatively spliced mRNAs encoding related sex-specific polypeptides. Cell. 1989;56 (6):997–1010. doi: 10.1016/0092-8674(89)90633-8. [DOI] [PubMed] [Google Scholar]
  • 16.Demir E, Dickson BJ. fruitless splicing specifies male courtship behavior in Drosophila. Cell. 2005;121 (5):785–794. doi: 10.1016/j.cell.2005.04.027. [DOI] [PubMed] [Google Scholar]
  • 17.Nagoshi RN, McKeown M, Burtis KC, Belote JM, Baker BS. The control of alternative splicing at genes regulating sexual differentiation in D. melanogaster. Cell. 1988;53 (2):229–236. doi: 10.1016/0092-8674(88)90384-4. [DOI] [PubMed] [Google Scholar]
  • 18.McIntyre LM, Bono LM, Genissel A, et al. Sex-specific expression of alternative transcripts in Drosophila. Genome Biol. 2006;7 (8):R79. doi: 10.1186/gb-2006-7-8-r79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Telonis-Scott M, Kopp A, Wayne ML, Nuzhdin SV, McIntyre LM. Sex-specific splicing in Drosophila: widespread occurrence, tissue specificity and evolutionary conservation. Genetics. 2009;181 (2):421–434. doi: 10.1534/genetics.108.096743. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gupta V, Parisi M, Sturgill D, et al. Global analysis of X-chromosome dosage compensation. J Biol. 2006;5 (1):3. doi: 10.1186/jbiol30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Parisi M, Nuttall R, Naiman D, et al. Paucity of genes on the Drosophila X chromosome showing male-biased expression. Science. 2003;299 (5607):697–700. doi: 10.1126/science.1079190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Parisi M, Nuttall R, Edwards P, et al. A survey of ovary-, testis-, and soma-biased gene expression in Drosophila melanogaster adults. Genome Biol. 2004;5 (6):R40. doi: 10.1186/gb-2004-5-6-r40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sturgill D, Zhang Y, Parisi M, Oliver B. Demasculinization of X chromosomes in the Drosophila genus. Nature. 2007;450 (7167):238–241. doi: 10.1038/nature06330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270 (5235):467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]
  • 25.Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270 (5235):484–487. doi: 10.1126/science.270.5235.484. [DOI] [PubMed] [Google Scholar]
  • 26.Wilhelm BT, Marguerat S, Watt S, et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008;453 (7199):1239–1243. doi: 10.1038/nature07002. [DOI] [PubMed] [Google Scholar]
  • 27.Nagalakshmi U, Wang Z, Waern K, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008;320 (5881):1344–1349. doi: 10.1126/science.1158441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lister R, O’Malley RC, Tonti-Filippini J, et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell. 2008;133 (3):523–536. doi: 10.1016/j.cell.2008.03.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5 (7):621–628. doi: 10.1038/nmeth.1226. [DOI] [PubMed] [Google Scholar]
  • 30.Sultan M, Schulz MH, Richard H, et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science. 2008;321 (5891):956–960. doi: 10.1126/science.1160342. [DOI] [PubMed] [Google Scholar]
  • 31.Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008;18 (9):1509–1517. doi: 10.1101/gr.079558.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10 (1):57–63. doi: 10.1038/nrg2484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.McKearin DM, Spradling AC. bag-of-marbles: a Drosophila gene required to initiate both male and female gametogenesis. Genes Dev. 1990;4 (12B):2242–2251. doi: 10.1101/gad.4.12b.2242. [DOI] [PubMed] [Google Scholar]
  • 34.Gonczy P, Matunis E, DiNardo S. bag-of-marbles and benign gonial cell neoplasm act in the germline to restrict proliferation during Drosophila spermatogenesis. Development. 1997;124 (21):4361–4371. doi: 10.1242/dev.124.21.4361. [DOI] [PubMed] [Google Scholar]
  • 35.Terry NA, Tulina N, Matunis E, DiNardo S. Novel regulators revealed by profiling Drosophila testis stem cells within their niche. Dev Biol. 2006;294 (1):246–257. doi: 10.1016/j.ydbio.2006.02.048. [DOI] [PubMed] [Google Scholar]
  • 36.Fuller MT. Spermatogenesis. In: Bate M, Martinez Arias A, editors. The Development of Drosophila melanogaster. I. Cold Spring Harbor: Cold Spring Harbor Press; 1993. [Google Scholar]
  • 37.Spradling AC. Developmental genetics of oogenesis. In: Bate M, Martinez Arias A, editors. The Development of Drosophila melanogaster. I. Cold Spring Harbor: Cold Spring Harbor Press; 1993. [Google Scholar]
  • 38.Arbeitman MN, Fleming AA, Siegal ML, Null BH, Baker BS. A genomic analysis of Drosophila somatic sexual differentiation and its regulation. Development. 2004;131 (9):2007–2021. doi: 10.1242/dev.01077. [DOI] [PubMed] [Google Scholar]
  • 39.Lebo MS, Sanders LE, Sun F, Arbeitman MN. Somatic, germline and sex hierarchy regulated gene expression during Drosophila metamorphosis. BMC Genomics. 2009;10:80. doi: 10.1186/1471-2164-10-80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Arbeitman MN, Furlong EE, Imam F, et al. Gene expression during the life cycle of Drosophila melanogaster. Science. 2002;297 (5590):2270–2275. doi: 10.1126/science.1072152. [DOI] [PubMed] [Google Scholar]
  • 41.Rathke C, Baarends WM, Jayaramaiah-Raja S, et al. Transition from a nucleosome-based to a protamine-based chromatin configuration during spermiogenesis in Drosophila. J Cell Sci. 2007;120 (Pt 9):1689–1700. doi: 10.1242/jcs.004663. [DOI] [PubMed] [Google Scholar]
  • 42.Fuller MT. Genetic control of cell proliferation and differentiation in Drosophila spermatogenesis. Semin Cell Dev Biol. 1998;9 (4):433–444. doi: 10.1006/scdb.1998.0227. [DOI] [PubMed] [Google Scholar]
  • 43.Beall EL, Lewis PW, Bell M, et al. Discovery of tMAC: a Drosophila testis-specific meiotic arrest complex paralogous to Myb-Muv B. Genes Dev. 2007;21 (8):904–919. doi: 10.1101/gad.1516607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Chen X, Hiller M, Sancak Y, Fuller MT. Tissue-specific TAFs counteract Polycomb to turn on terminal differentiation. Science. 2005;310 (5749):869–872. doi: 10.1126/science.1118101. [DOI] [PubMed] [Google Scholar]
  • 45.Hiller M, Chen X, Pringle MJ, et al. Testis-specific TAF homologs collaborate to control a tissue-specific transcription program. Development. 2004;131 (21):5297–5308. doi: 10.1242/dev.01314. [DOI] [PubMed] [Google Scholar]
  • 46.Xi R, Xie T. Stem cell self-renewal controlled by chromatin remodeling factors. Science. 2005;310 (5753):1487–1489. doi: 10.1126/science.1120140. [DOI] [PubMed] [Google Scholar]
  • 47.Tamkun JW. The role of brahma and related proteins in transcription and development. Curr Opin Genet Dev. 1995;5 (4):473–477. doi: 10.1016/0959-437x(95)90051-h. [DOI] [PubMed] [Google Scholar]
  • 48.Elfring LK, Deuring R, McCallum CM, Peterson CL, Tamkun JW. Identification and characterization of Drosophila relatives of the yeast transcriptional activator SNF2/SWI2. Mol Cell Biol. 1994;14 (4):2225–2234. doi: 10.1128/mcb.14.4.2225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Gdula DA, Sandaltzopoulos R, Tsukiyama T, Ossipow V, Wu C. Inorganic pyrophosphatase is a component of the Drosophila nucleosome remodeling factor complex. Genes Dev. 1998;12 (20):3206–3216. doi: 10.1101/gad.12.20.3206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Wu C, Tsukiyama T, Gdula D, et al. ATP-dependent remodeling of chromatin. Cold Spring Harb Symp Quant Biol. 1998;63:525–534. doi: 10.1101/sqb.1998.63.525. [DOI] [PubMed] [Google Scholar]
  • 51.Orlando V. Polycomb, epigenomes, and control of cell identity. Cell. 2003;112 (5):599–606. doi: 10.1016/s0092-8674(03)00157-0. [DOI] [PubMed] [Google Scholar]
  • 52.Francis NJ, Kingston RE. Mechanisms of transcriptional memory. Nat Rev Mol Cell Biol. 2001;2 (6):409–421. doi: 10.1038/35073039. [DOI] [PubMed] [Google Scholar]
  • 53.Smith ST, Petruk S, Sedkov Y, et al. Modulation of heat shock gene expression by the TAC1 chromatin-modifying complex. Nat Cell Biol. 2004;6 (2):162–167. doi: 10.1038/ncb1088. [DOI] [PubMed] [Google Scholar]
  • 54.Lee N, Zhang J, Klose RJ, et al. The trithorax-group protein Lid is a histone H3 trimethyl-Lys4 demethylase. Nat Struct Mol Biol. 2007;14 (4):341–343. doi: 10.1038/nsmb1216. [DOI] [PubMed] [Google Scholar]
  • 55.Secombe J, Li L, Carlos L, Eisenman RN. The Trithorax group protein Lid is a trimethyl histone H3K4 demethylase required for dMyc-induced cell growth. Genes Dev. 2007;21 (5):537–551. doi: 10.1101/gad.1523007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Eissenberg JC, Shilatifard A. Histone H3 lysine 4 (H3K4) methylation in development and differentiation. Dev Biol. 2009 doi: 10.1016/j.ydbio.2009.08.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Kuo MH, Allis CD. Roles of histone acetyltransferases and deacetylases in gene regulation. Bioessays. 1998;20 (8):615–626. doi: 10.1002/(SICI)1521-1878(199808)20:8<615::AID-BIES4>3.0.CO;2-H. [DOI] [PubMed] [Google Scholar]
  • 58.De Rubertis F, Kadosh D, Henchoz S, et al. The histone deacetylase RPD3 counteracts genomic silencing in Drosophila and yeast. Nature. 1996;384 (6609):589–591. doi: 10.1038/384589a0. [DOI] [PubMed] [Google Scholar]
  • 59.Gildea JJ, Lopez R, Shearn A. A screen for new trithorax group genes identified little imaginal discs, the Drosophila melanogaster homologue of human retinoblastoma binding protein 2. Genetics. 2000;156 (2):645–663. doi: 10.1093/genetics/156.2.645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Wang Z, Zang C, Cui K, et al. Genome-wide mapping of HATs and HDACs reveals distinct functions in active and inactive genes. Cell. 2009;138 (5):1019–1031. doi: 10.1016/j.cell.2009.06.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Lee N, Erdjument-Bromage H, Tempst P, Jones RS, Zhang Y. The H3K4 demethylase lid associates with and inhibits histone deacetylase Rpd3. Mol Cell Biol. 2009;29 (6):1401–1410. doi: 10.1128/MCB.01643-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Spradling AC, Nystul T, Lighthouse D, et al. Stem cells and their niches: integrated units that maintain Drosophila tissues. Cold Spring Harb Symp Quant Biol. 2008;73:49–57. doi: 10.1101/sqb.2008.73.023. [DOI] [PubMed] [Google Scholar]
  • 63.Cheng J, Turkel N, Hemati N, et al. Centrosome misorientation reduces stem cell division during ageing. Nature. 2008;456 (7222):599–604. doi: 10.1038/nature07386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Sheng XR, Brawley CM, Matunis EL. Dedifferentiating spermatogonia outcompete somatic stem cells for niche occupancy in the Drosophila testis. Cell Stem Cell. 2009;5 (2):191–203. doi: 10.1016/j.stem.2009.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Ritchie W, Granjeaud S, Puthier D, Gautheret D. Entropy measures quantify global splicing disorders in cancer. PLoS Comput Biol. 2008;4 (3):e1000011. doi: 10.1371/journal.pcbi.1000011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Yeo G, Holste D, Kreiman G, Burge CB. Variation in alternative splicing across human tissues. Genome Biol. 2004;5 (10):R74. doi: 10.1186/gb-2004-5-10-r74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Elliott DJ, Grellscheid SN. Alternative RNA splicing regulation in the testis. Reproduction. 2006;132 (6):811–819. doi: 10.1530/REP-06-0147. [DOI] [PubMed] [Google Scholar]
  • 68.Cline TW. The Drosophila sex determination signal: how do flies count to two? Trends Genet. 1993;9 (11):385–390. doi: 10.1016/0168-9525(93)90138-8. [DOI] [PubMed] [Google Scholar]
  • 69.Casper A, Van Doren M. The control of sexual identity in the Drosophila germline. Development. 2006;133 (15):2783–2791. doi: 10.1242/dev.02415. [DOI] [PubMed] [Google Scholar]
  • 70.Wawersik M, Milutinovich A, Casper AL, et al. Somatic control of germline sexual development is mediated by the JAK/STAT pathway. Nature. 2005;436 (7050):563–567. doi: 10.1038/nature03849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Ko LJ, Prives C. p53: puzzle and paradigm. Genes Dev. 1996;10 (9):1054–1072. doi: 10.1101/gad.10.9.1054. [DOI] [PubMed] [Google Scholar]
  • 72.Giaccia AJ, Kastan MB. The complexity of p53 modulation: emerging patterns from divergent signals. Genes Dev. 1998;12 (19):2973–2983. doi: 10.1101/gad.12.19.2973. [DOI] [PubMed] [Google Scholar]
  • 73.Yamada Y, Davis KD, Coffman CR. Programmed cell death of primordial germ cells in Drosophila is regulated by p53 and the Outsiders monocarboxylate transporter. Development. 2008;135 (2):207–216. doi: 10.1242/dev.010389. [DOI] [PubMed] [Google Scholar]
  • 74.Zhang J, Schulze KL, Hiesinger PR, et al. Thirty-one flavors of Drosophila rab proteins. Genetics. 2007;176 (2):1307–1322. doi: 10.1534/genetics.106.066761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Zhang J, Fonovic M, Suyama K, Bogyo M, Scott MP. Rab35 controls actin bundling by recruiting fascin as an effector protein. Science. 2009;325 (5945):1250–1254. doi: 10.1126/science.1174921. [DOI] [PubMed] [Google Scholar]
  • 76.Lighthouse DV, Buszczak M, Spradling AC. New components of the Drosophila fusome suggest it plays novel roles in signaling and transport. Dev Biol. 2008;317 (1):59–71. doi: 10.1016/j.ydbio.2008.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Barberan-Soler S, Zahler AM. Alternative splicing regulation during C. elegans development: splicing factors as regulated targets. PLoS Genet. 2008;4 (2):e1000001. doi: 10.1371/journal.pgen.1000001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Johnstone O, Lasko P. Translational regulation and RNA localization in Drosophila oocytes and embryos. Annu Rev Genet. 2001;35:365–406. doi: 10.1146/annurev.genet.35.102401.090756. [DOI] [PubMed] [Google Scholar]
  • 79.Macdonald PM, Luk SK, Kilpatrick M. Protein encoded by the exuperantia gene is concentrated at sites of bicoid mRNA accumulation in Drosophila nurse cells but not in oocytes or embryos. Genes Dev. 1991;5 (12B):2455–2466. doi: 10.1101/gad.5.12b.2455. [DOI] [PubMed] [Google Scholar]
  • 80.Hazelrigg T, Watkins WS, Marcey D, et al. The exuperantia gene is required for Drosophila spermatogenesis as well as anteroposterior polarity of the developing oocyte, and encodes overlapping sex-specific transcripts. Genetics. 1990;126 (3):607–617. doi: 10.1093/genetics/126.3.607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Hazelrigg T, Tu C. Sex-specific processing of the Drosophila exuperantia transcript is regulated in male germ cells by the tra-2 gene. Proceedings of the National Academy of Sciences of the United States of America. 1994;91 (22):10752–10756. doi: 10.1073/pnas.91.22.10752. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Herold N, Will CL, Wolf E, et al. Conservation of the protein composition and electron microscopy structure of Drosophila melanogaster and human spliceosomal complexes. Mol Cell Biol. 2009;29 (1):281–301. doi: 10.1128/MCB.01415-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Fabrizio JJ, Hickey CA, Stabrawa C, et al. Imp (IGF-II mRNA-binding protein) is expressed during spermatogenesis in Drosophila melanogaster. Fly. 2008;2(1) doi: 10.4161/fly.5659. [DOI] [PubMed] [Google Scholar]
  • 84.Khil PP, Smirnova NA, Romanienko PJ, Camerini-Otero RD. The mouse X chromosome is enriched for sex-biased genes not subject to selection by meiotic sex chromosome inactivation. Nat Genet. 2004;36 (6):642–646. doi: 10.1038/ng1368. [DOI] [PubMed] [Google Scholar]
  • 85.Wang PJ, McCarrey JR, Yang F, Page DC. An abundance of X-linked genes expressed in spermatogonia. Nat Genet. 2001;27 (4):422–426. doi: 10.1038/86927. [DOI] [PubMed] [Google Scholar]
  • 86.Wu CI, Xu EY. Sexual antagonism and X inactivation--the SAXI hypothesis. Trends Genet. 2003;19 (5):243–247. doi: 10.1016/s0168-9525(03)00058-1. [DOI] [PubMed] [Google Scholar]
  • 87.Hense W, Baines JF, Parsch J. X chromosome inactivation during Drosophila spermatogenesis. PLoS Biol. 2007;5 (10):e273. doi: 10.1371/journal.pbio.0050273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Brett D, Pospisil H, Valcarcel J, Reich J, Bork P. Alternative splicing and genome complexity. Nat Genet. 2002;30 (1):29–30. doi: 10.1038/ng803. [DOI] [PubMed] [Google Scholar]
  • 89.Park JW, Parisky K, Celotto AM, Reenan RA, Graveley BR. Identification of alternative splicing regulators by RNA interference in Drosophila. Proc Natl Acad Sci U S A. 2004;101 (45):15974–15979. doi: 10.1073/pnas.0407004101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Black DL. Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem. 2003;72:291–336. doi: 10.1146/annurev.biochem.72.121801.161720. [DOI] [PubMed] [Google Scholar]
  • 91.Boyer LA, Mathur D, Jaenisch R. Molecular control of pluripotency. Curr Opin Genet Dev. 2006;16 (5):455–462. doi: 10.1016/j.gde.2006.08.009. [DOI] [PubMed] [Google Scholar]
  • 92.Jaenisch R, Young R. Stem cells, the molecular circuitry of pluripotency and nuclear reprogramming. Cell. 2008;132 (4):567–582. doi: 10.1016/j.cell.2008.01.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Chen X. Stem cells: What can we learn from flies? Fly (Austin) 2008;2(1) doi: 10.4161/fly.5872. [DOI] [PubMed] [Google Scholar]
  • 94.Buszczak M, Spradling AC. Searching chromatin for stem cell identity. Cell. 2006;125 (2):233–236. doi: 10.1016/j.cell.2006.04.004. [DOI] [PubMed] [Google Scholar]
  • 95.Barski A, Cuddapah S, Cui K, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129 (4):823–837. doi: 10.1016/j.cell.2007.05.009. [DOI] [PubMed] [Google Scholar]
  • 96.Chintapalli VR, Wang J, Dow JA. Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nature genetics. 2007;39 (6):715–720. doi: 10.1038/ng2049. [DOI] [PubMed] [Google Scholar]
  • 97.White-Cooper H, Schafer MA, Alphey LS, Fuller MT. Transcriptional and post-transcriptional control mechanisms coordinate the onset of spermatid differentiation with meiosis I in Drosophila. Development (Cambridge, England) 1998;125 (1):125–134. doi: 10.1242/dev.125.1.125. [DOI] [PubMed] [Google Scholar]
  • 98.Gan Q, Schones DE, Eun S, Wei G, Cui K, Zhao K, Chen X. Monovalent and unpoised status of most genes in undifferentiated cell-enriched Drosophila testis In revision for resubmission to Genome Biology. doi: 10.1186/gb-2010-11-4-r42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Klose RJ, Kallin EM, Zhang Y. JmjC-domain-containing proteins and histone demethylation. Nat Rev Genet. 2006;7 (9):715–727. doi: 10.1038/nrg1945. [DOI] [PubMed] [Google Scholar]
  • 100.Allis CD, Berger SL, Cote J, et al. New nomenclature for chromatin-modifying enzymes. Cell. 2007;131 (4):633–636. doi: 10.1016/j.cell.2007.10.039. [DOI] [PubMed] [Google Scholar]
  • 101.Wang L, Charroux B, Kerridge S, Tsai CC. Atrophin recruits HDAC1/2 and G9a to modify histone H3K9 and to determine cell fates. EMBO Rep. 2008;9 (6):555–562. doi: 10.1038/embor.2008.67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Martin C, Zhang Y. The diverse functions of histone lysine methylation. Nat Rev Mol Cell Biol. 2005;6 (11):838–849. doi: 10.1038/nrm1761. [DOI] [PubMed] [Google Scholar]
  • 103.Foglietti C, Filocamo G, Cundari E, et al. Dissecting the biological functions of Drosophila histone deacetylases by RNA interference and transcriptional profiling. J Biol Chem. 2006;281 (26):17968–17976. doi: 10.1074/jbc.M511945200. [DOI] [PubMed] [Google Scholar]
  • 104.Zhu X, Singh N, Donnelly C, Boimel P, Elefant F. The cloning and characterization of the histone acetyltransferase human homolog Dmel\TIP60 in Drosophila melanogaster: Dmel\TIP60 is essential for multicellular development. Genetics. 2007;175 (3):1229–1240. doi: 10.1534/genetics.106.063685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Ciurciu A, Komonyi O, Pankotai T, Boros IM. The Drosophila histone acetyltransferase Gcn5 and transcriptional adaptor Ada2a are involved in nucleosomal histone H4 acetylation. Mol Cell Biol. 2006;26 (24):9413–9423. doi: 10.1128/MCB.01401-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Zeremski M, Stricker JR, Fischer D, Zusman SB, Cohen D. Histone deacetylase dHDAC4 is involved in segmentation of the Drosophila embryo and is regulated by gap and pair-rule genes. Genesis. 2003;35 (1):31–38. doi: 10.1002/gene.10159. [DOI] [PubMed] [Google Scholar]
  • 107.Rudolph T, Yonezawa M, Lein S, et al. Heterochromatin formation in Drosophila is initiated through active removal of H3K4 methylation by the LSD1 homolog SU(VAR)3-3. Mol Cell. 2007;26 (1):103–115. doi: 10.1016/j.molcel.2007.02.025. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supple Table 1. Table S1: The RPKM values for expressed genes in at least one of the four samples.

The Probe IDs in Affymetrix GeneChip Drosophila Genome 2.0 Array were listed for comparison with microarray data if they are available.

supple Table 2. Table S2: Expression of genes involved in splicing in bam vs. wt testis.

311 genes which are involved in splicing were extracted from several sources, including 21 splicing factors from (http://www.sdbonline.org/fly/aignfam/splice.htm), 161 genes with “RNA splicing” GO annotation based on Amigo database (http://amigo.geneontology.org), 97 gene that are involved in pre-mRNA processing (http://www.wam.umd.edu/~smount/DmRNAfactors/table.html), and 207 genes that are components of the spliceosome according to the proteomic study 82. Genes coming from these different sources are not mutually exclusive. We found 177 out of the 311 genes (56.9%) are enriched in bam testis (bam testis/wt testis≥ 2); but only 26 out of the 311 genes (8.4%) are enriched in wt testis (wt testis/bam testis≥ 2). All genes that encode SR protein are highlighted in green, SR-like proteins in light green and hnRNP in yellow.

supple Table 3. Table S3: Meta table of alternatively spliced genes in all four samples.

Isoform-specific RPKM values for 2,094 genes with two or more uniquely detectable isoforms. The geneAS column shows array T1|T2|…|Tn of uniquely detectable transcripts of each individual gene.

supple Table 4-1. Table S4: Four pair-wise comparisons of genes with differential isoforms in gonads.

(4-1) bam testis vs. bam ovary comparison; (4-2) wt testis vs. wt ovary comparison. (4-3) wt testis vs. bam testis comparison (4-4) wt ovary vs. bam ovary comparison. All splicing factors or putative splicing factors that have different isoforms in each pair-wise comparison are highlighted in yellow.

supple Table 4-2
supple Table 4-3
supple Table 4-4
supple Table 5. Table S5: Genes that exhibit stage-specific or stage-biased isoforms in bam and wt gonads.

Genes that encode transcription factors or putative transcription factors were highlighted in red. Genes that encode splicing factors or putative splicing factors were highlighted in yellow.

supple Table 6. Table S6: Detection of CFTRs in all four samples.

The identification and analysis of CFTRs were described in Methods.

supple figures

Figure S1: Methods of RNA-seq. (A) Flow chart of the experimental design; (B) Scheme of the data processing procedure.

Figure S2: Comparison of RNA-seq data with microarray data. (A) Comparison of RNA-seq data with the microarray data from 22. There are 11,893 genes in the FlyGEM microarray, 11,341 of them have the same gene ID used for our RNA-seq. Therefore 11,341 genes were used for this comparison. For the genes identified by the Parisi et al. study, 89% of the testis-enriched genes and 93% of the ovary-enriched genes were also identified in our study. In Parisi et al. study, the y,w strain was used as the wt strain, ovaries and testes were dissected 3 to 5 days post eclosion. We used the same strain but we dissected within 1 day post-eclosion, which may contribute to some of the differences of these two data sets. (B) Comparison of RNA-seq data with the microarray data from FlyAtlas (http://www.flyatlas.org/). We used 12,394 genes that FlyAtlas and our RNA-seq overlap for this comparison. We detected 3,615 and 3,910 genes that are enriched in wt testis and wt ovary, respectively, whereas the FlyAtlas identified 2,602 and 3,939 genes respectively. 91% of the testis-enriched genes and 77% of the ovary-enriched genes identified by FlyAtlas’s data overlapped with our RNA-seq data. In FlyAtlas, Canton S strain was used as the wt strain, ovaries and testes were dissected 7 days post-eclosion, which may contribute to some of the differences of these two data sets.

Figure S3: More examples of genes that have sex-specific isoforms. (A–F) UCSC snapshots show that the Itd gene has sex-specific isoforms in bam testis and bam ovary samples; and the smg (B), aret (C), BicC (D), qua (E), capu (F) genes have sex-specific isoforms in wt testis and wt ovary samples.

Figure S4: Venn diagrams of pair-wise comparisons of CFTRs. In each comparison, the CFTRs were considered to be shared in both samples if their corresponding genomic loci had ≥ 20bp overlap. The numbers in brackets represented the shared CFTRs from the right sample in each pair-wise comparison. All CFTRs had ≥ 10 sequencing reads and ≥ 2 RPKM. Identification of CFTRs was described in Methods and all CFTRs were listed in Table S6.

RESOURCES