Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2010 Jul 1;107(29):12975–12979. doi: 10.1073/pnas.1007586107

Global analysis of trans-splicing in Drosophila

C Joel McManus 1, Michael O Duff 1, Jodi Eipper-Mains 1, Brenton R Graveley 1,1
PMCID: PMC2919919  PMID: 20615941

Abstract

Precursor mRNA (pre-mRNA) splicing can join exons contained on either a single pre-mRNA (cis) or on separate pre-mRNAs (trans). It is exceedingly rare to have trans-splicing between protein-coding exons and has been demonstrated for only two Drosophila genes: mod(mdg4) and lola. It has also been suggested that trans-splicing is a mechanism for the generation of chimeric RNA products containing sequence from multiple distant genomic sites. Because most high-throughput approaches cannot distinguish cis- and trans-splicing events, the extent to which trans-splicing occurs between protein-coding exons in any organism is unknown. Here, we used paired-end deep sequencing of mRNA to identify genes that undergo trans-splicing in Drosophila interspecies hybrids. We did not observe credible evidence for the existence of chimeric RNAs generated by trans-splicing of RNAs transcribed from distant genomic loci. Rather, our data suggest that experimental artifacts are the source of most, if not all, apparent chimeric RNA products. We did, however, identify 80 genes that appear to undergo trans-splicing between homologous alleles and can be classified into three categories based on their organization: (i) genes with multiple 3′ terminal exons, (ii) genes with multiple first exons, and (iii) genes with very large introns, often containing other genes. Our results suggest that trans-splicing between homologous alleles occurs more commonly in Drosophila than previously believed and may facilitate expression of architecturally complex genes.

Keywords: chimeric RNA, RNA-seq, genomics, bioinformatics, deep sequencing


Precursor mRNA (pre-mRNA) splicing is an essential process in eukaryotic gene expression. Splicing can occur either within a single pre-mRNA (in cis) or between two different pre-mRNAs (in trans) (1, 2). The best-characterized form of trans-splicing occurs commonly in nematodes and trypanosomes. In these organisms, spliced-leader RNAs are added to the 5′ ends of many, if not all pre-mRNAs (3, 4). Examples of trans-splicing that do not involve spliced-leader RNAs, but rather occur between coding exons, are exceedingly rare, and only two Drosophila genes are known to be trans-spliced: mod(mdg4) (5, 6) and lola (7).

The Drosophila genes mod(mdg4) and lola both contain common 5′ exons and multiple alternative 3′ terminal exons. Although the exons of mod(mdg4) are encoded on both DNA strands (5, 6), and therefore require trans-splicing, all of the lola exons are encoded on the same DNA strand (7), suggesting that they are cis-spliced. However, interallelic complementation studies have demonstrated that at least some lola isoforms are generated by trans-splicing (7). This finding demonstrates that trans-spliced genes cannot be identified based on their genomic organization alone, and raises the possibility that other Drosophila genes could use trans-splicing for mRNA synthesis.

Trans-splicing may also be a mechanism for the generation of so-called chimeric RNAs, which contain sequences originating from distant genomic loci (8). However, apparent chimeric RNAs can also be generated by homology-driven template switching during RT-PCR (911), and adequate controls are needed to identify these experimental artifacts. One of the more complete reports describing chimeric RNAs found an enrichment of short homologous sequences (SHSs) at chimeric RNA junction sites (12). Although the authors suggested that cellular RNA polymerases switch DNA templates at SHSs (12), RT-PCR strand-switching at SHSs is a more likely explanation, given that both reverse-transcriptase and Taq DNA polymerase are known to strand-switch and multiple amplification cycles were used. A more recent study described the existence of several hundred chimeric RNAs in the rice transcriptome; however, control experiments to eliminate strand-switching as an explanation were not provided (13).

We used high-throughput sequencing of Drosophila hybrid mRNA and a mixed mRNA-negative control sample to investigate the extent and specificity of trans-splicing. The trans-splicing of mod(mdg4) and lola were extremely specific, as no chimeric products between these two genes were observed. In addition, 80 other candidate trans-spliced genes were identified, 6 of which were validated. These unique trans-spliced genes have complex genomic architecture, suggesting that trans-splicing may facilitate expression of genes whose structure would otherwise pose challenges to the gene-expression machinery. Finally, we report a high background of chimeric mRNA products in our negative control sample, which suggests that mRNAs that appear to link distant genomic loci likely result from experimental errors.

Results

Paired-End mRNA-seq to Identify trans-Spliced Genes.

To search for additional trans-spliced genes, we performed paired-end deep sequencing of mRNA isolated from F1 hybrid progeny generated from crossing Drosophila melanogaster females to Drosophila sechellia males (Fig. 1). These species were chosen because their genome assemblies are of sufficient quality and these two species have sufficient sequence divergence (∼2–3% across annotated genes) to map RNA-seq reads allele-specifically. To differentiate trans-spliced RNAs generated in the animal from chimeric products generated through library preparation artifacts (911) or sequencing errors, we also sequenced a negative control library prepared by mixing equal amounts of RNA isolated from the D. melanogaster and D. sechellia parents. We obtained 49 and 54 million mate-pairs from the control and hybrid libraries, respectively. All reads were separately aligned to both the D. melanogaster and D. sechellia genomes to identify reads that mapped perfectly (without mismatches) and uniquely to only one species. This alignment resulted in 9,815,247 hybrid and 9,198,164 control mate-pairs, where both reads were species-specific. Mate-pairs where both reads map to the same species are referred to as cis–mate-pairs (9,678,331 hybrid and 9,069,982 control mate-pairs). In contrast, mate-pairs where each read maps to a different species are referred to as trans–mate-pairs (136,916 hybrid and 128,182 control mate-pairs). We next mapped the reads in the cis– and trans–mate-pairs to exons of protein-coding genes. Mate-pairs in which the two reads mapped to different exons (either within the same gene or between different annotated genes) were considered as candidates for being generated by splicing (49.4% of the hybrid and 56.4% of the control mate-pairs).

Fig. 1.

Fig. 1.

Deep sequencing to search for trans-spliced genes and chimeric RNAs. Sequencing libraries were prepared from poly(A)-selected RNA from F1 hybrids of D. melanogaster and D. sechellia, and from a mixture of parental RNA (control). Libraries were subjected to paired-end deep sequencing, and species-specific sequence reads were identified by comparing genomic alignments. Sequence mate-pairs in which both reads mapped to the same (cis) or different (trans) species were mapped to genes to identify pairs indicative of splicing.

Frequency and Specificity of mod(mdg4) and lola trans-Splicing.

Examination of the two known trans-spliced genes, mod(mdg4) and lola, revealed that this approach can indeed identify trans-splicing events. For mod(mdg4), we obtained 50 trans–mate-pairs from the hybrid but only 2 from the control (Fig. S1). Similarly, for lola we obtained 43 trans–mate-pairs from the hybrid library and none from the control library. Importantly, six mod(mdg4) and four lola trans-splicing events, including one previously identified event in mod(mdg4), were supported by as few as one trans–mate-pair.

Previous studies demonstrated trans-splicing for only 6 of 28 and 4 of 22 known mod(mdg4) and lola 3′ terminal exon groups, and many trans-spliced products were detected only in the context of overexpressed trans-genes, which may not reflect natural phenomena (57, 14). Our results show that 22 of 24 (92%) and 12 of 17 (71%) of the expressed, annotated mod(mdg4) and lola isoforms, respectively, have at least one trans–mate-pair or reside on the antisense strand, and are therefore trans-spliced (Fig. 2). Thus, mod(mdg4) and lola mRNAs appear to be generated almost entirely by trans-splicing.

Fig. 2.

Fig. 2.

Trans-splicing of mod(mdg4) and lola. The sequencing results obtained for mod(mdg4) (A) and lola (B) are shown. The horizontal gray line separates the sense and antisense exons of mod(mdg4). The 3′ terminal exon groups for which deep sequencing data support trans-splicing (green), only cis-splicing (red), or are not expressed in the hybrid (gray) are shown. Isoforms for which trans-splicing was previously reported are depicted with an asterisk. The number of cis– (red) and trans– (green) mate-pairs observed for each isoform of mod(mdg4) and lola are shown (bar graphs).

As the mod(mdg4) and lola 3′ terminal exons all have the same reading frame, chimeric mRNAs synthesized by trans-splicing of mod(mdg4) common exons to lola variable exons (and vice versa) would be refractory to nonsense-mediated decay. We therefore assessed the frequency of aberrant mod(mdg4) and lola trans-splicing by searching for mate-pairs between mod(mdg4) and lola. Importantly, we did not observe any mate-pairs from either the same, or opposite species between mod(mdg4) and lola in the hybrid dataset. Furthermore, although we did observe some single mate-pairs between mod(mdg4) or lola and other genes, these were more prevalent in the control (49 mate-pairs) than in the hybrid (26 mate-pairs), suggesting that these are most likely artifacts (Table S1). Thus, trans-splicing of mod(mdg4) and lola is highly specific.

Detection and Validation of Novel trans-Splicing Events.

We next searched for new examples of trans-splicing within the same gene. Two thousand one hundred seventy-seven genes had at least one trans–mate-pair and were considered candidate trans-spliced genes. However, several factors including strand-switching, deep sequencing errors, or reference genome errors resulted in false-positives (Fig. S2). We therefore visually evaluated each candidate gene in a genome browser to remove those with potential false-positive signals (see Materials and Methods, Fig. S2, and Tables S2 and S3). This visual curation step resulted in a final collection of 80 trans-splicing candidate genes.

We used a species-specific RT-PCR/sequencing assay (15) to validate the existence of trans-spliced mRNAs for mod(mdg4), lola, and six candidate genes. To confirm trans-splicing, we required that an RT-PCR product was obtained from the hybrid RNA, but not from the individual parents or the control. The hybrid RT-PCR products were cloned and sequenced to verify that SNPs between the primers and the exon boundaries showed a clean transition at exon-exon junctions. Using these stringent criteria, we confirmed trans-splicing for three undocumented isoforms from mod(mdg4) and lola, and all of the tested candidate genes (Fig. 3 and Fig. S3).

Fig. 3.

Fig. 3.

Examples of newly identified trans-spliced genes. Trans-splicing was validated using RT-PCR with primers specific to D. melanogaster (red) and D. sechellia (blue). Trans-splicing is validated by the presence of RT-PCR products when using opposite species forward and reverse primers with hybrid, but not mixed control (Mix) cDNA. Several clones of these putative trans-splicing products were sequenced to verify a clean transition of species-specific sequences at splicing junctions. (A) CG42235 contains a set of common 5′ exons which are trans-spliced to multiple alternative 3′ terminal exon groups. (B) The ome gene has multiple alternative transcription initiation exons which are trans-spliced to a set of common 3′ terminal exons. (C) Nmdmc is an example of trans-splicing of nested genes, as the gene Rel is located within the intron.

Candidate Chimeric RNA Products Carry Hallmarks of RT-PCR Artifacts.

We searched for cases of trans-splicing of exons located in different annotated genes on the same chromosome or on different chromosomes. As with mod(mdg4) and lola trans-splicing, we expect that the genes involved in any new cases would be specific (not promiscuous), would involve splicing of RNA derived from the transcribed strand of the annotated isoforms, and would not involve genes from the mitochondrial genome. Of the 128,958 pairs of genes connected by at least one intergenic mate-pair, 74,383 (58%) had at least one mate-pair derived from the noncoding strand and 1,307 (1%) involved genes from the mitochondrial genome. Nearly all (54,558) of the remaining 54,575 gene pairs were promiscuous, in that at least one of the genes in a pair was involved in more than one intergenic pairing. Strikingly, 16 of the 17 coding, nonpromiscuous intergenic pairs involved single mate-pairs between adjacent or nested genes on the same chromosome, and none of these were trans–mate-pairs (opposite allele pairs), suggesting the genes connected by these mate-pairs may be misannotated, are part of the same transcription unit, and are therefore actually cases of intragenic cis-splicing (Table S4). The remaining coding, nonpromiscuous intergenic gene pair involves a single cis–mate-pair between two paralogs of His3 (CG33845 and CG33821) that differ in sequence by a single nucleotide, suggesting that this mate-pair resulted from a sequencing error. Given these results, we next investigated whether the intergenic trans–mate-pairs in our dataset could result from strand-switching artifacts generated by RT-PCR during library preparation.

Strand-switching is dependent on two major factors: template homology and concentration. We found that the most frequent cases of intergenic trans–mate-pairs involved different members of highly homologous gene families. For example, Actin paralogs located on different chromosomes were the most abundant intergenic mate-pairs in our dataset. We also observed strong correlations between gene template concentration, measured in total mapped reads, and the number of intergenic trans–mate-pairs (Pearson's r = 0.88, 0.81, for different genes on different, and the same chromosomes, respectively). For comparison, the correlation between template concentration and same-gene trans–mate-pairs was relatively weak (Pearson's r = 0.33). Finally, we compared the tissue-specific expression patterns of the interchromosomal gene pairs to examine whether mRNAs from these genes were expressed in the same tissues (16). We find that ∼7.4% of interchromosomal gene pairs are not coexpressed in D. melanogaster (Table S5). Together, these observations suggest that the vast majority of intergenic trans–mate-pairs are derived from RT-PCR strand-switching artifacts and sequencing errors. Thus, we do not find reliable evidence of chimeric RNA production in adult Drosophila.

Discussion

The approach described in this study is unique in providing a genome-wide survey of trans-splicing, and reveals that trans-splicing between protein coding exons is more widespread than previously appreciated. At the same time, our results indicate that tran-splicing in Drosophila is extremely specific. Interestingly, homologous chromosomes are paired in Drosophila somatic cells (17) and chromosomal pairing appears to be required for efficient lola trans-splicing (7). This suggests the possibility that chromosomal pairing may be a general requirement for efficient, specific trans-splicing between homologous genes in Drosophila.

The candidate trans-spliced genes we identified can be grouped into three categories. The first class consists of genes that contain at least two alternative 3′ terminal exons, like mod(mdg4) and lola (Fig. 3A). The most notable example from this class is CG42235, in which trans–mate-pairs mapped to the CG42235-RD and CG42235-RE isoforms, both of which were validated. The second class contains genes with at least two alternative 5′ terminal exons, such as ome (Fig. 3B). The final category included genes with large introns, which frequently contain nested genes within the intron, such as Nmdmc (Fig. 3C). Intriguingly, the architecture of trans-spliced genes in each class creates obstacles for the gene-expression machinery. For example, collisions of transcription complexes may occur in nested genes. For genes containing alternative 5′ terminal exons, use of distal exons requires active repression of proximal exons. Finally, for genes containing alternative 3′ terminal exons, it is necessary to actively repress all proximal 3′ splice sites, premature 3′ end formation, and transcription termination before synthesis and splicing of the distal exons. In each of these cases, trans-splicing of separate pre-mRNAs generated using distinct promoters and transcription termination sites would overcome all of these obstacles.

In some cases, the frequency of trans-splicing is very low. This finding may reflect a low background of “noisy” trans-splicing or a low level of strand-switching or sequencing errors that occurred only in the hybrid sample. Alternatively, the trans–mate-pairs from these genes could have resulted from cis-splicing of transcripts expressed in a small population of cells in which somatic recombination has occurred between the D. melanogaster and D. sechellia alleles. Although we cannot exclude these possibilities, we note that our validation experiments were performed using biological replicate samples. Thus, it seems unlikely that the same experimental errors or somatic recombination events would occur in multiple biological samples.

Our approach also allowed us to evaluate the extent of strand-switching that occurs in deep sequencing experiments. Most cross-chromosomal trans–mate-pairs we observed result from RT-PCR artifacts and do not represent biologically generated chimeric mRNAs. Consequently, we do not find credible evidence of intergenic chimeric RNA production in adult Drosophila. Because we observed a large number of false-positive chimeric RNA signals, our data further suggest that reports of chimeric RNAs should be treated with caution, especially when the supporting data are generated using RT-PCR. However, our results do not preclude the existence of chimeric RNAs in other species. For example, exons from the mosquito bursicon mRNA were recently found to be encoded on two separate chromosomes, suggesting that trans-splicing is required for bursicon mRNA synthesis (18). Another recent report described a chimeric RNA comprised of exons from the human JJAZ1 and JAZF1 genes located on chromosomes 7 and 17, respectively (15). This chimeric RNA can be formed in in vitro splicing reactions, suggesting the possibility that the chimeric RNA can be produced via trans-splicing in vivo. However, this result does not entirely eliminate the possibility that the chimeric product was produced during RT-PCR amplification. Improvements in direct RNA sequencing (19, 20) should eventually allow the direct detection of any genuine chimeric RNAs, without the introduction of strand-switching artifacts inherent to reverse transcription and PCR amplification.

Regardless of the precise mechanism by which trans-splicing occurs and the purpose of trans-splicing, the results presented here identify several additional protein-coding genes that are trans-spliced in Drosophila. Because we have only examined trans-splicing in adult females, these results certainly underestimate the frequency of trans-splicing. Thus, deeper sequencing to analyze trans-splicing throughout Drosophila development will likely identify additional trans-spliced genes. Conducting similar experiments in other species, including humans, whose genomes contain many genes with long introns (e.g., c-Abl), multiple promoters (e.g., PCDHGA), and multiple 3′ terminal exons (e.g., IGHA1), may reveal that trans-splicing between protein-coding exons is even more ubiquitous.

Materials and Methods

Flies/Crosses.

Flies were reared on standard cornmeal/molasses medium at 25 °C. The F1 hybrids used resulted from crossing 7 females of the D. melanogaster strain 14021–0231.36 (y[1]; Gr22b[1] Gr22d[1] cn[1] CG33964[R4.2] bw[1] sp[1]; LysC[1] MstProx[1] GstD5[1] Rh6[1]) with approximately 30 males of the D. sechellia strain 14021–0248.25 (wild-type). Only female hybrids are viable from this cross.

Library Preparation and Sequencing.

mRNA sequencing libraries were performed to manufacturer specifications (Illumina). Total RNA was prepared from whole flies using TRIzol (Invitrogen) and treated with DNase I to remove any contaminating DNA. Nine micrograms of total RNA from hybrid, and control (4.5 μg D. melanogaster RNA + 4.5 μg D. sechellia RNA) females was used as input for library preparation. Poly(A)+ RNA selected using Dynal magnetic beads (Invitrogen) was fragmented using RNA fragmentation reagent (Ambion), and reverse-transcribed using random primers and SuperScript II (Invitrogen). The resulting cDNA was size-selected (∼370 bp) on 2% agarose (TAE) gels. Libraries were subsequently prepared for sequencing using the Paired-end Genomic DNA Library kit (Illumina). Libraries were sequenced in six (hybrid) and four (control) lanes on an Illumina GAIIx using a 37-cycle paired-end sequencing protocol, and one (hybrid) and two (control) lanes using a 76-cycle paired-end protocol. Sequence reads from 76-cycle runs were trimmed to their first 37 bases for comparison with the other data.

mRNA-seq Data Analysis.

Sequence image analysis was performed using the Firecrest, Bustard and GERALD programs (Illumina). Sequences were aligned separately to both the D melanogaster (2006, dm3) and D. sechellia (droSec1) genome assemblies (21) using Bowtie (22). Allele-specific sequence read assignments were performed as previously described (23). Briefly, sequence reads were aligned requiring no-mismatches, and alignment results were compared to identify sequences that aligned to only one genome and mapped to a single genomic location. The coordinates of D. sechellia-specific reads were converted to their syntenic D. melanogaster coordinates using the lift-over tool (http://genome.ucsc.edu). Species-specific sequence reads were mapped to all annotated exons (Flybase 5.11) using a custom perl script “exonhitter” (23).

Additional custom scripts were used to identify cis– and trans–mate-pairs and for further downstream analyses. Mate-pairs were first examined to identify pairs whose reads mapped to different exons. These pairs were further separated into “same gene” and “different gene” categories if the ends of the pair mapped to the same or different genes, respectively. “Different gene” read-pairs were parsed into same- and different-chromosome categories, if the genes to which they mapped were located on the same or different chromosomes. The number of mate-pairs mapping to each exon pair were counted.

The total number of cis- and trans- “same gene” mate-pairs was calculated for each gene. All genes with at least one hybrid trans–mate-pair were considered as trans-splicing candidates. Custom browser tracks were generated to view the location of allele-specific sequence reads and trans–mate-pairs on the University of California–Santa Cruz genome browser. The trans-splicing candidate genes were visually evaluated to identify genes whose hybrid and negative control trans–mate-pairs align to the same sets of SNPs, which is indicative of strand-switching and mapping bias because of reference genome errors (Fig. S2). Candidate trans-splicing events containing these potential sources of error were not considered further.

The mRNA-seq protocol used in this study results in sequences that are not strand-specific (i.e., one does not know from which strand an observed mRNA-seq read was generated). However, the relative strands of each sequence mate-pair can be analyzed. If a mate-pair was generated from a continuous mRNA, the mate-pair reads should map to opposite strands in the reference genome. We used this relative strand information to determine whether the reads in putative chimeric read-pairs could both come from a protein-coding sequence. For example, if two genes were encoded on the positive DNA strand, the reads in a chimeric mate-pair derived from the coding sequence of both genes would align to opposite DNA strands. If the reads in a mate-pair aligned to the same DNA strand, the sequence from one read in the pair must have originated from the noncoding strand of a gene. We calculated the frequency of coding and noncoding mate-pairs for each apparent chimeric junction between two different genes using a custom perl script. Custom scripts were also used to identify genes with multiple chimeric junctions (gene1 to gene2, gene1 to gene3, and so forth). Tissue-specific gene expression patterns were downloaded from FlyAtlas (http://flyatlas.org/) (16) and custom perl scripts were used to compare the expression patterns of genes in each potential chimeric gene pair. Genes were considered to be expressed in a tissue if all four of the microarray experiments reported expression.

Validation.

RNA from different biological replicates was reverse-transcribed using SuperScript II RT (Invitrogen) to prepare cDNA for validation PCR. Species-specific primers (Table S6) were designed for 23 isoforms of 20 candidate genes [including mod(mdg4) and lola]. Species-specific PCR amplification was successful for 11 genes, failed completely (no product was generated) for 3 genes, and was nonspecific for 6 genes. RT-PCR products generated from hybrid cDNA were cloned and sequenced to verify clean SNP transitions at exon-exon junctions (Fig. S3).

Supplementary Material

Corrected Supporting Information

Acknowledgments

We thank members of the B.R.G. laboratory for discussions and comments on the manuscript, Thom Theara for assistance with the Illumina GAIIx, and the University of Connecticut Health Center Translational Genomics Core Facility for use of the instrument. This work was supported by National Institutes of Health Grant GM062516 (to B.R.G.).

Footnotes

The authors declare no conflict of interest.

Data deposition: The data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE20421).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1007586107/-/DCSupplemental.

References

  • 1.Konarska MM, Padgett RA, Sharp PA. Trans splicing of mRNA precursors in vitro. Cell. 1985;42:165–171. doi: 10.1016/s0092-8674(85)80112-4. [DOI] [PubMed] [Google Scholar]
  • 2.Solnick D. Trans splicing of mRNA precursors. Cell. 1985;42:157–164. doi: 10.1016/s0092-8674(85)80111-2. [DOI] [PubMed] [Google Scholar]
  • 3.Sutton RE, Boothroyd JC. Evidence for trans splicing in trypanosomes. Cell. 1986;47:527–535. doi: 10.1016/0092-8674(86)90617-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Nilsen TW. Evolutionary origin of SL-addition trans-splicing: Still an enigma. Trends Genet. 2001;17:678–680. doi: 10.1016/s0168-9525(01)02499-4. [DOI] [PubMed] [Google Scholar]
  • 5.Dorn R, Reuter G, Loewendorf A. Transgene analysis proves mRNA trans-splicing at the complex mod(mdg4) locus in Drosophila. Proc Natl Acad Sci USA. 2001;98:9724–9729. doi: 10.1073/pnas.151268698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Labrador M, et al. Protein encoding by both DNA strands. Nature. 2001;409:1000. doi: 10.1038/35059000. [DOI] [PubMed] [Google Scholar]
  • 7.Horiuchi T, Giniger E, Aigaki T. Alternative trans-splicing of constant and variable exons of a Drosophila axon guidance gene, lola. Genes Dev. 2003;17:2496–2501. doi: 10.1101/gad.1137303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Gingeras TR. Implications of chimaeric non-co-linear transcripts. Nature. 2009;461:206–211. doi: 10.1038/nature08452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Cocquet J, Chong A, Zhang G, Veitia RA. Reverse transcriptase template switching and false alternative transcripts. Genomics. 2006;88:127–131. doi: 10.1016/j.ygeno.2005.12.013. [DOI] [PubMed] [Google Scholar]
  • 10.Odelberg SJ, Weiss RB, Hata A, White R. Template-switching during DNA synthesis by Thermus aquaticus DNA polymerase I. Nucleic Acids Res. 1995;23:2049–2057. doi: 10.1093/nar/23.11.2049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Tasic B, et al. Promoter choice determines splice site selection in protocadherin alpha and gamma pre-mRNA splicing. Mol Cell. 2002;10:21–33. doi: 10.1016/s1097-2765(02)00578-6. [DOI] [PubMed] [Google Scholar]
  • 12.Li X, Zhao L, Jiang H, Wang W. Short homologous sequences are strongly associated with the generation of chimeric RNAs in eukaryotes. J Mol Evol. 2009;68:56–65. doi: 10.1007/s00239-008-9187-0. [DOI] [PubMed] [Google Scholar]
  • 13.Zhang G, et al. Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. Genome Res. 2010;20:646–654. doi: 10.1101/gr.100677.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Gabler M, et al. Trans-splicing of the mod(mdg4) complex locus is conserved between the distantly related species Drosophila melanogaster and D. virilis. Genetics. 2005;169:723–736. doi: 10.1534/genetics.103.020842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Li H, Wang J, Mor G, Sklar J. A neoplastic gene fusion mimics trans-splicing of RNAs in normal human cells. Science. 2008;321:1357–1361. doi: 10.1126/science.1156725. [DOI] [PubMed] [Google Scholar]
  • 16.Chintapalli VR, Wang J, Dow JA. Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat Genet. 2007;39:715–720. doi: 10.1038/ng2049. [DOI] [PubMed] [Google Scholar]
  • 17.Metz CW. Chromosome studies on the Diptera II. The paired association of chromosomes in the Diptera and its significance. J Exp Zool. 1916;21:213–279. [Google Scholar]
  • 18.Robertson HM, Navik JA, Walden KK, Honegger HW. The bursicon gene in mosquitoes: An unusual example of mRNA trans-splicing. Genetics. 2007;176:1351–1353. doi: 10.1534/genetics.107.070938. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ozsolak F, et al. Direct RNA sequencing. Nature. 2009;461:814–818. doi: 10.1038/nature08390. [DOI] [PubMed] [Google Scholar]
  • 20.Mamanova L, et al. FRT-seq: Amplification-free, strand-specific transcriptome sequencing. Nat Methods. 2010;7:130–132. doi: 10.1038/nmeth.1417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Clark AG, et al. Drosophila 12 Genomes Consortium. Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007;450:203–218. doi: 10.1038/nature06341. [DOI] [PubMed] [Google Scholar]
  • 22.Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.McManus CJ, et al. Regulatory divergence in Drosophila revealed by mRNA-seq. Genome Res. 2010;20:816–825. doi: 10.1101/gr.102491.109. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Corrected Supporting Information
1007586107_st01.pdf (33.9KB, pdf)
1007586107_st02.pdf (193KB, pdf)
1007586107_st03.pdf (253.6KB, pdf)
1007586107_st04.pdf (5.2MB, pdf)
1007586107_st05.pdf (579.1KB, pdf)
1007586107_st06.pdf (186.6KB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES